Skip to content

ScrapeGraphAI

The ScrapeGraphAI MCP Server connects your ADK agent to ScrapeGraphAI. This integration enables your agent to extract structured data using natural language prompts, handle dynamic content like infinite scrolling, and convert complex webpages into clean, usable JSON or Markdown.

Use cases

  • Scalable Extraction & Crawling: Extract structured data from single pages or crawl entire websites, leveraging AI to handle dynamic content, infinite scrolling, and large-scale asynchronous operations.

  • Research and Summarization: Execute AI-powered web searches to research topics, aggregate data from multiple sources, and summarize findings.

  • Agentic Workflows: Run advanced agentic scraping workflows with customizable steps, complex navigation (like authentication), and structured output schemas.

Prerequisites

Use with agent

from google.adk.agents import Agent
from google.adk.tools.mcp_tool.mcp_toolset import MCPToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
from mcp import StdioServerParameters

SGAI_API_KEY = "YOUR_SCRAPEGRAPHAI_API_KEY"

root_agent = Agent(
    model="gemini-2.5-pro",
    name="scrapegraph_assistant_agent",
    instruction="""Help the user with web scraping and data extraction using
                  ScrapeGraph AI. You can convert webpages to markdown, extract
                  structured data using AI, perform web searches, crawl
                  multiple pages, and automate complex scraping workflows.""",
    tools=[
        MCPToolset(
            connection_params=StdioConnectionParams(
                server_params=StdioServerParameters(
                    # The following CLI command is available
                    # from `pip install scrapegraph-mcp`
                    command="scrapegraph-mcp",
                    env={
                        "SGAI_API_KEY": SGAI_API_KEY,
                    },
                ),
                timeout=300,
            ),
        # Optional: Filter which tools from the MCP server are exposed
        # tool_filter=["markdownify", "smartscraper", "searchscraper"]
        ),
    ],
)

Available tools

Tool Description
markdownify Transform any webpage into clean, structured markdown format
smartscraper Leverage AI to extract structured data from any webpage with support for infinite scrolling
searchscraper Execute AI-powered web searches with structured, actionable results
scrape Basic scraping endpoint to fetch page content with optional heavy JavaScript rendering
sitemap Extract sitemap URLs and structure for any website
smartcrawler_initiate Initiate intelligent multi-page web crawling (asynchronous operation)
smartcrawler_fetch_results Retrieve results from asynchronous crawling operations
agentic_scrapper Run advanced agentic scraping workflows with customizable steps and structured output schemas

Additional resources