Firecrawl¶
The Firecrawl MCP Server connects your ADK agent to the Firecrawl API, a service that can crawl any website and convert its content into clean, structured markdown. This allows your agent to ingest, search, and reason over web data from any URL, including all its subpages.
Features¶
-
Agent-based Web Research: Deploy an agent that can take a topic, use the search tool to find relevant URLs, and then use the scrape tool to extract the full content of each page for analysis or summarization.
-
Structured Data Extraction: Use the extract tool to pull specific, structured information (like product names, prices, or contact info) from a list of URLs, powered by LLM extraction.
-
Large-Scale Content Ingestion: Automate the scraping of entire websites or large batches of URLs using the batch scrape and crawl tools. This is ideal for populating a vector database for a RAG (Retrieval-Augmented Generation) pipeline.
Prerequisites¶
Usage with ADK¶
from google.adk.agents.llm_agent import Agent
from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
from google.adk.tools.mcp_tool.mcp_toolset import MCPToolset
from mcp import StdioServerParameters
FIRECRAWL_API_KEY = "YOUR_FIRECRAWL_API_KEY"
root_agent = Agent(
model="gemini-2.5-pro",
name="firecrawl_agent",
description="A helpful assistant for scraping websites with Firecrawl",
instruction="Help the user search for website content",
tools=[
MCPToolset(
connection_params=StdioConnectionParams(
server_params = StdioServerParameters(
command="npx",
args=[
"-y",
"firecrawl-mcp",
],
env={
"FIRECRAWL_API_KEY": FIRECRAWL_API_KEY,
}
),
timeout=30,
),
)
],
)
from google.adk.agents.llm_agent import Agent
from google.adk.tools.mcp_tool.mcp_session_manager import StreamableHTTPServerParams
from google.adk.tools.mcp_tool.mcp_toolset import MCPToolset
FIRECRAWL_API_KEY = "YOUR_FIRECRAWL_API_KEY"
root_agent = Agent(
model="gemini-2.5-pro",
name="firecrawl_agent",
description="A helpful assistant for scraping websites with Firecrawl",
instruction="Help the user search for website content",
tools=[
MCPToolset(
connection_params=StreamableHTTPServerParams(
url=f"https://mcp.firecrawl.dev/{FIRECRAWL_API_KEY}/v2/mcp",
),
)
],
)
Available tools¶
This toolset provides a comprehensive suite of functions for web crawling, scraping, and searching:
| Tool | Name | Description |
|---|---|---|
| Scrape Tool | firecrawl_scrape |
Scrape content from a single URL with advanced options |
| Batch Scrape Tool | firecrawl_batch_scrape |
Scrape multiple URLs efficiently with built-in rate limiting and parallel processing |
| Check Batch Status | firecrawl_check_batch_status |
Check the status of a batch operation |
| Map Tool | firecrawl_map |
Map a website to discover all indexed URLs on the site |
| Search Tool | firecrawl_search |
Search the web and optionally extract content from search results |
| Crawl Tool | firecrawl_crawl |
Start an asynchronous crawl with advanced options |
| Check Crawl Status | firecrawl_check_crawl_status |
Check the status of a crawl job |
| Extract Tool | firecrawl_extract |
Extract structured information from web pages using LLM capabilities. Supports both cloud AI and self-hosted LLM extraction |
Configuration¶
The Firecrawl MCP server can be configured using environment variables:
Required:
FIRECRAWL_API_KEY: Your Firecrawl API key- Required when using cloud API (default)
- Optional when using self-hosted instance with
FIRECRAWL_API_URL
Firecrawl API URL (optional):
FIRECRAWL_API_URL(Optional): Custom API endpoint for self-hosted instances- Example:
https://firecrawl.your-domain.com - If not provided, the cloud API will be used (requires API key)
- Example:
Retry configuration (optional):
FIRECRAWL_RETRY_MAX_ATTEMPTS: Maximum number of retry attempts (default: 3)FIRECRAWL_RETRY_INITIAL_DELAY: Initial delay in milliseconds before first retry (default: 1000)FIRECRAWL_RETRY_MAX_DELAY: Maximum delay in milliseconds between retries (default: 10000)FIRECRAWL_RETRY_BACKOFF_FACTOR: Exponential backoff multiplier (default: 2)
Credit usage monitoring (optional):
FIRECRAWL_CREDIT_WARNING_THRESHOLD: Credit usage warning threshold (default: 1000)FIRECRAWL_CREDIT_CRITICAL_THRESHOLD: Credit usage critical threshold (default: 100)