# Agent Development Kit
> Build powerful multi-agent systems with Agent Development Kit
An open-source, code-first toolkit for building, evaluating, and deploying sophisticated AI agents with flexibility and control.
# Build Agents
# Get started
Agent Development Kit (ADK) is designed to empower developers to quickly build, manage, evaluate and deploy AI-powered agents. These quick start guides get you set up and running a simple agent in less than 20 minutes.
- **Python Quickstart**
______________________________________________________________________
Create your first Python ADK agent in minutes.
[Start with Python](https://google.github.io/adk-docs/get-started/python/index.md)
- **Go Quickstart**
______________________________________________________________________
Create your first Go ADK agent in minutes.
[Start with Go](https://google.github.io/adk-docs/get-started/go/index.md)
- **Java Quickstart**
______________________________________________________________________
Create your first Java ADK agent in minutes.
[Start with Java](https://google.github.io/adk-docs/get-started/java/index.md)
- **TypeScript Quickstart**
______________________________________________________________________
Create your first TypeScript ADK agent in minutes.
[Start with TypeScript](https://google.github.io/adk-docs/get-started/typescript/index.md)
# Agent Development Kit (ADK)
**Build, Evaluate and Deploy agents, seamlessly!**
ADK is designed to empower developers to build, manage, evaluate and deploy AI-powered agents. It provides a robust and flexible environment for creating both conversational and non-conversational agents, capable of handling complex tasks and workflows.
## Core Concepts
ADK is built around a few key primitives and concepts that make it powerful and flexible. Here are the essentials:
- **Agent:** The fundamental worker unit designed for specific tasks. Agents can use language models (`LlmAgent`) for complex reasoning, or act as deterministic controllers of the execution, which are called "[workflow agents](https://google.github.io/adk-docs/agents/workflow-agents/index.md)" (`SequentialAgent`, `ParallelAgent`, `LoopAgent`).
- **Tool:** Gives agents abilities beyond conversation, letting them interact with external APIs, search information, run code, or call other services.
- **Callbacks:** Custom code snippets you provide to run at specific points in the agent's process, allowing for checks, logging, or behavior modifications.
- **Session Management (`Session` & `State`):** Handles the context of a single conversation (`Session`), including its history (`Events`) and the agent's working memory for that conversation (`State`).
- **Memory:** Enables agents to recall information about a user across *multiple* sessions, providing long-term context (distinct from short-term session `State`).
- **Artifact Management (`Artifact`):** Allows agents to save, load, and manage files or binary data (like images, PDFs) associated with a session or user.
- **Code Execution:** The ability for agents (usually via Tools) to generate and execute code to perform complex calculations or actions.
- **Planning:** An advanced capability where agents can break down complex goals into smaller steps and plan how to achieve them like a ReAct planner.
- **Models:** The underlying LLM that powers `LlmAgent`s, enabling their reasoning and language understanding abilities.
- **Event:** The basic unit of communication representing things that happen during a session (user message, agent reply, tool use), forming the conversation history.
- **Runner:** The engine that manages the execution flow, orchestrates agent interactions based on Events, and coordinates with backend services.
***Note:** Features like Multimodal Streaming, Evaluation, Deployment, Debugging, and Trace are also part of the broader ADK ecosystem, supporting real-time interaction and the development lifecycle.*
## Key Capabilities
ADK offers several key advantages for developers building agentic applications:
1. **Multi-Agent System Design:** Easily build applications composed of multiple, specialized agents arranged hierarchically. Agents can coordinate complex tasks, delegate sub-tasks using LLM-driven transfer or explicit `AgentTool` invocation, enabling modular and scalable solutions.
1. **Rich Tool Ecosystem:** Equip agents with diverse capabilities. ADK supports integrating custom functions (`FunctionTool`), using other agents as tools (`AgentTool`), leveraging built-in functionalities like code execution, and interacting with external data sources and APIs (e.g., Search, Databases). Support for long-running tools allows handling asynchronous operations effectively.
1. **Flexible Orchestration:** Define complex agent workflows using built-in workflow agents (`SequentialAgent`, `ParallelAgent`, `LoopAgent`) alongside LLM-driven dynamic routing. This allows for both predictable pipelines and adaptive agent behavior.
1. **Integrated Developer Tooling:** Develop and iterate locally with ease. ADK includes tools like a command-line interface (CLI) and a Developer UI for running agents, inspecting execution steps (events, state changes), debugging interactions, and visualizing agent definitions.
1. **Native Streaming Support:** Build real-time, interactive experiences with native support for bidirectional streaming (text and audio). This integrates seamlessly with underlying capabilities like the [Multimodal Live API for the Gemini Developer API](https://ai.google.dev/gemini-api/docs/live) (or for [Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/multimodal-live)), often enabled with simple configuration changes.
1. **Built-in Agent Evaluation:** Assess agent performance systematically. The framework includes tools to create multi-turn evaluation datasets and run evaluations locally (via CLI or the dev UI) to measure quality and guide improvements.
1. **Broad LLM Support:** While optimized for Google's Gemini models, the framework is designed for flexibility, allowing integration with various LLMs (potentially including open-source or fine-tuned models) through its `BaseLlm` interface.
1. **Artifact Management:** Enable agents to handle files and binary data. The framework provides mechanisms (`ArtifactService`, context methods) for agents to save, load, and manage versioned artifacts like images, documents, or generated reports during their execution.
1. **Extensibility and Interoperability:** ADK promotes an open ecosystem. While providing core tools, it allows developers to easily integrate and reuse third-party tools and data connectors.
1. **State and Memory Management:** Automatically handles short-term conversational memory (`State` within a `Session`) managed by the `SessionService`. Provides integration points for longer-term `Memory` services, allowing agents to recall user information across multiple sessions.
## Get Started
- Ready to build your first agent? [Try the quickstart](https://google.github.io/adk-docs/get-started/quickstart/index.md)
# Go Quickstart for ADK
This guide shows you how to get up and running with Agent Development Kit for Go. Before you start, make sure you have the following installed:
- Go 1.24.4 or later
- ADK Go v0.2.0 or later
## Create an agent project
Create an agent project with the following files and directory structure:
```text
my_agent/
agent.go # main agent code
.env # API keys or project IDs
```
Create this project structure using the command line
```console
mkdir my_agent\
type nul > my_agent\agent.go
type nul > my_agent\env.bat
```
```bash
mkdir -p my_agent/ && \
touch my_agent/agent.go && \
touch my_agent/.env
```
### Define the agent code
Create the code for a basic agent that uses the built-in [Google Search tool](/adk-docs/tools/built-in-tools/#google-search). Add the following code to the `my_agent/agent.go` file in your project directory:
my_agent/agent.go
```go
package main
import (
"context"
"log"
"os"
"google.golang.org/adk/agent"
"google.golang.org/adk/agent/llmagent"
"google.golang.org/adk/cmd/launcher"
"google.golang.org/adk/cmd/launcher/full"
"google.golang.org/adk/model/gemini"
"google.golang.org/adk/tool"
"google.golang.org/adk/tool/geminitool"
"google.golang.org/genai"
)
func main() {
ctx := context.Background()
model, err := gemini.NewModel(ctx, "gemini-3-pro-preview", &genai.ClientConfig{
APIKey: os.Getenv("GOOGLE_API_KEY"),
})
if err != nil {
log.Fatalf("Failed to create model: %v", err)
}
timeAgent, err := llmagent.New(llmagent.Config{
Name: "hello_time_agent",
Model: model,
Description: "Tells the current time in a specified city.",
Instruction: "You are a helpful assistant that tells the current time in a city.",
Tools: []tool.Tool{
geminitool.GoogleSearch{},
},
})
if err != nil {
log.Fatalf("Failed to create agent: %v", err)
}
config := &launcher.Config{
AgentLoader: agent.NewSingleLoader(timeAgent),
}
l := full.NewLauncher()
if err = l.Execute(ctx, config, os.Args[1:]); err != nil {
log.Fatalf("Run failed: %v\n\n%s", err, l.CommandLineSyntax())
}
}
```
### Configure project and dependencies
Use the `go mod` command to initialize the project modules and install the required packages based on the `import` statement in your agent code file:
```console
go mod init my-agent/main
go mod tidy
```
### Set your API key
This project uses the Gemini API, which requires an API key. If you don't already have Gemini API key, create a key in Google AI Studio on the [API Keys](https://aistudio.google.com/app/apikey) page.
In a terminal window, write your API key into the `.env` or `env.bat` file of your project to set environment variables:
Update: my_agent/.env
```bash
echo 'export GOOGLE_API_KEY="YOUR_API_KEY"' > .env
```
Update: my_agent/env.bat
```console
echo 'set GOOGLE_API_KEY="YOUR_API_KEY"' > env.bat
```
Using other AI models with ADK
ADK supports the use of many generative AI models. For more information on configuring other models in ADK agents, see [Models & Authentication](/adk-docs/agents/models).
## Run your agent
You can run your ADK agent using the interactive command-line interface you defined or the ADK web user interface provided by the ADK Go command line tool. Both these options allow you to test and interact with your agent.
### Run with command-line interface
Run your agent using the following Go command:
Run from: my_agent/ directory
```console
# Remember to load keys and settings: source .env OR env.bat
go run agent.go
```
### Run with web interface
Run your agent with the ADK web interface using the following Go command:
Run from: my_agent/ directory
```console
# Remember to load keys and settings: source .env OR env.bat
go run agent.go web api webui
```
This command starts a web server with a chat interface for your agent. You can access the web interface at (http://localhost:8080). Select your agent at the upper left corner and type a request.
Caution: ADK Web for development only
ADK Web is ***not meant for use in production deployments***. You should use ADK Web for development and debugging purposes only.
## Next: Build your agent
Now that you have ADK installed and your first agent running, try building your own agent with our build guides:
- [Build your agent](/adk-docs/tutorials/)
# Installing ADK
## Create & activate virtual environment
We recommend creating a virtual Python environment using [venv](https://docs.python.org/3/library/venv.html):
```shell
python -m venv .venv
```
Now, you can activate the virtual environment using the appropriate command for your operating system and environment:
```text
# Mac / Linux
source .venv/bin/activate
# Windows CMD:
.venv\Scripts\activate.bat
# Windows PowerShell:
.venv\Scripts\Activate.ps1
```
### Install ADK
```bash
pip install google-adk
```
(Optional) Verify your installation:
```bash
pip show google-adk
```
### Install ADK and ADK DevTools
```bash
npm install @google/adk @google/adk-devtools
```
## Create a new Go module
If you are starting a new project, you can create a new Go module:
```shell
go mod init example.com/my-agent
```
## Install ADK
To add the ADK to your project, run the following command:
```shell
go get google.golang.org/adk
```
This will add the ADK as a dependency to your `go.mod` file.
(Optional) Verify your installation by checking your `go.mod` file for the `google.golang.org/adk` entry.
You can either use maven or gradle to add the `google-adk` and `google-adk-dev` package.
`google-adk` is the core Java ADK library. Java ADK also comes with a pluggable example SpringBoot server to run your agents seamlessly. This optional package is present as part of `google-adk-dev`.
If you are using maven, add the following to your `pom.xml`:
pom.xml
```xml
4.0.0com.example.agentadk-agents1.0-SNAPSHOT1717UTF-8com.google.adkgoogle-adk0.5.0com.google.adkgoogle-adk-dev0.5.0
```
Here's a [complete pom.xml](https://github.com/google/adk-docs/tree/main/examples/java/cloud-run/pom.xml) file for reference.
If you are using gradle, add the dependency to your build.gradle:
build.gradle
```text
dependencies {
implementation 'com.google.adk:google-adk:0.5.0'
implementation 'com.google.adk:google-adk-dev:0.5.0'
}
```
You should also configure Gradle to pass `-parameters` to `javac`. (Alternatively, use `@Schema(name = "...")`).
## Next steps
- Try creating your first agent with the [**Quickstart**](https://google.github.io/adk-docs/get-started/quickstart/index.md)
# Java Quickstart for ADK
This guide shows you how to get up and running with Agent Development Kit for Java. Before you start, make sure you have the following installed:
- Java 17 or later
- Maven 3.9 or later
## Create an agent project
Create an agent project with the following files and directory structure:
```text
my_agent/
src/main/java/com/example/agent/
HelloTimeAgent.java # main agent code
AgentCliRunner.java # command-line interface
pom.xml # project configuration
.env # API keys or project IDs
```
Create this project structure using the command line
```console
mkdir my_agent\src\main\java\com\example\agent
type nul > my_agent\src\main\java\com\example\agent\HelloTimeAgent.java
type nul > my_agent\src\main\java\com\example\agent\AgentCliRunner.java
type nul > my_agent\pom.xml
type nul > my_agent\.env
```
```bash
mkdir -p my_agent/src/main/java/com/example/agent && \
touch my_agent/src/main/java/com/example/agent/HelloTimeAgent.java && \
touch my_agent/src/main/java/com/example/agent/AgentCliRunner.java && \
touch my_agent/pom.xml my_agent/.env
```
### Define the agent code
Create the code for a basic agent, including a simple implementation of an ADK [Function Tool](/adk-docs/tools-custom/function-tools/), called `getCurrentTime()`. Add the following code to the `HelloTimeAgent.java` file in your project directory:
my_agent/src/main/java/com/example/agent/HelloTimeAgent.java
```java
package com.example.agent;
import com.google.adk.agents.BaseAgent;
import com.google.adk.agents.LlmAgent;
import com.google.adk.tools.Annotations.Schema;
import com.google.adk.tools.FunctionTool;
import java.util.Map;
public class HelloTimeAgent {
public static BaseAgent ROOT_AGENT = initAgent();
private static BaseAgent initAgent() {
return LlmAgent.builder()
.name("hello-time-agent")
.description("Tells the current time in a specified city")
.instruction("""
You are a helpful assistant that tells the current time in a city.
Use the 'getCurrentTime' tool for this purpose.
""")
.model("gemini-2.5-flash")
.tools(FunctionTool.create(HelloTimeAgent.class, "getCurrentTime"))
.build();
}
/** Mock tool implementation */
@Schema(description = "Get the current time for a given city")
public static Map getCurrentTime(
@Schema(name = "city", description = "Name of the city to get the time for") String city) {
return Map.of(
"city", city,
"forecast", "The time is 10:30am."
);
}
}
```
Caution: Gemini 3 compatibility
ADK Java v0.3.0 and lower is not compatible with [Gemini 3 Pro Preview](https://ai.google.dev/gemini-api/docs/models#gemini-3-pro) due to thought signature changes for function calling. Use Gemini 2.5 or lower models instead.
### Configure project and dependencies
An ADK agent project requires this dependency in your `pom.xml` project file:
my_agent/pom.xml (partial)
```xml
com.google.adkgoogle-adk0.5.0
```
Update the `pom.xml` project file to include this dependency and additional settings with the following configuration code:
Complete `pom.xml` configuration for project
The following code shows a complete `pom.xml` configuration for this project:
my_agent/pom.xml
```xml
4.0.0com.example.agentadk-agents1.0-SNAPSHOT1717UTF-8com.google.adkgoogle-adk0.3.0com.google.adkgoogle-adk-dev0.3.0
```
### Set your API key
This project uses the Gemini API, which requires an API key. If you don't already have Gemini API key, create a key in Google AI Studio on the [API Keys](https://aistudio.google.com/app/apikey) page.
In a terminal window, write your API key into your `.env` file of your project to set environment variables:
Update: my_agent/.env
```bash
echo 'export GOOGLE_API_KEY="YOUR_API_KEY"' > .env
```
Update: my_agent/env.bat
```console
echo 'set GOOGLE_API_KEY="YOUR_API_KEY"' > env.bat
```
Using other AI models with ADK
ADK supports the use of many generative AI models. For more information on configuring other models in ADK agents, see [Models & Authentication](/adk-docs/agents/models).
### Create an agent command-line interface
Create a `AgentCliRunner.java` class to allow you to run and interact with `HelloTimeAgent` from the command line. This code shows how to create a `RunConfig` object to run the agent and a `Session` object to interact with the running agent.
my_agent/src/main/java/com/example/agent/AgentCliRunner.java
```java
package com.example.agent;
import com.google.adk.agents.RunConfig;
import com.google.adk.events.Event;
import com.google.adk.runner.InMemoryRunner;
import com.google.adk.sessions.Session;
import com.google.genai.types.Content;
import com.google.genai.types.Part;
import io.reactivex.rxjava3.core.Flowable;
import java.util.Scanner;
import static java.nio.charset.StandardCharsets.UTF_8;
public class AgentCliRunner {
public static void main(String[] args) {
RunConfig runConfig = RunConfig.builder().build();
InMemoryRunner runner = new InMemoryRunner(HelloTimeAgent.ROOT_AGENT);
Session session = runner
.sessionService()
.createSession(runner.appName(), "user1234")
.blockingGet();
try (Scanner scanner = new Scanner(System.in, UTF_8)) {
while (true) {
System.out.print("\nYou > ");
String userInput = scanner.nextLine();
if ("quit".equalsIgnoreCase(userInput)) {
break;
}
Content userMsg = Content.fromParts(Part.fromText(userInput));
Flowable events = runner.runAsync(session.userId(), session.id(), userMsg, runConfig);
System.out.print("\nAgent > ");
events.blockingForEach(event -> {
if (event.finalResponse()) {
System.out.println(event.stringifyContent());
}
});
}
}
}
}
```
## Run your agent
You can run your ADK agent using the interactive command-line interface `AgentCliRunner` class you defined or the ADK web user interface provided by the ADK using the `AdkWebServer` class. Both these options allow you to test and interact with your agent.
### Run with command-line interface
Run your agent with the command-line interface `AgentCliRunner` class using the following Maven command:
```console
# Remember to load keys and settings: source .env OR env.bat
mvn compile exec:java -Dexec.mainClass="com.example.agent.AgentCliRunner"
```
### Run with web interface
Run your agent with the ADK web interface using the following Maven command:
```console
# Remember to load keys and settings: source .env OR env.bat
mvn compile exec:java \
-Dexec.mainClass="com.google.adk.web.AdkWebServer" \
-Dexec.args="--adk.agents.source-dir=target --server.port=8000"
```
This command starts a web server with a chat interface for your agent. You can access the web interface at (http://localhost:8000). Select your agent at the upper left corner and type a request.
Caution: ADK Web for development only
ADK Web is ***not meant for use in production deployments***. You should use ADK Web for development and debugging purposes only.
## Next: Build your agent
Now that you have ADK installed and your first agent running, try building your own agent with our build guides:
- [Build your agent](/adk-docs/tutorials/)
# Python Quickstart for ADK
This guide shows you how to get up and running with Agent Development Kit (ADK) for Python. Before you start, make sure you have the following installed:
- Python 3.10 or later
- `pip` for installing packages
## Installation
Install ADK by running the following command:
```shell
pip install google-adk
```
Recommended: create and activate a Python virtual environment
Create a Python virtual environment:
```shell
python -m venv .venv
```
Activate the Python virtual environment:
```console
.venv\Scripts\activate.bat
```
```console
.venv\Scripts\Activate.ps1
```
```bash
source .venv/bin/activate
```
## Create an agent project
Run the `adk create` command to start a new agent project.
```shell
adk create my_agent
```
### Explore the agent project
The created agent project has the following structure, with the `agent.py` file containing the main control code for the agent.
```text
my_agent/
agent.py # main agent code
.env # API keys or project IDs
__init__.py
```
## Update your agent project
The `agent.py` file contains a `root_agent` definition which is the only required element of an ADK agent. You can also define tools for the agent to use. Update the generated `agent.py` code to include a `get_current_time` tool for use by the agent, as shown in the following code:
```python
from google.adk.agents.llm_agent import Agent
# Mock tool implementation
def get_current_time(city: str) -> dict:
"""Returns the current time in a specified city."""
return {"status": "success", "city": city, "time": "10:30 AM"}
root_agent = Agent(
model='gemini-3-flash-preview',
name='root_agent',
description="Tells the current time in a specified city.",
instruction="You are a helpful assistant that tells the current time in cities. Use the 'get_current_time' tool for this purpose.",
tools=[get_current_time],
)
```
### Set your API key
This project uses the Gemini API, which requires an API key. If you don't already have Gemini API key, create a key in Google AI Studio on the [API Keys](https://aistudio.google.com/app/apikey) page.
In a terminal window, write your API key into an `.env` file as an environment variable:
Update: my_agent/.env
```console
echo 'GOOGLE_API_KEY="YOUR_API_KEY"' > .env
```
Using other AI models with ADK
ADK supports the use of many generative AI models. For more information on configuring other models in ADK agents, see [Models & Authentication](/adk-docs/agents/models).
## Run your agent
You can run your ADK agent with an interactive command-line interface using the `adk run` command or the ADK web user interface provided by the ADK using the `adk web` command. Both these options allow you to test and interact with your agent.
### Run with command-line interface
Run your agent using the `adk run` command-line tool.
```console
adk run my_agent
```
### Run with web interface
The ADK framework provides web interface you can use to test and interact with your agent. You can start the web interface using the following command:
```console
adk web --port 8000
```
Note
Run this command from the **parent directory** that contains your `my_agent/` folder. For example, if your agent is inside `agents/my_agent/`, run `adk web` from the `agents/` directory.
This command starts a web server with a chat interface for your agent. You can access the web interface at (http://localhost:8000). Select the agent at the upper left corner and type a request.
Caution: ADK Web for development only
ADK Web is ***not meant for use in production deployments***. You should use ADK Web for development and debugging purposes only.
## Next: Build your agent
Now that you have ADK installed and your first agent running, try building your own agent with our build guides:
- [Build your agent](/adk-docs/tutorials/)
# Build a multi-tool agent
This quickstart guides you through installing the Agent Development Kit (ADK), setting up a basic agent with multiple tools, and running it locally either in the terminal or in the interactive, browser-based dev UI.
This quickstart assumes a local IDE (VS Code, PyCharm, IntelliJ IDEA, etc.) with Python 3.10+ or Java 17+ and terminal access. This method runs the application entirely on your machine and is recommended for internal development.
## 1. Set up Environment & Install ADK
Create & Activate Virtual Environment (Recommended):
```bash
# Create
python -m venv .venv
# Activate (each new terminal)
# macOS/Linux: source .venv/bin/activate
# Windows CMD: .venv\Scripts\activate.bat
# Windows PowerShell: .venv\Scripts\Activate.ps1
```
Install ADK:
```bash
pip install google-adk
```
Create a new project directory, initialize it, and install dependencies:
```bash
mkdir my-adk-agent
cd my-adk-agent
npm init -y
npm install @google/adk @google/adk-devtools
npm install -D typescript
```
Create a `tsconfig.json` file with the following content. This configuration ensures your project correctly handles modern Node.js modules.
tsconfig.json
```json
{
"compilerOptions": {
"target": "es2020",
"module": "nodenext",
"moduleResolution": "nodenext",
"esModuleInterop": true,
"strict": true,
"skipLibCheck": true,
// set to false to allow CommonJS module syntax:
"verbatimModuleSyntax": false
}
}
```
To install ADK and setup the environment, proceed to the following steps.
## 2. Create Agent Project
### Project structure
You will need to create the following project structure:
```console
parent_folder/
multi_tool_agent/
__init__.py
agent.py
.env
```
Create the folder `multi_tool_agent`:
```bash
mkdir multi_tool_agent/
```
Note for Windows users
When using ADK on Windows for the next few steps, we recommend creating Python files using File Explorer or an IDE because the following commands (`mkdir`, `echo`) typically generate files with null bytes and/or incorrect encoding.
### `__init__.py`
Now create an `__init__.py` file in the folder:
```shell
echo "from . import agent" > multi_tool_agent/__init__.py
```
Your `__init__.py` should now look like this:
multi_tool_agent/__init__.py
```python
from . import agent
```
### `agent.py`
Create an `agent.py` file in the same folder:
```shell
touch multi_tool_agent/agent.py
```
```shell
type nul > multi_tool_agent/agent.py
```
Copy and paste the following code into `agent.py`:
multi_tool_agent/agent.py
```python
import datetime
from zoneinfo import ZoneInfo
from google.adk.agents import Agent
def get_weather(city: str) -> dict:
"""Retrieves the current weather report for a specified city.
Args:
city (str): The name of the city for which to retrieve the weather report.
Returns:
dict: status and result or error msg.
"""
if city.lower() == "new york":
return {
"status": "success",
"report": (
"The weather in New York is sunny with a temperature of 25 degrees"
" Celsius (77 degrees Fahrenheit)."
),
}
else:
return {
"status": "error",
"error_message": f"Weather information for '{city}' is not available.",
}
def get_current_time(city: str) -> dict:
"""Returns the current time in a specified city.
Args:
city (str): The name of the city for which to retrieve the current time.
Returns:
dict: status and result or error msg.
"""
if city.lower() == "new york":
tz_identifier = "America/New_York"
else:
return {
"status": "error",
"error_message": (
f"Sorry, I don't have timezone information for {city}."
),
}
tz = ZoneInfo(tz_identifier)
now = datetime.datetime.now(tz)
report = (
f'The current time in {city} is {now.strftime("%Y-%m-%d %H:%M:%S %Z%z")}'
)
return {"status": "success", "report": report}
root_agent = Agent(
name="weather_time_agent",
model="gemini-2.0-flash",
description=(
"Agent to answer questions about the time and weather in a city."
),
instruction=(
"You are a helpful agent who can answer user questions about the time and weather in a city."
),
tools=[get_weather, get_current_time],
)
```
### `.env`
Create a `.env` file in the same folder:
```shell
touch multi_tool_agent/.env
```
```shell
type nul > multi_tool_agent\.env
```
More instructions about this file are described in the next section on [Set up the model](#set-up-the-model).
You will need to create the following project structure in your `my-adk-agent` directory:
```console
my-adk-agent/
agent.ts
.env
package.json
tsconfig.json
```
### `agent.ts`
Create an `agent.ts` file in your project folder:
```shell
touch agent.ts
```
```shell
type nul > agent.ts
```
Copy and paste the following code into `agent.ts`:
agent.ts
```typescript
import 'dotenv/config';
import { FunctionTool, LlmAgent } from '@google/adk';
import { z } from 'zod';
const getWeather = new FunctionTool({
name: 'get_weather',
description: 'Retrieves the current weather report for a specified city.',
parameters: z.object({
city: z.string().describe('The name of the city for which to retrieve the weather report.'),
}),
execute: ({ city }) => {
if (city.toLowerCase() === 'new york') {
return {
status: 'success',
report:
'The weather in New York is sunny with a temperature of 25 degrees Celsius (77 degrees Fahrenheit).',
};
} else {
return {
status: 'error',
error_message: `Weather information for '${city}' is not available.`,
};
}
},
});
const getCurrentTime = new FunctionTool({
name: 'get_current_time',
description: 'Returns the current time in a specified city.',
parameters: z.object({
city: z.string().describe("The name of the city for which to retrieve the current time."),
}),
execute: ({ city }) => {
let tz_identifier: string;
if (city.toLowerCase() === 'new york') {
tz_identifier = 'America/New_York';
} else {
return {
status: 'error',
error_message: `Sorry, I don't have timezone information for ${city}.`,
};
}
const now = new Date();
const report = `The current time in ${city} is ${now.toLocaleString('en-US', { timeZone: tz_identifier })}`;
return { status: 'success', report: report };
},
});
export const rootAgent = new LlmAgent({
name: 'weather_time_agent',
model: 'gemini-2.5-flash',
description: 'Agent to answer questions about the time and weather in a city.',
instruction: 'You are a helpful agent who can answer user questions about the time and weather in a city.',
tools: [getWeather, getCurrentTime],
});
```
### `.env`
Create a `.env` file in the same folder:
```shell
touch .env
```
```shell
type nul > .env
```
More instructions about this file are described in the next section on [Set up the model](#set-up-the-model).
Java projects generally feature the following project structure:
```console
project_folder/
├── pom.xml (or build.gradle)
├── src/
├── └── main/
│ └── java/
│ └── agents/
│ └── multitool/
└── test/
```
### Create `MultiToolAgent.java`
Create a `MultiToolAgent.java` source file in the `agents.multitool` package in the `src/main/java/agents/multitool/` directory.
Copy and paste the following code into `MultiToolAgent.java`:
agents/multitool/MultiToolAgent.java
```java
package agents.multitool;
import com.google.adk.agents.BaseAgent;
import com.google.adk.agents.LlmAgent;
import com.google.adk.events.Event;
import com.google.adk.runner.InMemoryRunner;
import com.google.adk.sessions.Session;
import com.google.adk.tools.Annotations.Schema;
import com.google.adk.tools.FunctionTool;
import com.google.genai.types.Content;
import com.google.genai.types.Part;
import io.reactivex.rxjava3.core.Flowable;
import java.nio.charset.StandardCharsets;
import java.text.Normalizer;
import java.time.ZoneId;
import java.time.ZonedDateTime;
import java.time.format.DateTimeFormatter;
import java.util.Map;
import java.util.Scanner;
public class MultiToolAgent {
private static String USER_ID = "student";
private static String NAME = "multi_tool_agent";
// The run your agent with Dev UI, the ROOT_AGENT should be a global public static final variable.
public static final BaseAgent ROOT_AGENT = initAgent();
public static BaseAgent initAgent() {
return LlmAgent.builder()
.name(NAME)
.model("gemini-2.0-flash")
.description("Agent to answer questions about the time and weather in a city.")
.instruction(
"You are a helpful agent who can answer user questions about the time and weather"
+ " in a city.")
.tools(
FunctionTool.create(MultiToolAgent.class, "getCurrentTime"),
FunctionTool.create(MultiToolAgent.class, "getWeather"))
.build();
}
public static Map getCurrentTime(
@Schema(name = "city",
description = "The name of the city for which to retrieve the current time")
String city) {
String normalizedCity =
Normalizer.normalize(city, Normalizer.Form.NFD)
.trim()
.toLowerCase()
.replaceAll("(\\p{IsM}+|\\p{IsP}+)", "")
.replaceAll("\\s+", "_");
return ZoneId.getAvailableZoneIds().stream()
.filter(zid -> zid.toLowerCase().endsWith("/" + normalizedCity))
.findFirst()
.map(
zid ->
Map.of(
"status",
"success",
"report",
"The current time in "
+ city
+ " is "
+ ZonedDateTime.now(ZoneId.of(zid))
.format(DateTimeFormatter.ofPattern("HH:mm"))
+ "."))
.orElse(
Map.of(
"status",
"error",
"report",
"Sorry, I don't have timezone information for " + city + "."));
}
public static Map getWeather(
@Schema(name = "city",
description = "The name of the city for which to retrieve the weather report")
String city) {
if (city.toLowerCase().equals("new york")) {
return Map.of(
"status",
"success",
"report",
"The weather in New York is sunny with a temperature of 25 degrees Celsius (77 degrees"
+ " Fahrenheit).");
} else {
return Map.of(
"status", "error", "report", "Weather information for " + city + " is not available.");
}
}
public static void main(String[] args) throws Exception {
InMemoryRunner runner = new InMemoryRunner(ROOT_AGENT);
Session session =
runner
.sessionService()
.createSession(NAME, USER_ID)
.blockingGet();
try (Scanner scanner = new Scanner(System.in, StandardCharsets.UTF_8)) {
while (true) {
System.out.print("\nYou > ");
String userInput = scanner.nextLine();
if ("quit".equalsIgnoreCase(userInput)) {
break;
}
Content userMsg = Content.fromParts(Part.fromText(userInput));
Flowable events = runner.runAsync(USER_ID, session.id(), userMsg);
System.out.print("\nAgent > ");
events.blockingForEach(event -> System.out.println(event.stringifyContent()));
}
}
}
}
```
## 3. Set up the model
Your agent's ability to understand user requests and generate responses is powered by a Large Language Model (LLM). Your agent needs to make secure calls to this external LLM service, which **requires authentication credentials**. Without valid authentication, the LLM service will deny the agent's requests, and the agent will be unable to function.
Model Authentication guide
For a detailed guide on authenticating to different models, see the [Authentication guide](/adk-docs/agents/models/google-gemini#google-ai-studio). This is a critical step to ensure your agent can make calls to the LLM service.
1. Get an API key from [Google AI Studio](https://aistudio.google.com/apikey).
1. When using Python, open the **`.env`** file located inside (`multi_tool_agent/`) and copy-paste the following code.
multi_tool_agent/.env
```text
GOOGLE_GENAI_USE_VERTEXAI=FALSE
GOOGLE_API_KEY=PASTE_YOUR_ACTUAL_API_KEY_HERE
```
When using Java, define environment variables:
terminal
```console
export GOOGLE_GENAI_USE_VERTEXAI=FALSE
export GOOGLE_API_KEY=PASTE_YOUR_ACTUAL_API_KEY_HERE
```
When using TypeScript, the `.env` file is automatically loaded by the `import 'dotenv/config';` line at the top of your `agent.ts` file.
`env title=""multi_tool_agent/.env" GOOGLE_GENAI_USE_VERTEXAI=FALSE GOOGLE_GENAI_API_KEY=PASTE_YOUR_ACTUAL_API_KEY_HERE`
1. Replace `PASTE_YOUR_ACTUAL_API_KEY_HERE` with your actual `API KEY`.
1. Set up a [Google Cloud project](https://cloud.google.com/vertex-ai/generative-ai/docs/start/quickstarts/quickstart-multimodal#setup-gcp) and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).
1. Set up the [gcloud CLI](https://cloud.google.com/vertex-ai/generative-ai/docs/start/quickstarts/quickstart-multimodal#setup-local).
1. Authenticate to Google Cloud from the terminal by running `gcloud auth application-default login`.
1. When using Python, open the **`.env`** file located inside (`multi_tool_agent/`). Copy-paste the following code and update the project ID and location.
multi_tool_agent/.env
```text
GOOGLE_GENAI_USE_VERTEXAI=TRUE
GOOGLE_CLOUD_PROJECT=YOUR_PROJECT_ID
GOOGLE_CLOUD_LOCATION=LOCATION
```
When using Java, define environment variables:
terminal
```console
export GOOGLE_GENAI_USE_VERTEXAI=TRUE
export GOOGLE_CLOUD_PROJECT=YOUR_PROJECT_ID
export GOOGLE_CLOUD_LOCATION=LOCATION
```
When using TypeScript, the `.env` file is automatically loaded by the `import 'dotenv/config';` line at the top of your `agent.ts` file.
.env
```text
GOOGLE_GENAI_USE_VERTEXAI=TRUE
GOOGLE_CLOUD_PROJECT=YOUR_PROJECT_ID
GOOGLE_CLOUD_LOCATION=LOCATION
```
1. You can sign up for a free Google Cloud project and use Gemini for free with an eligible account!
- Set up a [Google Cloud project with Vertex AI Express Mode](https://cloud.google.com/vertex-ai/generative-ai/docs/start/express-mode/overview)
- Get an API key from your Express mode project. This key can be used with ADK to use Gemini models for free, as well as access to Agent Engine services.
1. When using Python, open the **`.env`** file located inside (`multi_tool_agent/`). Copy-paste the following code and update the project ID and location.
multi_tool_agent/.env
```text
GOOGLE_GENAI_USE_VERTEXAI=TRUE
GOOGLE_API_KEY=PASTE_YOUR_ACTUAL_EXPRESS_MODE_API_KEY_HERE
```
When using Java, define environment variables:
terminal
```console
export GOOGLE_GENAI_USE_VERTEXAI=TRUE
export GOOGLE_API_KEY=PASTE_YOUR_ACTUAL_EXPRESS_MODE_API_KEY_HERE
```
When using TypeScript, the `.env` file is automatically loaded by the `import 'dotenv/config';` line at the top of your `agent.ts` file.
.env
```text
GOOGLE_GENAI_USE_VERTEXAI=TRUE
GOOGLE_GENAI_API_KEY=PASTE_YOUR_ACTUAL_EXPRESS_MODE_API_KEY_HERE
```
## 4. Run Your Agent
Using the terminal, navigate to the parent directory of your agent project (e.g. using `cd ..`):
```console
parent_folder/ <-- navigate to this directory
multi_tool_agent/
__init__.py
agent.py
.env
```
There are multiple ways to interact with your agent:
Authentication Setup for Vertex AI Users
If you selected **"Gemini - Google Cloud Vertex AI"** in the previous step, you must authenticate with Google Cloud before launching the dev UI.
Run this command and follow the prompts:
```bash
gcloud auth application-default login
```
**Note:** Skip this step if you're using "Gemini - Google AI Studio".
Run the following command to launch the **dev UI**.
```shell
adk web
```
Caution: ADK Web for development only
ADK Web is ***not meant for use in production deployments***. You should use ADK Web for development and debugging purposes only.
Note for Windows users
When hitting the `_make_subprocess_transport NotImplementedError`, consider using `adk web --no-reload` instead.
**Step 1:** Open the URL provided (usually `http://localhost:8000` or `http://127.0.0.1:8000`) directly in your browser.
**Step 2.** In the top-left corner of the UI, you can select your agent in the dropdown. Select "multi_tool_agent".
Troubleshooting
If you do not see "multi_tool_agent" in the dropdown menu, make sure you are running `adk web` in the **parent folder** of your agent folder (i.e. the parent folder of multi_tool_agent).
**Step 3.** Now you can chat with your agent using the textbox:
**Step 4.** By using the `Events` tab at the left, you can inspect individual function calls, responses and model responses by clicking on the actions:
On the `Events` tab, you can also click the `Trace` button to see the trace logs for each event that shows the latency of each function calls:
**Step 5.** You can also enable your microphone and talk to your agent:
Model support for voice/video streaming
In order to use voice/video streaming in ADK, you will need to use Gemini models that support the Live API. You can find the **model ID(s)** that supports the Gemini Live API in the documentation:
- [Google AI Studio: Gemini Live API](https://ai.google.dev/gemini-api/docs/models#live-api)
- [Vertex AI: Gemini Live API](https://cloud.google.com/vertex-ai/generative-ai/docs/live-api)
You can then replace the `model` string in `root_agent` in the `agent.py` file you created earlier ([jump to section](#agentpy)). Your code should look something like:
```py
root_agent = Agent(
name="weather_time_agent",
model="replace-me-with-model-id", #e.g. gemini-2.0-flash-live-001
...
```
Tip
When using `adk run` you can inject prompts into the agent to start by piping text to the command like so:
```shell
echo "Please start by listing files" | adk run file_listing_agent
```
Run the following command, to chat with your Weather agent.
```text
adk run multi_tool_agent
```
To exit, use Cmd/Ctrl+C.
`adk api_server` enables you to create a local FastAPI server in a single command, enabling you to test local cURL requests before you deploy your agent.
To learn how to use `adk api_server` for testing, refer to the [documentation on using the API server](/adk-docs/runtime/api-server/).
Using the terminal, navigate to your agent project directory:
```console
my-adk-agent/ <-- navigate to this directory
agent.ts
.env
package.json
tsconfig.json
```
There are multiple ways to interact with your agent:
Run the following command to launch the **dev UI**.
```shell
npx adk web
```
**Step 1:** Open the URL provided (usually `http://localhost:8000` or `http://127.0.0.1:8000`) directly in your browser.
**Step 2.** In the top-left corner of the UI, select your agent from the dropdown. The agents are listed by their filenames, so you should select "agent".
Troubleshooting
If you do not see "agent" in the dropdown menu, make sure you are running `npx adk web` in the directory containing your `agent.ts` file.
**Step 3.** Now you can chat with your agent using the textbox:
**Step 4.** By using the `Events` tab at the left, you can inspect individual function calls, responses and model responses by clicking on the actions:
On the `Events` tab, you can also click the `Trace` button to see the trace logs for each event that shows the latency of each function calls:
Run the following command to chat with your agent.
```text
npx adk run agent.ts
```
To exit, use Cmd/Ctrl+C.
`npx adk api_server` enables you to create a local Express.js server in a single command, enabling you to test local cURL requests before you deploy your agent.
To learn how to use `api_server` for testing, refer to the [documentation on testing](/adk-docs/runtime/api-server/).
Using the terminal, navigate to the parent directory of your agent project (e.g. using `cd ..`):
```console
project_folder/ <-- navigate to this directory
├── pom.xml (or build.gradle)
├── src/
├── └── main/
│ └── java/
│ └── agents/
│ └── multitool/
│ └── MultiToolAgent.java
└── test/
```
Run the following command from the terminal to launch the Dev UI.
**DO NOT change the main class name of the Dev UI server.**
terminal
```console
mvn exec:java \
-Dexec.mainClass="com.google.adk.web.AdkWebServer" \
-Dexec.args="--adk.agents.source-dir=src/main/java" \
-Dexec.classpathScope="compile"
```
**Step 1:** Open the URL provided (usually `http://localhost:8080` or `http://127.0.0.1:8080`) directly in your browser.
**Step 2.** In the top-left corner of the UI, you can select your agent in the dropdown. Select "multi_tool_agent".
Troubleshooting
If you do not see "multi_tool_agent" in the dropdown menu, make sure you are running the `mvn` command at the location where your Java source code is located (usually `src/main/java`).
**Step 3.** Now you can chat with your agent using the textbox:
**Step 4.** You can also inspect individual function calls, responses and model responses by clicking on the actions:
Caution: ADK Web for development only
ADK Web is ***not meant for use in production deployments***. You should use ADK Web for development and debugging purposes only.
With Maven, run the `main()` method of your Java class with the following command:
terminal
```console
mvn compile exec:java -Dexec.mainClass="agents.multitool.MultiToolAgent"
```
With Gradle, the `build.gradle` or `build.gradle.kts` build file should have the following Java plugin in its `plugins` section:
```groovy
plugins {
id('java')
// other plugins
}
```
Then, elsewhere in the build file, at the top-level, create a new task to run the `main()` method of your agent:
```groovy
tasks.register('runAgent', JavaExec) {
classpath = sourceSets.main.runtimeClasspath
mainClass = 'agents.multitool.MultiToolAgent'
}
```
Finally, on the command-line, run the following command:
```console
gradle runAgent
```
### 📝 Example prompts to try
- What is the weather in New York?
- What is the time in New York?
- What is the weather in Paris?
- What is the time in Paris?
## 🎉 Congratulations!
You've successfully created and interacted with your first agent using ADK!
______________________________________________________________________
## 🛣️ Next steps
- **Go to the tutorial**: Learn how to add memory, session, state to your agent: [tutorial](https://google.github.io/adk-docs/tutorials/index.md).
- **Delve into advanced configuration:** Explore the [setup](https://google.github.io/adk-docs/get-started/installation/index.md) section for deeper dives into project structure, configuration, and other interfaces.
- **Understand Core Concepts:** Learn about [agents concepts](https://google.github.io/adk-docs/agents/index.md).
# TypeScript Quickstart for ADK
This guide shows you how to get up and running with Agent Development Kit for TypeScript. Before you start, make sure you have the following installed:
- Node.js 24.13.0 or later
- Node Package Manager (npm) 11.8.0 or later
## Create an agent project
Create an empty `my-agent` directory for your project:
```text
my-agent/
```
Create this project structure using the command line
```bash
mkdir -p my-agent/
```
```console
mkdir my-agent
```
### Configure project and dependencies
Use the `npm` tool to install and configure dependencies for your project, including the package file, ADK TypeScript main library, and developer tools. Run the following commands from your `my-agent/` directory to create the `package.json` file and install the project dependencies:
```console
cd my-agent/
# initialize a project as an ES module
npm init --yes
npm pkg set type="module"
npm pkg set main="agent.ts"
# install ADK libraries
npm install @google/adk
# install dev tools as a dev dependency
npm install -D @google/adk-devtools
```
### Define the agent code
Create the code for a basic agent, including a simple implementation of an ADK [Function Tool](/adk-docs/tools/function-tools/), called `getCurrentTime`. Create an `agent.ts` file in your project directory and add the following code:
my-agent/agent.ts
```typescript
import {FunctionTool, LlmAgent} from '@google/adk';
import {z} from 'zod';
/* Mock tool implementation */
const getCurrentTime = new FunctionTool({
name: 'get_current_time',
description: 'Returns the current time in a specified city.',
parameters: z.object({
city: z.string().describe("The name of the city for which to retrieve the current time."),
}),
execute: ({city}) => {
return {status: 'success', report: `The current time in ${city} is 10:30 AM`};
},
});
export const rootAgent = new LlmAgent({
name: 'hello_time_agent',
model: 'gemini-2.5-flash',
description: 'Tells the current time in a specified city.',
instruction: `You are a helpful assistant that tells the current time in a city.
Use the 'getCurrentTime' tool for this purpose.`,
tools: [getCurrentTime],
});
```
### Set your API key
This project uses the Gemini API, which requires an API key. If you don't already have Gemini API key, create a key in Google AI Studio on the [API Keys](https://aistudio.google.com/app/apikey) page.
In a terminal window, write your API key into your `.env` file of your project to set environment variables:
Update: my-agent/.env
```bash
echo 'GEMINI_API_KEY="YOUR_API_KEY"' > .env
```
Using other AI models with ADK
ADK supports the use of many generative AI models. For more information on configuring other models in ADK agents, see [Models & Authentication](/adk-docs/agents/models).
## Run your agent
You can run your ADK agent with the `@google/adk-devtools` library as an interactive command-line interface using the `run` command or the ADK web user interface using the `web` command. Both these options allow you to test and interact with your agent.
### Run with command-line interface
Run your agent with the ADK TypeScript command-line interface tool using the following command:
```console
npx adk run agent.ts
```
### Run with web interface
Run your agent with the ADK web interface using the following command:
```console
npx adk web
```
This command starts a web server with a chat interface for your agent. You can access the web interface at (http://localhost:8000). Select your agent at the upper right corner and type a request.
Caution: ADK Web for development only
ADK Web is ***not meant for use in production deployments***. You should use ADK Web for development and debugging purposes only.
## Next: Build your agent
Now that you have ADK installed and your first agent running, try building your own agent with our build guides:
- [Build your agent](/adk-docs/tutorials/)
# Build a streaming agent
The Agent Development Kit (ADK) enables real-time, interactive experiences with your AI agents through streaming. This allows for features like live voice conversations, real-time tool use, and continuous updates from your agent.
This page provides quickstart examples to get you up and running with streaming capabilities in both Python and Java ADK.
- **Python ADK: Streaming agent**
______________________________________________________________________
This example demonstrates how to set up a basic streaming interaction with an agent using Python ADK. It typically involves using the `Runner.run_live()` method and handling asynchronous events.
[View Python Streaming Quickstart](https://google.github.io/adk-docs/get-started/streaming/quickstart-streaming/index.md)
- **Java ADK: Streaming agent**
______________________________________________________________________
This example demonstrates how to set up a basic streaming interaction with an agent using Java ADK. It involves using the `Runner.runLive()` method, a `LiveRequestQueue`, and handling the `Flowable` stream.
[View Java Streaming Quickstart](https://google.github.io/adk-docs/get-started/streaming/quickstart-streaming-java/index.md)
# Build a streaming agent with Java
This quickstart guide will walk you through the process of creating a basic agent and leveraging ADK Streaming with Java to facilitate low-latency, bidirectional voice interactions.
You'll begin by setting up your Java and Maven environment, structuring your project, and defining the necessary dependencies. Following this, you'll create a simple `ScienceTeacherAgent`, test its text-based streaming capabilities using the Dev UI, and then progress to enabling live audio communication, transforming your agent into an interactive voice-driven application.
## **Create your first agent**
### **Prerequisites**
- In this getting started guide, you will be programming in Java. Check if **Java** is installed on your machine. Ideally, you should be using Java 17 or more (you can check that by typing **java -version**)
- You’ll also be using the **Maven** build tool for Java. So be sure to have [Maven installed](https://maven.apache.org/install.html) on your machine before going further (this is the case for Cloud Top or Cloud Shell, but not necessarily for your laptop).
### **Prepare the project structure**
To get started with ADK Java, let’s create a Maven project with the following directory structure:
```text
adk-agents/
├── pom.xml
└── src/
└── main/
└── java/
└── agents/
└── ScienceTeacherAgent.java
```
Follow the instructions in [Installation](https://google.github.io/adk-docs/get-started/installation/index.md) page to add `pom.xml` for using the ADK package.
Note
Feel free to use whichever name you like for the root directory of your project (instead of adk-agents)
### **Running a compilation**
Let’s see if Maven is happy with this build, by running a compilation (**mvn compile** command):
```shell
$ mvn compile
[INFO] Scanning for projects...
[INFO]
[INFO] --------------------< adk-agents:adk-agents >--------------------
[INFO] Building adk-agents 1.0-SNAPSHOT
[INFO] from pom.xml
[INFO] --------------------------------[ jar ]---------------------------------
[INFO]
[INFO] --- resources:3.3.1:resources (default-resources) @ adk-demo ---
[INFO] skip non existing resourceDirectory /home/user/adk-demo/src/main/resources
[INFO]
[INFO] --- compiler:3.13.0:compile (default-compile) @ adk-demo ---
[INFO] Nothing to compile - all classes are up to date.
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1.347 s
[INFO] Finished at: 2025-05-06T15:38:08Z
[INFO] ------------------------------------------------------------------------
```
Looks like the project is set up properly for compilation!
### **Creating an agent**
Create the **ScienceTeacherAgent.java** file under the `src/main/java/agents/` directory with the following content:
```java
package samples.liveaudio;
import com.google.adk.agents.BaseAgent;
import com.google.adk.agents.LlmAgent;
/** Science teacher agent. */
public class ScienceTeacherAgent {
// Field expected by the Dev UI to load the agent dynamically
// (the agent must be initialized at declaration time)
public static final BaseAgent ROOT_AGENT = initAgent();
// Please fill in the latest model id that supports live API from
// https://google.github.io/adk-docs/get-started/streaming/quickstart-streaming/#supported-models
public static BaseAgent initAgent() {
return LlmAgent.builder()
.name("science-app")
.description("Science teacher agent")
.model("...") // Pleaase fill in the latest model id for live API
.instruction("""
You are a helpful science teacher that explains
science concepts to kids and teenagers.
""")
.build();
}
}
```
We will use `Dev UI` to run this agent later. For the tool to automatically recognize the agent, its Java class has to comply with the following two rules:
- The agent should be stored in a global **public static** variable named **ROOT_AGENT** of type **BaseAgent** and initialized at declaration time.
- The agent definition has to be a **static** method so it can be loaded during the class initialization by the dynamic compiling classloader.
## **Run agent with Dev UI**
`Dev UI` is a web server where you can quickly run and test your agents for development purpose, without building your own UI application for the agents.
### **Define environment variables**
To run the server, you’ll need to export two environment variables:
- a Gemini key that you can [get from AI Studio](https://ai.google.dev/gemini-api/docs/api-key),
- a variable to specify we’re not using Vertex AI this time.
```shell
export GOOGLE_GENAI_USE_VERTEXAI=FALSE
export GOOGLE_API_KEY=YOUR_API_KEY
```
### **Run Dev UI**
Run the following command from the terminal to launch the Dev UI.
terminal
```console
mvn exec:java \
-Dexec.mainClass="com.google.adk.web.AdkWebServer" \
-Dexec.args="--adk.agents.source-dir=." \
-Dexec.classpathScope="compile"
```
**Step 1:** Open the URL provided (usually `http://localhost:8080` or `http://127.0.0.1:8080`) directly in your browser.
**Step 2.** In the top-left corner of the UI, you can select your agent in the dropdown. Select "science-app".
Troubleshooting
If you do not see "science-app" in the dropdown menu, make sure you are running the `mvn` command from the root of your maven project.
Caution: ADK Web for development only
ADK Web is ***not meant for use in production deployments***. You should use ADK Web for development and debugging purposes only.
## Try Dev UI with voice and video
With your favorite browser, navigate to:
You should see the following interface:
Click the microphone button to enable the voice input, and ask a question `What's the electron?` in voice. You will hear the answer in voice in real-time.
To try with video, reload the web browser, click the camera button to enable the video input, and ask questions like "What do you see?". The agent will answer what they see in the video input.
### Caveat
- You can not use text chat with the native-audio models. You will see errors when entering text messages on `adk web`.
### Stop the tool
Stop the tool by pressing `Ctrl-C` on the console.
## **Run agent with a custom live audio app**
Now, let's try audio streaming with the agent and a custom live audio application.
### **A Maven pom.xml build file for Live Audio**
Replace your existing pom.xml with the following.
```xml
4.0.0com.google.adk.samplesgoogle-adk-sample-live-audio0.1.0Google ADK - Sample - Live Audio
A sample application demonstrating a live audio conversation using ADK,
runnable via samples.liveaudio.LiveAudioRun.
jarUTF-8171.11.0samples.liveaudio.LiveAudioRun0.1.0com.google.cloudlibraries-bom26.53.0pomimportcom.google.adkgoogle-adk${google-adk.version}commons-loggingcommons-logging1.2org.apache.maven.pluginsmaven-compiler-plugin3.13.0${java.version}${java.version}truecom.google.auto.valueauto-value${auto-value.version}org.codehaus.mojobuild-helper-maven-plugin3.6.0add-sourcegenerate-sourcesadd-source.org.codehaus.mojoexec-maven-plugin3.2.0${exec.mainClass}runtime
```
### **Creating Live Audio Run tool**
Create the **LiveAudioRun.java** file under the `src/main/java/` directory with the following content. This tool runs the agent on it with live audio input and output.
```java
package samples.liveaudio;
import com.google.adk.agents.LiveRequestQueue;
import com.google.adk.agents.RunConfig;
import com.google.adk.events.Event;
import com.google.adk.runner.Runner;
import com.google.adk.sessions.InMemorySessionService;
import com.google.common.collect.ImmutableList;
import com.google.genai.types.Blob;
import com.google.genai.types.Modality;
import com.google.genai.types.PrebuiltVoiceConfig;
import com.google.genai.types.Content;
import com.google.genai.types.Part;
import com.google.genai.types.SpeechConfig;
import com.google.genai.types.VoiceConfig;
import io.reactivex.rxjava3.core.Flowable;
import java.io.ByteArrayOutputStream;
import java.io.InputStream;
import java.net.URL;
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.DataLine;
import javax.sound.sampled.LineUnavailableException;
import javax.sound.sampled.Mixer;
import javax.sound.sampled.SourceDataLine;
import javax.sound.sampled.TargetDataLine;
import java.util.UUID;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ConcurrentMap;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicBoolean;
import agents.ScienceTeacherAgent;
/** Main class to demonstrate running the {@link LiveAudioAgent} for a voice conversation. */
public final class LiveAudioRun {
private final String userId;
private final String sessionId;
private final Runner runner;
private static final javax.sound.sampled.AudioFormat MIC_AUDIO_FORMAT =
new javax.sound.sampled.AudioFormat(16000.0f, 16, 1, true, false);
private static final javax.sound.sampled.AudioFormat SPEAKER_AUDIO_FORMAT =
new javax.sound.sampled.AudioFormat(24000.0f, 16, 1, true, false);
private static final int BUFFER_SIZE = 4096;
public LiveAudioRun() {
this.userId = "test_user";
String appName = "LiveAudioApp";
this.sessionId = UUID.randomUUID().toString();
InMemorySessionService sessionService = new InMemorySessionService();
this.runner = new Runner(ScienceTeacherAgent.ROOT_AGENT, appName, null, sessionService);
ConcurrentMap initialState = new ConcurrentHashMap<>();
var unused =
sessionService.createSession(appName, userId, initialState, sessionId).blockingGet();
}
private void runConversation() throws Exception {
System.out.println("Initializing microphone input and speaker output...");
RunConfig runConfig =
RunConfig.builder()
.setStreamingMode(RunConfig.StreamingMode.BIDI)
.setResponseModalities(ImmutableList.of(new Modality("AUDIO")))
.setSpeechConfig(
SpeechConfig.builder()
.voiceConfig(
VoiceConfig.builder()
.prebuiltVoiceConfig(
PrebuiltVoiceConfig.builder().voiceName("Aoede").build())
.build())
.languageCode("en-US")
.build())
.build();
LiveRequestQueue liveRequestQueue = new LiveRequestQueue();
Flowable eventStream =
this.runner.runLive(
runner.sessionService().createSession(userId, sessionId).blockingGet(),
liveRequestQueue,
runConfig);
AtomicBoolean isRunning = new AtomicBoolean(true);
AtomicBoolean conversationEnded = new AtomicBoolean(false);
ExecutorService executorService = Executors.newFixedThreadPool(2);
// Task for capturing microphone input
Future> microphoneTask =
executorService.submit(() -> captureAndSendMicrophoneAudio(liveRequestQueue, isRunning));
// Task for processing agent responses and playing audio
Future> outputTask =
executorService.submit(
() -> {
try {
processAudioOutput(eventStream, isRunning, conversationEnded);
} catch (Exception e) {
System.err.println("Error processing audio output: " + e.getMessage());
e.printStackTrace();
isRunning.set(false);
}
});
// Wait for user to press Enter to stop the conversation
System.out.println("Conversation started. Press Enter to stop...");
System.in.read();
System.out.println("Ending conversation...");
isRunning.set(false);
try {
// Give some time for ongoing processing to complete
microphoneTask.get(2, TimeUnit.SECONDS);
outputTask.get(2, TimeUnit.SECONDS);
} catch (Exception e) {
System.out.println("Stopping tasks...");
}
liveRequestQueue.close();
executorService.shutdownNow();
System.out.println("Conversation ended.");
}
private void captureAndSendMicrophoneAudio(
LiveRequestQueue liveRequestQueue, AtomicBoolean isRunning) {
TargetDataLine micLine = null;
try {
DataLine.Info info = new DataLine.Info(TargetDataLine.class, MIC_AUDIO_FORMAT);
if (!AudioSystem.isLineSupported(info)) {
System.err.println("Microphone line not supported!");
return;
}
micLine = (TargetDataLine) AudioSystem.getLine(info);
micLine.open(MIC_AUDIO_FORMAT);
micLine.start();
System.out.println("Microphone initialized. Start speaking...");
byte[] buffer = new byte[BUFFER_SIZE];
int bytesRead;
while (isRunning.get()) {
bytesRead = micLine.read(buffer, 0, buffer.length);
if (bytesRead > 0) {
byte[] audioChunk = new byte[bytesRead];
System.arraycopy(buffer, 0, audioChunk, 0, bytesRead);
Blob audioBlob = Blob.builder().data(audioChunk).mimeType("audio/pcm").build();
liveRequestQueue.realtime(audioBlob);
}
}
} catch (LineUnavailableException e) {
System.err.println("Error accessing microphone: " + e.getMessage());
e.printStackTrace();
} finally {
if (micLine != null) {
micLine.stop();
micLine.close();
}
}
}
private void processAudioOutput(
Flowable eventStream, AtomicBoolean isRunning, AtomicBoolean conversationEnded) {
SourceDataLine speakerLine = null;
try {
DataLine.Info info = new DataLine.Info(SourceDataLine.class, SPEAKER_AUDIO_FORMAT);
if (!AudioSystem.isLineSupported(info)) {
System.err.println("Speaker line not supported!");
return;
}
final SourceDataLine finalSpeakerLine = (SourceDataLine) AudioSystem.getLine(info);
finalSpeakerLine.open(SPEAKER_AUDIO_FORMAT);
finalSpeakerLine.start();
System.out.println("Speaker initialized.");
for (Event event : eventStream.blockingIterable()) {
if (!isRunning.get()) {
break;
}
AtomicBoolean audioReceived = new AtomicBoolean(false);
processEvent(event, audioReceived);
event.content().ifPresent(content -> content.parts().ifPresent(parts -> parts.forEach(part -> playAudioData(part, finalSpeakerLine))));
}
speakerLine = finalSpeakerLine; // Assign to outer variable for cleanup in finally block
} catch (LineUnavailableException e) {
System.err.println("Error accessing speaker: " + e.getMessage());
e.printStackTrace();
} finally {
if (speakerLine != null) {
speakerLine.drain();
speakerLine.stop();
speakerLine.close();
}
conversationEnded.set(true);
}
}
private void playAudioData(Part part, SourceDataLine speakerLine) {
part.inlineData()
.ifPresent(
inlineBlob ->
inlineBlob
.data()
.ifPresent(
audioBytes -> {
if (audioBytes.length > 0) {
System.out.printf(
"Playing audio (%s): %d bytes%n",
inlineBlob.mimeType(),
audioBytes.length);
speakerLine.write(audioBytes, 0, audioBytes.length);
}
}));
}
private void processEvent(Event event, java.util.concurrent.atomic.AtomicBoolean audioReceived) {
event
.content()
.ifPresent(
content ->
content
.parts()
.ifPresent(parts -> parts.forEach(part -> logReceivedAudioData(part, audioReceived))));
}
private void logReceivedAudioData(Part part, AtomicBoolean audioReceived) {
part.inlineData()
.ifPresent(
inlineBlob ->
inlineBlob
.data()
.ifPresent(
audioBytes -> {
if (audioBytes.length > 0) {
System.out.printf(
" Audio (%s): received %d bytes.%n",
inlineBlob.mimeType(),
audioBytes.length);
audioReceived.set(true);
} else {
System.out.printf(
" Audio (%s): received empty audio data.%n",
inlineBlob.mimeType());
}
}));
}
public static void main(String[] args) throws Exception {
LiveAudioRun liveAudioRun = new LiveAudioRun();
liveAudioRun.runConversation();
System.out.println("Exiting Live Audio Run.");
}
}
```
### **Run the Live Audio Run tool**
To run Live Audio Run tool, use the following command on the `adk-agents` directory:
```text
mvn compile exec:java
```
Then you should see:
```text
$ mvn compile exec:java
...
Initializing microphone input and speaker output...
Conversation started. Press Enter to stop...
Speaker initialized.
Microphone initialized. Start speaking...
```
With this message, the tool is ready to take voice input. Talk to the agent with a question like `What's the electron?`.
Caution
When you observe the agent keep speaking by itself and doesn't stop, try using earphones to suppress the echoing.
## **Summary**
Streaming for ADK enables developers to create agents capable of low-latency, bidirectional voice and video communication, enhancing interactive experiences. The article demonstrates that text streaming is a built-in feature of ADK Agents, requiring no additional specific code, while also showcasing how to implement live audio conversations for real-time voice interaction with an agent. This allows for more natural and dynamic communication, as users can speak to and hear from the agent seamlessly.
# Build a streaming agent with Python
With this quickstart, you'll learn to create a simple agent and use ADK Streaming to enable voice and video communication with it that is low-latency and bidirectional. We will install ADK, set up a basic "Google Search" agent, try running the agent with Streaming with `adk web` tool, and then explain how to build a simple asynchronous web app by yourself using ADK Streaming and [FastAPI](https://fastapi.tiangolo.com/).
**Note:** This guide assumes you have experience using a terminal in Windows, Mac, and Linux environments.
## Supported models for voice/video streaming
In order to use voice/video streaming in ADK, you will need to use Gemini models that support the Live API. You can find the **model ID(s)** that supports the Gemini Live API in the documentation:
- [Google AI Studio: Gemini Live API](https://ai.google.dev/gemini-api/docs/models#live-api)
- [Vertex AI: Gemini Live API](https://cloud.google.com/vertex-ai/generative-ai/docs/live-api)
## 1. Setup Environment & Install ADK
Create & Activate Virtual Environment (Recommended):
```bash
# Create
python -m venv .venv
# Activate (each new terminal)
# macOS/Linux: source .venv/bin/activate
# Windows CMD: .venv\Scripts\activate.bat
# Windows PowerShell: .venv\Scripts\Activate.ps1
```
Install ADK:
```bash
pip install google-adk
```
## 2. Project Structure
Create the following folder structure with empty files:
```console
adk-streaming/ # Project folder
└── app/ # the web app folder
├── .env # Gemini API key
└── google_search_agent/ # Agent folder
├── __init__.py # Python package
└── agent.py # Agent definition
```
### agent.py
Copy-paste the following code block into the `agent.py` file.
For `model`, please double check the model ID as described earlier in the [Models section](#supported-models).
```py
from google.adk.agents import Agent
from google.adk.tools import google_search # Import the tool
root_agent = Agent(
# A unique name for the agent.
name="basic_search_agent",
# The Large Language Model (LLM) that agent will use.
# Please fill in the latest model id that supports live from
# https://google.github.io/adk-docs/get-started/streaming/quickstart-streaming/#supported-models
model="...",
# A short description of the agent's purpose.
description="Agent to answer questions using Google Search.",
# Instructions to set the agent's behavior.
instruction="You are an expert researcher. You always stick to the facts.",
# Add google_search tool to perform grounding with Google search.
tools=[google_search]
)
```
`agent.py` is where all your agent(s)' logic will be stored, and you must have a `root_agent` defined.
Notice how easily you integrated [grounding with Google Search](https://ai.google.dev/gemini-api/docs/grounding?lang=python#configure-search) capabilities. The `Agent` class and the `google_search` tool handle the complex interactions with the LLM and grounding with the search API, allowing you to focus on the agent's *purpose* and *behavior*.
Copy-paste the following code block to `__init__.py` file.
__init__.py
```py
from . import agent
```
## 3. Set up the platform
To run the agent, choose a platform from either Google AI Studio or Google Cloud Vertex AI:
1. Get an API key from [Google AI Studio](https://aistudio.google.com/apikey).
1. Open the **`.env`** file located inside (`app/`) and copy-paste the following code.
.env
```text
GOOGLE_GENAI_USE_VERTEXAI=FALSE
GOOGLE_API_KEY=PASTE_YOUR_ACTUAL_API_KEY_HERE
```
1. Replace `PASTE_YOUR_ACTUAL_API_KEY_HERE` with your actual `API KEY`.
1. You need an existing [Google Cloud](https://cloud.google.com/?e=48754805&hl=en) account and a project.
- Set up a [Google Cloud project](https://cloud.google.com/vertex-ai/generative-ai/docs/start/quickstarts/quickstart-multimodal#setup-gcp)
- Set up the [gcloud CLI](https://cloud.google.com/vertex-ai/generative-ai/docs/start/quickstarts/quickstart-multimodal#setup-local)
- Authenticate to Google Cloud, from the terminal by running `gcloud auth login`.
- [Enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).
1. Open the **`.env`** file located inside (`app/`). Copy-paste the following code and update the project ID and location.
.env
```text
GOOGLE_GENAI_USE_VERTEXAI=TRUE
GOOGLE_CLOUD_PROJECT=PASTE_YOUR_ACTUAL_PROJECT_ID
GOOGLE_CLOUD_LOCATION=us-central1
```
## 4. Try the agent with `adk web`
Now it's ready to try the agent. Run the following command to launch the **dev UI**. First, make sure to set the current directory to `app`:
```shell
cd app
```
Also, set `SSL_CERT_FILE` variable with the following command. This is required for the voice and video tests later.
```bash
export SSL_CERT_FILE=$(python -m certifi)
```
```powershell
$env:SSL_CERT_FILE = (python -m certifi)
```
Then, run the dev UI:
```shell
adk web
```
Note for Windows users
When hitting the `_make_subprocess_transport NotImplementedError`, consider using `adk web --no-reload` instead.
Caution: ADK Web for development only
ADK Web is ***not meant for use in production deployments***. You should use ADK Web for development and debugging purposes only.
Open the URL provided (usually `http://localhost:8000` or `http://127.0.0.1:8000`) **directly in your browser**. This connection stays entirely on your local machine. Select `google_search_agent`.
### Try with voice and video
To try with voice, reload the web browser, click the microphone button to enable the voice input, and ask the the following questions in voice. The agent will use the google_search tool to get the latest information to answer those questions. You will hear the answer in voice in real-time.
- What is the weather in New York?
- What is the time in New York?
- What is the weather in Paris?
- What is the time in Paris?
To try with video, reload the web browser, click the camera button to enable the video input, and ask questions like "What do you see?". The agent will answer what they see in the video input.
#### Caveat
- You can not use text chat with the native-audio models. You will see errors when entering text messages on `adk web`.
### Stop the tool
Stop `adk web` by pressing `Ctrl-C` on the console.
### Note on ADK Streaming
The following features will be supported in the future versions of the ADK Streaming: Callback, LongRunningTool, ExampleTool, and Shell agent (e.g. SequentialAgent).
Congratulations! You've successfully created and interacted with your first Streaming agent using ADK!
## Next steps: build custom streaming app
The [Bidi-streaming development guide series](https://google.github.io/adk-docs/streaming/dev-guide/part1/index.md) gives an overview of the server and client code for a custom asynchronous web app built with ADK Streaming, enabling real-time, bidirectional audio and text communication.
# Build your agent with ADK
Get started with the Agent Development Kit (ADK) through our collection of practical guides. These tutorials are designed in a simple, progressive, step-by-step fashion, introducing you to different ADK features and capabilities.
This approach allows you to learn and build incrementally – starting with foundational concepts and gradually tackling more advanced agent development techniques. You'll explore how to apply these features effectively across various use cases, equipping you to build your own sophisticated agentic applications with ADK. Explore our collection below and happy building:
- **Multi-tool agent**
______________________________________________________________________
Create a workflow that uses multiple tools.
[Build a multi-tool agent](https://google.github.io/adk-docs/get-started/quickstart/index.md)
- **Agent team**
______________________________________________________________________
Build an multi-agent workflow including agent delegation, session management, and safety callbacks.
[Build an agent team](https://google.github.io/adk-docs/tutorials/agent-team/index.md)
- **Streaming agent**
______________________________________________________________________
Create an agent for handling streamed content.
[Build a streaming agent](https://google.github.io/adk-docs/get-started/streaming/index.md)
- **Discover sample agents**
______________________________________________________________________
Discover sample agents for retail, travel, customer service, and more!
[Discover adk-samples](https://github.com/google/adk-samples)
# Build Your First Intelligent Agent Team: A Progressive Weather Bot with ADK
Share to:
This tutorial extends from the [Quickstart example](https://google.github.io/adk-docs/get-started/quickstart/) for [Agent Development Kit](https://google.github.io/adk-docs/get-started/). Now, you're ready to dive deeper and construct a more sophisticated, **multi-agent system**.
We'll embark on building a **Weather Bot agent team**, progressively layering advanced features onto a simple foundation. Starting with a single agent that can look up weather, we will incrementally add capabilities like:
- Leveraging different AI models (Gemini, GPT, Claude).
- Designing specialized sub-agents for distinct tasks (like greetings and farewells).
- Enabling intelligent delegation between agents.
- Giving agents memory using persistent session state.
- Implementing crucial safety guardrails using callbacks.
**Why a Weather Bot Team?**
This use case, while seemingly simple, provides a practical and relatable canvas to explore core ADK concepts essential for building complex, real-world agentic applications. You'll learn how to structure interactions, manage state, ensure safety, and orchestrate multiple AI "brains" working together.
**What is ADK Again?**
As a reminder, ADK is a Python framework designed to streamline the development of applications powered by Large Language Models (LLMs). It offers robust building blocks for creating agents that can reason, plan, utilize tools, interact dynamically with users, and collaborate effectively within a team.
**In this advanced tutorial, you will master:**
- ✅ **Tool Definition & Usage:** Crafting Python functions (`tools`) that grant agents specific abilities (like fetching data) and instructing agents on how to use them effectively.
- ✅ **Multi-LLM Flexibility:** Configuring agents to utilize various leading LLMs (Gemini, GPT-4o, Claude Sonnet) via LiteLLM integration, allowing you to choose the best model for each task.
- ✅ **Agent Delegation & Collaboration:** Designing specialized sub-agents and enabling automatic routing (`auto flow`) of user requests to the most appropriate agent within a team.
- ✅ **Session State for Memory:** Utilizing `Session State` and `ToolContext` to enable agents to remember information across conversational turns, leading to more contextual interactions.
- ✅ **Safety Guardrails with Callbacks:** Implementing `before_model_callback` and `before_tool_callback` to inspect, modify, or block requests/tool usage based on predefined rules, enhancing application safety and control.
**End State Expectation:**
By completing this tutorial, you will have built a functional multi-agent Weather Bot system. This system will not only provide weather information but also handle conversational niceties, remember the last city checked, and operate within defined safety boundaries, all orchestrated using ADK.
**Prerequisites:**
- ✅ **Solid understanding of Python programming.**
- ✅ **Familiarity with Large Language Models (LLMs), APIs, and the concept of agents.**
- ❗ **Crucially: Completion of the ADK Quickstart tutorial(s) or equivalent foundational knowledge of ADK basics (Agent, Runner, SessionService, basic Tool usage).** This tutorial builds directly upon those concepts.
- ✅ **API Keys** for the LLMs you intend to use (e.g., Google AI Studio for Gemini, OpenAI Platform, Anthropic Console).
______________________________________________________________________
**Note on Execution Environment:**
This tutorial is structured for interactive notebook environments like Google Colab, Colab Enterprise, or Jupyter notebooks. Please keep the following in mind:
- **Running Async Code:** Notebook environments handle asynchronous code differently. You'll see examples using `await` (suitable when an event loop is already running, common in notebooks) or `asyncio.run()` (often needed when running as a standalone `.py` script or in specific notebook setups). The code blocks provide guidance for both scenarios.
- **Manual Runner/Session Setup:** The steps involve explicitly creating `Runner` and `SessionService` instances. This approach is shown because it gives you fine-grained control over the agent's execution lifecycle, session management, and state persistence.
**Alternative: Using ADK's Built-in Tools (Web UI / CLI / API Server)**
If you prefer a setup that handles the runner and session management automatically using ADK's standard tools, you can find the equivalent code structured for that purpose [here](https://github.com/google/adk-docs/tree/main/examples/python/tutorial/agent_team/adk-tutorial). That version is designed to be run directly with commands like `adk web` (for a web UI), `adk run` (for CLI interaction), or `adk api_server` (to expose an API). Please follow the `README.md` instructions provided in that alternative resource.
______________________________________________________________________
**Ready to build your agent team? Let's dive in!**
> **Note:** This tutorial works with adk version 1.0.0 and above
```python
# @title Step 0: Setup and Installation
# Install ADK and LiteLLM for multi-model support
!pip install google-adk -q
!pip install litellm -q
print("Installation complete.")
```
```python
# @title Import necessary libraries
import os
import asyncio
from google.adk.agents import Agent
from google.adk.models.lite_llm import LiteLlm # For multi-model support
from google.adk.sessions import InMemorySessionService
from google.adk.runners import Runner
from google.genai import types # For creating message Content/Parts
import warnings
# Ignore all warnings
warnings.filterwarnings("ignore")
import logging
logging.basicConfig(level=logging.ERROR)
print("Libraries imported.")
```
```python
# @title Configure API Keys (Replace with your actual keys!)
# --- IMPORTANT: Replace placeholders with your real API keys ---
# Gemini API Key (Get from Google AI Studio: https://aistudio.google.com/app/apikey)
os.environ["GOOGLE_API_KEY"] = "YOUR_GOOGLE_API_KEY" # <--- REPLACE
# [Optional]
# OpenAI API Key (Get from OpenAI Platform: https://platform.openai.com/api-keys)
os.environ['OPENAI_API_KEY'] = 'YOUR_OPENAI_API_KEY' # <--- REPLACE
# [Optional]
# Anthropic API Key (Get from Anthropic Console: https://console.anthropic.com/settings/keys)
os.environ['ANTHROPIC_API_KEY'] = 'YOUR_ANTHROPIC_API_KEY' # <--- REPLACE
# --- Verify Keys (Optional Check) ---
print("API Keys Set:")
print(f"Google API Key set: {'Yes' if os.environ.get('GOOGLE_API_KEY') and os.environ['GOOGLE_API_KEY'] != 'YOUR_GOOGLE_API_KEY' else 'No (REPLACE PLACEHOLDER!)'}")
print(f"OpenAI API Key set: {'Yes' if os.environ.get('OPENAI_API_KEY') and os.environ['OPENAI_API_KEY'] != 'YOUR_OPENAI_API_KEY' else 'No (REPLACE PLACEHOLDER!)'}")
print(f"Anthropic API Key set: {'Yes' if os.environ.get('ANTHROPIC_API_KEY') and os.environ['ANTHROPIC_API_KEY'] != 'YOUR_ANTHROPIC_API_KEY' else 'No (REPLACE PLACEHOLDER!)'}")
# Configure ADK to use API keys directly (not Vertex AI for this multi-model setup)
os.environ["GOOGLE_GENAI_USE_VERTEXAI"] = "False"
# @markdown **Security Note:** It's best practice to manage API keys securely (e.g., using Colab Secrets or environment variables) rather than hardcoding them directly in the notebook. Replace the placeholder strings above.
```
```python
# --- Define Model Constants for easier use ---
# More supported models can be referenced here: https://ai.google.dev/gemini-api/docs/models#model-variations
MODEL_GEMINI_2_5_FLASH = "gemini-2.5-flash"
# More supported models can be referenced here: https://docs.litellm.ai/docs/providers/openai#openai-chat-completion-models
MODEL_GPT_4O = "openai/gpt-4.1" # You can also try: gpt-4.1-mini, gpt-4o etc.
# More supported models can be referenced here: https://docs.litellm.ai/docs/providers/anthropic
MODEL_CLAUDE_SONNET = "anthropic/claude-sonnet-4-20250514" # You can also try: claude-opus-4-20250514 , claude-3-7-sonnet-20250219 etc
print("\nEnvironment configured.")
```
______________________________________________________________________
## Step 1: Your First Agent - Basic Weather Lookup
Let's begin by building the fundamental component of our Weather Bot: a single agent capable of performing a specific task – looking up weather information. This involves creating two core pieces:
1. **A Tool:** A Python function that equips the agent with the *ability* to fetch weather data.
1. **An Agent:** The AI "brain" that understands the user's request, knows it has a weather tool, and decides when and how to use it.
______________________________________________________________________
**1. Define the Tool (`get_weather`)**
In ADK, **Tools** are the building blocks that give agents concrete capabilities beyond just text generation. They are typically regular Python functions that perform specific actions, like calling an API, querying a database, or performing calculations.
Our first tool will provide a *mock* weather report. This allows us to focus on the agent structure without needing external API keys yet. Later, you could easily swap this mock function with one that calls a real weather service.
**Key Concept: Docstrings are Crucial!** The agent's LLM relies heavily on the function's **docstring** to understand:
- *What* the tool does.
- *When* to use it.
- *What arguments* it requires (`city: str`).
- *What information* it returns.
**Best Practice:** Write clear, descriptive, and accurate docstrings for your tools. This is essential for the LLM to use the tool correctly.
```python
# @title Define the get_weather Tool
def get_weather(city: str) -> dict:
"""Retrieves the current weather report for a specified city.
Args:
city (str): The name of the city (e.g., "New York", "London", "Tokyo").
Returns:
dict: A dictionary containing the weather information.
Includes a 'status' key ('success' or 'error').
If 'success', includes a 'report' key with weather details.
If 'error', includes an 'error_message' key.
"""
print(f"--- Tool: get_weather called for city: {city} ---") # Log tool execution
city_normalized = city.lower().replace(" ", "") # Basic normalization
# Mock weather data
mock_weather_db = {
"newyork": {"status": "success", "report": "The weather in New York is sunny with a temperature of 25°C."},
"london": {"status": "success", "report": "It's cloudy in London with a temperature of 15°C."},
"tokyo": {"status": "success", "report": "Tokyo is experiencing light rain and a temperature of 18°C."},
}
if city_normalized in mock_weather_db:
return mock_weather_db[city_normalized]
else:
return {"status": "error", "error_message": f"Sorry, I don't have weather information for '{city}'."}
# Example tool usage (optional test)
print(get_weather("New York"))
print(get_weather("Paris"))
```
______________________________________________________________________
**2. Define the Agent (`weather_agent`)**
Now, let's create the **Agent** itself. An `Agent` in ADK orchestrates the interaction between the user, the LLM, and the available tools.
We configure it with several key parameters:
- `name`: A unique identifier for this agent (e.g., "weather_agent_v1").
- `model`: Specifies which LLM to use (e.g., `MODEL_GEMINI_2_5_FLASH`). We'll start with a specific Gemini model.
- `description`: A concise summary of the agent's overall purpose. This becomes crucial later when other agents need to decide whether to delegate tasks to *this* agent.
- `instruction`: Detailed guidance for the LLM on how to behave, its persona, its goals, and specifically *how and when* to utilize its assigned `tools`.
- `tools`: A list containing the actual Python tool functions the agent is allowed to use (e.g., `[get_weather]`).
**Best Practice:** Provide clear and specific `instruction` prompts. The more detailed the instructions, the better the LLM can understand its role and how to use its tools effectively. Be explicit about error handling if needed.
**Best Practice:** Choose descriptive `name` and `description` values. These are used internally by ADK and are vital for features like automatic delegation (covered later).
```python
# @title Define the Weather Agent
# Use one of the model constants defined earlier
AGENT_MODEL = MODEL_GEMINI_2_5_FLASH # Starting with Gemini
weather_agent = Agent(
name="weather_agent_v1",
model=AGENT_MODEL, # Can be a string for Gemini or a LiteLlm object
description="Provides weather information for specific cities.",
instruction="You are a helpful weather assistant. "
"When the user asks for the weather in a specific city, "
"use the 'get_weather' tool to find the information. "
"If the tool returns an error, inform the user politely. "
"If the tool is successful, present the weather report clearly.",
tools=[get_weather], # Pass the function directly
)
print(f"Agent '{weather_agent.name}' created using model '{AGENT_MODEL}'.")
```
______________________________________________________________________
**3. Setup Runner and Session Service**
To manage conversations and execute the agent, we need two more components:
- `SessionService`: Responsible for managing conversation history and state for different users and sessions. The `InMemorySessionService` is a simple implementation that stores everything in memory, suitable for testing and simple applications. It keeps track of the messages exchanged. We'll explore state persistence more in Step 4.
- `Runner`: The engine that orchestrates the interaction flow. It takes user input, routes it to the appropriate agent, manages calls to the LLM and tools based on the agent's logic, handles session updates via the `SessionService`, and yields events representing the progress of the interaction.
```python
# @title Setup Session Service and Runner
# --- Session Management ---
# Key Concept: SessionService stores conversation history & state.
# InMemorySessionService is simple, non-persistent storage for this tutorial.
session_service = InMemorySessionService()
# Define constants for identifying the interaction context
APP_NAME = "weather_tutorial_app"
USER_ID = "user_1"
SESSION_ID = "session_001" # Using a fixed ID for simplicity
# Create the specific session where the conversation will happen
session = await session_service.create_session(
app_name=APP_NAME,
user_id=USER_ID,
session_id=SESSION_ID
)
print(f"Session created: App='{APP_NAME}', User='{USER_ID}', Session='{SESSION_ID}'")
# --- OR ---
# Uncomment the following lines if running as a standard Python script (.py file):
# async def init_session(app_name:str,user_id:str,session_id:str) -> InMemorySessionService:
# session = await session_service.create_session(
# app_name=app_name,
# user_id=user_id,
# session_id=session_id
# )
# print(f"Session created: App='{app_name}', User='{user_id}', Session='{session_id}'")
# return session
#
# session = asyncio.run(init_session(APP_NAME,USER_ID,SESSION_ID))
# --- Runner ---
# Key Concept: Runner orchestrates the agent execution loop.
runner = Runner(
agent=weather_agent, # The agent we want to run
app_name=APP_NAME, # Associates runs with our app
session_service=session_service # Uses our session manager
)
print(f"Runner created for agent '{runner.agent.name}'.")
```
______________________________________________________________________
**4. Interact with the Agent**
We need a way to send messages to our agent and receive its responses. Since LLM calls and tool executions can take time, ADK's `Runner` operates asynchronously.
We'll define an `async` helper function (`call_agent_async`) that:
1. Takes a user query string.
1. Packages it into the ADK `Content` format.
1. Calls `runner.run_async`, providing the user/session context and the new message.
1. Iterates through the **Events** yielded by the runner. Events represent steps in the agent's execution (e.g., tool call requested, tool result received, intermediate LLM thought, final response).
1. Identifies and prints the **final response** event using `event.is_final_response()`.
**Why `async`?** Interactions with LLMs and potentially tools (like external APIs) are I/O-bound operations. Using `asyncio` allows the program to handle these operations efficiently without blocking execution.
```python
# @title Define Agent Interaction Function
from google.genai import types # For creating message Content/Parts
async def call_agent_async(query: str, runner, user_id, session_id):
"""Sends a query to the agent and prints the final response."""
print(f"\n>>> User Query: {query}")
# Prepare the user's message in ADK format
content = types.Content(role='user', parts=[types.Part(text=query)])
final_response_text = "Agent did not produce a final response." # Default
# Key Concept: run_async executes the agent logic and yields Events.
# We iterate through events to find the final answer.
async for event in runner.run_async(user_id=user_id, session_id=session_id, new_message=content):
# You can uncomment the line below to see *all* events during execution
# print(f" [Event] Author: {event.author}, Type: {type(event).__name__}, Final: {event.is_final_response()}, Content: {event.content}")
# Key Concept: is_final_response() marks the concluding message for the turn.
if event.is_final_response():
if event.content and event.content.parts:
# Assuming text response in the first part
final_response_text = event.content.parts[0].text
elif event.actions and event.actions.escalate: # Handle potential errors/escalations
final_response_text = f"Agent escalated: {event.error_message or 'No specific message.'}"
# Add more checks here if needed (e.g., specific error codes)
break # Stop processing events once the final response is found
print(f"<<< Agent Response: {final_response_text}")
```
______________________________________________________________________
**5. Run the Conversation**
Finally, let's test our setup by sending a few queries to the agent. We wrap our `async` calls in a main `async` function and run it using `await`.
Watch the output:
- See the user queries.
- Notice the `--- Tool: get_weather called... ---` logs when the agent uses the tool.
- Observe the agent's final responses, including how it handles the case where weather data isn't available (for Paris).
```python
# @title Run the Initial Conversation
# We need an async function to await our interaction helper
async def run_conversation():
await call_agent_async("What is the weather like in London?",
runner=runner,
user_id=USER_ID,
session_id=SESSION_ID)
await call_agent_async("How about Paris?",
runner=runner,
user_id=USER_ID,
session_id=SESSION_ID) # Expecting the tool's error message
await call_agent_async("Tell me the weather in New York",
runner=runner,
user_id=USER_ID,
session_id=SESSION_ID)
# Execute the conversation using await in an async context (like Colab/Jupyter)
await run_conversation()
# --- OR ---
# Uncomment the following lines if running as a standard Python script (.py file):
# import asyncio
# if __name__ == "__main__":
# try:
# asyncio.run(run_conversation())
# except Exception as e:
# print(f"An error occurred: {e}")
```
______________________________________________________________________
Congratulations! You've successfully built and interacted with your first ADK agent. It understands the user's request, uses a tool to find information, and responds appropriately based on the tool's result.
In the next step, we'll explore how to easily switch the underlying Language Model powering this agent.
## Step 2: Going Multi-Model with LiteLLM [Optional]
In Step 1, we built a functional Weather Agent powered by a specific Gemini model. While effective, real-world applications often benefit from the flexibility to use *different* Large Language Models (LLMs). Why?
- **Performance:** Some models excel at specific tasks (e.g., coding, reasoning, creative writing).
- **Cost:** Different models have varying price points.
- **Capabilities:** Models offer diverse features, context window sizes, and fine-tuning options.
- **Availability/Redundancy:** Having alternatives ensures your application remains functional even if one provider experiences issues.
ADK makes switching between models seamless through its integration with the [**LiteLLM**](https://github.com/BerriAI/litellm) library. LiteLLM acts as a consistent interface to over 100 different LLMs.
**In this step, we will:**
1. Learn how to configure an ADK `Agent` to use models from providers like OpenAI (GPT) and Anthropic (Claude) using the `LiteLlm` wrapper.
1. Define, configure (with their own sessions and runners), and immediately test instances of our Weather Agent, each backed by a different LLM.
1. Interact with these different agents to observe potential variations in their responses, even when using the same underlying tool.
______________________________________________________________________
**1. Import `LiteLlm`**
We imported this during the initial setup (Step 0), but it's the key component for multi-model support:
```python
# @title 1. Import LiteLlm
from google.adk.models.lite_llm import LiteLlm
```
**2. Define and Test Multi-Model Agents**
Instead of passing only a model name string (which defaults to Google's Gemini models), we wrap the desired model identifier string within the `LiteLlm` class.
- **Key Concept: `LiteLlm` Wrapper:** The `LiteLlm(model="provider/model_name")` syntax tells ADK to route requests for this agent through the LiteLLM library to the specified model provider.
Make sure you have configured the necessary API keys for OpenAI and Anthropic in Step 0. We'll use the `call_agent_async` function (defined earlier, which now accepts `runner`, `user_id`, and `session_id`) to interact with each agent immediately after its setup.
Each block below will:
- Define the agent using a specific LiteLLM model (`MODEL_GPT_4O` or `MODEL_CLAUDE_SONNET`).
- Create a *new, separate* `InMemorySessionService` and session specifically for that agent's test run. This keeps the conversation histories isolated for this demonstration.
- Create a `Runner` configured for the specific agent and its session service.
- Immediately call `call_agent_async` to send a query and test the agent.
**Best Practice:** Use constants for model names (like `MODEL_GPT_4O`, `MODEL_CLAUDE_SONNET` defined in Step 0) to avoid typos and make code easier to manage.
**Error Handling:** We wrap the agent definitions in `try...except` blocks. This prevents the entire code cell from failing if an API key for a specific provider is missing or invalid, allowing the tutorial to proceed with the models that *are* configured.
First, let's create and test the agent using OpenAI's GPT-4o.
```python
# @title Define and Test GPT Agent
# Make sure 'get_weather' function from Step 1 is defined in your environment.
# Make sure 'call_agent_async' is defined from earlier.
# --- Agent using GPT-4o ---
weather_agent_gpt = None # Initialize to None
runner_gpt = None # Initialize runner to None
try:
weather_agent_gpt = Agent(
name="weather_agent_gpt",
# Key change: Wrap the LiteLLM model identifier
model=LiteLlm(model=MODEL_GPT_4O),
description="Provides weather information (using GPT-4o).",
instruction="You are a helpful weather assistant powered by GPT-4o. "
"Use the 'get_weather' tool for city weather requests. "
"Clearly present successful reports or polite error messages based on the tool's output status.",
tools=[get_weather], # Re-use the same tool
)
print(f"Agent '{weather_agent_gpt.name}' created using model '{MODEL_GPT_4O}'.")
# InMemorySessionService is simple, non-persistent storage for this tutorial.
session_service_gpt = InMemorySessionService() # Create a dedicated service
# Define constants for identifying the interaction context
APP_NAME_GPT = "weather_tutorial_app_gpt" # Unique app name for this test
USER_ID_GPT = "user_1_gpt"
SESSION_ID_GPT = "session_001_gpt" # Using a fixed ID for simplicity
# Create the specific session where the conversation will happen
session_gpt = await session_service_gpt.create_session(
app_name=APP_NAME_GPT,
user_id=USER_ID_GPT,
session_id=SESSION_ID_GPT
)
print(f"Session created: App='{APP_NAME_GPT}', User='{USER_ID_GPT}', Session='{SESSION_ID_GPT}'")
# Create a runner specific to this agent and its session service
runner_gpt = Runner(
agent=weather_agent_gpt,
app_name=APP_NAME_GPT, # Use the specific app name
session_service=session_service_gpt # Use the specific session service
)
print(f"Runner created for agent '{runner_gpt.agent.name}'.")
# --- Test the GPT Agent ---
print("\n--- Testing GPT Agent ---")
# Ensure call_agent_async uses the correct runner, user_id, session_id
await call_agent_async(query = "What's the weather in Tokyo?",
runner=runner_gpt,
user_id=USER_ID_GPT,
session_id=SESSION_ID_GPT)
# --- OR ---
# Uncomment the following lines if running as a standard Python script (.py file):
# import asyncio
# if __name__ == "__main__":
# try:
# asyncio.run(call_agent_async(query = "What's the weather in Tokyo?",
# runner=runner_gpt,
# user_id=USER_ID_GPT,
# session_id=SESSION_ID_GPT)
# except Exception as e:
# print(f"An error occurred: {e}")
except Exception as e:
print(f"❌ Could not create or run GPT agent '{MODEL_GPT_4O}'. Check API Key and model name. Error: {e}")
```
Next, we'll do the same for Anthropic's Claude Sonnet.
```python
# @title Define and Test Claude Agent
# Make sure 'get_weather' function from Step 1 is defined in your environment.
# Make sure 'call_agent_async' is defined from earlier.
# --- Agent using Claude Sonnet ---
weather_agent_claude = None # Initialize to None
runner_claude = None # Initialize runner to None
try:
weather_agent_claude = Agent(
name="weather_agent_claude",
# Key change: Wrap the LiteLLM model identifier
model=LiteLlm(model=MODEL_CLAUDE_SONNET),
description="Provides weather information (using Claude Sonnet).",
instruction="You are a helpful weather assistant powered by Claude Sonnet. "
"Use the 'get_weather' tool for city weather requests. "
"Analyze the tool's dictionary output ('status', 'report'/'error_message'). "
"Clearly present successful reports or polite error messages.",
tools=[get_weather], # Re-use the same tool
)
print(f"Agent '{weather_agent_claude.name}' created using model '{MODEL_CLAUDE_SONNET}'.")
# InMemorySessionService is simple, non-persistent storage for this tutorial.
session_service_claude = InMemorySessionService() # Create a dedicated service
# Define constants for identifying the interaction context
APP_NAME_CLAUDE = "weather_tutorial_app_claude" # Unique app name
USER_ID_CLAUDE = "user_1_claude"
SESSION_ID_CLAUDE = "session_001_claude" # Using a fixed ID for simplicity
# Create the specific session where the conversation will happen
session_claude = await session_service_claude.create_session(
app_name=APP_NAME_CLAUDE,
user_id=USER_ID_CLAUDE,
session_id=SESSION_ID_CLAUDE
)
print(f"Session created: App='{APP_NAME_CLAUDE}', User='{USER_ID_CLAUDE}', Session='{SESSION_ID_CLAUDE}'")
# Create a runner specific to this agent and its session service
runner_claude = Runner(
agent=weather_agent_claude,
app_name=APP_NAME_CLAUDE, # Use the specific app name
session_service=session_service_claude # Use the specific session service
)
print(f"Runner created for agent '{runner_claude.agent.name}'.")
# --- Test the Claude Agent ---
print("\n--- Testing Claude Agent ---")
# Ensure call_agent_async uses the correct runner, user_id, session_id
await call_agent_async(query = "Weather in London please.",
runner=runner_claude,
user_id=USER_ID_CLAUDE,
session_id=SESSION_ID_CLAUDE)
# --- OR ---
# Uncomment the following lines if running as a standard Python script (.py file):
# import asyncio
# if __name__ == "__main__":
# try:
# asyncio.run(call_agent_async(query = "Weather in London please.",
# runner=runner_claude,
# user_id=USER_ID_CLAUDE,
# session_id=SESSION_ID_CLAUDE)
# except Exception as e:
# print(f"An error occurred: {e}")
except Exception as e:
print(f"❌ Could not create or run Claude agent '{MODEL_CLAUDE_SONNET}'. Check API Key and model name. Error: {e}")
```
Observe the output carefully from both code blocks. You should see:
1. Each agent (`weather_agent_gpt`, `weather_agent_claude`) is created successfully (if API keys are valid).
1. A dedicated session and runner are set up for each.
1. Each agent correctly identifies the need to use the `get_weather` tool when processing the query (you'll see the `--- Tool: get_weather called... ---` log).
1. The *underlying tool logic* remains identical, always returning our mock data.
1. However, the **final textual response** generated by each agent might differ slightly in phrasing, tone, or formatting. This is because the instruction prompt is interpreted and executed by different LLMs (GPT-4o vs. Claude Sonnet).
This step demonstrates the power and flexibility ADK + LiteLLM provide. You can easily experiment with and deploy agents using various LLMs while keeping your core application logic (tools, fundamental agent structure) consistent.
In the next step, we'll move beyond a single agent and build a small team where agents can delegate tasks to each other!
______________________________________________________________________
## Step 3: Building an Agent Team - Delegation for Greetings & Farewells
In Steps 1 and 2, we built and experimented with a single agent focused solely on weather lookups. While effective for its specific task, real-world applications often involve handling a wider variety of user interactions. We *could* keep adding more tools and complex instructions to our single weather agent, but this can quickly become unmanageable and less efficient.
A more robust approach is to build an **Agent Team**. This involves:
1. Creating multiple, **specialized agents**, each designed for a specific capability (e.g., one for weather, one for greetings, one for calculations).
1. Designating a **root agent** (or orchestrator) that receives the initial user request.
1. Enabling the root agent to **delegate** the request to the most appropriate specialized sub-agent based on the user's intent.
**Why build an Agent Team?**
- **Modularity:** Easier to develop, test, and maintain individual agents.
- **Specialization:** Each agent can be fine-tuned (instructions, model choice) for its specific task.
- **Scalability:** Simpler to add new capabilities by adding new agents.
- **Efficiency:** Allows using potentially simpler/cheaper models for simpler tasks (like greetings).
**In this step, we will:**
1. Define simple tools for handling greetings (`say_hello`) and farewells (`say_goodbye`).
1. Create two new specialized sub-agents: `greeting_agent` and `farewell_agent`.
1. Update our main weather agent (`weather_agent_v2`) to act as the **root agent**.
1. Configure the root agent with its sub-agents, enabling **automatic delegation**.
1. Test the delegation flow by sending different types of requests to the root agent.
______________________________________________________________________
**1. Define Tools for Sub-Agents**
First, let's create the simple Python functions that will serve as tools for our new specialist agents. Remember, clear docstrings are vital for the agents that will use them.
```python
# @title Define Tools for Greeting and Farewell Agents
from typing import Optional # Make sure to import Optional
# Ensure 'get_weather' from Step 1 is available if running this step independently.
# def get_weather(city: str) -> dict: ... (from Step 1)
def say_hello(name: Optional[str] = None) -> str:
"""Provides a simple greeting. If a name is provided, it will be used.
Args:
name (str, optional): The name of the person to greet. Defaults to a generic greeting if not provided.
Returns:
str: A friendly greeting message.
"""
if name:
greeting = f"Hello, {name}!"
print(f"--- Tool: say_hello called with name: {name} ---")
else:
greeting = "Hello there!" # Default greeting if name is None or not explicitly passed
print(f"--- Tool: say_hello called without a specific name (name_arg_value: {name}) ---")
return greeting
def say_goodbye() -> str:
"""Provides a simple farewell message to conclude the conversation."""
print(f"--- Tool: say_goodbye called ---")
return "Goodbye! Have a great day."
print("Greeting and Farewell tools defined.")
# Optional self-test
print(say_hello("Alice"))
print(say_hello()) # Test with no argument (should use default "Hello there!")
print(say_hello(name=None)) # Test with name explicitly as None (should use default "Hello there!")
```
______________________________________________________________________
**2. Define the Sub-Agents (Greeting & Farewell)**
Now, create the `Agent` instances for our specialists. Notice their highly focused `instruction` and, critically, their clear `description`. The `description` is the primary information the *root agent* uses to decide *when* to delegate to these sub-agents.
**Best Practice:** Sub-agent `description` fields should accurately and concisely summarize their specific capability. This is crucial for effective automatic delegation.
**Best Practice:** Sub-agent `instruction` fields should be tailored to their limited scope, telling them exactly what to do and *what not* to do (e.g., "Your *only* task is...").
```python
# @title Define Greeting and Farewell Sub-Agents
# If you want to use models other than Gemini, Ensure LiteLlm is imported and API keys are set (from Step 0/2)
# from google.adk.models.lite_llm import LiteLlm
# MODEL_GPT_4O, MODEL_CLAUDE_SONNET etc. should be defined
# Or else, continue to use: model = MODEL_GEMINI_2_5_FLASH
# --- Greeting Agent ---
greeting_agent = None
try:
greeting_agent = Agent(
# Using a potentially different/cheaper model for a simple task
model = MODEL_GEMINI_2_5_FLASH,
# model=LiteLlm(model=MODEL_GPT_4O), # If you would like to experiment with other models
name="greeting_agent",
instruction="You are the Greeting Agent. Your ONLY task is to provide a friendly greeting to the user. "
"Use the 'say_hello' tool to generate the greeting. "
"If the user provides their name, make sure to pass it to the tool. "
"Do not engage in any other conversation or tasks.",
description="Handles simple greetings and hellos using the 'say_hello' tool.", # Crucial for delegation
tools=[say_hello],
)
print(f"✅ Agent '{greeting_agent.name}' created using model '{greeting_agent.model}'.")
except Exception as e:
print(f"❌ Could not create Greeting agent. Check API Key ({greeting_agent.model}). Error: {e}")
# --- Farewell Agent ---
farewell_agent = None
try:
farewell_agent = Agent(
# Can use the same or a different model
model = MODEL_GEMINI_2_5_FLASH,
# model=LiteLlm(model=MODEL_GPT_4O), # If you would like to experiment with other models
name="farewell_agent",
instruction="You are the Farewell Agent. Your ONLY task is to provide a polite goodbye message. "
"Use the 'say_goodbye' tool when the user indicates they are leaving or ending the conversation "
"(e.g., using words like 'bye', 'goodbye', 'thanks bye', 'see you'). "
"Do not perform any other actions.",
description="Handles simple farewells and goodbyes using the 'say_goodbye' tool.", # Crucial for delegation
tools=[say_goodbye],
)
print(f"✅ Agent '{farewell_agent.name}' created using model '{farewell_agent.model}'.")
except Exception as e:
print(f"❌ Could not create Farewell agent. Check API Key ({farewell_agent.model}). Error: {e}")
```
______________________________________________________________________
**3. Define the Root Agent (Weather Agent v2) with Sub-Agents**
Now, we upgrade our `weather_agent`. The key changes are:
- Adding the `sub_agents` parameter: We pass a list containing the `greeting_agent` and `farewell_agent` instances we just created.
- Updating the `instruction`: We explicitly tell the root agent *about* its sub-agents and *when* it should delegate tasks to them.
**Key Concept: Automatic Delegation (Auto Flow)** By providing the `sub_agents` list, ADK enables automatic delegation. When the root agent receives a user query, its LLM considers not only its own instructions and tools but also the `description` of each sub-agent. If the LLM determines that a query aligns better with a sub-agent's described capability (e.g., "Handles simple greetings"), it will automatically generate a special internal action to *transfer control* to that sub-agent for that turn. The sub-agent then processes the query using its own model, instructions, and tools.
**Best Practice:** Ensure the root agent's instructions clearly guide its delegation decisions. Mention the sub-agents by name and describe the conditions under which delegation should occur.
```python
# @title Define the Root Agent with Sub-Agents
# Ensure sub-agents were created successfully before defining the root agent.
# Also ensure the original 'get_weather' tool is defined.
root_agent = None
runner_root = None # Initialize runner
if greeting_agent and farewell_agent and 'get_weather' in globals():
# Let's use a capable Gemini model for the root agent to handle orchestration
root_agent_model = MODEL_GEMINI_2_5_FLASH
weather_agent_team = Agent(
name="weather_agent_v2", # Give it a new version name
model=root_agent_model,
description="The main coordinator agent. Handles weather requests and delegates greetings/farewells to specialists.",
instruction="You are the main Weather Agent coordinating a team. Your primary responsibility is to provide weather information. "
"Use the 'get_weather' tool ONLY for specific weather requests (e.g., 'weather in London'). "
"You have specialized sub-agents: "
"1. 'greeting_agent': Handles simple greetings like 'Hi', 'Hello'. Delegate to it for these. "
"2. 'farewell_agent': Handles simple farewells like 'Bye', 'See you'. Delegate to it for these. "
"Analyze the user's query. If it's a greeting, delegate to 'greeting_agent'. If it's a farewell, delegate to 'farewell_agent'. "
"If it's a weather request, handle it yourself using 'get_weather'. "
"For anything else, respond appropriately or state you cannot handle it.",
tools=[get_weather], # Root agent still needs the weather tool for its core task
# Key change: Link the sub-agents here!
sub_agents=[greeting_agent, farewell_agent]
)
print(f"✅ Root Agent '{weather_agent_team.name}' created using model '{root_agent_model}' with sub-agents: {[sa.name for sa in weather_agent_team.sub_agents]}")
else:
print("❌ Cannot create root agent because one or more sub-agents failed to initialize or 'get_weather' tool is missing.")
if not greeting_agent: print(" - Greeting Agent is missing.")
if not farewell_agent: print(" - Farewell Agent is missing.")
if 'get_weather' not in globals(): print(" - get_weather function is missing.")
```
______________________________________________________________________
**4. Interact with the Agent Team**
Now that we've defined our root agent (`weather_agent_team` - *Note: Ensure this variable name matches the one defined in the previous code block, likely `# @title Define the Root Agent with Sub-Agents`, which might have named it `root_agent`*) with its specialized sub-agents, let's test the delegation mechanism.
The following code block will:
1. Define an `async` function `run_team_conversation`.
1. Inside this function, create a *new, dedicated* `InMemorySessionService` and a specific session (`session_001_agent_team`) just for this test run. This isolates the conversation history for testing the team dynamics.
1. Create a `Runner` (`runner_agent_team`) configured to use our `weather_agent_team` (the root agent) and the dedicated session service.
1. Use our updated `call_agent_async` function to send different types of queries (greeting, weather request, farewell) to the `runner_agent_team`. We explicitly pass the runner, user ID, and session ID for this specific test.
1. Immediately execute the `run_team_conversation` function.
We expect the following flow:
1. The "Hello there!" query goes to `runner_agent_team`.
1. The root agent (`weather_agent_team`) receives it and, based on its instructions and the `greeting_agent`'s description, delegates the task.
1. `greeting_agent` handles the query, calls its `say_hello` tool, and generates the response.
1. The "What is the weather in New York?" query is *not* delegated and is handled directly by the root agent using its `get_weather` tool.
1. The "Thanks, bye!" query is delegated to the `farewell_agent`, which uses its `say_goodbye` tool.
```python
# @title Interact with the Agent Team
import asyncio # Ensure asyncio is imported
# Ensure the root agent (e.g., 'weather_agent_team' or 'root_agent' from the previous cell) is defined.
# Ensure the call_agent_async function is defined.
# Check if the root agent variable exists before defining the conversation function
root_agent_var_name = 'root_agent' # Default name from Step 3 guide
if 'weather_agent_team' in globals(): # Check if user used this name instead
root_agent_var_name = 'weather_agent_team'
elif 'root_agent' not in globals():
print("⚠️ Root agent ('root_agent' or 'weather_agent_team') not found. Cannot define run_team_conversation.")
# Assign a dummy value to prevent NameError later if the code block runs anyway
root_agent = None # Or set a flag to prevent execution
# Only define and run if the root agent exists
if root_agent_var_name in globals() and globals()[root_agent_var_name]:
# Define the main async function for the conversation logic.
# The 'await' keywords INSIDE this function are necessary for async operations.
async def run_team_conversation():
print("\n--- Testing Agent Team Delegation ---")
session_service = InMemorySessionService()
APP_NAME = "weather_tutorial_agent_team"
USER_ID = "user_1_agent_team"
SESSION_ID = "session_001_agent_team"
session = await session_service.create_session(
app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID
)
print(f"Session created: App='{APP_NAME}', User='{USER_ID}', Session='{SESSION_ID}'")
actual_root_agent = globals()[root_agent_var_name]
runner_agent_team = Runner( # Or use InMemoryRunner
agent=actual_root_agent,
app_name=APP_NAME,
session_service=session_service
)
print(f"Runner created for agent '{actual_root_agent.name}'.")
# --- Interactions using await (correct within async def) ---
await call_agent_async(query = "Hello there!",
runner=runner_agent_team,
user_id=USER_ID,
session_id=SESSION_ID)
await call_agent_async(query = "What is the weather in New York?",
runner=runner_agent_team,
user_id=USER_ID,
session_id=SESSION_ID)
await call_agent_async(query = "Thanks, bye!",
runner=runner_agent_team,
user_id=USER_ID,
session_id=SESSION_ID)
# --- Execute the `run_team_conversation` async function ---
# Choose ONE of the methods below based on your environment.
# Note: This may require API keys for the models used!
# METHOD 1: Direct await (Default for Notebooks/Async REPLs)
# If your environment supports top-level await (like Colab/Jupyter notebooks),
# it means an event loop is already running, so you can directly await the function.
print("Attempting execution using 'await' (default for notebooks)...")
await run_team_conversation()
# METHOD 2: asyncio.run (For Standard Python Scripts [.py])
# If running this code as a standard Python script from your terminal,
# the script context is synchronous. `asyncio.run()` is needed to
# create and manage an event loop to execute your async function.
# To use this method:
# 1. Comment out the `await run_team_conversation()` line above.
# 2. Uncomment the following block:
"""
import asyncio
if __name__ == "__main__": # Ensures this runs only when script is executed directly
print("Executing using 'asyncio.run()' (for standard Python scripts)...")
try:
# This creates an event loop, runs your async function, and closes the loop.
asyncio.run(run_team_conversation())
except Exception as e:
print(f"An error occurred: {e}")
"""
else:
# This message prints if the root agent variable wasn't found earlier
print("\n⚠️ Skipping agent team conversation execution as the root agent was not successfully defined in a previous step.")
```
______________________________________________________________________
Look closely at the output logs, especially the `--- Tool: ... called ---` messages. You should observe:
- For "Hello there!", the `say_hello` tool was called (indicating `greeting_agent` handled it).
- For "What is the weather in New York?", the `get_weather` tool was called (indicating the root agent handled it).
- For "Thanks, bye!", the `say_goodbye` tool was called (indicating `farewell_agent` handled it).
This confirms successful **automatic delegation**! The root agent, guided by its instructions and the `description`s of its `sub_agents`, correctly routed user requests to the appropriate specialist agent within the team.
You've now structured your application with multiple collaborating agents. This modular design is fundamental for building more complex and capable agent systems. In the next step, we'll give our agents the ability to remember information across turns using session state.
## Step 4: Adding Memory and Personalization with Session State
So far, our agent team can handle different tasks through delegation, but each interaction starts fresh – the agents have no memory of past conversations or user preferences within a session. To create more sophisticated and context-aware experiences, agents need **memory**. ADK provides this through **Session State**.
**What is Session State?**
- It's a Python dictionary (`session.state`) tied to a specific user session (identified by `APP_NAME`, `USER_ID`, `SESSION_ID`).
- It persists information *across multiple conversational turns* within that session.
- Agents and Tools can read from and write to this state, allowing them to remember details, adapt behavior, and personalize responses.
**How Agents Interact with State:**
1. **`ToolContext` (Primary Method):** Tools can accept a `ToolContext` object (automatically provided by ADK if declared as the last argument). This object gives direct access to the session state via `tool_context.state`, allowing tools to read preferences or save results *during* execution.
1. **`output_key` (Auto-Save Agent Response):** An `Agent` can be configured with an `output_key="your_key"`. ADK will then automatically save the agent's final textual response for a turn into `session.state["your_key"]`.
**In this step, we will enhance our Weather Bot team by:**
1. Using a **new** `InMemorySessionService` to demonstrate state in isolation.
1. Initializing session state with a user preference for `temperature_unit`.
1. Creating a state-aware version of the weather tool (`get_weather_stateful`) that reads this preference via `ToolContext` and adjusts its output format (Celsius/Fahrenheit).
1. Updating the root agent to use this stateful tool and configuring it with an `output_key` to automatically save its final weather report to the session state.
1. Running a conversation to observe how the initial state affects the tool, how manual state changes alter subsequent behavior, and how `output_key` persists the agent's response.
______________________________________________________________________
**1. Initialize New Session Service and State**
To clearly demonstrate state management without interference from prior steps, we'll instantiate a new `InMemorySessionService`. We'll also create a session with an initial state defining the user's preferred temperature unit.
```python
# @title 1. Initialize New Session Service and State
# Import necessary session components
from google.adk.sessions import InMemorySessionService
# Create a NEW session service instance for this state demonstration
session_service_stateful = InMemorySessionService()
print("✅ New InMemorySessionService created for state demonstration.")
# Define a NEW session ID for this part of the tutorial
SESSION_ID_STATEFUL = "session_state_demo_001"
USER_ID_STATEFUL = "user_state_demo"
# Define initial state data - user prefers Celsius initially
initial_state = {
"user_preference_temperature_unit": "Celsius"
}
# Create the session, providing the initial state
session_stateful = await session_service_stateful.create_session(
app_name=APP_NAME, # Use the consistent app name
user_id=USER_ID_STATEFUL,
session_id=SESSION_ID_STATEFUL,
state=initial_state # <<< Initialize state during creation
)
print(f"✅ Session '{SESSION_ID_STATEFUL}' created for user '{USER_ID_STATEFUL}'.")
# Verify the initial state was set correctly
retrieved_session = await session_service_stateful.get_session(app_name=APP_NAME,
user_id=USER_ID_STATEFUL,
session_id = SESSION_ID_STATEFUL)
print("\n--- Initial Session State ---")
if retrieved_session:
print(retrieved_session.state)
else:
print("Error: Could not retrieve session.")
```
______________________________________________________________________
**2. Create State-Aware Weather Tool (`get_weather_stateful`)**
Now, we create a new version of the weather tool. Its key feature is accepting `tool_context: ToolContext` which allows it to access `tool_context.state`. It will read the `user_preference_temperature_unit` and format the temperature accordingly.
- **Key Concept: `ToolContext`** This object is the bridge allowing your tool logic to interact with the session's context, including reading and writing state variables. ADK injects it automatically if defined as the last parameter of your tool function.
- **Best Practice:** When reading from state, use `dictionary.get('key', default_value)` to handle cases where the key might not exist yet, ensuring your tool doesn't crash.
```python
from google.adk.tools.tool_context import ToolContext
def get_weather_stateful(city: str, tool_context: ToolContext) -> dict:
"""Retrieves weather, converts temp unit based on session state."""
print(f"--- Tool: get_weather_stateful called for {city} ---")
# --- Read preference from state ---
preferred_unit = tool_context.state.get("user_preference_temperature_unit", "Celsius") # Default to Celsius
print(f"--- Tool: Reading state 'user_preference_temperature_unit': {preferred_unit} ---")
city_normalized = city.lower().replace(" ", "")
# Mock weather data (always stored in Celsius internally)
mock_weather_db = {
"newyork": {"temp_c": 25, "condition": "sunny"},
"london": {"temp_c": 15, "condition": "cloudy"},
"tokyo": {"temp_c": 18, "condition": "light rain"},
}
if city_normalized in mock_weather_db:
data = mock_weather_db[city_normalized]
temp_c = data["temp_c"]
condition = data["condition"]
# Format temperature based on state preference
if preferred_unit == "Fahrenheit":
temp_value = (temp_c * 9/5) + 32 # Calculate Fahrenheit
temp_unit = "°F"
else: # Default to Celsius
temp_value = temp_c
temp_unit = "°C"
report = f"The weather in {city.capitalize()} is {condition} with a temperature of {temp_value:.0f}{temp_unit}."
result = {"status": "success", "report": report}
print(f"--- Tool: Generated report in {preferred_unit}. Result: {result} ---")
# Example of writing back to state (optional for this tool)
tool_context.state["last_city_checked_stateful"] = city
print(f"--- Tool: Updated state 'last_city_checked_stateful': {city} ---")
return result
else:
# Handle city not found
error_msg = f"Sorry, I don't have weather information for '{city}'."
print(f"--- Tool: City '{city}' not found. ---")
return {"status": "error", "error_message": error_msg}
print("✅ State-aware 'get_weather_stateful' tool defined.")
```
______________________________________________________________________
**3. Redefine Sub-Agents and Update Root Agent**
To ensure this step is self-contained and builds correctly, we first redefine the `greeting_agent` and `farewell_agent` exactly as they were in Step 3. Then, we define our new root agent (`weather_agent_v4_stateful`):
- It uses the new `get_weather_stateful` tool.
- It includes the greeting and farewell sub-agents for delegation.
- **Crucially**, it sets `output_key="last_weather_report"` which automatically saves its final weather response to the session state.
```python
# @title 3. Redefine Sub-Agents and Update Root Agent with output_key
# Ensure necessary imports: Agent, LiteLlm, Runner
from google.adk.agents import Agent
from google.adk.models.lite_llm import LiteLlm
from google.adk.runners import Runner
# Ensure tools 'say_hello', 'say_goodbye' are defined (from Step 3)
# Ensure model constants MODEL_GPT_4O, MODEL_GEMINI_2_5_FLASH etc. are defined
# --- Redefine Greeting Agent (from Step 3) ---
greeting_agent = None
try:
greeting_agent = Agent(
model=MODEL_GEMINI_2_5_FLASH,
name="greeting_agent",
instruction="You are the Greeting Agent. Your ONLY task is to provide a friendly greeting using the 'say_hello' tool. Do nothing else.",
description="Handles simple greetings and hellos using the 'say_hello' tool.",
tools=[say_hello],
)
print(f"✅ Agent '{greeting_agent.name}' redefined.")
except Exception as e:
print(f"❌ Could not redefine Greeting agent. Error: {e}")
# --- Redefine Farewell Agent (from Step 3) ---
farewell_agent = None
try:
farewell_agent = Agent(
model=MODEL_GEMINI_2_5_FLASH,
name="farewell_agent",
instruction="You are the Farewell Agent. Your ONLY task is to provide a polite goodbye message using the 'say_goodbye' tool. Do not perform any other actions.",
description="Handles simple farewells and goodbyes using the 'say_goodbye' tool.",
tools=[say_goodbye],
)
print(f"✅ Agent '{farewell_agent.name}' redefined.")
except Exception as e:
print(f"❌ Could not redefine Farewell agent. Error: {e}")
# --- Define the Updated Root Agent ---
root_agent_stateful = None
runner_root_stateful = None # Initialize runner
# Check prerequisites before creating the root agent
if greeting_agent and farewell_agent and 'get_weather_stateful' in globals():
root_agent_model = MODEL_GEMINI_2_5_FLASH # Choose orchestration model
root_agent_stateful = Agent(
name="weather_agent_v4_stateful", # New version name
model=root_agent_model,
description="Main agent: Provides weather (state-aware unit), delegates greetings/farewells, saves report to state.",
instruction="You are the main Weather Agent. Your job is to provide weather using 'get_weather_stateful'. "
"The tool will format the temperature based on user preference stored in state. "
"Delegate simple greetings to 'greeting_agent' and farewells to 'farewell_agent'. "
"Handle only weather requests, greetings, and farewells.",
tools=[get_weather_stateful], # Use the state-aware tool
sub_agents=[greeting_agent, farewell_agent], # Include sub-agents
output_key="last_weather_report" # <<< Auto-save agent's final weather response
)
print(f"✅ Root Agent '{root_agent_stateful.name}' created using stateful tool and output_key.")
# --- Create Runner for this Root Agent & NEW Session Service ---
runner_root_stateful = Runner(
agent=root_agent_stateful,
app_name=APP_NAME,
session_service=session_service_stateful # Use the NEW stateful session service
)
print(f"✅ Runner created for stateful root agent '{runner_root_stateful.agent.name}' using stateful session service.")
else:
print("❌ Cannot create stateful root agent. Prerequisites missing.")
if not greeting_agent: print(" - greeting_agent definition missing.")
if not farewell_agent: print(" - farewell_agent definition missing.")
if 'get_weather_stateful' not in globals(): print(" - get_weather_stateful tool missing.")
```
______________________________________________________________________
**4. Interact and Test State Flow**
Now, let's execute a conversation designed to test the state interactions using the `runner_root_stateful` (associated with our stateful agent and the `session_service_stateful`). We'll use the `call_agent_async` function defined earlier, ensuring we pass the correct runner, user ID (`USER_ID_STATEFUL`), and session ID (`SESSION_ID_STATEFUL`).
The conversation flow will be:
1. **Check weather (London):** The `get_weather_stateful` tool should read the initial "Celsius" preference from the session state initialized in Section 1. The root agent's final response (the weather report in Celsius) should get saved to `state['last_weather_report']` via the `output_key` configuration.
1. **Manually update state:** We will *directly modify* the state stored within the `InMemorySessionService` instance (`session_service_stateful`).
- **Why direct modification?** The `session_service.get_session()` method returns a *copy* of the session. Modifying that copy wouldn't affect the state used in subsequent agent runs. For this testing scenario with `InMemorySessionService`, we access the internal `sessions` dictionary to change the *actual* stored state value for `user_preference_temperature_unit` to "Fahrenheit". *Note: In real applications, state changes are typically triggered by tools or agent logic returning `EventActions(state_delta=...)`, not direct manual updates.*
1. **Check weather again (New York):** The `get_weather_stateful` tool should now read the updated "Fahrenheit" preference from the state and convert the temperature accordingly. The root agent's *new* response (weather in Fahrenheit) will overwrite the previous value in `state['last_weather_report']` due to the `output_key`.
1. **Greet the agent:** Verify that delegation to the `greeting_agent` still works correctly alongside the stateful operations. This interaction will become the *last* response saved by `output_key` in this specific sequence.
1. **Inspect final state:** After the conversation, we retrieve the session one last time (getting a copy) and print its state to confirm the `user_preference_temperature_unit` is indeed "Fahrenheit", observe the final value saved by `output_key` (which will be the greeting in this run), and see the `last_city_checked_stateful` value written by the tool.
```python
# @title 4. Interact to Test State Flow and output_key
import asyncio # Ensure asyncio is imported
# Ensure the stateful runner (runner_root_stateful) is available from the previous cell
# Ensure call_agent_async, USER_ID_STATEFUL, SESSION_ID_STATEFUL, APP_NAME are defined
if 'runner_root_stateful' in globals() and runner_root_stateful:
# Define the main async function for the stateful conversation logic.
# The 'await' keywords INSIDE this function are necessary for async operations.
async def run_stateful_conversation():
print("\n--- Testing State: Temp Unit Conversion & output_key ---")
# 1. Check weather (Uses initial state: Celsius)
print("--- Turn 1: Requesting weather in London (expect Celsius) ---")
await call_agent_async(query= "What's the weather in London?",
runner=runner_root_stateful,
user_id=USER_ID_STATEFUL,
session_id=SESSION_ID_STATEFUL
)
# 2. Manually update state preference to Fahrenheit - DIRECTLY MODIFY STORAGE
print("\n--- Manually Updating State: Setting unit to Fahrenheit ---")
try:
# Access the internal storage directly - THIS IS SPECIFIC TO InMemorySessionService for testing
# NOTE: In production with persistent services (Database, VertexAI), you would
# typically update state via agent actions or specific service APIs if available,
# not by direct manipulation of internal storage.
stored_session = session_service_stateful.sessions[APP_NAME][USER_ID_STATEFUL][SESSION_ID_STATEFUL]
stored_session.state["user_preference_temperature_unit"] = "Fahrenheit"
# Optional: You might want to update the timestamp as well if any logic depends on it
# import time
# stored_session.last_update_time = time.time()
print(f"--- Stored session state updated. Current 'user_preference_temperature_unit': {stored_session.state.get('user_preference_temperature_unit', 'Not Set')} ---") # Added .get for safety
except KeyError:
print(f"--- Error: Could not retrieve session '{SESSION_ID_STATEFUL}' from internal storage for user '{USER_ID_STATEFUL}' in app '{APP_NAME}' to update state. Check IDs and if session was created. ---")
except Exception as e:
print(f"--- Error updating internal session state: {e} ---")
# 3. Check weather again (Tool should now use Fahrenheit)
# This will also update 'last_weather_report' via output_key
print("\n--- Turn 2: Requesting weather in New York (expect Fahrenheit) ---")
await call_agent_async(query= "Tell me the weather in New York.",
runner=runner_root_stateful,
user_id=USER_ID_STATEFUL,
session_id=SESSION_ID_STATEFUL
)
# 4. Test basic delegation (should still work)
# This will update 'last_weather_report' again, overwriting the NY weather report
print("\n--- Turn 3: Sending a greeting ---")
await call_agent_async(query= "Hi!",
runner=runner_root_stateful,
user_id=USER_ID_STATEFUL,
session_id=SESSION_ID_STATEFUL
)
# --- Execute the `run_stateful_conversation` async function ---
# Choose ONE of the methods below based on your environment.
# METHOD 1: Direct await (Default for Notebooks/Async REPLs)
# If your environment supports top-level await (like Colab/Jupyter notebooks),
# it means an event loop is already running, so you can directly await the function.
print("Attempting execution using 'await' (default for notebooks)...")
await run_stateful_conversation()
# METHOD 2: asyncio.run (For Standard Python Scripts [.py])
# If running this code as a standard Python script from your terminal,
# the script context is synchronous. `asyncio.run()` is needed to
# create and manage an event loop to execute your async function.
# To use this method:
# 1. Comment out the `await run_stateful_conversation()` line above.
# 2. Uncomment the following block:
"""
import asyncio
if __name__ == "__main__": # Ensures this runs only when script is executed directly
print("Executing using 'asyncio.run()' (for standard Python scripts)...")
try:
# This creates an event loop, runs your async function, and closes the loop.
asyncio.run(run_stateful_conversation())
except Exception as e:
print(f"An error occurred: {e}")
"""
# --- Inspect final session state after the conversation ---
# This block runs after either execution method completes.
print("\n--- Inspecting Final Session State ---")
final_session = await session_service_stateful.get_session(app_name=APP_NAME,
user_id= USER_ID_STATEFUL,
session_id=SESSION_ID_STATEFUL)
if final_session:
# Use .get() for safer access to potentially missing keys
print(f"Final Preference: {final_session.state.get('user_preference_temperature_unit', 'Not Set')}")
print(f"Final Last Weather Report (from output_key): {final_session.state.get('last_weather_report', 'Not Set')}")
print(f"Final Last City Checked (by tool): {final_session.state.get('last_city_checked_stateful', 'Not Set')}")
# Print full state for detailed view
# print(f"Full State Dict: {final_session.state}") # For detailed view
else:
print("\n❌ Error: Could not retrieve final session state.")
else:
print("\n⚠️ Skipping state test conversation. Stateful root agent runner ('runner_root_stateful') is not available.")
```
______________________________________________________________________
By reviewing the conversation flow and the final session state printout, you can confirm:
- **State Read:** The weather tool (`get_weather_stateful`) correctly read `user_preference_temperature_unit` from state, initially using "Celsius" for London.
- **State Update:** The direct modification successfully changed the stored preference to "Fahrenheit".
- **State Read (Updated):** The tool subsequently read "Fahrenheit" when asked for New York's weather and performed the conversion.
- **Tool State Write:** The tool successfully wrote the `last_city_checked_stateful` ("New York" after the second weather check) into the state via `tool_context.state`.
- **Delegation:** The delegation to the `greeting_agent` for "Hi!" functioned correctly even after state modifications.
- **`output_key`:** The `output_key="last_weather_report"` successfully saved the root agent's *final* response for *each turn* where the root agent was the one ultimately responding. In this sequence, the last response was the greeting ("Hello, there!"), so that overwrote the weather report in the state key.
- **Final State:** The final check confirms the preference persisted as "Fahrenheit".
You've now successfully integrated session state to personalize agent behavior using `ToolContext`, manually manipulated state for testing `InMemorySessionService`, and observed how `output_key` provides a simple mechanism for saving the agent's last response to state. This foundational understanding of state management is key as we proceed to implement safety guardrails using callbacks in the next steps.
______________________________________________________________________
## Step 5: Adding Safety - Input Guardrail with `before_model_callback`
Our agent team is becoming more capable, remembering preferences and using tools effectively. However, in real-world scenarios, we often need safety mechanisms to control the agent's behavior *before* potentially problematic requests even reach the core Large Language Model (LLM).
ADK provides **Callbacks** – functions that allow you to hook into specific points in the agent's execution lifecycle. The `before_model_callback` is particularly useful for input safety.
**What is `before_model_callback`?**
- It's a Python function you define that ADK executes *just before* an agent sends its compiled request (including conversation history, instructions, and the latest user message) to the underlying LLM.
- **Purpose:** Inspect the request, modify it if necessary, or block it entirely based on predefined rules.
**Common Use Cases:**
- **Input Validation/Filtering:** Check if user input meets criteria or contains disallowed content (like PII or keywords).
- **Guardrails:** Prevent harmful, off-topic, or policy-violating requests from being processed by the LLM.
- **Dynamic Prompt Modification:** Add timely information (e.g., from session state) to the LLM request context just before sending.
**How it Works:**
1. Define a function accepting `callback_context: CallbackContext` and `llm_request: LlmRequest`.
- `callback_context`: Provides access to agent info, session state (`callback_context.state`), etc.
- `llm_request`: Contains the full payload intended for the LLM (`contents`, `config`).
1. Inside the function:
- **Inspect:** Examine `llm_request.contents` (especially the last user message).
- **Modify (Use Caution):** You *can* change parts of `llm_request`.
- **Block (Guardrail):** Return an `LlmResponse` object. ADK will send this response back immediately, *skipping* the LLM call for that turn.
- **Allow:** Return `None`. ADK proceeds to call the LLM with the (potentially modified) request.
**In this step, we will:**
1. Define a `before_model_callback` function (`block_keyword_guardrail`) that checks the user's input for a specific keyword ("BLOCK").
1. Update our stateful root agent (`weather_agent_v4_stateful` from Step 4) to use this callback.
1. Create a new runner associated with this updated agent but using the *same stateful session service* to maintain state continuity.
1. Test the guardrail by sending both normal and keyword-containing requests.
______________________________________________________________________
**1. Define the Guardrail Callback Function**
This function will inspect the last user message within the `llm_request` content. If it finds "BLOCK" (case-insensitive), it constructs and returns an `LlmResponse` to block the flow; otherwise, it returns `None`.
```python
# @title 1. Define the before_model_callback Guardrail
# Ensure necessary imports are available
from google.adk.agents.callback_context import CallbackContext
from google.adk.models.llm_request import LlmRequest
from google.adk.models.llm_response import LlmResponse
from google.genai import types # For creating response content
from typing import Optional
def block_keyword_guardrail(
callback_context: CallbackContext, llm_request: LlmRequest
) -> Optional[LlmResponse]:
"""
Inspects the latest user message for 'BLOCK'. If found, blocks the LLM call
and returns a predefined LlmResponse. Otherwise, returns None to proceed.
"""
agent_name = callback_context.agent_name # Get the name of the agent whose model call is being intercepted
print(f"--- Callback: block_keyword_guardrail running for agent: {agent_name} ---")
# Extract the text from the latest user message in the request history
last_user_message_text = ""
if llm_request.contents:
# Find the most recent message with role 'user'
for content in reversed(llm_request.contents):
if content.role == 'user' and content.parts:
# Assuming text is in the first part for simplicity
if content.parts[0].text:
last_user_message_text = content.parts[0].text
break # Found the last user message text
print(f"--- Callback: Inspecting last user message: '{last_user_message_text[:100]}...' ---") # Log first 100 chars
# --- Guardrail Logic ---
keyword_to_block = "BLOCK"
if keyword_to_block in last_user_message_text.upper(): # Case-insensitive check
print(f"--- Callback: Found '{keyword_to_block}'. Blocking LLM call! ---")
# Optionally, set a flag in state to record the block event
callback_context.state["guardrail_block_keyword_triggered"] = True
print(f"--- Callback: Set state 'guardrail_block_keyword_triggered': True ---")
# Construct and return an LlmResponse to stop the flow and send this back instead
return LlmResponse(
content=types.Content(
role="model", # Mimic a response from the agent's perspective
parts=[types.Part(text=f"I cannot process this request because it contains the blocked keyword '{keyword_to_block}'.")],
)
# Note: You could also set an error_message field here if needed
)
else:
# Keyword not found, allow the request to proceed to the LLM
print(f"--- Callback: Keyword not found. Allowing LLM call for {agent_name}. ---")
return None # Returning None signals ADK to continue normally
print("✅ block_keyword_guardrail function defined.")
```
______________________________________________________________________
**2. Update Root Agent to Use the Callback**
We redefine the root agent, adding the `before_model_callback` parameter and pointing it to our new guardrail function. We'll give it a new version name for clarity.
*Important:* We need to redefine the sub-agents (`greeting_agent`, `farewell_agent`) and the stateful tool (`get_weather_stateful`) within this context if they are not already available from previous steps, ensuring the root agent definition has access to all its components.
```python
# @title 2. Update Root Agent with before_model_callback
# --- Redefine Sub-Agents (Ensures they exist in this context) ---
greeting_agent = None
try:
# Use a defined model constant
greeting_agent = Agent(
model=MODEL_GEMINI_2_5_FLASH,
name="greeting_agent", # Keep original name for consistency
instruction="You are the Greeting Agent. Your ONLY task is to provide a friendly greeting using the 'say_hello' tool. Do nothing else.",
description="Handles simple greetings and hellos using the 'say_hello' tool.",
tools=[say_hello],
)
print(f"✅ Sub-Agent '{greeting_agent.name}' redefined.")
except Exception as e:
print(f"❌ Could not redefine Greeting agent. Check Model/API Key ({greeting_agent.model}). Error: {e}")
farewell_agent = None
try:
# Use a defined model constant
farewell_agent = Agent(
model=MODEL_GEMINI_2_5_FLASH,
name="farewell_agent", # Keep original name
instruction="You are the Farewell Agent. Your ONLY task is to provide a polite goodbye message using the 'say_goodbye' tool. Do not perform any other actions.",
description="Handles simple farewells and goodbyes using the 'say_goodbye' tool.",
tools=[say_goodbye],
)
print(f"✅ Sub-Agent '{farewell_agent.name}' redefined.")
except Exception as e:
print(f"❌ Could not redefine Farewell agent. Check Model/API Key ({farewell_agent.model}). Error: {e}")
# --- Define the Root Agent with the Callback ---
root_agent_model_guardrail = None
runner_root_model_guardrail = None
# Check all components before proceeding
if greeting_agent and farewell_agent and 'get_weather_stateful' in globals() and 'block_keyword_guardrail' in globals():
# Use a defined model constant
root_agent_model = MODEL_GEMINI_2_5_FLASH
root_agent_model_guardrail = Agent(
name="weather_agent_v5_model_guardrail", # New version name for clarity
model=root_agent_model,
description="Main agent: Handles weather, delegates greetings/farewells, includes input keyword guardrail.",
instruction="You are the main Weather Agent. Provide weather using 'get_weather_stateful'. "
"Delegate simple greetings to 'greeting_agent' and farewells to 'farewell_agent'. "
"Handle only weather requests, greetings, and farewells.",
tools=[get_weather_stateful],
sub_agents=[greeting_agent, farewell_agent], # Reference the redefined sub-agents
output_key="last_weather_report", # Keep output_key from Step 4
before_model_callback=block_keyword_guardrail # <<< Assign the guardrail callback
)
print(f"✅ Root Agent '{root_agent_model_guardrail.name}' created with before_model_callback.")
# --- Create Runner for this Agent, Using SAME Stateful Session Service ---
# Ensure session_service_stateful exists from Step 4
if 'session_service_stateful' in globals():
runner_root_model_guardrail = Runner(
agent=root_agent_model_guardrail,
app_name=APP_NAME, # Use consistent APP_NAME
session_service=session_service_stateful # <<< Use the service from Step 4
)
print(f"✅ Runner created for guardrail agent '{runner_root_model_guardrail.agent.name}', using stateful session service.")
else:
print("❌ Cannot create runner. 'session_service_stateful' from Step 4 is missing.")
else:
print("❌ Cannot create root agent with model guardrail. One or more prerequisites are missing or failed initialization:")
if not greeting_agent: print(" - Greeting Agent")
if not farewell_agent: print(" - Farewell Agent")
if 'get_weather_stateful' not in globals(): print(" - 'get_weather_stateful' tool")
if 'block_keyword_guardrail' not in globals(): print(" - 'block_keyword_guardrail' callback")
```
______________________________________________________________________
**3. Interact to Test the Guardrail**
Let's test the guardrail's behavior. We'll use the *same session* (`SESSION_ID_STATEFUL`) as in Step 4 to show that state persists across these changes.
1. Send a normal weather request (should pass the guardrail and execute).
1. Send a request containing "BLOCK" (should be intercepted by the callback).
1. Send a greeting (should pass the root agent's guardrail, be delegated, and execute normally).
```python
# @title 3. Interact to Test the Model Input Guardrail
import asyncio # Ensure asyncio is imported
# Ensure the runner for the guardrail agent is available
if 'runner_root_model_guardrail' in globals() and runner_root_model_guardrail:
# Define the main async function for the guardrail test conversation.
# The 'await' keywords INSIDE this function are necessary for async operations.
async def run_guardrail_test_conversation():
print("\n--- Testing Model Input Guardrail ---")
# Use the runner for the agent with the callback and the existing stateful session ID
# Define a helper lambda for cleaner interaction calls
interaction_func = lambda query: call_agent_async(query,
runner_root_model_guardrail,
USER_ID_STATEFUL, # Use existing user ID
SESSION_ID_STATEFUL # Use existing session ID
)
# 1. Normal request (Callback allows, should use Fahrenheit from previous state change)
print("--- Turn 1: Requesting weather in London (expect allowed, Fahrenheit) ---")
await interaction_func("What is the weather in London?")
# 2. Request containing the blocked keyword (Callback intercepts)
print("\n--- Turn 2: Requesting with blocked keyword (expect blocked) ---")
await interaction_func("BLOCK the request for weather in Tokyo") # Callback should catch "BLOCK"
# 3. Normal greeting (Callback allows root agent, delegation happens)
print("\n--- Turn 3: Sending a greeting (expect allowed) ---")
await interaction_func("Hello again")
# --- Execute the `run_guardrail_test_conversation` async function ---
# Choose ONE of the methods below based on your environment.
# METHOD 1: Direct await (Default for Notebooks/Async REPLs)
# If your environment supports top-level await (like Colab/Jupyter notebooks),
# it means an event loop is already running, so you can directly await the function.
print("Attempting execution using 'await' (default for notebooks)...")
await run_guardrail_test_conversation()
# METHOD 2: asyncio.run (For Standard Python Scripts [.py])
# If running this code as a standard Python script from your terminal,
# the script context is synchronous. `asyncio.run()` is needed to
# create and manage an event loop to execute your async function.
# To use this method:
# 1. Comment out the `await run_guardrail_test_conversation()` line above.
# 2. Uncomment the following block:
"""
import asyncio
if __name__ == "__main__": # Ensures this runs only when script is executed directly
print("Executing using 'asyncio.run()' (for standard Python scripts)...")
try:
# This creates an event loop, runs your async function, and closes the loop.
asyncio.run(run_guardrail_test_conversation())
except Exception as e:
print(f"An error occurred: {e}")
"""
# --- Inspect final session state after the conversation ---
# This block runs after either execution method completes.
# Optional: Check state for the trigger flag set by the callback
print("\n--- Inspecting Final Session State (After Guardrail Test) ---")
# Use the session service instance associated with this stateful session
final_session = await session_service_stateful.get_session(app_name=APP_NAME,
user_id=USER_ID_STATEFUL,
session_id=SESSION_ID_STATEFUL)
if final_session:
# Use .get() for safer access
print(f"Guardrail Triggered Flag: {final_session.state.get('guardrail_block_keyword_triggered', 'Not Set (or False)')}")
print(f"Last Weather Report: {final_session.state.get('last_weather_report', 'Not Set')}") # Should be London weather if successful
print(f"Temperature Unit: {final_session.state.get('user_preference_temperature_unit', 'Not Set')}") # Should be Fahrenheit
# print(f"Full State Dict: {final_session.state}") # For detailed view
else:
print("\n❌ Error: Could not retrieve final session state.")
else:
print("\n⚠️ Skipping model guardrail test. Runner ('runner_root_model_guardrail') is not available.")
```
______________________________________________________________________
Observe the execution flow:
1. **London Weather:** The callback runs for `weather_agent_v5_model_guardrail`, inspects the message, prints "Keyword not found. Allowing LLM call.", and returns `None`. The agent proceeds, calls the `get_weather_stateful` tool (which uses the "Fahrenheit" preference from Step 4's state change), and returns the weather. This response updates `last_weather_report` via `output_key`.
1. **BLOCK Request:** The callback runs again for `weather_agent_v5_model_guardrail`, inspects the message, finds "BLOCK", prints "Blocking LLM call!", sets the state flag, and returns the predefined `LlmResponse`. The agent's underlying LLM is *never called* for this turn. The user sees the callback's blocking message.
1. **Hello Again:** The callback runs for `weather_agent_v5_model_guardrail`, allows the request. The root agent then delegates to `greeting_agent`. *Note: The `before_model_callback` defined on the root agent does NOT automatically apply to sub-agents.* The `greeting_agent` proceeds normally, calls its `say_hello` tool, and returns the greeting.
You have successfully implemented an input safety layer! The `before_model_callback` provides a powerful mechanism to enforce rules and control agent behavior *before* expensive or potentially risky LLM calls are made. Next, we'll apply a similar concept to add guardrails around tool usage itself.
## Step 6: Adding Safety - Tool Argument Guardrail (`before_tool_callback`)
In Step 5, we added a guardrail to inspect and potentially block user input *before* it reached the LLM. Now, we'll add another layer of control *after* the LLM has decided to use a tool but *before* that tool actually executes. This is useful for validating the *arguments* the LLM wants to pass to the tool.
ADK provides the `before_tool_callback` for this precise purpose.
**What is `before_tool_callback`?**
- It's a Python function executed just *before* a specific tool function runs, after the LLM has requested its use and decided on the arguments.
- **Purpose:** Validate tool arguments, prevent tool execution based on specific inputs, modify arguments dynamically, or enforce resource usage policies.
**Common Use Cases:**
- **Argument Validation:** Check if arguments provided by the LLM are valid, within allowed ranges, or conform to expected formats.
- **Resource Protection:** Prevent tools from being called with inputs that might be costly, access restricted data, or cause unwanted side effects (e.g., blocking API calls for certain parameters).
- **Dynamic Argument Modification:** Adjust arguments based on session state or other contextual information before the tool runs.
**How it Works:**
1. Define a function accepting `tool: BaseTool`, `args: Dict[str, Any]`, and `tool_context: ToolContext`.
- `tool`: The tool object about to be called (inspect `tool.name`).
- `args`: The dictionary of arguments the LLM generated for the tool.
- `tool_context`: Provides access to session state (`tool_context.state`), agent info, etc.
1. Inside the function:
- **Inspect:** Examine the `tool.name` and the `args` dictionary.
- **Modify:** Change values within the `args` dictionary *directly*. If you return `None`, the tool runs with these modified args.
- **Block/Override (Guardrail):** Return a **dictionary**. ADK treats this dictionary as the *result* of the tool call, completely *skipping* the execution of the original tool function. The dictionary should ideally match the expected return format of the tool it's blocking.
- **Allow:** Return `None`. ADK proceeds to execute the actual tool function with the (potentially modified) arguments.
**In this step, we will:**
1. Define a `before_tool_callback` function (`block_paris_tool_guardrail`) that specifically checks if the `get_weather_stateful` tool is called with the city "Paris".
1. If "Paris" is detected, the callback will block the tool and return a custom error dictionary.
1. Update our root agent (`weather_agent_v6_tool_guardrail`) to include *both* the `before_model_callback` and this new `before_tool_callback`.
1. Create a new runner for this agent, using the same stateful session service.
1. Test the flow by requesting weather for allowed cities and the blocked city ("Paris").
______________________________________________________________________
**1. Define the Tool Guardrail Callback Function**
This function targets the `get_weather_stateful` tool. It checks the `city` argument. If it's "Paris", it returns an error dictionary that looks like the tool's own error response. Otherwise, it allows the tool to run by returning `None`.
```python
# @title 1. Define the before_tool_callback Guardrail
# Ensure necessary imports are available
from google.adk.tools.base_tool import BaseTool
from google.adk.tools.tool_context import ToolContext
from typing import Optional, Dict, Any # For type hints
def block_paris_tool_guardrail(
tool: BaseTool, args: Dict[str, Any], tool_context: ToolContext
) -> Optional[Dict]:
"""
Checks if 'get_weather_stateful' is called for 'Paris'.
If so, blocks the tool execution and returns a specific error dictionary.
Otherwise, allows the tool call to proceed by returning None.
"""
tool_name = tool.name
agent_name = tool_context.agent_name # Agent attempting the tool call
print(f"--- Callback: block_paris_tool_guardrail running for tool '{tool_name}' in agent '{agent_name}' ---")
print(f"--- Callback: Inspecting args: {args} ---")
# --- Guardrail Logic ---
target_tool_name = "get_weather_stateful" # Match the function name used by FunctionTool
blocked_city = "paris"
# Check if it's the correct tool and the city argument matches the blocked city
if tool_name == target_tool_name:
city_argument = args.get("city", "") # Safely get the 'city' argument
if city_argument and city_argument.lower() == blocked_city:
print(f"--- Callback: Detected blocked city '{city_argument}'. Blocking tool execution! ---")
# Optionally update state
tool_context.state["guardrail_tool_block_triggered"] = True
print(f"--- Callback: Set state 'guardrail_tool_block_triggered': True ---")
# Return a dictionary matching the tool's expected output format for errors
# This dictionary becomes the tool's result, skipping the actual tool run.
return {
"status": "error",
"error_message": f"Policy restriction: Weather checks for '{city_argument.capitalize()}' are currently disabled by a tool guardrail."
}
else:
print(f"--- Callback: City '{city_argument}' is allowed for tool '{tool_name}'. ---")
else:
print(f"--- Callback: Tool '{tool_name}' is not the target tool. Allowing. ---")
# If the checks above didn't return a dictionary, allow the tool to execute
print(f"--- Callback: Allowing tool '{tool_name}' to proceed. ---")
return None # Returning None allows the actual tool function to run
print("✅ block_paris_tool_guardrail function defined.")
```
______________________________________________________________________
**2. Update Root Agent to Use Both Callbacks**
We redefine the root agent again (`weather_agent_v6_tool_guardrail`), this time adding the `before_tool_callback` parameter alongside the `before_model_callback` from Step 5.
*Self-Contained Execution Note:* Similar to Step 5, ensure all prerequisites (sub-agents, tools, `before_model_callback`) are defined or available in the execution context before defining this agent.
```python
# @title 2. Update Root Agent with BOTH Callbacks (Self-Contained)
# --- Ensure Prerequisites are Defined ---
# (Include or ensure execution of definitions for: Agent, LiteLlm, Runner, ToolContext,
# MODEL constants, say_hello, say_goodbye, greeting_agent, farewell_agent,
# get_weather_stateful, block_keyword_guardrail, block_paris_tool_guardrail)
# --- Redefine Sub-Agents (Ensures they exist in this context) ---
greeting_agent = None
try:
# Use a defined model constant
greeting_agent = Agent(
model=MODEL_GEMINI_2_5_FLASH,
name="greeting_agent", # Keep original name for consistency
instruction="You are the Greeting Agent. Your ONLY task is to provide a friendly greeting using the 'say_hello' tool. Do nothing else.",
description="Handles simple greetings and hellos using the 'say_hello' tool.",
tools=[say_hello],
)
print(f"✅ Sub-Agent '{greeting_agent.name}' redefined.")
except Exception as e:
print(f"❌ Could not redefine Greeting agent. Check Model/API Key ({greeting_agent.model}). Error: {e}")
farewell_agent = None
try:
# Use a defined model constant
farewell_agent = Agent(
model=MODEL_GEMINI_2_5_FLASH,
name="farewell_agent", # Keep original name
instruction="You are the Farewell Agent. Your ONLY task is to provide a polite goodbye message using the 'say_goodbye' tool. Do not perform any other actions.",
description="Handles simple farewells and goodbyes using the 'say_goodbye' tool.",
tools=[say_goodbye],
)
print(f"✅ Sub-Agent '{farewell_agent.name}' redefined.")
except Exception as e:
print(f"❌ Could not redefine Farewell agent. Check Model/API Key ({farewell_agent.model}). Error: {e}")
# --- Define the Root Agent with Both Callbacks ---
root_agent_tool_guardrail = None
runner_root_tool_guardrail = None
if ('greeting_agent' in globals() and greeting_agent and
'farewell_agent' in globals() and farewell_agent and
'get_weather_stateful' in globals() and
'block_keyword_guardrail' in globals() and
'block_paris_tool_guardrail' in globals()):
root_agent_model = MODEL_GEMINI_2_5_FLASH
root_agent_tool_guardrail = Agent(
name="weather_agent_v6_tool_guardrail", # New version name
model=root_agent_model,
description="Main agent: Handles weather, delegates, includes input AND tool guardrails.",
instruction="You are the main Weather Agent. Provide weather using 'get_weather_stateful'. "
"Delegate greetings to 'greeting_agent' and farewells to 'farewell_agent'. "
"Handle only weather, greetings, and farewells.",
tools=[get_weather_stateful],
sub_agents=[greeting_agent, farewell_agent],
output_key="last_weather_report",
before_model_callback=block_keyword_guardrail, # Keep model guardrail
before_tool_callback=block_paris_tool_guardrail # <<< Add tool guardrail
)
print(f"✅ Root Agent '{root_agent_tool_guardrail.name}' created with BOTH callbacks.")
# --- Create Runner, Using SAME Stateful Session Service ---
if 'session_service_stateful' in globals():
runner_root_tool_guardrail = Runner(
agent=root_agent_tool_guardrail,
app_name=APP_NAME,
session_service=session_service_stateful # <<< Use the service from Step 4/5
)
print(f"✅ Runner created for tool guardrail agent '{runner_root_tool_guardrail.agent.name}', using stateful session service.")
else:
print("❌ Cannot create runner. 'session_service_stateful' from Step 4/5 is missing.")
else:
print("❌ Cannot create root agent with tool guardrail. Prerequisites missing.")
```
______________________________________________________________________
**3. Interact to Test the Tool Guardrail**
Let's test the interaction flow, again using the same stateful session (`SESSION_ID_STATEFUL`) from the previous steps.
1. Request weather for "New York": Passes both callbacks, tool executes (using Fahrenheit preference from state).
1. Request weather for "Paris": Passes `before_model_callback`. LLM decides to call `get_weather_stateful(city='Paris')`. `before_tool_callback` intercepts, blocks the tool, and returns the error dictionary. Agent relays this error.
1. Request weather for "London": Passes both callbacks, tool executes normally.
```python
# @title 3. Interact to Test the Tool Argument Guardrail
import asyncio # Ensure asyncio is imported
# Ensure the runner for the tool guardrail agent is available
if 'runner_root_tool_guardrail' in globals() and runner_root_tool_guardrail:
# Define the main async function for the tool guardrail test conversation.
# The 'await' keywords INSIDE this function are necessary for async operations.
async def run_tool_guardrail_test():
print("\n--- Testing Tool Argument Guardrail ('Paris' blocked) ---")
# Use the runner for the agent with both callbacks and the existing stateful session
# Define a helper lambda for cleaner interaction calls
interaction_func = lambda query: call_agent_async(query,
runner_root_tool_guardrail,
USER_ID_STATEFUL, # Use existing user ID
SESSION_ID_STATEFUL # Use existing session ID
)
# 1. Allowed city (Should pass both callbacks, use Fahrenheit state)
print("--- Turn 1: Requesting weather in New York (expect allowed) ---")
await interaction_func("What's the weather in New York?")
# 2. Blocked city (Should pass model callback, but be blocked by tool callback)
print("\n--- Turn 2: Requesting weather in Paris (expect blocked by tool guardrail) ---")
await interaction_func("How about Paris?") # Tool callback should intercept this
# 3. Another allowed city (Should work normally again)
print("\n--- Turn 3: Requesting weather in London (expect allowed) ---")
await interaction_func("Tell me the weather in London.")
# --- Execute the `run_tool_guardrail_test` async function ---
# Choose ONE of the methods below based on your environment.
# METHOD 1: Direct await (Default for Notebooks/Async REPLs)
# If your environment supports top-level await (like Colab/Jupyter notebooks),
# it means an event loop is already running, so you can directly await the function.
print("Attempting execution using 'await' (default for notebooks)...")
await run_tool_guardrail_test()
# METHOD 2: asyncio.run (For Standard Python Scripts [.py])
# If running this code as a standard Python script from your terminal,
# the script context is synchronous. `asyncio.run()` is needed to
# create and manage an event loop to execute your async function.
# To use this method:
# 1. Comment out the `await run_tool_guardrail_test()` line above.
# 2. Uncomment the following block:
"""
import asyncio
if __name__ == "__main__": # Ensures this runs only when script is executed directly
print("Executing using 'asyncio.run()' (for standard Python scripts)...")
try:
# This creates an event loop, runs your async function, and closes the loop.
asyncio.run(run_tool_guardrail_test())
except Exception as e:
print(f"An error occurred: {e}")
"""
# --- Inspect final session state after the conversation ---
# This block runs after either execution method completes.
# Optional: Check state for the tool block trigger flag
print("\n--- Inspecting Final Session State (After Tool Guardrail Test) ---")
# Use the session service instance associated with this stateful session
final_session = await session_service_stateful.get_session(app_name=APP_NAME,
user_id=USER_ID_STATEFUL,
session_id= SESSION_ID_STATEFUL)
if final_session:
# Use .get() for safer access
print(f"Tool Guardrail Triggered Flag: {final_session.state.get('guardrail_tool_block_triggered', 'Not Set (or False)')}")
print(f"Last Weather Report: {final_session.state.get('last_weather_report', 'Not Set')}") # Should be London weather if successful
print(f"Temperature Unit: {final_session.state.get('user_preference_temperature_unit', 'Not Set')}") # Should be Fahrenheit
# print(f"Full State Dict: {final_session.state}") # For detailed view
else:
print("\n❌ Error: Could not retrieve final session state.")
else:
print("\n⚠️ Skipping tool guardrail test. Runner ('runner_root_tool_guardrail') is not available.")
```
______________________________________________________________________
Analyze the output:
1. **New York:** The `before_model_callback` allows the request. The LLM requests `get_weather_stateful`. The `before_tool_callback` runs, inspects the args (`{'city': 'New York'}`), sees it's not "Paris", prints "Allowing tool..." and returns `None`. The actual `get_weather_stateful` function executes, reads "Fahrenheit" from state, and returns the weather report. The agent relays this, and it gets saved via `output_key`.
1. **Paris:** The `before_model_callback` allows the request. The LLM requests `get_weather_stateful(city='Paris')`. The `before_tool_callback` runs, inspects the args, detects "Paris", prints "Blocking tool execution!", sets the state flag, and returns the error dictionary `{'status': 'error', 'error_message': 'Policy restriction...'}`. The actual `get_weather_stateful` function is **never executed**. The agent receives the error dictionary *as if it were the tool's output* and formulates a response based on that error message.
1. **London:** Behaves like New York, passing both callbacks and executing the tool successfully. The new London weather report overwrites the `last_weather_report` in the state.
You've now added a crucial safety layer controlling not just *what* reaches the LLM, but also *how* the agent's tools can be used based on the specific arguments generated by the LLM. Callbacks like `before_model_callback` and `before_tool_callback` are essential for building robust, safe, and policy-compliant agent applications.
______________________________________________________________________
## Conclusion: Your Agent Team is Ready!
Congratulations! You've successfully journeyed from building a single, basic weather agent to constructing a sophisticated, multi-agent team using the Agent Development Kit (ADK).
**Let's recap what you've accomplished:**
- You started with a **fundamental agent** equipped with a single tool (`get_weather`).
- You explored ADK's **multi-model flexibility** using LiteLLM, running the same core logic with different LLMs like Gemini, GPT-4o, and Claude.
- You embraced **modularity** by creating specialized sub-agents (`greeting_agent`, `farewell_agent`) and enabling **automatic delegation** from a root agent.
- You gave your agents **memory** using **Session State**, allowing them to remember user preferences (`temperature_unit`) and past interactions (`output_key`).
- You implemented crucial **safety guardrails** using both `before_model_callback` (blocking specific input keywords) and `before_tool_callback` (blocking tool execution based on arguments like the city "Paris").
Through building this progressive Weather Bot team, you've gained hands-on experience with core ADK concepts essential for developing complex, intelligent applications.
**Key Takeaways:**
- **Agents & Tools:** The fundamental building blocks for defining capabilities and reasoning. Clear instructions and docstrings are paramount.
- **Runners & Session Services:** The engine and memory management system that orchestrate agent execution and maintain conversational context.
- **Delegation:** Designing multi-agent teams allows for specialization, modularity, and better management of complex tasks. Agent `description` is key for auto-flow.
- **Session State (`ToolContext`, `output_key`):** Essential for creating context-aware, personalized, and multi-turn conversational agents.
- **Callbacks (`before_model`, `before_tool`):** Powerful hooks for implementing safety, validation, policy enforcement, and dynamic modifications *before* critical operations (LLM calls or tool execution).
- **Flexibility (`LiteLlm`):** ADK empowers you to choose the best LLM for the job, balancing performance, cost, and features.
**Where to Go Next?**
Your Weather Bot team is a great starting point. Here are some ideas to further explore ADK and enhance your application:
1. **Real Weather API:** Replace the `mock_weather_db` in your `get_weather` tool with a call to a real weather API (like OpenWeatherMap, WeatherAPI).
1. **More Complex State:** Store more user preferences (e.g., preferred location, notification settings) or conversation summaries in the session state.
1. **Refine Delegation:** Experiment with different root agent instructions or sub-agent descriptions to fine-tune the delegation logic. Could you add a "forecast" agent?
1. **Advanced Callbacks:**
- Use `after_model_callback` to potentially reformat or sanitize the LLM's response *after* it's generated.
- Use `after_tool_callback` to process or log the results returned by a tool.
- Implement `before_agent_callback` or `after_agent_callback` for agent-level entry/exit logic.
1. **Error Handling:** Improve how the agent handles tool errors or unexpected API responses. Maybe add retry logic within a tool.
1. **Persistent Session Storage:** Explore alternatives to `InMemorySessionService` for storing session state persistently (e.g., using databases like Firestore or Cloud SQL – requires custom implementation or future ADK integrations).
1. **Streaming UI:** Integrate your agent team with a web framework (like FastAPI, as shown in the ADK Streaming Quickstart) to create a real-time chat interface.
The Agent Development Kit provides a robust foundation for building sophisticated LLM-powered applications. By mastering the concepts covered in this tutorial – tools, state, delegation, and callbacks – you are well-equipped to tackle increasingly complex agentic systems.
Happy building!
# Coding with AI
The Agent Development Kit (ADK) documentation supports the [`/llms.txt` standard](https://llmstxt.org/), providing a machine-readable index of the documentation optimized for Large Language Models (LLMs). This allows you to easily use the ADK documentation as context in your AI-powered development environment.
## What is llms.txt?
`llms.txt` is a standardized text file that acts as a map for LLMs, listing the most important documentation pages and their descriptions. This helps AI tools understand the structure of the ADK documentation and retrieve relevant information to answer your questions.
The ADK documentation provides the following files that are automatically generated with every update:
| File | Best For... | URL |
| ------------------- | ------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------- |
| **`llms.txt`** | Tools that can fetch links dynamically | [`https://google.github.io/adk-docs/llms.txt`](https://google.github.io/adk-docs/llms.txt) |
| **`llms-full.txt`** | Tools that need a single, static text dump of the entire site | [`https://google.github.io/adk-docs/llms-full.txt`](https://google.github.io/adk-docs/llms-full.txt) |
## Usage in Development Tools
You can use these files to power your AI coding assistants with ADK knowledge. This functionality allows your agents to autonomously search and read the ADK documentation while planning tasks and generating code.
### Gemini CLI
The [Gemini CLI](https://geminicli.com/) can be configured to query the ADK documentation using the [ADK Docs Extension](https://github.com/derailed-dash/adk-docs-ext).
**Installation:**
To install the extension, run the following command:
```bash
gemini extensions install https://github.com/derailed-dash/adk-docs-ext
```
**Usage:**
Once installed, the extension is automatically enabled. You can ask questions about ADK directly in the Gemini CLI, and it will use the `llms.txt` file and ADK documentation to provide accurate answers and generate code.
For example, you can ask the following question from within Gemini CLI:
> How do I create a function tool using Agent Development Kit?
______________________________________________________________________
### Antigravity
The [Antigravity](https://antigravity.google/) IDE can be configured to access the ADK documentation by running a custom MCP server that points to the `llms.txt` file for ADK.
**Prerequisites:**
Ensure you have the [`uv`](https://docs.astral.sh/uv/) tool installed, as this configuration uses `uvx` to run the documentation server without manual installation.
**Configuration:**
1. Open the MCP store via the **...** (more) menu at the top of the editor's agent panel.
1. Click on **Manage MCP Servers**.
1. Click on **View raw config**.
1. Add the following entry to `mcp_config.json` with your custom MCP server configuration. If this is your first MCP server, you can paste the entire code block:
```json
{
"mcpServers": {
"adk-docs-mcp": {
"command": "uvx",
"args": [
"--from",
"mcpdoc",
"mcpdoc",
"--urls",
"AgentDevelopmentKit:https://google.github.io/adk-docs/llms.txt",
"--transport",
"stdio"
]
}
}
}
```
Refer to the [Antigravity MCP documentation](https://antigravity.google/docs/mcp) for more information on managing MCP servers.
**Usage:**
Once configured, you can prompt the coding agent with instructions like:
> Use the ADK docs to build a multi-tool agent that uses Gemini 2.5 Pro and includes a mock weather lookup tool and a custom calculator tool. Verify the agent using `adk run`.
______________________________________________________________________
### Claude Code
[Claude Code](https://code.claude.com/docs/en/overview) can be configured to query the ADK documentation by adding an [MCP server](https://code.claude.com/docs/en/mcp).
**Installation:**
To add an MCP server for the ADK docs to Claude Code, run the following command:
```bash
claude mcp add adk-docs --transport stdio -- uvx --from mcpdoc mcpdoc --urls AgentDevelopmentKit:https://google.github.io/adk-docs/llms.txt --transport stdio
```
**Usage:**
Once installed, the MCP server is automatically enabled. You can ask questions about ADK directly in Claude Code, and it will use the `llms.txt` file and ADK documentation to provide accurate answers and generate code.
For example, you can ask the following question from within Claude Code:
> How do I create a function tool using Agent Development Kit?
______________________________________________________________________
### Cursor
The [Cursor](https://cursor.com/) IDE can be configured to access the ADK documentation by running a custom MCP server that points to the `llms.txt` file for ADK.
**Prerequisites:**
Ensure you have the [`uv`](https://docs.astral.sh/uv/) tool installed, as this configuration uses `uvx` to run the documentation server without manual installation.
**Configuration:**
1. Open **Cursor Settings** and navigate to the **Tools & MCP** tab.
1. Click on **New MCP Server**, which will open `mcp.json` for editing.
1. Add the following entry to `mcp.json` with your custom MCP server configuration. If this is your first MCP server, you can paste the entire code block:
```json
{
"mcpServers": {
"adk-docs-mcp": {
"command": "uvx",
"args": [
"--from",
"mcpdoc",
"mcpdoc",
"--urls",
"AgentDevelopmentKit:https://google.github.io/adk-docs/llms.txt",
"--transport",
"stdio"
]
}
}
}
```
Refer to the [Cursor MCP documentation](https://cursor.com/docs/context/mcp) for more information on managing MCP servers.
**Usage:**
Once configured, you can prompt the coding agent with instructions like:
> Use the ADK docs to build a multi-tool agent that uses Gemini 2.5 Pro and includes a mock weather lookup tool and a custom calculator tool. Verify the agent using `adk run`.
______________________________________________________________________
### Other Tools
Any tool that supports the `llms.txt` standard or can ingest documentation from a URL can benefit from these files. You can provide the URL `https://google.github.io/adk-docs/llms.txt` (or `llms-full.txt`) to your tool's knowledge base configuration or MCP server configuration.
# Visual Builder for agents
Supported in ADKPython v1.18.0Experimental
The ADK Visual Builder is a web-based tool that provides a visual workflow design environment for creating and managing ADK agents. It allows you to design, build, and test your agents in a beginner-friendly graphical interface, and includes an AI-powered assistant to help you build agents.
Experimental
The Visual Builder feature is an experimental release. We welcome your [feedback](https://github.com/google/adk-python/issues/new?template=feature_request.md)!
## Get started
The Visual Builder interface is part of the ADK Web tool user interface. Make sure you have ADK library [installed](/adk-docs/get-started/installation/#python) and then run the ADK Web user interface.
```console
adk web --port 8000
```
Tip: Run from a code development directory
The Visual Builder tool writes project files to new subdirectories located in the directory where you run the ADK Web tool. Make sure you run this command from a developer directory location where you have write access.
**Figure 1:** ADK Web controls to start the Visual Builder tool.
To create an agent with Visual Builder:
1. In top left of the page, select the **+** (plus sign), as shown in *Figure 1*, to start creating an agent.
1. Type a name for your agent application and select **Create**.
1. Edit your agent by doing any of the following:
- In the left panel, edit agent component values.
- In the central panel, add new agent components .
- In the right panel, use prompts to modify the agent or get help.
1. In bottom left corner, select **Save** to save your agent.
1. Interact with your new agent to test it.
1. In top left of the page, select the pencil icon, as shown in *Figure 1*, to continue editing your agent.
Here are few things to note when using Visual Builder:
- **Create agent and save:** When creating an agent, make sure you select **Save** before exiting the editing interface, otherwise your new agent may not be editable.
- **Agent editing:** Edit (pencil icon) for agents is *only* available for agents created with Visual Builder
- **Add tools:** When adding existing custom Tools to a Visual Builder agent, specify a fully-qualified Python function name.
## Workflow component support
The Visual Builder tool provides a drag-and-drop user interface for constructing agents, as well as an AI-powered development Assistant that can answer questions and edit your agent workflow. The tool supports all the essential components for building an ADK agent workflow, including:
- **Agents**
- **Root Agent**: The primary controlling agent for a workflow. All other agents in an ADK agent workflow are considered Sub Agents.
- [**LLM Agent:**](/adk-docs/agents/llm-agents/) An agent powered by a generative AI model.
- [**Sequential Agent:**](/adk-docs/agents/workflow-agents/sequential-agents/) A workflow agent that executes a series of sub-agents in a sequence.
- [**Loop Agent:**](/adk-docs/agents/workflow-agents/loop-agents/) A workflow agent that repeatedly executes a sub-agent until a certain condition is met.
- [**Parallel Agent:**](/adk-docs/agents/workflow-agents/parallel-agents/) A workflow agent that executes multiple sub-agents concurrently.
- **Tools**
- [**Prebuilt tools:**](/adk-docs/tools/built-in-tools/) A limited set of ADK-provided tools can be added to agents.
- [**Custom tools:**](/adk-docs/tools-custom/) You can build and add custom tools to your workflow.
- **Components**
- [**Callbacks**](/adk-docs/callbacks/) A flow control component that lets you modify the behavior of agents at the start and end of agent workflow events.
Some advanced ADK features are not supported by Visual Builder due to limitations of the Agent Config feature. For more information, see the Agent Config [Known limitations](/adk-docs/agents/config/#known-limitations).
## Project code output
The Visual Builder tool generates code in the [Agent Config](/adk-docs/agents/config/) format, using `.yaml` configuration files for agents and Python code for custom tools. These files are generated in a subfolder of the directory where you ran the ADK Web interface. The following listing shows an example layout for a DiceAgent project:
```text
DiceAgent/
root_agent.yaml # main agent code
sub_agent_1.yaml # sub agents (if any)
tools/ # tools directory
__init__.py
dice_tool.py # tool code
```
Editing generated agents
You can edit the generated files in your development environment. However, some changes may not be compatible with Visual Builder.
## Next steps
Using the Visual Builder development Assistant, try building a new agent using this prompt:
```text
Help me add a dice roll tool to my current agent.
Use the default model if you need to configure that.
```
Check out more information on the Agent Config code format used by Visual Builder and the available options:
- [Agent Config](/adk-docs/agents/config/)
- [Agent Config YAML schema](/adk-docs/api-reference/agentconfig/)
# Agents
Supported in ADKPythonTypeScriptGoJava
In Agent Development Kit (ADK), an **Agent** is a self-contained execution unit designed to act autonomously to achieve specific goals. Agents can perform tasks, interact with users, utilize external tools, and coordinate with other agents.
The foundation for all agents in ADK is the `BaseAgent` class. It serves as the fundamental blueprint. To create functional agents, you typically extend `BaseAgent` in one of three main ways, catering to different needs – from intelligent reasoning to structured process control.
## Core Agent Categories
ADK provides distinct agent categories to build sophisticated applications:
1. [**LLM Agents (`LlmAgent`, `Agent`)**](https://google.github.io/adk-docs/agents/llm-agents/index.md): These agents utilize Large Language Models (LLMs) as their core engine to understand natural language, reason, plan, generate responses, and dynamically decide how to proceed or which tools to use, making them ideal for flexible, language-centric tasks. [Learn more about LLM Agents...](https://google.github.io/adk-docs/agents/llm-agents/index.md)
1. [**Workflow Agents (`SequentialAgent`, `ParallelAgent`, `LoopAgent`)**](https://google.github.io/adk-docs/agents/workflow-agents/index.md): These specialized agents control the execution flow of other agents in predefined, deterministic patterns (sequence, parallel, or loop) without using an LLM for the flow control itself, perfect for structured processes needing predictable execution. [Explore Workflow Agents...](https://google.github.io/adk-docs/agents/workflow-agents/index.md)
1. [**Custom Agents**](https://google.github.io/adk-docs/agents/custom-agents/index.md): Created by extending `BaseAgent` directly, these agents allow you to implement unique operational logic, specific control flows, or specialized integrations not covered by the standard types, catering to highly tailored application requirements. [Discover how to build Custom Agents...](https://google.github.io/adk-docs/agents/custom-agents/index.md)
## Choosing the Right Agent Type
The following table provides a high-level comparison to help distinguish between the agent types. As you explore each type in more detail in the subsequent sections, these distinctions will become clearer.
| Feature | LLM Agent (`LlmAgent`) | Workflow Agent | Custom Agent (`BaseAgent` subclass) |
| -------------------- | --------------------------------- | ------------------------------------------- | ----------------------------------------- |
| **Primary Function** | Reasoning, Generation, Tool Use | Controlling Agent Execution Flow | Implementing Unique Logic/Integrations |
| **Core Engine** | Large Language Model (LLM) | Predefined Logic (Sequence, Parallel, Loop) | Custom Code |
| **Determinism** | Non-deterministic (Flexible) | Deterministic (Predictable) | Can be either, based on implementation |
| **Primary Use** | Language tasks, Dynamic decisions | Structured processes, Orchestration | Tailored requirements, Specific workflows |
## Agents Working Together: Multi-Agent Systems
While each agent type serves a distinct purpose, the true power often comes from combining them. Complex applications frequently employ [multi-agent architectures](https://google.github.io/adk-docs/agents/multi-agents/index.md) where:
- **LLM Agents** handle intelligent, language-based task execution.
- **Workflow Agents** manage the overall process flow using standard patterns.
- **Custom Agents** provide specialized capabilities or rules needed for unique integrations.
Understanding these core types is the first step toward building sophisticated, capable AI applications with ADK.
______________________________________________________________________
## What's Next?
Now that you have an overview of the different agent types available in ADK, dive deeper into how they work and how to use them effectively:
- [**LLM Agents:**](https://google.github.io/adk-docs/agents/llm-agents/index.md) Explore how to configure agents powered by large language models, including setting instructions, providing tools, and enabling advanced features like planning and code execution.
- [**Workflow Agents:**](https://google.github.io/adk-docs/agents/workflow-agents/index.md) Learn how to orchestrate tasks using `SequentialAgent`, `ParallelAgent`, and `LoopAgent` for structured and predictable processes.
- [**Custom Agents:**](https://google.github.io/adk-docs/agents/custom-agents/index.md) Discover the principles of extending `BaseAgent` to build agents with unique logic and integrations tailored to your specific needs.
- [**Multi-Agents:**](https://google.github.io/adk-docs/agents/multi-agents/index.md) Understand how to combine different agent types to create sophisticated, collaborative systems capable of tackling complex problems.
- [**Models:**](/adk-docs/agents/models/) Learn about the different LLM integrations available and how to select the right model for your agents.
# Build agents with Agent Config
Supported in ADKPython v1.11.0Experimental
The ADK Agent Config feature lets you build an ADK workflow without writing code. An Agent Config uses a YAML format text file with a brief description of the agent, allowing just about anyone to assemble and run an ADK agent. The following is a simple example of a basic Agent Config definition:
```text
name: assistant_agent
model: gemini-2.5-flash
description: A helper agent that can answer users' questions.
instruction: You are an agent to help answer users' various questions.
```
You can use Agent Config files to build more complex agents which can incorporate Functions, Tools, Sub-Agents, and more. This page describes how to build and run ADK workflows with the Agent Config feature. For detailed information on the syntax and settings supported by the Agent Config format, see the [Agent Config syntax reference](/adk-docs/api-reference/agentconfig/).
Experimental
The Agent Config feature is experimental and has some [known limitations](#known-limitations). We welcome your [feedback](https://github.com/google/adk-python/issues/new?template=feature_request.md&labels=agent%20config)!
## Get started
This section describes how to set up and start building agents with the ADK and the Agent Config feature, including installation setup, building an agent, and running your agent.
### Setup
You need to install the Google Agent Development Kit libraries, and provide an access key for a generative AI model such as Gemini API. This section provides details on what you must install and configure before you can run agents with the Agent Config files.
Note
The Agent Config feature currently only supports Gemini models. For more information about additional; functional restrictions, see [Known limitations](#known-limitations).
To set up ADK for use with Agent Config:
1. Install the ADK Python libraries by following the [Installation](/adk-docs/get-started/installation/#python) instructions. *Python is currently required.* For more information, see the [Known limitations](#known-limitations).
1. Verify that ADK is installed by running the following command in your terminal:
```text
adk --version
```
This command should show the ADK version you have installed.
Tip
If the `adk` command fails to run and the version is not listed in step 2, make sure your Python environment is active. Execute `source .venv/bin/activate` in your terminal on Mac and Linux. For other platform commands, see the [Installation](/adk-docs/get-started/installation/#python) page.
### Build an agent
You build an agent with Agent Config using the `adk create` command to create the project files for an agent, and then editing the `root_agent.yaml` file it generates for you.
To create an ADK project for use with Agent Config:
1. In your terminal window, run the following command to create a config-based agent:
```text
adk create --type=config my_agent
```
This command generates a `my_agent/` folder, containing a `root_agent.yaml` file and an `.env` file.
1. In the `my_agent/.env` file, set environment variables for your agent to access generative AI models and other services:
1. For Gemini model access through Google API, add a line to the file with your API key:
```text
GOOGLE_GENAI_USE_VERTEXAI=0
GOOGLE_API_KEY=
```
You can get an API key from the Google AI Studio [API Keys](https://aistudio.google.com/app/apikey) page.
1. For Gemini model access through Google Cloud, add these lines to the file:
```text
GOOGLE_GENAI_USE_VERTEXAI=1
GOOGLE_CLOUD_PROJECT=
GOOGLE_CLOUD_LOCATION=us-central1
```
For information on creating a Cloud Project, see the Google Cloud docs for [Creating and managing projects](https://cloud.google.com/resource-manager/docs/creating-managing-projects).
1. Using text editor, edit the Agent Config file `my_agent/root_agent.yaml`, as shown below:
```text
# yaml-language-server: $schema=https://raw.githubusercontent.com/google/adk-python/refs/heads/main/src/google/adk/agents/config_schemas/AgentConfig.json
name: assistant_agent
model: gemini-2.5-flash
description: A helper agent that can answer users' questions.
instruction: You are an agent to help answer users' various questions.
```
You can discover more configuration options for your `root_agent.yaml` agent configuration file by referring to the ADK [samples repository](https://github.com/search?q=repo%3Agoogle%2Fadk-python+path%3A%2F%5Econtributing%5C%2Fsamples%5C%2F%2F+.yaml&type=code) or the [Agent Config syntax](/adk-docs/api-reference/agentconfig/) reference.
### Run the agent
Once you have completed editing your Agent Config, you can run your agent using the web interface, command line terminal execution, or API server mode.
To run your Agent Config-defined agent:
1. In your terminal, navigate to the `my_agent/` directory containing the `root_agent.yaml` file.
1. Type one of the following commands to run your agent:
- `adk web` - Run web UI interface for your agent.
- `adk run` - Run your agent in the terminal without a user interface.
- `adk api_server` - Run your agent as a service that can be used by other applications.
For more information on the ways to run your agent, see the *Run Your Agent* topic in the [Quickstart](/adk-docs/get-started/quickstart/#run-your-agent). For more information about the ADK command line options, see the [ADK CLI reference](/adk-docs/api-reference/cli/).
## Example configs
This section shows examples of Agent Config files to get you started building agents. For additional and more complete examples, see the ADK [samples repository](https://github.com/search?q=repo%3Agoogle%2Fadk-python+path%3A%2F%5Econtributing%5C%2Fsamples%5C%2F%2F+root_agent.yaml&type=code).
### Built-in tool example
The following example uses a built-in ADK tool function for using google search to provide functionality to the agent. This agent automatically uses the search tool to reply to user requests.
```text
# yaml-language-server: $schema=https://raw.githubusercontent.com/google/adk-python/refs/heads/main/src/google/adk/agents/config_schemas/AgentConfig.json
name: search_agent
model: gemini-2.0-flash
description: 'an agent whose job it is to perform Google search queries and answer questions about the results.'
instruction: You are an agent whose job is to perform Google search queries and answer questions about the results.
tools:
- name: google_search
```
For more details, see the full code for this sample in the [ADK sample repository](https://github.com/google/adk-python/blob/main/contributing/samples/tool_builtin_config/root_agent.yaml).
### Custom tool example
The following example uses a custom tool built with Python code and listed in the `tools:` section of the config file. The agent uses this tool to check if a list of numbers provided by the user are prime numbers.
```text
# yaml-language-server: $schema=https://raw.githubusercontent.com/google/adk-python/refs/heads/main/src/google/adk/agents/config_schemas/AgentConfig.json
agent_class: LlmAgent
model: gemini-2.5-flash
name: prime_agent
description: Handles checking if numbers are prime.
instruction: |
You are responsible for checking whether numbers are prime.
When asked to check primes, you must call the check_prime tool with a list of integers.
Never attempt to determine prime numbers manually.
Return the prime number results to the root agent.
tools:
- name: ma_llm.check_prime
```
For more details, see the full code for this sample in the [ADK sample repository](https://github.com/google/adk-python/blob/main/contributing/samples/multi_agent_llm_config/prime_agent.yaml).
### Sub-agents example
The following example shows an agent defined with two sub-agents in the `sub_agents:` section, and an example tool in the `tools:` section of the config file. This agent determines what the user wants, and delegates to one of the sub-agents to resolve the request. The sub-agents are defined using Agent Config YAML files.
```text
# yaml-language-server: $schema=https://raw.githubusercontent.com/google/adk-python/refs/heads/main/src/google/adk/agents/config_schemas/AgentConfig.json
agent_class: LlmAgent
model: gemini-2.5-flash
name: root_agent
description: Learning assistant that provides tutoring in code and math.
instruction: |
You are a learning assistant that helps students with coding and math questions.
You delegate coding questions to the code_tutor_agent and math questions to the math_tutor_agent.
Follow these steps:
1. If the user asks about programming or coding, delegate to the code_tutor_agent.
2. If the user asks about math concepts or problems, delegate to the math_tutor_agent.
3. Always provide clear explanations and encourage learning.
sub_agents:
- config_path: code_tutor_agent.yaml
- config_path: math_tutor_agent.yaml
```
For more details, see the full code for this sample in the [ADK sample repository](https://github.com/google/adk-python/blob/main/contributing/samples/multi_agent_basic_config/root_agent.yaml).
## Deploy agent configs
You can deploy Agent Config agents with [Cloud Run](/adk-docs/deploy/cloud-run/) and [Agent Engine](/adk-docs/deploy/agent-engine/), using the same procedure as code-based agents. For more information on how to prepare and deploy Agent Config-based agents, see the [Cloud Run](/adk-docs/deploy/cloud-run/) and [Agent Engine](/adk-docs/deploy/agent-engine/) deployment guides.
## Known limitations
The Agent Config feature is experimental and includes the following limitations:
- **Model support:** Only Gemini models are currently supported. Integration with third-party models is in progress.
- **Programming language:** The Agent Config feature currently supports only Python code for tools and other functionality requiring programming code.
- **ADK Tool support:** The following ADK tools are supported by the Agent Config feature, but *not all tools are fully supported*:
- `google_search`
- `load_artifacts`
- `url_context`
- `exit_loop`
- `preload_memory`
- `get_user_choice`
- `enterprise_web_search`
- `load_web_page`: Requires a fully-qualified path to access web pages.
- **Agent Type Support:** The `LangGraphAgent` and `A2aAgent` types are not yet supported.
- `AgentTool`
- `LongRunningFunctionTool`
- `VertexAiSearchTool`
- `McpToolset`
- `ExampleTool`
## Next steps
For ideas on how and what to build with ADK Agent Configs, see the yaml-based agent definitions in the ADK [adk-samples](https://github.com/search?q=repo:google/adk-python+path:/%5Econtributing%5C/samples%5C//+root_agent.yaml&type=code) repository. For detailed information on the syntax and settings supported by the Agent Config format, see the [Agent Config syntax reference](/adk-docs/api-reference/agentconfig/).
# Custom agents
Supported in ADKPython v0.1.0Typescript v0.2.0Go v0.1.0Java v0.1.0
Custom agents provide the ultimate flexibility in ADK, allowing you to define **arbitrary orchestration logic** by inheriting directly from `BaseAgent` and implementing your own control flow. This goes beyond the predefined patterns of `SequentialAgent`, `LoopAgent`, and `ParallelAgent`, enabling you to build highly specific and complex agentic workflows.
Advanced Concept
Building custom agents by directly implementing `_run_async_impl` (or its equivalent in other languages) provides powerful control but is more complex than using the predefined `LlmAgent` or standard `WorkflowAgent` types. We recommend understanding those foundational agent types first before tackling custom orchestration logic.
## Introduction: Beyond Predefined Workflows
### What is a Custom Agent?
A Custom Agent is essentially any class you create that inherits from `google.adk.agents.BaseAgent` and implements its core execution logic within the `_run_async_impl` asynchronous method. You have complete control over how this method calls other agents (sub-agents), manages state, and handles events.
Note
The specific method name for implementing an agent's core asynchronous logic may vary slightly by SDK language (e.g., `runAsyncImpl` in Java, `_run_async_impl` in Python, or `runAsyncImpl` in TypeScript). Refer to the language-specific API documentation for details.
### Why Use Them?
While the standard [Workflow Agents](https://google.github.io/adk-docs/agents/workflow-agents/index.md) (`SequentialAgent`, `LoopAgent`, `ParallelAgent`) cover common orchestration patterns, you'll need a Custom agent when your requirements include:
- **Conditional Logic:** Executing different sub-agents or taking different paths based on runtime conditions or the results of previous steps.
- **Complex State Management:** Implementing intricate logic for maintaining and updating state throughout the workflow beyond simple sequential passing.
- **External Integrations:** Incorporating calls to external APIs, databases, or custom libraries directly within the orchestration flow control.
- **Dynamic Agent Selection:** Choosing which sub-agent(s) to run next based on dynamic evaluation of the situation or input.
- **Unique Workflow Patterns:** Implementing orchestration logic that doesn't fit the standard sequential, parallel, or loop structures.
## Implementing Custom Logic:
The core of any custom agent is the method where you define its unique asynchronous behavior. This method allows you to orchestrate sub-agents and manage the flow of execution.
The heart of any custom agent is the `_run_async_impl` method. This is where you define its unique behavior.
- **Signature:** `async def _run_async_impl(self, ctx: InvocationContext) -> AsyncGenerator[Event, None]:`
- **Asynchronous Generator:** It must be an `async def` function and return an `AsyncGenerator`. This allows it to `yield` events produced by sub-agents or its own logic back to the runner.
- **`ctx` (InvocationContext):** Provides access to crucial runtime information, most importantly `ctx.session.state`, which is the primary way to share data between steps orchestrated by your custom agent.
The heart of any custom agent is the `runAsyncImpl` method. This is where you define its unique behavior.
- **Signature:** `async* runAsyncImpl(ctx: InvocationContext): AsyncGenerator`
- **Asynchronous Generator:** It must be an `async` generator function (`async*`).
- **`ctx` (InvocationContext):** Provides access to crucial runtime information, most importantly `ctx.session.state`, which is the primary way to share data between steps orchestrated by your custom agent.
In Go, you implement the `Run` method as part of a struct that satisfies the `agent.Agent` interface. The actual logic is typically a method on your custom agent struct.
- **Signature:** `Run(ctx agent.InvocationContext) iter.Seq2[*session.Event, error]`
- **Iterator:** The `Run` method returns an iterator (`iter.Seq2`) that yields events and errors. This is the standard way to handle streaming results from an agent's execution.
- **`ctx` (InvocationContext):** The `agent.InvocationContext` provides access to the session, including state, and other crucial runtime information.
- **Session State:** You can access the session state through `ctx.Session().State()`.
The heart of any custom agent is the `runAsyncImpl` method, which you override from `BaseAgent`.
- **Signature:** `protected Flowable runAsyncImpl(InvocationContext ctx)`
- **Reactive Stream (`Flowable`):** It must return an `io.reactivex.rxjava3.core.Flowable`. This `Flowable` represents a stream of events that will be produced by the custom agent's logic, often by combining or transforming multiple `Flowable` from sub-agents.
- **`ctx` (InvocationContext):** Provides access to crucial runtime information, most importantly `ctx.session().state()`, which is a `java.util.concurrent.ConcurrentMap`. This is the primary way to share data between steps orchestrated by your custom agent.
**Key Capabilities within the Core Asynchronous Method:**
1. **Calling Sub-Agents:** You invoke sub-agents (which are typically stored as instance attributes like `self.my_llm_agent`) using their `run_async` method and yield their events:
```python
async for event in self.some_sub_agent.run_async(ctx):
# Optionally inspect or log the event
yield event # Pass the event up
```
1. **Managing State:** Read from and write to the session state dictionary (`ctx.session.state`) to pass data between sub-agent calls or make decisions:
```python
# Read data set by a previous agent
previous_result = ctx.session.state.get("some_key")
# Make a decision based on state
if previous_result == "some_value":
# ... call a specific sub-agent ...
else:
# ... call another sub-agent ...
# Store a result for a later step (often done via a sub-agent's output_key)
# ctx.session.state["my_custom_result"] = "calculated_value"
```
1. **Implementing Control Flow:** Use standard Python constructs (`if`/`elif`/`else`, `for`/`while` loops, `try`/`except`) to create sophisticated, conditional, or iterative workflows involving your sub-agents.
1. **Calling Sub-Agents:** You invoke sub-agents (which are typically stored as instance properties like `this.myLlmAgent`) using their `run` method and yield their events:
```typescript
for await (const event of this.someSubAgent.runAsync(ctx)) {
// Optionally inspect or log the event
yield event; // Pass the event up to the runner
}
```
1. **Managing State:** Read from and write to the session state object (`ctx.session.state`) to pass data between sub-agent calls or make decisions:
```typescript
// Read data set by a previous agent
const previousResult = ctx.session.state['some_key'];
// Make a decision based on state
if (previousResult === 'some_value') {
// ... call a specific sub-agent ...
} else {
// ... call another sub-agent ...
}
// Store a result for a later step (often done via a sub-agent's outputKey)
// ctx.session.state['my_custom_result'] = 'calculated_value';
```
1. **Implementing Control Flow:** Use standard TypeScript/JavaScript constructs (`if`/`else`, `for`/`while` loops, `try`/`catch`) to create sophisticated, conditional, or iterative workflows involving your sub-agents.
1. **Calling Sub-Agents:** You invoke sub-agents by calling their `Run` method.
```go
// Example: Running one sub-agent and yielding its events
for event, err := range someSubAgent.Run(ctx) {
if err != nil {
// Handle or propagate the error
return
}
// Yield the event up to the caller
if !yield(event, nil) {
return
}
}
```
1. **Managing State:** Read from and write to the session state to pass data between sub-agent calls or make decisions.
```go
// The `ctx` (`agent.InvocationContext`) is passed directly to your agent's `Run` function.
// Read data set by a previous agent
previousResult, err := ctx.Session().State().Get("some_key")
if err != nil {
// Handle cases where the key might not exist yet
}
// Make a decision based on state
if val, ok := previousResult.(string); ok && val == "some_value" {
// ... call a specific sub-agent ...
} else {
// ... call another sub-agent ...
}
// Store a result for a later step
if err := ctx.Session().State().Set("my_custom_result", "calculated_value"); err != nil {
// Handle error
}
```
1. **Implementing Control Flow:** Use standard Go constructs (`if`/`else`, `for`/`switch` loops, goroutines, channels) to create sophisticated, conditional, or iterative workflows involving your sub-agents.
1. **Calling Sub-Agents:** You invoke sub-agents (which are typically stored as instance attributes or objects) using their asynchronous run method and return their event streams:
You typically chain `Flowable`s from sub-agents using RxJava operators like `concatWith`, `flatMapPublisher`, or `concatArray`.
```java
// Example: Running one sub-agent
// return someSubAgent.runAsync(ctx);
// Example: Running sub-agents sequentially
Flowable firstAgentEvents = someSubAgent1.runAsync(ctx)
.doOnNext(event -> System.out.println("Event from agent 1: " + event.id()));
Flowable secondAgentEvents = Flowable.defer(() ->
someSubAgent2.runAsync(ctx)
.doOnNext(event -> System.out.println("Event from agent 2: " + event.id()))
);
return firstAgentEvents.concatWith(secondAgentEvents);
```
The `Flowable.defer()` is often used for subsequent stages if their execution depends on the completion or state after prior stages.
1. **Managing State:** Read from and write to the session state to pass data between sub-agent calls or make decisions. The session state is a `java.util.concurrent.ConcurrentMap` obtained via `ctx.session().state()`.
```java
// Read data set by a previous agent
Object previousResult = ctx.session().state().get("some_key");
// Make a decision based on state
if ("some_value".equals(previousResult)) {
// ... logic to include a specific sub-agent's Flowable ...
} else {
// ... logic to include another sub-agent's Flowable ...
}
// Store a result for a later step (often done via a sub-agent's output_key)
// ctx.session().state().put("my_custom_result", "calculated_value");
```
1. **Implementing Control Flow:** Use standard language constructs (`if`/`else`, loops, `try`/`catch`) combined with reactive operators (RxJava) to create sophisticated workflows.
- **Conditional:** `Flowable.defer()` to choose which `Flowable` to subscribe to based on a condition, or `filter()` if you're filtering events within a stream.
- **Iterative:** Operators like `repeat()`, `retry()`, or by structuring your `Flowable` chain to recursively call parts of itself based on conditions (often managed with `flatMapPublisher` or `concatMap`).
## Managing Sub-Agents and State
Typically, a custom agent orchestrates other agents (like `LlmAgent`, `LoopAgent`, etc.).
- **Initialization:** You usually pass instances of these sub-agents into your custom agent's constructor and store them as instance fields/attributes (e.g., `this.story_generator = story_generator_instance` or `self.story_generator = story_generator_instance`). This makes them accessible within the custom agent's core asynchronous execution logic (such as: `_run_async_impl` method).
- **Sub Agents List:** When initializing the `BaseAgent` using it's `super()` constructor, you should pass a `sub agents` list. This list tells the ADK framework about the agents that are part of this custom agent's immediate hierarchy. It's important for framework features like lifecycle management, introspection, and potentially future routing capabilities, even if your core execution logic (`_run_async_impl`) calls the agents directly via `self.xxx_agent`. Include the agents that your custom logic directly invokes at the top level.
- **State:** As mentioned, `ctx.session.state` is the standard way sub-agents (especially `LlmAgent`s using `output key`) communicate results back to the orchestrator and how the orchestrator passes necessary inputs down.
## Design Pattern Example: `StoryFlowAgent`
Let's illustrate the power of custom agents with an example pattern: a multi-stage content generation workflow with conditional logic.
**Goal:** Create a system that generates a story, iteratively refines it through critique and revision, performs final checks, and crucially, *regenerates the story if the final tone check fails*.
**Why Custom?** The core requirement driving the need for a custom agent here is the **conditional regeneration based on the tone check**. Standard workflow agents don't have built-in conditional branching based on the outcome of a sub-agent's task. We need custom logic (`if tone == "negative": ...`) within the orchestrator.
______________________________________________________________________
### Part 1: Simplified custom agent Initialization
We define the `StoryFlowAgent` inheriting from `BaseAgent`. In `__init__`, we store the necessary sub-agents (passed in) as instance attributes and tell the `BaseAgent` framework about the top-level agents this custom agent will directly orchestrate.
```python
class StoryFlowAgent(BaseAgent):
"""
Custom agent for a story generation and refinement workflow.
This agent orchestrates a sequence of LLM agents to generate a story,
critique it, revise it, check grammar and tone, and potentially
regenerate the story if the tone is negative.
"""
# --- Field Declarations for Pydantic ---
# Declare the agents passed during initialization as class attributes with type hints
story_generator: LlmAgent
critic: LlmAgent
reviser: LlmAgent
grammar_check: LlmAgent
tone_check: LlmAgent
loop_agent: LoopAgent
sequential_agent: SequentialAgent
# model_config allows setting Pydantic configurations if needed, e.g., arbitrary_types_allowed
model_config = {"arbitrary_types_allowed": True}
def __init__(
self,
name: str,
story_generator: LlmAgent,
critic: LlmAgent,
reviser: LlmAgent,
grammar_check: LlmAgent,
tone_check: LlmAgent,
):
"""
Initializes the StoryFlowAgent.
Args:
name: The name of the agent.
story_generator: An LlmAgent to generate the initial story.
critic: An LlmAgent to critique the story.
reviser: An LlmAgent to revise the story based on criticism.
grammar_check: An LlmAgent to check the grammar.
tone_check: An LlmAgent to analyze the tone.
"""
# Create internal agents *before* calling super().__init__
loop_agent = LoopAgent(
name="CriticReviserLoop", sub_agents=[critic, reviser], max_iterations=2
)
sequential_agent = SequentialAgent(
name="PostProcessing", sub_agents=[grammar_check, tone_check]
)
# Define the sub_agents list for the framework
sub_agents_list = [
story_generator,
loop_agent,
sequential_agent,
]
# Pydantic will validate and assign them based on the class annotations.
super().__init__(
name=name,
story_generator=story_generator,
critic=critic,
reviser=reviser,
grammar_check=grammar_check,
tone_check=tone_check,
loop_agent=loop_agent,
sequential_agent=sequential_agent,
sub_agents=sub_agents_list, # Pass the sub_agents list directly
)
```
We define the `StoryFlowAgent` by extending `BaseAgent`. In its constructor, we:
1. Create any internal composite agents (like `LoopAgent` or `SequentialAgent`).
1. Pass the list of all top-level sub-agents to the `super()` constructor.
1. Store the sub-agents (passed in or created internally) as instance properties (e.g., `this.storyGenerator`) so they can be accessed in the custom `runImpl` logic.
```typescript
class StoryFlowAgent extends BaseAgent {
// --- Property Declarations for TypeScript ---
private storyGenerator: LlmAgent;
private critic: LlmAgent;
private reviser: LlmAgent;
private grammarCheck: LlmAgent;
private toneCheck: LlmAgent;
private loopAgent: LoopAgent;
private sequentialAgent: SequentialAgent;
constructor(
name: string,
storyGenerator: LlmAgent,
critic: LlmAgent,
reviser: LlmAgent,
grammarCheck: LlmAgent,
toneCheck: LlmAgent
) {
// Create internal composite agents
const loopAgent = new LoopAgent({
name: "CriticReviserLoop",
subAgents: [critic, reviser],
maxIterations: 2,
});
const sequentialAgent = new SequentialAgent({
name: "PostProcessing",
subAgents: [grammarCheck, toneCheck],
});
// Define the sub-agents for the framework to know about
const subAgentsList = [
storyGenerator,
loopAgent,
sequentialAgent,
];
// Call the parent constructor
super({
name,
subAgents: subAgentsList,
});
// Assign agents to class properties for use in the custom run logic
this.storyGenerator = storyGenerator;
this.critic = critic;
this.reviser = reviser;
this.grammarCheck = grammarCheck;
this.toneCheck = toneCheck;
this.loopAgent = loopAgent;
this.sequentialAgent = sequentialAgent;
}
```
We define the `StoryFlowAgent` struct and a constructor. In the constructor, we store the necessary sub-agents and tell the `BaseAgent` framework about the top-level agents this custom agent will directly orchestrate.
```go
// StoryFlowAgent is a custom agent that orchestrates a story generation workflow.
// It encapsulates the logic of running sub-agents in a specific sequence.
type StoryFlowAgent struct {
storyGenerator agent.Agent
revisionLoopAgent agent.Agent
postProcessorAgent agent.Agent
}
// NewStoryFlowAgent creates and configures the entire custom agent workflow.
// It takes individual LLM agents as input and internally creates the necessary
// workflow agents (loop, sequential), returning the final orchestrator agent.
func NewStoryFlowAgent(
storyGenerator,
critic,
reviser,
grammarCheck,
toneCheck agent.Agent,
) (agent.Agent, error) {
loopAgent, err := loopagent.New(loopagent.Config{
MaxIterations: 2,
AgentConfig: agent.Config{
Name: "CriticReviserLoop",
SubAgents: []agent.Agent{critic, reviser},
},
})
if err != nil {
return nil, fmt.Errorf("failed to create loop agent: %w", err)
}
sequentialAgent, err := sequentialagent.New(sequentialagent.Config{
AgentConfig: agent.Config{
Name: "PostProcessing",
SubAgents: []agent.Agent{grammarCheck, toneCheck},
},
})
if err != nil {
return nil, fmt.Errorf("failed to create sequential agent: %w", err)
}
// The StoryFlowAgent struct holds the agents needed for the Run method.
orchestrator := &StoryFlowAgent{
storyGenerator: storyGenerator,
revisionLoopAgent: loopAgent,
postProcessorAgent: sequentialAgent,
}
// agent.New creates the final agent, wiring up the Run method.
return agent.New(agent.Config{
Name: "StoryFlowAgent",
Description: "Orchestrates story generation, critique, revision, and checks.",
SubAgents: []agent.Agent{storyGenerator, loopAgent, sequentialAgent},
Run: orchestrator.Run,
})
}
```
We define the `StoryFlowAgentExample` by extending `BaseAgent`. In its **constructor**, we store the necessary sub-agent instances (passed as parameters) as instance fields. These top-level sub-agents, which this custom agent will directly orchestrate, are also passed to the `super` constructor of `BaseAgent` as a list.
```java
private final LlmAgent storyGenerator;
private final LoopAgent loopAgent;
private final SequentialAgent sequentialAgent;
public StoryFlowAgentExample(
String name, LlmAgent storyGenerator, LoopAgent loopAgent, SequentialAgent sequentialAgent) {
super(
name,
"Orchestrates story generation, critique, revision, and checks.",
List.of(storyGenerator, loopAgent, sequentialAgent),
null,
null);
this.storyGenerator = storyGenerator;
this.loopAgent = loopAgent;
this.sequentialAgent = sequentialAgent;
}
```
______________________________________________________________________
### Part 2: Defining the Custom Execution Logic
This method orchestrates the sub-agents using standard Python async/await and control flow.
```python
@override
async def _run_async_impl(
self, ctx: InvocationContext
) -> AsyncGenerator[Event, None]:
"""
Implements the custom orchestration logic for the story workflow.
Uses the instance attributes assigned by Pydantic (e.g., self.story_generator).
"""
logger.info(f"[{self.name}] Starting story generation workflow.")
# 1. Initial Story Generation
logger.info(f"[{self.name}] Running StoryGenerator...")
async for event in self.story_generator.run_async(ctx):
logger.info(f"[{self.name}] Event from StoryGenerator: {event.model_dump_json(indent=2, exclude_none=True)}")
yield event
# Check if story was generated before proceeding
if "current_story" not in ctx.session.state or not ctx.session.state["current_story"]:
logger.error(f"[{self.name}] Failed to generate initial story. Aborting workflow.")
return # Stop processing if initial story failed
logger.info(f"[{self.name}] Story state after generator: {ctx.session.state.get('current_story')}")
# 2. Critic-Reviser Loop
logger.info(f"[{self.name}] Running CriticReviserLoop...")
# Use the loop_agent instance attribute assigned during init
async for event in self.loop_agent.run_async(ctx):
logger.info(f"[{self.name}] Event from CriticReviserLoop: {event.model_dump_json(indent=2, exclude_none=True)}")
yield event
logger.info(f"[{self.name}] Story state after loop: {ctx.session.state.get('current_story')}")
# 3. Sequential Post-Processing (Grammar and Tone Check)
logger.info(f"[{self.name}] Running PostProcessing...")
# Use the sequential_agent instance attribute assigned during init
async for event in self.sequential_agent.run_async(ctx):
logger.info(f"[{self.name}] Event from PostProcessing: {event.model_dump_json(indent=2, exclude_none=True)}")
yield event
# 4. Tone-Based Conditional Logic
tone_check_result = ctx.session.state.get("tone_check_result")
logger.info(f"[{self.name}] Tone check result: {tone_check_result}")
if tone_check_result == "negative":
logger.info(f"[{self.name}] Tone is negative. Regenerating story...")
async for event in self.story_generator.run_async(ctx):
logger.info(f"[{self.name}] Event from StoryGenerator (Regen): {event.model_dump_json(indent=2, exclude_none=True)}")
yield event
else:
logger.info(f"[{self.name}] Tone is not negative. Keeping current story.")
pass
logger.info(f"[{self.name}] Workflow finished.")
```
**Explanation of Logic:**
1. The initial `story_generator` runs. Its output is expected to be in `ctx.session.state["current_story"]`.
1. The `loop_agent` runs, which internally calls the `critic` and `reviser` sequentially for `max_iterations` times. They read/write `current_story` and `criticism` from/to the state.
1. The `sequential_agent` runs, calling `grammar_check` then `tone_check`, reading `current_story` and writing `grammar_suggestions` and `tone_check_result` to the state.
1. **Custom Part:** The `if` statement checks the `tone_check_result` from the state. If it's "negative", the `story_generator` is called *again*, overwriting the `current_story` in the state. Otherwise, the flow ends.
The `runImpl` method orchestrates the sub-agents using standard TypeScript `async`/`await` and control flow. The `runLiveImpl` is also added to handle live streaming scenarios.
```typescript
// Implements the custom orchestration logic for the story workflow.
async* runLiveImpl(ctx: InvocationContext): AsyncGenerator {
yield* this.runAsyncImpl(ctx);
}
// Implements the custom orchestration logic for the story workflow.
async* runAsyncImpl(ctx: InvocationContext): AsyncGenerator {
console.log(`[${this.name}] Starting story generation workflow.`);
// 1. Initial Story Generation
console.log(`[${this.name}] Running StoryGenerator...`);
for await (const event of this.storyGenerator.runAsync(ctx)) {
console.log(`[${this.name}] Event from StoryGenerator: ${JSON.stringify(event, null, 2)}`);
yield event;
}
// Check if the story was generated before proceeding
if (!ctx.session.state["current_story"]) {
console.error(`[${this.name}] Failed to generate initial story. Aborting workflow.`);
return; // Stop processing
}
console.log(`[${this.name}] Story state after generator: ${ctx.session.state['current_story']}`);
// 2. Critic-Reviser Loop
console.log(`[${this.name}] Running CriticReviserLoop...`);
for await (const event of this.loopAgent.runAsync(ctx)) {
console.log(`[${this.name}] Event from CriticReviserLoop: ${JSON.stringify(event, null, 2)}`);
yield event;
}
console.log(`[${this.name}] Story state after loop: ${ctx.session.state['current_story']}`);
// 3. Sequential Post-Processing (Grammar and Tone Check)
console.log(`[${this.name}] Running PostProcessing...`);
for await (const event of this.sequentialAgent.runAsync(ctx)) {
console.log(`[${this.name}] Event from PostProcessing: ${JSON.stringify(event, null, 2)}`);
yield event;
}
// 4. Tone-Based Conditional Logic
const toneCheckResult = ctx.session.state["tone_check_result"] as string;
console.log(`[${this.name}] Tone check result: ${toneCheckResult}`);
if (toneCheckResult === "negative") {
console.log(`[${this.name}] Tone is negative. Regenerating story...`);
for await (const event of this.storyGenerator.runAsync(ctx)) {
console.log(`[${this.name}] Event from StoryGenerator (Regen): ${JSON.stringify(event, null, 2)}`);
yield event;
}
} else {
console.log(`[${this.name}] Tone is not negative. Keeping current story.`);
}
console.log(`[${this.name}] Workflow finished.`);
}
```
**Explanation of Logic:**
1. The initial `storyGenerator` runs. Its output is expected to be in `ctx.session.state['current_story']`.
1. The `loopAgent` runs, which internally calls the `critic` and `reviser` sequentially for `maxIterations` times. They read/write `current_story` and `criticism` from/to the state.
1. The `sequentialAgent` runs, calling `grammarCheck` then `toneCheck`, reading `current_story` and writing `grammar_suggestions` and `tone_check_result` to the state.
1. **Custom Part:** The `if` statement checks the `tone_check_result` from the state. If it's "negative", the `storyGenerator` is called *again*, overwriting the `current_story` in the state. Otherwise, the flow ends.
The `Run` method orchestrates the sub-agents by calling their respective `Run` methods in a loop and yielding their events.
```go
// Run defines the custom execution logic for the StoryFlowAgent.
func (s *StoryFlowAgent) Run(ctx agent.InvocationContext) iter.Seq2[*session.Event, error] {
return func(yield func(*session.Event, error) bool) {
// Stage 1: Initial Story Generation
for event, err := range s.storyGenerator.Run(ctx) {
if err != nil {
yield(nil, fmt.Errorf("story generator failed: %w", err))
return
}
if !yield(event, nil) {
return
}
}
// Check if story was generated before proceeding
currentStory, err := ctx.Session().State().Get("current_story")
if err != nil || currentStory == "" {
log.Println("Failed to generate initial story. Aborting workflow.")
return
}
// Stage 2: Critic-Reviser Loop
for event, err := range s.revisionLoopAgent.Run(ctx) {
if err != nil {
yield(nil, fmt.Errorf("loop agent failed: %w", err))
return
}
if !yield(event, nil) {
return
}
}
// Stage 3: Post-Processing
for event, err := range s.postProcessorAgent.Run(ctx) {
if err != nil {
yield(nil, fmt.Errorf("sequential agent failed: %w", err))
return
}
if !yield(event, nil) {
return
}
}
// Stage 4: Conditional Regeneration
toneResult, err := ctx.Session().State().Get("tone_check_result")
if err != nil {
log.Printf("Could not read tone_check_result from state: %v. Assuming tone is not negative.", err)
return
}
if tone, ok := toneResult.(string); ok && tone == "negative" {
log.Println("Tone is negative. Regenerating story...")
for event, err := range s.storyGenerator.Run(ctx) {
if err != nil {
yield(nil, fmt.Errorf("story regeneration failed: %w", err))
return
}
if !yield(event, nil) {
return
}
}
} else {
log.Println("Tone is not negative. Keeping current story.")
}
}
}
```
**Explanation of Logic:**
1. The initial `storyGenerator` runs. Its output is expected to be in the session state under the key `"current_story"`.
1. The `revisionLoopAgent` runs, which internally calls the `critic` and `reviser` sequentially for `max_iterations` times. They read/write `current_story` and `criticism` from/to the state.
1. The `postProcessorAgent` runs, calling `grammar_check` then `tone_check`, reading `current_story` and writing `grammar_suggestions` and `tone_check_result` to the state.
1. **Custom Part:** The code checks the `tone_check_result` from the state. If it's "negative", the `story_generator` is called *again*, overwriting the `current_story` in the state. Otherwise, the flow ends.
The `runAsyncImpl` method orchestrates the sub-agents using RxJava's Flowable streams and operators for asynchronous control flow.
```java
@Override
protected Flowable runAsyncImpl(InvocationContext invocationContext) {
// Implements the custom orchestration logic for the story workflow.
// Uses the instance attributes assigned by Pydantic (e.g., self.story_generator).
logger.log(Level.INFO, () -> String.format("[%s] Starting story generation workflow.", name()));
// Stage 1. Initial Story Generation
Flowable storyGenFlow = runStage(storyGenerator, invocationContext, "StoryGenerator");
// Stage 2: Critic-Reviser Loop (runs after story generation completes)
Flowable criticReviserFlow = Flowable.defer(() -> {
if (!isStoryGenerated(invocationContext)) {
logger.log(Level.SEVERE,() ->
String.format("[%s] Failed to generate initial story. Aborting after StoryGenerator.",
name()));
return Flowable.empty(); // Stop further processing if no story
}
logger.log(Level.INFO, () ->
String.format("[%s] Story state after generator: %s",
name(), invocationContext.session().state().get("current_story")));
return runStage(loopAgent, invocationContext, "CriticReviserLoop");
});
// Stage 3: Post-Processing (runs after critic-reviser loop completes)
Flowable postProcessingFlow = Flowable.defer(() -> {
logger.log(Level.INFO, () ->
String.format("[%s] Story state after loop: %s",
name(), invocationContext.session().state().get("current_story")));
return runStage(sequentialAgent, invocationContext, "PostProcessing");
});
// Stage 4: Conditional Regeneration (runs after post-processing completes)
Flowable conditionalRegenFlow = Flowable.defer(() -> {
String toneCheckResult = (String) invocationContext.session().state().get("tone_check_result");
logger.log(Level.INFO, () -> String.format("[%s] Tone check result: %s", name(), toneCheckResult));
if ("negative".equalsIgnoreCase(toneCheckResult)) {
logger.log(Level.INFO, () ->
String.format("[%s] Tone is negative. Regenerating story...", name()));
return runStage(storyGenerator, invocationContext, "StoryGenerator (Regen)");
} else {
logger.log(Level.INFO, () ->
String.format("[%s] Tone is not negative. Keeping current story.", name()));
return Flowable.empty(); // No regeneration needed
}
});
return Flowable.concatArray(storyGenFlow, criticReviserFlow, postProcessingFlow, conditionalRegenFlow)
.doOnComplete(() -> logger.log(Level.INFO, () -> String.format("[%s] Workflow finished.", name())));
}
// Helper method for a single agent run stage with logging
private Flowable runStage(BaseAgent agentToRun, InvocationContext ctx, String stageName) {
logger.log(Level.INFO, () -> String.format("[%s] Running %s...", name(), stageName));
return agentToRun
.runAsync(ctx)
.doOnNext(event ->
logger.log(Level.INFO,() ->
String.format("[%s] Event from %s: %s", name(), stageName, event.toJson())))
.doOnError(err ->
logger.log(Level.SEVERE,
String.format("[%s] Error in %s", name(), stageName), err))
.doOnComplete(() ->
logger.log(Level.INFO, () ->
String.format("[%s] %s finished.", name(), stageName)));
}
```
**Explanation of Logic:**
1. The initial `storyGenerator.runAsync(invocationContext)` Flowable is executed. Its output is expected to be in `invocationContext.session().state().get("current_story")`.
1. The `loopAgent's` Flowable runs next (due to `Flowable.concatArray` and `Flowable.defer`). The LoopAgent internally calls the `critic` and `reviser` sub-agents sequentially for up to `maxIterations`. They read/write `current_story` and `criticism` from/to the state.
1. Then, the `sequentialAgent's` Flowable executes. It calls the `grammar_check` then `tone_check`, reading `current_story` and writing `grammar_suggestions` and `tone_check_result` to the state.
1. **Custom Part:** After the sequentialAgent completes, logic within a `Flowable.defer` checks the "tone_check_result" from `invocationContext.session().state()`. If it's "negative", the `storyGenerator` Flowable is *conditionally concatenated* and executed again, overwriting "current_story". Otherwise, an empty Flowable is used, and the overall workflow proceeds to completion.
______________________________________________________________________
### Part 3: Defining the LLM Sub-Agents
These are standard `LlmAgent` definitions, responsible for specific tasks. Their `output key` parameter is crucial for placing results into the `session.state` where other agents or the custom orchestrator can access them.
Direct State Injection in Instructions
Notice the `story_generator`'s instruction. The `{var}` syntax is a placeholder. Before the instruction is sent to the LLM, the ADK framework automatically replaces (Example:`{topic}`) with the value of `session.state['topic']`. This is the recommended way to provide context to an agent, using templating in the instructions. For more details, see the [State documentation](https://google.github.io/adk-docs/sessions/state/#accessing-session-state-in-agent-instructions).
```python
GEMINI_2_FLASH = "gemini-2.0-flash" # Define model constant
# --- Define the individual LLM agents ---
story_generator = LlmAgent(
name="StoryGenerator",
model=GEMINI_2_FLASH,
instruction="""You are a story writer. Write a short story (around 100 words), on the following topic: {topic}""",
input_schema=None,
output_key="current_story", # Key for storing output in session state
)
critic = LlmAgent(
name="Critic",
model=GEMINI_2_FLASH,
instruction="""You are a story critic. Review the story provided: {{current_story}}. Provide 1-2 sentences of constructive criticism
on how to improve it. Focus on plot or character.""",
input_schema=None,
output_key="criticism", # Key for storing criticism in session state
)
reviser = LlmAgent(
name="Reviser",
model=GEMINI_2_FLASH,
instruction="""You are a story reviser. Revise the story provided: {{current_story}}, based on the criticism in
{{criticism}}. Output only the revised story.""",
input_schema=None,
output_key="current_story", # Overwrites the original story
)
grammar_check = LlmAgent(
name="GrammarCheck",
model=GEMINI_2_FLASH,
instruction="""You are a grammar checker. Check the grammar of the story provided: {current_story}. Output only the suggested
corrections as a list, or output 'Grammar is good!' if there are no errors.""",
input_schema=None,
output_key="grammar_suggestions",
)
tone_check = LlmAgent(
name="ToneCheck",
model=GEMINI_2_FLASH,
instruction="""You are a tone analyzer. Analyze the tone of the story provided: {current_story}. Output only one word: 'positive' if
the tone is generally positive, 'negative' if the tone is generally negative, or 'neutral'
otherwise.""",
input_schema=None,
output_key="tone_check_result", # This agent's output determines the conditional flow
)
```
```typescript
// --- Define the individual LLM agents ---
const storyGenerator = new LlmAgent({
name: "StoryGenerator",
model: GEMINI_MODEL,
instruction: `You are a story writer. Write a short story (around 100 words), on the following topic: {topic}`,
outputKey: "current_story",
});
const critic = new LlmAgent({
name: "Critic",
model: GEMINI_MODEL,
instruction: `You are a story critic. Review the story provided: {{current_story}}. Provide 1-2 sentences of constructive criticism
on how to improve it. Focus on plot or character.`,
outputKey: "criticism",
});
const reviser = new LlmAgent({
name: "Reviser",
model: GEMINI_MODEL,
instruction: `You are a story reviser. Revise the story provided: {{current_story}}, based on the criticism in
{{criticism}}. Output only the revised story.`,
outputKey: "current_story", // Overwrites the original story
});
const grammarCheck = new LlmAgent({
name: "GrammarCheck",
model: GEMINI_MODEL,
instruction: `You are a grammar checker. Check the grammar of the story provided: {current_story}. Output only the suggested
corrections as a list, or output 'Grammar is good!' if there are no errors.`,
outputKey: "grammar_suggestions",
});
const toneCheck = new LlmAgent({
name: "ToneCheck",
model: GEMINI_MODEL,
instruction: `You are a tone analyzer. Analyze the tone of the story provided: {current_story}. Output only one word: 'positive' if
the tone is generally positive, 'negative' if the tone is generally negative, or 'neutral'
otherwise.`,
outputKey: "tone_check_result",
});
```
```go
// --- Define the individual LLM agents ---
storyGenerator, err := llmagent.New(llmagent.Config{
Name: "StoryGenerator",
Model: model,
Description: "Generates the initial story.",
Instruction: "You are a story writer. Write a short story (around 100 words) about a cat, based on the topic: {topic}",
OutputKey: "current_story",
})
if err != nil {
log.Fatalf("Failed to create StoryGenerator agent: %v", err)
}
critic, err := llmagent.New(llmagent.Config{
Name: "Critic",
Model: model,
Description: "Critiques the story.",
Instruction: "You are a story critic. Review the story: {current_story}. Provide 1-2 sentences of constructive criticism on how to improve it. Focus on plot or character.",
OutputKey: "criticism",
})
if err != nil {
log.Fatalf("Failed to create Critic agent: %v", err)
}
reviser, err := llmagent.New(llmagent.Config{
Name: "Reviser",
Model: model,
Description: "Revises the story based on criticism.",
Instruction: "You are a story reviser. Revise the story: {current_story}, based on the criticism: {criticism}. Output only the revised story.",
OutputKey: "current_story",
})
if err != nil {
log.Fatalf("Failed to create Reviser agent: %v", err)
}
grammarCheck, err := llmagent.New(llmagent.Config{
Name: "GrammarCheck",
Model: model,
Description: "Checks grammar and suggests corrections.",
Instruction: "You are a grammar checker. Check the grammar of the story: {current_story}. Output only the suggested corrections as a list, or output 'Grammar is good!' if there are no errors.",
OutputKey: "grammar_suggestions",
})
if err != nil {
log.Fatalf("Failed to create GrammarCheck agent: %v", err)
}
toneCheck, err := llmagent.New(llmagent.Config{
Name: "ToneCheck",
Model: model,
Description: "Analyzes the tone of the story.",
Instruction: "You are a tone analyzer. Analyze the tone of the story: {current_story}. Output only one word: 'positive' if the tone is generally positive, 'negative' if the tone is generally negative, or 'neutral' otherwise.",
OutputKey: "tone_check_result",
})
if err != nil {
log.Fatalf("Failed to create ToneCheck agent: %v", err)
}
```
```java
// --- Define the individual LLM agents ---
LlmAgent storyGenerator =
LlmAgent.builder()
.name("StoryGenerator")
.model(MODEL_NAME)
.description("Generates the initial story.")
.instruction(
"""
You are a story writer. Write a short story (around 100 words) about a cat,
based on the topic: {topic}
""")
.inputSchema(null)
.outputKey("current_story") // Key for storing output in session state
.build();
LlmAgent critic =
LlmAgent.builder()
.name("Critic")
.model(MODEL_NAME)
.description("Critiques the story.")
.instruction(
"""
You are a story critic. Review the story: {current_story}. Provide 1-2 sentences of constructive criticism
on how to improve it. Focus on plot or character.
""")
.inputSchema(null)
.outputKey("criticism") // Key for storing criticism in session state
.build();
LlmAgent reviser =
LlmAgent.builder()
.name("Reviser")
.model(MODEL_NAME)
.description("Revises the story based on criticism.")
.instruction(
"""
You are a story reviser. Revise the story: {current_story}, based on the criticism: {criticism}. Output only the revised story.
""")
.inputSchema(null)
.outputKey("current_story") // Overwrites the original story
.build();
LlmAgent grammarCheck =
LlmAgent.builder()
.name("GrammarCheck")
.model(MODEL_NAME)
.description("Checks grammar and suggests corrections.")
.instruction(
"""
You are a grammar checker. Check the grammar of the story: {current_story}. Output only the suggested
corrections as a list, or output 'Grammar is good!' if there are no errors.
""")
.outputKey("grammar_suggestions")
.build();
LlmAgent toneCheck =
LlmAgent.builder()
.name("ToneCheck")
.model(MODEL_NAME)
.description("Analyzes the tone of the story.")
.instruction(
"""
You are a tone analyzer. Analyze the tone of the story: {current_story}. Output only one word: 'positive' if
the tone is generally positive, 'negative' if the tone is generally negative, or 'neutral'
otherwise.
""")
.outputKey("tone_check_result") // This agent's output determines the conditional flow
.build();
LoopAgent loopAgent =
LoopAgent.builder()
.name("CriticReviserLoop")
.description("Iteratively critiques and revises the story.")
.subAgents(critic, reviser)
.maxIterations(2)
.build();
SequentialAgent sequentialAgent =
SequentialAgent.builder()
.name("PostProcessing")
.description("Performs grammar and tone checks sequentially.")
.subAgents(grammarCheck, toneCheck)
.build();
```
______________________________________________________________________
### Part 4: Instantiating and Running the custom agent
Finally, you instantiate your `StoryFlowAgent` and use the `Runner` as usual.
```python
# --- Create the custom agent instance ---
story_flow_agent = StoryFlowAgent(
name="StoryFlowAgent",
story_generator=story_generator,
critic=critic,
reviser=reviser,
grammar_check=grammar_check,
tone_check=tone_check,
)
INITIAL_STATE = {"topic": "a brave kitten exploring a haunted house"}
# --- Setup Runner and Session ---
async def setup_session_and_runner():
session_service = InMemorySessionService()
session = await session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID, state=INITIAL_STATE)
logger.info(f"Initial session state: {session.state}")
runner = Runner(
agent=story_flow_agent, # Pass the custom orchestrator agent
app_name=APP_NAME,
session_service=session_service
)
return session_service, runner
# --- Function to Interact with the Agent ---
async def call_agent_async(user_input_topic: str):
"""
Sends a new topic to the agent (overwriting the initial one if needed)
and runs the workflow.
"""
session_service, runner = await setup_session_and_runner()
current_session = session_service.sessions[APP_NAME][USER_ID][SESSION_ID]
current_session.state["topic"] = user_input_topic
logger.info(f"Updated session state topic to: {user_input_topic}")
content = types.Content(role='user', parts=[types.Part(text=f"Generate a story about the preset topic.")])
events = runner.run_async(user_id=USER_ID, session_id=SESSION_ID, new_message=content)
final_response = "No final response captured."
async for event in events:
if event.is_final_response() and event.content and event.content.parts:
logger.info(f"Potential final response from [{event.author}]: {event.content.parts[0].text}")
final_response = event.content.parts[0].text
print("\n--- Agent Interaction Result ---")
print("Agent Final Response: ", final_response)
final_session = await session_service.get_session(app_name=APP_NAME,
user_id=USER_ID,
session_id=SESSION_ID)
print("Final Session State:")
import json
print(json.dumps(final_session.state, indent=2))
print("-------------------------------\n")
# --- Run the Agent ---
# Note: In Colab, you can directly use 'await' at the top level.
# If running this code as a standalone Python script, you'll need to use asyncio.run() or manage the event loop.
await call_agent_async("a lonely robot finding a friend in a junkyard")
```
```typescript
// --- Create the custom agent instance ---
const storyFlowAgent = new StoryFlowAgent(
"StoryFlowAgent",
storyGenerator,
critic,
reviser,
grammarCheck,
toneCheck
);
const INITIAL_STATE = { "topic": "a brave kitten exploring a haunted house" };
// --- Setup Runner and Session ---
async function setupRunnerAndSession() {
const runner = new InMemoryRunner({
agent: storyFlowAgent,
appName: APP_NAME,
});
const session = await runner.sessionService.createSession({
appName: APP_NAME,
userId: USER_ID,
sessionId: SESSION_ID,
state: INITIAL_STATE,
});
console.log(`Initial session state: ${JSON.stringify(session.state, null, 2)}`);
return runner;
}
// --- Function to Interact with the Agent ---
async function callAgent(runner: InMemoryRunner, userInputTopic: string) {
const currentSession = await runner.sessionService.getSession({
appName: APP_NAME,
userId: USER_ID,
sessionId: SESSION_ID
});
if (!currentSession) {
return;
}
// Update the state with the new topic for this run
currentSession.state["topic"] = userInputTopic;
console.log(`Updated session state topic to: ${userInputTopic}`);
let finalResponse = "No final response captured.";
for await (const event of runner.runAsync({
userId: USER_ID,
sessionId: SESSION_ID,
newMessage: createUserContent(`Generate a story about: ${userInputTopic}`)
})) {
if (isFinalResponse(event) && event.content?.parts?.length) {
console.log(`Potential final response from [${event.author}]: ${event.content.parts.map(part => part.text ?? '').join('')}`);
finalResponse = event.content.parts.map(part => part.text ?? '').join('');
}
}
const finalSession = await runner.sessionService.getSession({
appName: APP_NAME,
userId: USER_ID,
sessionId: SESSION_ID
});
console.log("\n--- Agent Interaction Result ---");
console.log("Agent Final Response: ", finalResponse);
console.log("Final Session State:");
console.log(JSON.stringify(finalSession?.state, null, 2));
console.log("-------------------------------\n");
}
// --- Run the Agent ---
async function main() {
const runner = await setupRunnerAndSession();
await callAgent(runner, "a lonely robot finding a friend in a junkyard");
}
main();
```
```go
// Instantiate the custom agent, which encapsulates the workflow agents.
storyFlowAgent, err := NewStoryFlowAgent(
storyGenerator,
critic,
reviser,
grammarCheck,
toneCheck,
)
if err != nil {
log.Fatalf("Failed to create story flow agent: %v", err)
}
// --- Run the Agent ---
sessionService := session.InMemoryService()
initialState := map[string]any{
"topic": "a brave kitten exploring a haunted house",
}
sessionInstance, err := sessionService.Create(ctx, &session.CreateRequest{
AppName: appName,
UserID: userID,
State: initialState,
})
if err != nil {
log.Fatalf("Failed to create session: %v", err)
}
userTopic := "a lonely robot finding a friend in a junkyard"
r, err := runner.New(runner.Config{
AppName: appName,
Agent: storyFlowAgent,
SessionService: sessionService,
})
if err != nil {
log.Fatalf("Failed to create runner: %v", err)
}
input := genai.NewContentFromText("Generate a story about: "+userTopic, genai.RoleUser)
events := r.Run(ctx, userID, sessionInstance.Session.ID(), input, agent.RunConfig{
StreamingMode: agent.StreamingModeSSE,
})
var finalResponse string
for event, err := range events {
if err != nil {
log.Fatalf("An error occurred during agent execution: %v", err)
}
for _, part := range event.Content.Parts {
// Accumulate text from all parts of the final response.
finalResponse += part.Text
}
}
fmt.Println("\n--- Agent Interaction Result ---")
fmt.Println("Agent Final Response: " + finalResponse)
finalSession, err := sessionService.Get(ctx, &session.GetRequest{
UserID: userID,
AppName: appName,
SessionID: sessionInstance.Session.ID(),
})
if err != nil {
log.Fatalf("Failed to retrieve final session: %v", err)
}
fmt.Println("Final Session State:", finalSession.Session.State())
}
```
```java
// --- Function to Interact with the Agent ---
// Sends a new topic to the agent (overwriting the initial one if needed)
// and runs the workflow.
public static void runAgent(StoryFlowAgentExample agent, String userTopic) {
// --- Setup Runner and Session ---
InMemoryRunner runner = new InMemoryRunner(agent);
Map initialState = new HashMap<>();
initialState.put("topic", "a brave kitten exploring a haunted house");
Session session =
runner
.sessionService()
.createSession(APP_NAME, USER_ID, new ConcurrentHashMap<>(initialState), SESSION_ID)
.blockingGet();
logger.log(Level.INFO, () -> String.format("Initial session state: %s", session.state()));
session.state().put("topic", userTopic); // Update the state in the retrieved session
logger.log(Level.INFO, () -> String.format("Updated session state topic to: %s", userTopic));
Content userMessage = Content.fromParts(Part.fromText("Generate a story about: " + userTopic));
// Use the modified session object for the run
Flowable eventStream = runner.runAsync(USER_ID, session.id(), userMessage);
final String[] finalResponse = {"No final response captured."};
eventStream.blockingForEach(
event -> {
if (event.finalResponse() && event.content().isPresent()) {
String author = event.author() != null ? event.author() : "UNKNOWN_AUTHOR";
Optional textOpt =
event
.content()
.flatMap(Content::parts)
.filter(parts -> !parts.isEmpty())
.map(parts -> parts.get(0).text().orElse(""));
logger.log(Level.INFO, () ->
String.format("Potential final response from [%s]: %s", author, textOpt.orElse("N/A")));
textOpt.ifPresent(text -> finalResponse[0] = text);
}
});
System.out.println("\n--- Agent Interaction Result ---");
System.out.println("Agent Final Response: " + finalResponse[0]);
// Retrieve session again to see the final state after the run
Session finalSession =
runner
.sessionService()
.getSession(APP_NAME, USER_ID, SESSION_ID, Optional.empty())
.blockingGet();
assert finalSession != null;
System.out.println("Final Session State:" + finalSession.state());
System.out.println("-------------------------------\n");
}
```
*(Note: The full runnable code, including imports and execution logic, can be found linked below.)*
______________________________________________________________________
## Full Code Example
Storyflow Agent
```python
# Full runnable code for the StoryFlowAgent example
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import logging
from typing import AsyncGenerator
from typing_extensions import override
from google.adk.agents import LlmAgent, BaseAgent, LoopAgent, SequentialAgent
from google.adk.agents.invocation_context import InvocationContext
from google.genai import types
from google.adk.sessions import InMemorySessionService
from google.adk.runners import Runner
from google.adk.events import Event
from pydantic import BaseModel, Field
# --- Constants ---
APP_NAME = "story_app"
USER_ID = "12345"
SESSION_ID = "123344"
GEMINI_2_FLASH = "gemini-2.0-flash"
# --- Configure Logging ---
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
# --- Custom Orchestrator Agent ---
class StoryFlowAgent(BaseAgent):
"""
Custom agent for a story generation and refinement workflow.
This agent orchestrates a sequence of LLM agents to generate a story,
critique it, revise it, check grammar and tone, and potentially
regenerate the story if the tone is negative.
"""
# --- Field Declarations for Pydantic ---
# Declare the agents passed during initialization as class attributes with type hints
story_generator: LlmAgent
critic: LlmAgent
reviser: LlmAgent
grammar_check: LlmAgent
tone_check: LlmAgent
loop_agent: LoopAgent
sequential_agent: SequentialAgent
# model_config allows setting Pydantic configurations if needed, e.g., arbitrary_types_allowed
model_config = {"arbitrary_types_allowed": True}
def __init__(
self,
name: str,
story_generator: LlmAgent,
critic: LlmAgent,
reviser: LlmAgent,
grammar_check: LlmAgent,
tone_check: LlmAgent,
):
"""
Initializes the StoryFlowAgent.
Args:
name: The name of the agent.
story_generator: An LlmAgent to generate the initial story.
critic: An LlmAgent to critique the story.
reviser: An LlmAgent to revise the story based on criticism.
grammar_check: An LlmAgent to check the grammar.
tone_check: An LlmAgent to analyze the tone.
"""
# Create internal agents *before* calling super().__init__
loop_agent = LoopAgent(
name="CriticReviserLoop", sub_agents=[critic, reviser], max_iterations=2
)
sequential_agent = SequentialAgent(
name="PostProcessing", sub_agents=[grammar_check, tone_check]
)
# Define the sub_agents list for the framework
sub_agents_list = [
story_generator,
loop_agent,
sequential_agent,
]
# Pydantic will validate and assign them based on the class annotations.
super().__init__(
name=name,
story_generator=story_generator,
critic=critic,
reviser=reviser,
grammar_check=grammar_check,
tone_check=tone_check,
loop_agent=loop_agent,
sequential_agent=sequential_agent,
sub_agents=sub_agents_list, # Pass the sub_agents list directly
)
@override
async def _run_async_impl(
self, ctx: InvocationContext
) -> AsyncGenerator[Event, None]:
"""
Implements the custom orchestration logic for the story workflow.
Uses the instance attributes assigned by Pydantic (e.g., self.story_generator).
"""
logger.info(f"[{self.name}] Starting story generation workflow.")
# 1. Initial Story Generation
logger.info(f"[{self.name}] Running StoryGenerator...")
async for event in self.story_generator.run_async(ctx):
logger.info(f"[{self.name}] Event from StoryGenerator: {event.model_dump_json(indent=2, exclude_none=True)}")
yield event
# Check if story was generated before proceeding
if "current_story" not in ctx.session.state or not ctx.session.state["current_story"]:
logger.error(f"[{self.name}] Failed to generate initial story. Aborting workflow.")
return # Stop processing if initial story failed
logger.info(f"[{self.name}] Story state after generator: {ctx.session.state.get('current_story')}")
# 2. Critic-Reviser Loop
logger.info(f"[{self.name}] Running CriticReviserLoop...")
# Use the loop_agent instance attribute assigned during init
async for event in self.loop_agent.run_async(ctx):
logger.info(f"[{self.name}] Event from CriticReviserLoop: {event.model_dump_json(indent=2, exclude_none=True)}")
yield event
logger.info(f"[{self.name}] Story state after loop: {ctx.session.state.get('current_story')}")
# 3. Sequential Post-Processing (Grammar and Tone Check)
logger.info(f"[{self.name}] Running PostProcessing...")
# Use the sequential_agent instance attribute assigned during init
async for event in self.sequential_agent.run_async(ctx):
logger.info(f"[{self.name}] Event from PostProcessing: {event.model_dump_json(indent=2, exclude_none=True)}")
yield event
# 4. Tone-Based Conditional Logic
tone_check_result = ctx.session.state.get("tone_check_result")
logger.info(f"[{self.name}] Tone check result: {tone_check_result}")
if tone_check_result == "negative":
logger.info(f"[{self.name}] Tone is negative. Regenerating story...")
async for event in self.story_generator.run_async(ctx):
logger.info(f"[{self.name}] Event from StoryGenerator (Regen): {event.model_dump_json(indent=2, exclude_none=True)}")
yield event
else:
logger.info(f"[{self.name}] Tone is not negative. Keeping current story.")
pass
logger.info(f"[{self.name}] Workflow finished.")
# --- Define the individual LLM agents ---
story_generator = LlmAgent(
name="StoryGenerator",
model=GEMINI_2_FLASH,
instruction="""You are a story writer. Write a short story (around 100 words), on the following topic: {topic}""",
input_schema=None,
output_key="current_story", # Key for storing output in session state
)
critic = LlmAgent(
name="Critic",
model=GEMINI_2_FLASH,
instruction="""You are a story critic. Review the story provided: {{current_story}}. Provide 1-2 sentences of constructive criticism
on how to improve it. Focus on plot or character.""",
input_schema=None,
output_key="criticism", # Key for storing criticism in session state
)
reviser = LlmAgent(
name="Reviser",
model=GEMINI_2_FLASH,
instruction="""You are a story reviser. Revise the story provided: {{current_story}}, based on the criticism in
{{criticism}}. Output only the revised story.""",
input_schema=None,
output_key="current_story", # Overwrites the original story
)
grammar_check = LlmAgent(
name="GrammarCheck",
model=GEMINI_2_FLASH,
instruction="""You are a grammar checker. Check the grammar of the story provided: {current_story}. Output only the suggested
corrections as a list, or output 'Grammar is good!' if there are no errors.""",
input_schema=None,
output_key="grammar_suggestions",
)
tone_check = LlmAgent(
name="ToneCheck",
model=GEMINI_2_FLASH,
instruction="""You are a tone analyzer. Analyze the tone of the story provided: {current_story}. Output only one word: 'positive' if
the tone is generally positive, 'negative' if the tone is generally negative, or 'neutral'
otherwise.""",
input_schema=None,
output_key="tone_check_result", # This agent's output determines the conditional flow
)
# --- Create the custom agent instance ---
story_flow_agent = StoryFlowAgent(
name="StoryFlowAgent",
story_generator=story_generator,
critic=critic,
reviser=reviser,
grammar_check=grammar_check,
tone_check=tone_check,
)
INITIAL_STATE = {"topic": "a brave kitten exploring a haunted house"}
# --- Setup Runner and Session ---
async def setup_session_and_runner():
session_service = InMemorySessionService()
session = await session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID, state=INITIAL_STATE)
logger.info(f"Initial session state: {session.state}")
runner = Runner(
agent=story_flow_agent, # Pass the custom orchestrator agent
app_name=APP_NAME,
session_service=session_service
)
return session_service, runner
# --- Function to Interact with the Agent ---
async def call_agent_async(user_input_topic: str):
"""
Sends a new topic to the agent (overwriting the initial one if needed)
and runs the workflow.
"""
session_service, runner = await setup_session_and_runner()
current_session = session_service.sessions[APP_NAME][USER_ID][SESSION_ID]
current_session.state["topic"] = user_input_topic
logger.info(f"Updated session state topic to: {user_input_topic}")
content = types.Content(role='user', parts=[types.Part(text=f"Generate a story about the preset topic.")])
events = runner.run_async(user_id=USER_ID, session_id=SESSION_ID, new_message=content)
final_response = "No final response captured."
async for event in events:
if event.is_final_response() and event.content and event.content.parts:
logger.info(f"Potential final response from [{event.author}]: {event.content.parts[0].text}")
final_response = event.content.parts[0].text
print("\n--- Agent Interaction Result ---")
print("Agent Final Response: ", final_response)
final_session = await session_service.get_session(app_name=APP_NAME,
user_id=USER_ID,
session_id=SESSION_ID)
print("Final Session State:")
import json
print(json.dumps(final_session.state, indent=2))
print("-------------------------------\n")
# --- Run the Agent ---
# Note: In Colab, you can directly use 'await' at the top level.
# If running this code as a standalone Python script, you'll need to use asyncio.run() or manage the event loop.
await call_agent_async("a lonely robot finding a friend in a junkyard")
```
```typescript
// Full runnable code for the StoryFlowAgent example
/**
* Copyright 2025 Google LLC
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import {
LlmAgent,
BaseAgent,
LoopAgent,
SequentialAgent,
InMemoryRunner,
InvocationContext,
Event,
isFinalResponse,
} from '@google/adk';
import { createUserContent } from "@google/genai";
// --- Constants ---
const APP_NAME = "story_app_ts";
const USER_ID = "12345";
const SESSION_ID = "123344_ts";
const GEMINI_MODEL = "gemini-2.5-flash";
// --- Custom Orchestrator Agent ---
class StoryFlowAgent extends BaseAgent {
// --- Property Declarations for TypeScript ---
private storyGenerator: LlmAgent;
private critic: LlmAgent;
private reviser: LlmAgent;
private grammarCheck: LlmAgent;
private toneCheck: LlmAgent;
private loopAgent: LoopAgent;
private sequentialAgent: SequentialAgent;
constructor(
name: string,
storyGenerator: LlmAgent,
critic: LlmAgent,
reviser: LlmAgent,
grammarCheck: LlmAgent,
toneCheck: LlmAgent
) {
// Create internal composite agents
const loopAgent = new LoopAgent({
name: "CriticReviserLoop",
subAgents: [critic, reviser],
maxIterations: 2,
});
const sequentialAgent = new SequentialAgent({
name: "PostProcessing",
subAgents: [grammarCheck, toneCheck],
});
// Define the sub-agents for the framework to know about
const subAgentsList = [
storyGenerator,
loopAgent,
sequentialAgent,
];
// Call the parent constructor
super({
name,
subAgents: subAgentsList,
});
// Assign agents to class properties for use in the custom run logic
this.storyGenerator = storyGenerator;
this.critic = critic;
this.reviser = reviser;
this.grammarCheck = grammarCheck;
this.toneCheck = toneCheck;
this.loopAgent = loopAgent;
this.sequentialAgent = sequentialAgent;
}
// Implements the custom orchestration logic for the story workflow.
async* runLiveImpl(ctx: InvocationContext): AsyncGenerator {
yield* this.runAsyncImpl(ctx);
}
// Implements the custom orchestration logic for the story workflow.
async* runAsyncImpl(ctx: InvocationContext): AsyncGenerator {
console.log(`[${this.name}] Starting story generation workflow.`);
// 1. Initial Story Generation
console.log(`[${this.name}] Running StoryGenerator...`);
for await (const event of this.storyGenerator.runAsync(ctx)) {
console.log(`[${this.name}] Event from StoryGenerator: ${JSON.stringify(event, null, 2)}`);
yield event;
}
// Check if the story was generated before proceeding
if (!ctx.session.state["current_story"]) {
console.error(`[${this.name}] Failed to generate initial story. Aborting workflow.`);
return; // Stop processing
}
console.log(`[${this.name}] Story state after generator: ${ctx.session.state['current_story']}`);
// 2. Critic-Reviser Loop
console.log(`[${this.name}] Running CriticReviserLoop...`);
for await (const event of this.loopAgent.runAsync(ctx)) {
console.log(`[${this.name}] Event from CriticReviserLoop: ${JSON.stringify(event, null, 2)}`);
yield event;
}
console.log(`[${this.name}] Story state after loop: ${ctx.session.state['current_story']}`);
// 3. Sequential Post-Processing (Grammar and Tone Check)
console.log(`[${this.name}] Running PostProcessing...`);
for await (const event of this.sequentialAgent.runAsync(ctx)) {
console.log(`[${this.name}] Event from PostProcessing: ${JSON.stringify(event, null, 2)}`);
yield event;
}
// 4. Tone-Based Conditional Logic
const toneCheckResult = ctx.session.state["tone_check_result"] as string;
console.log(`[${this.name}] Tone check result: ${toneCheckResult}`);
if (toneCheckResult === "negative") {
console.log(`[${this.name}] Tone is negative. Regenerating story...`);
for await (const event of this.storyGenerator.runAsync(ctx)) {
console.log(`[${this.name}] Event from StoryGenerator (Regen): ${JSON.stringify(event, null, 2)}`);
yield event;
}
} else {
console.log(`[${this.name}] Tone is not negative. Keeping current story.`);
}
console.log(`[${this.name}] Workflow finished.`);
}
}
// --- Define the individual LLM agents ---
const storyGenerator = new LlmAgent({
name: "StoryGenerator",
model: GEMINI_MODEL,
instruction: `You are a story writer. Write a short story (around 100 words), on the following topic: {topic}`,
outputKey: "current_story",
});
const critic = new LlmAgent({
name: "Critic",
model: GEMINI_MODEL,
instruction: `You are a story critic. Review the story provided: {{current_story}}. Provide 1-2 sentences of constructive criticism
on how to improve it. Focus on plot or character.`,
outputKey: "criticism",
});
const reviser = new LlmAgent({
name: "Reviser",
model: GEMINI_MODEL,
instruction: `You are a story reviser. Revise the story provided: {{current_story}}, based on the criticism in
{{criticism}}. Output only the revised story.`,
outputKey: "current_story", // Overwrites the original story
});
const grammarCheck = new LlmAgent({
name: "GrammarCheck",
model: GEMINI_MODEL,
instruction: `You are a grammar checker. Check the grammar of the story provided: {current_story}. Output only the suggested
corrections as a list, or output 'Grammar is good!' if there are no errors.`,
outputKey: "grammar_suggestions",
});
const toneCheck = new LlmAgent({
name: "ToneCheck",
model: GEMINI_MODEL,
instruction: `You are a tone analyzer. Analyze the tone of the story provided: {current_story}. Output only one word: 'positive' if
the tone is generally positive, 'negative' if the tone is generally negative, or 'neutral'
otherwise.`,
outputKey: "tone_check_result",
});
// --- Create the custom agent instance ---
const storyFlowAgent = new StoryFlowAgent(
"StoryFlowAgent",
storyGenerator,
critic,
reviser,
grammarCheck,
toneCheck
);
const INITIAL_STATE = { "topic": "a brave kitten exploring a haunted house" };
// --- Setup Runner and Session ---
async function setupRunnerAndSession() {
const runner = new InMemoryRunner({
agent: storyFlowAgent,
appName: APP_NAME,
});
const session = await runner.sessionService.createSession({
appName: APP_NAME,
userId: USER_ID,
sessionId: SESSION_ID,
state: INITIAL_STATE,
});
console.log(`Initial session state: ${JSON.stringify(session.state, null, 2)}`);
return runner;
}
// --- Function to Interact with the Agent ---
async function callAgent(runner: InMemoryRunner, userInputTopic: string) {
const currentSession = await runner.sessionService.getSession({
appName: APP_NAME,
userId: USER_ID,
sessionId: SESSION_ID
});
if (!currentSession) {
return;
}
// Update the state with the new topic for this run
currentSession.state["topic"] = userInputTopic;
console.log(`Updated session state topic to: ${userInputTopic}`);
let finalResponse = "No final response captured.";
for await (const event of runner.runAsync({
userId: USER_ID,
sessionId: SESSION_ID,
newMessage: createUserContent(`Generate a story about: ${userInputTopic}`)
})) {
if (isFinalResponse(event) && event.content?.parts?.length) {
console.log(`Potential final response from [${event.author}]: ${event.content.parts.map(part => part.text ?? '').join('')}`);
finalResponse = event.content.parts.map(part => part.text ?? '').join('');
}
}
const finalSession = await runner.sessionService.getSession({
appName: APP_NAME,
userId: USER_ID,
sessionId: SESSION_ID
});
console.log("\n--- Agent Interaction Result ---");
console.log("Agent Final Response: ", finalResponse);
console.log("Final Session State:");
console.log(JSON.stringify(finalSession?.state, null, 2));
console.log("-------------------------------\n");
}
// --- Run the Agent ---
async function main() {
const runner = await setupRunnerAndSession();
await callAgent(runner, "a lonely robot finding a friend in a junkyard");
}
main();
```
```go
# Full runnable code for the StoryFlowAgent example
package main
import (
"context"
"fmt"
"iter"
"log"
"google.golang.org/adk/agent/workflowagents/loopagent"
"google.golang.org/adk/agent/workflowagents/sequentialagent"
"google.golang.org/adk/agent"
"google.golang.org/adk/agent/llmagent"
"google.golang.org/adk/model/gemini"
"google.golang.org/adk/runner"
"google.golang.org/adk/session"
"google.golang.org/genai"
)
// StoryFlowAgent is a custom agent that orchestrates a story generation workflow.
// It encapsulates the logic of running sub-agents in a specific sequence.
type StoryFlowAgent struct {
storyGenerator agent.Agent
revisionLoopAgent agent.Agent
postProcessorAgent agent.Agent
}
// NewStoryFlowAgent creates and configures the entire custom agent workflow.
// It takes individual LLM agents as input and internally creates the necessary
// workflow agents (loop, sequential), returning the final orchestrator agent.
func NewStoryFlowAgent(
storyGenerator,
critic,
reviser,
grammarCheck,
toneCheck agent.Agent,
) (agent.Agent, error) {
loopAgent, err := loopagent.New(loopagent.Config{
MaxIterations: 2,
AgentConfig: agent.Config{
Name: "CriticReviserLoop",
SubAgents: []agent.Agent{critic, reviser},
},
})
if err != nil {
return nil, fmt.Errorf("failed to create loop agent: %w", err)
}
sequentialAgent, err := sequentialagent.New(sequentialagent.Config{
AgentConfig: agent.Config{
Name: "PostProcessing",
SubAgents: []agent.Agent{grammarCheck, toneCheck},
},
})
if err != nil {
return nil, fmt.Errorf("failed to create sequential agent: %w", err)
}
// The StoryFlowAgent struct holds the agents needed for the Run method.
orchestrator := &StoryFlowAgent{
storyGenerator: storyGenerator,
revisionLoopAgent: loopAgent,
postProcessorAgent: sequentialAgent,
}
// agent.New creates the final agent, wiring up the Run method.
return agent.New(agent.Config{
Name: "StoryFlowAgent",
Description: "Orchestrates story generation, critique, revision, and checks.",
SubAgents: []agent.Agent{storyGenerator, loopAgent, sequentialAgent},
Run: orchestrator.Run,
})
}
// Run defines the custom execution logic for the StoryFlowAgent.
func (s *StoryFlowAgent) Run(ctx agent.InvocationContext) iter.Seq2[*session.Event, error] {
return func(yield func(*session.Event, error) bool) {
// Stage 1: Initial Story Generation
for event, err := range s.storyGenerator.Run(ctx) {
if err != nil {
yield(nil, fmt.Errorf("story generator failed: %w", err))
return
}
if !yield(event, nil) {
return
}
}
// Check if story was generated before proceeding
currentStory, err := ctx.Session().State().Get("current_story")
if err != nil || currentStory == "" {
log.Println("Failed to generate initial story. Aborting workflow.")
return
}
// Stage 2: Critic-Reviser Loop
for event, err := range s.revisionLoopAgent.Run(ctx) {
if err != nil {
yield(nil, fmt.Errorf("loop agent failed: %w", err))
return
}
if !yield(event, nil) {
return
}
}
// Stage 3: Post-Processing
for event, err := range s.postProcessorAgent.Run(ctx) {
if err != nil {
yield(nil, fmt.Errorf("sequential agent failed: %w", err))
return
}
if !yield(event, nil) {
return
}
}
// Stage 4: Conditional Regeneration
toneResult, err := ctx.Session().State().Get("tone_check_result")
if err != nil {
log.Printf("Could not read tone_check_result from state: %v. Assuming tone is not negative.", err)
return
}
if tone, ok := toneResult.(string); ok && tone == "negative" {
log.Println("Tone is negative. Regenerating story...")
for event, err := range s.storyGenerator.Run(ctx) {
if err != nil {
yield(nil, fmt.Errorf("story regeneration failed: %w", err))
return
}
if !yield(event, nil) {
return
}
}
} else {
log.Println("Tone is not negative. Keeping current story.")
}
}
}
const (
modelName = "gemini-2.0-flash"
appName = "story_app"
userID = "user_12345"
)
func main() {
ctx := context.Background()
model, err := gemini.NewModel(ctx, modelName, &genai.ClientConfig{})
if err != nil {
log.Fatalf("Failed to create model: %v", err)
}
// --- Define the individual LLM agents ---
storyGenerator, err := llmagent.New(llmagent.Config{
Name: "StoryGenerator",
Model: model,
Description: "Generates the initial story.",
Instruction: "You are a story writer. Write a short story (around 100 words) about a cat, based on the topic: {topic}",
OutputKey: "current_story",
})
if err != nil {
log.Fatalf("Failed to create StoryGenerator agent: %v", err)
}
critic, err := llmagent.New(llmagent.Config{
Name: "Critic",
Model: model,
Description: "Critiques the story.",
Instruction: "You are a story critic. Review the story: {current_story}. Provide 1-2 sentences of constructive criticism on how to improve it. Focus on plot or character.",
OutputKey: "criticism",
})
if err != nil {
log.Fatalf("Failed to create Critic agent: %v", err)
}
reviser, err := llmagent.New(llmagent.Config{
Name: "Reviser",
Model: model,
Description: "Revises the story based on criticism.",
Instruction: "You are a story reviser. Revise the story: {current_story}, based on the criticism: {criticism}. Output only the revised story.",
OutputKey: "current_story",
})
if err != nil {
log.Fatalf("Failed to create Reviser agent: %v", err)
}
grammarCheck, err := llmagent.New(llmagent.Config{
Name: "GrammarCheck",
Model: model,
Description: "Checks grammar and suggests corrections.",
Instruction: "You are a grammar checker. Check the grammar of the story: {current_story}. Output only the suggested corrections as a list, or output 'Grammar is good!' if there are no errors.",
OutputKey: "grammar_suggestions",
})
if err != nil {
log.Fatalf("Failed to create GrammarCheck agent: %v", err)
}
toneCheck, err := llmagent.New(llmagent.Config{
Name: "ToneCheck",
Model: model,
Description: "Analyzes the tone of the story.",
Instruction: "You are a tone analyzer. Analyze the tone of the story: {current_story}. Output only one word: 'positive' if the tone is generally positive, 'negative' if the tone is generally negative, or 'neutral' otherwise.",
OutputKey: "tone_check_result",
})
if err != nil {
log.Fatalf("Failed to create ToneCheck agent: %v", err)
}
// Instantiate the custom agent, which encapsulates the workflow agents.
storyFlowAgent, err := NewStoryFlowAgent(
storyGenerator,
critic,
reviser,
grammarCheck,
toneCheck,
)
if err != nil {
log.Fatalf("Failed to create story flow agent: %v", err)
}
// --- Run the Agent ---
sessionService := session.InMemoryService()
initialState := map[string]any{
"topic": "a brave kitten exploring a haunted house",
}
sessionInstance, err := sessionService.Create(ctx, &session.CreateRequest{
AppName: appName,
UserID: userID,
State: initialState,
})
if err != nil {
log.Fatalf("Failed to create session: %v", err)
}
userTopic := "a lonely robot finding a friend in a junkyard"
r, err := runner.New(runner.Config{
AppName: appName,
Agent: storyFlowAgent,
SessionService: sessionService,
})
if err != nil {
log.Fatalf("Failed to create runner: %v", err)
}
input := genai.NewContentFromText("Generate a story about: "+userTopic, genai.RoleUser)
events := r.Run(ctx, userID, sessionInstance.Session.ID(), input, agent.RunConfig{
StreamingMode: agent.StreamingModeSSE,
})
var finalResponse string
for event, err := range events {
if err != nil {
log.Fatalf("An error occurred during agent execution: %v", err)
}
for _, part := range event.Content.Parts {
// Accumulate text from all parts of the final response.
finalResponse += part.Text
}
}
fmt.Println("\n--- Agent Interaction Result ---")
fmt.Println("Agent Final Response: " + finalResponse)
finalSession, err := sessionService.Get(ctx, &session.GetRequest{
UserID: userID,
AppName: appName,
SessionID: sessionInstance.Session.ID(),
})
if err != nil {
log.Fatalf("Failed to retrieve final session: %v", err)
}
fmt.Println("Final Session State:", finalSession.Session.State())
}
```
```java
# Full runnable code for the StoryFlowAgent example
import com.google.adk.agents.LlmAgent;
import com.google.adk.agents.BaseAgent;
import com.google.adk.agents.InvocationContext;
import com.google.adk.agents.LoopAgent;
import com.google.adk.agents.SequentialAgent;
import com.google.adk.events.Event;
import com.google.adk.runner.InMemoryRunner;
import com.google.adk.sessions.Session;
import com.google.genai.types.Content;
import com.google.genai.types.Part;
import io.reactivex.rxjava3.core.Flowable;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Optional;
import java.util.concurrent.ConcurrentHashMap;
import java.util.logging.Level;
import java.util.logging.Logger;
public class StoryFlowAgentExample extends BaseAgent {
// --- Constants ---
private static final String APP_NAME = "story_app";
private static final String USER_ID = "user_12345";
private static final String SESSION_ID = "session_123344";
private static final String MODEL_NAME = "gemini-2.0-flash"; // Ensure this model is available
private static final Logger logger = Logger.getLogger(StoryFlowAgentExample.class.getName());
private final LlmAgent storyGenerator;
private final LoopAgent loopAgent;
private final SequentialAgent sequentialAgent;
public StoryFlowAgentExample(
String name, LlmAgent storyGenerator, LoopAgent loopAgent, SequentialAgent sequentialAgent) {
super(
name,
"Orchestrates story generation, critique, revision, and checks.",
List.of(storyGenerator, loopAgent, sequentialAgent),
null,
null);
this.storyGenerator = storyGenerator;
this.loopAgent = loopAgent;
this.sequentialAgent = sequentialAgent;
}
public static void main(String[] args) {
// --- Define the individual LLM agents ---
LlmAgent storyGenerator =
LlmAgent.builder()
.name("StoryGenerator")
.model(MODEL_NAME)
.description("Generates the initial story.")
.instruction(
"""
You are a story writer. Write a short story (around 100 words) about a cat,
based on the topic: {topic}
""")
.inputSchema(null)
.outputKey("current_story") // Key for storing output in session state
.build();
LlmAgent critic =
LlmAgent.builder()
.name("Critic")
.model(MODEL_NAME)
.description("Critiques the story.")
.instruction(
"""
You are a story critic. Review the story: {current_story}. Provide 1-2 sentences of constructive criticism
on how to improve it. Focus on plot or character.
""")
.inputSchema(null)
.outputKey("criticism") // Key for storing criticism in session state
.build();
LlmAgent reviser =
LlmAgent.builder()
.name("Reviser")
.model(MODEL_NAME)
.description("Revises the story based on criticism.")
.instruction(
"""
You are a story reviser. Revise the story: {current_story}, based on the criticism: {criticism}. Output only the revised story.
""")
.inputSchema(null)
.outputKey("current_story") // Overwrites the original story
.build();
LlmAgent grammarCheck =
LlmAgent.builder()
.name("GrammarCheck")
.model(MODEL_NAME)
.description("Checks grammar and suggests corrections.")
.instruction(
"""
You are a grammar checker. Check the grammar of the story: {current_story}. Output only the suggested
corrections as a list, or output 'Grammar is good!' if there are no errors.
""")
.outputKey("grammar_suggestions")
.build();
LlmAgent toneCheck =
LlmAgent.builder()
.name("ToneCheck")
.model(MODEL_NAME)
.description("Analyzes the tone of the story.")
.instruction(
"""
You are a tone analyzer. Analyze the tone of the story: {current_story}. Output only one word: 'positive' if
the tone is generally positive, 'negative' if the tone is generally negative, or 'neutral'
otherwise.
""")
.outputKey("tone_check_result") // This agent's output determines the conditional flow
.build();
LoopAgent loopAgent =
LoopAgent.builder()
.name("CriticReviserLoop")
.description("Iteratively critiques and revises the story.")
.subAgents(critic, reviser)
.maxIterations(2)
.build();
SequentialAgent sequentialAgent =
SequentialAgent.builder()
.name("PostProcessing")
.description("Performs grammar and tone checks sequentially.")
.subAgents(grammarCheck, toneCheck)
.build();
StoryFlowAgentExample storyFlowAgentExample =
new StoryFlowAgentExample(APP_NAME, storyGenerator, loopAgent, sequentialAgent);
// --- Run the Agent ---
runAgent(storyFlowAgentExample, "a lonely robot finding a friend in a junkyard");
}
// --- Function to Interact with the Agent ---
// Sends a new topic to the agent (overwriting the initial one if needed)
// and runs the workflow.
public static void runAgent(StoryFlowAgentExample agent, String userTopic) {
// --- Setup Runner and Session ---
InMemoryRunner runner = new InMemoryRunner(agent);
Map initialState = new HashMap<>();
initialState.put("topic", "a brave kitten exploring a haunted house");
Session session =
runner
.sessionService()
.createSession(APP_NAME, USER_ID, new ConcurrentHashMap<>(initialState), SESSION_ID)
.blockingGet();
logger.log(Level.INFO, () -> String.format("Initial session state: %s", session.state()));
session.state().put("topic", userTopic); // Update the state in the retrieved session
logger.log(Level.INFO, () -> String.format("Updated session state topic to: %s", userTopic));
Content userMessage = Content.fromParts(Part.fromText("Generate a story about: " + userTopic));
// Use the modified session object for the run
Flowable eventStream = runner.runAsync(USER_ID, session.id(), userMessage);
final String[] finalResponse = {"No final response captured."};
eventStream.blockingForEach(
event -> {
if (event.finalResponse() && event.content().isPresent()) {
String author = event.author() != null ? event.author() : "UNKNOWN_AUTHOR";
Optional textOpt =
event
.content()
.flatMap(Content::parts)
.filter(parts -> !parts.isEmpty())
.map(parts -> parts.get(0).text().orElse(""));
logger.log(Level.INFO, () ->
String.format("Potential final response from [%s]: %s", author, textOpt.orElse("N/A")));
textOpt.ifPresent(text -> finalResponse[0] = text);
}
});
System.out.println("\n--- Agent Interaction Result ---");
System.out.println("Agent Final Response: " + finalResponse[0]);
// Retrieve session again to see the final state after the run
Session finalSession =
runner
.sessionService()
.getSession(APP_NAME, USER_ID, SESSION_ID, Optional.empty())
.blockingGet();
assert finalSession != null;
System.out.println("Final Session State:" + finalSession.state());
System.out.println("-------------------------------\n");
}
private boolean isStoryGenerated(InvocationContext ctx) {
Object currentStoryObj = ctx.session().state().get("current_story");
return currentStoryObj != null && !String.valueOf(currentStoryObj).isEmpty();
}
@Override
protected Flowable runAsyncImpl(InvocationContext invocationContext) {
// Implements the custom orchestration logic for the story workflow.
// Uses the instance attributes assigned by Pydantic (e.g., self.story_generator).
logger.log(Level.INFO, () -> String.format("[%s] Starting story generation workflow.", name()));
// Stage 1. Initial Story Generation
Flowable storyGenFlow = runStage(storyGenerator, invocationContext, "StoryGenerator");
// Stage 2: Critic-Reviser Loop (runs after story generation completes)
Flowable criticReviserFlow = Flowable.defer(() -> {
if (!isStoryGenerated(invocationContext)) {
logger.log(Level.SEVERE,() ->
String.format("[%s] Failed to generate initial story. Aborting after StoryGenerator.",
name()));
return Flowable.empty(); // Stop further processing if no story
}
logger.log(Level.INFO, () ->
String.format("[%s] Story state after generator: %s",
name(), invocationContext.session().state().get("current_story")));
return runStage(loopAgent, invocationContext, "CriticReviserLoop");
});
// Stage 3: Post-Processing (runs after critic-reviser loop completes)
Flowable postProcessingFlow = Flowable.defer(() -> {
logger.log(Level.INFO, () ->
String.format("[%s] Story state after loop: %s",
name(), invocationContext.session().state().get("current_story")));
return runStage(sequentialAgent, invocationContext, "PostProcessing");
});
// Stage 4: Conditional Regeneration (runs after post-processing completes)
Flowable conditionalRegenFlow = Flowable.defer(() -> {
String toneCheckResult = (String) invocationContext.session().state().get("tone_check_result");
logger.log(Level.INFO, () -> String.format("[%s] Tone check result: %s", name(), toneCheckResult));
if ("negative".equalsIgnoreCase(toneCheckResult)) {
logger.log(Level.INFO, () ->
String.format("[%s] Tone is negative. Regenerating story...", name()));
return runStage(storyGenerator, invocationContext, "StoryGenerator (Regen)");
} else {
logger.log(Level.INFO, () ->
String.format("[%s] Tone is not negative. Keeping current story.", name()));
return Flowable.empty(); // No regeneration needed
}
});
return Flowable.concatArray(storyGenFlow, criticReviserFlow, postProcessingFlow, conditionalRegenFlow)
.doOnComplete(() -> logger.log(Level.INFO, () -> String.format("[%s] Workflow finished.", name())));
}
// Helper method for a single agent run stage with logging
private Flowable runStage(BaseAgent agentToRun, InvocationContext ctx, String stageName) {
logger.log(Level.INFO, () -> String.format("[%s] Running %s...", name(), stageName));
return agentToRun
.runAsync(ctx)
.doOnNext(event ->
logger.log(Level.INFO,() ->
String.format("[%s] Event from %s: %s", name(), stageName, event.toJson())))
.doOnError(err ->
logger.log(Level.SEVERE,
String.format("[%s] Error in %s", name(), stageName), err))
.doOnComplete(() ->
logger.log(Level.INFO, () ->
String.format("[%s] %s finished.", name(), stageName)));
}
@Override
protected Flowable runLiveImpl(InvocationContext invocationContext) {
return Flowable.error(new UnsupportedOperationException("runLive not implemented."));
}
}
```
# LLM Agent
Supported in ADK Python v0.1.0 Typescript v0.2.0 Go v0.1.0 Java v0.1.0
The `LlmAgent` (often aliased simply as `Agent`) is a core component in ADK, acting as the "thinking" part of your application. It leverages the power of a Large Language Model (LLM) for reasoning, understanding natural language, making decisions, generating responses, and interacting with tools.
Unlike deterministic [Workflow Agents](https://google.github.io/adk-docs/agents/workflow-agents/index.md) that follow predefined execution paths, `LlmAgent` behavior is non-deterministic. It uses the LLM to interpret instructions and context, deciding dynamically how to proceed, which tools to use (if any), or whether to transfer control to another agent.
Building an effective `LlmAgent` involves defining its identity, clearly guiding its behavior through instructions, and equipping it with the necessary tools and capabilities.
## Defining the Agent's Identity and Purpose
First, you need to establish what the agent *is* and what it's *for*.
- **`name` (Required):** Every agent needs a unique string identifier. This `name` is crucial for internal operations, especially in multi-agent systems where agents need to refer to or delegate tasks to each other. Choose a descriptive name that reflects the agent's function (e.g., `customer_support_router`, `billing_inquiry_agent`). Avoid reserved names like `user`.
- **`description` (Optional, Recommended for Multi-Agent):** Provide a concise summary of the agent's capabilities. This description is primarily used by *other* LLM agents to determine if they should route a task to this agent. Make it specific enough to differentiate it from peers (e.g., "Handles inquiries about current billing statements," not just "Billing agent").
- **`model` (Required):** Specify the underlying LLM that will power this agent's reasoning. This is a string identifier like `"gemini-2.5-flash"`. The choice of model impacts the agent's capabilities, cost, and performance. See the [Models](/adk-docs/agents/models/) page for available options and considerations.
```python
# Example: Defining the basic identity
capital_agent = LlmAgent(
model="gemini-2.5-flash",
name="capital_agent",
description="Answers user questions about the capital city of a given country."
# instruction and tools will be added next
)
```
```typescript
// Example: Defining the basic identity
const capitalAgent = new LlmAgent({
model: 'gemini-2.5-flash',
name: 'capital_agent',
description: 'Answers user questions about the capital city of a given country.',
// instruction and tools will be added next
});
```
```go
// Example: Defining the basic identity
agent, err := llmagent.New(llmagent.Config{
Name: "capital_agent",
Model: model,
Description: "Answers user questions about the capital city of a given country.",
// instruction and tools will be added next
})
```
```java
// Example: Defining the basic identity
LlmAgent capitalAgent =
LlmAgent.builder()
.model("gemini-2.5-flash")
.name("capital_agent")
.description("Answers user questions about the capital city of a given country.")
// instruction and tools will be added next
.build();
```
## Guiding the Agent: Instructions (`instruction`)
The `instruction` parameter is arguably the most critical for shaping an `LlmAgent`'s behavior. It's a string (or a function returning a string) that tells the agent:
- Its core task or goal.
- Its personality or persona (e.g., "You are a helpful assistant," "You are a witty pirate").
- Constraints on its behavior (e.g., "Only answer questions about X," "Never reveal Y").
- How and when to use its `tools`. You should explain the purpose of each tool and the circumstances under which it should be called, supplementing any descriptions within the tool itself.
- The desired format for its output (e.g., "Respond in JSON," "Provide a bulleted list").
**Tips for Effective Instructions:**
- **Be Clear and Specific:** Avoid ambiguity. Clearly state the desired actions and outcomes.
- **Use Markdown:** Improve readability for complex instructions using headings, lists, etc.
- **Provide Examples (Few-Shot):** For complex tasks or specific output formats, include examples directly in the instruction.
- **Guide Tool Use:** Don't just list tools; explain *when* and *why* the agent should use them.
**State:**
- The instruction is a string template, you can use the `{var}` syntax to insert dynamic values into the instruction.
- `{var}` is used to insert the value of the state variable named var.
- `{artifact.var}` is used to insert the text content of the artifact named var.
- If the state variable or artifact does not exist, the agent will raise an error. If you want to ignore the error, you can append a `?` to the variable name as in `{var?}`.
```python
# Example: Adding instructions
capital_agent = LlmAgent(
model="gemini-2.5-flash",
name="capital_agent",
description="Answers user questions about the capital city of a given country.",
instruction="""You are an agent that provides the capital city of a country.
When a user asks for the capital of a country:
1. Identify the country name from the user's query.
2. Use the `get_capital_city` tool to find the capital.
3. Respond clearly to the user, stating the capital city.
Example Query: "What's the capital of {country}?"
Example Response: "The capital of France is Paris."
""",
# tools will be added next
)
```
```typescript
// Example: Adding instructions
const capitalAgent = new LlmAgent({
model: 'gemini-2.5-flash',
name: 'capital_agent',
description: 'Answers user questions about the capital city of a given country.',
instruction: `You are an agent that provides the capital city of a country.
When a user asks for the capital of a country:
1. Identify the country name from the user's query.
2. Use the \`getCapitalCity\` tool to find the capital.
3. Respond clearly to the user, stating the capital city.
Example Query: "What's the capital of {country}?"
Example Response: "The capital of France is Paris."
`,
// tools will be added next
});
```
```go
// Example: Adding instructions
agent, err := llmagent.New(llmagent.Config{
Name: "capital_agent",
Model: model,
Description: "Answers user questions about the capital city of a given country.",
Instruction: `You are an agent that provides the capital city of a country.
When a user asks for the capital of a country:
1. Identify the country name from the user's query.
2. Use the 'get_capital_city' tool to find the capital.
3. Respond clearly to the user, stating the capital city.
Example Query: "What's the capital of {country}?"
Example Response: "The capital of France is Paris."`,
// tools will be added next
})
```
```java
// Example: Adding instructions
LlmAgent capitalAgent =
LlmAgent.builder()
.model("gemini-2.5-flash")
.name("capital_agent")
.description("Answers user questions about the capital city of a given country.")
.instruction(
"""
You are an agent that provides the capital city of a country.
When a user asks for the capital of a country:
1. Identify the country name from the user's query.
2. Use the `get_capital_city` tool to find the capital.
3. Respond clearly to the user, stating the capital city.
Example Query: "What's the capital of {country}?"
Example Response: "The capital of France is Paris."
""")
// tools will be added next
.build();
```
*(Note: For instructions that apply to* all *agents in a system, consider using `global_instruction` on the root agent, detailed further in the [Multi-Agents](https://google.github.io/adk-docs/agents/multi-agents/index.md) section.)*
## Equipping the Agent: Tools (`tools`)
Tools give your `LlmAgent` capabilities beyond the LLM's built-in knowledge or reasoning. They allow the agent to interact with the outside world, perform calculations, fetch real-time data, or execute specific actions.
- **`tools` (Optional):** Provide a list of tools the agent can use. Each item in the list can be:
- A native function or method (wrapped as a `FunctionTool`). Python ADK automatically wraps the native function into a `FunctionTool` whereas, you must explicitly wrap your Java methods using `FunctionTool.create(...)`
- An instance of a class inheriting from `BaseTool`.
- An instance of another agent (`AgentTool`, enabling agent-to-agent delegation - see [Multi-Agents](https://google.github.io/adk-docs/agents/multi-agents/index.md)).
The LLM uses the function/tool names, descriptions (from docstrings or the `description` field), and parameter schemas to decide which tool to call based on the conversation and its instructions.
```python
# Define a tool function
def get_capital_city(country: str) -> str:
"""Retrieves the capital city for a given country."""
# Replace with actual logic (e.g., API call, database lookup)
capitals = {"france": "Paris", "japan": "Tokyo", "canada": "Ottawa"}
return capitals.get(country.lower(), f"Sorry, I don't know the capital of {country}.")
# Add the tool to the agent
capital_agent = LlmAgent(
model="gemini-2.5-flash",
name="capital_agent",
description="Answers user questions about the capital city of a given country.",
instruction="""You are an agent that provides the capital city of a country... (previous instruction text)""",
tools=[get_capital_city] # Provide the function directly
)
```
```typescript
import {z} from 'zod';
import { LlmAgent, FunctionTool } from '@google/adk';
// Define the schema for the tool's input parameters
const getCapitalCityParamsSchema = z.object({
country: z.string().describe('The country to get capital for.'),
});
// Define the tool function itself
async function getCapitalCity(params: z.infer): Promise<{ capitalCity: string }> {
const capitals: Record = {
'france': 'Paris',
'japan': 'Tokyo',
'canada': 'Ottawa',
};
const result = capitals[params.country.toLowerCase()] ??
`Sorry, I don't know the capital of ${params.country}.`;
return {capitalCity: result}; // Tools must return an object
}
// Create an instance of the FunctionTool
const getCapitalCityTool = new FunctionTool({
name: 'getCapitalCity',
description: 'Retrieves the capital city for a given country.',
parameters: getCapitalCityParamsSchema,
execute: getCapitalCity,
});
// Add the tool to the agent
const capitalAgent = new LlmAgent({
model: 'gemini-2.5-flash',
name: 'capitalAgent',
description: 'Answers user questions about the capital city of a given country.',
instruction: 'You are an agent that provides the capital city of a country...', // Note: the full instruction is omitted for brevity
tools: [getCapitalCityTool], // Provide the FunctionTool instance in an array
});
```
```go
// Define a tool function
type getCapitalCityArgs struct {
Country string `json:"country" jsonschema:"The country to get the capital of."`
}
getCapitalCity := func(ctx tool.Context, args getCapitalCityArgs) (map[string]any, error) {
// Replace with actual logic (e.g., API call, database lookup)
capitals := map[string]string{"france": "Paris", "japan": "Tokyo", "canada": "Ottawa"}
capital, ok := capitals[strings.ToLower(args.Country)]
if !ok {
return nil, fmt.Errorf("Sorry, I don't know the capital of %s.", args.Country)
}
return map[string]any{"result": capital}, nil
}
// Add the tool to the agent
capitalTool, err := functiontool.New(
functiontool.Config{
Name: "get_capital_city",
Description: "Retrieves the capital city for a given country.",
},
getCapitalCity,
)
if err != nil {
log.Fatal(err)
}
agent, err := llmagent.New(llmagent.Config{
Name: "capital_agent",
Model: model,
Description: "Answers user questions about the capital city of a given country.",
Instruction: "You are an agent that provides the capital city of a country... (previous instruction text)",
Tools: []tool.Tool{capitalTool},
})
```
```java
// Define a tool function
// Retrieves the capital city of a given country.
public static Map getCapitalCity(
@Schema(name = "country", description = "The country to get capital for")
String country) {
// Replace with actual logic (e.g., API call, database lookup)
Map countryCapitals = new HashMap<>();
countryCapitals.put("canada", "Ottawa");
countryCapitals.put("france", "Paris");
countryCapitals.put("japan", "Tokyo");
String result =
countryCapitals.getOrDefault(
country.toLowerCase(), "Sorry, I couldn't find the capital for " + country + ".");
return Map.of("result", result); // Tools must return a Map
}
// Add the tool to the agent
FunctionTool capitalTool = FunctionTool.create(experiment.getClass(), "getCapitalCity");
LlmAgent capitalAgent =
LlmAgent.builder()
.model("gemini-2.5-flash")
.name("capital_agent")
.description("Answers user questions about the capital city of a given country.")
.instruction("You are an agent that provides the capital city of a country... (previous instruction text)")
.tools(capitalTool) // Provide the function wrapped as a FunctionTool
.build();
```
Learn more about Tools in [Custom Tools](/adk-docs/tools-custom/).
## Advanced Configuration & Control
Beyond the core parameters, `LlmAgent` offers several options for finer control:
### Fine-Tuning LLM Generation (`generate_content_config`)
You can adjust how the underlying LLM generates responses using `generate_content_config`.
- **`generate_content_config` (Optional):** Pass an instance of [`google.genai.types.GenerateContentConfig`](https://googleapis.github.io/python-genai/genai.html#genai.types.GenerateContentConfig) to control parameters like `temperature` (randomness), `max_output_tokens` (response length), `top_p`, `top_k`, and safety settings.
```python
from google.genai import types
agent = LlmAgent(
# ... other params
generate_content_config=types.GenerateContentConfig(
temperature=0.2, # More deterministic output
max_output_tokens=250,
safety_settings=[
types.SafetySetting(
category=types.HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
threshold=types.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
)
]
)
)
```
```typescript
import { GenerateContentConfig } from '@google/genai';
const generateContentConfig: GenerateContentConfig = {
temperature: 0.2, // More deterministic output
maxOutputTokens: 250,
};
const agent = new LlmAgent({
// ... other params
generateContentConfig,
});
```
```go
import "google.golang.org/genai"
temperature := float32(0.2)
agent, err := llmagent.New(llmagent.Config{
Name: "gen_config_agent",
Model: model,
GenerateContentConfig: &genai.GenerateContentConfig{
Temperature: &temperature,
MaxOutputTokens: 250,
},
})
```
```java
import com.google.genai.types.GenerateContentConfig;
LlmAgent agent =
LlmAgent.builder()
// ... other params
.generateContentConfig(GenerateContentConfig.builder()
.temperature(0.2F) // More deterministic output
.maxOutputTokens(250)
.build())
.build();
```
### Structuring Data (`input_schema`, `output_schema`, `output_key`)
For scenarios requiring structured data exchange with an `LLM Agent`, the ADK provides mechanisms to define expected input and desired output formats using schema definitions.
- **`input_schema` (Optional):** Define a schema representing the expected input structure. If set, the user message content passed to this agent *must* be a JSON string conforming to this schema. Your instructions should guide the user or preceding agent accordingly.
- **`output_schema` (Optional):** Define a schema representing the desired output structure. If set, the agent's final response *must* be a JSON string conforming to this schema.
- **`output_key` (Optional):** Provide a string key. If set, the text content of the agent's *final* response will be automatically saved to the session's state dictionary under this key. This is useful for passing results between agents or steps in a workflow.
- In Python, this might look like: `session.state[output_key] = agent_response_text`
- In Java: `session.state().put(outputKey, agentResponseText)`
- In Golang, within a callback handler: `ctx.State().Set(output_key, agentResponseText)`
The input and output schema is typically a `Pydantic` BaseModel.
```python
from pydantic import BaseModel, Field
class CapitalOutput(BaseModel):
capital: str = Field(description="The capital of the country.")
structured_capital_agent = LlmAgent(
# ... name, model, description
instruction="""You are a Capital Information Agent. Given a country, respond ONLY with a JSON object containing the capital. Format: {"capital": "capital_name"}""",
output_schema=CapitalOutput, # Enforce JSON output
output_key="found_capital" # Store result in state['found_capital']
# Cannot use tools=[get_capital_city] effectively here
)
```
```typescript
import {z} from 'zod';
import { Schema, Type } from '@google/genai';
// Define the schema for the output
const CapitalOutputSchema: Schema = {
type: Type.OBJECT,
properties: {
capital: {
type: Type.STRING,
description: 'The capital of the country.',
},
},
required: ['capital'],
};
// Create the LlmAgent instance
const structuredCapitalAgent = new LlmAgent({
// ... name, model, description
instruction: `You are a Capital Information Agent. Given a country, respond ONLY with a JSON object containing the capital. Format: {"capital": "capital_name"}`,
outputSchema: CapitalOutputSchema, // Enforce JSON output
outputKey: 'found_capital', // Store result in state['found_capital']
// Cannot use tools effectively here
});
```
The input and output schema is a `google.genai.types.Schema` object.
```go
capitalOutput := &genai.Schema{
Type: genai.TypeObject,
Description: "Schema for capital city information.",
Properties: map[string]*genai.Schema{
"capital": {
Type: genai.TypeString,
Description: "The capital city of the country.",
},
},
}
agent, err := llmagent.New(llmagent.Config{
Name: "structured_capital_agent",
Model: model,
Description: "Provides capital information in a structured format.",
Instruction: `You are a Capital Information Agent. Given a country, respond ONLY with a JSON object containing the capital. Format: {"capital": "capital_name"}`,
OutputSchema: capitalOutput,
OutputKey: "found_capital",
// Cannot use the capitalTool tool effectively here
})
```
The input and output schema is a `google.genai.types.Schema` object.
```java
private static final Schema CAPITAL_OUTPUT =
Schema.builder()
.type("OBJECT")
.description("Schema for capital city information.")
.properties(
Map.of(
"capital",
Schema.builder()
.type("STRING")
.description("The capital city of the country.")
.build()))
.build();
LlmAgent structuredCapitalAgent =
LlmAgent.builder()
// ... name, model, description
.instruction(
"You are a Capital Information Agent. Given a country, respond ONLY with a JSON object containing the capital. Format: {\"capital\": \"capital_name\"}")
.outputSchema(capitalOutput) // Enforce JSON output
.outputKey("found_capital") // Store result in state.get("found_capital")
// Cannot use tools(getCapitalCity) effectively here
.build();
```
### Managing Context (`include_contents`)
Control whether the agent receives the prior conversation history.
- **`include_contents` (Optional, Default: `'default'`):** Determines if the `contents` (history) are sent to the LLM.
- `'default'`: The agent receives the relevant conversation history.
- `'none'`: The agent receives no prior `contents`. It operates based solely on its current instruction and any input provided in the *current* turn (useful for stateless tasks or enforcing specific contexts).
```python
stateless_agent = LlmAgent(
# ... other params
include_contents='none'
)
```
```typescript
const statelessAgent = new LlmAgent({
// ... other params
includeContents: 'none',
});
```
```go
import "google.golang.org/adk/agent/llmagent"
agent, err := llmagent.New(llmagent.Config{
Name: "stateless_agent",
Model: model,
IncludeContents: llmagent.IncludeContentsNone,
})
```
```java
import com.google.adk.agents.LlmAgent.IncludeContents;
LlmAgent statelessAgent =
LlmAgent.builder()
// ... other params
.includeContents(IncludeContents.NONE)
.build();
```
### Planner
Supported in ADKPython v0.1.0
**`planner` (Optional):** Assign a `BasePlanner` instance to enable multi-step reasoning and planning before execution. There are two main planners:
- **`BuiltInPlanner`:** Leverages the model's built-in planning capabilities (e.g., Gemini's thinking feature). See [Gemini Thinking](https://ai.google.dev/gemini-api/docs/thinking) for details and examples.
Here, the `thinking_budget` parameter guides the model on the number of thinking tokens to use when generating a response. The `include_thoughts` parameter controls whether the model should include its raw thoughts and internal reasoning process in the response.
```python
from google.adk import Agent
from google.adk.planners import BuiltInPlanner
from google.genai import types
my_agent = Agent(
model="gemini-2.5-flash",
planner=BuiltInPlanner(
thinking_config=types.ThinkingConfig(
include_thoughts=True,
thinking_budget=1024,
)
),
# ... your tools here
)
```
- **`PlanReActPlanner`:** This planner instructs the model to follow a specific structure in its output: first create a plan, then execute actions (like calling tools), and provide reasoning for its steps. *It's particularly useful for models that don't have a built-in "thinking" feature*.
```python
from google.adk import Agent
from google.adk.planners import PlanReActPlanner
my_agent = Agent(
model="gemini-2.5-flash",
planner=PlanReActPlanner(),
# ... your tools here
)
```
The agent's response will follow a structured format:
```text
[user]: ai news
[google_search_agent]: /*PLANNING*/
1. Perform a Google search for "latest AI news" to get current updates and headlines related to artificial intelligence.
2. Synthesize the information from the search results to provide a summary of recent AI news.
/*ACTION*/
/*REASONING*/
The search results provide a comprehensive overview of recent AI news, covering various aspects like company developments, research breakthroughs, and applications. I have enough information to answer the user's request.
/*FINAL_ANSWER*/
Here's a summary of recent AI news:
....
```
Example for using built-in-planner:
```python
from dotenv import load_dotenv
import asyncio
import os
from google.genai import types
from google.adk.agents.llm_agent import LlmAgent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.adk.artifacts.in_memory_artifact_service import InMemoryArtifactService # Optional
from google.adk.planners import BasePlanner, BuiltInPlanner, PlanReActPlanner
from google.adk.models import LlmRequest
from google.genai.types import ThinkingConfig
from google.genai.types import GenerateContentConfig
import datetime
from zoneinfo import ZoneInfo
APP_NAME = "weather_app"
USER_ID = "1234"
SESSION_ID = "session1234"
def get_weather(city: str) -> dict:
"""Retrieves the current weather report for a specified city.
Args:
city (str): The name of the city for which to retrieve the weather report.
Returns:
dict: status and result or error msg.
"""
if city.lower() == "new york":
return {
"status": "success",
"report": (
"The weather in New York is sunny with a temperature of 25 degrees"
" Celsius (77 degrees Fahrenheit)."
),
}
else:
return {
"status": "error",
"error_message": f"Weather information for '{city}' is not available.",
}
def get_current_time(city: str) -> dict:
"""Returns the current time in a specified city.
Args:
city (str): The name of the city for which to retrieve the current time.
Returns:
dict: status and result or error msg.
"""
if city.lower() == "new york":
tz_identifier = "America/New_York"
else:
return {
"status": "error",
"error_message": (
f"Sorry, I don't have timezone information for {city}."
),
}
tz = ZoneInfo(tz_identifier)
now = datetime.datetime.now(tz)
report = (
f'The current time in {city} is {now.strftime("%Y-%m-%d %H:%M:%S %Z%z")}'
)
return {"status": "success", "report": report}
# Step 1: Create a ThinkingConfig
thinking_config = ThinkingConfig(
include_thoughts=True, # Ask the model to include its thoughts in the response
thinking_budget=256 # Limit the 'thinking' to 256 tokens (adjust as needed)
)
print("ThinkingConfig:", thinking_config)
# Step 2: Instantiate BuiltInPlanner
planner = BuiltInPlanner(
thinking_config=thinking_config
)
print("BuiltInPlanner created.")
# Step 3: Wrap the planner in an LlmAgent
agent = LlmAgent(
model="gemini-2.5-pro-preview-03-25", # Set your model name
name="weather_and_time_agent",
instruction="You are an agent that returns time and weather",
planner=planner,
tools=[get_weather, get_current_time]
)
# Session and Runner
session_service = InMemorySessionService()
session = session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID)
runner = Runner(agent=agent, app_name=APP_NAME, session_service=session_service)
# Agent Interaction
def call_agent(query):
content = types.Content(role='user', parts=[types.Part(text=query)])
events = runner.run(user_id=USER_ID, session_id=SESSION_ID, new_message=content)
for event in events:
print(f"\nDEBUG EVENT: {event}\n")
if event.is_final_response() and event.content:
final_answer = event.content.parts[0].text.strip()
print("\n🟢 FINAL ANSWER\n", final_answer, "\n")
call_agent("If it's raining in New York right now, what is the current temperature?")
```
### Code Execution
Supported in ADKPython v0.1.0Java v0.1.0
- **`code_executor` (Optional):** Provide a `BaseCodeExecutor` instance to allow the agent to execute code blocks found in the LLM's response. For more information, see [Code Execution with Gemini API](/adk-docs/tools/gemini-api/code-execution/).
````python
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import asyncio
from google.adk.agents import LlmAgent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.adk.code_executors import BuiltInCodeExecutor
from google.genai import types
AGENT_NAME = "calculator_agent"
APP_NAME = "calculator"
USER_ID = "user1234"
SESSION_ID = "session_code_exec_async"
GEMINI_MODEL = "gemini-2.0-flash"
# Agent Definition
code_agent = LlmAgent(
name=AGENT_NAME,
model=GEMINI_MODEL,
code_executor=BuiltInCodeExecutor(),
instruction="""You are a calculator agent.
When given a mathematical expression, write and execute Python code to calculate the result.
Return only the final numerical result as plain text, without markdown or code blocks.
""",
description="Executes Python code to perform calculations.",
)
# Session and Runner
session_service = InMemorySessionService()
session = asyncio.run(session_service.create_session(
app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID
))
runner = Runner(agent=code_agent, app_name=APP_NAME,
session_service=session_service)
# Agent Interaction (Async)
async def call_agent_async(query):
content = types.Content(role="user", parts=[types.Part(text=query)])
print(f"\n--- Running Query: {query} ---")
final_response_text = "No final text response captured."
try:
# Use run_async
async for event in runner.run_async(
user_id=USER_ID, session_id=SESSION_ID, new_message=content
):
print(f"Event ID: {event.id}, Author: {event.author}")
# --- Check for specific parts FIRST ---
has_specific_part = False
if event.content and event.content.parts:
for part in event.content.parts: # Iterate through all parts
if part.executable_code:
# Access the actual code string via .code
print(
f" Debug: Agent generated code:\n```python\n{part.executable_code.code}\n```"
)
has_specific_part = True
elif part.code_execution_result:
# Access outcome and output correctly
print(
f" Debug: Code Execution Result: {part.code_execution_result.outcome} - Output:\n{part.code_execution_result.output}"
)
has_specific_part = True
# Also print any text parts found in any event for debugging
elif part.text and not part.text.isspace():
print(f" Text: '{part.text.strip()}'")
# Do not set has_specific_part=True here, as we want the final response logic below
# --- Check for final response AFTER specific parts ---
# Only consider it final if it doesn't have the specific code parts we just handled
if not has_specific_part and event.is_final_response():
if (
event.content
and event.content.parts
and event.content.parts[0].text
):
final_response_text = event.content.parts[0].text.strip()
print(f"==> Final Agent Response: {final_response_text}")
else:
print(
"==> Final Agent Response: [No text content in final event]")
except Exception as e:
print(f"ERROR during agent run: {e}")
print("-" * 30)
# Main async function to run the examples
async def main():
await call_agent_async("Calculate the value of (5 + 7) * 3")
await call_agent_async("What is 10 factorial?")
# Execute the main async function
try:
asyncio.run(main())
except RuntimeError as e:
# Handle specific error when running asyncio.run in an already running loop (like Jupyter/Colab)
if "cannot be called from a running event loop" in str(e):
print("\nRunning in an existing event loop (like Colab/Jupyter).")
print("Please run `await main()` in a notebook cell instead.")
# If in an interactive environment like a notebook, you might need to run:
# await main()
else:
raise e # Re-raise other runtime errors
````
````java
import com.google.adk.agents.BaseAgent;
import com.google.adk.agents.LlmAgent;
import com.google.adk.runner.Runner;
import com.google.adk.sessions.InMemorySessionService;
import com.google.adk.sessions.Session;
import com.google.adk.tools.BuiltInCodeExecutionTool;
import com.google.common.collect.ImmutableList;
import com.google.genai.types.Content;
import com.google.genai.types.Part;
public class CodeExecutionAgentApp {
private static final String AGENT_NAME = "calculator_agent";
private static final String APP_NAME = "calculator";
private static final String USER_ID = "user1234";
private static final String SESSION_ID = "session_code_exec_sync";
private static final String GEMINI_MODEL = "gemini-2.0-flash";
/**
* Calls the agent with a query and prints the interaction events and final response.
*
* @param runner The runner instance for the agent.
* @param query The query to send to the agent.
*/
public static void callAgent(Runner runner, String query) {
Content content =
Content.builder().role("user").parts(ImmutableList.of(Part.fromText(query))).build();
InMemorySessionService sessionService = (InMemorySessionService) runner.sessionService();
Session session =
sessionService
.createSession(APP_NAME, USER_ID, /* state= */ null, SESSION_ID)
.blockingGet();
System.out.println("\n--- Running Query: " + query + " ---");
final String[] finalResponseText = {"No final text response captured."};
try {
runner
.runAsync(session.userId(), session.id(), content)
.forEach(
event -> {
System.out.println("Event ID: " + event.id() + ", Author: " + event.author());
boolean hasSpecificPart = false;
if (event.content().isPresent() && event.content().get().parts().isPresent()) {
for (Part part : event.content().get().parts().get()) {
if (part.executableCode().isPresent()) {
System.out.println(
" Debug: Agent generated code:\n```python\n"
+ part.executableCode().get().code()
+ "\n```");
hasSpecificPart = true;
} else if (part.codeExecutionResult().isPresent()) {
System.out.println(
" Debug: Code Execution Result: "
+ part.codeExecutionResult().get().outcome()
+ " - Output:\n"
+ part.codeExecutionResult().get().output());
hasSpecificPart = true;
} else if (part.text().isPresent() && !part.text().get().trim().isEmpty()) {
System.out.println(" Text: '" + part.text().get().trim() + "'");
}
}
}
if (!hasSpecificPart && event.finalResponse()) {
if (event.content().isPresent()
&& event.content().get().parts().isPresent()
&& !event.content().get().parts().get().isEmpty()
&& event.content().get().parts().get().get(0).text().isPresent()) {
finalResponseText[0] =
event.content().get().parts().get().get(0).text().get().trim();
System.out.println("==> Final Agent Response: " + finalResponseText[0]);
} else {
System.out.println(
"==> Final Agent Response: [No text content in final event]");
}
}
});
} catch (Exception e) {
System.err.println("ERROR during agent run: " + e.getMessage());
e.printStackTrace();
}
System.out.println("------------------------------");
}
public static void main(String[] args) {
BuiltInCodeExecutionTool codeExecutionTool = new BuiltInCodeExecutionTool();
BaseAgent codeAgent =
LlmAgent.builder()
.name(AGENT_NAME)
.model(GEMINI_MODEL)
.tools(ImmutableList.of(codeExecutionTool))
.instruction(
"""
You are a calculator agent.
When given a mathematical expression, write and execute Python code to calculate the result.
Return only the final numerical result as plain text, without markdown or code blocks.
""")
.description("Executes Python code to perform calculations.")
.build();
InMemorySessionService sessionService = new InMemorySessionService();
Runner runner = new Runner(codeAgent, APP_NAME, null, sessionService);
callAgent(runner, "Calculate the value of (5 + 7) * 3");
callAgent(runner, "What is 10 factorial?");
}
}
````
## Putting It Together: Example
Code
Here's the complete basic `capital_agent`:
```python
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# --- Full example code demonstrating LlmAgent with Tools vs. Output Schema ---
import json # Needed for pretty printing dicts
import asyncio
from google.adk.agents import LlmAgent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.genai import types
from pydantic import BaseModel, Field
# --- 1. Define Constants ---
APP_NAME = "agent_comparison_app"
USER_ID = "test_user_456"
SESSION_ID_TOOL_AGENT = "session_tool_agent_xyz"
SESSION_ID_SCHEMA_AGENT = "session_schema_agent_xyz"
MODEL_NAME = "gemini-2.0-flash"
# --- 2. Define Schemas ---
# Input schema used by both agents
class CountryInput(BaseModel):
country: str = Field(description="The country to get information about.")
# Output schema ONLY for the second agent
class CapitalInfoOutput(BaseModel):
capital: str = Field(description="The capital city of the country.")
# Note: Population is illustrative; the LLM will infer or estimate this
# as it cannot use tools when output_schema is set.
population_estimate: str = Field(description="An estimated population of the capital city.")
# --- 3. Define the Tool (Only for the first agent) ---
def get_capital_city(country: str) -> str:
"""Retrieves the capital city of a given country."""
print(f"\n-- Tool Call: get_capital_city(country='{country}') --")
country_capitals = {
"united states": "Washington, D.C.",
"canada": "Ottawa",
"france": "Paris",
"japan": "Tokyo",
}
result = country_capitals.get(country.lower(), f"Sorry, I couldn't find the capital for {country}.")
print(f"-- Tool Result: '{result}' --")
return result
# --- 4. Configure Agents ---
# Agent 1: Uses a tool and output_key
capital_agent_with_tool = LlmAgent(
model=MODEL_NAME,
name="capital_agent_tool",
description="Retrieves the capital city using a specific tool.",
instruction="""You are a helpful agent that provides the capital city of a country using a tool.
The user will provide the country name in a JSON format like {"country": "country_name"}.
1. Extract the country name.
2. Use the `get_capital_city` tool to find the capital.
3. Respond clearly to the user, stating the capital city found by the tool.
""",
tools=[get_capital_city],
input_schema=CountryInput,
output_key="capital_tool_result", # Store final text response
)
# Agent 2: Uses output_schema (NO tools possible)
structured_info_agent_schema = LlmAgent(
model=MODEL_NAME,
name="structured_info_agent_schema",
description="Provides capital and estimated population in a specific JSON format.",
instruction=f"""You are an agent that provides country information.
The user will provide the country name in a JSON format like {{"country": "country_name"}}.
Respond ONLY with a JSON object matching this exact schema:
{json.dumps(CapitalInfoOutput.model_json_schema(), indent=2)}
Use your knowledge to determine the capital and estimate the population. Do not use any tools.
""",
# *** NO tools parameter here - using output_schema prevents tool use ***
input_schema=CountryInput,
output_schema=CapitalInfoOutput, # Enforce JSON output structure
output_key="structured_info_result", # Store final JSON response
)
# --- 5. Set up Session Management and Runners ---
session_service = InMemorySessionService()
# Create a runner for EACH agent
capital_runner = Runner(
agent=capital_agent_with_tool,
app_name=APP_NAME,
session_service=session_service
)
structured_runner = Runner(
agent=structured_info_agent_schema,
app_name=APP_NAME,
session_service=session_service
)
# --- 6. Define Agent Interaction Logic ---
async def call_agent_and_print(
runner_instance: Runner,
agent_instance: LlmAgent,
session_id: str,
query_json: str
):
"""Sends a query to the specified agent/runner and prints results."""
print(f"\n>>> Calling Agent: '{agent_instance.name}' | Query: {query_json}")
user_content = types.Content(role='user', parts=[types.Part(text=query_json)])
final_response_content = "No final response received."
async for event in runner_instance.run_async(user_id=USER_ID, session_id=session_id, new_message=user_content):
# print(f"Event: {event.type}, Author: {event.author}") # Uncomment for detailed logging
if event.is_final_response() and event.content and event.content.parts:
# For output_schema, the content is the JSON string itself
final_response_content = event.content.parts[0].text
print(f"<<< Agent '{agent_instance.name}' Response: {final_response_content}")
current_session = await session_service.get_session(app_name=APP_NAME,
user_id=USER_ID,
session_id=session_id)
stored_output = current_session.state.get(agent_instance.output_key)
# Pretty print if the stored output looks like JSON (likely from output_schema)
print(f"--- Session State ['{agent_instance.output_key}']: ", end="")
try:
# Attempt to parse and pretty print if it's JSON
parsed_output = json.loads(stored_output)
print(json.dumps(parsed_output, indent=2))
except (json.JSONDecodeError, TypeError):
# Otherwise, print as string
print(stored_output)
print("-" * 30)
# --- 7. Run Interactions ---
async def main():
# Create separate sessions for clarity, though not strictly necessary if context is managed
print("--- Creating Sessions ---")
await session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID_TOOL_AGENT)
await session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID_SCHEMA_AGENT)
print("--- Testing Agent with Tool ---")
await call_agent_and_print(capital_runner, capital_agent_with_tool, SESSION_ID_TOOL_AGENT, '{"country": "France"}')
await call_agent_and_print(capital_runner, capital_agent_with_tool, SESSION_ID_TOOL_AGENT, '{"country": "Canada"}')
print("\n\n--- Testing Agent with Output Schema (No Tool Use) ---")
await call_agent_and_print(structured_runner, structured_info_agent_schema, SESSION_ID_SCHEMA_AGENT, '{"country": "France"}')
await call_agent_and_print(structured_runner, structured_info_agent_schema, SESSION_ID_SCHEMA_AGENT, '{"country": "Japan"}')
# --- Run the Agent ---
# Note: In Colab, you can directly use 'await' at the top level.
# If running this code as a standalone Python script, you'll need to use asyncio.run() or manage the event loop.
if __name__ == "__main__":
asyncio.run(main())
```
```javascript
// Copyright 2025 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
import { LlmAgent, FunctionTool, InMemoryRunner, isFinalResponse } from '@google/adk';
import { createUserContent, Schema, Type } from '@google/genai';
import type { Part } from '@google/genai';
import { z } from 'zod';
// --- 1. Define Constants ---
const APP_NAME = "capital_app_ts";
const USER_ID = "test_user_789";
const SESSION_ID_TOOL_AGENT = "session_tool_agent_ts";
const SESSION_ID_SCHEMA_AGENT = "session_schema_agent_ts";
const MODEL_NAME = "gemini-2.5-flash"; // Using flash for speed
// --- 2. Define Schemas ---
// A. Schema for the Tool's parameters (using Zod)
const CountryInput = z.object({
country: z.string().describe('The country to get the capital for.'),
});
// B. Output schema ONLY for the second agent (using ADK's Schema type)
const CapitalInfoOutputSchema: Schema = {
type: Type.OBJECT,
description: "Schema for capital city information.",
properties: {
capital: {
type: Type.STRING,
description: "The capital city of the country."
},
population_estimate: {
type: Type.STRING,
description: "An estimated population of the capital city."
},
},
required: ["capital", "population_estimate"],
};
// --- 3. Define the Tool (Only for the first agent) ---
async function getCapitalCity(params: z.infer): Promise<{ result: string }> {
console.log(`\n-- Tool Call: getCapitalCity(country='${params.country}') --`);
const capitals: Record = {
'united states': 'Washington, D.C.',
'canada': 'Ottawa',
'france': 'Paris',
'japan': 'Tokyo',
};
const result = capitals[params.country.toLowerCase()] ??
`Sorry, I couldn't find the capital for ${params.country}.`;
console.log(`-- Tool Result: '${result}' --`);
return { result: result }; // Tools must return an object
}
// --- 4. Configure Agents ---
// Agent 1: Uses a tool and outputKey
const getCapitalCityTool = new FunctionTool({
name: 'get_capital_city',
description: 'Retrieves the capital city for a given country',
parameters: CountryInput,
execute: getCapitalCity,
});
const capitalAgentWithTool = new LlmAgent({
model: MODEL_NAME,
name: 'capital_agent_tool',
description: 'Retrieves the capital city using a specific tool.',
instruction: `You are a helpful agent that provides the capital city of a country using a tool.
The user will provide the country name in a JSON format like {"country": "country_name"}.
1. Extract the country name.
2. Use the \`get_capital_city\` tool to find the capital.
3. Respond with a JSON object with the key 'capital' and the value as the capital city.
`,
tools: [getCapitalCityTool],
outputKey: "capital_tool_result", // Store final text response
});
// Agent 2: Uses outputSchema (NO tools possible)
const structuredInfoAgentSchema = new LlmAgent({
model: MODEL_NAME,
name: 'structured_info_agent_schema',
description: 'Provides capital and estimated population in a specific JSON format.',
instruction: `You are an agent that provides country information.
The user will provide the country name in a JSON format like {"country": "country_name"}.
Respond ONLY with a JSON object matching this exact schema:
${JSON.stringify(CapitalInfoOutputSchema, null, 2)}
Use your knowledge to determine the capital and estimate the population. Do not use any tools.
`,
// *** NO tools parameter here - using outputSchema prevents tool use ***
outputSchema: CapitalInfoOutputSchema,
outputKey: "structured_info_result",
});
// --- 5. Define Agent Interaction Logic ---
async function callAgentAndPrint(
runner: InMemoryRunner,
agent: LlmAgent,
sessionId: string,
queryJson: string
) {
console.log(`\n>>> Calling Agent: '${agent.name}' | Query: ${queryJson}`);
const message = createUserContent(queryJson);
let finalResponseContent = "No final response received.";
for await (const event of runner.runAsync({ userId: USER_ID, sessionId: sessionId, newMessage: message })) {
if (isFinalResponse(event) && event.content?.parts?.length) {
finalResponseContent = event.content.parts.map((part: Part) => part.text ?? '').join('');
}
}
console.log(`<<< Agent '${agent.name}' Response: ${finalResponseContent}`);
// Check the session state
const currentSession = await runner.sessionService.getSession({ appName: APP_NAME, userId: USER_ID, sessionId: sessionId });
if (!currentSession) {
console.log(`--- Session not found: ${sessionId} ---`);
return;
}
const storedOutput = currentSession.state[agent.outputKey!];
console.log(`--- Session State ['${agent.outputKey}']: `);
try {
// Attempt to parse and pretty print if it's JSON
const parsedOutput = JSON.parse(storedOutput as string);
console.log(JSON.stringify(parsedOutput, null, 2));
} catch (e) {
// Otherwise, print as a string
console.log(storedOutput);
}
console.log("-".repeat(30));
}
// --- 6. Run Interactions ---
async function main() {
// Set up runners for each agent
const capitalRunner = new InMemoryRunner({ appName: APP_NAME, agent: capitalAgentWithTool });
const structuredRunner = new InMemoryRunner({ appName: APP_NAME, agent: structuredInfoAgentSchema });
// Create sessions
console.log("--- Creating Sessions ---");
await capitalRunner.sessionService.createSession({ appName: APP_NAME, userId: USER_ID, sessionId: SESSION_ID_TOOL_AGENT });
await structuredRunner.sessionService.createSession({ appName: APP_NAME, userId: USER_ID, sessionId: SESSION_ID_SCHEMA_AGENT });
console.log("\n--- Testing Agent with Tool ---");
await callAgentAndPrint(capitalRunner, capitalAgentWithTool, SESSION_ID_TOOL_AGENT, '{"country": "France"}');
await callAgentAndPrint(capitalRunner, capitalAgentWithTool, SESSION_ID_TOOL_AGENT, '{"country": "Canada"}');
console.log("\n\n--- Testing Agent with Output Schema (No Tool Use) ---");
await callAgentAndPrint(structuredRunner, structuredInfoAgentSchema, SESSION_ID_SCHEMA_AGENT, '{"country": "France"}');
await callAgentAndPrint(structuredRunner, structuredInfoAgentSchema, SESSION_ID_SCHEMA_AGENT, '{"country": "Japan"}');
}
main();
```
```go
package main
import (
"context"
"encoding/json"
"errors"
"fmt"
"log"
"strings"
"google.golang.org/adk/agent"
"google.golang.org/adk/agent/llmagent"
"google.golang.org/adk/model/gemini"
"google.golang.org/adk/runner"
"google.golang.org/adk/session"
"google.golang.org/adk/tool"
"google.golang.org/adk/tool/functiontool"
"google.golang.org/genai"
)
// --- Main Runnable Example ---
const (
modelName = "gemini-2.0-flash"
appName = "agent_comparison_app"
userID = "test_user_456"
)
type getCapitalCityArgs struct {
Country string `json:"country" jsonschema:"The country to get the capital of."`
}
// getCapitalCity retrieves the capital city of a given country.
func getCapitalCity(ctx tool.Context, args getCapitalCityArgs) (map[string]any, error) {
fmt.Printf("\n-- Tool Call: getCapitalCity(country='%s') --\n", args.Country)
capitals := map[string]string{
"united states": "Washington, D.C.",
"canada": "Ottawa",
"france": "Paris",
"japan": "Tokyo",
}
capital, ok := capitals[strings.ToLower(args.Country)]
if !ok {
result := fmt.Sprintf("Sorry, I couldn't find the capital for %s.", args.Country)
fmt.Printf("-- Tool Result: '%s' --\n", result)
return nil, errors.New(result)
}
fmt.Printf("-- Tool Result: '%s' --\n", capital)
return map[string]any{"result": capital}, nil
}
// callAgent is a helper function to execute an agent with a given prompt and handle its output.
func callAgent(ctx context.Context, a agent.Agent, outputKey string, prompt string) {
fmt.Printf("\n>>> Calling Agent: '%s' | Query: %s\n", a.Name(), prompt)
// Create an in-memory session service to manage agent state.
sessionService := session.InMemoryService()
// Create a new session for the agent interaction.
sessionCreateResponse, err := sessionService.Create(ctx, &session.CreateRequest{
AppName: appName,
UserID: userID,
})
if err != nil {
log.Fatalf("Failed to create the session service: %v", err)
}
session := sessionCreateResponse.Session
// Configure the runner with the application name, agent, and session service.
config := runner.Config{
AppName: appName,
Agent: a,
SessionService: sessionService,
}
// Create a new runner instance.
r, err := runner.New(config)
if err != nil {
log.Fatalf("Failed to create the runner: %v", err)
}
// Prepare the user's message to send to the agent.
sessionID := session.ID()
userMsg := &genai.Content{
Parts: []*genai.Part{
genai.NewPartFromText(prompt),
},
Role: string(genai.RoleUser),
}
// Run the agent and process the streaming events.
for event, err := range r.Run(ctx, userID, sessionID, userMsg, agent.RunConfig{
StreamingMode: agent.StreamingModeSSE,
}) {
if err != nil {
fmt.Printf("\nAGENT_ERROR: %v\n", err)
} else if event.Partial {
// Print partial responses as they are received.
for _, p := range event.Content.Parts {
fmt.Print(p.Text)
}
}
}
// After the run, check if there's an expected output key in the session state.
if outputKey != "" {
storedOutput, error := session.State().Get(outputKey)
if error == nil {
// Pretty-print the stored output if it's a JSON string.
fmt.Printf("\n--- Session State ['%s']: ", outputKey)
storedString, isString := storedOutput.(string)
if isString {
var prettyJSON map[string]interface{}
if err := json.Unmarshal([]byte(storedString), &prettyJSON); err == nil {
indentedJSON, err := json.MarshalIndent(prettyJSON, "", " ")
if err == nil {
fmt.Println(string(indentedJSON))
} else {
fmt.Println(storedString)
}
} else {
fmt.Println(storedString)
}
} else {
fmt.Println(storedOutput)
}
fmt.Println(strings.Repeat("-", 30))
}
}
}
func main() {
ctx := context.Background()
model, err := gemini.NewModel(ctx, modelName, &genai.ClientConfig{})
if err != nil {
log.Fatalf("Failed to create model: %v", err)
}
capitalTool, err := functiontool.New(
functiontool.Config{
Name: "get_capital_city",
Description: "Retrieves the capital city for a given country.",
},
getCapitalCity,
)
if err != nil {
log.Fatalf("Failed to create function tool: %v", err)
}
countryInputSchema := &genai.Schema{
Type: genai.TypeObject,
Description: "Input for specifying a country.",
Properties: map[string]*genai.Schema{
"country": {
Type: genai.TypeString,
Description: "The country to get information about.",
},
},
Required: []string{"country"},
}
capitalAgentWithTool, err := llmagent.New(llmagent.Config{
Name: "capital_agent_tool",
Model: model,
Description: "Retrieves the capital city using a specific tool.",
Instruction: `You are a helpful agent that provides the capital city of a country using a tool.
The user will provide the country name in a JSON format like {"country": "country_name"}.
1. Extract the country name.
2. Use the 'get_capital_city' tool to find the capital.
3. Respond clearly to the user, stating the capital city found by the tool.`,
Tools: []tool.Tool{capitalTool},
InputSchema: countryInputSchema,
OutputKey: "capital_tool_result",
})
if err != nil {
log.Fatalf("Failed to create capital agent with tool: %v", err)
}
capitalInfoOutputSchema := &genai.Schema{
Type: genai.TypeObject,
Description: "Schema for capital city information.",
Properties: map[string]*genai.Schema{
"capital": {
Type: genai.TypeString,
Description: "The capital city of the country.",
},
"population_estimate": {
Type: genai.TypeString,
Description: "An estimated population of the capital city.",
},
},
Required: []string{"capital", "population_estimate"},
}
schemaJSON, _ := json.Marshal(capitalInfoOutputSchema)
structuredInfoAgentSchema, err := llmagent.New(llmagent.Config{
Name: "structured_info_agent_schema",
Model: model,
Description: "Provides capital and estimated population in a specific JSON format.",
Instruction: fmt.Sprintf(`You are an agent that provides country information.
The user will provide the country name in a JSON format like {"country": "country_name"}.
Respond ONLY with a JSON object matching this exact schema:
%s
Use your knowledge to determine the capital and estimate the population. Do not use any tools.`, string(schemaJSON)),
InputSchema: countryInputSchema,
OutputSchema: capitalInfoOutputSchema,
OutputKey: "structured_info_result",
})
if err != nil {
log.Fatalf("Failed to create structured info agent: %v", err)
}
fmt.Println("--- Testing Agent with Tool ---")
callAgent(ctx, capitalAgentWithTool, "capital_tool_result", `{"country": "France"}`)
callAgent(ctx, capitalAgentWithTool, "capital_tool_result", `{"country": "Canada"}`)
fmt.Println("\n\n--- Testing Agent with Output Schema (No Tool Use) ---")
callAgent(ctx, structuredInfoAgentSchema, "structured_info_result", `{"country": "France"}`)
callAgent(ctx, structuredInfoAgentSchema, "structured_info_result", `{"country": "Japan"}`)
}
```
```java
// --- Full example code demonstrating LlmAgent with Tools vs. Output Schema ---
import com.google.adk.agents.LlmAgent;
import com.google.adk.events.Event;
import com.google.adk.runner.Runner;
import com.google.adk.sessions.InMemorySessionService;
import com.google.adk.sessions.Session;
import com.google.adk.tools.Annotations;
import com.google.adk.tools.FunctionTool;
import com.google.genai.types.Content;
import com.google.genai.types.Part;
import com.google.genai.types.Schema;
import io.reactivex.rxjava3.core.Flowable;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Optional;
public class LlmAgentExample {
// --- 1. Define Constants ---
private static final String MODEL_NAME = "gemini-2.0-flash";
private static final String APP_NAME = "capital_agent_tool";
private static final String USER_ID = "test_user_456";
private static final String SESSION_ID_TOOL_AGENT = "session_tool_agent_xyz";
private static final String SESSION_ID_SCHEMA_AGENT = "session_schema_agent_xyz";
// --- 2. Define Schemas ---
// Input schema used by both agents
private static final Schema COUNTRY_INPUT_SCHEMA =
Schema.builder()
.type("OBJECT")
.description("Input for specifying a country.")
.properties(
Map.of(
"country",
Schema.builder()
.type("STRING")
.description("The country to get information about.")
.build()))
.required(List.of("country"))
.build();
// Output schema ONLY for the second agent
private static final Schema CAPITAL_INFO_OUTPUT_SCHEMA =
Schema.builder()
.type("OBJECT")
.description("Schema for capital city information.")
.properties(
Map.of(
"capital",
Schema.builder()
.type("STRING")
.description("The capital city of the country.")
.build(),
"population_estimate",
Schema.builder()
.type("STRING")
.description("An estimated population of the capital city.")
.build()))
.required(List.of("capital", "population_estimate"))
.build();
// --- 3. Define the Tool (Only for the first agent) ---
// Retrieves the capital city of a given country.
public static Map getCapitalCity(
@Annotations.Schema(name = "country", description = "The country to get capital for")
String country) {
System.out.printf("%n-- Tool Call: getCapitalCity(country='%s') --%n", country);
Map countryCapitals = new HashMap<>();
countryCapitals.put("united states", "Washington, D.C.");
countryCapitals.put("canada", "Ottawa");
countryCapitals.put("france", "Paris");
countryCapitals.put("japan", "Tokyo");
String result =
countryCapitals.getOrDefault(
country.toLowerCase(), "Sorry, I couldn't find the capital for " + country + ".");
System.out.printf("-- Tool Result: '%s' --%n", result);
return Map.of("result", result); // Tools must return a Map
}
public static void main(String[] args){
LlmAgentExample agentExample = new LlmAgentExample();
FunctionTool capitalTool = FunctionTool.create(agentExample.getClass(), "getCapitalCity");
// --- 4. Configure Agents ---
// Agent 1: Uses a tool and output_key
LlmAgent capitalAgentWithTool =
LlmAgent.builder()
.model(MODEL_NAME)
.name("capital_agent_tool")
.description("Retrieves the capital city using a specific tool.")
.instruction(
"""
You are a helpful agent that provides the capital city of a country using a tool.
1. Extract the country name.
2. Use the `get_capital_city` tool to find the capital.
3. Respond clearly to the user, stating the capital city found by the tool.
""")
.tools(capitalTool)
.inputSchema(COUNTRY_INPUT_SCHEMA)
.outputKey("capital_tool_result") // Store final text response
.build();
// Agent 2: Uses an output schema
LlmAgent structuredInfoAgentSchema =
LlmAgent.builder()
.model(MODEL_NAME)
.name("structured_info_agent_schema")
.description("Provides capital and estimated population in a specific JSON format.")
.instruction(
String.format("""
You are an agent that provides country information.
Respond ONLY with a JSON object matching this exact schema: %s
Use your knowledge to determine the capital and estimate the population. Do not use any tools.
""", CAPITAL_INFO_OUTPUT_SCHEMA.toJson()))
// *** NO tools parameter here - using output_schema prevents tool use ***
.inputSchema(COUNTRY_INPUT_SCHEMA)
.outputSchema(CAPITAL_INFO_OUTPUT_SCHEMA) // Enforce JSON output structure
.outputKey("structured_info_result") // Store final JSON response
.build();
// --- 5. Set up Session Management and Runners ---
InMemorySessionService sessionService = new InMemorySessionService();
sessionService.createSession(APP_NAME, USER_ID, null, SESSION_ID_TOOL_AGENT).blockingGet();
sessionService.createSession(APP_NAME, USER_ID, null, SESSION_ID_SCHEMA_AGENT).blockingGet();
Runner capitalRunner = new Runner(capitalAgentWithTool, APP_NAME, null, sessionService);
Runner structuredRunner = new Runner(structuredInfoAgentSchema, APP_NAME, null, sessionService);
// --- 6. Run Interactions ---
System.out.println("--- Testing Agent with Tool ---");
agentExample.callAgentAndPrint(
capitalRunner, capitalAgentWithTool, SESSION_ID_TOOL_AGENT, "{\"country\": \"France\"}");
agentExample.callAgentAndPrint(
capitalRunner, capitalAgentWithTool, SESSION_ID_TOOL_AGENT, "{\"country\": \"Canada\"}");
System.out.println("\n\n--- Testing Agent with Output Schema (No Tool Use) ---");
agentExample.callAgentAndPrint(
structuredRunner,
structuredInfoAgentSchema,
SESSION_ID_SCHEMA_AGENT,
"{\"country\": \"France\"}");
agentExample.callAgentAndPrint(
structuredRunner,
structuredInfoAgentSchema,
SESSION_ID_SCHEMA_AGENT,
"{\"country\": \"Japan\"}");
}
// --- 7. Define Agent Interaction Logic ---
public void callAgentAndPrint(Runner runner, LlmAgent agent, String sessionId, String queryJson) {
System.out.printf(
"%n>>> Calling Agent: '%s' | Session: '%s' | Query: %s%n",
agent.name(), sessionId, queryJson);
Content userContent = Content.fromParts(Part.fromText(queryJson));
final String[] finalResponseContent = {"No final response received."};
Flowable eventStream = runner.runAsync(USER_ID, sessionId, userContent);
// Stream event response
eventStream.blockingForEach(event -> {
if (event.finalResponse() && event.content().isPresent()) {
event
.content()
.get()
.parts()
.flatMap(parts -> parts.isEmpty() ? Optional.empty() : Optional.of(parts.get(0)))
.flatMap(Part::text)
.ifPresent(text -> finalResponseContent[0] = text);
}
});
System.out.printf("<<< Agent '%s' Response: %s%n", agent.name(), finalResponseContent[0]);
// Retrieve the session again to get the updated state
Session updatedSession =
runner
.sessionService()
.getSession(APP_NAME, USER_ID, sessionId, Optional.empty())
.blockingGet();
if (updatedSession != null && agent.outputKey().isPresent()) {
// Print to verify if the stored output looks like JSON (likely from output_schema)
System.out.printf("--- Session State ['%s']: ", agent.outputKey().get());
}
}
}
```
*(This example demonstrates the core concepts. More complex agents might incorporate schemas, context control, planning, etc.)*
## Related Concepts (Deferred Topics)
While this page covers the core configuration of `LlmAgent`, several related concepts provide more advanced control and are detailed elsewhere:
- **Callbacks:** Intercepting execution points (before/after model calls, before/after tool calls) using `before_model_callback`, `after_model_callback`, etc. See [Callbacks](https://google.github.io/adk-docs/callbacks/types-of-callbacks/index.md).
- **Multi-Agent Control:** Advanced strategies for agent interaction, including planning (`planner`), controlling agent transfer (`disallow_transfer_to_parent`, `disallow_transfer_to_peers`), and system-wide instructions (`global_instruction`). See [Multi-Agents](https://google.github.io/adk-docs/agents/multi-agents/index.md).
# Multi-Agent Systems in ADK
Supported in ADKPython v0.1.0Typescript v0.2.0Go v0.1.0Java v0.1.0
As agentic applications grow in complexity, structuring them as a single, monolithic agent can become challenging to develop, maintain, and reason about. The Agent Development Kit (ADK) supports building sophisticated applications by composing multiple, distinct `BaseAgent` instances into a **Multi-Agent System (MAS)**.
In ADK, a multi-agent system is an application where different agents, often forming a hierarchy, collaborate or coordinate to achieve a larger goal. Structuring your application this way offers significant advantages, including enhanced modularity, specialization, reusability, maintainability, and the ability to define structured control flows using dedicated workflow agents.
You can compose various types of agents derived from `BaseAgent` to build these systems:
- **LLM Agents:** Agents powered by large language models. (See [LLM Agents](https://google.github.io/adk-docs/agents/llm-agents/index.md))
- **Workflow Agents:** Specialized agents (`SequentialAgent`, `ParallelAgent`, `LoopAgent`) designed to manage the execution flow of their sub-agents. (See [Workflow Agents](https://google.github.io/adk-docs/agents/workflow-agents/index.md))
- **Custom agents:** Your own agents inheriting from `BaseAgent` with specialized, non-LLM logic. (See [Custom Agents](https://google.github.io/adk-docs/agents/custom-agents/index.md))
The following sections detail the core ADK primitives—such as agent hierarchy, workflow agents, and interaction mechanisms—that enable you to construct and manage these multi-agent systems effectively.
## 1. ADK Primitives for Agent Composition
ADK provides core building blocks—primitives—that enable you to structure and manage interactions within your multi-agent system.
Note
The specific parameters or method names for the primitives may vary slightly by SDK language (e.g., `sub_agents` in Python, `subAgents` in Java). Refer to the language-specific API documentation for details.
### 1.1. Agent Hierarchy (Parent agent, Sub Agents)
The foundation for structuring multi-agent systems is the parent-child relationship defined in `BaseAgent`.
- **Establishing Hierarchy:** You create a tree structure by passing a list of agent instances to the `sub_agents` argument when initializing a parent agent. ADK automatically sets the `parent_agent` attribute on each child agent during initialization.
- **Single Parent Rule:** An agent instance can only be added as a sub-agent once. Attempting to assign a second parent will result in a `ValueError`.
- **Importance:** This hierarchy defines the scope for [Workflow Agents](#workflow-agents-as-orchestrators) and influences the potential targets for LLM-Driven Delegation. You can navigate the hierarchy using `agent.parent_agent` or find descendants using `agent.find_agent(name)`.
```python
# Conceptual Example: Defining Hierarchy
from google.adk.agents import LlmAgent, BaseAgent
# Define individual agents
greeter = LlmAgent(name="Greeter", model="gemini-2.0-flash")
task_doer = BaseAgent(name="TaskExecutor") # Custom non-LLM agent
# Create parent agent and assign children via sub_agents
coordinator = LlmAgent(
name="Coordinator",
model="gemini-2.0-flash",
description="I coordinate greetings and tasks.",
sub_agents=[ # Assign sub_agents here
greeter,
task_doer
]
)
# Framework automatically sets:
# assert greeter.parent_agent == coordinator
# assert task_doer.parent_agent == coordinator
```
```typescript
// Conceptual Example: Defining Hierarchy
import { LlmAgent, BaseAgent, InvocationContext } from '@google/adk';
import type { Event, createEventActions } from '@google/adk';
class TaskExecutorAgent extends BaseAgent {
async *runAsyncImpl(context: InvocationContext): AsyncGenerator {
yield {
id: 'event-1',
invocationId: context.invocationId,
author: this.name,
content: { parts: [{ text: 'Task completed!' }] },
actions: createEventActions(),
timestamp: Date.now(),
};
}
async *runLiveImpl(context: InvocationContext): AsyncGenerator {
this.runAsyncImpl(context);
}
}
// Define individual agents
const greeter = new LlmAgent({name: 'Greeter', model: 'gemini-2.5-flash'});
const taskDoer = new TaskExecutorAgent({name: 'TaskExecutor'}); // Custom non-LLM agent
// Create parent agent and assign children via subAgents
const coordinator = new LlmAgent({
name: 'Coordinator',
model: 'gemini-2.5-flash',
description: 'I coordinate greetings and tasks.',
subAgents: [ // Assign subAgents here
greeter,
taskDoer
],
});
// Framework automatically sets:
// console.assert(greeter.parentAgent === coordinator);
// console.assert(taskDoer.parentAgent === coordinator);
```
```go
import (
"google.golang.org/adk/agent"
"google.golang.org/adk/agent/llmagent"
)
// Conceptual Example: Defining Hierarchy
// Define individual agents
greeter, _ := llmagent.New(llmagent.Config{Name: "Greeter", Model: m})
taskDoer, _ := agent.New(agent.Config{Name: "TaskExecutor"}) // Custom non-LLM agent
// Create parent agent and assign children via sub_agents
coordinator, _ := llmagent.New(llmagent.Config{
Name: "Coordinator",
Model: m,
Description: "I coordinate greetings and tasks.",
SubAgents: []agent.Agent{greeter, taskDoer}, // Assign sub_agents here
})
```
```java
// Conceptual Example: Defining Hierarchy
import com.google.adk.agents.SequentialAgent;
import com.google.adk.agents.LlmAgent;
// Define individual agents
LlmAgent greeter = LlmAgent.builder().name("Greeter").model("gemini-2.0-flash").build();
SequentialAgent taskDoer = SequentialAgent.builder().name("TaskExecutor").subAgents(...).build(); // Sequential Agent
// Create parent agent and assign sub_agents
LlmAgent coordinator = LlmAgent.builder()
.name("Coordinator")
.model("gemini-2.0-flash")
.description("I coordinate greetings and tasks")
.subAgents(greeter, taskDoer) // Assign sub_agents here
.build();
// Framework automatically sets:
// assert greeter.parentAgent().equals(coordinator);
// assert taskDoer.parentAgent().equals(coordinator);
```
### 1.2. Workflow Agents as Orchestrators
ADK includes specialized agents derived from `BaseAgent` that don't perform tasks themselves but orchestrate the execution flow of their `sub_agents`.
- **[`SequentialAgent`](https://google.github.io/adk-docs/agents/workflow-agents/sequential-agents/index.md):** Executes its `sub_agents` one after another in the order they are listed.
- **Context:** Passes the *same* [`InvocationContext`](https://google.github.io/adk-docs/runtime/index.md) sequentially, allowing agents to easily pass results via shared state.
```python
# Conceptual Example: Sequential Pipeline
from google.adk.agents import SequentialAgent, LlmAgent
step1 = LlmAgent(name="Step1_Fetch", output_key="data") # Saves output to state['data']
step2 = LlmAgent(name="Step2_Process", instruction="Process data from {data}.")
pipeline = SequentialAgent(name="MyPipeline", sub_agents=[step1, step2])
# When pipeline runs, Step2 can access the state['data'] set by Step1.
```
```typescript
// Conceptual Example: Sequential Pipeline
import { SequentialAgent, LlmAgent } from '@google/adk';
const step1 = new LlmAgent({name: 'Step1_Fetch', outputKey: 'data'}); // Saves output to state['data']
const step2 = new LlmAgent({name: 'Step2_Process', instruction: 'Process data from {data}.'});
const pipeline = new SequentialAgent({name: 'MyPipeline', subAgents: [step1, step2]});
// When pipeline runs, Step2 can access the state['data'] set by Step1.
```
```go
import (
"google.golang.org/adk/agent"
"google.golang.org/adk/agent/llmagent"
"google.golang.org/adk/agent/workflowagents/sequentialagent"
)
// Conceptual Example: Sequential Pipeline
step1, _ := llmagent.New(llmagent.Config{Name: "Step1_Fetch", OutputKey: "data", Model: m}) // Saves output to state["data"]
step2, _ := llmagent.New(llmagent.Config{Name: "Step2_Process", Instruction: "Process data from {data}.", Model: m})
pipeline, _ := sequentialagent.New(sequentialagent.Config{
AgentConfig: agent.Config{Name: "MyPipeline", SubAgents: []agent.Agent{step1, step2}},
})
// When pipeline runs, Step2 can access the state["data"] set by Step1.
```
```java
// Conceptual Example: Sequential Pipeline
import com.google.adk.agents.SequentialAgent;
import com.google.adk.agents.LlmAgent;
LlmAgent step1 = LlmAgent.builder().name("Step1_Fetch").outputKey("data").build(); // Saves output to state.get("data")
LlmAgent step2 = LlmAgent.builder().name("Step2_Process").instruction("Process data from {data}.").build();
SequentialAgent pipeline = SequentialAgent.builder().name("MyPipeline").subAgents(step1, step2).build();
// When pipeline runs, Step2 can access the state.get("data") set by Step1.
```
- **[`ParallelAgent`](https://google.github.io/adk-docs/agents/workflow-agents/parallel-agents/index.md):** Executes its `sub_agents` in parallel. Events from sub-agents may be interleaved.
- **Context:** Modifies the `InvocationContext.branch` for each child agent (e.g., `ParentBranch.ChildName`), providing a distinct contextual path which can be useful for isolating history in some memory implementations.
- **State:** Despite different branches, all parallel children access the *same shared* `session.state`, enabling them to read initial state and write results (use distinct keys to avoid race conditions).
```python
# Conceptual Example: Parallel Execution
from google.adk.agents import ParallelAgent, LlmAgent
fetch_weather = LlmAgent(name="WeatherFetcher", output_key="weather")
fetch_news = LlmAgent(name="NewsFetcher", output_key="news")
gatherer = ParallelAgent(name="InfoGatherer", sub_agents=[fetch_weather, fetch_news])
# When gatherer runs, WeatherFetcher and NewsFetcher run concurrently.
# A subsequent agent could read state['weather'] and state['news'].
```
```typescript
// Conceptual Example: Parallel Execution
import { ParallelAgent, LlmAgent } from '@google/adk';
const fetchWeather = new LlmAgent({name: 'WeatherFetcher', outputKey: 'weather'});
const fetchNews = new LlmAgent({name: 'NewsFetcher', outputKey: 'news'});
const gatherer = new ParallelAgent({name: 'InfoGatherer', subAgents: [fetchWeather, fetchNews]});
// When gatherer runs, WeatherFetcher and NewsFetcher run concurrently.
// A subsequent agent could read state['weather'] and state['news'].
```
```go
import (
"google.golang.org/adk/agent"
"google.golang.org/adk/agent/llmagent"
"google.golang.org/adk/agent/workflowagents/parallelagent"
)
// Conceptual Example: Parallel Execution
fetchWeather, _ := llmagent.New(llmagent.Config{Name: "WeatherFetcher", OutputKey: "weather", Model: m})
fetchNews, _ := llmagent.New(llmagent.Config{Name: "NewsFetcher", OutputKey: "news", Model: m})
gatherer, _ := parallelagent.New(parallelagent.Config{
AgentConfig: agent.Config{Name: "InfoGatherer", SubAgents: []agent.Agent{fetchWeather, fetchNews}},
})
// When gatherer runs, WeatherFetcher and NewsFetcher run concurrently.
// A subsequent agent could read state["weather"] and state["news"].
```
```java
// Conceptual Example: Parallel Execution
import com.google.adk.agents.LlmAgent;
import com.google.adk.agents.ParallelAgent;
LlmAgent fetchWeather = LlmAgent.builder()
.name("WeatherFetcher")
.outputKey("weather")
.build();
LlmAgent fetchNews = LlmAgent.builder()
.name("NewsFetcher")
.instruction("news")
.build();
ParallelAgent gatherer = ParallelAgent.builder()
.name("InfoGatherer")
.subAgents(fetchWeather, fetchNews)
.build();
// When gatherer runs, WeatherFetcher and NewsFetcher run concurrently.
// A subsequent agent could read state['weather'] and state['news'].
```
- **[`LoopAgent`](https://google.github.io/adk-docs/agents/workflow-agents/loop-agents/index.md):** Executes its `sub_agents` sequentially in a loop.
- **Termination:** The loop stops if the optional `max_iterations` is reached, or if any sub-agent returns an [`Event`](https://google.github.io/adk-docs/events/index.md) with `escalate=True` in its Event Actions.
- **Context & State:** Passes the *same* `InvocationContext` in each iteration, allowing state changes (e.g., counters, flags) to persist across loops.
```python
# Conceptual Example: Loop with Condition
from google.adk.agents import LoopAgent, LlmAgent, BaseAgent
from google.adk.events import Event, EventActions
from google.adk.agents.invocation_context import InvocationContext
from typing import AsyncGenerator
class CheckCondition(BaseAgent): # Custom agent to check state
async def _run_async_impl(self, ctx: InvocationContext) -> AsyncGenerator[Event, None]:
status = ctx.session.state.get("status", "pending")
is_done = (status == "completed")
yield Event(author=self.name, actions=EventActions(escalate=is_done)) # Escalate if done
process_step = LlmAgent(name="ProcessingStep") # Agent that might update state['status']
poller = LoopAgent(
name="StatusPoller",
max_iterations=10,
sub_agents=[process_step, CheckCondition(name="Checker")]
)
# When poller runs, it executes process_step then Checker repeatedly
# until Checker escalates (state['status'] == 'completed') or 10 iterations pass.
```
```typescript
// Conceptual Example: Loop with Condition
import { LoopAgent, LlmAgent, BaseAgent, InvocationContext } from '@google/adk';
import type { Event, createEventActions, EventActions } from '@google/adk';
class CheckConditionAgent extends BaseAgent { // Custom agent to check state
async *runAsyncImpl(ctx: InvocationContext): AsyncGenerator {
const status = ctx.session.state['status'] || 'pending';
const isDone = status === 'completed';
yield createEvent({ author: 'check_condition', actions: createEventActions({ escalate: isDone }) });
}
async *runLiveImpl(ctx: InvocationContext): AsyncGenerator {
// This is not implemented.
}
};
const processStep = new LlmAgent({name: 'ProcessingStep'}); // Agent that might update state['status']
const poller = new LoopAgent({
name: 'StatusPoller',
maxIterations: 10,
// Executes its sub_agents sequentially in a loop
subAgents: [processStep, new CheckConditionAgent ({name: 'Checker'})]
});
// When poller runs, it executes processStep then Checker repeatedly
// until Checker escalates (state['status'] === 'completed') or 10 iterations pass.
```
```go
import (
"iter"
"google.golang.org/adk/agent"
"google.golang.org/adk/agent/llmagent"
"google.golang.org/adk/agent/workflowagents/loopagent"
"google.golang.org/adk/session"
)
// Conceptual Example: Loop with Condition
// Custom agent to check state
checkCondition, _ := agent.New(agent.Config{
Name: "Checker",
Run: func(ctx agent.InvocationContext) iter.Seq2[*session.Event, error] {
return func(yield func(*session.Event, error) bool) {
status, err := ctx.Session().State().Get("status")
// If "status" is not in the state, default to "pending".
// This is idiomatic Go for handling a potential error on lookup.
if err != nil {
status = "pending"
}
isDone := status == "completed"
yield(&session.Event{Author: "Checker", Actions: session.EventActions{Escalate: isDone}}, nil)
}
},
})
processStep, _ := llmagent.New(llmagent.Config{Name: "ProcessingStep", Model: m}) // Agent that might update state["status"]
poller, _ := loopagent.New(loopagent.Config{
MaxIterations: 10,
AgentConfig: agent.Config{Name: "StatusPoller", SubAgents: []agent.Agent{processStep, checkCondition}},
})
// When poller runs, it executes processStep then Checker repeatedly
// until Checker escalates (state["status"] == "completed") or 10 iterations pass.
```
````
```java
// Conceptual Example: Loop with Condition
// Custom agent to check state and potentially escalate
public static class CheckConditionAgent extends BaseAgent {
public CheckConditionAgent(String name, String description) {
super(name, description, List.of(), null, null);
}
@Override
protected Flowable runAsyncImpl(InvocationContext ctx) {
String status = (String) ctx.session().state().getOrDefault("status", "pending");
boolean isDone = "completed".equalsIgnoreCase(status);
// Emit an event that signals to escalate (exit the loop) if the condition is met.
// If not done, the escalate flag will be false or absent, and the loop continues.
Event checkEvent = Event.builder()
.author(name())
.id(Event.generateEventId()) // Important to give events unique IDs
.actions(EventActions.builder().escalate(isDone).build()) // Escalate if done
.build();
return Flowable.just(checkEvent);
}
}
// Agent that might update state.put("status")
LlmAgent processingStepAgent = LlmAgent.builder().name("ProcessingStep").build();
// Custom agent instance for checking the condition
CheckConditionAgent conditionCheckerAgent = new CheckConditionAgent(
"ConditionChecker",
"Checks if the status is 'completed'."
);
LoopAgent poller = LoopAgent.builder().name("StatusPoller").maxIterations(10).subAgents(processingStepAgent, conditionCheckerAgent).build();
// When poller runs, it executes processingStepAgent then conditionCheckerAgent repeatedly
// until Checker escalates (state.get("status") == "completed") or 10 iterations pass.
````
### 1.3. Interaction & Communication Mechanisms
Agents within a system often need to exchange data or trigger actions in one another. ADK facilitates this through:
#### a) Shared Session State (`session.state`)
The most fundamental way for agents operating within the same invocation (and thus sharing the same [`Session`](/adk-docs/sessions/session/) object via the `InvocationContext`) to communicate passively.
- **Mechanism:** One agent (or its tool/callback) writes a value (`context.state['data_key'] = processed_data`), and a subsequent agent reads it (`data = context.state.get('data_key')`). State changes are tracked via [`CallbackContext`](https://google.github.io/adk-docs/callbacks/index.md).
- **Convenience:** The `output_key` property on [`LlmAgent`](https://google.github.io/adk-docs/agents/llm-agents/index.md) automatically saves the agent's final response text (or structured output) to the specified state key.
- **Nature:** Asynchronous, passive communication. Ideal for pipelines orchestrated by `SequentialAgent` or passing data across `LoopAgent` iterations.
- **See Also:** [State Management](https://google.github.io/adk-docs/sessions/state/index.md)
Invocation Context and `temp:` State
When a parent agent invokes a sub-agent, it passes the same `InvocationContext`. This means they share the same temporary (`temp:`) state, which is ideal for passing data that is only relevant for the current turn.
```python
# Conceptual Example: Using output_key and reading state
from google.adk.agents import LlmAgent, SequentialAgent
agent_A = LlmAgent(name="AgentA", instruction="Find the capital of France.", output_key="capital_city")
agent_B = LlmAgent(name="AgentB", instruction="Tell me about the city stored in {capital_city}.")
pipeline = SequentialAgent(name="CityInfo", sub_agents=[agent_A, agent_B])
# AgentA runs, saves "Paris" to state['capital_city'].
# AgentB runs, its instruction processor reads state['capital_city'] to get "Paris".
```
```typescript
// Conceptual Example: Using outputKey and reading state
import { LlmAgent, SequentialAgent } from '@google/adk';
const agentA = new LlmAgent({name: 'AgentA', instruction: 'Find the capital of France.', outputKey: 'capital_city'});
const agentB = new LlmAgent({name: 'AgentB', instruction: 'Tell me about the city stored in {capital_city}.'});
const pipeline = new SequentialAgent({name: 'CityInfo', subAgents: [agentA, agentB]});
// AgentA runs, saves "Paris" to state['capital_city'].
// AgentB runs, its instruction processor reads state['capital_city'] to get "Paris".
```
```go
import (
"google.golang.org/adk/agent"
"google.golang.org/adk/agent/llmagent"
"google.golang.org/adk/agent/workflowagents/sequentialagent"
)
// Conceptual Example: Using output_key and reading state
agentA, _ := llmagent.New(llmagent.Config{Name: "AgentA", Instruction: "Find the capital of France.", OutputKey: "capital_city", Model: m})
agentB, _ := llmagent.New(llmagent.Config{Name: "AgentB", Instruction: "Tell me about the city stored in {capital_city}.", Model: m})
pipeline2, _ := sequentialagent.New(sequentialagent.Config{
AgentConfig: agent.Config{Name: "CityInfo", SubAgents: []agent.Agent{agentA, agentB}},
})
// AgentA runs, saves "Paris" to state["capital_city"].
// AgentB runs, its instruction processor reads state["capital_city"] to get "Paris".
```
```java
// Conceptual Example: Using outputKey and reading state
import com.google.adk.agents.LlmAgent;
import com.google.adk.agents.SequentialAgent;
LlmAgent agentA = LlmAgent.builder()
.name("AgentA")
.instruction("Find the capital of France.")
.outputKey("capital_city")
.build();
LlmAgent agentB = LlmAgent.builder()
.name("AgentB")
.instruction("Tell me about the city stored in {capital_city}.")
.outputKey("capital_city")
.build();
SequentialAgent pipeline = SequentialAgent.builder().name("CityInfo").subAgents(agentA, agentB).build();
// AgentA runs, saves "Paris" to state('capital_city').
// AgentB runs, its instruction processor reads state.get("capital_city") to get "Paris".
```
#### b) LLM-Driven Delegation (Agent Transfer)
Leverages an [`LlmAgent`](https://google.github.io/adk-docs/agents/llm-agents/index.md)'s understanding to dynamically route tasks to other suitable agents within the hierarchy.
- **Mechanism:** The agent's LLM generates a specific function call: `transfer_to_agent(agent_name='target_agent_name')`.
- **Handling:** The `AutoFlow`, used by default when sub-agents are present or transfer isn't disallowed, intercepts this call. It identifies the target agent using `root_agent.find_agent()` and updates the `InvocationContext` to switch execution focus.
- **Requires:** The calling `LlmAgent` needs clear `instructions` on when to transfer, and potential target agents need distinct `description`s for the LLM to make informed decisions. Transfer scope (parent, sub-agent, siblings) can be configured on the `LlmAgent`.
- **Nature:** Dynamic, flexible routing based on LLM interpretation.
```python
# Conceptual Setup: LLM Transfer
from google.adk.agents import LlmAgent
booking_agent = LlmAgent(name="Booker", description="Handles flight and hotel bookings.")
info_agent = LlmAgent(name="Info", description="Provides general information and answers questions.")
coordinator = LlmAgent(
name="Coordinator",
model="gemini-2.0-flash",
instruction="You are an assistant. Delegate booking tasks to Booker and info requests to Info.",
description="Main coordinator.",
# AutoFlow is typically used implicitly here
sub_agents=[booking_agent, info_agent]
)
# If coordinator receives "Book a flight", its LLM should generate:
# FunctionCall(name='transfer_to_agent', args={'agent_name': 'Booker'})
# ADK framework then routes execution to booking_agent.
```
```typescript
// Conceptual Setup: LLM Transfer
import { LlmAgent } from '@google/adk';
const bookingAgent = new LlmAgent({name: 'Booker', description: 'Handles flight and hotel bookings.'});
const infoAgent = new LlmAgent({name: 'Info', description: 'Provides general information and answers questions.'});
const coordinator = new LlmAgent({
name: 'Coordinator',
model: 'gemini-2.5-flash',
instruction: 'You are an assistant. Delegate booking tasks to Booker and info requests to Info.',
description: 'Main coordinator.',
// AutoFlow is typically used implicitly here
subAgents: [bookingAgent, infoAgent]
});
// If coordinator receives "Book a flight", its LLM should generate:
// {functionCall: {name: 'transfer_to_agent', args: {agent_name: 'Booker'}}}
// ADK framework then routes execution to bookingAgent.
```
```go
import (
"google.golang.org/adk/agent/llmagent"
)
// Conceptual Setup: LLM Transfer
bookingAgent, _ := llmagent.New(llmagent.Config{Name: "Booker", Description: "Handles flight and hotel bookings.", Model: m})
infoAgent, _ := llmagent.New(llmagent.Config{Name: "Info", Description: "Provides general information and answers questions.", Model: m})
coordinator, _ = llmagent.New(llmagent.Config{
Name: "Coordinator",
Model: m,
Instruction: "You are an assistant. Delegate booking tasks to Booker and info requests to Info.",
Description: "Main coordinator.",
SubAgents: []agent.Agent{bookingAgent, infoAgent},
})
// If coordinator receives "Book a flight", its LLM should generate:
// FunctionCall{Name: "transfer_to_agent", Args: map[string]any{"agent_name": "Booker"}}
// ADK framework then routes execution to bookingAgent.
```
```java
// Conceptual Setup: LLM Transfer
import com.google.adk.agents.LlmAgent;
LlmAgent bookingAgent = LlmAgent.builder()
.name("Booker")
.description("Handles flight and hotel bookings.")
.build();
LlmAgent infoAgent = LlmAgent.builder()
.name("Info")
.description("Provides general information and answers questions.")
.build();
// Define the coordinator agent
LlmAgent coordinator = LlmAgent.builder()
.name("Coordinator")
.model("gemini-2.0-flash") // Or your desired model
.instruction("You are an assistant. Delegate booking tasks to Booker and info requests to Info.")
.description("Main coordinator.")
// AutoFlow will be used by default (implicitly) because subAgents are present
// and transfer is not disallowed.
.subAgents(bookingAgent, infoAgent)
.build();
// If coordinator receives "Book a flight", its LLM should generate:
// FunctionCall.builder.name("transferToAgent").args(ImmutableMap.of("agent_name", "Booker")).build()
// ADK framework then routes execution to bookingAgent.
```
#### c) Explicit Invocation (`AgentTool`)
Allows an [`LlmAgent`](https://google.github.io/adk-docs/agents/llm-agents/index.md) to treat another `BaseAgent` instance as a callable function or [Tool](/adk-docs/tools-custom/).
- **Mechanism:** Wrap the target agent instance in `AgentTool` and include it in the parent `LlmAgent`'s `tools` list. `AgentTool` generates a corresponding function declaration for the LLM.
- **Handling:** When the parent LLM generates a function call targeting the `AgentTool`, the framework executes `AgentTool.run_async`. This method runs the target agent, captures its final response, forwards any state/artifact changes back to the parent's context, and returns the response as the tool's result.
- **Nature:** Synchronous (within the parent's flow), explicit, controlled invocation like any other tool.
- **(Note:** `AgentTool` needs to be imported and used explicitly).
```python
# Conceptual Setup: Agent as a Tool
from google.adk.agents import LlmAgent, BaseAgent
from google.adk.tools import agent_tool
from pydantic import BaseModel
# Define a target agent (could be LlmAgent or custom BaseAgent)
class ImageGeneratorAgent(BaseAgent): # Example custom agent
name: str = "ImageGen"
description: str = "Generates an image based on a prompt."
# ... internal logic ...
async def _run_async_impl(self, ctx): # Simplified run logic
prompt = ctx.session.state.get("image_prompt", "default prompt")
# ... generate image bytes ...
image_bytes = b"..."
yield Event(author=self.name, content=types.Content(parts=[types.Part.from_bytes(image_bytes, "image/png")]))
image_agent = ImageGeneratorAgent()
image_tool = agent_tool.AgentTool(agent=image_agent) # Wrap the agent
# Parent agent uses the AgentTool
artist_agent = LlmAgent(
name="Artist",
model="gemini-2.0-flash",
instruction="Create a prompt and use the ImageGen tool to generate the image.",
tools=[image_tool] # Include the AgentTool
)
# Artist LLM generates a prompt, then calls:
# FunctionCall(name='ImageGen', args={'image_prompt': 'a cat wearing a hat'})
# Framework calls image_tool.run_async(...), which runs ImageGeneratorAgent.
# The resulting image Part is returned to the Artist agent as the tool result.
```
```typescript
// Conceptual Setup: Agent as a Tool
import { LlmAgent, BaseAgent, AgentTool, InvocationContext } from '@google/adk';
import type { Part, createEvent, Event } from '@google/genai';
// Define a target agent (could be LlmAgent or custom BaseAgent)
class ImageGeneratorAgent extends BaseAgent { // Example custom agent
constructor() {
super({name: 'ImageGen', description: 'Generates an image based on a prompt.'});
}
// ... internal logic ...
async *runAsyncImpl(ctx: InvocationContext): AsyncGenerator { // Simplified run logic
const prompt = ctx.session.state['image_prompt'] || 'default prompt';
// ... generate image bytes ...
const imageBytes = new Uint8Array(); // placeholder
const imagePart: Part = {inlineData: {data: Buffer.from(imageBytes).toString('base64'), mimeType: 'image/png'}};
yield createEvent({content: {parts: [imagePart]}});
}
async *runLiveImpl(ctx: InvocationContext): AsyncGenerator {
// Not implemented for this agent.
}
}
const imageAgent = new ImageGeneratorAgent();
const imageTool = new AgentTool({agent: imageAgent}); // Wrap the agent
// Parent agent uses the AgentTool
const artistAgent = new LlmAgent({
name: 'Artist',
model: 'gemini-2.5-flash',
instruction: 'Create a prompt and use the ImageGen tool to generate the image.',
tools: [imageTool] // Include the AgentTool
});
// Artist LLM generates a prompt, then calls:
// {functionCall: {name: 'ImageGen', args: {image_prompt: 'a cat wearing a hat'}}}
// Framework calls imageTool.runAsync(...), which runs ImageGeneratorAgent.
// The resulting image Part is returned to the Artist agent as the tool result.
```
```go
import (
"fmt"
"iter"
"google.golang.org/adk/agent"
"google.golang.org/adk/agent/llmagent"
"google.golang.org/adk/model"
"google.golang.org/adk/session"
"google.golang.org/adk/tool"
"google.golang.org/adk/tool/agenttool"
"google.golang.org/genai"
)
// Conceptual Setup: Agent as a Tool
// Define a target agent (could be LlmAgent or custom BaseAgent)
imageAgent, _ := agent.New(agent.Config{
Name: "ImageGen",
Description: "Generates an image based on a prompt.",
Run: func(ctx agent.InvocationContext) iter.Seq2[*session.Event, error] {
return func(yield func(*session.Event, error) bool) {
prompt, _ := ctx.Session().State().Get("image_prompt")
fmt.Printf("Generating image for prompt: %v\n", prompt)
imageBytes := []byte("...") // Simulate image bytes
yield(&session.Event{
Author: "ImageGen",
LLMResponse: model.LLMResponse{
Content: &genai.Content{
Parts: []*genai.Part{genai.NewPartFromBytes(imageBytes, "image/png")},
},
},
}, nil)
}
},
})
// Wrap the agent
imageTool := agenttool.New(imageAgent, nil)
// Now imageTool can be used as a tool by other agents.
// Parent agent uses the AgentTool
artistAgent, _ := llmagent.New(llmagent.Config{
Name: "Artist",
Model: m,
Instruction: "Create a prompt and use the ImageGen tool to generate the image.",
Tools: []tool.Tool{imageTool}, // Include the AgentTool
})
// Artist LLM generates a prompt, then calls:
// FunctionCall{Name: "ImageGen", Args: map[string]any{"image_prompt": "a cat wearing a hat"}}
// Framework calls imageTool.Run(...), which runs ImageGeneratorAgent.
// The resulting image Part is returned to the Artist agent as the tool result.
```
```java
// Conceptual Setup: Agent as a Tool
import com.google.adk.agents.BaseAgent;
import com.google.adk.agents.LlmAgent;
import com.google.adk.tools.AgentTool;
// Example custom agent (could be LlmAgent or custom BaseAgent)
public class ImageGeneratorAgent extends BaseAgent {
public ImageGeneratorAgent(String name, String description) {
super(name, description, List.of(), null, null);
}
// ... internal logic ...
@Override
protected Flowable runAsyncImpl(InvocationContext invocationContext) { // Simplified run logic
invocationContext.session().state().get("image_prompt");
// Generate image bytes
// ...
Event responseEvent = Event.builder()
.author(this.name())
.content(Content.fromParts(Part.fromText("...")))
.build();
return Flowable.just(responseEvent);
}
@Override
protected Flowable runLiveImpl(InvocationContext invocationContext) {
return null;
}
}
// Wrap the agent using AgentTool
ImageGeneratorAgent imageAgent = new ImageGeneratorAgent("image_agent", "generates images");
AgentTool imageTool = AgentTool.create(imageAgent);
// Parent agent uses the AgentTool
LlmAgent artistAgent = LlmAgent.builder()
.name("Artist")
.model("gemini-2.0-flash")
.instruction(
"You are an artist. Create a detailed prompt for an image and then " +
"use the 'ImageGen' tool to generate the image. " +
"The 'ImageGen' tool expects a single string argument named 'request' " +
"containing the image prompt. The tool will return a JSON string in its " +
"'result' field, containing 'image_base64', 'mime_type', and 'status'."
)
.description("An agent that can create images using a generation tool.")
.tools(imageTool) // Include the AgentTool
.build();
// Artist LLM generates a prompt, then calls:
// FunctionCall(name='ImageGen', args={'imagePrompt': 'a cat wearing a hat'})
// Framework calls imageTool.runAsync(...), which runs ImageGeneratorAgent.
// The resulting image Part is returned to the Artist agent as the tool result.
```
These primitives provide the flexibility to design multi-agent interactions ranging from tightly coupled sequential workflows to dynamic, LLM-driven delegation networks.
## 2. Common Multi-Agent Patterns using ADK Primitives
By combining ADK's composition primitives, you can implement various established patterns for multi-agent collaboration.
### Coordinator/Dispatcher Pattern
- **Structure:** A central [`LlmAgent`](https://google.github.io/adk-docs/agents/llm-agents/index.md) (Coordinator) manages several specialized `sub_agents`.
- **Goal:** Route incoming requests to the appropriate specialist agent.
- **ADK Primitives Used:**
- **Hierarchy:** Coordinator has specialists listed in `sub_agents`.
- **Interaction:** Primarily uses **LLM-Driven Delegation** (requires clear `description`s on sub-agents and appropriate `instruction` on Coordinator) or **Explicit Invocation (`AgentTool`)** (Coordinator includes `AgentTool`-wrapped specialists in its `tools`).
```python
# Conceptual Code: Coordinator using LLM Transfer
from google.adk.agents import LlmAgent
billing_agent = LlmAgent(name="Billing", description="Handles billing inquiries.")
support_agent = LlmAgent(name="Support", description="Handles technical support requests.")
coordinator = LlmAgent(
name="HelpDeskCoordinator",
model="gemini-2.0-flash",
instruction="Route user requests: Use Billing agent for payment issues, Support agent for technical problems.",
description="Main help desk router.",
# allow_transfer=True is often implicit with sub_agents in AutoFlow
sub_agents=[billing_agent, support_agent]
)
# User asks "My payment failed" -> Coordinator's LLM should call transfer_to_agent(agent_name='Billing')
# User asks "I can't log in" -> Coordinator's LLM should call transfer_to_agent(agent_name='Support')
```
```typescript
// Conceptual Code: Coordinator using LLM Transfer
import { LlmAgent } from '@google/adk';
const billingAgent = new LlmAgent({name: 'Billing', description: 'Handles billing inquiries.'});
const supportAgent = new LlmAgent({name: 'Support', description: 'Handles technical support requests.'});
const coordinator = new LlmAgent({
name: 'HelpDeskCoordinator',
model: 'gemini-2.5-flash',
instruction: 'Route user requests: Use Billing agent for payment issues, Support agent for technical problems.',
description: 'Main help desk router.',
// allowTransfer=true is often implicit with subAgents in AutoFlow
subAgents: [billingAgent, supportAgent]
});
// User asks "My payment failed" -> Coordinator's LLM should call {functionCall: {name: 'transfer_to_agent', args: {agent_name: 'Billing'}}}
// User asks "I can't log in" -> Coordinator's LLM should call {functionCall: {name: 'transfer_to_agent', args: {agent_name: 'Support'}}}
```
```go
import (
"google.golang.org/adk/agent"
"google.golang.org/adk/agent/llmagent"
)
// Conceptual Code: Coordinator using LLM Transfer
billingAgent, _ := llmagent.New(llmagent.Config{Name: "Billing", Description: "Handles billing inquiries.", Model: m})
supportAgent, _ := llmagent.New(llmagent.Config{Name: "Support", Description: "Handles technical support requests.", Model: m})
coordinator, _ := llmagent.New(llmagent.Config{
Name: "HelpDeskCoordinator",
Model: m,
Instruction: "Route user requests: Use Billing agent for payment issues, Support agent for technical problems.",
Description: "Main help desk router.",
SubAgents: []agent.Agent{billingAgent, supportAgent},
})
// User asks "My payment failed" -> Coordinator's LLM should call transfer_to_agent(agent_name='Billing')
// User asks "I can't log in" -> Coordinator's LLM should call transfer_to_agent(agent_name='Support')
```
```java
// Conceptual Code: Coordinator using LLM Transfer
import com.google.adk.agents.LlmAgent;
LlmAgent billingAgent = LlmAgent.builder()
.name("Billing")
.description("Handles billing inquiries and payment issues.")
.build();
LlmAgent supportAgent = LlmAgent.builder()
.name("Support")
.description("Handles technical support requests and login problems.")
.build();
LlmAgent coordinator = LlmAgent.builder()
.name("HelpDeskCoordinator")
.model("gemini-2.0-flash")
.instruction("Route user requests: Use Billing agent for payment issues, Support agent for technical problems.")
.description("Main help desk router.")
.subAgents(billingAgent, supportAgent)
// Agent transfer is implicit with sub agents in the Autoflow, unless specified
// using .disallowTransferToParent or disallowTransferToPeers
.build();
// User asks "My payment failed" -> Coordinator's LLM should call
// transferToAgent(agentName='Billing')
// User asks "I can't log in" -> Coordinator's LLM should call
// transferToAgent(agentName='Support')
```
### Sequential Pipeline Pattern
- **Structure:** A [`SequentialAgent`](https://google.github.io/adk-docs/agents/workflow-agents/sequential-agents/index.md) contains `sub_agents` executed in a fixed order.
- **Goal:** Implement a multistep process where the output of one-step feeds into the next.
- **ADK Primitives Used:**
- **Workflow:** `SequentialAgent` defines the order.
- **Communication:** Primarily uses **Shared Session State**. Earlier agents write results (often via `output_key`), later agents read those results from `context.state`.
```python
# Conceptual Code: Sequential Data Pipeline
from google.adk.agents import SequentialAgent, LlmAgent
validator = LlmAgent(name="ValidateInput", instruction="Validate the input.", output_key="validation_status")
processor = LlmAgent(name="ProcessData", instruction="Process data if {validation_status} is 'valid'.", output_key="result")
reporter = LlmAgent(name="ReportResult", instruction="Report the result from {result}.")
data_pipeline = SequentialAgent(
name="DataPipeline",
sub_agents=[validator, processor, reporter]
)
# validator runs -> saves to state['validation_status']
# processor runs -> reads state['validation_status'], saves to state['result']
# reporter runs -> reads state['result']
```
```typescript
// Conceptual Code: Sequential Data Pipeline
import { SequentialAgent, LlmAgent } from '@google/adk';
const validator = new LlmAgent({name: 'ValidateInput', instruction: 'Validate the input.', outputKey: 'validation_status'});
const processor = new LlmAgent({name: 'ProcessData', instruction: 'Process data if {validation_status} is "valid".', outputKey: 'result'});
const reporter = new LlmAgent({name: 'ReportResult', instruction: 'Report the result from {result}.'});
const dataPipeline = new SequentialAgent({
name: 'DataPipeline',
subAgents: [validator, processor, reporter]
});
// validator runs -> saves to state['validation_status']
// processor runs -> reads state['validation_status'], saves to state['result']
// reporter runs -> reads state['result']
```
```go
import (
"google.golang.org/adk/agent"
"google.golang.org/adk/agent/llmagent"
"google.golang.org/adk/agent/workflowagents/sequentialagent"
)
// Conceptual Code: Sequential Data Pipeline
validator, _ := llmagent.New(llmagent.Config{Name: "ValidateInput", Instruction: "Validate the input.", OutputKey: "validation_status", Model: m})
processor, _ := llmagent.New(llmagent.Config{Name: "ProcessData", Instruction: "Process data if {validation_status} is 'valid'.", OutputKey: "result", Model: m})
reporter, _ := llmagent.New(llmagent.Config{Name: "ReportResult", Instruction: "Report the result from {result}.", Model: m})
dataPipeline, _ := sequentialagent.New(sequentialagent.Config{
AgentConfig: agent.Config{Name: "DataPipeline", SubAgents: []agent.Agent{validator, processor, reporter}},
})
// validator runs -> saves to state["validation_status"]
// processor runs -> reads state["validation_status"], saves to state["result"]
// reporter runs -> reads state["result"]
```
```java
// Conceptual Code: Sequential Data Pipeline
import com.google.adk.agents.SequentialAgent;
LlmAgent validator = LlmAgent.builder()
.name("ValidateInput")
.instruction("Validate the input")
.outputKey("validation_status") // Saves its main text output to session.state["validation_status"]
.build();
LlmAgent processor = LlmAgent.builder()
.name("ProcessData")
.instruction("Process data if {validation_status} is 'valid'")
.outputKey("result") // Saves its main text output to session.state["result"]
.build();
LlmAgent reporter = LlmAgent.builder()
.name("ReportResult")
.instruction("Report the result from {result}")
.build();
SequentialAgent dataPipeline = SequentialAgent.builder()
.name("DataPipeline")
.subAgents(validator, processor, reporter)
.build();
// validator runs -> saves to state['validation_status']
// processor runs -> reads state['validation_status'], saves to state['result']
// reporter runs -> reads state['result']
```
### Parallel Fan-Out/Gather Pattern
- **Structure:** A [`ParallelAgent`](https://google.github.io/adk-docs/agents/workflow-agents/parallel-agents/index.md) runs multiple `sub_agents` concurrently, often followed by a later agent (in a `SequentialAgent`) that aggregates results.
- **Goal:** Execute independent tasks simultaneously to reduce latency, then combine their outputs.
- **ADK Primitives Used:**
- **Workflow:** `ParallelAgent` for concurrent execution (Fan-Out). Often nested within a `SequentialAgent` to handle the subsequent aggregation step (Gather).
- **Communication:** Sub-agents write results to distinct keys in **Shared Session State**. The subsequent "Gather" agent reads multiple state keys.
```python
# Conceptual Code: Parallel Information Gathering
from google.adk.agents import SequentialAgent, ParallelAgent, LlmAgent
fetch_api1 = LlmAgent(name="API1Fetcher", instruction="Fetch data from API 1.", output_key="api1_data")
fetch_api2 = LlmAgent(name="API2Fetcher", instruction="Fetch data from API 2.", output_key="api2_data")
gather_concurrently = ParallelAgent(
name="ConcurrentFetch",
sub_agents=[fetch_api1, fetch_api2]
)
synthesizer = LlmAgent(
name="Synthesizer",
instruction="Combine results from {api1_data} and {api2_data}."
)
overall_workflow = SequentialAgent(
name="FetchAndSynthesize",
sub_agents=[gather_concurrently, synthesizer] # Run parallel fetch, then synthesize
)
# fetch_api1 and fetch_api2 run concurrently, saving to state.
# synthesizer runs afterwards, reading state['api1_data'] and state['api2_data'].
```
```typescript
// Conceptual Code: Parallel Information Gathering
import { SequentialAgent, ParallelAgent, LlmAgent } from '@google/adk';
const fetchApi1 = new LlmAgent({name: 'API1Fetcher', instruction: 'Fetch data from API 1.', outputKey: 'api1_data'});
const fetchApi2 = new LlmAgent({name: 'API2Fetcher', instruction: 'Fetch data from API 2.', outputKey: 'api2_data'});
const gatherConcurrently = new ParallelAgent({
name: 'ConcurrentFetch',
subAgents: [fetchApi1, fetchApi2]
});
const synthesizer = new LlmAgent({
name: 'Synthesizer',
instruction: 'Combine results from {api1_data} and {api2_data}.'
});
const overallWorkflow = new SequentialAgent({
name: 'FetchAndSynthesize',
subAgents: [gatherConcurrently, synthesizer] // Run parallel fetch, then synthesize
});
// fetchApi1 and fetchApi2 run concurrently, saving to state.
// synthesizer runs afterwards, reading state['api1_data'] and state['api2_data'].
```
```go
import (
"google.golang.org/adk/agent"
"google.golang.org/adk/agent/llmagent"
"google.golang.org/adk/agent/workflowagents/parallelagent"
"google.golang.org/adk/agent/workflowagents/sequentialagent"
)
// Conceptual Code: Parallel Information Gathering
fetchAPI1, _ := llmagent.New(llmagent.Config{Name: "API1Fetcher", Instruction: "Fetch data from API 1.", OutputKey: "api1_data", Model: m})
fetchAPI2, _ := llmagent.New(llmagent.Config{Name: "API2Fetcher", Instruction: "Fetch data from API 2.", OutputKey: "api2_data", Model: m})
gatherConcurrently, _ := parallelagent.New(parallelagent.Config{
AgentConfig: agent.Config{Name: "ConcurrentFetch", SubAgents: []agent.Agent{fetchAPI1, fetchAPI2}},
})
synthesizer, _ := llmagent.New(llmagent.Config{Name: "Synthesizer", Instruction: "Combine results from {api1_data} and {api2_data}.", Model: m})
overallWorkflow, _ := sequentialagent.New(sequentialagent.Config{
AgentConfig: agent.Config{Name: "FetchAndSynthesize", SubAgents: []agent.Agent{gatherConcurrently, synthesizer}},
})
// fetch_api1 and fetch_api2 run concurrently, saving to state.
// synthesizer runs afterwards, reading state["api1_data"] and state["api2_data"].
```
```java
// Conceptual Code: Parallel Information Gathering
import com.google.adk.agents.LlmAgent;
import com.google.adk.agents.ParallelAgent;
import com.google.adk.agents.SequentialAgent;
LlmAgent fetchApi1 = LlmAgent.builder()
.name("API1Fetcher")
.instruction("Fetch data from API 1.")
.outputKey("api1_data")
.build();
LlmAgent fetchApi2 = LlmAgent.builder()
.name("API2Fetcher")
.instruction("Fetch data from API 2.")
.outputKey("api2_data")
.build();
ParallelAgent gatherConcurrently = ParallelAgent.builder()
.name("ConcurrentFetcher")
.subAgents(fetchApi2, fetchApi1)
.build();
LlmAgent synthesizer = LlmAgent.builder()
.name("Synthesizer")
.instruction("Combine results from {api1_data} and {api2_data}.")
.build();
SequentialAgent overallWorfklow = SequentialAgent.builder()
.name("FetchAndSynthesize") // Run parallel fetch, then synthesize
.subAgents(gatherConcurrently, synthesizer)
.build();
// fetch_api1 and fetch_api2 run concurrently, saving to state.
// synthesizer runs afterwards, reading state['api1_data'] and state['api2_data'].
```
### Hierarchical Task Decomposition
- **Structure:** A multi-level tree of agents where higher-level agents break down complex goals and delegate sub-tasks to lower-level agents.
- **Goal:** Solve complex problems by recursively breaking them down into simpler, executable steps.
- **ADK Primitives Used:**
- **Hierarchy:** Multi-level `parent_agent`/`sub_agents` structure.
- **Interaction:** Primarily **LLM-Driven Delegation** or **Explicit Invocation (`AgentTool`)** used by parent agents to assign tasks to subagents. Results are returned up the hierarchy (via tool responses or state).
```python
# Conceptual Code: Hierarchical Research Task
from google.adk.agents import LlmAgent
from google.adk.tools import agent_tool
# Low-level tool-like agents
web_searcher = LlmAgent(name="WebSearch", description="Performs web searches for facts.")
summarizer = LlmAgent(name="Summarizer", description="Summarizes text.")
# Mid-level agent combining tools
research_assistant = LlmAgent(
name="ResearchAssistant",
model="gemini-2.0-flash",
description="Finds and summarizes information on a topic.",
tools=[agent_tool.AgentTool(agent=web_searcher), agent_tool.AgentTool(agent=summarizer)]
)
# High-level agent delegating research
report_writer = LlmAgent(
name="ReportWriter",
model="gemini-2.0-flash",
instruction="Write a report on topic X. Use the ResearchAssistant to gather information.",
tools=[agent_tool.AgentTool(agent=research_assistant)]
# Alternatively, could use LLM Transfer if research_assistant is a sub_agent
)
# User interacts with ReportWriter.
# ReportWriter calls ResearchAssistant tool.
# ResearchAssistant calls WebSearch and Summarizer tools.
# Results flow back up.
```
```typescript
// Conceptual Code: Hierarchical Research Task
import { LlmAgent, AgentTool } from '@google/adk';
// Low-level tool-like agents
const webSearcher = new LlmAgent({name: 'WebSearch', description: 'Performs web searches for facts.'});
const summarizer = new LlmAgent({name: 'Summarizer', description: 'Summarizes text.'});
// Mid-level agent combining tools
const researchAssistant = new LlmAgent({
name: 'ResearchAssistant',
model: 'gemini-2.5-flash',
description: 'Finds and summarizes information on a topic.',
tools: [new AgentTool({agent: webSearcher}), new AgentTool({agent: summarizer})]
});
// High-level agent delegating research
const reportWriter = new LlmAgent({
name: 'ReportWriter',
model: 'gemini-2.5-flash',
instruction: 'Write a report on topic X. Use the ResearchAssistant to gather information.',
tools: [new AgentTool({agent: researchAssistant})]
// Alternatively, could use LLM Transfer if researchAssistant is a subAgent
});
// User interacts with ReportWriter.
// ReportWriter calls ResearchAssistant tool.
// ResearchAssistant calls WebSearch and Summarizer tools.
// Results flow back up.
```
```go
import (
"google.golang.org/adk/agent/llmagent"
"google.golang.org/adk/tool"
"google.golang.org/adk/tool/agenttool"
)
// Conceptual Code: Hierarchical Research Task
// Low-level tool-like agents
webSearcher, _ := llmagent.New(llmagent.Config{Name: "WebSearch", Description: "Performs web searches for facts.", Model: m})
summarizer, _ := llmagent.New(llmagent.Config{Name: "Summarizer", Description: "Summarizes text.", Model: m})
// Mid-level agent combining tools
webSearcherTool := agenttool.New(webSearcher, nil)
summarizerTool := agenttool.New(summarizer, nil)
researchAssistant, _ := llmagent.New(llmagent.Config{
Name: "ResearchAssistant",
Model: m,
Description: "Finds and summarizes information on a topic.",
Tools: []tool.Tool{webSearcherTool, summarizerTool},
})
// High-level agent delegating research
researchAssistantTool := agenttool.New(researchAssistant, nil)
reportWriter, _ := llmagent.New(llmagent.Config{
Name: "ReportWriter",
Model: m,
Instruction: "Write a report on topic X. Use the ResearchAssistant to gather information.",
Tools: []tool.Tool{researchAssistantTool},
})
// User interacts with ReportWriter.
// ReportWriter calls ResearchAssistant tool.
// ResearchAssistant calls WebSearch and Summarizer tools.
// Results flow back up.
```
```java
// Conceptual Code: Hierarchical Research Task
import com.google.adk.agents.LlmAgent;
import com.google.adk.tools.AgentTool;
// Low-level tool-like agents
LlmAgent webSearcher = LlmAgent.builder()
.name("WebSearch")
.description("Performs web searches for facts.")
.build();
LlmAgent summarizer = LlmAgent.builder()
.name("Summarizer")
.description("Summarizes text.")
.build();
// Mid-level agent combining tools
LlmAgent researchAssistant = LlmAgent.builder()
.name("ResearchAssistant")
.model("gemini-2.0-flash")
.description("Finds and summarizes information on a topic.")
.tools(AgentTool.create(webSearcher), AgentTool.create(summarizer))
.build();
// High-level agent delegating research
LlmAgent reportWriter = LlmAgent.builder()
.name("ReportWriter")
.model("gemini-2.0-flash")
.instruction("Write a report on topic X. Use the ResearchAssistant to gather information.")
.tools(AgentTool.create(researchAssistant))
// Alternatively, could use LLM Transfer if research_assistant is a subAgent
.build();
// User interacts with ReportWriter.
// ReportWriter calls ResearchAssistant tool.
// ResearchAssistant calls WebSearch and Summarizer tools.
// Results flow back up.
```
### Review/Critique Pattern (Generator-Critic)
- **Structure:** Typically involves two agents within a [`SequentialAgent`](https://google.github.io/adk-docs/agents/workflow-agents/sequential-agents/index.md): a Generator and a Critic/Reviewer.
- **Goal:** Improve the quality or validity of generated output by having a dedicated agent review it.
- **ADK Primitives Used:**
- **Workflow:** `SequentialAgent` ensures generation happens before review.
- **Communication:** **Shared Session State** (Generator uses `output_key` to save output; Reviewer reads that state key). The Reviewer might save its feedback to another state key for subsequent steps.
```python
# Conceptual Code: Generator-Critic
from google.adk.agents import SequentialAgent, LlmAgent
generator = LlmAgent(
name="DraftWriter",
instruction="Write a short paragraph about subject X.",
output_key="draft_text"
)
reviewer = LlmAgent(
name="FactChecker",
instruction="Review the text in {draft_text} for factual accuracy. Output 'valid' or 'invalid' with reasons.",
output_key="review_status"
)
# Optional: Further steps based on review_status
review_pipeline = SequentialAgent(
name="WriteAndReview",
sub_agents=[generator, reviewer]
)
# generator runs -> saves draft to state['draft_text']
# reviewer runs -> reads state['draft_text'], saves status to state['review_status']
```
```typescript
// Conceptual Code: Generator-Critic
import { SequentialAgent, LlmAgent } from '@google/adk';
const generator = new LlmAgent({
name: 'DraftWriter',
instruction: 'Write a short paragraph about subject X.',
outputKey: 'draft_text'
});
const reviewer = new LlmAgent({
name: 'FactChecker',
instruction: 'Review the text in {draft_text} for factual accuracy. Output "valid" or "invalid" with reasons.',
outputKey: 'review_status'
});
// Optional: Further steps based on review_status
const reviewPipeline = new SequentialAgent({
name: 'WriteAndReview',
subAgents: [generator, reviewer]
});
// generator runs -> saves draft to state['draft_text']
// reviewer runs -> reads state['draft_text'], saves status to state['review_status']
```
```go
import (
"google.golang.org/adk/agent"
"google.golang.org/adk/agent/llmagent"
"google.golang.org/adk/agent/workflowagents/sequentialagent"
)
// Conceptual Code: Generator-Critic
generator, _ := llmagent.New(llmagent.Config{
Name: "DraftWriter",
Instruction: "Write a short paragraph about subject X.",
OutputKey: "draft_text",
Model: m,
})
reviewer, _ := llmagent.New(llmagent.Config{
Name: "FactChecker",
Instruction: "Review the text in {draft_text} for factual accuracy. Output 'valid' or 'invalid' with reasons.",
OutputKey: "review_status",
Model: m,
})
reviewPipeline, _ := sequentialagent.New(sequentialagent.Config{
AgentConfig: agent.Config{Name: "WriteAndReview", SubAgents: []agent.Agent{generator, reviewer}},
})
// generator runs -> saves draft to state["draft_text"]
// reviewer runs -> reads state["draft_text"], saves status to state["review_status"]
```
```java
// Conceptual Code: Generator-Critic
import com.google.adk.agents.LlmAgent;
import com.google.adk.agents.SequentialAgent;
LlmAgent generator = LlmAgent.builder()
.name("DraftWriter")
.instruction("Write a short paragraph about subject X.")
.outputKey("draft_text")
.build();
LlmAgent reviewer = LlmAgent.builder()
.name("FactChecker")
.instruction("Review the text in {draft_text} for factual accuracy. Output 'valid' or 'invalid' with reasons.")
.outputKey("review_status")
.build();
// Optional: Further steps based on review_status
SequentialAgent reviewPipeline = SequentialAgent.builder()
.name("WriteAndReview")
.subAgents(generator, reviewer)
.build();
// generator runs -> saves draft to state['draft_text']
// reviewer runs -> reads state['draft_text'], saves status to state['review_status']
```
### Iterative Refinement Pattern
- **Structure:** Uses a [`LoopAgent`](https://google.github.io/adk-docs/agents/workflow-agents/loop-agents/index.md) containing one or more agents that work on a task over multiple iterations.
- **Goal:** Progressively improve a result (e.g., code, text, plan) stored in the session state until a quality threshold is met or a maximum number of iterations is reached.
- **ADK Primitives Used:**
- **Workflow:** `LoopAgent` manages the repetition.
- **Communication:** **Shared Session State** is essential for agents to read the previous iteration's output and save the refined version.
- **Termination:** The loop typically ends based on `max_iterations` or a dedicated checking agent setting `escalate=True` in the `Event Actions` when the result is satisfactory.
```python
# Conceptual Code: Iterative Code Refinement
from google.adk.agents import LoopAgent, LlmAgent, BaseAgent
from google.adk.events import Event, EventActions
from google.adk.agents.invocation_context import InvocationContext
from typing import AsyncGenerator
# Agent to generate/refine code based on state['current_code'] and state['requirements']
code_refiner = LlmAgent(
name="CodeRefiner",
instruction="Read state['current_code'] (if exists) and state['requirements']. Generate/refine Python code to meet requirements. Save to state['current_code'].",
output_key="current_code" # Overwrites previous code in state
)
# Agent to check if the code meets quality standards
quality_checker = LlmAgent(
name="QualityChecker",
instruction="Evaluate the code in state['current_code'] against state['requirements']. Output 'pass' or 'fail'.",
output_key="quality_status"
)
# Custom agent to check the status and escalate if 'pass'
class CheckStatusAndEscalate(BaseAgent):
async def _run_async_impl(self, ctx: InvocationContext) -> AsyncGenerator[Event, None]:
status = ctx.session.state.get("quality_status", "fail")
should_stop = (status == "pass")
yield Event(author=self.name, actions=EventActions(escalate=should_stop))
refinement_loop = LoopAgent(
name="CodeRefinementLoop",
max_iterations=5,
sub_agents=[code_refiner, quality_checker, CheckStatusAndEscalate(name="StopChecker")]
)
# Loop runs: Refiner -> Checker -> StopChecker
# State['current_code'] is updated each iteration.
# Loop stops if QualityChecker outputs 'pass' (leading to StopChecker escalating) or after 5 iterations.
```
```typescript
// Conceptual Code: Iterative Code Refinement
import { LoopAgent, LlmAgent, BaseAgent, InvocationContext } from '@google/adk';
import type { Event, createEvent, createEventActions } from '@google/genai';
// Agent to generate/refine code based on state['current_code'] and state['requirements']
const codeRefiner = new LlmAgent({
name: 'CodeRefiner',
instruction: 'Read state["current_code"] (if exists) and state["requirements"]. Generate/refine Typescript code to meet requirements. Save to state["current_code"].',
outputKey: 'current_code' // Overwrites previous code in state
});
// Agent to check if the code meets quality standards
const qualityChecker = new LlmAgent({
name: 'QualityChecker',
instruction: 'Evaluate the code in state["current_code"] against state["requirements"]. Output "pass" or "fail".',
outputKey: 'quality_status'
});
// Custom agent to check the status and escalate if 'pass'
class CheckStatusAndEscalate extends BaseAgent {
async *runAsyncImpl(ctx: InvocationContext): AsyncGenerator {
const status = ctx.session.state.quality_status;
const shouldStop = status === 'pass';
if (shouldStop) {
yield createEvent({
author: 'StopChecker',
actions: createEventActions(),
});
}
}
async *runLiveImpl(ctx: InvocationContext): AsyncGenerator {
// This agent doesn't have a live implementation
yield createEvent({ author: 'StopChecker' });
}
}
// Loop runs: Refiner -> Checker -> StopChecker
// State['current_code'] is updated each iteration.
// Loop stops if QualityChecker outputs 'pass' (leading to StopChecker escalating) or after 5 iterations.
const refinementLoop = new LoopAgent({
name: 'CodeRefinementLoop',
maxIterations: 5,
subAgents: [codeRefiner, qualityChecker, new CheckStatusAndEscalate({name: 'StopChecker'})]
});
```
```go
import (
"iter"
"google.golang.org/adk/agent"
"google.golang.org/adk/agent/llmagent"
"google.golang.org/adk/agent/workflowagents/loopagent"
"google.golang.org/adk/session"
)
// Conceptual Code: Iterative Code Refinement
codeRefiner, _ := llmagent.New(llmagent.Config{
Name: "CodeRefiner",
Instruction: "Read state['current_code'] (if exists) and state['requirements']. Generate/refine Python code to meet requirements. Save to state['current_code'].",
OutputKey: "current_code",
Model: m,
})
qualityChecker, _ := llmagent.New(llmagent.Config{
Name: "QualityChecker",
Instruction: "Evaluate the code in state['current_code'] against state['requirements']. Output 'pass' or 'fail'.",
OutputKey: "quality_status",
Model: m,
})
checkStatusAndEscalate, _ := agent.New(agent.Config{
Name: "StopChecker",
Run: func(ctx agent.InvocationContext) iter.Seq2[*session.Event, error] {
return func(yield func(*session.Event, error) bool) {
status, _ := ctx.Session().State().Get("quality_status")
shouldStop := status == "pass"
yield(&session.Event{Author: "StopChecker", Actions: session.EventActions{Escalate: shouldStop}}, nil)
}
},
})
refinementLoop, _ := loopagent.New(loopagent.Config{
MaxIterations: 5,
AgentConfig: agent.Config{Name: "CodeRefinementLoop", SubAgents: []agent.Agent{codeRefiner, qualityChecker, checkStatusAndEscalate}},
})
// Loop runs: Refiner -> Checker -> StopChecker
// State["current_code"] is updated each iteration.
// Loop stops if QualityChecker outputs 'pass' (leading to StopChecker escalating) or after 5 iterations.
```
```java
// Conceptual Code: Iterative Code Refinement
import com.google.adk.agents.BaseAgent;
import com.google.adk.agents.LlmAgent;
import com.google.adk.agents.LoopAgent;
import com.google.adk.events.Event;
import com.google.adk.events.EventActions;
import com.google.adk.agents.InvocationContext;
import io.reactivex.rxjava3.core.Flowable;
import java.util.List;
// Agent to generate/refine code based on state['current_code'] and state['requirements']
LlmAgent codeRefiner = LlmAgent.builder()
.name("CodeRefiner")
.instruction("Read state['current_code'] (if exists) and state['requirements']. Generate/refine Java code to meet requirements. Save to state['current_code'].")
.outputKey("current_code") // Overwrites previous code in state
.build();
// Agent to check if the code meets quality standards
LlmAgent qualityChecker = LlmAgent.builder()
.name("QualityChecker")
.instruction("Evaluate the code in state['current_code'] against state['requirements']. Output 'pass' or 'fail'.")
.outputKey("quality_status")
.build();
BaseAgent checkStatusAndEscalate = new BaseAgent(
"StopChecker","Checks quality_status and escalates if 'pass'.", List.of(), null, null) {
@Override
protected Flowable runAsyncImpl(InvocationContext invocationContext) {
String status = (String) invocationContext.session().state().getOrDefault("quality_status", "fail");
boolean shouldStop = "pass".equals(status);
EventActions actions = EventActions.builder().escalate(shouldStop).build();
Event event = Event.builder()
.author(this.name())
.actions(actions)
.build();
return Flowable.just(event);
}
};
LoopAgent refinementLoop = LoopAgent.builder()
.name("CodeRefinementLoop")
.maxIterations(5)
.subAgents(codeRefiner, qualityChecker, checkStatusAndEscalate)
.build();
// Loop runs: Refiner -> Checker -> StopChecker
// State['current_code'] is updated each iteration.
// Loop stops if QualityChecker outputs 'pass' (leading to StopChecker escalating) or after 5
// iterations.
```
### Human-in-the-Loop Pattern
- **Structure:** Integrates human intervention points within an agent workflow.
- **Goal:** Allow for human oversight, approval, correction, or tasks that AI cannot perform.
- **ADK Primitives Used (Conceptual):**
- **Interaction:** Can be implemented using a custom **Tool** that pauses execution and sends a request to an external system (e.g., a UI, ticketing system) waiting for human input. The tool then returns the human's response to the agent.
- **Workflow:** Could use **LLM-Driven Delegation** (`transfer_to_agent`) targeting a conceptual "Human Agent" that triggers the external workflow, or use the custom tool within an `LlmAgent`.
- **State/Callbacks:** State can hold task details for the human; callbacks can manage the interaction flow.
- **Note:** ADK doesn't have a built-in "Human Agent" type, so this requires custom integration.
```python
# Conceptual Code: Using a Tool for Human Approval
from google.adk.agents import LlmAgent, SequentialAgent
from google.adk.tools import FunctionTool
# --- Assume external_approval_tool exists ---
# This tool would:
# 1. Take details (e.g., request_id, amount, reason).
# 2. Send these details to a human review system (e.g., via API).
# 3. Poll or wait for the human response (approved/rejected).
# 4. Return the human's decision.
# async def external_approval_tool(amount: float, reason: str) -> str: ...
approval_tool = FunctionTool(func=external_approval_tool)
# Agent that prepares the request
prepare_request = LlmAgent(
name="PrepareApproval",
instruction="Prepare the approval request details based on user input. Store amount and reason in state.",
# ... likely sets state['approval_amount'] and state['approval_reason'] ...
)
# Agent that calls the human approval tool
request_approval = LlmAgent(
name="RequestHumanApproval",
instruction="Use the external_approval_tool with amount from state['approval_amount'] and reason from state['approval_reason'].",
tools=[approval_tool],
output_key="human_decision"
)
# Agent that proceeds based on human decision
process_decision = LlmAgent(
name="ProcessDecision",
instruction="Check {human_decision}. If 'approved', proceed. If 'rejected', inform user."
)
approval_workflow = SequentialAgent(
name="HumanApprovalWorkflow",
sub_agents=[prepare_request, request_approval, process_decision]
)
```
```typescript
// Conceptual Code: Using a Tool for Human Approval
import { LlmAgent, SequentialAgent, FunctionTool } from '@google/adk';
import { z } from 'zod';
// --- Assume externalApprovalTool exists ---
// This tool would:
// 1. Take details (e.g., request_id, amount, reason).
// 2. Send these details to a human review system (e.g., via API).
// 3. Poll or wait for the human response (approved/rejected).
// 4. Return the human's decision.
async function externalApprovalTool(params: {amount: number, reason: string}): Promise<{decision: string}> {
// ... implementation to call external system
return {decision: 'approved'}; // or 'rejected'
}
const approvalTool = new FunctionTool({
name: 'external_approval_tool',
description: 'Sends a request for human approval.',
parameters: z.object({
amount: z.number(),
reason: z.string(),
}),
execute: externalApprovalTool,
});
// Agent that prepares the request
const prepareRequest = new LlmAgent({
name: 'PrepareApproval',
instruction: 'Prepare the approval request details based on user input. Store amount and reason in state.',
// ... likely sets state['approval_amount'] and state['approval_reason'] ...
});
// Agent that calls the human approval tool
const requestApproval = new LlmAgent({
name: 'RequestHumanApproval',
instruction: 'Use the external_approval_tool with amount from state["approval_amount"] and reason from state["approval_reason"].',
tools: [approvalTool],
outputKey: 'human_decision'
});
// Agent that proceeds based on human decision
const processDecision = new LlmAgent({
name: 'ProcessDecision',
instruction: 'Check {human_decision}. If "approved", proceed. If "rejected", inform user.'
});
const approvalWorkflow = new SequentialAgent({
name: 'HumanApprovalWorkflow',
subAgents: [prepareRequest, requestApproval, processDecision]
});
```
```go
import (
"google.golang.org/adk/agent"
"google.golang.org/adk/agent/llmagent"
"google.golang.org/adk/agent/workflowagents/sequentialagent"
"google.golang.org/adk/tool"
)
// Conceptual Code: Using a Tool for Human Approval
// --- Assume externalApprovalTool exists ---
// func externalApprovalTool(amount float64, reason string) (string, error) { ... }
type externalApprovalToolArgs struct {
Amount float64 `json:"amount" jsonschema:"The amount for which approval is requested."`
Reason string `json:"reason" jsonschema:"The reason for the approval request."`
}
var externalApprovalTool func(tool.Context, externalApprovalToolArgs) (string, error)
approvalTool, _ := functiontool.New(
functiontool.Config{
Name: "external_approval_tool",
Description: "Sends a request for human approval.",
},
externalApprovalTool,
)
prepareRequest, _ := llmagent.New(llmagent.Config{
Name: "PrepareApproval",
Instruction: "Prepare the approval request details based on user input. Store amount and reason in state.",
Model: m,
})
requestApproval, _ := llmagent.New(llmagent.Config{
Name: "RequestHumanApproval",
Instruction: "Use the external_approval_tool with amount from state['approval_amount'] and reason from state['approval_reason'].",
Tools: []tool.Tool{approvalTool},
OutputKey: "human_decision",
Model: m,
})
processDecision, _ := llmagent.New(llmagent.Config{
Name: "ProcessDecision",
Instruction: "Check {human_decision}. If 'approved', proceed. If 'rejected', inform user.",
Model: m,
})
approvalWorkflow, _ := sequentialagent.New(sequentialagent.Config{
AgentConfig: agent.Config{Name: "HumanApprovalWorkflow", SubAgents: []agent.Agent{prepareRequest, requestApproval, processDecision}},
})
```
```java
// Conceptual Code: Using a Tool for Human Approval
import com.google.adk.agents.LlmAgent;
import com.google.adk.agents.SequentialAgent;
import com.google.adk.tools.FunctionTool;
// --- Assume external_approval_tool exists ---
// This tool would:
// 1. Take details (e.g., request_id, amount, reason).
// 2. Send these details to a human review system (e.g., via API).
// 3. Poll or wait for the human response (approved/rejected).
// 4. Return the human's decision.
// public boolean externalApprovalTool(float amount, String reason) { ... }
FunctionTool approvalTool = FunctionTool.create(externalApprovalTool);
// Agent that prepares the request
LlmAgent prepareRequest = LlmAgent.builder()
.name("PrepareApproval")
.instruction("Prepare the approval request details based on user input. Store amount and reason in state.")
// ... likely sets state['approval_amount'] and state['approval_reason'] ...
.build();
// Agent that calls the human approval tool
LlmAgent requestApproval = LlmAgent.builder()
.name("RequestHumanApproval")
.instruction("Use the external_approval_tool with amount from state['approval_amount'] and reason from state['approval_reason'].")
.tools(approvalTool)
.outputKey("human_decision")
.build();
// Agent that proceeds based on human decision
LlmAgent processDecision = LlmAgent.builder()
.name("ProcessDecision")
.instruction("Check {human_decision}. If 'approved', proceed. If 'rejected', inform user.")
.build();
SequentialAgent approvalWorkflow = SequentialAgent.builder()
.name("HumanApprovalWorkflow")
.subAgents(prepareRequest, requestApproval, processDecision)
.build();
```
#### Human in the Loop with Policy
A more advanced and structured way to implement Human-in-the-Loop is by using a `PolicyEngine`. This approach allows you to define policies that can trigger a confirmation step from a user before a tool is executed. The `SecurityPlugin` intercepts a tool call, consults the `PolicyEngine`, and if the policy dictates, it will automatically request user confirmation. This pattern is more robust for enforcing governance and security rules.
Here's how it works:
1. **`SecurityPlugin`**: You add this plugin to your `Runner`. It acts as an interceptor for all tool calls.
1. **`BasePolicyEngine`**: You create a custom class that implements this interface. Its `evaluate()` method contains your logic to decide if a tool call needs confirmation.
1. **`PolicyOutcome.CONFIRM`**: When your `evaluate()` method returns this outcome, the `SecurityPlugin` pauses the tool execution and generates a special `FunctionCall` using `getAskUserConfirmationFunctionCalls`.
1. **Application Handling**: Your application code receives this special function call and presents the confirmation request to the user.
1. **User Confirmation**: Once the user confirms, your application sends a `FunctionResponse` back to the agent, which allows the `SecurityPlugin` to proceed with the original tool execution.
TypeScript Recommended Pattern
The Policy-based pattern is the recommended approach for implementing Human-in-the-Loop workflows in TypeScript. Support in other ADK languages is planned for future releases.
A conceptual example of using a `CustomPolicyEngine` to require user confirmation before executing any tool is shown below.
```typescript
const rootAgent = new LlmAgent({
name: 'weather_time_agent',
model: 'gemini-2.5-flash',
description:
'Agent to answer questions about the time and weather in a city.',
instruction:
'You are a helpful agent who can answer user questions about the time and weather in a city.',
tools: [getWeatherTool],
});
class CustomPolicyEngine implements BasePolicyEngine {
async evaluate(_context: ToolCallPolicyContext): Promise {
// Default permissive implementation
return Promise.resolve({
outcome: PolicyOutcome.CONFIRM,
reason: 'Needs confirmation for tool call',
});
}
}
const runner = new InMemoryRunner({
agent: rootAgent,
appName,
plugins: [new SecurityPlugin({policyEngine: new CustomPolicyEngine()})]
});
```
You can find the full code sample [here](https://github.com/google/adk-docs/blob/main/examples/typescript/snippets/agents/workflow-agents/hitl_confirmation_agent.ts).
### Combining Patterns
These patterns provide starting points for structuring your multi-agent systems. You can mix and match them as needed to create the most effective architecture for your specific application.
# AI Models for ADK agents
Supported in ADKPythonTypescriptGoJava
Agent Development Kit (ADK) is designed for flexibility, allowing you to integrate various Large Language Models (LLMs) into your agents. This section details how to leverage Gemini and integrate other popular models effectively, including those hosted externally or running locally.
ADK primarily uses two mechanisms for model integration:
1. **Direct String / Registry:** For models tightly integrated with Google Cloud, such as Gemini models accessed via Google AI Studio or Vertex AI, or models hosted on Vertex AI endpoints. You access these models by providing the model name or endpoint resource string and ADK's internal registry resolves this string to the appropriate backend client.
- [Gemini models](/adk-docs/agents/models/google-gemini/)
- [Claude models](/adk-docs/agents/models/anthropic/)
- [Vertex AI hosted models](/adk-docs/agents/models/vertex/)
1. **Model connectors:** For broader compatibility, especially models outside the Google ecosystem or those requiring specific client configurations, such as models accessed via Apigee or LiteLLM. You instantiate a specific wrapper class, such as `ApigeeLlm` or `LiteLlm`, and pass this object as the `model` parameter to your `LlmAgent`.
- [Apigee models](/adk-docs/agents/models/apigee/)
- [LiteLLM models](/adk-docs/agents/models/litellm/)
- [Ollama model hosting](/adk-docs/agents/models/ollama/)
- [vLLM model hosting](/adk-docs/agents/models/vllm/)
# Claude models for ADK agents
Supported in ADKJava v0.2.0
You can integrate Anthropic's Claude models directly using an Anthropic API key or from a Vertex AI backend into your Java ADK applications by using the ADK's `Claude` wrapper class. You can also access Anthropic models through Google Cloud Vertex AI services. For more information, see the [Third-Party Models on Vertex AI](/adk-docs/agents/models/vertex/#third-party-models-on-vertex-ai-eg-anthropic-claude) section. You can also use Anthropic models through the [LiteLLM](/adk-docs/agents/models/litellm/) library for Python.
## Get started
The following code examples show a basic implementation for using Gemini models in your agents:
```java
public static LlmAgent createAgent() {
AnthropicClient anthropicClient = AnthropicOkHttpClient.builder()
.apiKey("ANTHROPIC_API_KEY")
.build();
Claude claudeModel = new Claude(
"claude-3-7-sonnet-latest", anthropicClient
);
return LlmAgent.builder()
.name("claude_direct_agent")
.model(claudeModel)
.instruction("You are a helpful AI assistant powered by Anthropic Claude.")
.build();
}
```
## Prerequisites
1. **Dependencies:**
- **Anthropic SDK Classes (Transitive):** The Java ADK's `com.google.adk.models.Claude` wrapper relies on classes from Anthropic's official Java SDK. These are typically included as *transitive dependencies*. For more information, see the [Anthropic Java SDK](https://github.com/anthropics/anthropic-sdk-java).
1. **Anthropic API Key:**
- Obtain an API key from Anthropic. Securely manage this key using a secret manager.
## Example implementation
Instantiate `com.google.adk.models.Claude`, providing the desired Claude model name and an `AnthropicOkHttpClient` configured with your API key. Then, pass the `Claude` instance to your `LlmAgent`, as shown in the following example:
```java
import com.anthropic.client.AnthropicClient;
import com.google.adk.agents.LlmAgent;
import com.google.adk.models.Claude;
import com.anthropic.client.okhttp.AnthropicOkHttpClient; // From Anthropic's SDK
public class DirectAnthropicAgent {
private static final String CLAUDE_MODEL_ID = "claude-3-7-sonnet-latest"; // Or your preferred Claude model
public static LlmAgent createAgent() {
// It's recommended to load sensitive keys from a secure config
AnthropicClient anthropicClient = AnthropicOkHttpClient.builder()
.apiKey("ANTHROPIC_API_KEY")
.build();
Claude claudeModel = new Claude(
CLAUDE_MODEL_ID,
anthropicClient
);
return LlmAgent.builder()
.name("claude_direct_agent")
.model(claudeModel)
.instruction("You are a helpful AI assistant powered by Anthropic Claude.")
// ... other LlmAgent configurations
.build();
}
public static void main(String[] args) {
try {
LlmAgent agent = createAgent();
System.out.println("Successfully created direct Anthropic agent: " + agent.name());
} catch (IllegalStateException e) {
System.err.println("Error creating agent: " + e.getMessage());
}
}
}
```
# Apigee AI Gateway for ADK agents
Supported in ADKPython v1.18.0Java v0.4.0
[Apigee](https://docs.cloud.google.com/apigee/docs/api-platform/get-started/what-apigee) provides a powerful [AI Gateway](https://cloud.google.com/solutions/apigee-ai), transforming how you manage and govern your generative AI model traffic. By exposing your AI model endpoint (like Vertex AI or the Gemini API) through an Apigee proxy, you immediately gain enterprise-grade capabilities:
- **Model Safety:** Implement security policies like Model Armor for threat protection.
- **Traffic Governance:** Enforce Rate Limiting and Token Limiting to manage costs and prevent abuse.
- **Performance:** Improve response times and efficiency using Semantic Caching and advanced model routing.
- **Monitoring & Visibility:** Get granular monitoring, analysis, and auditing of all your AI requests.
Note
The `ApigeeLLM` wrapper is currently designed for use with Vertex AI and the Gemini API (generateContent). We are continually expanding support for other models and interfaces.
## Example implementation
Integrate Apigee's governance into your agent's workflow by instantiating the `ApigeeLlm` wrapper object and pass it to an `LlmAgent` or other agent type.
```python
from google.adk.agents import LlmAgent
from google.adk.models.apigee_llm import ApigeeLlm
# Instantiate the ApigeeLlm wrapper
model = ApigeeLlm(
# Specify the Apigee route to your model. For more info, check out the ApigeeLlm documentation (https://github.com/google/adk-python/tree/main/contributing/samples/hello_world_apigeellm).
model="apigee/gemini-2.5-flash",
# The proxy URL of your deployed Apigee proxy including the base path
proxy_url=f"https://{APIGEE_PROXY_URL}",
# Pass necessary authentication/authorization headers (like an API key)
custom_headers={"foo": "bar"}
)
# Pass the configured model wrapper to your LlmAgent
agent = LlmAgent(
model=model,
name="my_governed_agent",
instruction="You are a helpful assistant powered by Gemini and governed by Apigee.",
# ... other agent parameters
)
```
```java
import com.google.adk.agents.LlmAgent;
import com.google.adk.models.ApigeeLlm;
import com.google.common.collect.ImmutableMap;
ApigeeLlm apigeeLlm =
ApigeeLlm.builder()
.modelName("apigee/gemini-2.5-flash") // Specify the Apigee route to your model. For more info, check out the ApigeeLlm documentation
.proxyUrl(APIGEE_PROXY_URL) //The proxy URL of your deployed Apigee proxy including the base path
.customHeaders(ImmutableMap.of("foo", "bar")) //Pass necessary authentication/authorization headers (like an API key)
.build();
LlmAgent agent =
LlmAgent.builder()
.model(apigeeLlm)
.name("my_governed_agent")
.description("my_governed_agent")
.instruction("You are a helpful assistant powered by Gemini and governed by Apigee.")
// tools will be added next
.build();
```
With this configuration, every API call from your agent will be routed through Apigee first, where all necessary policies (security, rate limiting, logging) are executed before the request is securely forwarded to the underlying AI model endpoint. For a full code example using the Apigee proxy, see [Hello World Apigee LLM](https://github.com/google/adk-python/tree/main/contributing/samples/hello_world_apigeellm).
# Google Gemini models for ADK agents
Supported in ADKPython v0.1.0Typescript v0.2.0Go v0.1.0Java v0.2.0
ADK supports the Google Gemini family of generative AI models that provide a powerful set of models with a wide range of features. ADK provides support for many Gemini features, including [Code Execution](/adk-docs/tools/gemini-api/code-execution/), [Google Search](/adk-docs/tools/gemini-api/google-search/), [Context caching](/adk-docs/context/caching/), [Computer use](/adk-docs/tools/gemini-api/computer-use/) and the [Interactions API](#interactions-api).
## Get started
The following code examples show a basic implementation for using Gemini models in your agents:
```python
from google.adk.agents import LlmAgent
# --- Example using a stable Gemini Flash model ---
agent_gemini_flash = LlmAgent(
# Use the latest stable Flash model identifier
model="gemini-2.5-flash",
name="gemini_flash_agent",
instruction="You are a fast and helpful Gemini assistant.",
# ... other agent parameters
)
```
```typescript
import {LlmAgent} from '@google/adk';
// --- Example #2: using a powerful Gemini Pro model with API Key in model ---
export const rootAgent = new LlmAgent({
name: 'hello_time_agent',
model: 'gemini-2.5-flash',
description: 'Gemini flash agent',
instruction: `You are a fast and helpful Gemini assistant.`,
});
```
```go
import (
"google.golang.org/adk/agent/llmagent"
"google.golang.org/adk/model/gemini"
"google.golang.org/genai"
)
// --- Example using a stable Gemini Flash model ---
modelFlash, err := gemini.NewModel(ctx, "gemini-2.0-flash", &genai.ClientConfig{})
if err != nil {
log.Fatalf("failed to create model: %v", err)
}
agentGeminiFlash, err := llmagent.New(llmagent.Config{
// Use the latest stable Flash model identifier
Model: modelFlash,
Name: "gemini_flash_agent",
Instruction: "You are a fast and helpful Gemini assistant.",
// ... other agent parameters
})
if err != nil {
log.Fatalf("failed to create agent: %v", err)
}
```
```java
// --- Example #1: using a stable Gemini Flash model with ENV variables---
LlmAgent agentGeminiFlash =
LlmAgent.builder()
// Use the latest stable Flash model identifier
.model("gemini-2.5-flash") // Set ENV variables to use this model
.name("gemini_flash_agent")
.instruction("You are a fast and helpful Gemini assistant.")
// ... other agent parameters
.build();
```
## Gemini model authentication
This section covers authenticating with Google's Gemini models, either through Google AI Studio for rapid development or Google Cloud Vertex AI for enterprise applications. This is the most direct way to use Google's flagship models within ADK.
**Integration Method:** Once you are authenticated using one of the below methods, you can pass the model's identifier string directly to the `model` parameter of `LlmAgent`.
Tip
The `google-genai` library, used internally by ADK for Gemini models, can connect through either Google AI Studio or Vertex AI.
**Model support for voice/video streaming**
In order to use voice/video streaming in ADK, you will need to use Gemini models that support the Live API. You can find the **model ID(s)** that support the Gemini Live API in the documentation:
- [Google AI Studio: Gemini Live API](https://ai.google.dev/gemini-api/docs/models#live-api)
- [Vertex AI: Gemini Live API](https://cloud.google.com/vertex-ai/generative-ai/docs/live-api)
### Google AI Studio
This is the simplest method and is recommended for getting started quickly.
- **Authentication Method:** API Key
- **Setup:**
1. **Get an API key:** Obtain your key from [Google AI Studio](https://aistudio.google.com/apikey).
1. **Set environment variables:** Create a `.env` file (Python) or `.properties` (Java) in your project's root directory and add the following lines. ADK will automatically load this file.
```shell
export GOOGLE_API_KEY="YOUR_GOOGLE_API_KEY"
export GOOGLE_GENAI_USE_VERTEXAI=FALSE
```
(or)
Pass these variables during the model initialization via the `Client` (see example below).
- **Models:** Find all available models on the [Google AI for Developers site](https://ai.google.dev/gemini-api/docs/models).
### Google Cloud Vertex AI
For scalable and production-oriented use cases, Vertex AI is the recommended platform. Gemini on Vertex AI supports enterprise-grade features, security, and compliance controls. Based on your development environment and usecase, *choose one of the below methods to authenticate*.
**Pre-requisites:** A Google Cloud Project with [Vertex AI enabled](https://console.cloud.google.com/apis/enableflow;apiid=aiplatform.googleapis.com).
### **Method A: User Credentials (for Local Development)**
1. **Install the gcloud CLI:** Follow the official [installation instructions](https://cloud.google.com/sdk/docs/install).
1. **Log in using ADC:** This command opens a browser to authenticate your user account for local development.
```bash
gcloud auth application-default login
```
1. **Set environment variables:**
```shell
export GOOGLE_CLOUD_PROJECT="YOUR_PROJECT_ID"
export GOOGLE_CLOUD_LOCATION="YOUR_VERTEX_AI_LOCATION" # e.g., us-central1
```
Explicitly tell the library to use Vertex AI:
```shell
export GOOGLE_GENAI_USE_VERTEXAI=TRUE
```
1. **Models:** Find available model IDs in the [Vertex AI documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models).
### **Method B: Vertex AI Express Mode**
[Vertex AI Express Mode](https://cloud.google.com/vertex-ai/generative-ai/docs/start/express-mode/overview) offers a simplified, API-key-based setup for rapid prototyping.
1. **Sign up for Express Mode** to get your API key.
1. **Set environment variables:**
```shell
export GOOGLE_API_KEY="PASTE_YOUR_EXPRESS_MODE_API_KEY_HERE"
export GOOGLE_GENAI_USE_VERTEXAI=TRUE
```
### **Method C: Service Account (for Production & Automation)**
For deployed applications, a service account is the standard method.
1. [**Create a Service Account**](https://cloud.google.com/iam/docs/service-accounts-create#console) and grant it the `Vertex AI User` role.
1. **Provide credentials to your application:**
- **On Google Cloud:** If you are running the agent in Cloud Run, GKE, VM or other Google Cloud services, the environment can automatically provide the service account credentials. You don't have to create a key file.
- **Elsewhere:** Create a [service account key file](https://cloud.google.com/iam/docs/keys-create-delete#console) and point to it with an environment variable:
```bash
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/your/keyfile.json"
```
Instead of the key file, you can also authenticate the service account using Workload Identity. But this is outside the scope of this guide.
Secure Your Credentials
Service account credentials or API keys are powerful credentials. Never expose them publicly. Use a secret manager such as [Google Cloud Secret Manager](https://cloud.google.com/security/products/secret-manager) to store and access them securely in production.
Gemini model versions
Always check the official Gemini documentation for the latest model names, including specific preview versions if needed. Preview models might have different availability or quota limitations.
## Troubleshooting
### Error Code 429 - RESOURCE_EXHAUSTED
This error usually happens if the number of your requests exceeds the capacity allocated to process requests.
To mitigate this, you can do one of the following:
1. Request higher quota limits for the model you are trying to use.
1. Enable client-side retries. Retries allow the client to automatically retry the request after a delay, which can help if the quota issue is temporary.
There are two ways you can set retry options:
**Option 1:** Set retry options on the Agent as a part of generate_content_config.
You would use this option if you are instantiating this model adapter by yourself.
```python
root_agent = Agent(
model='gemini-2.5-flash',
...
generate_content_config=types.GenerateContentConfig(
...
http_options=types.HttpOptions(
...
retry_options=types.HttpRetryOptions(initial_delay=1, attempts=2),
...
),
...
)
```
**Option 2:** Retry options on this model adapter.
You would use this option if you were instantiating the instance of adapter by yourself.
```python
from google.genai import types
# ...
agent = Agent(
model=Gemini(
retry_options=types.HttpRetryOptions(initial_delay=1, attempts=2),
)
)
```
## Gemini Interactions API
Supported in ADKPython v1.21.0
The Gemini [Interactions API](https://ai.google.dev/gemini-api/docs/interactions) is an alternative to the ***generateContent*** inference API, which provides stateful conversation capabilities, allowing you to chain interactions using a `previous_interaction_id` instead of sending the full conversation history with each request. Using this feature can be more efficient for long conversations.
You can enable the Interactions API by setting the `use_interactions_api=True` parameter in the Gemini model configuration, as shown in the following code snippet:
```python
from google.adk.agents.llm_agent import Agent
from google.adk.models.google_llm import Gemini
from google.adk.tools.google_search_tool import GoogleSearchTool
root_agent = Agent(
model=Gemini(
model="gemini-2.5-flash",
use_interactions_api=True, # Enable Interactions API
),
name="interactions_test_agent",
tools=[
GoogleSearchTool(bypass_multi_tools_limit=True), # Converted to function tool
get_current_weather, # Custom function tool
],
)
```
For a complete code sample, see the [Interactions API sample](https://github.com/google/adk-python/tree/main/contributing/samples/interactions_api).
### Known limitations
The Interactions API **does not** support mixing custom function calling tools with built-in tools, such as the [Google Search](/adk-docs/tools/built-in-tools/#google-search), tool, within the same agent. You can work around this limitation by configuring the the built-in tool to operate as a custom tool using the `bypass_multi_tools_limit` parameter:
```python
# Use bypass_multi_tools_limit=True to convert google_search to a function tool
GoogleSearchTool(bypass_multi_tools_limit=True)
```
In this example, this option converts the built-in google_search to a function calling tool (via GoogleSearchAgentTool), which allows it to work alongside custom function tools.
# LiteLLM model connector for ADK agents
Supported in ADKPython v0.1.0
[LiteLLM](https://docs.litellm.ai/) is a Python library that acts as a translation layer for models and model hosting services, providing a standardized, OpenAI-compatible interface to over 100+ LLMs. ADK provides integration through the LiteLLM library, allowing you to access a vast range of LLMs from providers like OpenAI, Anthropic (non-Vertex AI), Cohere, and many others. You can run open-source models locally or self-host them and integrate them using LiteLLM for operational control, cost savings, privacy, or offline use cases.
You can use the LiteLLM library to access remote or locally hosted AI models:
- **Remote model host:** Use the `LiteLlm` wrapper class and set it as the `model` parameter of `LlmAgent`.
- **Local model host:** Use the `LiteLlm` wrapper class configured to point to your local model server. For examples of local model hosting solutions, see the [Ollama](/adk-docs/agents/models/ollama/) or [vLLM](/adk-docs/agents/models/vllm/) documentation.
Windows Encoding with LiteLLM
When using ADK agents with LiteLLM on Windows, you might encounter a `UnicodeDecodeError`. This error occurs because LiteLLM may attempt to read cached files using the default Windows encoding (`cp1252`) instead of UTF-8. Prevent this error by setting the `PYTHONUTF8` environment variable to `1`. This forces Python to use UTF-8 for all file I/O.
**Example (PowerShell):**
```powershell
# Set for the current session
$env:PYTHONUTF8 = "1"
# Set persistently for the user
[System.Environment]::SetEnvironmentVariable('PYTHONUTF8', '1', [System.EnvironmentVariableTarget]::User)
```
## Setup
1. **Install LiteLLM:**
```shell
pip install litellm
```
1. **Set Provider API Keys:** Configure API keys as environment variables for the specific providers you intend to use.
- *Example for OpenAI:*
```shell
export OPENAI_API_KEY="YOUR_OPENAI_API_KEY"
```
- *Example for Anthropic (non-Vertex AI):*
```shell
export ANTHROPIC_API_KEY="YOUR_ANTHROPIC_API_KEY"
```
- *Consult the [LiteLLM Providers Documentation](https://docs.litellm.ai/docs/providers) for the correct environment variable names for other providers.*
## Example implementation
```python
from google.adk.agents import LlmAgent
from google.adk.models.lite_llm import LiteLlm
# --- Example Agent using OpenAI's GPT-4o ---
# (Requires OPENAI_API_KEY)
agent_openai = LlmAgent(
model=LiteLlm(model="openai/gpt-4o"), # LiteLLM model string format
name="openai_agent",
instruction="You are a helpful assistant powered by GPT-4o.",
# ... other agent parameters
)
# --- Example Agent using Anthropic's Claude Haiku (non-Vertex) ---
# (Requires ANTHROPIC_API_KEY)
agent_claude_direct = LlmAgent(
model=LiteLlm(model="anthropic/claude-3-haiku-20240307"),
name="claude_direct_agent",
instruction="You are an assistant powered by Claude Haiku.",
# ... other agent parameters
)
```
# Ollama model host for ADK agents
Supported in ADKPython v0.1.0
[Ollama](https://ollama.com/) is a tool that allows you to host and run open-source models locally. ADK integrates with Ollama-hosted models through the [LiteLLM](/adk-docs/agents/models/litellm/) model connector library.
## Get started
Use the LiteLLM wrapper to create agents with Ollama-hosted models. The following code example shows a basic implementation for using Gemma open models with your agents:
```py
root_agent = Agent(
model=LiteLlm(model="ollama_chat/gemma3:latest"),
name="dice_agent",
description=(
"hello world agent that can roll a dice of 8 sides and check prime"
" numbers."
),
instruction="""
You roll dice and answer questions about the outcome of the dice rolls.
""",
tools=[
roll_die,
check_prime,
],
)
```
Warning: Use `ollama_chat`interface
Make sure you set the provider `ollama_chat` instead of `ollama`. Using `ollama` can result in unexpected behaviors such as infinite tool call loops and ignoring previous context.
Use `OLLAMA_API_BASE` environment variable
Although you can specify the `api_base` parameter in LiteLLM for generation, as of v1.65.5, the library relies on the environment variable for other API calls. Therefore, you should set the `OLLAMA_API_BASE` environment variable for your Ollama server URL to ensure all requests are routed correctly.
```bash
export OLLAMA_API_BASE="http://localhost:11434"
adk web
```
## Model choice
If your agent is relying on tools, make sure that you select a model with tool support from [Ollama website](https://ollama.com/search?c=tools). For reliable results, use a model with tool support. You can check tool support for the model using the following command:
```bash
ollama show mistral-small3.1
Model
architecture mistral3
parameters 24.0B
context length 131072
embedding length 5120
quantization Q4_K_M
Capabilities
completion
vision
tools
```
You should see **tools** listed under capabilities. You can also look at the template the model is using and tweak it based on your needs.
```bash
ollama show --modelfile llama3.2 > model_file_to_modify
```
For instance, the default template for the above model inherently suggests that the model shall call a function all the time. This may result in an infinite loop of function calls.
```text
Given the following functions, please respond with a JSON for a function call
with its proper arguments that best answers the given prompt.
Respond in the format {"name": function name, "parameters": dictionary of
argument name and its value}. Do not use variables.
```
You can swap such prompts with a more descriptive one to prevent infinite tool call loops, for instance:
```text
Review the user's prompt and the available functions listed below.
First, determine if calling one of these functions is the most appropriate way
to respond. A function call is likely needed if the prompt asks for a specific
action, requires external data lookup, or involves calculations handled by the
functions. If the prompt is a general question or can be answered directly, a
function call is likely NOT needed.
If you determine a function call IS required: Respond ONLY with a JSON object in
the format {"name": "function_name", "parameters": {"argument_name": "value"}}.
Ensure parameter values are concrete, not variables.
If you determine a function call IS NOT required: Respond directly to the user's
prompt in plain text, providing the answer or information requested. Do not
output any JSON.
```
Then you can create a new model with the following command:
```bash
ollama create llama3.2-modified -f model_file_to_modify
```
## Use OpenAI provider
Alternatively, you can use `openai` as the provider name. This approach requires setting the `OPENAI_API_BASE=http://localhost:11434/v1` and `OPENAI_API_KEY=anything` env variables instead of `OLLAMA_API_BASE`. Note that the `API_BASE` value has *`/v1`* at the end.
```py
root_agent = Agent(
model=LiteLlm(model="openai/mistral-small3.1"),
name="dice_agent",
description=(
"hello world agent that can roll a dice of 8 sides and check prime"
" numbers."
),
instruction="""
You roll dice and answer questions about the outcome of the dice rolls.
""",
tools=[
roll_die,
check_prime,
],
)
```
```bash
export OPENAI_API_BASE=http://localhost:11434/v1
export OPENAI_API_KEY=anything
adk web
```
### Debugging
You can see the request sent to the Ollama server by adding the following in your agent code just after imports.
```py
import litellm
litellm._turn_on_debug()
```
Look for a line like the following:
```bash
Request Sent from LiteLLM:
curl -X POST \
http://localhost:11434/api/chat \
-d '{'model': 'mistral-small3.1', 'messages': [{'role': 'system', 'content': ...
```
# Vertex AI hosted models for ADK agents
For enterprise-grade scalability, reliability, and integration with Google Cloud's MLOps ecosystem, you can use models deployed to Vertex AI Endpoints. This includes models from Model Garden or your own fine-tuned models.
**Integration Method:** Pass the full Vertex AI Endpoint resource string (`projects/PROJECT_ID/locations/LOCATION/endpoints/ENDPOINT_ID`) directly to the `model` parameter of `LlmAgent`.
## Vertex AI Setup
Ensure your environment is configured for Vertex AI:
1. **Authentication:** Use Application Default Credentials (ADC):
```shell
gcloud auth application-default login
```
1. **Environment Variables:** Set your project and location:
```shell
export GOOGLE_CLOUD_PROJECT="YOUR_PROJECT_ID"
export GOOGLE_CLOUD_LOCATION="YOUR_VERTEX_AI_LOCATION" # e.g., us-central1
```
1. **Enable Vertex Backend:** Crucially, ensure the `google-genai` library targets Vertex AI:
```shell
export GOOGLE_GENAI_USE_VERTEXAI=TRUE
```
## Model Garden Deployments
Supported in ADKPython v0.2.0
You can deploy various open and proprietary models from the [Vertex AI Model Garden](https://console.cloud.google.com/vertex-ai/model-garden) to an endpoint.
**Example:**
```python
from google.adk.agents import LlmAgent
from google.genai import types # For config objects
# --- Example Agent using a Llama 3 model deployed from Model Garden ---
# Replace with your actual Vertex AI Endpoint resource name
llama3_endpoint = "projects/YOUR_PROJECT_ID/locations/us-central1/endpoints/YOUR_LLAMA3_ENDPOINT_ID"
agent_llama3_vertex = LlmAgent(
model=llama3_endpoint,
name="llama3_vertex_agent",
instruction="You are a helpful assistant based on Llama 3, hosted on Vertex AI.",
generate_content_config=types.GenerateContentConfig(max_output_tokens=2048),
# ... other agent parameters
)
```
## Fine-tuned Model Endpoints
Supported in ADKPython v0.2.0
Deploying your fine-tuned models (whether based on Gemini or other architectures supported by Vertex AI) results in an endpoint that can be used directly.
**Example:**
```python
from google.adk.agents import LlmAgent
# --- Example Agent using a fine-tuned Gemini model endpoint ---
# Replace with your fine-tuned model's endpoint resource name
finetuned_gemini_endpoint = "projects/YOUR_PROJECT_ID/locations/us-central1/endpoints/YOUR_FINETUNED_ENDPOINT_ID"
agent_finetuned_gemini = LlmAgent(
model=finetuned_gemini_endpoint,
name="finetuned_gemini_agent",
instruction="You are a specialized assistant trained on specific data.",
# ... other agent parameters
)
```
## Anthropic Claude on Vertex AI
Supported in ADKPython v0.2.0Java v0.1.0
Some providers, like Anthropic, make their models available directly through Vertex AI.
**Integration Method:** Uses the direct model string (e.g., `"claude-3-sonnet@20240229"`), *but requires manual registration* within ADK.
**Why Registration?** ADK's registry automatically recognizes `gemini-*` strings and standard Vertex AI endpoint strings (`projects/.../endpoints/...`) and routes them via the `google-genai` library. For other model types used directly via Vertex AI (like Claude), you must explicitly tell the ADK registry which specific wrapper class (`Claude` in this case) knows how to handle that model identifier string with the Vertex AI backend.
**Setup:**
1. **Vertex AI Environment:** Ensure the consolidated Vertex AI setup (ADC, Env Vars, `GOOGLE_GENAI_USE_VERTEXAI=TRUE`) is complete.
1. **Install Provider Library:** Install the necessary client library configured for Vertex AI.
```shell
pip install "anthropic[vertex]"
```
1. **Register Model Class:** Add this code near the start of your application, *before* creating an agent using the Claude model string:
```python
# Required for using Claude model strings directly via Vertex AI with LlmAgent
from google.adk.models.anthropic_llm import Claude
from google.adk.models.registry import LLMRegistry
LLMRegistry.register(Claude)
```
**Example:**
```python
from google.adk.agents import LlmAgent
from google.adk.models.anthropic_llm import Claude # Import needed for registration
from google.adk.models.registry import LLMRegistry # Import needed for registration
from google.genai import types
# --- Register Claude class (do this once at startup) ---
LLMRegistry.register(Claude)
# --- Example Agent using Claude 3 Sonnet on Vertex AI ---
# Standard model name for Claude 3 Sonnet on Vertex AI
claude_model_vertexai = "claude-3-sonnet@20240229"
agent_claude_vertexai = LlmAgent(
model=claude_model_vertexai, # Pass the direct string after registration
name="claude_vertexai_agent",
instruction="You are an assistant powered by Claude 3 Sonnet on Vertex AI.",
generate_content_config=types.GenerateContentConfig(max_output_tokens=4096),
# ... other agent parameters
)
```
**Integration Method:** Directly instantiate the provider-specific model class (e.g., `com.google.adk.models.Claude`) and configure it with a Vertex AI backend.
**Why Direct Instantiation?** The Java ADK's `LlmRegistry` primarily handles Gemini models by default. For third-party models like Claude on Vertex AI, you directly provide an instance of the ADK's wrapper class (e.g., `Claude`) to the `LlmAgent`. This wrapper class is responsible for interacting with the model via its specific client library, configured for Vertex AI.
**Setup:**
1. **Vertex AI Environment:**
- Ensure your Google Cloud project and region are correctly set up.
- **Application Default Credentials (ADC):** Make sure ADC is configured correctly in your environment. This is typically done by running `gcloud auth application-default login`. The Java client libraries use these credentials to authenticate with Vertex AI. Follow the [Google Cloud Java documentation on ADC](https://cloud.google.com/java/docs/reference/google-auth-library/latest/com.google.auth.oauth2.GoogleCredentials#com_google_auth_oauth2_GoogleCredentials_getApplicationDefault__) for detailed setup.
1. **Provider Library Dependencies:**
- **Third-Party Client Libraries (Often Transitive):** The ADK core library often includes the necessary client libraries for common third-party models on Vertex AI (like Anthropic's required classes) as **transitive dependencies**. This means you might not need to explicitly add a separate dependency for the Anthropic Vertex SDK in your `pom.xml` or `build.gradle`.
1. **Instantiate and Configure the Model:** When creating your `LlmAgent`, instantiate the `Claude` class (or the equivalent for another provider) and configure its `VertexBackend`.
**Example:**
```java
import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.vertex.backends.VertexBackend;
import com.google.adk.agents.LlmAgent;
import com.google.adk.models.Claude; // ADK's wrapper for Claude
import com.google.auth.oauth2.GoogleCredentials;
import java.io.IOException;
// ... other imports
public class ClaudeVertexAiAgent {
public static LlmAgent createAgent() throws IOException {
// Model name for Claude 3 Sonnet on Vertex AI (or other versions)
String claudeModelVertexAi = "claude-3-7-sonnet"; // Or any other Claude model
// Configure the AnthropicOkHttpClient with the VertexBackend
AnthropicClient anthropicClient = AnthropicOkHttpClient.builder()
.backend(
VertexBackend.builder()
.region("us-east5") // Specify your Vertex AI region
.project("your-gcp-project-id") // Specify your GCP Project ID
.googleCredentials(GoogleCredentials.getApplicationDefault())
.build())
.build();
// Instantiate LlmAgent with the ADK Claude wrapper
LlmAgent agentClaudeVertexAi = LlmAgent.builder()
.model(new Claude(claudeModelVertexAi, anthropicClient)) // Pass the Claude instance
.name("claude_vertexai_agent")
.instruction("You are an assistant powered by Claude 3 Sonnet on Vertex AI.")
// .generateContentConfig(...) // Optional: Add generation config if needed
// ... other agent parameters
.build();
return agentClaudeVertexAi;
}
public static void main(String[] args) {
try {
LlmAgent agent = createAgent();
System.out.println("Successfully created agent: " + agent.name());
// Here you would typically set up a Runner and Session to interact with the agent
} catch (IOException e) {
System.err.println("Failed to create agent: " + e.getMessage());
e.printStackTrace();
}
}
}
```
## Open Models on Vertex AI
Supported in ADKPython v0.1.0
Vertex AI offers a curated selection of open-source models, such as Meta Llama, through Model-as-a-Service (MaaS). These models are accessible via managed APIs, allowing you to deploy and scale without managing the underlying infrastructure. For a full list of available options, see the [Vertex AI open models for MaaS](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/maas/use-open-models#open-models) documentation.
You can use the [LiteLLM](https://docs.litellm.ai/) library to access open models like Meta's Llama on VertexAI MaaS
**Integration Method:** Use the `LiteLlm` wrapper class and set it as the `model` parameter of `LlmAgent`. Make sure you go through the [LiteLLM model connector for ADK agents](/adk-docs/agents/models/litellm/#litellm-model-connector-for-adk-agents) documentation on how to use LiteLLM in ADK
**Setup:**
1. **Vertex AI Environment:** Ensure the consolidated Vertex AI setup (ADC, Env Vars, `GOOGLE_GENAI_USE_VERTEXAI=TRUE`) is complete.
1. **Install LiteLLM:**
```shell
pip install litellm
```
**Example:**
```python
from google.adk.agents import LlmAgent
from google.adk.models.lite_llm import LiteLlm
# --- Example Agent using Meta's Llama 4 Scout ---
agent_llama_vertexai = LlmAgent(
model=LiteLlm(model="vertex_ai/meta/llama-4-scout-17b-16e-instruct-maas"), # LiteLLM model string format
name="llama4_agent",
instruction="You are a helpful assistant powered by Llama 4 Scout.",
# ... other agent parameters
)
```
# vLLM model host for ADK agents
Supported in ADKPython v0.1.0
Tools such as [vLLM](https://github.com/vllm-project/vllm) allow you to host models efficiently and serve them as an OpenAI-compatible API endpoint. You can use vLLM models through the [LiteLLM](/adk-docs/agents/models/litellm/) library for Python.
## Setup
1. **Deploy Model:** Deploy your chosen model using vLLM (or a similar tool). Note the API base URL (e.g., `https://your-vllm-endpoint.run.app/v1`).
- *Important for ADK Tools:* When deploying, ensure the serving tool supports and enables OpenAI-compatible tool/function calling. For vLLM, this might involve flags like `--enable-auto-tool-choice` and potentially a specific `--tool-call-parser`, depending on the model. Refer to the vLLM documentation on Tool Use.
1. **Authentication:** Determine how your endpoint handles authentication (e.g., API key, bearer token).
## Integration Example
The following example shows how to use a vLLM endpoint with ADK agents.
```python
import subprocess
from google.adk.agents import LlmAgent
from google.adk.models.lite_llm import LiteLlm
# --- Example Agent using a model hosted on a vLLM endpoint ---
# Endpoint URL provided by your vLLM deployment
api_base_url = "https://your-vllm-endpoint.run.app/v1"
# Model name as recognized by *your* vLLM endpoint configuration
model_name_at_endpoint = "hosted_vllm/google/gemma-3-4b-it" # Example from vllm_test.py
# Authentication (Example: using gcloud identity token for a Cloud Run deployment)
# Adapt this based on your endpoint's security
try:
gcloud_token = subprocess.check_output(
["gcloud", "auth", "print-identity-token", "-q"]
).decode().strip()
auth_headers = {"Authorization": f"Bearer {gcloud_token}"}
except Exception as e:
print(f"Warning: Could not get gcloud token - {e}. Endpoint might be unsecured or require different auth.")
auth_headers = None # Or handle error appropriately
agent_vllm = LlmAgent(
model=LiteLlm(
model=model_name_at_endpoint,
api_base=api_base_url,
# Pass authentication headers if needed
extra_headers=auth_headers
# Alternatively, if endpoint uses an API key:
# api_key="YOUR_ENDPOINT_API_KEY"
),
name="vllm_agent",
instruction="You are a helpful assistant running on a self-hosted vLLM endpoint.",
# ... other agent parameters
)
```
# Workflow Agents
Supported in ADKPythonTypeScriptGoJava
This section introduces "*workflow agents*" - **specialized agents that control the execution flow of its sub-agents**.
Workflow agents are specialized components in ADK designed purely for **orchestrating the execution flow of sub-agents**. Their primary role is to manage *how* and *when* other agents run, defining the control flow of a process.
Unlike [LLM Agents](https://google.github.io/adk-docs/agents/llm-agents/index.md), which use Large Language Models for dynamic reasoning and decision-making, Workflow Agents operate based on **predefined logic**. They determine the execution sequence according to their type (e.g., sequential, parallel, loop) without consulting an LLM for the orchestration itself. This results in **deterministic and predictable execution patterns**.
ADK provides three core workflow agent types, each implementing a distinct execution pattern:
- **Sequential Agents**
______________________________________________________________________
Executes sub-agents one after another, in **sequence**.
[Learn more](https://google.github.io/adk-docs/agents/workflow-agents/sequential-agents/index.md)
- **Loop Agents**
______________________________________________________________________
**Repeatedly** executes its sub-agents until a specific termination condition is met.
[Learn more](https://google.github.io/adk-docs/agents/workflow-agents/loop-agents/index.md)
- **Parallel Agents**
______________________________________________________________________
Executes multiple sub-agents in **parallel**.
[Learn more](https://google.github.io/adk-docs/agents/workflow-agents/parallel-agents/index.md)
## Why Use Workflow Agents?
Workflow agents are essential when you need explicit control over how a series of tasks or agents are executed. They provide:
- **Predictability:** The flow of execution is guaranteed based on the agent type and configuration.
- **Reliability:** Ensures tasks run in the required order or pattern consistently.
- **Structure:** Allows you to build complex processes by composing agents within clear control structures.
While the workflow agent manages the control flow deterministically, the sub-agents it orchestrates can themselves be any type of agent, including intelligent LLM Agent instances. This allows you to combine structured process control with flexible, LLM-powered task execution.
# Loop agents
Supported in ADKPython v0.1.0Typescript v0.2.0Go v0.1.0Java v0.2.0
The `LoopAgent` is a workflow agent that executes its sub-agents in a loop (i.e. iteratively). It ***repeatedly runs* a sequence of agents** for a specified number of iterations or until a termination condition is met.
Use the `LoopAgent` when your workflow involves repetition or iterative refinement, such as revising code.
### Example
- You want to build an agent that can generate images of food, but sometimes when you want to generate a specific number of items (e.g. 5 bananas), it generates a different number of those items in the image, such as an image of 7 bananas. You have two tools: `Generate Image`, `Count Food Items`. Because you want to keep generating images until it either correctly generates the specified number of items, or after a certain number of iterations, you should build your agent using a `LoopAgent`.
As with other [workflow agents](https://google.github.io/adk-docs/agents/workflow-agents/index.md), the `LoopAgent` is not powered by an LLM, and is thus deterministic in how it executes. That being said, workflow agents are only concerned with their execution, such as in a loop, and not their internal logic; the tools or sub-agents of a workflow agent may or may not utilize LLMs.
### How it Works
When the `LoopAgent`'s `Run Async` method is called, it performs the following actions:
1. **Sub-Agent Execution:** It iterates through the Sub Agents list *in order*. For *each* sub-agent, it calls the agent's `Run Async` method.
1. **Termination Check:**
*Crucially*, the `LoopAgent` itself does *not* inherently decide when to stop looping. You *must* implement a termination mechanism to prevent infinite loops. Common strategies include:
- **Max Iterations**: Set a maximum number of iterations in the `LoopAgent`. **The loop will terminate after that many iterations**.
- **Escalation from sub-agent**: Design one or more sub-agents to evaluate a condition (e.g., "Is the document quality good enough?", "Has a consensus been reached?"). If the condition is met, the sub-agent can signal termination (e.g., by raising a custom event, setting a flag in a shared context, or returning a specific value).
### Full Example: Iterative Document Improvement
Imagine a scenario where you want to iteratively improve a document:
- **Writer Agent:** An `LlmAgent` that generates or refines a draft on a topic.
- **Critic Agent:** An `LlmAgent` that critiques the draft, identifying areas for improvement.
```py
LoopAgent(sub_agents=[WriterAgent, CriticAgent], max_iterations=5)
```
In this setup, the `LoopAgent` would manage the iterative process. The `CriticAgent` could be **designed to return a "STOP" signal when the document reaches a satisfactory quality level**, preventing further iterations. Alternatively, the `max iterations` parameter could be used to limit the process to a fixed number of cycles, or external logic could be implemented to make stop decisions. The **loop would run at most five times**, ensuring the iterative refinement doesn't continue indefinitely.
Full Code
````py
from google.adk.agents import LoopAgent, LlmAgent, SequentialAgent
from google.adk.tools.tool_context import ToolContext
from google.adk.agents.callback_context import CallbackContext
# --- Constants ---
GEMINI_MODEL = "gemini-2.5-flash"
# --- State Keys ---
STATE_CURRENT_DOC = "current_document"
STATE_CRITICISM = "criticism"
# Define the exact phrase the Critic should use to signal completion
COMPLETION_PHRASE = "No major issues found."
# --- Tool Definition ---
def exit_loop(tool_context: ToolContext):
"""Call this function ONLY when the critique indicates no further changes are needed, signaling the iterative process should end."""
print(f" [Tool Call] exit_loop triggered by {tool_context.agent_name}")
tool_context.actions.escalate = True
tool_context.actions.skip_summarization = True
# Return empty dict as tools should typically return JSON-serializable output
return {}
# --- Before Agent Callback ---
def update_initial_topic_state(callback_context: CallbackContext):
"""Ensure 'initial_topic' is set in state before pipeline starts."""
callback_context.state['initial_topic'] = callback_context.state.get('initial_topic', 'a robot developing unexpected emotions')
# --- Agent Definitions ---
# STEP 1: Initial Writer Agent (Runs ONCE at the beginning)
initial_writer_agent = LlmAgent(
name="InitialWriterAgent",
model=GEMINI_MODEL,
include_contents='none',
instruction=f"""
You are a Creative Writing Assistant tasked with starting a story.
Write a *very basic* first draft of a short story (just 1-2 simple sentences).
Keep it plain and minimal - do NOT add descriptive language yet.
Topic: {{initial_topic}}
Output *only* the story/document text. Do not add introductions or explanations.
""",
description="Writes the initial document draft based on the topic, aiming for some initial substance.",
output_key=STATE_CURRENT_DOC
)
# STEP 2a: Critic Agent (Inside the Refinement Loop)
critic_agent_in_loop = LlmAgent(
name="CriticAgent",
model=GEMINI_MODEL,
include_contents='none',
instruction=f"""
You are a Constructive Critic AI reviewing a short story draft.
**Document to Review:**
```
{{current_document}}
```
**Completion Criteria (ALL must be met):**
1. At least 4 sentences long
2. Has a clear beginning, middle, and end
3. Includes at least one descriptive detail (sensory or emotional)
**Task:**
Check the document against the criteria above.
IF any criteria is NOT met, provide specific feedback on what to add or improve.
Output *only* the critique text.
IF ALL criteria are met, respond *exactly* with: "{COMPLETION_PHRASE}"
""",
description="Reviews the current draft, providing critique if clear improvements are needed, otherwise signals completion.",
output_key=STATE_CRITICISM
)
# STEP 2b: Refiner/Exiter Agent (Inside the Refinement Loop)
refiner_agent_in_loop = LlmAgent(
name="RefinerAgent",
model=GEMINI_MODEL,
# Relies solely on state via placeholders
include_contents='none',
instruction=f"""
You are a Creative Writing Assistant refining a document based on feedback OR exiting the process.
**Current Document:**
```
{{current_document}}
```
**Critique/Suggestions:**
{{criticism}}
**Task:**
Analyze the 'Critique/Suggestions'.
IF the critique is *exactly* "{COMPLETION_PHRASE}":
You MUST call the 'exit_loop' function. Do not output any text.
ELSE (the critique contains actionable feedback):
Carefully apply the suggestions to improve the 'Current Document'. Output *only* the refined document text.
Do not add explanations. Either output the refined document OR call the exit_loop function.
""",
description="Refines the document based on critique, or calls exit_loop if critique indicates completion.",
tools=[exit_loop], # Provide the exit_loop tool
output_key=STATE_CURRENT_DOC # Overwrites state['current_document'] with the refined version
)
# STEP 2: Refinement Loop Agent
refinement_loop = LoopAgent(
name="RefinementLoop",
# Agent order is crucial: Critique first, then Refine/Exit
sub_agents=[
critic_agent_in_loop,
refiner_agent_in_loop,
],
max_iterations=5 # Limit loops
)
# STEP 3: Overall Sequential Pipeline
# For ADK tools compatibility, the root agent must be named `root_agent`
root_agent = SequentialAgent(
name="IterativeWritingPipeline",
sub_agents=[
initial_writer_agent, # Run first to create initial doc
refinement_loop # Then run the critique/refine loop
],
before_agent_callback=update_initial_topic_state, # set initial topic in state
description="Writes an initial document and then iteratively refines it with critique using an exit tool."
)
````
```typescript
// Part of agent.ts --> Follow https://google.github.io/adk-docs/get-started/quickstart/ to learn the setup
import { LoopAgent, LlmAgent, SequentialAgent, FunctionTool } from '@google/adk';
import { z } from 'zod';
// --- Constants ---
const GEMINI_MODEL = "gemini-2.5-flash";
const STATE_INITIAL_TOPIC = "initial_topic";
// --- State Keys ---
const STATE_CURRENT_DOC = "current_document";
const STATE_CRITICISM = "criticism";
// Define the exact phrase the Critic should use to signal completion
const COMPLETION_PHRASE = "No major issues found.";
// --- Tool Definition ---
const exitLoopTool = new FunctionTool({
name: 'exit_loop',
description: 'Call this function ONLY when the critique indicates no further changes are needed, signaling the iterative process should end.',
parameters: z.object({}),
execute: (input, toolContext) => {
if (toolContext) {
console.log(` [Tool Call] exit_loop triggered by ${toolContext.agentName} with input: ${input}`);
toolContext.actions.escalate = true;
}
return {};
},
});
// --- Agent Definitions ---
// STEP 1: Initial Writer Agent (Runs ONCE at the beginning)
const initialWriterAgent = new LlmAgent({
name: "InitialWriterAgent",
model: GEMINI_MODEL,
includeContents: 'none',
// MODIFIED Instruction: Ask for a slightly more developed start
instruction: `You are a Creative Writing Assistant tasked with starting a story.
Write the *first draft* of a short story (aim for 2-4 sentences).
Base the content *only* on the topic provided below. Try to introduce a specific element (like a character, a setting detail, or a starting action) to make it engaging.
Topic: {{${STATE_INITIAL_TOPIC}}}
Output *only* the story/document text. Do not add introductions or explanations.
`,
description: "Writes the initial document draft based on the topic, aiming for some initial substance.",
outputKey: STATE_CURRENT_DOC
});
// STEP 2a: Critic Agent (Inside the Refinement Loop)
const criticAgentInLoop = new LlmAgent({
name: "CriticAgent",
model: GEMINI_MODEL,
includeContents: 'none',
// MODIFIED Instruction: More nuanced completion criteria, look for clear improvement paths.
instruction: `You are a Constructive Critic AI reviewing a short document draft (typically 2-6 sentences). Your goal is balanced feedback.
**Document to Review:**
{{current_document}}
**Task:**
Review the document for clarity, engagement, and basic coherence according to the initial topic (if known).
IF you identify 1-2 *clear and actionable* ways the document could be improved to better capture the topic or enhance reader engagement (e.g., "Needs a stronger opening sentence", "Clarify the character's goal"):
Provide these specific suggestions concisely. Output *only* the critique text.
ELSE IF the document is coherent, addresses the topic adequately for its length, and has no glaring errors or obvious omissions:
Respond *exactly* with the phrase "${COMPLETION_PHRASE}" and nothing else. It doesn't need to be perfect, just functionally complete for this stage. Avoid suggesting purely subjective stylistic preferences if the core is sound.
Do not add explanations. Output only the critique OR the exact completion.
`,
description: "Reviews the current draft, providing critique if clear improvements are needed, otherwise signals completion.",
outputKey: STATE_CRITICISM
});
// STEP 2b: Refiner/Exiter Agent (Inside the Refinement Loop)
const refinerAgentInLoop = new LlmAgent({
name: "RefinerAgent",
model: GEMINI_MODEL,
// Relies solely on state via placeholders
includeContents: 'none',
instruction: `You are a Creative Writing Assistant refining a document based on feedback OR exiting the process.
**Current Document:**
{{current_document}}
**Critique/Suggestions:**
{{criticism}}
**Task:**
Analyze the 'Critique/Suggestions'.
IF the critique is *exactly* "${COMPLETION_PHRASE}":
You MUST call the 'exit_loop' function. Do not output any text.
ELSE (the critique contains actionable feedback):
Carefully apply the suggestions to improve the 'Current Document'. Output *only* the refined document text.
Do not add explanations. Either output the refined document OR call the exit_loop function.
`,
tools: [exitLoopTool],
description: "Refines the document based on critique, or calls exit_loop if critique indicates completion.",
outputKey: STATE_CURRENT_DOC
});
// STEP 2: Refinement Loop Agent
const refinementLoop = new LoopAgent({
name: "RefinementLoop",
// Agent order is crucial: Critique first, then Refine/Exit
subAgents: [
criticAgentInLoop,
refinerAgentInLoop,
],
maxIterations: 5 // Limit loops
});
// STEP 3: Overall Sequential Pipeline
// For ADK tools compatibility, the root agent must be named `root_agent`
export const rootAgent = new SequentialAgent({
name: "IterativeWritingPipeline",
subAgents: [
initialWriterAgent, // Run first to create initial doc
refinementLoop // Then run the critique/refine loop
],
description: "Writes an initial document and then iteratively refines it with critique using an exit tool."
});
```
```go
// ExitLoopArgs defines the (empty) arguments for the ExitLoop tool.
type ExitLoopArgs struct{}
// ExitLoopResults defines the output of the ExitLoop tool.
type ExitLoopResults struct{}
// ExitLoop is a tool that signals the loop to terminate by setting Escalate to true.
func ExitLoop(ctx tool.Context, input ExitLoopArgs) (ExitLoopResults, error) {
fmt.Printf("[Tool Call] exitLoop triggered by %s \n", ctx.AgentName())
ctx.Actions().Escalate = true
return ExitLoopResults{}, nil
}
func main() {
ctx := context.Background()
if err := runAgent(ctx, "Write a document about a cat"); err != nil {
log.Fatalf("Agent execution failed: %v", err)
}
}
func runAgent(ctx context.Context, prompt string) error {
model, err := gemini.NewModel(ctx, modelName, &genai.ClientConfig{})
if err != nil {
return fmt.Errorf("failed to create model: %v", err)
}
// STEP 1: Initial Writer Agent (Runs ONCE at the beginning)
initialWriterAgent, err := llmagent.New(llmagent.Config{
Name: "InitialWriterAgent",
Model: model,
Description: "Writes the initial document draft based on the topic.",
Instruction: `You are a Creative Writing Assistant tasked with starting a story.
Write the *first draft* of a short story (aim for 2-4 sentences).
Base the content *only* on the topic provided in the user's prompt.
Output *only* the story/document text. Do not add introductions or explanations.`,
OutputKey: stateDoc,
})
if err != nil {
return fmt.Errorf("failed to create initial writer agent: %v", err)
}
// STEP 2a: Critic Agent (Inside the Refinement Loop)
criticAgentInLoop, err := llmagent.New(llmagent.Config{
Name: "CriticAgent",
Model: model,
Description: "Reviews the current draft, providing critique or signaling completion.",
Instruction: fmt.Sprintf(`You are a Constructive Critic AI reviewing a short document draft.
**Document to Review:**
"""
{%s}
"""
**Task:**
Review the document.
IF you identify 1-2 *clear and actionable* ways it could be improved:
Provide these specific suggestions concisely. Output *only* the critique text.
ELSE IF the document is coherent and addresses the topic adequately:
Respond *exactly* with the phrase "%s" and nothing else.`, stateDoc, donePhrase),
OutputKey: stateCrit,
})
if err != nil {
return fmt.Errorf("failed to create critic agent: %v", err)
}
exitLoopTool, err := functiontool.New(
functiontool.Config{
Name: "exitLoop",
Description: "Call this function ONLY when the critique indicates no further changes are needed.",
},
ExitLoop,
)
if err != nil {
return fmt.Errorf("failed to create exit loop tool: %v", err)
}
// STEP 2b: Refiner/Exiter Agent (Inside the Refinement Loop)
refinerAgentInLoop, err := llmagent.New(llmagent.Config{
Name: "RefinerAgent",
Model: model,
Instruction: fmt.Sprintf(`You are a Creative Writing Assistant refining a document based on feedback OR exiting the process.
**Current Document:**
"""
{%s}
"""
**Critique/Suggestions:**
{%s}
**Task:**
Analyze the 'Critique/Suggestions'.
IF the critique is *exactly* "%s":
You MUST call the 'exitLoop' function. Do not output any text.
ELSE (the critique contains actionable feedback):
Carefully apply the suggestions to improve the 'Current Document'. Output *only* the refined document text.`, stateDoc, stateCrit, donePhrase),
Description: "Refines the document based on critique, or calls exitLoop if critique indicates completion.",
Tools: []tool.Tool{exitLoopTool},
OutputKey: stateDoc,
})
if err != nil {
return fmt.Errorf("failed to create refiner agent: %v", err)
}
// STEP 2: Refinement Loop Agent
refinementLoop, err := loopagent.New(loopagent.Config{
AgentConfig: agent.Config{
Name: "RefinementLoop",
SubAgents: []agent.Agent{criticAgentInLoop, refinerAgentInLoop},
},
MaxIterations: 5,
})
if err != nil {
return fmt.Errorf("failed to create loop agent: %v", err)
}
// STEP 3: Overall Sequential Pipeline
iterativeWriterAgent, err := sequentialagent.New(sequentialagent.Config{
AgentConfig: agent.Config{
Name: appName,
SubAgents: []agent.Agent{initialWriterAgent, refinementLoop},
},
})
if err != nil {
return fmt.Errorf("failed to create sequential agent pipeline: %v", err)
}
```
````java
import static com.google.adk.agents.LlmAgent.IncludeContents.NONE;
import com.google.adk.agents.LlmAgent;
import com.google.adk.agents.LoopAgent;
import com.google.adk.agents.SequentialAgent;
import com.google.adk.events.Event;
import com.google.adk.runner.InMemoryRunner;
import com.google.adk.sessions.Session;
import com.google.adk.tools.Annotations.Schema;
import com.google.adk.tools.FunctionTool;
import com.google.adk.tools.ToolContext;
import com.google.genai.types.Content;
import com.google.genai.types.Part;
import io.reactivex.rxjava3.core.Flowable;
import java.util.Map;
public class LoopAgentExample {
// --- Constants ---
private static final String APP_NAME = "IterativeWritingPipeline";
private static final String USER_ID = "test_user_456";
private static final String MODEL_NAME = "gemini-2.0-flash";
// --- State Keys ---
private static final String STATE_CURRENT_DOC = "current_document";
private static final String STATE_CRITICISM = "criticism";
public static void main(String[] args) {
LoopAgentExample loopAgentExample = new LoopAgentExample();
loopAgentExample.runAgent("Write a document about a cat");
}
// --- Tool Definition ---
@Schema(
description =
"Call this function ONLY when the critique indicates no further changes are needed,"
+ " signaling the iterative process should end.")
public static Map exitLoop(@Schema(name = "toolContext") ToolContext toolContext) {
System.out.printf("[Tool Call] exitLoop triggered by %s \n", toolContext.agentName());
toolContext.actions().setEscalate(true);
// Return empty dict as tools should typically return JSON-serializable output
return Map.of();
}
// --- Agent Definitions ---
public void runAgent(String prompt) {
// STEP 1: Initial Writer Agent (Runs ONCE at the beginning)
LlmAgent initialWriterAgent =
LlmAgent.builder()
.model(MODEL_NAME)
.name("InitialWriterAgent")
.description(
"Writes the initial document draft based on the topic, aiming for some initial"
+ " substance.")
.instruction(
"""
You are a Creative Writing Assistant tasked with starting a story.
Write the *first draft* of a short story (aim for 2-4 sentences).
Base the content *only* on the topic provided below. Try to introduce a specific element (like a character, a setting detail, or a starting action) to make it engaging.
Output *only* the story/document text. Do not add introductions or explanations.
""")
.outputKey(STATE_CURRENT_DOC)
.includeContents(NONE)
.build();
// STEP 2a: Critic Agent (Inside the Refinement Loop)
LlmAgent criticAgentInLoop =
LlmAgent.builder()
.model(MODEL_NAME)
.name("CriticAgent")
.description(
"Reviews the current draft, providing critique if clear improvements are needed,"
+ " otherwise signals completion.")
.instruction(
"""
You are a Constructive Critic AI reviewing a short document draft (typically 2-6 sentences). Your goal is balanced feedback.
**Document to Review:**
```
{{current_document}}
```
**Task:**
Review the document for clarity, engagement, and basic coherence according to the initial topic (if known).
IF you identify 1-2 *clear and actionable* ways the document could be improved to better capture the topic or enhance reader engagement (e.g., "Needs a stronger opening sentence", "Clarify the character's goal"):
Provide these specific suggestions concisely. Output *only* the critique text.
ELSE IF the document is coherent, addresses the topic adequately for its length, and has no glaring errors or obvious omissions:
Respond *exactly* with the phrase "No major issues found." and nothing else. It doesn't need to be perfect, just functionally complete for this stage. Avoid suggesting purely subjective stylistic preferences if the core is sound.
Do not add explanations. Output only the critique OR the exact completion phrase.
""")
.outputKey(STATE_CRITICISM)
.includeContents(NONE)
.build();
// STEP 2b: Refiner/Exiter Agent (Inside the Refinement Loop)
LlmAgent refinerAgentInLoop =
LlmAgent.builder()
.model(MODEL_NAME)
.name("RefinerAgent")
.description(
"Refines the document based on critique, or calls exitLoop if critique indicates"
+ " completion.")
.instruction(
"""
You are a Creative Writing Assistant refining a document based on feedback OR exiting the process.
**Current Document:**
```
{{current_document}}
```
**Critique/Suggestions:**
{{criticism}}
**Task:**
Analyze the 'Critique/Suggestions'.
IF the critique is *exactly* "No major issues found.":
You MUST call the 'exitLoop' function. Do not output any text.
ELSE (the critique contains actionable feedback):
Carefully apply the suggestions to improve the 'Current Document'. Output *only* the refined document text.
Do not add explanations. Either output the refined document OR call the exitLoop function.
""")
.outputKey(STATE_CURRENT_DOC)
.includeContents(NONE)
.tools(FunctionTool.create(LoopAgentExample.class, "exitLoop"))
.build();
// STEP 2: Refinement Loop Agent
LoopAgent refinementLoop =
LoopAgent.builder()
.name("RefinementLoop")
.description("Repeatedly refines the document with critique and then exits.")
.subAgents(criticAgentInLoop, refinerAgentInLoop)
.maxIterations(5)
.build();
// STEP 3: Overall Sequential Pipeline
SequentialAgent iterativeWriterAgent =
SequentialAgent.builder()
.name(APP_NAME)
.description(
"Writes an initial document and then iteratively refines it with critique using an"
+ " exit tool.")
.subAgents(initialWriterAgent, refinementLoop)
.build();
// Create an InMemoryRunner
InMemoryRunner runner = new InMemoryRunner(iterativeWriterAgent, APP_NAME);
// InMemoryRunner automatically creates a session service. Create a session using the service
Session session = runner.sessionService().createSession(APP_NAME, USER_ID).blockingGet();
Content userMessage = Content.fromParts(Part.fromText(prompt));
// Run the agent
Flowable eventStream = runner.runAsync(USER_ID, session.id(), userMessage);
// Stream event response
eventStream.blockingForEach(
event -> {
if (event.finalResponse()) {
System.out.println(event.stringifyContent());
}
});
}
}
````
# Parallel agents
Supported in ADKPython v0.1.0Typescript v0.2.0Go v0.1.0Java v0.2.0
The `ParallelAgent` is a [workflow agent](https://google.github.io/adk-docs/agents/workflow-agents/index.md) that executes its sub-agents *concurrently*. This dramatically speeds up workflows where tasks can be performed independently.
Use `ParallelAgent` when: For scenarios prioritizing speed and involving independent, resource-intensive tasks, a `ParallelAgent` facilitates efficient parallel execution. **When sub-agents operate without dependencies, their tasks can be performed concurrently**, significantly reducing overall processing time.
As with other [workflow agents](https://google.github.io/adk-docs/agents/workflow-agents/index.md), the `ParallelAgent` is not powered by an LLM, and is thus deterministic in how it executes. That being said, workflow agents are only concerned with their execution (i.e. executing sub-agents in parallel), and not their internal logic; the tools or sub-agents of a workflow agent may or may not utilize LLMs.
### Example
This approach is particularly beneficial for operations like multi-source data retrieval or heavy computations, where parallelization yields substantial performance gains. Importantly, this strategy assumes no inherent need for shared state or direct information exchange between the concurrently executing agents.
### How it works
When the `ParallelAgent`'s `run_async()` method is called:
1. **Concurrent Execution:** It initiates the `run_async()` method of *each* sub-agent present in the `sub_agents` list *concurrently*. This means all the agents start running at (approximately) the same time.
1. **Independent Branches:** Each sub-agent operates in its own execution branch. There is ***no* automatic sharing of conversation history or state between these branches** during execution.
1. **Result Collection:** The `ParallelAgent` manages the parallel execution and, typically, provides a way to access the results from each sub-agent after they have completed (e.g., through a list of results or events). The order of results may not be deterministic.
### Independent Execution and State Management
It's *crucial* to understand that sub-agents within a `ParallelAgent` run independently. If you *need* communication or data sharing between these agents, you must implement it explicitly. Possible approaches include:
- **Shared `InvocationContext`:** You could pass a shared `InvocationContext` object to each sub-agent. This object could act as a shared data store. However, you'd need to manage concurrent access to this shared context carefully (e.g., using locks) to avoid race conditions.
- **External State Management:** Use an external database, message queue, or other mechanism to manage shared state and facilitate communication between agents.
- **Post-Processing:** Collect results from each branch, and then implement logic to coordinate data afterwards.
### Full Example: Parallel Web Research
Imagine researching multiple topics simultaneously:
1. **Researcher Agent 1:** An `LlmAgent` that researches "renewable energy sources."
1. **Researcher Agent 2:** An `LlmAgent` that researches "electric vehicle technology."
1. **Researcher Agent 3:** An `LlmAgent` that researches "carbon capture methods."
```py
ParallelAgent(sub_agents=[ResearcherAgent1, ResearcherAgent2, ResearcherAgent3])
```
These research tasks are independent. Using a `ParallelAgent` allows them to run concurrently, potentially reducing the total research time significantly compared to running them sequentially. The results from each agent would be collected separately after they finish.
Full Code
```py
from google.adk.agents.parallel_agent import ParallelAgent
from google.adk.agents.llm_agent import LlmAgent
from google.adk.agents.sequential_agent import SequentialAgent
from google.adk.tools import google_search
# --- Constants ---
GEMINI_MODEL = "gemini-2.5-flash"
# --- 1. Define Researcher Sub-Agents (to run in parallel) ---
# Researcher 1: Renewable Energy
researcher_agent_1 = LlmAgent(
name="RenewableEnergyResearcher",
model=GEMINI_MODEL,
instruction="""
You are an AI Research Assistant specializing in energy.
Research the latest advancements in 'renewable energy sources'.
Use the Google Search tool provided.
Summarize your key findings concisely (1-2 sentences).
Output *only* the summary.
""",
description="Researches renewable energy sources.",
tools=[google_search],
# Store result in state for the merger agent
output_key="renewable_energy_result"
)
# Researcher 2: Electric Vehicles
researcher_agent_2 = LlmAgent(
name="EVResearcher",
model=GEMINI_MODEL,
instruction="""
You are an AI Research Assistant specializing in transportation.
Research the latest developments in 'electric vehicle technology'.
Use the Google Search tool provided.
Summarize your key findings concisely (1-2 sentences).
Output *only* the summary.
""",
description="Researches electric vehicle technology.",
tools=[google_search],
# Store result in state for the merger agent
output_key="ev_technology_result"
)
# Researcher 3: Carbon Capture
researcher_agent_3 = LlmAgent(
name="CarbonCaptureResearcher",
model=GEMINI_MODEL,
instruction="""
You are an AI Research Assistant specializing in climate solutions.
Research the current state of 'carbon capture methods'.
Use the Google Search tool provided.
Summarize your key findings concisely (1-2 sentences).
Output *only* the summary.
""",
description="Researches carbon capture methods.",
tools=[google_search],
# Store result in state for the merger agent
output_key="carbon_capture_result"
)
# --- 2. Create the ParallelAgent (Runs researchers concurrently) ---
# This agent orchestrates the concurrent execution of the researchers.
# It finishes once all researchers have completed and stored their results in state.
parallel_research_agent = ParallelAgent(
name="ParallelWebResearchAgent",
sub_agents=[researcher_agent_1, researcher_agent_2, researcher_agent_3],
description="Runs multiple research agents in parallel to gather information."
)
# --- 3. Define the Merger Agent (Runs *after* the parallel agents) ---
# This agent takes the results stored in the session state by the parallel agents
# and synthesizes them into a single, structured response with attributions.
merger_agent = LlmAgent(
name="SynthesisAgent",
model=GEMINI_MODEL, # Or potentially a more powerful model if needed for synthesis
instruction="""
You are an AI Assistant responsible for combining research findings into a structured report.
Your primary task is to synthesize the following research summaries, clearly attributing findings to their source areas. Structure your response using headings for each topic. Ensure the report is coherent and integrates the key points smoothly.
**Crucially: Your entire response MUST be grounded *exclusively* on the information provided in the 'Input Summaries' below. Do NOT add any external knowledge, facts, or details not present in these specific summaries.**
**Input Summaries:**
* **Renewable Energy:**
{renewable_energy_result}
* **Electric Vehicles:**
{ev_technology_result}
* **Carbon Capture:**
{carbon_capture_result}
**Output Format:**
## Summary of Recent Sustainable Technology Advancements
### Renewable Energy Findings
(Based on RenewableEnergyResearcher's findings)
[Synthesize and elaborate *only* on the renewable energy input summary provided above.]
### Electric Vehicle Findings
(Based on EVResearcher's findings)
[Synthesize and elaborate *only* on the EV input summary provided above.]
### Carbon Capture Findings
(Based on CarbonCaptureResearcher's findings)
[Synthesize and elaborate *only* on the carbon capture input summary provided above.]
### Overall Conclusion
[Provide a brief (1-2 sentence) concluding statement that connects *only* the findings presented above.]
Output *only* the structured report following this format. Do not include introductory or concluding phrases outside this structure, and strictly adhere to using only the provided input summary content.
""",
description="Combines research findings from parallel agents into a structured, cited report, strictly grounded on provided inputs.",
# No tools needed for merging
# No output_key needed here, as its direct response is the final output of the sequence
)
# --- 4. Create the SequentialAgent (Orchestrates the overall flow) ---
# This is the main agent that will be run. It first executes the ParallelAgent
# to populate the state, and then executes the MergerAgent to produce the final output.
sequential_pipeline_agent = SequentialAgent(
name="ResearchAndSynthesisPipeline",
# Run parallel research first, then merge
sub_agents=[parallel_research_agent, merger_agent],
description="Coordinates parallel research and synthesizes the results."
)
root_agent = sequential_pipeline_agent
```
```typescript
// Part of agent.ts --> Follow https://google.github.io/adk-docs/get-started/quickstart/ to learn the setup
// --- 1. Define Researcher Sub-Agents (to run in parallel) ---
const researchTools = [GOOGLE_SEARCH];
// Researcher 1: Renewable Energy
const researcherAgent1 = new LlmAgent({
name: "RenewableEnergyResearcher",
model: GEMINI_MODEL,
instruction: `You are an AI Research Assistant specializing in energy.
Research the latest advancements in 'renewable energy sources'.
Use the Google Search tool provided.
Summarize your key findings concisely (1-2 sentences).
Output *only* the summary.
`,
description: "Researches renewable energy sources.",
tools: researchTools,
// Store result in state for the merger agent
outputKey: "renewable_energy_result"
});
// Researcher 2: Electric Vehicles
const researcherAgent2 = new LlmAgent({
name: "EVResearcher",
model: GEMINI_MODEL,
instruction: `You are an AI Research Assistant specializing in transportation.
Research the latest developments in 'electric vehicle technology'.
Use the Google Search tool provided.
Summarize your key findings concisely (1-2 sentences).
Output *only* the summary.
`,
description: "Researches electric vehicle technology.",
tools: researchTools,
// Store result in state for the merger agent
outputKey: "ev_technology_result"
});
// Researcher 3: Carbon Capture
const researcherAgent3 = new LlmAgent({
name: "CarbonCaptureResearcher",
model: GEMINI_MODEL,
instruction: `You are an AI Research Assistant specializing in climate solutions.
Research the current state of 'carbon capture methods'.
Use the Google Search tool provided.
Summarize your key findings concisely (1-2 sentences).
Output *only* the summary.
`,
description: "Researches carbon capture methods.",
tools: researchTools,
// Store result in state for the merger agent
outputKey: "carbon_capture_result"
});
// --- 2. Create the ParallelAgent (Runs researchers concurrently) ---
// This agent orchestrates the concurrent execution of the researchers.
// It finishes once all researchers have completed and stored their results in state.
const parallelResearchAgent = new ParallelAgent({
name: "ParallelWebResearchAgent",
subAgents: [researcherAgent1, researcherAgent2, researcherAgent3],
description: "Runs multiple research agents in parallel to gather information."
});
// --- 3. Define the Merger Agent (Runs *after* the parallel agents) ---
// This agent takes the results stored in the session state by the parallel agents
// and synthesizes them into a single, structured response with attributions.
const mergerAgent = new LlmAgent({
name: "SynthesisAgent",
model: GEMINI_MODEL, // Or potentially a more powerful model if needed for synthesis
instruction: `You are an AI Assistant responsible for combining research findings into a structured report.
Your primary task is to synthesize the following research summaries, clearly attributing findings to their source areas. Structure your response using headings for each topic. Ensure the report is coherent and integrates the key points smoothly.
**Crucially: Your entire response MUST be grounded *exclusively* on the information provided in the 'Input Summaries' below. Do NOT add any external knowledge, facts, or details not present in these specific summaries.**
**Input Summaries:**
* **Renewable Energy:**
{renewable_energy_result}
* **Electric Vehicles:**
{ev_technology_result}
* **Carbon Capture:**
{carbon_capture_result}
**Output Format:**
## Summary of Recent Sustainable Technology Advancements
### Renewable Energy Findings
(Based on RenewableEnergyResearcher's findings)
[Synthesize and elaborate *only* on the renewable energy input summary provided above.]
### Electric Vehicle Findings
(Based on EVResearcher's findings)
[Synthesize and elaborate *only* on the EV input summary provided above.]
### Carbon Capture Findings
(Based on CarbonCaptureResearcher's findings)
[Synthesize and elaborate *only* on the carbon capture input summary provided above.]
### Overall Conclusion
[Provide a brief (1-2 sentence) concluding statement that connects *only* the findings presented above.]
Output *only* the structured report following this format. Do not include introductory or concluding phrases outside this structure, and strictly adhere to using only the provided input summary content.
`,
description: "Combines research findings from parallel agents into a structured, cited report, strictly grounded on provided inputs.",
// No tools needed for merging
// No output_key needed here, as its direct response is the final output of the sequence
});
// --- 4. Create the SequentialAgent (Orchestrates the overall flow) ---
// This is the main agent that will be run. It first executes the ParallelAgent
// to populate the state, and then executes the MergerAgent to produce the final output.
const rootAgent = new SequentialAgent({
name: "ResearchAndSynthesisPipeline",
// Run parallel research first, then merge
subAgents: [parallelResearchAgent, mergerAgent],
description: "Coordinates parallel research and synthesizes the results."
});
```
```go
model, err := gemini.NewModel(ctx, modelName, &genai.ClientConfig{})
if err != nil {
return fmt.Errorf("failed to create model: %v", err)
}
// --- 1. Define Researcher Sub-Agents (to run in parallel) ---
researcher1, err := llmagent.New(llmagent.Config{
Name: "RenewableEnergyResearcher",
Model: model,
Instruction: `You are an AI Research Assistant specializing in energy.
Research the latest advancements in 'renewable energy sources'.
Use the Google Search tool provided.
Summarize your key findings concisely (1-2 sentences).
Output *only* the summary.`,
Description: "Researches renewable energy sources.",
OutputKey: "renewable_energy_result",
})
if err != nil {
return err
}
researcher2, err := llmagent.New(llmagent.Config{
Name: "EVResearcher",
Model: model,
Instruction: `You are an AI Research Assistant specializing in transportation.
Research the latest developments in 'electric vehicle technology'.
Use the Google Search tool provided.
Summarize your key findings concisely (1-2 sentences).
Output *only* the summary.`,
Description: "Researches electric vehicle technology.",
OutputKey: "ev_technology_result",
})
if err != nil {
return err
}
researcher3, err := llmagent.New(llmagent.Config{
Name: "CarbonCaptureResearcher",
Model: model,
Instruction: `You are an AI Research Assistant specializing in climate solutions.
Research the current state of 'carbon capture methods'.
Use the Google Search tool provided.
Summarize your key findings concisely (1-2 sentences).
Output *only* the summary.`,
Description: "Researches carbon capture methods.",
OutputKey: "carbon_capture_result",
})
if err != nil {
return err
}
// --- 2. Create the ParallelAgent (Runs researchers concurrently) ---
parallelResearchAgent, err := parallelagent.New(parallelagent.Config{
AgentConfig: agent.Config{
Name: "ParallelWebResearchAgent",
Description: "Runs multiple research agents in parallel to gather information.",
SubAgents: []agent.Agent{researcher1, researcher2, researcher3},
},
})
if err != nil {
return fmt.Errorf("failed to create parallel agent: %v", err)
}
// --- 3. Define the Merger Agent (Runs *after* the parallel agents) ---
synthesisAgent, err := llmagent.New(llmagent.Config{
Name: "SynthesisAgent",
Model: model,
Instruction: `You are an AI Assistant responsible for combining research findings into a structured report.
Your primary task is to synthesize the following research summaries, clearly attributing findings to their source areas. Structure your response using headings for each topic. Ensure the report is coherent and integrates the key points smoothly.
**Crucially: Your entire response MUST be grounded *exclusively* on the information provided in the 'Input Summaries' below. Do NOT add any external knowledge, facts, or details not present in these specific summaries.**
**Input Summaries:**
* **Renewable Energy:**
{renewable_energy_result}
* **Electric Vehicles:**
{ev_technology_result}
* **Carbon Capture:**
{carbon_capture_result}
**Output Format:**
## Summary of Recent Sustainable Technology Advancements
### Renewable Energy Findings
(Based on RenewableEnergyResearcher's findings)
[Synthesize and elaborate *only* on the renewable energy input summary provided above.]
### Electric Vehicle Findings
(Based on EVResearcher's findings)
[Synthesize and elaborate *only* on the EV input summary provided above.]
### Carbon Capture Findings
(Based on CarbonCaptureResearcher's findings)
[Synthesize and elaborate *only* on the carbon capture input summary provided above.]
### Overall Conclusion
[Provide a brief (1-2 sentence) concluding statement that connects *only* the findings presented above.]
Output *only* the structured report following this format. Do not include introductory or concluding phrases outside this structure, and strictly adhere to using only the provided input summary content.`,
Description: "Combines research findings from parallel agents into a structured, cited report, strictly grounded on provided inputs.",
})
if err != nil {
return fmt.Errorf("failed to create synthesis agent: %v", err)
}
// --- 4. Create the SequentialAgent (Orchestrates the overall flow) ---
pipeline, err := sequentialagent.New(sequentialagent.Config{
AgentConfig: agent.Config{
Name: "ResearchAndSynthesisPipeline",
Description: "Coordinates parallel research and synthesizes the results.",
SubAgents: []agent.Agent{parallelResearchAgent, synthesisAgent},
},
})
if err != nil {
return fmt.Errorf("failed to create sequential agent pipeline: %v", err)
}
```
```java
import com.google.adk.agents.LlmAgent;
import com.google.adk.agents.ParallelAgent;
import com.google.adk.agents.SequentialAgent;
import com.google.adk.events.Event;
import com.google.adk.runner.InMemoryRunner;
import com.google.adk.sessions.Session;
import com.google.adk.tools.GoogleSearchTool;
import com.google.genai.types.Content;
import com.google.genai.types.Part;
import io.reactivex.rxjava3.core.Flowable;
public class ParallelResearchPipeline {
private static final String APP_NAME = "parallel_research_app";
private static final String USER_ID = "research_user_01";
private static final String GEMINI_MODEL = "gemini-2.0-flash";
// Assume google_search is an instance of the GoogleSearchTool
private static final GoogleSearchTool googleSearchTool = new GoogleSearchTool();
public static void main(String[] args) {
String query = "Summarize recent sustainable tech advancements.";
SequentialAgent sequentialPipelineAgent = initAgent();
runAgent(sequentialPipelineAgent, query);
}
public static SequentialAgent initAgent() {
// --- 1. Define Researcher Sub-Agents (to run in parallel) ---
// Researcher 1: Renewable Energy
LlmAgent researcherAgent1 = LlmAgent.builder()
.name("RenewableEnergyResearcher")
.model(GEMINI_MODEL)
.instruction("""
You are an AI Research Assistant specializing in energy.
Research the latest advancements in 'renewable energy sources'.
Use the Google Search tool provided.
Summarize your key findings concisely (1-2 sentences).
Output *only* the summary.
""")
.description("Researches renewable energy sources.")
.tools(googleSearchTool)
.outputKey("renewable_energy_result") // Store result in state
.build();
// Researcher 2: Electric Vehicles
LlmAgent researcherAgent2 = LlmAgent.builder()
.name("EVResearcher")
.model(GEMINI_MODEL)
.instruction("""
You are an AI Research Assistant specializing in transportation.
Research the latest developments in 'electric vehicle technology'.
Use the Google Search tool provided.
Summarize your key findings concisely (1-2 sentences).
Output *only* the summary.
""")
.description("Researches electric vehicle technology.")
.tools(googleSearchTool)
.outputKey("ev_technology_result") // Store result in state
.build();
// Researcher 3: Carbon Capture
LlmAgent researcherAgent3 = LlmAgent.builder()
.name("CarbonCaptureResearcher")
.model(GEMINI_MODEL)
.instruction("""
You are an AI Research Assistant specializing in climate solutions.
Research the current state of 'carbon capture methods'.
Use the Google Search tool provided.
Summarize your key findings concisely (1-2 sentences).
Output *only* the summary.
""")
.description("Researches carbon capture methods.")
.tools(googleSearchTool)
.outputKey("carbon_capture_result") // Store result in state
.build();
// --- 2. Create the ParallelAgent (Runs researchers concurrently) ---
// This agent orchestrates the concurrent execution of the researchers.
// It finishes once all researchers have completed and stored their results in state.
ParallelAgent parallelResearchAgent =
ParallelAgent.builder()
.name("ParallelWebResearchAgent")
.subAgents(researcherAgent1, researcherAgent2, researcherAgent3)
.description("Runs multiple research agents in parallel to gather information.")
.build();
// --- 3. Define the Merger Agent (Runs *after* the parallel agents) ---
// This agent takes the results stored in the session state by the parallel agents
// and synthesizes them into a single, structured response with attributions.
LlmAgent mergerAgent =
LlmAgent.builder()
.name("SynthesisAgent")
.model(GEMINI_MODEL)
.instruction(
"""
You are an AI Assistant responsible for combining research findings into a structured report.
Your primary task is to synthesize the following research summaries, clearly attributing findings to their source areas. Structure your response using headings for each topic. Ensure the report is coherent and integrates the key points smoothly.
**Crucially: Your entire response MUST be grounded *exclusively* on the information provided in the 'Input Summaries' below. Do NOT add any external knowledge, facts, or details not present in these specific summaries.**
**Input Summaries:**
* **Renewable Energy:**
{renewable_energy_result}
* **Electric Vehicles:**
{ev_technology_result}
* **Carbon Capture:**
{carbon_capture_result}
**Output Format:**
## Summary of Recent Sustainable Technology Advancements
### Renewable Energy Findings
(Based on RenewableEnergyResearcher's findings)
[Synthesize and elaborate *only* on the renewable energy input summary provided above.]
### Electric Vehicle Findings
(Based on EVResearcher's findings)
[Synthesize and elaborate *only* on the EV input summary provided above.]
### Carbon Capture Findings
(Based on CarbonCaptureResearcher's findings)
[Synthesize and elaborate *only* on the carbon capture input summary provided above.]
### Overall Conclusion
[Provide a brief (1-2 sentence) concluding statement that connects *only* the findings presented above.]
Output *only* the structured report following this format. Do not include introductory or concluding phrases outside this structure, and strictly adhere to using only the provided input summary content.
""")
.description(
"Combines research findings from parallel agents into a structured, cited report, strictly grounded on provided inputs.")
// No tools needed for merging
// No output_key needed here, as its direct response is the final output of the sequence
.build();
// --- 4. Create the SequentialAgent (Orchestrates the overall flow) ---
// This is the main agent that will be run. It first executes the ParallelAgent
// to populate the state, and then executes the MergerAgent to produce the final output.
SequentialAgent sequentialPipelineAgent =
SequentialAgent.builder()
.name("ResearchAndSynthesisPipeline")
// Run parallel research first, then merge
.subAgents(parallelResearchAgent, mergerAgent)
.description("Coordinates parallel research and synthesizes the results.")
.build();
return sequentialPipelineAgent;
}
public static void runAgent(SequentialAgent sequentialPipelineAgent, String query) {
// Create an InMemoryRunner
InMemoryRunner runner = new InMemoryRunner(sequentialPipelineAgent, APP_NAME);
// InMemoryRunner automatically creates a session service. Create a session using the service
Session session = runner.sessionService().createSession(APP_NAME, USER_ID).blockingGet();
Content userMessage = Content.fromParts(Part.fromText(query));
// Run the agent
Flowable eventStream = runner.runAsync(USER_ID, session.id(), userMessage);
// Stream event response
eventStream.blockingForEach(
event -> {
if (event.finalResponse()) {
System.out.printf("Event Author: %s \n Event Response: %s \n\n\n", event.author(), event.stringifyContent());
}
});
}
}
```
# Sequential agents
Supported in ADKPython v0.1.0Typescript v0.2.0Go v0.1.0Java v0.2.0
The `SequentialAgent` is a [workflow agent](https://google.github.io/adk-docs/agents/workflow-agents/index.md) that executes its sub-agents in the order they are specified in the list. Use the `SequentialAgent` when you want the execution to occur in a fixed, strict order.
### Example
- You want to build an agent that can summarize any webpage, using two tools: `Get Page Contents` and `Summarize Page`. Because the agent must always call `Get Page Contents` before calling `Summarize Page` (you can't summarize from nothing!), you should build your agent using a `SequentialAgent`.
As with other [workflow agents](https://google.github.io/adk-docs/agents/workflow-agents/index.md), the `SequentialAgent` is not powered by an LLM, and is thus deterministic in how it executes. That being said, workflow agents are concerned only with their execution (i.e. in sequence), and not their internal logic; the tools or sub-agents of a workflow agent may or may not utilize LLMs.
### How it works
When the `SequentialAgent`'s `Run Async` method is called, it performs the following actions:
1. **Iteration:** It iterates through the sub agents list in the order they were provided.
1. **Sub-Agent Execution:** For each sub-agent in the list, it calls the sub-agent's `Run Async` method.
### Full Example: Code Development Pipeline
Consider a simplified code development pipeline:
- **Code Writer Agent:** An LLM Agent that generates initial code based on a specification.
- **Code Reviewer Agent:** An LLM Agent that reviews the generated code for errors, style issues, and adherence to best practices. It receives the output of the Code Writer Agent.
- **Code Refactorer Agent:** An LLM Agent that takes the reviewed code (and the reviewer's comments) and refactors it to improve quality and address issues.
A `SequentialAgent` is perfect for this:
```py
SequentialAgent(sub_agents=[CodeWriterAgent, CodeReviewerAgent, CodeRefactorerAgent])
```
This ensures the code is written, *then* reviewed, and *finally* refactored, in a strict, dependable order. **The output from each sub-agent is passed to the next by storing them in state via [Output Key](https://google.github.io/adk-docs/agents/llm-agents/#structuring-data-input_schema-output_schema-output_key)**.
Shared Invocation Context
The `SequentialAgent` passes the same `InvocationContext` to each of its sub-agents. This means they all share the same session state, including the temporary (`temp:`) namespace, making it easy to pass data between steps within a single turn.
Code
````py
from google.adk.agents.sequential_agent import SequentialAgent
from google.adk.agents.llm_agent import LlmAgent
# --- Constants ---
GEMINI_MODEL = "gemini-2.5-flash"
# --- 1. Define Sub-Agents for Each Pipeline Stage ---
# Code Writer Agent
# Takes the initial specification (from user query) and writes code.
code_writer_agent = LlmAgent(
name="CodeWriterAgent",
model=GEMINI_MODEL,
instruction="""
You are a Python Code Generator.
Based *only* on the user's request, write Python code that fulfills the requirement.
Output *only* the complete Python code block, enclosed in triple backticks (```python ... ```).
Do not add any other text before or after the code block.
""",
description="Writes initial Python code based on a specification.",
output_key="generated_code"
)
# Code Reviewer Agent
# Takes the code generated by the previous agent (read from state) and provides feedback.
code_reviewer_agent = LlmAgent(
name="CodeReviewerAgent",
model=GEMINI_MODEL,
instruction="""
You are an expert Python Code Reviewer.
Your task is to provide constructive feedback on the provided code.
**Code to Review:**
```python
{generated_code}
```
**Review Criteria:**
1. **Correctness:** Does the code work as intended? Are there logic errors?
2. **Readability:** Is the code clear and easy to understand? Follows PEP 8 style guidelines?
3. **Efficiency:** Is the code reasonably efficient? Any obvious performance bottlenecks?
4. **Edge Cases:** Does the code handle potential edge cases or invalid inputs gracefully?
5. **Best Practices:** Does the code follow common Python best practices?
**Output:**
Provide your feedback as a concise, bulleted list. Focus on the most important points for improvement.
If the code is excellent and requires no changes, simply state: "No major issues found."
Output *only* the review comments or the "No major issues" statement.
""",
description="Reviews code and provides feedback.",
output_key="review_comments"
)
# Code Refactorer Agent
# Takes the original code and the review comments (read from state) and refactors the code.
code_refactorer_agent = LlmAgent(
name="CodeRefactorerAgent",
model=GEMINI_MODEL,
instruction="""
You are a Python Code Refactoring AI.
Your goal is to improve the given Python code based on the provided review comments.
**Original Code:**
```python
{generated_code}
```
**Review Comments:**
{review_comments}
**Task:**
Carefully apply the suggestions from the review comments to refactor the original code.
If the review comments state "No major issues found," return the original code unchanged.
Ensure the final code is complete, functional, and includes necessary imports and docstrings.
**Output:**
Output *only* the final, refactored Python code block, enclosed in triple backticks (```python ... ```).
Do not add any other text before or after the code block.
""",
description="Refactors code based on review comments.",
output_key="refactored_code"
)
# --- 2. Create the SequentialAgent ---
# This agent orchestrates the pipeline by running the sub_agents in order.
code_pipeline_agent = SequentialAgent(
name="CodePipelineAgent",
sub_agents=[code_writer_agent, code_reviewer_agent, code_refactorer_agent],
description="Executes a sequence of code writing, reviewing, and refactoring.",
)
root_agent = code_pipeline_agent
````
```typescript
// Part of agent.ts --> Follow https://google.github.io/adk-docs/get-started/quickstart/ to learn the setup
// --- 1. Define Sub-Agents for Each Pipeline Stage ---
// Code Writer Agent
// Takes the initial specification (from user query) and writes code.
const codeWriterAgent = new LlmAgent({
name: "CodeWriterAgent",
model: GEMINI_MODEL,
instruction: `You are a Python Code Generator.
Based *only* on the user's request, write Python code that fulfills the requirement.
Output *only* the complete Python code block, enclosed in triple backticks (\`\`\`python ... \`\`\`).
Do not add any other text before or after the code block.
`,
description: "Writes initial Python code based on a specification.",
outputKey: "generated_code" // Stores output in state['generated_code']
});
// Code Reviewer Agent
// Takes the code generated by the previous agent (read from state) and provides feedback.
const codeReviewerAgent = new LlmAgent({
name: "CodeReviewerAgent",
model: GEMINI_MODEL,
instruction: `You are an expert Python Code Reviewer.
Your task is to provide constructive feedback on the provided code.
**Code to Review:**
\`\`\`python
{generated_code}
\`\`\`
**Review Criteria:**
1. **Correctness:** Does the code work as intended? Are there logic errors?
2. **Readability:** Is the code clear and easy to understand? Follows PEP 8 style guidelines?
3. **Efficiency:** Is the code reasonably efficient? Any obvious performance bottlenecks?
4. **Edge Cases:** Does the code handle potential edge cases or invalid inputs gracefully?
5. **Best Practices:** Does the code follow common Python best practices?
**Output:**
Provide your feedback as a concise, bulleted list. Focus on the most important points for improvement.
If the code is excellent and requires no changes, simply state: "No major issues found."
Output *only* the review comments or the "No major issues" statement.
`,
description: "Reviews code and provides feedback.",
outputKey: "review_comments", // Stores output in state['review_comments']
});
// Code Refactorer Agent
// Takes the original code and the review comments (read from state) and refactors the code.
const codeRefactorerAgent = new LlmAgent({
name: "CodeRefactorerAgent",
model: GEMINI_MODEL,
instruction: `You are a Python Code Refactoring AI.
Your goal is to improve the given Python code based on the provided review comments.
**Original Code:**
\`\`\`python
{generated_code}
\`\`\`
**Review Comments:**
{review_comments}
**Task:**
Carefully apply the suggestions from the review comments to refactor the original code.
If the review comments state "No major issues found," return the original code unchanged.
Ensure the final code is complete, functional, and includes necessary imports and docstrings.
**Output:**
Output *only* the final, refactored Python code block, enclosed in triple backticks (\`\`\`python ... \`\`\`).
Do not add any other text before or after the code block.
`,
description: "Refactors code based on review comments.",
outputKey: "refactored_code", // Stores output in state['refactored_code']
});
// --- 2. Create the SequentialAgent ---
// This agent orchestrates the pipeline by running the sub_agents in order.
const rootAgent = new SequentialAgent({
name: "CodePipelineAgent",
subAgents: [codeWriterAgent, codeReviewerAgent, codeRefactorerAgent],
description: "Executes a sequence of code writing, reviewing, and refactoring.",
// The agents will run in the order provided: Writer -> Reviewer -> Refactorer
});
```
```go
model, err := gemini.NewModel(ctx, modelName, &genai.ClientConfig{})
if err != nil {
return fmt.Errorf("failed to create model: %v", err)
}
codeWriterAgent, err := llmagent.New(llmagent.Config{
Name: "CodeWriterAgent",
Model: model,
Description: "Writes initial Go code based on a specification.",
Instruction: `You are a Go Code Generator.
Based *only* on the user's request, write Go code that fulfills the requirement.
Output *only* the complete Go code block, enclosed in triple backticks ('''go ... ''').
Do not add any other text before or after the code block.`,
OutputKey: "generated_code",
})
if err != nil {
return fmt.Errorf("failed to create code writer agent: %v", err)
}
codeReviewerAgent, err := llmagent.New(llmagent.Config{
Name: "CodeReviewerAgent",
Model: model,
Description: "Reviews code and provides feedback.",
Instruction: `You are an expert Go Code Reviewer.
Your task is to provide constructive feedback on the provided code.
**Code to Review:**
'''go
{generated_code}
'''
**Review Criteria:**
1. **Correctness:** Does the code work as intended? Are there logic errors?
2. **Readability:** Is the code clear and easy to understand? Follows Go style guidelines?
3. **Idiomatic Go:** Does the code use Go's features in a natural and standard way?
4. **Edge Cases:** Does the code handle potential edge cases or invalid inputs gracefully?
5. **Best Practices:** Does the code follow common Go best practices?
**Output:**
Provide your feedback as a concise, bulleted list. Focus on the most important points for improvement.
If the code is excellent and requires no changes, simply state: "No major issues found."
Output *only* the review comments or the "No major issues" statement.`,
OutputKey: "review_comments",
})
if err != nil {
return fmt.Errorf("failed to create code reviewer agent: %v", err)
}
codeRefactorerAgent, err := llmagent.New(llmagent.Config{
Name: "CodeRefactorerAgent",
Model: model,
Description: "Refactors code based on review comments.",
Instruction: `You are a Go Code Refactoring AI.
Your goal is to improve the given Go code based on the provided review comments.
**Original Code:**
'''go
{generated_code}
'''
**Review Comments:**
{review_comments}
**Task:**
Carefully apply the suggestions from the review comments to refactor the original code.
If the review comments state "No major issues found," return the original code unchanged.
Ensure the final code is complete, functional, and includes necessary imports.
**Output:**
Output *only* the final, refactored Go code block, enclosed in triple backticks ('''go ... ''').
Do not add any other text before or after the code block.`,
OutputKey: "refactored_code",
})
if err != nil {
return fmt.Errorf("failed to create code refactorer agent: %v", err)
}
codePipelineAgent, err := sequentialagent.New(sequentialagent.Config{
AgentConfig: agent.Config{
Name: appName,
Description: "Executes a sequence of code writing, reviewing, and refactoring.",
SubAgents: []agent.Agent{
codeWriterAgent,
codeReviewerAgent,
codeRefactorerAgent,
},
},
})
if err != nil {
return fmt.Errorf("failed to create sequential agent: %v", err)
}
```
````java
import com.google.adk.agents.LlmAgent;
import com.google.adk.agents.SequentialAgent;
import com.google.adk.events.Event;
import com.google.adk.runner.InMemoryRunner;
import com.google.adk.sessions.Session;
import com.google.genai.types.Content;
import com.google.genai.types.Part;
import io.reactivex.rxjava3.core.Flowable;
public class SequentialAgentExample {
private static final String APP_NAME = "CodePipelineAgent";
private static final String USER_ID = "test_user_456";
private static final String MODEL_NAME = "gemini-2.0-flash";
public static void main(String[] args) {
SequentialAgentExample sequentialAgentExample = new SequentialAgentExample();
sequentialAgentExample.runAgent(
"Write a Java function to calculate the factorial of a number.");
}
public void runAgent(String prompt) {
LlmAgent codeWriterAgent =
LlmAgent.builder()
.model(MODEL_NAME)
.name("CodeWriterAgent")
.description("Writes initial Java code based on a specification.")
.instruction(
"""
You are a Java Code Generator.
Based *only* on the user's request, write Java code that fulfills the requirement.
Output *only* the complete Java code block, enclosed in triple backticks (```java ... ```).
Do not add any other text before or after the code block.
""")
.outputKey("generated_code")
.build();
LlmAgent codeReviewerAgent =
LlmAgent.builder()
.model(MODEL_NAME)
.name("CodeReviewerAgent")
.description("Reviews code and provides feedback.")
.instruction(
"""
You are an expert Java Code Reviewer.
Your task is to provide constructive feedback on the provided code.
**Code to Review:**
```java
{generated_code}
```
**Review Criteria:**
1. **Correctness:** Does the code work as intended? Are there logic errors?
2. **Readability:** Is the code clear and easy to understand? Follows Java style guidelines?
3. **Efficiency:** Is the code reasonably efficient? Any obvious performance bottlenecks?
4. **Edge Cases:** Does the code handle potential edge cases or invalid inputs gracefully?
5. **Best Practices:** Does the code follow common Java best practices?
**Output:**
Provide your feedback as a concise, bulleted list. Focus on the most important points for improvement.
If the code is excellent and requires no changes, simply state: "No major issues found."
Output *only* the review comments or the "No major issues" statement.
""")
.outputKey("review_comments")
.build();
LlmAgent codeRefactorerAgent =
LlmAgent.builder()
.model(MODEL_NAME)
.name("CodeRefactorerAgent")
.description("Refactors code based on review comments.")
.instruction(
"""
You are a Java Code Refactoring AI.
Your goal is to improve the given Java code based on the provided review comments.
**Original Code:**
```java
{generated_code}
```
**Review Comments:**
{review_comments}
**Task:**
Carefully apply the suggestions from the review comments to refactor the original code.
If the review comments state "No major issues found," return the original code unchanged.
Ensure the final code is complete, functional, and includes necessary imports and docstrings.
**Output:**
Output *only* the final, refactored Java code block, enclosed in triple backticks (```java ... ```).
Do not add any other text before or after the code block.
""")
.outputKey("refactored_code")
.build();
SequentialAgent codePipelineAgent =
SequentialAgent.builder()
.name(APP_NAME)
.description("Executes a sequence of code writing, reviewing, and refactoring.")
// The agents will run in the order provided: Writer -> Reviewer -> Refactorer
.subAgents(codeWriterAgent, codeReviewerAgent, codeRefactorerAgent)
.build();
// Create an InMemoryRunner
InMemoryRunner runner = new InMemoryRunner(codePipelineAgent, APP_NAME);
// InMemoryRunner automatically creates a session service. Create a session using the service
Session session = runner.sessionService().createSession(APP_NAME, USER_ID).blockingGet();
Content userMessage = Content.fromParts(Part.fromText(prompt));
// Run the agent
Flowable eventStream = runner.runAsync(USER_ID, session.id(), userMessage);
// Stream event response
eventStream.blockingForEach(
event -> {
if (event.finalResponse()) {
System.out.println(event.stringifyContent());
}
});
}
}
````
# Tools and Integrations for Agents
Check out the following pre-built tools and integrations that you can use with ADK agents. For information on building custom tools, see [Custom Tools](/adk-docs/tools-custom/). For information on submitting integrations to the catalog, see the [Contribution Guide for Integrations](https://github.com/google/adk-docs/blob/main/CONTRIBUTING.md#integrations).
Filter: All Code Connectors Data Google MCP Observability Search
# AG-UI user interface for ADK
Supported in ADKPythonTypeScriptGoJava
Turn your ADK agents into full-featured applications with rich, responsive UIs. [AG-UI](https://docs.ag-ui.com/) is an open protocol that handles streaming events, client state, and bi-directional communication between your agents and users.
[AG-UI](https://github.com/ag-ui-protocol/ag-ui) provides a consistent interface to empower rich clients across technology stacks, from mobile to the web and even the command line. There are a number of different clients that support AG-UI:
- [CopilotKit](https://copilotkit.ai) provides tooling and components to tightly integrate your agent with web applications
- Clients for [Kotlin](https://github.com/ag-ui-protocol/ag-ui/tree/main/sdks/community/kotlin), [Java](https://github.com/ag-ui-protocol/ag-ui/tree/main/sdks/community/java), [Go](https://github.com/ag-ui-protocol/ag-ui/tree/main/sdks/community/go/example/client), and [CLI implementations](https://github.com/ag-ui-protocol/ag-ui/tree/main/apps/client-cli-example/src) in TypeScript
This tutorial uses CopilotKit to create a sample app backed by an ADK agent that demonstrates some of the features supported by AG-UI.
## Quickstart
To get started, let's create a sample application with an ADK agent and a simple web client:
1. Create the app:
```bash
npx copilotkit@latest create -f adk
```
1. Set your Google API key:
```bash
export GOOGLE_API_KEY="your-api-key"
```
1. Install dependencies and run:
```bash
npm install && npm run dev
```
This starts two servers:
- **http://localhost:3000** - The web UI (open this in your browser)
- **http://localhost:8000** - The ADK agent API (backend only)
Open in your browser to chat with your agent.
## Features
### Chat
Chat is a familiar interface for exposing your agent, and AG-UI handles streaming messages between your users and agents:
src/app/page.tsx
```tsx
```
Learn more about the chat UI [in the CopilotKit docs](https://docs.copilotkit.ai/adk/agentic-chat-ui).
### Generative UI
AG-UI lets you share tool information with a Generative UI so that it can be displayed to users:
src/app/page.tsx
```tsx
useRenderToolCall(
{
name: "get_weather",
description: "Get the weather for a given location.",
parameters: [{ name: "location", type: "string", required: true }],
render: ({ args }) => {
return ;
},
},
[themeColor],
);
```
Learn more about Generative UI [in the CopilotKit docs](https://docs.copilotkit.ai/adk/generative-ui).
### Shared State
ADK agents can be stateful, and synchronizing that state between your agents and your UIs enables powerful and fluid user experiences. State can be synchronized both ways so agents are automatically aware of changes made by your user or other parts of your application:
src/app/page.tsx
```tsx
const { state, setState } = useCoAgent({
name: "my_agent",
initialState: {
proverbs: [
"A journey of a thousand miles begins with a single step.",
],
},
})
```
Learn more about shared state [in the CopilotKit docs](https://docs.copilotkit.ai/adk/shared-state).
## Resources
To see what other features you can build into your UI with AG-UI, refer to the CopilotKit docs:
- [Agentic Generative UI](https://docs.copilotkit.ai/adk/generative-ui/agentic)
- [Human in the Loop](https://docs.copilotkit.ai/adk/human-in-the-loop)
- [Frontend Actions](https://docs.copilotkit.ai/adk/frontend-actions)
Or try them out in the [AG-UI Dojo](https://dojo.ag-ui.com).
# AgentOps observability for ADK
Supported in ADKPython
**With just two lines of code**, [AgentOps](https://www.agentops.ai) provides session replays, metrics, and monitoring for agents.
## Why AgentOps for ADK?
Observability is a key aspect of developing and deploying conversational AI agents. It allows developers to understand how their agents are performing, how their agents are interacting with users, and how their agents use external tools and APIs.
By integrating AgentOps, developers can gain deep insights into their ADK agent's behavior, LLM interactions, and tool usage.
Google ADK includes its own OpenTelemetry-based tracing system, primarily aimed at providing developers with a way to trace the basic flow of execution within their agents. AgentOps enhances this by offering a dedicated and more comprehensive observability platform with:
- **Unified Tracing and Replay Analytics:** Consolidate traces from ADK and other components of your AI stack.
- **Rich Visualization:** Intuitive dashboards to visualize agent execution flow, LLM calls, and tool performance.
- **Detailed Debugging:** Drill down into specific spans, view prompts, completions, token counts, and errors.
- **LLM Cost and Latency Tracking:** Track latencies, costs (via token usage), and identify bottlenecks.
- **Simplified Setup:** Get started with just a few lines of code.
*AgentOps dashboard displaying a trace from a multi-step ADK application execution. You can see the hierarchical structure of spans, including the main agent workflow, individual sub-agents, LLM calls, and tool executions. Note the clear hierarchy: the main workflow agent span contains child spans for various sub-agent operations, LLM calls, and tool executions.*
## Getting Started with AgentOps and ADK
Integrating AgentOps into your ADK application is straightforward:
1. **Install AgentOps:**
```bash
pip install -U agentops
```
1. **Create an API Key** Create a user API key here: [Create API Key](https://app.agentops.ai/settings/projects) and configure your environment:
Add your API key to your environment variables:
```text
AGENTOPS_API_KEY=
```
1. **Initialize AgentOps:** Add the following lines at the beginning of your ADK application script (e.g., your main Python file running the ADK `Runner`):
```python
import agentops
agentops.init()
```
This will initiate an AgentOps session as well as automatically track ADK agents.
Detailed example:
```python
import agentops
import os
from dotenv import load_dotenv
# Load environment variables (optional, if you use a .env file for API keys)
load_dotenv()
agentops.init(
api_key=os.getenv("AGENTOPS_API_KEY"), # Your AgentOps API Key
trace_name="my-adk-app-trace" # Optional: A name for your trace
# auto_start_session=True is the default.
# Set to False if you want to manually control session start/end.
)
```
> 🚨 🔑 You can find your AgentOps API key on your [AgentOps Dashboard](https://app.agentops.ai/) after signing up. It's recommended to set it as an environment variable (`AGENTOPS_API_KEY`).
Once initialized, AgentOps will automatically begin instrumenting your ADK agent.
**This is all you need to capture all telemetry data for your ADK agent**
## How AgentOps Instruments ADK
AgentOps employs a sophisticated strategy to provide seamless observability without conflicting with ADK's native telemetry:
1. **Neutralizing ADK's Native Telemetry:** AgentOps detects ADK and intelligently patches ADK's internal OpenTelemetry tracer (typically `trace.get_tracer('gcp.vertex.agent')`). It replaces it with a `NoOpTracer`, ensuring that ADK's own attempts to create telemetry spans are effectively silenced. This prevents duplicate traces and allows AgentOps to be the authoritative source for observability data.
1. **AgentOps-Controlled Span Creation:** AgentOps takes control by wrapping key ADK methods to create a logical hierarchy of spans:
- **Agent Execution Spans (e.g., `adk.agent.MySequentialAgent`):** When an ADK agent (like `BaseAgent`, `SequentialAgent`, or `LlmAgent`) starts its `run_async` method, AgentOps initiates a parent span for that agent's execution.
- **LLM Interaction Spans (e.g., `adk.llm.gemini-pro`):** For calls made by an agent to an LLM (via ADK's `BaseLlmFlow._call_llm_async`), AgentOps creates a dedicated child span, typically named after the LLM model. This span captures request details (prompts, model parameters) and, upon completion (via ADK's `_finalize_model_response_event`), records response details like completions, token usage, and finish reasons.
- **Tool Usage Spans (e.g., `adk.tool.MyCustomTool`):** When an agent uses a tool (via ADK's `functions.__call_tool_async`), AgentOps creates a single, comprehensive child span named after the tool. This span includes the tool's input parameters and the result it returns.
1. **Rich Attribute Collection:** AgentOps reuses ADK's internal data extraction logic. It patches ADK's specific telemetry functions (e.g., `google.adk.telemetry.trace_tool_call`, `trace_call_llm`). The AgentOps wrappers for these functions take the detailed information ADK gathers and attach it as attributes to the *currently active AgentOps span*.
## Visualizing Your ADK Agent in AgentOps
When you instrument your ADK application with AgentOps, you gain a clear, hierarchical view of your agent's execution in the AgentOps dashboard.
1. **Initialization:** When `agentops.init()` is called (e.g., `agentops.init(trace_name="my_adk_application")`), an initial parent span is created if the init param `auto_start_session=True` (true by default). This span, often named similar to `my_adk_application.session`, will be the root for all operations within that trace.
1. **ADK Runner Execution:** When an ADK `Runner` executes a top-level agent (e.g., a `SequentialAgent` orchestrating a workflow), AgentOps creates a corresponding agent span under the session trace. This span will reflect the name of your top-level ADK agent (e.g., `adk.agent.YourMainWorkflowAgent`).
1. **Sub-Agent and LLM/Tool Calls:** As this main agent executes its logic, including calling sub-agents, LLMs, or tools:
- Each **sub-agent execution** will appear as a nested child span under its parent agent.
- Calls to **Large Language Models** will generate further nested child spans (e.g., `adk.llm.`), capturing prompt details, responses, and token usage.
- **Tool invocations** will also result in distinct child spans (e.g., `adk.tool.`), showing their parameters and results.
This creates a waterfall of spans, allowing you to see the sequence, duration, and details of each step in your ADK application. All relevant attributes, such as LLM prompts, completions, token counts, tool inputs/outputs, and agent names, are captured and displayed.
For a practical demonstration, you can explore a sample Jupyter Notebook that illustrates a human approval workflow using Google ADK and AgentOps: [Google ADK Human Approval Example on GitHub](https://github.com/AgentOps-AI/agentops/blob/main/examples/google_adk/human_approval.ipynb).
This example showcases how a multi-step agent process with tool usage is visualized in AgentOps.
## Benefits
- **Effortless Setup:** Minimal code changes for comprehensive ADK tracing.
- **Deep Visibility:** Understand the inner workings of complex ADK agent flows.
- **Faster Debugging:** Quickly pinpoint issues with detailed trace data.
- **Performance Optimization:** Analyze latencies and token usage.
By integrating AgentOps, ADK developers can significantly enhance their ability to build, debug, and maintain robust AI agents.
## Further Information
To get started, [create an AgentOps account](http://app.agentops.ai). For feature requests or bug reports, please reach out to the AgentOps team on the [AgentOps Repo](https://github.com/AgentOps-AI/agentops).
### Extra links
🐦 [Twitter](http://x.com/agentopsai) • 📢 [Discord](http://x.com/agentopsai) • 🖇️ [AgentOps Dashboard](http://app.agentops.ai) • 📙 [Documentation](http://docs.agentops.ai)
# Google Cloud API Registry tool for ADK
Supported in ADKPython v1.20.0Preview
The Google Cloud API Registry connector tool for Agent Development Kit (ADK) lets you access a wide range of Google Cloud services for your agents as Model Context Protocol (MCP) servers through the [Google Cloud API Registry](https://docs.cloud.google.com/api-registry/docs/overview). You can configure this tool to connect your agent to your Google Cloud projects and dynamically access Cloud services enabled for that project.
Preview release
The Google Cloud API Registry feature is a Preview release. For more information, see the [launch stage descriptions](https://cloud.google.com/products#product-launch-stages).
## Prerequisites
Before using the API Registry with your agent, you need to ensure the following:
- **Google Cloud project:** Configure your agent to access AI models using an existing Google Cloud project.
- **API Registry access:** The environment where your agent runs needs Google Cloud [Application Default Credentials](https://docs.cloud.google.com/docs/authentication/provide-credentials-adc) with the `apiregistry.viewer` role to list available MCP servers.
- **Cloud APIs:** In your Google Cloud project, enable the *cloudapiregistry.googleapis.com* and *apihub.googleapis.com* Google Cloud APIs.
- **MCP Server and Tool access:** Make sure you enable the MCP Servers in the API Registry for the Google Cloud services in your Cloud Project that you want access with your agent. You can enable this in the Cloud Console or use a gcloud command such as: `gcloud beta api-registry mcp enable bigquery.googleapis.com --project={PROJECT_ID}`. The credentials used by the agent must have permissions to access the MCP server and the underlying services used by the tools. For example, to use BigQuery tools, the service account needs BigQuery IAM roles like `bigquery.dataViewer` and `bigquery.jobUser`. For more information about required permissions, see [Authentication and access](#auth).
You can check what MCP servers are enabled with API Registry using the following gcloud command:
```console
gcloud beta api-registry mcp servers list --project={PROJECT_ID}.
```
## Use with agent
When configuring the API Registry connector tool with an agent, you first initialize the ***ApiRegistry*** class to establish a connection with Cloud services, and then use the `get_toolset()` function to retrieve a toolset for a specific MCP server registered in the API Registry. The following code example demonstrates how to create an agent that uses tools from an MCP server listed in API Registry. This agent is designed to interact with BigQuery:
```python
import os
from google.adk.agents.llm_agent import LlmAgent
from google.adk.tools.api_registry import ApiRegistry
# Configure with your Google Cloud Project ID and registered MCP server name
PROJECT_ID = "your-google-cloud-project-id"
MCP_SERVER_NAME = "projects/your-google-cloud-project-id/locations/global/mcpServers/your-mcp-server-name"
# Example header provider for BigQuery, a project header is required.
def header_provider(context):
return {"x-goog-user-project": PROJECT_ID}
# Initialize ApiRegistry
api_registry = ApiRegistry(
api_registry_project_id=PROJECT_ID,
header_provider=header_provider
)
# Get the toolset for the specific MCP server
registry_tools = api_registry.get_toolset(
mcp_server_name=MCP_SERVER_NAME,
# Optionally filter tools:
#tool_filter=["list_datasets", "run_query"]
)
# Create an agent with the tools
root_agent = LlmAgent(
model="gemini-1.5-flash", # Or your preferred model
name="bigquery_assistant",
instruction="""
Help user access their BigQuery data using the available tools.
""",
tools=[registry_tools],
)
```
For the complete code for this example, see the [api_registry_agent](https://github.com/google/adk-python/tree/main/contributing/samples/api_registry_agent/) sample. For information on the configuration options, see [Configuration](#configuration). For information on the authentication for this tool, see [Authentication and access](#auth).
## Authentication and access
Using the API Registry with your agent requires authentication for the services the agent accesses. By default the tool uses Google Cloud [Application Default Credentials](https://docs.cloud.google.com/docs/authentication/provide-credentials-adc) for authentication. When using this tool make sure your agent has the following permissions and access:
- **API Registry access:** The `ApiRegistry` class uses Application Default Credentials (`google.auth.default()`) to authenticate requests to the Google Cloud API Registry to list the available MCP servers. Ensure the environment where the agent runs has credentials with the necessary permissions to view the API Registry resources, such as `apiregistry.viewer`.
- **MCP Server and Tool access:** The `McpToolset` returned by `get_toolset` also uses the Google Cloud Application Default Credentials by default to authenticate calls to the actual MCP server endpoint. The credentials used must have the necessary permissions for both:
1. Accessing the MCP server itself.
1. Utilizing the underlying services and resources that the tools interact with.
- **MCP Tool user role:** Allow the account used by your agent to call MCP tools through the API registry by granting the MCP tool user role: `gcloud projects add-iam-policy-binding {PROJECT_ID} --member={member} --role="roles/mcp.toolUser"`
For example, when using MCP server tools that interact with BigQuery, the account associated with the credentials, such as a service account, must be granted appropriate BigQuery IAM roles, such as `bigquery.dataViewer` or `bigquery.jobUser`, within your Google Cloud project to access datasets and run queries. In the case of the bigquery MCP server, a `"x-goog-user-project": PROJECT_ID` header is required to use its tools Additional headers for authentication or project context can be injected via the `header_provider` argument in the `ApiRegistry` constructor.
## Configuration
The ***APIRegistry*** object has the following configuration options:
- **`api_registry_project_id`** (str): The Google Cloud Project ID where the API Registry is located.
- **`location`** (str, optional): The location of the API Registry resources. Defaults to `"global"`.
- **`header_provider`** (Callable, optional): A function that takes the call context and returns a dictionary of additional HTTP headers to be sent with requests to the MCP server. This is often used for dynamic authentication or project-specific headers.
The `get_toolset()` function has the following configuration options:
- **`mcp_server_name`** (str): The full name of the registered MCP server from which to load tools, for example: `projects/my-project/locations/global/mcpServers/my-server`.
- **`tool_filter`** (Union\[ToolPredicate, List[str]\], optional): Specifies which tools to include in the toolset.
- If a list of strings, only tools with names in the list are included.
- If a `ToolPredicate` function, the function is called for each tool, and only tools for which it returns `True` are included.
- If `None`, all tools from the MCP server are included.
- **`tool_name_prefix`** (str, optional): A prefix to add to the name of each tool in the resulting toolset.
## Additional resources
- [api_registry_agent](https://github.com/google/adk-python/tree/main/contributing/samples/api_registry_agent/) ADK code sample
- [Google Cloud API Registry](https://docs.cloud.google.com/api-registry/docs/overview) documentation
# Apigee API Hub tool for ADK
Supported in ADKPython v0.1.0
**ApiHubToolset** lets you turn any documented API from Apigee API hub into a tool with a few lines of code. This section shows you the step-by-step instructions including setting up authentication for a secure connection to your APIs.
**Prerequisites**
1. [Install ADK](/adk-docs/get-started/installation/)
1. Install the [Google Cloud CLI](https://cloud.google.com/sdk/docs/install?db=bigtable-docs#installation_instructions).
1. [Apigee API hub](https://cloud.google.com/apigee/docs/apihub/what-is-api-hub) instance with documented (i.e. OpenAPI spec) APIs
1. Set up your project structure and create required files
```console
project_root_folder
|
`-- my_agent
|-- .env
|-- __init__.py
|-- agent.py
`__ tool.py
```
## Create an API Hub Toolset
Note: This tutorial includes an agent creation. If you already have an agent, you only need to follow a subset of these steps.
1. Get your access token, so that APIHubToolset can fetch spec from API Hub API. In your terminal run the following command
```shell
gcloud auth print-access-token
# Prints your access token like 'ya29....'
```
1. Ensure that the account used has the required permissions. You can use the pre-defined role `roles/apihub.viewer` or assign the following permissions:
1. **apihub.specs.get (required)**
1. apihub.apis.get (optional)
1. apihub.apis.list (optional)
1. apihub.versions.get (optional)
1. apihub.versions.list (optional)
1. apihub.specs.list (optional)
1. Create a tool with `APIHubToolset`. Add the below to `tools.py`
If your API requires authentication, you must configure authentication for the tool. The following code sample demonstrates how to configure an API key. ADK supports token based auth (API Key, Bearer token), service account, and OpenID Connect. We will soon add support for various OAuth2 flows.
```py
from google.adk.tools.openapi_tool.auth.auth_helpers import token_to_scheme_credential
from google.adk.tools.apihub_tool.apihub_toolset import APIHubToolset
# Provide authentication for your APIs. Not required if your APIs don't required authentication.
auth_scheme, auth_credential = token_to_scheme_credential(
"apikey", "query", "apikey", apikey_credential_str
)
sample_toolset = APIHubToolset(
name="apihub-sample-tool",
description="Sample Tool",
access_token="...", # Copy your access token generated in step 1
apihub_resource_name="...", # API Hub resource name
auth_scheme=auth_scheme,
auth_credential=auth_credential,
)
```
For production deployment we recommend using a service account instead of an access token. In the code snippet above, use `service_account_json=service_account_cred_json_str` and provide your security account credentials instead of the token.
For apihub_resource_name, if you know the specific ID of the OpenAPI Spec being used for your API, use `` `projects/my-project-id/locations/us-west1/apis/my-api-id/versions/version-id/specs/spec-id` ``. If you would like the Toolset to automatically pull the first available spec from the API, use `` `projects/my-project-id/locations/us-west1/apis/my-api-id` ``
1. Create your agent file Agent.py and add the created tools to your agent definition:
```py
from google.adk.agents.llm_agent import LlmAgent
from .tools import sample_toolset
root_agent = LlmAgent(
model='gemini-2.0-flash',
name='enterprise_assistant',
instruction='Help user, leverage the tools you have access to',
tools=sample_toolset.get_tools(),
)
```
1. Configure your `__init__.py` to expose your agent
```py
from . import agent
```
1. Start the Google ADK Web UI and try your agent:
```shell
# make sure to run `adk web` from your project_root_folder
adk web
```
Then go to to try your agent from the Web UI.
# Google Cloud Application Integration tool for ADK
Supported in ADKPython v0.1.0Java v0.3.0
With **ApplicationIntegrationToolset**, you can seamlessly give your agents secure and governed access to enterprise applications using Integration Connectors' 100+ pre-built connectors for systems like Salesforce, ServiceNow, JIRA, SAP, and more.
It supports both on-premise and SaaS applications. In addition, you can turn your existing Application Integration process automations into agentic workflows by providing application integration workflows as tools to your ADK agents.
Federated search within Application Integration lets you use ADK agents to query multiple enterprise applications and data sources simultaneously.
[See how ADK Federated Search in Application Integration works in this video walkthrough](https://www.youtube.com/watch?v=JdlWOQe5RgU)
## Prerequisites
### 1. Install ADK
Install Agent Development Kit following the steps in the [installation guide](/adk-docs/get-started/installation/).
### 2. Install CLI
Install the [Google Cloud CLI](https://cloud.google.com/sdk/docs/install#installation_instructions). To use the tool with default credentials, run the following commands:
```shell
gcloud config set project
gcloud auth application-default login
gcloud auth application-default set-quota-project
```
Replace `` with the unique ID of your Google Cloud project.
### 3. Provision Application Integration workflow and publish Connection Tool
Use an existing [Application Integration](https://cloud.google.com/application-integration/docs/overview) workflow or [Integrations Connector](https://cloud.google.com/integration-connectors/docs/overview) connection you want to use with your agent. You can also create a new [Application Integration workflow](https://cloud.google.com/application-integration/docs/setup-application-integration) or a [connection](https://cloud.google.com/integration-connectors/docs/connectors/neo4j/configure#configure-the-connector).
Import and publish the [Connection Tool](https://console.cloud.google.com/integrations/templates/connection-tool/locations/global) from the template library.
**Note**: To use a connector from Integration Connectors, you need to provision the Application Integration in the same region as your connection.
### 4. Create project structure
Set up your project structure and create the required files:
```console
project_root_folder
├── .env
└── my_agent
├── __init__.py
├── agent.py
└── tools.py
```
When running the agent, make sure to run `adk web` from the `project_root_folder`.
Set up your project structure and create the required files:
```console
project_root_folder
└── my_agent
├── agent.java
└── pom.xml
```
When running the agent, make sure to run the commands from the `project_root_folder`.
### 5. Set roles and permissions
To get the permissions that you need to set up **ApplicationIntegrationToolset**, you must have the following IAM roles on the project (common to both Integration Connectors and Application Integration Workflows):
```text
- roles/integrations.integrationEditor
- roles/connectors.invoker
- roles/secretmanager.secretAccessor
```
**Note:** When using Agent Engine (AE) for deployment, don't use `roles/integrations.integrationInvoker`, as it can result in 403 errors. Use `roles/integrations.integrationEditor` instead.
## Use Integration Connectors
Connect your agent to enterprise applications using [Integration Connectors](https://cloud.google.com/integration-connectors/docs/overview).
### Before you begin
**Note:** The *ExecuteConnection* integration is typically created automatically when you provision Application Integration in a given region. If the *ExecuteConnection* doesn't exist in the [list of integrations](https://console.cloud.google.com/integrations/list), you must follow these steps to create it:
1. To use a connector from Integration Connectors, click **QUICK SETUP** and [provision](https://console.cloud.google.com/integrations) Application Integration in the same region as your connection.
1. Go to the [Connection Tool](https://console.cloud.google.com/integrations/templates/connection-tool/locations/us-central1) template in the template library and click **USE TEMPLATE**.
1. Enter the Integration Name as *ExecuteConnection* (it is mandatory to use this exact integration name only). Then, select the region to match your connection region and click **CREATE**.
1. Click **PUBLISH** to publish the integration in the *Application Integration* editor.
### Create an Application Integration Toolset
To create an Application Integration Toolset for Integration Connectors, follow these steps:
1. Create a tool with `ApplicationIntegrationToolset` in the `tools.py` file:
```py
from google.adk.tools.application_integration_tool.application_integration_toolset import ApplicationIntegrationToolset
connector_tool = ApplicationIntegrationToolset(
project="test-project", # TODO: replace with GCP project of the connection
location="us-central1", #TODO: replace with location of the connection
connection="test-connection", #TODO: replace with connection name
entity_operations={"Entity_One": ["LIST","CREATE"], "Entity_Two": []},#empty list for actions means all operations on the entity are supported.
actions=["action1"], #TODO: replace with actions
service_account_json='{...}', # optional. Stringified json for service account key
tool_name_prefix="tool_prefix2",
tool_instructions="..."
)
```
**Note:**
- You can provide a service account to be used instead of default credentials by generating a [Service Account Key](https://cloud.google.com/iam/docs/keys-create-delete#creating), and providing the right [Application Integration and Integration Connector IAM roles](#prerequisites) to the service account.
- To find the list of supported entities and actions for a connection, use the Connectors APIs: [listActions](https://cloud.google.com/integration-connectors/docs/reference/rest/v1/projects.locations.connections.connectionSchemaMetadata/listActions) or [listEntityTypes](https://cloud.google.com/integration-connectors/docs/reference/rest/v1/projects.locations.connections.connectionSchemaMetadata/listEntityTypes).
`ApplicationIntegrationToolset` supports `auth_scheme` and `auth_credential` for **dynamic OAuth2 authentication** for Integration Connectors. To use it, create a tool similar to this in the `tools.py` file:
```py
from google.adk.tools.application_integration_tool.application_integration_toolset import ApplicationIntegrationToolset
from google.adk.tools.openapi_tool.auth.auth_helpers import dict_to_auth_scheme
from google.adk.auth import AuthCredential
from google.adk.auth import AuthCredentialTypes
from google.adk.auth import OAuth2Auth
oauth2_data_google_cloud = {
"type": "oauth2",
"flows": {
"authorizationCode": {
"authorizationUrl": "https://accounts.google.com/o/oauth2/auth",
"tokenUrl": "https://oauth2.googleapis.com/token",
"scopes": {
"https://www.googleapis.com/auth/cloud-platform": (
"View and manage your data across Google Cloud Platform"
" services"
),
"https://www.googleapis.com/auth/calendar.readonly": "View your calendars"
},
}
},
}
oauth_scheme = dict_to_auth_scheme(oauth2_data_google_cloud)
auth_credential = AuthCredential(
auth_type=AuthCredentialTypes.OAUTH2,
oauth2=OAuth2Auth(
client_id="...", #TODO: replace with client_id
client_secret="...", #TODO: replace with client_secret
),
)
connector_tool = ApplicationIntegrationToolset(
project="test-project", # TODO: replace with GCP project of the connection
location="us-central1", #TODO: replace with location of the connection
connection="test-connection", #TODO: replace with connection name
entity_operations={"Entity_One": ["LIST","CREATE"], "Entity_Two": []},#empty list for actions means all operations on the entity are supported.
actions=["GET_calendars/%7BcalendarId%7D/events"], #TODO: replace with actions. this one is for list events
service_account_json='{...}', # optional. Stringified json for service account key
tool_name_prefix="tool_prefix2",
tool_instructions="...",
auth_scheme=oauth_scheme,
auth_credential=auth_credential
)
```
1. Update the `agent.py` file and add tool to your agent:
```py
from google.adk.agents.llm_agent import LlmAgent
from .tools import connector_tool
root_agent = LlmAgent(
model='gemini-2.0-flash',
name='connector_agent',
instruction="Help user, leverage the tools you have access to",
tools=[connector_tool],
)
```
1. Configure `__init__.py` to expose your agent:
```py
from . import agent
```
1. Start the Google ADK Web UI and use your agent:
```shell
# make sure to run `adk web` from your project_root_folder
adk web
```
After completing the above steps, go to , and choose `my\_agent` agent (which is the same as the agent folder name).
## Use Application Integration Workflows
Use an existing [Application Integration](https://cloud.google.com/application-integration/docs/overview) workflow as a tool for your agent or create a new one.
### 1. Create a tool
To create a tool with `ApplicationIntegrationToolset` in the `tools.py` file, use the following code:
```py
integration_tool = ApplicationIntegrationToolset(
project="test-project", # TODO: replace with GCP project of the connection
location="us-central1", #TODO: replace with location of the connection
integration="test-integration", #TODO: replace with integration name
triggers=["api_trigger/test_trigger"],#TODO: replace with trigger id(s). Empty list would mean all api triggers in the integration to be considered.
service_account_json='{...}', #optional. Stringified json for service account key
tool_name_prefix="tool_prefix1",
tool_instructions="..."
)
```
**Note:** You can provide a service account to be used instead of using default credentials. To do this, generate a [Service Account Key](https://cloud.google.com/iam/docs/keys-create-delete#creating) and provide the correct [Application Integration and Integration Connector IAM roles](#prerequisites) to the service account. For more details about the IAM roles, refer to the [Prerequisites](#prerequisites) section.
To create a tool with `ApplicationIntegrationToolset` in the `tools.java` file, use the following code:
```java
import com.google.adk.tools.applicationintegrationtoolset.ApplicationIntegrationToolset;
import com.google.common.collect.ImmutableList;
import com.google.common.collect.ImmutableMap;
public class Tools {
private static ApplicationIntegrationToolset integrationTool;
private static ApplicationIntegrationToolset connectionsTool;
static {
integrationTool = new ApplicationIntegrationToolset(
"test-project",
"us-central1",
"test-integration",
ImmutableList.of("api_trigger/test-api"),
null,
null,
null,
"{...}",
"tool_prefix1",
"...");
connectionsTool = new ApplicationIntegrationToolset(
"test-project",
"us-central1",
null,
null,
"test-connection",
ImmutableMap.of("Issue", ImmutableList.of("GET")),
ImmutableList.of("ExecuteCustomQuery"),
"{...}",
"tool_prefix",
"...");
}
}
```
**Note:** You can provide a service account to be used instead of using default credentials. To do this, generate a [Service Account Key](https://cloud.google.com/iam/docs/keys-create-delete#creating) and provide the correct [Application Integration and Integration Connector IAM roles](#prerequisites) to the service account. For more details about the IAM roles, refer to the [Prerequisites](#prerequisites) section.
### 2. Add the tool to your agent
To update the `agent.py` file and add the tool to your agent, use the following code:
```py
from google.adk.agents.llm_agent import LlmAgent
from .tools import integration_tool, connector_tool
root_agent = LlmAgent(
model='gemini-2.0-flash',
name='integration_agent',
instruction="Help user, leverage the tools you have access to",
tools=[integration_tool],
)
```
To update the `agent.java` file and add the tool to your agent, use the following code:
````java
import com.google.adk.agent.LlmAgent;
import com.google.adk.tools.BaseTool;
import com.google.common.collect.ImmutableList;
```text
public class MyAgent {
public static void main(String[] args) {
// Assuming Tools class is defined as in the previous step
ImmutableList tools = ImmutableList.builder()
.add(Tools.integrationTool)
.add(Tools.connectionsTool)
.build();
// Finally, create your agent with the tools generated automatically.
LlmAgent rootAgent = LlmAgent.builder()
.name("science-teacher")
.description("Science teacher agent")
.model("gemini-2.0-flash")
.instruction(
"Help user, leverage the tools you have access to."
)
.tools(tools)
.build();
// You can now use rootAgent to interact with the LLM
// For example, you can start a conversation with the agent.
}
}
````
````
**Note:** To find the list of supported entities and actions for a
connection, use these Connector APIs: `listActions`, `listEntityTypes`.
### 3. Expose your agent
To configure `__init__.py` to expose your agent, use the following code:
```py
from . import agent
````
### 4. Use your agent
To start the Google ADK Web UI and use your agent, use the following commands:
```shell
# make sure to run `adk web` from your project_root_folder
adk web
```
After completing the above steps, go to , and choose the `my_agent` agent (which is the same as the agent folder name).
To start the Google ADK Web UI and use your agent, use the following commands:
```bash
mvn install
mvn exec:java \
-Dexec.mainClass="com.google.adk.web.AdkWebServer" \
-Dexec.args="--adk.agents.source-dir=src/main/java" \
-Dexec.classpathScope="compile"
```
After completing the above steps, go to , and choose the `my_agent` agent (which is the same as the agent folder name).
# Arize AX observability for ADK
[Arize AX](https://arize.com/docs/ax) is a production-grade observability platform for monitoring, debugging, and improving LLM applications and AI Agents at scale. It provides comprehensive tracing, evaluation, and monitoring capabilities for your Google ADK applications. To get started, sign up for a [free account](https://app.arize.com/auth/join).
For an open-source, self-hosted alternative, check out [Phoenix](https://arize.com/docs/phoenix).
## Overview
Arize AX can automatically collect traces from Google ADK using [OpenInference instrumentation](https://github.com/Arize-ai/openinference/tree/main/python/instrumentation/openinference-instrumentation-google-adk), allowing you to:
- **Trace agent interactions** - Automatically capture every agent run, tool call, model request, and response with context and metadata
- **Evaluate performance** - Assess agent behavior using custom or pre-built evaluators and run experiments to test agent configurations
- **Monitor in production** - Set up real-time dashboards and alerts to track performance
- **Debug issues** - Analyze detailed traces to quickly identify bottlenecks, failed tool calls, and any unexpected agent behavior
## Installation
Install the required packages:
```bash
pip install openinference-instrumentation-google-adk google-adk arize-otel
```
## Setup
### 1. Configure Environment Variables
Set your Google API key:
```bash
export GOOGLE_API_KEY=[your_key_here]
```
### 2. Connect your application to Arize AX
```python
from arize.otel import register
# Register with Arize AX
tracer_provider = register(
space_id="your-space-id", # Found in app space settings page
api_key="your-api-key", # Found in app space settings page
project_name="your-project-name" # Name this whatever you prefer
)
# Import and configure the automatic instrumentor from OpenInference
from openinference.instrumentation.google_adk import GoogleADKInstrumentor
# Finish automatic instrumentation
GoogleADKInstrumentor().instrument(tracer_provider=tracer_provider)
```
## Observe
Now that you have tracing setup, all Google ADK SDK requests will be streamed to Arize AX for observability and evaluation.
```python
import nest_asyncio
nest_asyncio.apply()
from google.adk.agents import Agent
from google.adk.runners import InMemoryRunner
from google.genai import types
# Define a tool function
def get_weather(city: str) -> dict:
"""Retrieves the current weather report for a specified city.
Args:
city (str): The name of the city for which to retrieve the weather report.
Returns:
dict: status and result or error msg.
"""
if city.lower() == "new york":
return {
"status": "success",
"report": (
"The weather in New York is sunny with a temperature of 25 degrees"
" Celsius (77 degrees Fahrenheit)."
),
}
else:
return {
"status": "error",
"error_message": f"Weather information for '{city}' is not available.",
}
# Create an agent with tools
agent = Agent(
name="weather_agent",
model="gemini-2.0-flash-exp",
description="Agent to answer questions using weather tools.",
instruction="You must use the available tools to find an answer.",
tools=[get_weather]
)
app_name = "weather_app"
user_id = "test_user"
session_id = "test_session"
runner = InMemoryRunner(agent=agent, app_name=app_name)
session_service = runner.session_service
await session_service.create_session(
app_name=app_name,
user_id=user_id,
session_id=session_id
)
# Run the agent (all interactions will be traced)
async for event in runner.run_async(
user_id=user_id,
session_id=session_id,
new_message=types.Content(role="user", parts=[
types.Part(text="What is the weather in New York?")]
)
):
if event.is_final_response():
print(event.content.parts[0].text.strip())
```
## View Results in Arize AX
## Support and Resources
- [Arize AX Documentation](https://arize.com/docs/ax/integrations/frameworks-and-platforms/google-adk)
- [Arize Community Slack](https://arize-ai.slack.com/join/shared_invite/zt-11t1vbu4x-xkBIHmOREQnYnYDH1GDfCg#/shared-invite/email)
- [OpenInference Package](https://github.com/Arize-ai/openinference/tree/main/python/instrumentation/openinference-instrumentation-google-adk)
# Asana MCP tool for ADK
Supported in ADKPythonTypeScript
The [Asana MCP Server](https://developers.asana.com/docs/using-asanas-mcp-server) connects your ADK agent to the [Asana](https://asana.com/) work management platform. This integration gives your agent the ability to manage projects, tasks, goals, and team collaboration using natural language.
## Use cases
- **Track Project Status**: Get real-time updates on project progress, view status reports, and retrieve information about milestones and deadlines.
- **Manage Tasks**: Create, update, and organize tasks using natural language. Let your agent handle task assignments, status changes, and priority updates.
- **Monitor Goals**: Access and update Asana Goals to track team objectives and key results across your organization.
## Prerequisites
- An [Asana](https://asana.com/) account with access to a workspace
## Use with agent
```python
from google.adk.agents import Agent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
from mcp import StdioServerParameters
root_agent = Agent(
model="gemini-2.5-pro",
name="asana_agent",
instruction="Help users manage projects, tasks, and goals in Asana",
tools=[
McpToolset(
connection_params=StdioConnectionParams(
server_params=StdioServerParameters(
command="npx",
args=[
"-y",
"mcp-remote",
"https://mcp.asana.com/sse",
]
),
timeout=30,
),
)
],
)
```
```typescript
import { LlmAgent, MCPToolset } from "@google/adk";
const rootAgent = new LlmAgent({
model: "gemini-2.5-pro",
name: "asana_agent",
instruction: "Help users manage projects, tasks, and goals in Asana",
tools: [
new MCPToolset({
type: "StdioConnectionParams",
serverParams: {
command: "npx",
args: [
"-y",
"mcp-remote",
"https://mcp.asana.com/sse",
],
},
}),
],
});
export { rootAgent };
```
Note
When you run this agent for the first time, a browser window opens automatically to request access via OAuth. Alternatively, you can use the authorization URL printed in the console. You must approve this request to allow the agent to access your Asana data.
## Available tools
Asana's MCP server includes 30+ tools organized by category. Tools are automatically discovered when your agent connects. Use the [ADK Web UI](/adk-docs/runtime/web-interface/) to view available tools in the trace graph after running your agent.
| Category | Description |
| ----------------- | ------------------------------------------- |
| Project tracking | Get project status updates and reports |
| Task management | Create, update, and organize tasks |
| User information | Access user details and assignments |
| Goals | Track and update Asana Goals |
| Team organization | Manage team structures and membership |
| Object search | Quick typeahead search across Asana objects |
## Additional resources
- [Asana MCP Server Documentation](https://developers.asana.com/docs/using-asanas-mcp-server)
- [Asana MCP Integration Guide](https://developers.asana.com/docs/integrating-with-asanas-mcp-server)
# Atlassian MCP tool for ADK
Supported in ADKPythonTypeScript
The [Atlassian MCP Server](https://github.com/atlassian/atlassian-mcp-server) connects your ADK agent to the [Atlassian](https://www.atlassian.com/) ecosystem, bridging the gap between project tracking in Jira and knowledge management in Confluence. This integration gives your agent the ability to manage issues, search and update documentation pages, and streamline collaboration workflows using natural language.
## Use cases
- **Unified Knowledge Search**: Search across both Jira issues and Confluence pages simultaneously to find project specs, decisions, or historical context.
- **Automate Issue Management**: Create, edit, and transition Jira issues, or add comments to existing tickets.
- **Documentation Assistant**: Retrieve page content, generate drafts, or add inline comments to Confluence documents directly from your agent.
## Prerequisites
- Sign up for an [Atlassian account](https://id.atlassian.com/signup)
- An Atlassian Cloud site with Jira and/or Confluence
## Use with agent
```python
from google.adk.agents import Agent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
from mcp import StdioServerParameters
root_agent = Agent(
model="gemini-2.5-pro",
name="atlassian_agent",
instruction="Help users work with data in Atlassian products",
tools=[
McpToolset(
connection_params=StdioConnectionParams(
server_params=StdioServerParameters(
command="npx",
args=[
"-y",
"mcp-remote",
"https://mcp.atlassian.com/v1/mcp",
]
),
timeout=30,
),
)
],
)
```
```typescript
import { LlmAgent, MCPToolset } from "@google/adk";
const rootAgent = new LlmAgent({
model: "gemini-2.5-pro",
name: "atlassian_agent",
instruction: "Help users work with data in Atlassian products",
tools: [
new MCPToolset({
type: "StdioConnectionParams",
serverParams: {
command: "npx",
args: [
"-y",
"mcp-remote",
"https://mcp.atlassian.com/v1/mcp",
],
},
}),
],
});
export { rootAgent };
```
Note
When you run this agent for the first time, a browser window opens automatically to request access via OAuth. Alternatively, you can use the authorization URL printed in the console. You must approve this request to allow the agent to access your Atlassian data.
## Available tools
| Tool | Description |
| ---------------------------------- | ---------------------------------------------------------- |
| `atlassianUserInfo` | Get information about the user |
| `getAccessibleAtlassianResources` | Get information about accessible Atlassian resources |
| `getJiraIssue` | Get information about a Jira issue |
| `editJiraIssue` | Edit a Jira issue |
| `createJiraIssue` | Create a new Jira issue |
| `getTransitionsForJiraIssue` | Get transitions for a Jira issue |
| `transitionJiraIssue` | Transition a Jira issue |
| `lookupJiraAccountId` | Lookup a Jira account ID |
| `searchJiraIssuesUsingJql` | Search Jira issues using JQL |
| `addCommentToJiraIssue` | Add a comment to a Jira issue |
| `getJiraIssueRemoteIssueLinks` | Get remote issue links for a Jira issue |
| `getVisibleJiraProjects` | Get visible Jira projects |
| `getJiraProjectIssueTypesMetadata` | Get issue types metadata for a Jira project |
| `getJiraIssueTypeMetaWithFields` | Get issue type metadata with fields for a Jira issue |
| `getConfluenceSpaces` | Get information about Confluence spaces |
| `getConfluencePage` | Get information about a Confluence page |
| `getPagesInConfluenceSpace` | Get information about pages in a Confluence space |
| `getConfluencePageFooterComments` | Get information about footer comments in a Confluence page |
| `getConfluencePageInlineComments` | Get information about inline comments in a Confluence page |
| `getConfluencePageDescendants` | Get information about descendants of a Confluence page |
| `createConfluencePage` | Create a new Confluence page |
| `updateConfluencePage` | Update an existing Confluence page |
| `createConfluenceFooterComment` | Create a footer comment in a Confluence page |
| `createConfluenceInlineComment` | Create an inline comment in a Confluence page |
| `searchConfluenceUsingCql` | Search Confluence using CQL |
| `search` | Search for information |
| `fetch` | Fetch information |
## Additional resources
- [Atlassian MCP Server Repository](https://github.com/atlassian/atlassian-mcp-server)
- [Atlassian MCP Server Documentation](https://support.atlassian.com/atlassian-rovo-mcp-server/docs/getting-started-with-the-atlassian-remote-mcp-server/)
# BigQuery Agent Analytics plugin for ADK
Supported in ADKPython v1.21.0Preview
Version Requirement
Use ADK Python version 1.21.0 or higher to make full use of the features described in this document.
The BigQuery Agent Analytics Plugin significantly enhances the Agent Development Kit (ADK) by providing a robust solution for in-depth agent behavior analysis. Using the ADK Plugin architecture and the **BigQuery Storage Write API**, it captures and logs critical operational events directly into a Google BigQuery table, empowering you with advanced capabilities for debugging, real-time monitoring, and comprehensive offline performance evaluation.
Version 1.21.0 introduces **Hybrid Multimodal Logging**, allowing you to log large payloads (images, audio, blobs) by offloading them to Google Cloud Storage (GCS) while keeping a structured reference (`ObjectRef`) in BigQuery.
Preview release
The BigQuery Agent Analytics Plugin is in Preview release. For more information, see the [launch stage descriptions](https://cloud.google.com/products#product-launch-stages).
BigQuery Storage Write API
This feature uses **BigQuery Storage Write API**, which is a paid service. For information on costs, see the [BigQuery documentation](https://cloud.google.com/bigquery/pricing?e=48754805&hl=en#data-ingestion-pricing).
## Use cases
- **Agent workflow debugging and analysis:** Capture a wide range of *plugin lifecycle events* (LLM calls, tool usage) and *agent-yielded events* (user input, model responses), into a well-defined schema.
- **High-volume analysis and debugging:** Logging operations are performed asynchronously using the Storage Write API to allow high throughput and low latency.
- **Multimodal Analysis**: Log and analyze text, images, and other modalities. Large files are offloaded to GCS, making them accessible to BigQuery ML via Object Tables.
- **Distributed Tracing**: Built-in support for OpenTelemetry-style tracing (`trace_id`, `span_id`) to visualize agent execution flows.
The agent event data recorded varies based on the ADK event type. For more information, see [Event types and payloads](#event-types).
## Prerequisites
- **Google Cloud Project** with the **BigQuery API** enabled.
- **BigQuery Dataset:** Create a dataset to store logging tables before using the plugin. The plugin automatically creates the necessary events table within the dataset if the table does not exist.
- **Google Cloud Storage Bucket (Optional):** If you plan to log multimodal content (images, audio, etc.), creating a GCS bucket is recommended for offloading large files.
- **Authentication:**
- **Local:** Run `gcloud auth application-default login`.
- **Cloud:** Ensure your service account has the required permissions.
### IAM permissions
For the agent to work properly, the principal (e.g., service account, user account) under which the agent is running needs these Google Cloud roles: * `roles/bigquery.jobUser` at Project Level to run BigQuery queries. * `roles/bigquery.dataEditor` at Table Level to write log/event data. * **If using GCS offloading:** `roles/storage.objectCreator` and `roles/storage.objectViewer` on the target bucket.
## Use with agent
You use the BigQuery Agent Analytics Plugin by configuring and registering it with your ADK agent's App object. The following example shows an implementation of an agent with this plugin, including GCS offloading:
my_bq_agent/agent.py
```python
# my_bq_agent/agent.py
import os
import google.auth
from google.adk.apps import App
from google.adk.plugins.bigquery_agent_analytics_plugin import BigQueryAgentAnalyticsPlugin, BigQueryLoggerConfig
from google.adk.agents import Agent
from google.adk.models.google_llm import Gemini
from google.adk.tools.bigquery import BigQueryToolset, BigQueryCredentialsConfig
# --- OpenTelemetry Initialization (Optional) ---
# Recommended for enabling distributed tracing (populates trace_id, span_id).
# If not configured, the plugin uses internal UUIDs for span correlation.
try:
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
trace.set_tracer_provider(TracerProvider())
except ImportError:
pass # OpenTelemetry is optional
# --- Configuration ---
PROJECT_ID = os.environ.get("GOOGLE_CLOUD_PROJECT", "your-gcp-project-id")
DATASET_ID = os.environ.get("BIG_QUERY_DATASET_ID", "your-big-query-dataset-id")
LOCATION = os.environ.get("GOOGLE_CLOUD_LOCATION", "US") # default location is US in the plugin
GCS_BUCKET = os.environ.get("GCS_BUCKET_NAME", "your-gcs-bucket-name") # Optional
if PROJECT_ID == "your-gcp-project-id":
raise ValueError("Please set GOOGLE_CLOUD_PROJECT or update the code.")
# --- CRITICAL: Set environment variables BEFORE Gemini instantiation ---
os.environ['GOOGLE_CLOUD_PROJECT'] = PROJECT_ID
os.environ['GOOGLE_CLOUD_LOCATION'] = LOCATION
os.environ['GOOGLE_GENAI_USE_VERTEXAI'] = 'True'
# --- Initialize the Plugin with Config ---
bq_config = BigQueryLoggerConfig(
enabled=True,
gcs_bucket_name=GCS_BUCKET, # Enable GCS offloading for multimodal content
log_multi_modal_content=True,
max_content_length=500 * 1024, # 500 KB limit for inline text
batch_size=1, # Default is 1 for low latency, increase for high throughput
shutdown_timeout=10.0
)
bq_logging_plugin = BigQueryAgentAnalyticsPlugin(
project_id=PROJECT_ID,
dataset_id=DATASET_ID,
table_id="agent_events_v2", # default table name is agent_events_v2
config=bq_config,
location=LOCATION
)
# --- Initialize Tools and Model ---
credentials, _ = google.auth.default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
bigquery_toolset = BigQueryToolset(
credentials_config=BigQueryCredentialsConfig(credentials=credentials)
)
llm = Gemini(model="gemini-2.5-flash")
root_agent = Agent(
model=llm,
name='my_bq_agent',
instruction="You are a helpful assistant with access to BigQuery tools.",
tools=[bigquery_toolset]
)
# --- Create the App ---
app = App(
name="my_bq_agent",
root_agent=root_agent,
plugins=[bq_logging_plugin],
)
```
### Run and test agent
Test the plugin by running the agent and making a few requests through the chat interface, such as ”tell me what you can do” or "List datasets in my cloud project “. These actions create events which are recorded in your Google Cloud project BigQuery instance. Once these events have been processed, you can view the data for them in the [BigQuery Console](https://console.cloud.google.com/bigquery), using this query
```sql
SELECT timestamp, event_type, content
FROM `your-gcp-project-id.your-big-query-dataset-id.agent_events_v2`
ORDER BY timestamp DESC
LIMIT 20;
```
## Tracing and Observability
The plugin supports **OpenTelemetry** for distributed tracing.
- **Automatic Span Management**: The plugin automatically generates spans for Agent execution, LLM calls, and Tool executions.
- **OpenTelemetry Integration**: If an OpenTelemetry `TracerProvider` is configured (as shown in the example above), the plugin will use valid OTel spans, populating `trace_id`, `span_id`, and `parent_span_id` with standard OTel identifiers. This allows you to correlate agent logs with other services in your distributed system.
- **Fallback Mechanism**: If OpenTelemetry is not installed or configured, the plugin automatically falls back to generating internal UUIDs for spans and uses the `invocation_id` as the trace ID. This ensures that the parent-child hierarchy (Agent -> Span -> Tool/LLM) is *always* preserved in the BigQuery logs, even without a full OTel setup.
## Configuration options
You can customize the plugin using `BigQueryLoggerConfig`.
- **`enabled`** (`bool`, default: `True`): To disable the plugin from logging agent data to the BigQuery table, set this parameter to False.
- **`clustering_fields`** (`List[str]`, default: `["event_type", "agent", "user_id"]`): The fields used to cluster the BigQuery table when it is automatically created.
- **`gcs_bucket_name`** (`Optional[str]`, default: `None`): The name of the GCS bucket to offload large content (images, blobs, large text) to. If not provided, large content may be truncated or replaced with placeholders.
- **`connection_id`** (`Optional[str]`, default: `None`): The BigQuery connection ID (e.g., `us.my-connection`) to use as the authorizer for `ObjectRef` columns. Required for using `ObjectRef` with BigQuery ML.
- **`max_content_length`** (`int`, default: `500 * 1024`): The maximum length (in characters) of text content to store **inline** in BigQuery before offloading to GCS (if configured) or truncating. Default is 500 KB.
- **`batch_size`** (`int`, default: `1`): The number of events to batch before writing to BigQuery.
- **`batch_flush_interval`** (`float`, default: `1.0`): The maximum time (in seconds) to wait before flushing a partial batch.
- **`shutdown_timeout`** (`float`, default: `10.0`): Seconds to wait for logs to flush during shutdown.
- **`event_allowlist`** (`Optional[List[str]]`, default: `None`): A list of event types to log. If `None`, all events are logged except those in `event_denylist`. For a comprehensive list of supported event types, refer to the [Event types and payloads](#event-types) section.
- **`event_denylist`** (`Optional[List[str]]`, default: `None`): A list of event types to skip logging. For a comprehensive list of supported event types, refer to the [Event types and payloads](#event-types) section.
- **`content_formatter`** (`Optional[Callable[[Any, str], Any]]`, default: `None`): An optional function to format event content before logging.
- **`log_multi_modal_content`** (`bool`, default: `True`): Whether to log detailed content parts (including GCS references).
- **`queue_max_size`** (`int`, default: `10000`): The maximum number of events to hold in the in-memory queue before dropping new events.
- **`retry_config`** (`RetryConfig`, default: `RetryConfig()`): Configuration for retrying failed BigQuery writes (attributes: `max_retries`, `initial_delay`, `multiplier`, `max_delay`).
- **`log_session_metadata`** (`bool`, default: `True`): If True, logs metadata from the `session` object (e.g., `session.metadata`) into the `attributes` column.
- **`custom_tags`** (`Dict[str, Any]`, default: `{}`): A dictionary of static tags (e.g., `{"env": "prod", "version": "1.0"}`) to be included in the `attributes` column for every event.
The following code sample shows how to define a configuration for the BigQuery Agent Analytics plugin:
```python
import json
import re
from google.adk.plugins.bigquery_agent_analytics_plugin import BigQueryLoggerConfig
def redact_dollar_amounts(event_content: Any) -> str:
"""
Custom formatter to redact dollar amounts (e.g., $600, $12.50)
and ensure JSON output if the input is a dict.
"""
text_content = ""
if isinstance(event_content, dict):
text_content = json.dumps(event_content)
else:
text_content = str(event_content)
# Regex to find dollar amounts: $ followed by digits, optionally with commas or decimals.
# Examples: $600, $1,200.50, $0.99
redacted_content = re.sub(r'\$\d+(?:,\d{3})*(?:\.\d+)?', 'xxx', text_content)
return redacted_content
config = BigQueryLoggerConfig(
enabled=True,
event_allowlist=["LLM_REQUEST", "LLM_RESPONSE"], # Only log these events
# event_denylist=["TOOL_STARTING"], # Skip these events
shutdown_timeout=10.0, # Wait up to 10s for logs to flush on exit
client_close_timeout=2.0, # Wait up to 2s for BQ client to close
max_content_length=500, # Truncate content to 500 chars
content_formatter=redact_dollar_amounts, # Redact the dollar amounts in the logging content
queue_max_size=10000, # Max events to hold in memory
# retry_config=RetryConfig(max_retries=3), # Optional: Configure retries
)
plugin = BigQueryAgentAnalyticsPlugin(..., config=config)
```
## Schema and production setup
### Schema Reference
The events table (`agent_events_v2`) uses a flexible schema. The following table provides a comprehensive reference with example values.
| Field Name | Type | Mode | Description | Example Value |
| ------------------ | ----------- | ---------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
| **timestamp** | `TIMESTAMP` | `REQUIRED` | UTC timestamp of event creation. Acts as the primary ordering key and often the daily partitioning key. Precision is microsecond. | `2026-02-03 20:52:17 UTC` |
| **event_type** | `STRING` | `NULLABLE` | The canonical event category. Standard values include `LLM_REQUEST`, `LLM_RESPONSE`, `TOOL_STARTING`, `TOOL_COMPLETED`, `AGENT_STARTING`, `AGENT_COMPLETED`, `STATE_DELTA`. Used for high-level filtering. | `LLM_REQUEST` |
| **agent** | `STRING` | `NULLABLE` | The name of the agent responsible for this event. Defined during agent initialization or via the `root_agent_name` context. | `my_bq_agent` |
| **session_id** | `STRING` | `NULLABLE` | A persistent identifier for the entire conversation thread. Stays constant across multiple turns and sub-agent calls. | `04275a01-1649-4a30-b6a7-5b443c69a7bc` |
| **invocation_id** | `STRING` | `NULLABLE` | The unique identifier for a single execution turn or request cycle. Corresponds to `trace_id` in many contexts. | `e-b55b2000-68c6-4e8b-b3b3-ffb454a92e40` |
| **user_id** | `STRING` | `NULLABLE` | The identifier of the user (human or system) initiating the session. Extracted from the `User` object or metadata. | `test_user` |
| **trace_id** | `STRING` | `NULLABLE` | The **OpenTelemetry** Trace ID (32-char hex). Links all operations within a single distributed request lifecycle. | `e-b55b2000-68c6-4e8b-b3b3-ffb454a92e40` |
| **span_id** | `STRING` | `NULLABLE` | The **OpenTelemetry** Span ID (16-char hex). Uniquely identifies this specific atomic operation. | `69867a836cd94798be2759d8e0d70215` |
| **parent_span_id** | `STRING` | `NULLABLE` | The Span ID of the immediate caller. Used to reconstruct the parent-child execution tree (DAG). | `ef5843fe40764b4b8afec44e78044205` |
| **content** | `JSON` | `NULLABLE` | The primary event payload. Structure is polymorphic based on `event_type`. | `{"system_prompt": "You are...", "prompt": [{"role": "user", "content": "hello"}], "response": "Hi", "usage": {"total": 15}}` |
| **attributes** | `JSON` | `NULLABLE` | Metadata/Enrichment (usage stats, model info, custom tags). | `{"model": "gemini-2.5-flash", "usage_metadata": {"total_token_count": 15}, "state_delta": {"key": "val"}, "session_metadata": {"key": "val"}}` |
| **latency_ms** | `JSON` | `NULLABLE` | Performance metrics. Standard keys are `total_ms` (wall-clock duration) and `time_to_first_token_ms` (streaming latency). | `{"total_ms": 1250, "time_to_first_token_ms": 450}` |
| **status** | `STRING` | `NULLABLE` | High-level outcome. Values: `OK` (success) or `ERROR` (failure). | `OK` |
| **error_message** | `STRING` | `NULLABLE` | Human-readable exception message or stack trace fragment. Populated only when `status` is `ERROR`. | `Error 404: Dataset not found` |
| **is_truncated** | `BOOLEAN` | `NULLABLE` | `true` if `content` or `attributes` exceeded the BigQuery cell size limit (default 10MB) and were partially dropped. | `false` |
| **content_parts** | `RECORD` | `REPEATED` | Array of multi-modal segments (Text, Image, Blob). Used when content cannot be serialized as simple JSON (e.g., large binaries or GCS refs). | `[{"mime_type": "text/plain", "text": "hello"}]` |
The plugin automatically creates the table if it does not exist. However, for production, we recommend creating the table manually using the following DDL, which utilizes the **JSON** type for flexibility and **REPEATED RECORD**s for multimodal content.
**Recommended DDL:**
```sql
CREATE TABLE `your-gcp-project-id.adk_agent_logs.agent_events_v2`
(
timestamp TIMESTAMP NOT NULL OPTIONS(description="The UTC time at which the event was logged."),
event_type STRING OPTIONS(description="Indicates the type of event being logged (e.g., 'LLM_REQUEST', 'TOOL_COMPLETED')."),
agent STRING OPTIONS(description="The name of the ADK agent or author associated with the event."),
session_id STRING OPTIONS(description="A unique identifier to group events within a single conversation or user session."),
invocation_id STRING OPTIONS(description="A unique identifier for each individual agent execution or turn within a session."),
user_id STRING OPTIONS(description="The identifier of the user associated with the current session."),
trace_id STRING OPTIONS(description="OpenTelemetry trace ID for distributed tracing."),
span_id STRING OPTIONS(description="OpenTelemetry span ID for this specific operation."),
parent_span_id STRING OPTIONS(description="OpenTelemetry parent span ID to reconstruct hierarchy."),
content JSON OPTIONS(description="The event-specific data (payload) stored as JSON."),
content_parts ARRAY,
text STRING,
part_index INT64,
part_attributes STRING,
storage_mode STRING
>> OPTIONS(description="Detailed content parts for multi-modal data."),
attributes JSON OPTIONS(description="Arbitrary key-value pairs for additional metadata (e.g., 'root_agent_name', 'model_version', 'usage_metadata', 'session_metadata', 'custom_tags')."),
latency_ms JSON OPTIONS(description="Latency measurements (e.g., total_ms)."),
status STRING OPTIONS(description="The outcome of the event, typically 'OK' or 'ERROR'."),
error_message STRING OPTIONS(description="Populated if an error occurs."),
is_truncated BOOLEAN OPTIONS(description="Flag indicates if content was truncated.")
)
PARTITION BY DATE(timestamp)
CLUSTER BY event_type, agent, user_id;
```
### Event types and payloads
The `content` column now contains a **JSON** object specific to the `event_type`. The `content_parts` column provides a structured view of the content, especially useful for images or offloaded data.
Content Truncation
- Variable content fields are truncated to `max_content_length` (configured in `BigQueryLoggerConfig`, default 500KB).
- If `gcs_bucket_name` is configured, large content is offloaded to GCS instead of being truncated, and a reference is stored in `content_parts.object_ref`.
#### LLM interactions (plugin lifecycle)
These events track the raw requests sent to and responses received from the LLM.
**1. LLM_REQUEST**
Captures the prompt sent to the model, including conversation history and system instructions.
```json
{
"event_type": "LLM_REQUEST",
"content": {
"system_prompt": "You are a helpful assistant...",
"prompt": [
{
"role": "user",
"content": "hello how are you today"
}
]
},
"attributes": {
"model": "gemini-2.5-flash",
"llm_config": {
"temperature": 0.5,
"top_p": 0.9
}
}
}
```
**2. LLM_RESPONSE**
Captures the model's output and token usage statistics.
```json
{
"event_type": "LLM_RESPONSE",
"content": {
"response": "text: 'Hello! I'm doing well...'",
"usage": {
"completion": 19,
"prompt": 10129,
"total": 10148
}
},
"attributes": {
"usage_metadata": {
"prompt_token_count": 10129,
"candidates_token_count": 19,
"total_token_count": 10148
}
},
"latency_ms": {
"time_to_first_token_ms": 2579,
"total_ms": 2579
}
}
```
#### Tool usage (plugin lifecycle)
These events track the execution of tools by the agent.
**3. TOOL_STARTING**
Logged when an agent begins executing a tool.
```json
{
"event_type": "TOOL_STARTING",
"content": {
"tool": "list_dataset_ids",
"args": {
"project_id": "bigquery-public-data"
}
}
}
```
**4. TOOL_COMPLETED**
Logged when a tool execution finishes.
```json
{
"event_type": "TOOL_COMPLETED",
"content": {
"tool": "list_dataset_ids",
"result": [
"austin_311",
"austin_bikeshare"
]
},
"latency_ms": {
"total_ms": 467
}
}
```
#### State Management
These events track changes to the agent's state, typically triggered by tools.
**5. STATE_DELTA**
Tracks changes to the agent's internal state (e.g., token cache updates).
```json
{
"event_type": "STATE_DELTA",
"attributes": {
"state_delta": {
"bigquery_token_cache": "{\"token\": \"ya29...\", \"expiry\": \"...\"}"
}
}
}
```
#### Agent lifecycle & Generic Events
| **Event Type** | **Content (JSON) Structure** |
| ----------------------- | -------------------------------------------- |
| `INVOCATION_STARTING` | `{}` |
| `INVOCATION_COMPLETED` | `{}` |
| `AGENT_STARTING` | `"You are a helpful agent..."` |
| `AGENT_COMPLETED` | `{}` |
| `USER_MESSAGE_RECEIVED` | `{"text_summary": "Help me book a flight."}` |
#### GCS Offloading Examples (Multimodal & Large Text)
When `gcs_bucket_name` is configured, large text and multimodal content (images, audio, etc.) are automatically offloaded to GCS. The `content` column will contain a summary or placeholder, while `content_parts` contains the `object_ref` pointing to the GCS URI.
**Offloaded Text Example**
```json
{
"event_type": "LLM_REQUEST",
"content_parts": [
{
"part_index": 1,
"mime_type": "text/plain",
"storage_mode": "GCS_REFERENCE",
"text": "AAAA... [OFFLOADED]",
"object_ref": {
"uri": "gs://haiyuan-adk-debug-verification-1765319132/2025-12-10/e-f9545d6d/ae5235e6_p1.txt",
"authorizer": "us.bqml_connection",
"details": {"gcs_metadata": {"content_type": "text/plain"}}
}
}
]
}
```
**Offloaded Image Example**
```json
{
"event_type": "LLM_REQUEST",
"content_parts": [
{
"part_index": 2,
"mime_type": "image/png",
"storage_mode": "GCS_REFERENCE",
"text": "[MEDIA OFFLOADED]",
"object_ref": {
"uri": "gs://haiyuan-adk-debug-verification-1765319132/2025-12-10/e-f9545d6d/ae5235e6_p2.png",
"authorizer": "us.bqml_connection",
"details": {"gcs_metadata": {"content_type": "image/png"}}
}
}
]
}
```
**Querying Offloaded Content (Get Signed URLs)**
```sql
SELECT
timestamp,
event_type,
part.mime_type,
part.storage_mode,
part.object_ref.uri AS gcs_uri,
-- Generate a signed URL to read the content directly (requires connection_id configuration)
STRING(OBJ.GET_ACCESS_URL(part.object_ref, 'r').access_urls.read_url) AS signed_url
FROM `your-gcp-project-id.your-dataset-id.agent_events_v2`,
UNNEST(content_parts) AS part
WHERE part.storage_mode = 'GCS_REFERENCE'
ORDER BY timestamp DESC
LIMIT 10;
```
## Advanced analysis queries
**Trace a specific conversation turn using trace_id**
```sql
SELECT timestamp, event_type, agent, JSON_VALUE(content, '$.response') as summary
FROM `your-gcp-project-id.your-dataset-id.agent_events_v2`
WHERE trace_id = 'your-trace-id'
ORDER BY timestamp ASC;
```
**Token usage analysis (accessing JSON fields)**
```sql
SELECT
AVG(CAST(JSON_VALUE(content, '$.usage.total') AS INT64)) as avg_tokens
FROM `your-gcp-project-id.your-dataset-id.agent_events_v2`
WHERE event_type = 'LLM_RESPONSE';
```
**Querying Multimodal Content (using content_parts and ObjectRef)**
```sql
SELECT
timestamp,
part.mime_type,
part.object_ref.uri as gcs_uri
FROM `your-gcp-project-id.your-dataset-id.agent_events_v2`,
UNNEST(content_parts) as part
WHERE part.mime_type LIKE 'image/%'
ORDER BY timestamp DESC;
```
**Analyze Multimodal Content with BigQuery Remote Model (Gemini)**
```sql
SELECT
logs.session_id,
-- Get a signed URL for the image
STRING(OBJ.GET_ACCESS_URL(parts.object_ref, "r").access_urls.read_url) as signed_url,
-- Analyze the image using a remote model (e.g., gemini-pro-vision)
AI.GENERATE(
('Describe this image briefly. What company logo?', parts.object_ref)
) AS generated_result
FROM
`your-gcp-project-id.your-dataset-id.agent_events_v2` logs,
UNNEST(logs.content_parts) AS parts
WHERE
parts.mime_type LIKE 'image/%'
ORDER BY logs.timestamp DESC
LIMIT 1;
```
**Latency Analysis (LLM & Tools)**
```sql
SELECT
event_type,
AVG(CAST(JSON_VALUE(latency_ms, '$.total_ms') AS INT64)) as avg_latency_ms
FROM `your-gcp-project-id.your-dataset-id.agent_events_v2`
WHERE event_type IN ('LLM_RESPONSE', 'TOOL_COMPLETED')
GROUP BY event_type;
```
**Span Hierarchy & Duration Analysis**
```sql
SELECT
span_id,
parent_span_id,
event_type,
timestamp,
-- Extract duration from latency_ms for completed operations
CAST(JSON_VALUE(latency_ms, '$.total_ms') AS INT64) as duration_ms,
-- Identify the specific tool or operation
COALESCE(
JSON_VALUE(content, '$.tool'),
'LLM_CALL'
) as operation
FROM `your-gcp-project-id.your-dataset-id.agent_events_v2`
WHERE trace_id = 'your-trace-id'
AND event_type IN ('LLM_RESPONSE', 'TOOL_COMPLETED')
ORDER BY timestamp ASC;
```
### 7. AI-Powered Root Cause Analysis (Agent Ops)
Automatically analyze failed sessions to determine the root cause of errors using BigQuery ML and Gemini.
```sql
DECLARE failed_session_id STRING;
-- Find a recent failed session
SET failed_session_id = (
SELECT session_id
FROM `your-gcp-project-id.your-dataset-id.agent_events_v2`
WHERE error_message IS NOT NULL
ORDER BY timestamp DESC
LIMIT 1
);
-- Reconstruct the full conversation context
WITH SessionContext AS (
SELECT
session_id,
STRING_AGG(CONCAT(event_type, ': ', COALESCE(TO_JSON_STRING(content), '')), '\n' ORDER BY timestamp) as full_history
FROM `your-gcp-project-id.your-dataset-id.agent_events_v2`
WHERE session_id = failed_session_id
GROUP BY session_id
)
-- Ask Gemini to diagnose the issue
SELECT
session_id,
AI.GENERATE(
('Analyze this conversation log and explain the root cause of the failure. Log: ', full_history),
connection_id => 'your-gcp-project-id.us.my-connection',
endpoint => 'gemini-2.5-flash'
).result AS root_cause_explanation
FROM SessionContext;
```
## Conversational Analytics in BigQuery
You can also use [BigQuery Conversational Analytics](https://cloud.google.com/bigquery/docs/conversational-analytics) to analyze your agent logs using natural language. Use this tool to answer questions like:
- "Show me the error rate over time"
- "What are the most common tool calls?"
- "Identify sessions with high token usage"
## Looker Studio Dashboard
You can visualize your agent's performance using our pre-built [Looker Studio Dashboard template](https://lookerstudio.google.com/c/reporting/f1c5b513-3095-44f8-90a2-54953d41b125/page/8YdhF).
To connect this dashboard to your own BigQuery table, use the following link format, replacing the placeholders with your specific project, dataset, and table IDs:
```text
https://lookerstudio.google.com/reporting/create?c.reportId=f1c5b513-3095-44f8-90a2-54953d41b125&ds.ds3.connector=bigQuery&ds.ds3.type=TABLE&ds.ds3.projectId=&ds.ds3.datasetId=&ds.ds3.tableId=
```
## Additional resources
- [BigQuery Storage Write API](https://cloud.google.com/bigquery/docs/write-api)
- [Introduction to Object Tables](https://cloud.google.com/bigquery/docs/object-tables-intro)
- [Interactive Demo Notebook](https://github.com/haiyuan-eng-google/demo_BQ_agent_analytics_plugin_notebook)
# BigQuery tool for ADK
Supported in ADKPython v1.1.0
These are a set of tools aimed to provide integration with BigQuery, namely:
- **`list_dataset_ids`**: Fetches BigQuery dataset ids present in a GCP project.
- **`get_dataset_info`**: Fetches metadata about a BigQuery dataset.
- **`list_table_ids`**: Fetches table ids present in a BigQuery dataset.
- **`get_table_info`**: Fetches metadata about a BigQuery table.
- **`execute_sql`**: Runs a SQL query in BigQuery and fetch the result.
- **`forecast`**: Runs a BigQuery AI time series forecast using the `AI.FORECAST` function.
- **`ask_data_insights`**: Answers questions about data in BigQuery tables using natural language.
They are packaged in the toolset `BigQueryToolset`.
```py
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import asyncio
from google.adk.agents import Agent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.adk.tools.bigquery import BigQueryCredentialsConfig
from google.adk.tools.bigquery import BigQueryToolset
from google.adk.tools.bigquery.config import BigQueryToolConfig
from google.adk.tools.bigquery.config import WriteMode
from google.genai import types
import google.auth
# Define constants for this example agent
AGENT_NAME = "bigquery_agent"
APP_NAME = "bigquery_app"
USER_ID = "user1234"
SESSION_ID = "1234"
GEMINI_MODEL = "gemini-2.0-flash"
# Define a tool configuration to block any write operations
tool_config = BigQueryToolConfig(write_mode=WriteMode.BLOCKED)
# Use Application Default Credentials (ADC) for BigQuery authentication
# https://cloud.google.com/docs/authentication/provide-credentials-adc
application_default_credentials, _ = google.auth.default()
credentials_config = BigQueryCredentialsConfig(
credentials=application_default_credentials
)
# Instantiate a BigQuery toolset
bigquery_toolset = BigQueryToolset(
credentials_config=credentials_config, bigquery_tool_config=tool_config
)
# Agent Definition
bigquery_agent = Agent(
model=GEMINI_MODEL,
name=AGENT_NAME,
description=(
"Agent to answer questions about BigQuery data and models and execute"
" SQL queries."
),
instruction="""\
You are a data science agent with access to several BigQuery tools.
Make use of those tools to answer the user's questions.
""",
tools=[bigquery_toolset],
)
# Session and Runner
session_service = InMemorySessionService()
session = asyncio.run(
session_service.create_session(
app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID
)
)
runner = Runner(
agent=bigquery_agent, app_name=APP_NAME, session_service=session_service
)
# Agent Interaction
def call_agent(query):
"""
Helper function to call the agent with a query.
"""
content = types.Content(role="user", parts=[types.Part(text=query)])
events = runner.run(user_id=USER_ID, session_id=SESSION_ID, new_message=content)
print("USER:", query)
for event in events:
if event.is_final_response():
final_response = event.content.parts[0].text
print("AGENT:", final_response)
call_agent("Are there any ml datasets in bigquery-public-data project?")
call_agent("Tell me more about ml_datasets.")
call_agent("Which all tables does it have?")
call_agent("Tell me more about the census_adult_income table.")
call_agent("How many rows are there per income bracket?")
call_agent(
"What is the statistical correlation between education_num, age, and the income_bracket?"
)
```
Note: If you want to access a BigQuery data agent as a tool, see [Data Agents tools for ADK](https://google.github.io/adk-docs/integrations/data-agent/index.md).
# Bigtable tool for ADK
Supported in ADKPython v1.12.0
These are a set of tools aimed to provide integration with Bigtable, namely:
- **`list_instances`**: Fetches Bigtable instances in a Google Cloud project.
- **`get_instance_info`**: Fetches metadata instance information in a Google Cloud project.
- **`list_tables`**: Fetches tables in a GCP Bigtable instance.
- **`get_table_info`**: Fetches metadata table information in a GCP Bigtable.
- **`execute_sql`**: Runs a SQL query in Bigtable table and fetch the result.
They are packaged in the toolset `BigtableToolset`.
```py
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import asyncio
from google.adk.agents import Agent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.adk.tools.google_tool import GoogleTool
from google.adk.tools.bigtable import query_tool
from google.adk.tools.bigtable.settings import BigtableToolSettings
from google.adk.tools.bigtable.bigtable_credentials import BigtableCredentialsConfig
from google.adk.tools.bigtable.bigtable_toolset import BigtableToolset
from google.genai import types
from google.adk.tools.tool_context import ToolContext
import google.auth
from google.auth.credentials import Credentials
# Define constants for this example agent
AGENT_NAME = "bigtable_agent"
APP_NAME = "bigtable_app"
USER_ID = "user1234"
SESSION_ID = "1234"
GEMINI_MODEL = "gemini-2.5-flash"
# Define Bigtable tool config with read capability set to allowed.
tool_settings = BigtableToolSettings()
# Define a credentials config - in this example we are using application default
# credentials
# https://cloud.google.com/docs/authentication/provide-credentials-adc
application_default_credentials, _ = google.auth.default()
credentials_config = BigtableCredentialsConfig(
credentials=application_default_credentials
)
# Instantiate a Bigtable toolset
bigtable_toolset = BigtableToolset(
credentials_config=credentials_config, bigtable_tool_settings=tool_settings
)
# Optional
# Create a wrapped function tool for the agent on top of the built-in
# `execute_sql` tool in the bigtable toolset.
# For example, this customized tool can perform a dynamically-built query.
def count_rows_tool(
table_name: str,
credentials: Credentials, # GoogleTool handles `credentials`
settings: BigtableToolSettings, # GoogleTool handles `settings`
tool_context: ToolContext, # GoogleTool handles `tool_context`
):
"""Counts the total number of rows for a specified table.
Args:
table_name: The name of the table for which to count rows.
Returns:
The total number of rows in the table.
"""
# Replace the following settings for a specific bigtable database.
PROJECT_ID = ""
INSTANCE_ID = ""
query = f"""
SELECT count(*) FROM {table_name}
"""
return query_tool.execute_sql(
project_id=PROJECT_ID,
instance_id=INSTANCE_ID,
query=query,
credentials=credentials,
settings=settings,
tool_context=tool_context,
)
# Agent Definition
bigtable_agent = Agent(
model=GEMINI_MODEL,
name=AGENT_NAME,
description=(
"Agent to answer questions about bigtable database and execute SQL queries."
),
instruction="""\
You are a data assistant agent with access to several bigtable tools.
Make use of those tools to answer the user's questions.
""",
tools=[
bigtable_toolset,
# Add customized bigtable tool based on the built-in bigtable toolset.
GoogleTool(
func=count_rows_tool,
credentials_config=credentials_config,
tool_settings=tool_settings,
),
],
)
# Session and Runner
session_service = InMemorySessionService()
session = asyncio.run(
session_service.create_session(
app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID
)
)
runner = Runner(
agent=bigtable_agent, app_name=APP_NAME, session_service=session_service
)
# Agent Interaction
def call_agent(query):
"""
Helper function to call the agent with a query.
"""
content = types.Content(role="user", parts=[types.Part(text=query)])
events = runner.run(user_id=USER_ID, session_id=SESSION_ID, new_message=content)
print("USER:", query)
for event in events:
if event.is_final_response():
final_response = event.content.parts[0].text
print("AGENT:", final_response)
# Replace the bigtable instance and table names below with your own.
call_agent("List all tables in projects//instances/")
call_agent("List the top 5 rows in ")
```
# Cartesia MCP tool for ADK
Supported in ADKPythonTypeScript
The [Cartesia MCP Server](https://github.com/cartesia-ai/cartesia-mcp) connects your ADK agent to the [Cartesia](https://cartesia.ai/) AI audio platform. This integration gives your agent the ability to generate speech, localize voices across languages, and create audio content using natural language.
## Use cases
- **Text-to-Speech Generation**: Convert text into natural-sounding speech using Cartesia's diverse voice library, with control over voice selection and output format.
- **Voice Localization**: Transform existing voices into different languages while preserving the original speaker's characteristics—ideal for multilingual content creation.
- **Audio Infill**: Fill gaps between audio segments to create smooth transitions, useful for podcast editing or audiobook production.
- **Voice Transformation**: Convert audio clips to sound like different voices from Cartesia's library.
## Prerequisites
- Sign up for a [Cartesia account](https://play.cartesia.ai/sign-in)
- Generate an [API key](https://play.cartesia.ai/keys) from the Cartesia playground
## Use with agent
```python
from google.adk.agents import Agent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
from mcp import StdioServerParameters
CARTESIA_API_KEY = "YOUR_CARTESIA_API_KEY"
root_agent = Agent(
model="gemini-2.5-pro",
name="cartesia_agent",
instruction="Help users generate speech and work with audio content",
tools=[
McpToolset(
connection_params=StdioConnectionParams(
server_params=StdioServerParameters(
command="uvx",
args=["cartesia-mcp"],
env={
"CARTESIA_API_KEY": CARTESIA_API_KEY,
# "OUTPUT_DIRECTORY": "/path/to/output", # Optional
}
),
timeout=30,
),
)
],
)
```
```typescript
import { LlmAgent, MCPToolset } from "@google/adk";
const CARTESIA_API_KEY = "YOUR_CARTESIA_API_KEY";
const rootAgent = new LlmAgent({
model: "gemini-2.5-pro",
name: "cartesia_agent",
instruction: "Help users generate speech and work with audio content",
tools: [
new MCPToolset({
type: "StdioConnectionParams",
serverParams: {
command: "uvx",
args: ["cartesia-mcp"],
env: {
CARTESIA_API_KEY: CARTESIA_API_KEY,
// OUTPUT_DIRECTORY: "/path/to/output", // Optional
},
},
}),
],
});
export { rootAgent };
```
## Available tools
| Tool | Description |
| ---------------- | ---------------------------------------------- |
| `text_to_speech` | Convert text to audio using a specified voice |
| `list_voices` | List all available Cartesia voices |
| `get_voice` | Get details about a specific voice |
| `clone_voice` | Clone a voice from audio samples |
| `update_voice` | Update an existing voice |
| `delete_voice` | Delete a voice from your library |
| `localize_voice` | Transform a voice into a different language |
| `voice_change` | Convert an audio file to use a different voice |
| `infill` | Fill gaps between audio segments |
## Configuration
The Cartesia MCP server can be configured using environment variables:
| Variable | Description | Required |
| ------------------ | ---------------------------------------- | -------- |
| `CARTESIA_API_KEY` | Your Cartesia API key | Yes |
| `OUTPUT_DIRECTORY` | Directory to store generated audio files | No |
## Additional resources
- [Cartesia MCP Server Repository](https://github.com/cartesia-ai/cartesia-mcp)
- [Cartesia MCP Documentation](https://docs.cartesia.ai/integrations/mcp)
- [Cartesia Playground](https://play.cartesia.ai/)
# Chroma MCP tool for ADK
Supported in ADKPythonTypeScript
The [Chroma MCP Server](https://github.com/chroma-core/chroma-mcp) connects your ADK agent to [Chroma](https://www.trychroma.com/), an open-source embedding database. This integration gives your agent the ability to create collections, store documents, and retrieve information using semantic search, full text search, and metadata filtering.
## Use cases
- **Semantic Memory for Agents**: Store conversation context, facts, or learned information that agents can retrieve later using natural language queries.
- **Knowledge Base Retrieval**: Build a retrieval-augmented generation (RAG) system by storing documents and retrieving relevant context for responses.
- **Persistent Context Across Sessions**: Maintain long-term memory across conversations, allowing agents to reference past interactions and accumulated knowledge.
## Prerequisites
- **For local storage**: A directory path to persist data
- **For Chroma Cloud**: A [Chroma Cloud](https://www.trychroma.com/) account with tenant ID, database name, and API key
## Use with agent
```python
from google.adk.agents import Agent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
from mcp import StdioServerParameters
# For local storage, use:
DATA_DIR = "/path/to/your/data/directory"
# For Chroma Cloud, use:
# CHROMA_TENANT = "your-tenant-id"
# CHROMA_DATABASE = "your-database-name"
# CHROMA_API_KEY = "your-api-key"
root_agent = Agent(
model="gemini-2.5-pro",
name="chroma_agent",
instruction="Help users store and retrieve information using semantic search",
tools=[
McpToolset(
connection_params=StdioConnectionParams(
server_params=StdioServerParameters(
command="uvx",
args=[
"chroma-mcp",
# For local storage, use:
"--client-type",
"persistent",
"--data-dir",
DATA_DIR,
# For Chroma Cloud, use:
# "--client-type",
# "cloud",
# "--tenant",
# CHROMA_TENANT,
# "--database",
# CHROMA_DATABASE,
# "--api-key",
# CHROMA_API_KEY,
],
),
timeout=30,
),
)
],
)
```
```typescript
import { LlmAgent, MCPToolset } from "@google/adk";
// For local storage, use:
const DATA_DIR = "/path/to/your/data/directory";
// For Chroma Cloud, use:
// const CHROMA_TENANT = "your-tenant-id";
// const CHROMA_DATABASE = "your-database-name";
// const CHROMA_API_KEY = "your-api-key";
const rootAgent = new LlmAgent({
model: "gemini-2.5-pro",
name: "chroma_agent",
instruction: "Help users store and retrieve information using semantic search",
tools: [
new MCPToolset({
type: "StdioConnectionParams",
serverParams: {
command: "uvx",
args: [
"chroma-mcp",
// For local storage, use:
"--client-type",
"persistent",
"--data-dir",
DATA_DIR,
// For Chroma Cloud, use:
// "--client-type",
// "cloud",
// "--tenant",
// CHROMA_TENANT,
// "--database",
// CHROMA_DATABASE,
// "--api-key",
// CHROMA_API_KEY,
],
},
}),
],
});
export { rootAgent };
```
## Available tools
### Collection management
| Tool | Description |
| ----------------------------- | -------------------------------------------------------- |
| `chroma_list_collections` | List all collections with pagination support |
| `chroma_create_collection` | Create a new collection with optional HNSW configuration |
| `chroma_get_collection_info` | Get detailed information about a collection |
| `chroma_get_collection_count` | Get the number of documents in a collection |
| `chroma_modify_collection` | Update a collection's name or metadata |
| `chroma_delete_collection` | Delete a collection |
| `chroma_peek_collection` | View a sample of documents in a collection |
### Document operations
| Tool | Description |
| ------------------------- | ------------------------------------------------------------- |
| `chroma_add_documents` | Add documents with optional metadata and custom IDs |
| `chroma_query_documents` | Query documents using semantic search with advanced filtering |
| `chroma_get_documents` | Retrieve documents by IDs or filters with pagination |
| `chroma_update_documents` | Update existing documents' content, metadata, or embeddings |
| `chroma_delete_documents` | Delete specific documents from a collection |
## Configuration
The Chroma MCP server supports multiple client types to suit different needs:
### Client types
| Client Type | Description | Key Arguments |
| ------------ | ---------------------------------------------------------- | -------------------------------------------------------- |
| `ephemeral` | In-memory storage, cleared on restart. Useful for testing. | None (default) |
| `persistent` | File-based storage on your local machine | `--data-dir` |
| `http` | Connect to a self-hosted Chroma server | `--host`, `--port`, `--ssl`, `--custom-auth-credentials` |
| `cloud` | Connect to Chroma Cloud (api.trychroma.com) | `--tenant`, `--database`, `--api-key` |
### Environment variables
You can also configure the client using environment variables. Command-line arguments take precedence over environment variables.
| Variable | Description |
| -------------------- | ---------------------------------------------------------- |
| `CHROMA_CLIENT_TYPE` | Client type: `ephemeral`, `persistent`, `http`, or `cloud` |
| `CHROMA_DATA_DIR` | Path for persistent local storage |
| `CHROMA_TENANT` | Tenant ID for Chroma Cloud |
| `CHROMA_DATABASE` | Database name for Chroma Cloud |
| `CHROMA_API_KEY` | API key for Chroma Cloud |
| `CHROMA_HOST` | Host for self-hosted HTTP client |
| `CHROMA_PORT` | Port for self-hosted HTTP client |
| `CHROMA_SSL` | Enable SSL for HTTP client (`true` or `false`) |
| `CHROMA_DOTENV_PATH` | Path to `.env` file (defaults to `.chroma_env`) |
## Additional resources
- [Chroma MCP Server Repository](https://github.com/chroma-core/chroma-mcp)
- [Chroma Documentation](https://docs.trychroma.com/)
- [Chroma Cloud](https://www.trychroma.com/)
# Google Cloud Trace observability for ADK
Supported in ADKPython
With ADK, you’ve already capable of inspecting and observing your agent interaction locally utilizing the powerful web development UI discussed in [here](https://google.github.io/adk-docs/evaluate/#debugging-with-the-trace-view). However, if we aim for cloud deployment, we will need a centralized dashboard to observe real traffic.
Cloud Trace is a component of Google Cloud Observability. It is a powerful tool for monitoring, debugging, and improving the performance of your applications by focusing specifically on tracing capabilities. For Agent Development Kit (ADK) applications, Cloud Trace enables comprehensive tracing, helping you understand how requests flow through your agent's interactions and identify performance bottlenecks or errors within your AI agents.
## Overview
Cloud Trace is built on [OpenTelemetry](https://opentelemetry.io/), an open-source standard that supports many languages and ingestion methods for generating trace data. This aligns with observability practices for ADK applications, which also leverage OpenTelemetry-compatible instrumentation, allowing you to :
- Trace agent interactions : Cloud Trace continuously gathers and analyzes trace data from your project, enabling you to rapidly diagnose latency issues and errors within your ADK applications. This automatic data collection simplifies the process of identifying problems in complex agent workflows.
- Debug issues : Quickly diagnose latency issues and errors by analyzing detailed traces. Crucial for understanding issues that manifest as increased communication latency across different services or during specific agent actions like tool calls.
- In-depth Analysis and Visualization: Trace Explorer is the primary tool for analyzing traces, offering visual aids like heatmaps for span duration and line charts for request/error rates. It also provides a spans table, groupable by service and operation, which gives one-click access to representative traces and a waterfall view to easily identify bottlenecks and sources of errors within your agent's execution path
The following example will assume the following agent directory structure
```text
working_dir/
├── weather_agent/
│ ├── agent.py
│ └── __init__.py
└── deploy_agent_engine.py
└── deploy_fast_api_app.py
└── agent_runner.py
```
```python
# weather_agent/agent.py
import os
from google.adk.agents import Agent
os.environ.setdefault("GOOGLE_CLOUD_PROJECT", "{your-project-id}")
os.environ.setdefault("GOOGLE_CLOUD_LOCATION", "global")
os.environ.setdefault("GOOGLE_GENAI_USE_VERTEXAI", "True")
# Define a tool function
def get_weather(city: str) -> dict:
"""Retrieves the current weather report for a specified city.
Args:
city (str): The name of the city for which to retrieve the weather report.
Returns:
dict: status and result or error msg.
"""
if city.lower() == "new york":
return {
"status": "success",
"report": (
"The weather in New York is sunny with a temperature of 25 degrees"
" Celsius (77 degrees Fahrenheit)."
),
}
else:
return {
"status": "error",
"error_message": f"Weather information for '{city}' is not available.",
}
# Create an agent with tools
root_agent = Agent(
name="weather_agent",
model="gemini-2.5-flash",
description="Agent to answer questions using weather tools.",
instruction="You must use the available tools to find an answer.",
tools=[get_weather],
)
```
## Cloud Trace Setup
### Setup for Agent Engine Deployment
#### Agent Engine Deployment - from ADK CLI
You can enable cloud tracing by adding `--trace_to_cloud` flag when deploying your agent using `adk deploy agent_engine` command for agent engine deployment.
```bash
adk deploy agent_engine \
--project=$GOOGLE_CLOUD_PROJECT \
--region=$GOOGLE_CLOUD_LOCATION \
--staging_bucket=$STAGING_BUCKET \
--trace_to_cloud \
$AGENT_PATH
```
#### Agent Engine Deployment - from Python SDK
If you prefer using Python SDK, you can enable cloud tracing by adding `enable_tracing=True` when initialize the `AdkApp` object
```python
# deploy_agent_engine.py
from vertexai.preview import reasoning_engines
from vertexai import agent_engines
from weather_agent.agent import root_agent
import vertexai
PROJECT_ID = "{your-project-id}"
LOCATION = "{your-preferred-location}"
STAGING_BUCKET = "{your-staging-bucket}"
vertexai.init(
project=PROJECT_ID,
location=LOCATION,
staging_bucket=STAGING_BUCKET,
)
adk_app = reasoning_engines.AdkApp(
agent=root_agent,
enable_tracing=True,
)
remote_app = agent_engines.create(
agent_engine=adk_app,
extra_packages=[
"./weather_agent",
],
requirements=[
"google-cloud-aiplatform[adk,agent_engines]",
],
)
```
### Setup for Cloud Run Deployment
#### Cloud Run Deployment - from ADK CLI
You can enable cloud tracing by adding `--trace_to_cloud` flag when deploying your agent using `adk deploy cloud_run` command for cloud run deployment.
```bash
adk deploy cloud_run \
--project=$GOOGLE_CLOUD_PROJECT \
--region=$GOOGLE_CLOUD_LOCATION \
--trace_to_cloud \
$AGENT_PATH
```
If you want to enable cloud tracing and using a customized agent service deployment on Cloud Run, you can refer to the [Setup for Customized Deployment](#setup-for-customized-deployment) section below
### Setup for Customized Deployment
#### From Built-in `get_fast_api_app` Module
If you want to customize your own agent service, you can enable cloud tracing by initialize the FastAPI app using built-in `get_fast_api_app` module and set `trace_to_cloud=True`
```python
# deploy_fast_api_app.py
import os
from google.adk.cli.fast_api import get_fast_api_app
from fastapi import FastAPI
# Set GOOGLE_CLOUD_PROJECT environment variable for cloud tracing
os.environ.setdefault("GOOGLE_CLOUD_PROJECT", "alvin-exploratory-2")
# Discover the `weather_agent` directory in current working dir
AGENT_DIR = os.path.dirname(os.path.abspath(__file__))
# Create FastAPI app with enabled cloud tracing
app: FastAPI = get_fast_api_app(
agents_dir=AGENT_DIR,
web=True,
trace_to_cloud=True,
)
app.title = "weather-agent"
app.description = "API for interacting with the Agent weather-agent"
# Main execution
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8080)
```
#### From Customized Agent Runner
If you want to fully customize your ADK agent runtime, you can enable cloud tracing by using `CloudTraceSpanExporter` module from Opentelemetry.
```python
# agent_runner.py
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from weather_agent.agent import root_agent as weather_agent
from google.genai.types import Content, Part
from opentelemetry import trace
from opentelemetry.exporter.cloud_trace import CloudTraceSpanExporter
from opentelemetry.sdk.trace import export
from opentelemetry.sdk.trace import TracerProvider
APP_NAME = "weather_agent"
USER_ID = "u_123"
SESSION_ID = "s_123"
provider = TracerProvider()
processor = export.BatchSpanProcessor(
CloudTraceSpanExporter(project_id="{your-project-id}")
)
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)
session_service = InMemorySessionService()
runner = Runner(agent=weather_agent, app_name=APP_NAME, session_service=session_service)
async def main():
session = await session_service.get_session(
app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID
)
if session is None:
session = await session_service.create_session(
app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID
)
user_content = Content(
role="user", parts=[Part(text="what's weather in paris?")]
)
final_response_content = "No response"
async for event in runner.run_async(
user_id=USER_ID, session_id=SESSION_ID, new_message=user_content
):
if event.is_final_response() and event.content and event.content.parts:
final_response_content = event.content.parts[0].text
print(final_response_content)
if __name__ == "__main__":
import asyncio
asyncio.run(main())
```
## Inspect Cloud Traces
After the setup is complete, whenever you interact with the agent it will automatically send trace data to Cloud Trace. You can inspect the traces by going to [console.cloud.google.com](https://console.cloud.google.com) and visit the Trace Explorer on the configured Google Cloud Project
And then you will see all available traces produced by ADK agent which configured in several span names such as `invocation` , `agent_run` . `call_llm` and `execute_tool`
If you click on one of the traces, you will see the waterfall view of the detailed process, similar to what we see in the web development UI with `adk web` command.
## Resources
- [Google Cloud Trace Documentation](https://cloud.google.com/trace)
# Agent Engine Code Execution tool for ADK
Supported in ADKPython v1.17.0Preview
The Agent Engine Code Execution ADK Tool provides a low-latency, highly efficient method for running AI-generated code using the [Google Cloud Agent Engine](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/overview) service. This tool is designed for fast execution, tailored for agentic workflows, and uses sandboxed environments for improved security. The Code Execution tool allows code and data to persist over multiple requests, enabling complex, multi-step coding tasks, including:
- **Code development and debugging:** Create agent tasks that test and iterate on versions of code over multiple requests.
- **Code with data analysis:** Upload data files up to 100MB, and run multiple code-based analyses without the need to reload data for each code run.
This code execution tool is part of the Agent Engine suite, however you do not have to deploy your agent to Agent Engine to use it. You can run your agent locally or with other services and use this tool. For more information about the Code Execution feature in Agent Engine, see the [Agent Engine Code Execution](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/code-execution/overview) documentation.
Preview release
The Agent Engine Code Execution feature is a Preview release. For more information, see the [launch stage descriptions](https://cloud.google.com/products#product-launch-stages).
## Use the Tool
Using the Agent Engine Code Execution tool requires that you create a sandbox environment with Google Cloud Agent Engine before using the tool with an ADK agent.
To use the Code Execution tool with your ADK agent:
1. Follow the instructions in the Agent Engine [Code Execution quickstart](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/code-execution/quickstart) to create a code execution sandbox environment.
1. Create an ADK agent with settings to access the Google Cloud project where you created the sandbox environment.
1. The following code example shows an agent configured to use the Code Executor tool. Replace `SANDBOX_RESOURCE_NAME` with the sandbox environment resource name you created.
```python
from google.adk.agents.llm_agent import Agent
from google.adk.code_executors.agent_engine_sandbox_code_executor import AgentEngineSandboxCodeExecutor
root_agent = Agent(
model="gemini-2.5-flash",
name="agent_engine_code_execution_agent",
instruction="You are a helpful agent that can write and execute code to answer questions and solve problems.",
code_executor=AgentEngineSandboxCodeExecutor(
sandbox_resource_name="SANDBOX_RESOURCE_NAME",
),
)
```
For details on the expected format of the `sandbox_resource_name` value, and the alternative `agent_engine_resource_name` parameter, see [Configuration parameters](#config-parameters). For a more advanced example, including recommended system instructions for the tool, see the [Advanced example](#advanced-example) or the full [agent code example](https://github.com/google/adk-python/tree/main/contributing/samples/agent_engine_code_execution).
## How it works
The `AgentEngineCodeExecutor` Tool maintains a single sandbox throughout an agent's task, meaning the sandbox's state persists across all operations within an ADK workflow session.
1. **Sandbox creation:** For multi-step tasks requiring code execution, the Agent Engine creates a sandbox with specified language and machine configurations, isolating the code execution environment. If no sandbox is pre-created, the code execution tool will automatically create one using default settings.
1. **Code execution with persistence:** AI-generated code for a tool call is streamed to the sandbox and then executed within the isolated environment. After execution, the sandbox *remains active* for subsequent tool calls within the same session, preserving variables, imported modules, and file state for the next tool call from the same agent.
1. **Result retrieval:** The standard output, and any captured error streams are collected and passed back to the calling agent.
1. **Sandbox clean up:** Once the agent task or conversation concludes, the agent can explicitly delete the sandbox, or rely on the TTL feature of the sandbox specified when creating the sandbox.
## Key benefits
- **Persistent state:** Solve complex tasks where data manipulation or variable context must carry over between multiple tool calls.
- **Targeted Isolation:** Provides robust process-level isolation, ensuring that tool code execution is safe while remaining lightweight.
- **Agent Engine integration:** Tightly integrated into the Agent Engine tool-use and orchestration layer.
- **Low-latency performance:** Designed for speed, allowing agents to execute complex tool-use workflows efficiently without significant overhead.
- **Flexible compute configurations:** Create sandboxes with specific programming language, processing power, and memory configurations.
## System requirements¶
The following requirements must be met to successfully use the Agent Engine Code Execution tool with your ADK agents:
- Google Cloud project with Vertex API enabled
- Agent's service account requires **roles/aiplatform.user** role, which allow it to:
- Create, get, list and delete code execution sandboxes
- Execute code execution sandbox
## Configuration parameters
The Agent Engine Code Execution tool has the following parameters. You must set one of the following resource parameters:
- **`sandbox_resource_name`** : A sandbox resource path to an existing sandbox environment it uses for each tool call. The expected string format is as follows:
```text
projects/{$PROJECT_ID}/locations/{$LOCATION_ID}/reasoningEngines/{$REASONING_ENGINE_ID}/sandboxEnvironments/{$SANDBOX_ENVIRONMENT_ID}
# Example:
projects/my-vertex-agent-project/locations/us-central1/reasoningEngines/6842888880301111172/sandboxEnvironments/6545148888889161728
```
- **`agent_engine_resource_name`**: Agent Engine resource name where the tool creates a sandbox environment. The expected string format is as follows:
```text
projects/{$PROJECT_ID}/locations/{$LOCATION_ID}/reasoningEngines/{$REASONING_ENGINE_ID}
# Example:
projects/my-vertex-agent-project/locations/us-central1/reasoningEngines/6842888880301111172
```
You can use Google Cloud Agent Engine's API to configure Agent Engine sandbox environments separately using a Google Cloud client connection, including the following settings:
- **Programming languages,** including Python and JavaScript
- **Compute environment**, including CPU and memory sizes
For more information on connecting to Google Cloud Agent Engine and configuring sandbox environments, see the Agent Engine [Code Execution quickstart](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/code-execution/quickstart#create_a_sandbox).
## Advanced example
The following example code shows how to implement use of the Code Executor tool in an ADK agent. This example includes a `base_system_instruction` clause to set the operating guidelines for code execution. This instruction clause is optional, but strongly recommended for getting the best results from this tool.
````python
from google.adk.agents.llm_agent import Agent
from google.adk.code_executors.agent_engine_sandbox_code_executor import AgentEngineSandboxCodeExecutor
def base_system_instruction():
"""Returns: data science agent system instruction."""
return """
# Guidelines
**Objective:** Assist the user in achieving their data analysis goals, **with emphasis on avoiding assumptions and ensuring accuracy.** Reaching that goal can involve multiple steps. When you need to generate code, you **don't** need to solve the goal in one go. Only generate the next step at a time.
**Code Execution:** All code snippets provided will be executed within the sandbox environment.
**Statefulness:** All code snippets are executed and the variables stays in the environment. You NEVER need to re-initialize variables. You NEVER need to reload files. You NEVER need to re-import libraries.
**Output Visibility:** Always print the output of code execution to visualize results, especially for data exploration and analysis. For example:
- To look a the shape of a pandas.DataFrame do:
```tool_code
print(df.shape)
```
The output will be presented to you as:
```tool_outputs
(49, 7)
```
- To display the result of a numerical computation:
```tool_code
x = 10 ** 9 - 12 ** 5
print(f'{{x=}}')
```
The output will be presented to you as:
```tool_outputs
x=999751168
```
- You **never** generate ```tool_outputs yourself.
- You can then use this output to decide on next steps.
- Print just variables (e.g., `print(f'{{variable=}}')`.
**No Assumptions:** **Crucially, avoid making assumptions about the nature of the data or column names.** Base findings solely on the data itself. Always use the information obtained from `explore_df` to guide your analysis.
**Available files:** Only use the files that are available as specified in the list of available files.
**Data in prompt:** Some queries contain the input data directly in the prompt. You have to parse that data into a pandas DataFrame. ALWAYS parse all the data. NEVER edit the data that are given to you.
**Answerability:** Some queries may not be answerable with the available data. In those cases, inform the user why you cannot process their query and suggest what type of data would be needed to fulfill their request.
"""
root_agent = Agent(
model="gemini-2.5-flash",
name="agent_engine_code_execution_agent",
instruction=base_system_instruction() + """
You need to assist the user with their queries by looking at the data and the context in the conversation.
You final answer should summarize the code and code execution relevant to the user query.
You should include all pieces of data to answer the user query, such as the table from code execution results.
If you cannot answer the question directly, you should follow the guidelines above to generate the next step.
If the question can be answered directly with writing any code, you should do that.
If you doesn't have enough data to answer the question, you should ask for clarification from the user.
You should NEVER install any package on your own like `pip install ...`.
When plotting trends, you should make sure to sort and order the data by the x-axis.
""",
code_executor=AgentEngineSandboxCodeExecutor(
# Replace with your sandbox resource name if you already have one.
sandbox_resource_name="SANDBOX_RESOURCE_NAME",
# Replace with agent engine resource name used for creating sandbox if
# sandbox_resource_name is not set:
# agent_engine_resource_name="AGENT_ENGINE_RESOURCE_NAME",
),
)
````
For a complete version of an ADK agent using this example code, see the [agent_engine_code_execution sample](https://github.com/google/adk-python/tree/main/contributing/samples/agent_engine_code_execution).
# Gemini API Code Execution tool for ADK
Supported in ADKPython v0.1.0Java v0.2.0
The `built_in_code_execution` tool enables the agent to execute code, specifically when using Gemini 2 and higher models. This allows the model to perform tasks like calculations, data manipulation, or running small scripts.
Warning: Single tool per agent limitation
This tool can only be used ***by itself*** within an agent instance. For more information about this limitation and workarounds, see [Limitations for ADK tools](/adk-docs/tools/limitations/#one-tool-one-agent).
````py
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import asyncio
from google.adk.agents import LlmAgent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.adk.code_executors import BuiltInCodeExecutor
from google.genai import types
AGENT_NAME = "calculator_agent"
APP_NAME = "calculator"
USER_ID = "user1234"
SESSION_ID = "session_code_exec_async"
GEMINI_MODEL = "gemini-2.0-flash"
# Agent Definition
code_agent = LlmAgent(
name=AGENT_NAME,
model=GEMINI_MODEL,
code_executor=BuiltInCodeExecutor(),
instruction="""You are a calculator agent.
When given a mathematical expression, write and execute Python code to calculate the result.
Return only the final numerical result as plain text, without markdown or code blocks.
""",
description="Executes Python code to perform calculations.",
)
# Session and Runner
session_service = InMemorySessionService()
session = asyncio.run(session_service.create_session(
app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID
))
runner = Runner(agent=code_agent, app_name=APP_NAME,
session_service=session_service)
# Agent Interaction (Async)
async def call_agent_async(query):
content = types.Content(role="user", parts=[types.Part(text=query)])
print(f"\n--- Running Query: {query} ---")
final_response_text = "No final text response captured."
try:
# Use run_async
async for event in runner.run_async(
user_id=USER_ID, session_id=SESSION_ID, new_message=content
):
print(f"Event ID: {event.id}, Author: {event.author}")
# --- Check for specific parts FIRST ---
has_specific_part = False
if event.content and event.content.parts:
for part in event.content.parts: # Iterate through all parts
if part.executable_code:
# Access the actual code string via .code
print(
f" Debug: Agent generated code:\n```python\n{part.executable_code.code}\n```"
)
has_specific_part = True
elif part.code_execution_result:
# Access outcome and output correctly
print(
f" Debug: Code Execution Result: {part.code_execution_result.outcome} - Output:\n{part.code_execution_result.output}"
)
has_specific_part = True
# Also print any text parts found in any event for debugging
elif part.text and not part.text.isspace():
print(f" Text: '{part.text.strip()}'")
# Do not set has_specific_part=True here, as we want the final response logic below
# --- Check for final response AFTER specific parts ---
# Only consider it final if it doesn't have the specific code parts we just handled
if not has_specific_part and event.is_final_response():
if (
event.content
and event.content.parts
and event.content.parts[0].text
):
final_response_text = event.content.parts[0].text.strip()
print(f"==> Final Agent Response: {final_response_text}")
else:
print(
"==> Final Agent Response: [No text content in final event]")
except Exception as e:
print(f"ERROR during agent run: {e}")
print("-" * 30)
# Main async function to run the examples
async def main():
await call_agent_async("Calculate the value of (5 + 7) * 3")
await call_agent_async("What is 10 factorial?")
# Execute the main async function
try:
asyncio.run(main())
except RuntimeError as e:
# Handle specific error when running asyncio.run in an already running loop (like Jupyter/Colab)
if "cannot be called from a running event loop" in str(e):
print("\nRunning in an existing event loop (like Colab/Jupyter).")
print("Please run `await main()` in a notebook cell instead.")
# If in an interactive environment like a notebook, you might need to run:
# await main()
else:
raise e # Re-raise other runtime errors
````
````java
import com.google.adk.agents.BaseAgent;
import com.google.adk.agents.LlmAgent;
import com.google.adk.runner.Runner;
import com.google.adk.sessions.InMemorySessionService;
import com.google.adk.sessions.Session;
import com.google.adk.tools.BuiltInCodeExecutionTool;
import com.google.common.collect.ImmutableList;
import com.google.genai.types.Content;
import com.google.genai.types.Part;
public class CodeExecutionAgentApp {
private static final String AGENT_NAME = "calculator_agent";
private static final String APP_NAME = "calculator";
private static final String USER_ID = "user1234";
private static final String SESSION_ID = "session_code_exec_sync";
private static final String GEMINI_MODEL = "gemini-2.0-flash";
/**
* Calls the agent with a query and prints the interaction events and final response.
*
* @param runner The runner instance for the agent.
* @param query The query to send to the agent.
*/
public static void callAgent(Runner runner, String query) {
Content content =
Content.builder().role("user").parts(ImmutableList.of(Part.fromText(query))).build();
InMemorySessionService sessionService = (InMemorySessionService) runner.sessionService();
Session session =
sessionService
.createSession(APP_NAME, USER_ID, /* state= */ null, SESSION_ID)
.blockingGet();
System.out.println("\n--- Running Query: " + query + " ---");
final String[] finalResponseText = {"No final text response captured."};
try {
runner
.runAsync(session.userId(), session.id(), content)
.forEach(
event -> {
System.out.println("Event ID: " + event.id() + ", Author: " + event.author());
boolean hasSpecificPart = false;
if (event.content().isPresent() && event.content().get().parts().isPresent()) {
for (Part part : event.content().get().parts().get()) {
if (part.executableCode().isPresent()) {
System.out.println(
" Debug: Agent generated code:\n```python\n"
+ part.executableCode().get().code()
+ "\n```");
hasSpecificPart = true;
} else if (part.codeExecutionResult().isPresent()) {
System.out.println(
" Debug: Code Execution Result: "
+ part.codeExecutionResult().get().outcome()
+ " - Output:\n"
+ part.codeExecutionResult().get().output());
hasSpecificPart = true;
} else if (part.text().isPresent() && !part.text().get().trim().isEmpty()) {
System.out.println(" Text: '" + part.text().get().trim() + "'");
}
}
}
if (!hasSpecificPart && event.finalResponse()) {
if (event.content().isPresent()
&& event.content().get().parts().isPresent()
&& !event.content().get().parts().get().isEmpty()
&& event.content().get().parts().get().get(0).text().isPresent()) {
finalResponseText[0] =
event.content().get().parts().get().get(0).text().get().trim();
System.out.println("==> Final Agent Response: " + finalResponseText[0]);
} else {
System.out.println(
"==> Final Agent Response: [No text content in final event]");
}
}
});
} catch (Exception e) {
System.err.println("ERROR during agent run: " + e.getMessage());
e.printStackTrace();
}
System.out.println("------------------------------");
}
public static void main(String[] args) {
BuiltInCodeExecutionTool codeExecutionTool = new BuiltInCodeExecutionTool();
BaseAgent codeAgent =
LlmAgent.builder()
.name(AGENT_NAME)
.model(GEMINI_MODEL)
.tools(ImmutableList.of(codeExecutionTool))
.instruction(
"""
You are a calculator agent.
When given a mathematical expression, write and execute Python code to calculate the result.
Return only the final numerical result as plain text, without markdown or code blocks.
""")
.description("Executes Python code to perform calculations.")
.build();
InMemorySessionService sessionService = new InMemorySessionService();
Runner runner = new Runner(codeAgent, APP_NAME, null, sessionService);
callAgent(runner, "Calculate the value of (5 + 7) * 3");
callAgent(runner, "What is 10 factorial?");
}
}
````
# Gemini API Computer Use tool for ADK
Supported in ADKPython v1.17.0Preview
The Computer Use Toolset allows an agent to operate a user interface of a computer, such as browsers, to complete tasks. This tool uses a specific Gemini model and the [Playwright](https://playwright.dev/) testing tool to control a Chromium browser and can interact with web pages by taking screenshots, clicking, typing, and navigating.
For more information about the computer use model, see Gemini API [Computer use](https://ai.google.dev/gemini-api/docs/computer-use) or the Google Cloud Vertex AI API [Computer use](https://cloud.google.com/vertex-ai/generative-ai/docs/computer-use).
Preview release
The Computer Use model and tool is a Preview release. For more information, see the [launch stage descriptions](https://cloud.google.com/products#product-launch-stages).
## Setup
You must install Playwright and its dependencies, including Chromium, to be able to use the Computer Use Toolset.
Recommended: create and activate a Python virtual environment
Create a Python virtual environment:
```shell
python -m venv .venv
```
Activate the Python virtual environment:
```console
.venv\Scripts\activate.bat
```
```console
.venv\Scripts\Activate.ps1
```
```bash
source .venv/bin/activate
```
To set up the required software libraries for the Computer Use Toolset:
1. Install Python dependencies:
```console
pip install termcolor==3.1.0
pip install playwright==1.52.0
pip install browserbase==1.3.0
pip install rich
```
1. Install the Playwright dependencies, including the Chromium browser:
```console
playwright install-deps chromium
playwright install chromium
```
## Use the tool
Use the Computer Use Toolset by adding it as a tool to your agent. When you configure the tool, you must provide a implementation of the `BaseComputer` class which defines an interface for an agent to use a computer. In the following example, the `PlaywrightComputer` class is defined for this purpose. You can find the code for this implementation in `playwright.py` file of the [computer_use](https://github.com/google/adk-python/blob/main/contributing/samples/computer_use/playwright.py) agent sample project.
```python
from google.adk import Agent
from google.adk.models.google_llm import Gemini
from google.adk.tools.computer_use.computer_use_toolset import ComputerUseToolset
from typing_extensions import override
from .playwright import PlaywrightComputer
root_agent = Agent(
model='gemini-2.5-computer-use-preview-10-2025',
name='hello_world_agent',
description=(
'computer use agent that can operate a browser on a computer to finish'
' user tasks'
),
instruction='you are a computer use agent',
tools=[
ComputerUseToolset(computer=PlaywrightComputer(screen_size=(1280, 936)))
],
)
```
For a complete code example, see the [computer_use](https://github.com/google/adk-python/tree/main/contributing/samples/computer_use) agent sample project.
# Google Cloud Data Agents tool for ADK
Supported in ADKPython v1.23.0
These are a set of tools aimed to provide integration with Data Agents powered by [Conversational Analytics API](https://docs.cloud.google.com/gemini/docs/conversational-analytics-api/overview).
Data Agents are AI-powered agents that help you analyze your data using natural language. When configuring a Data Agent, you can choose from supported data sources, including **BigQuery**, **Looker**, and **Looker Studio**.
**Prerequisites**
Before using these tools, you must build and configure your Data Agents in Google Cloud:
- [Build a data agent using HTTP and Python](https://docs.cloud.google.com/gemini/docs/conversational-analytics-api/build-agent-http)
- [Build a data agent using the Python SDK](https://docs.cloud.google.com/gemini/docs/conversational-analytics-api/build-agent-sdk)
- [Create a data agent in BigQuery Studio](https://docs.cloud.google.com/bigquery/docs/create-data-agents#create_a_data_agent)
The `DataAgentToolset` includes the following tools:
- **`list_accessible_data_agents`**: Lists Data Agents you have permission to access in the configured GCP project.
- **`get_data_agent_info`**: Retrieves details about a specific Data Agent given its full resource name.
- **`ask_data_agent`**: Chats with a specific Data Agent using natural language.
They are packaged in the toolset `DataAgentToolset`.
```py
# Copyright 2026 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import asyncio
from google.adk.agents import Agent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.adk.tools.data_agent.config import DataAgentToolConfig
from google.adk.tools.data_agent.credentials import DataAgentCredentialsConfig
from google.adk.tools.data_agent.data_agent_toolset import DataAgentToolset
from google.genai import types
import google.auth
# Define constants for this example agent
AGENT_NAME = "data_agent_example"
APP_NAME = "data_agent_app"
USER_ID = "user1234"
SESSION_ID = "1234"
GEMINI_MODEL = "gemini-2.5-flash"
# Define tool configuration
tool_config = DataAgentToolConfig(
max_query_result_rows=100,
)
# Use Application Default Credentials (ADC)
# https://cloud.google.com/docs/authentication/provide-credentials-adc
application_default_credentials, _ = google.auth.default()
credentials_config = DataAgentCredentialsConfig(
credentials=application_default_credentials
)
# Instantiate a Data Agent toolset
da_toolset = DataAgentToolset(
credentials_config=credentials_config,
data_agent_tool_config=tool_config,
tool_filter=[
"list_accessible_data_agents",
"get_data_agent_info",
"ask_data_agent",
],
)
# Agent Definition
data_agent = Agent(
name=AGENT_NAME,
model=GEMINI_MODEL,
description="Agent to answer user questions using Data Agents.",
instruction=(
"## Persona\nYou are a helpful assistant that uses Data Agents"
" to answer user questions about their data.\n\n"
),
tools=[da_toolset],
)
# Session and Runner
session_service = InMemorySessionService()
session = asyncio.run(
session_service.create_session(
app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID
)
)
runner = Runner(
agent=data_agent, app_name=APP_NAME, session_service=session_service
)
# Agent Interaction
def call_agent(query):
"""
Helper function to call the agent with a query.
"""
content = types.Content(role="user", parts=[types.Part(text=query)])
events = runner.run(user_id=USER_ID, session_id=SESSION_ID, new_message=content)
print("USER:", query)
for event in events:
if event.is_final_response():
final_response = event.content.parts[0].text
print("AGENT:", final_response)
call_agent("List accessible data agents in project .")
call_agent("Get information about .")
# The data agent in this example is configured with the BigQuery table:
# `bigquery-public-data.san_francisco.street_trees`
call_agent("Ask to count the rows in the table.")
call_agent("What are the columns in the table?")
call_agent("What are the top 5 tree species?")
call_agent("For those species, what is the distribution of legal status?")
```
# Daytona plugin for ADK
Supported in ADKPython
The [Daytona ADK plugin](https://github.com/daytonaio/daytona-adk-plugin) connects your ADK agent to [Daytona](https://www.daytona.io/) sandboxes. This integration gives your agent the ability to execute code, run shell commands, and manage files in isolated environments, enabling secure execution of AI-generated code.
## Use cases
- **Secure Code Execution**: Run Python, JavaScript, and TypeScript code in isolated sandboxes without risking your local environment.
- **Shell Command Automation**: Execute shell commands with configurable timeouts and working directories for build tasks, installations, or system operations.
- **File Management**: Upload scripts and datasets to sandboxes, then retrieve generated outputs and results.
## Prerequisites
- A [Daytona](https://www.daytona.io/) account
- Daytona API key
## Installation
```bash
pip install daytona-adk
```
## Use with agent
```python
from daytona_adk import DaytonaPlugin
from google.adk.agents import Agent
plugin = DaytonaPlugin(
api_key="your-daytona-api-key" # Or set DAYTONA_API_KEY environment variable
)
root_agent = Agent(
model="gemini-2.5-pro",
name="sandbox_agent",
instruction="Help users execute code and commands in a secure sandbox",
tools=plugin.get_tools(),
)
```
## Available tools
| Tool | Description |
| ------------------------------------ | ---------------------------------------------- |
| `execute_code_in_daytona` | Execute Python, JavaScript, or TypeScript code |
| `execute_command_in_daytona` | Run shell commands |
| `upload_file_to_daytona` | Upload scripts or data files to the sandbox |
| `read_file_from_daytona` | Read script outputs or generated files |
| `start_long_running_command_daytona` | Start background processes (servers, watchers) |
## Learn more
For a detailed guide on building a code generator agent that writes, tests, and verifies code in secure sandboxes, check out [this guide](https://www.daytona.io/docs/en/google-adk-code-generator).
## Additional resources
- [Code Generator Agent Guide](https://www.daytona.io/docs/en/google-adk-code-generator)
- [Daytona ADK on PyPI](https://pypi.org/project/daytona-adk/)
- [Daytona ADK on GitHub](https://github.com/daytonaio/daytona-adk-plugin)
- [Daytona Documentation](https://www.daytona.io/docs)
# ElevenLabs MCP tool for ADK
Supported in ADKPythonTypeScript
The [ElevenLabs MCP Server](https://github.com/elevenlabs/elevenlabs-mcp) connects your ADK agent to the [ElevenLabs](https://elevenlabs.io/) AI audio platform. This integration gives your agent the ability to generate speech, clone voices, transcribe audio, create sound effects, and build conversational AI experiences using natural language.
## Use cases
- **Text-to-Speech Generation**: Convert text into natural-sounding speech using a variety of voices, with fine-grained control over stability, style, and similarity settings.
- **Voice Cloning & Design**: Clone voices from audio samples or generate new voices from text descriptions of desired characteristics like age, gender, accent, and tone.
- **Audio Processing**: Isolate speech from background noise, convert audio to sound like different voices, or transcribe speech to text with speaker identification.
- **Sound Effects & Soundscapes**: Generate sound effects and ambient soundscapes from text descriptions, such as "a thunderstorm in a dense jungle with animals reacting to the weather."
## Prerequisites
- Sign up for an [ElevenLabs account](https://elevenlabs.io/app/sign-up)
- Generate an [API key](https://elevenlabs.io/app/settings/api-keys) from your account settings
## Use with agent
```python
from google.adk.agents import Agent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
from mcp import StdioServerParameters
ELEVENLABS_API_KEY = "YOUR_ELEVENLABS_API_KEY"
root_agent = Agent(
model="gemini-2.5-pro",
name="elevenlabs_agent",
instruction="Help users generate speech, clone voices, and process audio",
tools=[
McpToolset(
connection_params=StdioConnectionParams(
server_params=StdioServerParameters(
command="uvx",
args=["elevenlabs-mcp"],
env={
"ELEVENLABS_API_KEY": ELEVENLABS_API_KEY,
}
),
timeout=30,
),
)
],
)
```
```typescript
import { LlmAgent, MCPToolset } from "@google/adk";
const ELEVENLABS_API_KEY = "YOUR_ELEVENLABS_API_KEY";
const rootAgent = new LlmAgent({
model: "gemini-2.5-pro",
name: "elevenlabs_agent",
instruction: "Help users generate speech, clone voices, and process audio",
tools: [
new MCPToolset({
type: "StdioConnectionParams",
serverParams: {
command: "uvx",
args: ["elevenlabs-mcp"],
env: {
ELEVENLABS_API_KEY: ELEVENLABS_API_KEY,
},
},
}),
],
});
export { rootAgent };
```
## Available tools
### Text-to-speech and voice
| Tool | Description |
| --------------------------- | ------------------------------------------------- |
| `text_to_speech` | Generate speech from text using a specified voice |
| `speech_to_speech` | Transform audio to sound like a different voice |
| `text_to_voice` | Generate a voice preview from text description |
| `create_voice_from_preview` | Save a generated voice preview to your library |
| `voice_clone` | Clone a voice from audio samples |
| `get_voice` | Get details about a specific voice |
| `search_voices` | Search for voices in your library |
| `search_voice_library` | Search the public voice library |
| `list_models` | List available text-to-speech models |
### Audio processing
| Tool | Description |
| ------------------------- | ---------------------------------------------------- |
| `speech_to_text` | Transcribe audio to text with speaker identification |
| `text_to_sound_effects` | Generate sound effects from text descriptions |
| `isolate_audio` | Separate speech from background noise and music |
| `play_audio` | Play an audio file locally |
| `compose_music` | Generate music from a description |
| `create_composition_plan` | Create a plan for music composition |
### Conversational AI
| Tool | Description |
| ----------------------------- | ---------------------------------------------- |
| `create_agent` | Create a conversational AI agent |
| `get_agent` | Get details about a specific agent |
| `list_agents` | List all your conversational AI agents |
| `add_knowledge_base_to_agent` | Add a knowledge base to an agent |
| `make_outbound_call` | Initiate an outbound phone call using an agent |
| `list_phone_numbers` | List available phone numbers |
| `get_conversation` | Get details about a specific conversation |
| `list_conversations` | List all conversations |
### Account
| Tool | Description |
| -------------------- | ---------------------------------------- |
| `check_subscription` | Check your subscription and credit usage |
## Configuration
The ElevenLabs MCP server can be configured using environment variables:
| Variable | Description | Default |
| ---------------------------- | --------------------------------------- | ----------- |
| `ELEVENLABS_API_KEY` | Your ElevenLabs API key | Required |
| `ELEVENLABS_MCP_BASE_PATH` | Base path for file operations | `~/Desktop` |
| `ELEVENLABS_MCP_OUTPUT_MODE` | How generated files are returned | `files` |
| `ELEVENLABS_API_RESIDENCY` | Data residency region (enterprise only) | `us` |
### Output modes
The `ELEVENLABS_MCP_OUTPUT_MODE` environment variable supports three modes:
- **`files`** (default): Save files to disk and return file paths
- **`resources`**: Return files as MCP resources (base64-encoded binary data)
- **`both`**: Save files to disk AND return as MCP resources
## Additional resources
- [ElevenLabs MCP Server Repository](https://github.com/elevenlabs/elevenlabs-mcp)
- [Introducing ElevenLabs MCP](https://elevenlabs.io/blog/introducing-elevenlabs-mcp)
- [ElevenLabs Documentation](https://elevenlabs.io/docs)
# Google Cloud Vertex AI express mode for ADK
Supported in ADKPython v0.1.0Java v0.1.0Preview
Google Cloud Vertex AI express mode provides a no-cost access tier for prototyping and development, allowing you to use Vertex AI services without creating a full Google Cloud Project. This service includes access to many powerful Vertex AI services, including:
- [Vertex AI SessionService](#vertex-ai-session-service)
- [Vertex AI MemoryBankService](#vertex-ai-memory-bank)
You can sign up for an express mode account using a Gmail account and receive an API key to use with the ADK. Obtain an API key through the [Google Cloud Console](https://console.cloud.google.com/expressmode). For more information, see [Vertex AI express mode](https://cloud.google.com/vertex-ai/generative-ai/docs/start/express-mode/overview).
Preview release
The Vertex AI express mode feature is a Preview release. For more information, see the [launch stage descriptions](https://cloud.google.com/products#product-launch-stages).
Vertex AI express mode limitations
Vertex AI express mode projects are only valid for 90 days and only select services are available to be used with limited quota. For example, the number of Agent Engines is restricted to 10 and deployment to Agent Engine requires paid access. To remove the quota restrictions and use all of Vertex AI's services, add a billing account to your express mode project.
## Configure Agent Engine container
When using Vertex AI express mode, create an `AgentEngine` object to enable Vertex AI management of agent components such as `Session` and `Memory` objects. With this approach, `Session` objects are handled as children of the `AgentEngine` object. Before running your agent make sure your environment variables are set correctly, as shown below:
agent/.env
```text
GOOGLE_GENAI_USE_VERTEXAI=TRUE
GOOGLE_API_KEY=PASTE_YOUR_ACTUAL_EXPRESS_MODE_API_KEY_HERE
```
Next, create your Agent Engine instance using the Vertex AI SDK.
1. Import Vertex AI SDK.
```py
import vertexai
from vertexai import agent_engines
```
1. Initialize the Vertex AI Client with your API key and create an agent engine instance.
```py
# Create Agent Engine with Gen AI SDK
client = vertexai.Client(
api_key="YOUR_API_KEY",
)
agent_engine = client.agent_engines.create(
config={
"display_name": "Demo Agent Engine",
"description": "Agent Engine for Session and Memory",
})
```
1. Get the Agent Engine name and ID from the response to use with Memories and Sessions.
```py
APP_ID = agent_engine.api_resource.name.split('/')[-1]
```
## Manage Sessions with `VertexAiSessionService`
[`VertexAiSessionService`](/adk-docs/sessions/session.md#sessionservice-implementations) is compatible with Vertex AI express mode API Keys. You can instead initialize the session object without any project or location.
```py
# Requires: pip install google-adk[vertexai]
# Plus environment variable setup:
# GOOGLE_GENAI_USE_VERTEXAI=TRUE
# GOOGLE_API_KEY=PASTE_YOUR_ACTUAL_EXPRESS_MODE_API_KEY_HERE
from google.adk.sessions import VertexAiSessionService
# The app_name used with this service should be the Reasoning Engine ID or name
APP_ID = "your-reasoning-engine-id"
# Project and location are not required when initializing with Vertex express mode
session_service = VertexAiSessionService(agent_engine_id=APP_ID)
# Use REASONING_ENGINE_APP_ID when calling service methods, e.g.:
# session = await session_service.create_session(app_name=APP_ID, user_id= ...)
```
Session Service Quotas
For Free express mode Projects, `VertexAiSessionService` has the following quota:
- 10 Create, delete, or update Vertex AI Agent Engine sessions per minute
- 30 Append event to Vertex AI Agent Engine sessions per minute
## Manage Memory with `VertexAiMemoryBankService`
[`VertexAiMemoryBankService`](/adk-docs/sessions/memory.md#vertex-ai-memory-bank) is compatible with Vertex AI express mode API Keys. You can instead initialize the memory object without any project or location.
```py
# Requires: pip install google-adk[vertexai]
# Plus environment variable setup:
# GOOGLE_GENAI_USE_VERTEXAI=TRUE
# GOOGLE_API_KEY=PASTE_YOUR_ACTUAL_EXPRESS_MODE_API_KEY_HERE
from google.adk.memory import VertexAiMemoryBankService
# The app_name used with this service should be the Reasoning Engine ID or name
APP_ID = "your-reasoning-engine-id"
# Project and location are not required when initializing with express mode
memory_service = VertexAiMemoryBankService(agent_engine_id=APP_ID)
# Generate a memory from that session so the Agent can remember relevant details about the user
# memory = await memory_service.add_session_to_memory(session)
```
Memory Service Quotas
For Free express mode Projects, `VertexAiMemoryBankService` has the following quota:
- 10 Create, delete, or update Vertex AI Agent Engine memory resources per minute
- 10 Get, list, or retrieve from Vertex AI Agent Engine Memory Bank per minute
### Code Sample: Weather Agent with Session and Memory
This code sample shows a weather agent that utilizes both `VertexAiSessionService` and `VertexAiMemoryBankService` for context management, allowing your agent to recall user preferences and conversations.
- [Weather Agent with Session and Memory](https://github.com/google/adk-docs/blob/main/examples/python/notebooks/express-mode-weather-agent.ipynb) using Vertex AI express mode
# Freeplay observability for ADK
Supported in ADKPython
[Freeplay](https://freeplay.ai/) provides an end-to-end workflow for building and optimizing AI agents, and it can be integrated with ADK. With Freeplay your whole team can easily collaborate to iterate on agent instructions (prompts), experiment with and compare different models and agent changes, run evals both offline and online to measure quality, monitor production, and review data by hand.
Key benefits of Freeplay:
- **Simple observability** - focused on agents, LLM calls and tool calls for easy human review
- **Online evals/automated scorers** - for error detection in production
- **Offline evals and experiment comparison** - to test changes before deploying
- **Prompt management** - supports pushing changes straight from the Freeplay playground to code
- **Human review workflow** - for collaboration on error analysis and data annotation
- **Powerful UI** - makes it possible for domain experts to collaborate closely with engineers
Freeplay and ADK complement one another. ADK gives you a powerful and expressive agent orchestration framework while Freeplay plugs in for observability, prompt management, evaluation and testing. Once you integrate with Freeplay, you can update prompts and evals from the Freeplay UI or from code, so that anyone on your team can contribute.
## Getting Started
Below is a guide for getting started with Freeplay and ADK. You can also find a full sample ADK agent repo [here](https://github.com/228Labs/freeplay-google-demo).
### Create a Freeplay Account
Sign up for a free [Freeplay account](https://freeplay.ai/signup).
After creating an account, you can define the following environment variables:
```text
FREEPLAY_PROJECT_ID=
FREEPLAY_API_KEY=
FREEPLAY_API_URL=
```
### Use Freeplay ADK Library
Install the Freeplay ADK library:
```text
pip install freeplay-python-adk
```
Freeplay will automatically capture OTel logs from your ADK application when you initialize observability:
```python
from freeplay_python_adk.client import FreeplayADK
FreeplayADK.initialize_observability()
```
You'll also want to pass in the Freeplay plugin to your App:
```python
from app.agent import root_agent
from freeplay_python_adk.freeplay_observability_plugin import FreeplayObservabilityPlugin
from google.adk.runners import App
app = App(
name="app",
root_agent=root_agent,
plugins=[FreeplayObservabilityPlugin()],
)
__all__ = ["app"]
```
You can now use ADK as you normally would, and you will see logs flowing to Freeplay in the Observability section.
## Observability
Freeplay's Observability feature gives you a clear view into how your agent is behaving in production. You can dig into individual agent traces to understand each step and diagnose issues:
You can also use Freeplay's filtering functionality to search and filter the data across any segment of interest:
## Prompt Management (optional)
Freeplay offers [native prompt management](https://docs.freeplay.ai/docs/managing-prompts), which simplifies the process of version and testing different prompt versions. It allows you to experiment with changes to ADK agent instructions in the Freeplay UI, test different models, and push updates straight to your code, similar to a feature flag.
To leverage Freeplay's prompt management capabilities alongside ADK, you'll want to use the Freeplay ADK agent wrapper. `FreeplayLLMAgent` extends ADK's base `LlmAgent` class, so instead of having to hard code your prompts as agent instructions, you can version prompts in the Freeplay application.
First define a prompt in Freeplay by going to Prompts -> Create prompt template:
When creating your prompt template you'll need to add 3 elements, as described in the following sections:
### System Message
This corresponds to the "instructions" section in your code.
### Agent Context Variable
Adding the following to the bottom of your system message will create a variable for the ongoing agent context to be passed through:
```python
{{agent_context}}
```
### History Block
Click new message and change the role to 'history'. This will ensure the past messages are passed through when present.
Now in your code you can use the `FreeplayLLMAgent`:
```python
from freeplay_python_adk.client import FreeplayADK
from freeplay_python_adk.freeplay_llm_agent import (
FreeplayLLMAgent,
)
FreeplayADK.initialize_observability()
root_agent = FreeplayLLMAgent(
name="social_product_researcher",
tools=[tavily_search],
)
```
When the `social_product_researcher` is invoked, the prompt will be retrieved from Freeplay and formatted with the proper input variables.
## Evaluation
Freeplay enables you to define, version, and run [evaluations](https://docs.freeplay.ai/docs/evaluations) from the Freeplay web application. You can define evaluations for any of your prompts or agents by going to Evaluations -> "New evaluation".
These evaluations can be configured to run for both online monitoring and offline evaluation. Datasets for offline evaluation can be uploaded to Freeplay or saved from log examples.
## Dataset Management
As you get data flowing into Freeplay, you can use these logs to start building up [datasets](https://docs.freeplay.ai/docs/datasets) to test against on a repeated basis. Use production logs to create golden datasets or collections of failure cases that you can use to test against as you make changes.
## Batch Testing
As you iterate on your agent, you can run batch tests (i.e., offline experiments) at both the [prompt](https://docs.freeplay.ai/docs/component-level-test-runs) and [end-to-end](https://docs.freeplay.ai/docs/end-to-end-test-runs) agent level. This allows you to compare multiple different models or prompt changes and quantify changes head to head across your full agent execution.
[Here](https://github.com/freeplayai/freeplay-google-demo/blob/main/examples/example_test_run.py) is a code example for executing a batch test on Freeplay with ADK.
## Sign up now
Go to [Freeplay](https://freeplay.ai/) to sign up for an account, and check out a full Freeplay \<> ADK Integration [here](https://github.com/freeplayai/freeplay-google-demo/tree/main)
# GitHub MCP tool for ADK
Supported in ADKPythonTypeScript
The [GitHub MCP Server](https://github.com/github/github-mcp-server) connects AI tools directly to GitHub's platform. This gives your ADK agent the ability to read repositories and code files, manage issues and PRs, analyze code, and automate workflows using natural language.
## Use cases
- **Repository Management**: Browse and query code, search files, analyze commits, and understand project structure across any repository you have access to.
- **Issue & PR Automation**: Create, update, and manage issues and pull requests. Let AI help triage bugs, review code changes, and maintain project boards.
- **Code Analysis**: Examine security findings, review Dependabot alerts, understand code patterns, and get comprehensive insights into your codebase.
## Prerequisites
- Create a [Personal Access Token](https://github.com/settings/personal-access-tokens/new) in GitHub. Refer to the [documentation](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens) for more information.
## Use with agent
```python
from google.adk.agents import Agent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StreamableHTTPServerParams
GITHUB_TOKEN = "YOUR_GITHUB_TOKEN"
root_agent = Agent(
model="gemini-2.5-pro",
name="github_agent",
instruction="Help users get information from GitHub",
tools=[
McpToolset(
connection_params=StreamableHTTPServerParams(
url="https://api.githubcopilot.com/mcp/",
headers={
"Authorization": f"Bearer {GITHUB_TOKEN}",
"X-MCP-Toolsets": "all",
"X-MCP-Readonly": "true"
},
),
)
],
)
```
```typescript
import { LlmAgent, MCPToolset } from "@google/adk";
const GITHUB_TOKEN = "YOUR_GITHUB_TOKEN";
const rootAgent = new LlmAgent({
model: "gemini-2.5-pro",
name: "github_agent",
instruction: "Help users get information from GitHub",
tools: [
new MCPToolset({
type: "StreamableHTTPConnectionParams",
url: "https://api.githubcopilot.com/mcp/",
header: {
Authorization: `Bearer ${GITHUB_TOKEN}`,
"X-MCP-Toolsets": "all",
"X-MCP-Readonly": "true",
},
}),
],
});
export { rootAgent };
```
## Available tools
| Tool | Description |
| ---------------------------- | ----------------------------------------------------------------------------------------- |
| `context` | Tools that provide context about the current user and GitHub context you are operating in |
| `copilot` | Copilot related tools (e.g. Copilot Coding Agent) |
| `copilot_spaces` | Copilot Spaces related tools |
| `actions` | GitHub Actions workflows and CI/CD operations |
| `code_security` | Code security related tools, such as GitHub Code Scanning |
| `dependabot` | Dependabot tools |
| `discussions` | GitHub Discussions related tools |
| `experiments` | Experimental features that are not considered stable yet |
| `gists` | GitHub Gist related tools |
| `github_support_docs_search` | Search docs to answer GitHub product and support questions |
| `issues` | GitHub Issues related tools |
| `labels` | GitHub Labels related tools |
| `notifications` | GitHub Notifications related tools |
| `orgs` | GitHub Organization related tools |
| `projects` | GitHub Projects related tools |
| `pull_requests` | GitHub Pull Request related tools |
| `repos` | GitHub Repository related tools |
| `secret_protection` | Secret protection related tools, such as GitHub Secret Scanning |
| `security_advisories` | Security advisories related tools |
| `stargazers` | GitHub Stargazers related tools |
| `users` | GitHub User related tools |
## Configuration
The Remote GitHub MCP server has optional headers that can be used to configure available toolsets and read-only mode:
- `X-MCP-Toolsets`: Comma-separated list of toolsets to enable. (e.g., "repos,issues")
- If the list is empty, default toolsets will be used. If a bad toolset is provided, the server will fail to start and emit a 400 bad request status. Whitespace is ignored.
- `X-MCP-Readonly`: Enables only "read" tools.
- If this header is empty, "false", "f", "no", "n", "0", or "off" (ignoring whitespace and case), it will be interpreted as false. All other values are interpreted as true.
## Additional resources
- [GitHub MCP Server Repository](https://github.com/github/github-mcp-server)
- [Remote GitHub MCP Server Documentation](https://github.com/github/github-mcp-server/blob/main/docs/remote-server.md)
- [Policies and Governance for the GitHub MCP Server](https://github.com/github/github-mcp-server/blob/main/docs/policies-and-governance.md)
# GitLab MCP tool for ADK
Supported in ADKPythonTypeScript
The [GitLab MCP Server](https://docs.gitlab.com/user/gitlab_duo/model_context_protocol/mcp_server/) connects your ADK agent directly to [GitLab.com](https://gitlab.com/) or your self-managed GitLab instance. This integration gives your agent the ability to manage issues and merge requests, inspect CI/CD pipelines, perform semantic code searches, and automate development workflows using natural language.
## Use cases
- **Semantic Code Exploration**: Navigate your codebase using natural language. Unlike standard text search, you can query the logic and intent of your code to quickly understand complex implementations.
- **Accelerate Merge Request Reviews**: Get up to speed on code changes instantly. Retrieve full merge request contexts, analyze specific diffs, and review commit history to provide faster, more meaningful feedback to your team.
- **Troubleshoot CI/CD Pipelines**: Diagnose build failures without leaving your chat. Inspect pipeline statuses and retrieve detailed job logs to pinpoint exactly why a specific merge request or commit failed its checks.
## Prerequisites
- A GitLab account with a Premium or Ultimate subscription and [GitLab Duo](https://docs.gitlab.com/user/gitlab_duo/) enabled
- [Beta and experimental features](https://docs.gitlab.com/user/gitlab_duo/turn_on_off/#turn-on-beta-and-experimental-features) enabled in your GitLab settings
## Use with agent
```python
from google.adk.agents import Agent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
from mcp import StdioServerParameters
# Replace with your instance URL if self-hosted (e.g., "gitlab.example.com")
GITLAB_INSTANCE_URL = "gitlab.com"
root_agent = Agent(
model="gemini-2.5-pro",
name="gitlab_agent",
instruction="Help users get information from GitLab",
tools=[
McpToolset(
connection_params=StdioConnectionParams(
server_params = StdioServerParameters(
command="npx",
args=[
"-y",
"mcp-remote",
f"https://{GITLAB_INSTANCE_URL}/api/v4/mcp",
"--static-oauth-client-metadata",
"{\"scope\": \"mcp\"}",
],
),
timeout=30,
),
)
],
)
```
```typescript
import { LlmAgent, MCPToolset } from "@google/adk";
// Replace with your instance URL if self-hosted (e.g., "gitlab.example.com")
const GITLAB_INSTANCE_URL = "gitlab.com";
const rootAgent = new LlmAgent({
model: "gemini-2.5-pro",
name: "gitlab_agent",
instruction: "Help users get information from GitLab",
tools: [
new MCPToolset({
type: "StdioConnectionParams",
serverParams: {
command: "npx",
args: [
"-y",
"mcp-remote",
`https://${GITLAB_INSTANCE_URL}/api/v4/mcp`,
"--static-oauth-client-metadata",
'{"scope": "mcp"}',
],
},
}),
],
});
export { rootAgent };
```
Note
When you run this agent for the first time, a browser window will open automatically (and an authorization URL will be printed) requesting OAuth permissions. You must approve this request to allow the agent to access your GitLab data.
## Available tools
| Tool | Description |
| ----------------------------- | ------------------------------------------------------------------------- |
| `get_mcp_server_version` | Returns the current version of the GitLab MCP server |
| `create_issue` | Creates a new issue in a GitLab project |
| `get_issue` | Retrieves detailed information about a specific GitLab issue |
| `create_merge_request` | Creates a merge request in a project |
| `get_merge_request` | Retrieves detailed information about a specific GitLab merge request |
| `get_merge_request_commits` | Retrieves the list of commits in a specific merge request |
| `get_merge_request_diffs` | Retrieves the diffs for a specific merge request |
| `get_merge_request_pipelines` | Retrieves the pipelines for a specific merge request |
| `get_pipeline_jobs` | Retrieves the jobs for a specific CI/CD pipeline |
| `gitlab_search` | Searches for a term across the entire GitLab instance with the search API |
| `semantic_code_search` | Searches for relevant code snippets in a project |
## Additional resources
- [GitLab MCP Server Documentation](https://docs.gitlab.com/user/gitlab_duo/model_context_protocol/mcp_server/)
# Google Cloud GKE Code Executor tool for ADK
Supported in ADKPython v1.14.0
The GKE Code Executor (`GkeCodeExecutor`) provides a secure and scalable method for running LLM-generated code by leveraging the GKE (Google Kubernetes Engine) Sandbox environment, which uses gVisor for workload isolation. For each code execution request, it dynamically creates an ephemeral, sandboxed Kubernetes Job with a hardened Pod configuration. You should use this executor for production environments on GKE where security and isolation are critical.
## How it Works
When a request to execute code is made, the `GkeCodeExecutor` performs the following steps:
1. **Creates a ConfigMap:** A Kubernetes ConfigMap is created to store the Python code that needs to be executed.
1. **Creates a Sandboxed Pod:** A new Kubernetes Job is created, which in turn creates a Pod with a hardened security context and the gVisor runtime enabled. The code from the ConfigMap is mounted into this Pod.
1. **Executes the Code:** The code is executed within the sandboxed Pod, isolated from the underlying node and other workloads.
1. **Retrieves the Result:** The standard output and error streams from the execution are captured from the Pod's logs.
1. **Cleans Up Resources:** Once the execution is complete, the Job and the associated ConfigMap are automatically deleted, ensuring that no artifacts are left behind.
## Key Benefits
- **Enhanced Security:** Code is executed in a gVisor-sandboxed environment with kernel-level isolation.
- **Ephemeral Environments:** Each code execution runs in its own ephemeral Pod, to prevent state transfer between executions.
- **Resource Control:** You can configure CPU and memory limits for the execution Pods to prevent resource abuse.
- **Scalability:** Allows you to run a large number of code executions in parallel, with GKE handling the scheduling and scaling of the underlying nodes.
## System requirements
The following requirements must be met to successfully deploy your ADK project with the GKE Code Executor tool:
- GKE cluster with a **gVisor-enabled node pool**.
- Agent's service account requires specific **RBAC permissions**, which allow it to:
- Create, watch, and delete **Jobs** for each execution request.
- Manage **ConfigMaps** to inject code into the Job's pod.
- List **Pods** and read their **logs** to retrieve the execution result
- Install the client library with GKE extras: `pip install google-adk[gke]`
For a complete, ready-to-use configuration, see the [deployment_rbac.yaml](https://github.com/google/adk-python/blob/main/contributing/samples/gke_agent_sandbox/deployment_rbac.yaml) sample. For more information on deploying ADK workflows to GKE, see [Deploy to Google Kubernetes Engine (GKE)](/adk-docs/deploy/gke/).
```python
from google.adk.agents import LlmAgent
from google.adk.code_executors import GkeCodeExecutor
# Initialize the executor, targeting the namespace where its ServiceAccount
# has the required RBAC permissions.
# This example also sets a custom timeout and resource limits.
gke_executor = GkeCodeExecutor(
namespace="agent-sandbox",
timeout_seconds=600,
cpu_limit="1000m", # 1 CPU core
mem_limit="1Gi",
)
# The agent now uses this executor for any code it generates.
gke_agent = LlmAgent(
name="gke_coding_agent",
model="gemini-2.0-flash",
instruction="You are a helpful AI agent that writes and executes Python code.",
code_executor=gke_executor,
)
```
## Configuration parameters
The `GkeCodeExecutor` can be configured with the following parameters:
| Parameter | Type | Description |
| -------------------- | ----- | --------------------------------------------------------------------------------------------------------------------- |
| `namespace` | `str` | Kubernetes namespace where the execution Jobs will be created. Defaults to `"default"`. |
| `image` | `str` | Container image to use for the execution Pod. Defaults to `"python:3.11-slim"`. |
| `timeout_seconds` | `int` | Timeout in seconds for the code execution. Defaults to `300`. |
| `cpu_requested` | `str` | Amount of CPU to request for the execution Pod. Defaults to `"200m"`. |
| `mem_requested` | `str` | Amount of memory to request for the execution Pod. Defaults to `"256Mi"`. |
| `cpu_limit` | `str` | Maximum amount of CPU the execution Pod can use. Defaults to `"500m"`. |
| `mem_limit` | `str` | Maximum amount of memory the execution Pod can use. Defaults to `"512Mi"`. |
| `kubeconfig_path` | `str` | Path to a kubeconfig file to use for authentication. Falls back to in-cluster config or the default local kubeconfig. |
| `kubeconfig_context` | `str` | The `kubeconfig` context to use. |
# Gemini API Google Search tool for ADK
Supported in ADKPython v0.1.0TypeScript v0.2.0Go v0.1.0Java v0.2.0
The `google_search` tool allows the agent to perform web searches using Google Search. The `google_search` tool is only compatible with Gemini 2 models. For further details of the tool, see [Understanding Google Search grounding](/adk-docs/grounding/google_search_grounding/).
Additional requirements when using the `google_search` tool
When you use grounding with Google Search, and you receive Search suggestions in your response, you must display the Search suggestions in production and in your applications. For more information on grounding with Google Search, see Grounding with Google Search documentation for [Google AI Studio](https://ai.google.dev/gemini-api/docs/grounding/search-suggestions) or [Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs/grounding/grounding-search-suggestions). The UI code (HTML) is returned in the Gemini response as `renderedContent`, and you will need to show the HTML in your app, in accordance with the policy.
Warning: Single tool per agent limitation
This tool can only be used ***by itself*** within an agent instance. For more information about this limitation and workarounds, see [Limitations for ADK tools](/adk-docs/tools/limitations/#one-tool-one-agent).
```py
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from google.adk.agents import Agent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.adk.tools import google_search
from google.genai import types
APP_NAME="google_search_agent"
USER_ID="user1234"
SESSION_ID="1234"
root_agent = Agent(
name="basic_search_agent",
model="gemini-2.0-flash",
description="Agent to answer questions using Google Search.",
instruction="I can answer your questions by searching the internet. Just ask me anything!",
# google_search is a pre-built tool which allows the agent to perform Google searches.
tools=[google_search]
)
# Session and Runner
async def setup_session_and_runner():
session_service = InMemorySessionService()
session = await session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID)
runner = Runner(agent=root_agent, app_name=APP_NAME, session_service=session_service)
return session, runner
# Agent Interaction
async def call_agent_async(query):
content = types.Content(role='user', parts=[types.Part(text=query)])
session, runner = await setup_session_and_runner()
events = runner.run_async(user_id=USER_ID, session_id=SESSION_ID, new_message=content)
async for event in events:
if event.is_final_response():
final_response = event.content.parts[0].text
print("Agent Response: ", final_response)
# Note: In Colab, you can directly use 'await' at the top level.
# If running this code as a standalone Python script, you'll need to use asyncio.run() or manage the event loop.
await call_agent_async("what's the latest ai news?")
```
```typescript
import {GOOGLE_SEARCH, LlmAgent} from '@google/adk';
export const rootAgent = new LlmAgent({
model: 'gemini-2.5-flash',
name: 'root_agent',
description:
'an agent whose job it is to perform Google search queries and answer questions about the results.',
instruction:
'You are an agent whose job is to perform Google search queries and answer questions about the results.',
tools: [GOOGLE_SEARCH],
});
```
```go
// Copyright 2025 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package main
import (
"context"
"fmt"
"log"
"google.golang.org/adk/agent"
"google.golang.org/adk/agent/llmagent"
"google.golang.org/adk/model/gemini"
"google.golang.org/adk/runner"
"google.golang.org/adk/session"
"google.golang.org/adk/tool"
"google.golang.org/adk/tool/geminitool"
"google.golang.org/genai"
)
func createSearchAgent(ctx context.Context) (agent.Agent, error) {
model, err := gemini.NewModel(ctx, "gemini-2.5-flash", &genai.ClientConfig{})
if err != nil {
return nil, fmt.Errorf("failed to create model: %v", err)
}
return llmagent.New(llmagent.Config{
Name: "basic_search_agent",
Model: model,
Description: "Agent to answer questions using Google Search.",
Instruction: "I can answer your questions by searching the web. Just ask me anything!",
Tools: []tool.Tool{geminitool.GoogleSearch{}},
})
}
const (
userID = "user1234"
appName = "Google Search_agent"
)
func callAgent(ctx context.Context, a agent.Agent, prompt string) error {
sessionService := session.InMemoryService()
session, err := sessionService.Create(ctx, &session.CreateRequest{
AppName: appName,
UserID: userID,
})
if err != nil {
return fmt.Errorf("failed to create the session service: %v", err)
}
config := runner.Config{
AppName: appName,
Agent: a,
SessionService: sessionService,
}
r, err := runner.New(config)
if err != nil {
return fmt.Errorf("failed to create the runner: %v", err)
}
sessionID := session.Session.ID()
userMsg := &genai.Content{
Parts: []*genai.Part{{Text: prompt}},
Role: string(genai.RoleUser),
}
// The r.Run method streams events and errors.
// The loop iterates over the results, handling them as they arrive.
for event, err := range r.Run(ctx, userID, sessionID, userMsg, agent.RunConfig{
StreamingMode: agent.StreamingModeSSE,
}) {
if err != nil {
fmt.Printf("\nAGENT_ERROR: %v\n", err)
} else if event.Partial {
for _, p := range event.LLMResponse.Content.Parts {
fmt.Print(p.Text)
}
}
}
return nil
}
func main() {
agent, err := createSearchAgent(context.Background())
if err != nil {
log.Fatalf("Failed to create agent: %v", err)
}
fmt.Println("Agent created:", agent.Name())
prompt := "what's the latest ai news?"
fmt.Printf("\nPrompt: %s\nResponse: ", prompt)
if err := callAgent(context.Background(), agent, prompt); err != nil {
log.Fatalf("Error calling agent: %v", err)
}
fmt.Println("\n---")
}
```
```java
import com.google.adk.agents.BaseAgent;
import com.google.adk.agents.LlmAgent;
import com.google.adk.runner.Runner;
import com.google.adk.sessions.InMemorySessionService;
import com.google.adk.sessions.Session;
import com.google.adk.tools.GoogleSearchTool;
import com.google.common.collect.ImmutableList;
import com.google.genai.types.Content;
import com.google.genai.types.Part;
public class GoogleSearchAgentApp {
private static final String APP_NAME = "Google Search_agent";
private static final String USER_ID = "user1234";
private static final String SESSION_ID = "1234";
/**
* Calls the agent with the given query and prints the final response.
*
* @param runner The runner to use.
* @param query The query to send to the agent.
*/
public static void callAgent(Runner runner, String query) {
Content content =
Content.fromParts(Part.fromText(query));
InMemorySessionService sessionService = (InMemorySessionService) runner.sessionService();
Session session =
sessionService
.createSession(APP_NAME, USER_ID, /* state= */ null, SESSION_ID)
.blockingGet();
runner
.runAsync(session.userId(), session.id(), content)
.forEach(
event -> {
if (event.finalResponse()
&& event.content().isPresent()
&& event.content().get().parts().isPresent()
&& !event.content().get().parts().get().isEmpty()
&& event.content().get().parts().get().get(0).text().isPresent()) {
String finalResponse = event.content().get().parts().get().get(0).text().get();
System.out.println("Agent Response: " + finalResponse);
}
});
}
public static void main(String[] args) {
// Google Search is a pre-built tool which allows the agent to perform Google searches.
GoogleSearchTool googleSearchTool = new GoogleSearchTool();
BaseAgent rootAgent =
LlmAgent.builder()
.name("basic_search_agent")
.model("gemini-2.0-flash") // Ensure to use a Gemini 2.0 model for Google Search Tool
.description("Agent to answer questions using Google Search.")
.instruction(
"I can answer your questions by searching the internet. Just ask me anything!")
.tools(ImmutableList.of(googleSearchTool))
.build();
// Session and Runner
InMemorySessionService sessionService = new InMemorySessionService();
Runner runner = new Runner(rootAgent, APP_NAME, null, sessionService);
// Agent Interaction
callAgent(runner, "what's the latest ai news?");
}
}
```
# Hugging Face MCP tool for ADK
Supported in ADKPythonTypeScript
The [Hugging Face MCP Server](https://github.com/huggingface/hf-mcp-server) can be used to connect your ADK agent to the Hugging Face Hub and thousands of Gradio AI Applications.
## Use cases
- **Discover AI/ML Assets**: Search and filter the Hub for models, datasets, and papers based on tasks, libraries, or keywords.
- **Build Multi-Step Workflows**: Chain tools together, such as transcribing audio with one tool and then summarizing the resulting text with another.
- **Find AI Applications**: Search for Gradio Spaces that can perform a specific task, like background removal or text-to-speech.
## Prerequisites
- Create a [user access token](https://huggingface.co/settings/tokens) in Hugging Face. Refer to the [documentation](https://huggingface.co/docs/hub/en/security-tokens) for more information.
## Use with agent
```python
from google.adk.agents import Agent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
from mcp import StdioServerParameters
HUGGING_FACE_TOKEN = "YOUR_HUGGING_FACE_TOKEN"
root_agent = Agent(
model="gemini-2.5-pro",
name="hugging_face_agent",
instruction="Help users get information from Hugging Face",
tools=[
McpToolset(
connection_params=StdioConnectionParams(
server_params = StdioServerParameters(
command="npx",
args=[
"-y",
"@llmindset/hf-mcp-server",
],
env={
"HF_TOKEN": HUGGING_FACE_TOKEN,
}
),
timeout=30,
),
)
],
)
```
```python
from google.adk.agents import Agent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StreamableHTTPServerParams
HUGGING_FACE_TOKEN = "YOUR_HUGGING_FACE_TOKEN"
root_agent = Agent(
model="gemini-2.5-pro",
name="hugging_face_agent",
instruction="Help users get information from Hugging Face",
tools=[
McpToolset(
connection_params=StreamableHTTPServerParams(
url="https://huggingface.co/mcp",
headers={
"Authorization": f"Bearer {HUGGING_FACE_TOKEN}",
},
),
)
],
)
```
```typescript
import { LlmAgent, MCPToolset } from "@google/adk";
const HUGGING_FACE_TOKEN = "YOUR_HUGGING_FACE_TOKEN";
const rootAgent = new LlmAgent({
model: "gemini-2.5-pro",
name: "hugging_face_agent",
instruction: "Help users get information from Hugging Face",
tools: [
new MCPToolset({
type: "StdioConnectionParams",
serverParams: {
command: "npx",
args: ["-y", "@llmindset/hf-mcp-server"],
env: {
HF_TOKEN: HUGGING_FACE_TOKEN,
},
},
}),
],
});
export { rootAgent };
```
```typescript
import { LlmAgent, MCPToolset } from "@google/adk";
const HUGGING_FACE_TOKEN = "YOUR_HUGGING_FACE_TOKEN";
const rootAgent = new LlmAgent({
model: "gemini-2.5-pro",
name: "hugging_face_agent",
instruction: "Help users get information from Hugging Face",
tools: [
new MCPToolset({
type: "StreamableHTTPConnectionParams",
url: "https://huggingface.co/mcp",
header: {
Authorization: `Bearer ${HUGGING_FACE_TOKEN}`,
},
}),
],
});
export { rootAgent };
```
## Available tools
| Tool | Description |
| ----------------------------- | ---------------------------------------------------------- |
| Spaces Semantic Search | Find the best AI Apps via natural language queries |
| Papers Semantic Search | Find ML Research Papers via natural language queries |
| Model Search | Search for ML models with filters for task, library, etc… |
| Dataset Search | Search for datasets with filters for author, tags, etc… |
| Documentation Semantic Search | Search the Hugging Face documentation library |
| Hub Repository Details | Get detailed information about Models, Datasets and Spaces |
## Configuration
To configure which tools are available in your Hugging Face Hub MCP server, visit the [MCP Settings Page](https://huggingface.co/settings/mcp) in your Hugging Face account.
To configure the local MCP server, you can use the following environment variables:
- `TRANSPORT`: The transport type to use (`stdio`, `sse`, `streamableHttp`, or `streamableHttpJson`)
- `DEFAULT_HF_TOKEN`: ⚠️ Requests are serviced with the `HF_TOKEN` received in the Authorization: Bearer header. The DEFAULT_HF_TOKEN is used if no header was sent. Only set this in Development / Test environments or for local STDIO Deployments. ⚠️
- If running with stdio transport, `HF_TOKEN` is used if `DEFAULT_HF_TOKEN` is not set.
- `HF_API_TIMEOUT`: Timeout for Hugging Face API requests in milliseconds (default: 12500ms / 12.5 seconds)
- `USER_CONFIG_API`: URL to use for User settings (defaults to Local front-end)
- `MCP_STRICT_COMPLIANCE`: set to True for GET 405 rejects in JSON Mode (default serves a welcome page).
- `AUTHENTICATE_TOOL`: whether to include an Authenticate tool to issue an OAuth challenge when called
- `SEARCH_ENABLES_FETCH`: When set to true, automatically enables the hf_doc_fetch tool whenever hf_doc_search is enabled
## Additional resources
- [Hugging Face MCP Server Repository](https://github.com/huggingface/hf-mcp-server)
- [Hugging Face MCP Server Documentation](https://huggingface.co/docs/hub/en/hf-mcp-server)
# Linear MCP tool for ADK
Supported in ADKPythonTypeScript
The [Linear MCP Server](https://linear.app/docs/mcp) connects your ADK agent to [Linear](https://linear.app/), a purpose-built tool for planning and building products. This integration gives your agent the ability to manage issues, track project cycles, and automate development workflows using natural language.
## Use cases
- **Streamline Issue Management**: Create, update, and organize issues using natural language. Let your agent handle logging bugs, assigning tasks, and updating statuses.
- **Track Projects and Cycles**: Get instant visibility into your team's momentum. Query the status of active cycles, check project milestones, and retrieve deadlines.
- **Contextual Search & Summarization**: Quickly catch up on long discussion threads or find specific project specifications. Your agent can search documentation and summarize complex issues.
## Prerequisites
- [Sign up](https://linear.app/signup) for a Linear account
- Generate an API key in [Linear Settings > Security & access](https://linear.app/docs/security-and-access) (if using API authentication)
## Use with agent
```python
from google.adk.agents import Agent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
from mcp import StdioServerParameters
root_agent = Agent(
model="gemini-2.5-pro",
name="linear_agent",
instruction="Help users manage issues, projects, and cycles in Linear",
tools=[
McpToolset(
connection_params=StdioConnectionParams(
server_params=StdioServerParameters(
command="npx",
args=[
"-y",
"mcp-remote",
"https://mcp.linear.app/mcp",
]
),
timeout=30,
),
)
],
)
```
Note
When you run this agent for the first time, a browser window will open automatically to request access via OAuth. Alternatively, you can use the authorization URL printed in the console. You must approve this request to allow the agent to access your Linear data.
```python
from google.adk.agents import Agent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StreamableHTTPServerParams
LINEAR_API_KEY = "YOUR_LINEAR_API_KEY"
root_agent = Agent(
model="gemini-2.5-pro",
name="linear_agent",
instruction="Help users manage issues, projects, and cycles in Linear",
tools=[
McpToolset(
connection_params=StreamableHTTPServerParams(
url="https://mcp.linear.app/mcp",
headers={
"Authorization": f"Bearer {LINEAR_API_KEY}",
},
),
)
],
)
```
Note
This code example uses an API key for authentication. To use a browser-based OAuth authentication flow instead, remove the `headers` parameter and run the agent.
```typescript
import { LlmAgent, MCPToolset } from "@google/adk";
const rootAgent = new LlmAgent({
model: "gemini-2.5-pro",
name: "linear_agent",
instruction: "Help users manage issues, projects, and cycles in Linear",
tools: [
new MCPToolset({
type: "StdioConnectionParams",
serverParams: {
command: "npx",
args: ["-y", "mcp-remote", "https://mcp.linear.app/mcp"],
},
}),
],
});
export { rootAgent };
```
Note
When you run this agent for the first time, a browser window will open automatically to request access via OAuth. Alternatively, you can use the authorization URL printed in the console. You must approve this request to allow the agent to access your Linear data.
```typescript
import { LlmAgent, MCPToolset } from "@google/adk";
const LINEAR_API_KEY = "YOUR_LINEAR_API_KEY";
const rootAgent = new LlmAgent({
model: "gemini-2.5-pro",
name: "linear_agent",
instruction: "Help users manage issues, projects, and cycles in Linear",
tools: [
new MCPToolset({
type: "StreamableHTTPConnectionParams",
url: "https://mcp.linear.app/mcp",
header: {
Authorization: `Bearer ${LINEAR_API_KEY}`,
},
}),
],
});
export { rootAgent };
```
Note
This code example uses an API key for authentication. To use a browser-based OAuth authentication flow instead, remove the `header` property and run the agent.
## Available tools
| Tool | Description |
| ---------------------- | ---------------------------- |
| `list_comments` | List comments on an issue |
| `create_comment` | Create a comment on an issue |
| `list_cycles` | List cycles in a project |
| `get_document` | Get a document |
| `list_documents` | List documents |
| `get_issue` | Get an issue |
| `list_issues` | List issues |
| `create_issue` | Create an issue |
| `update_issue` | Update an issue |
| `list_issue_statuses` | List issue statuses |
| `get_issue_status` | Get an issue status |
| `list_issue_labels` | List issue labels |
| `create_issue_label` | Create an issue label |
| `list_projects` | List projects |
| `get_project` | Get a project |
| `create_project` | Create a project |
| `update_project` | Update a project |
| `list_project_labels` | List project labels |
| `list_teams` | List teams |
| `get_team` | Get a team |
| `list_users` | List users |
| `get_user` | Get a user |
| `search_documentation` | Search documentation |
## Additional resources
- [Linear MCP Server Documentation](https://linear.app/docs/mcp)
- [Linear Getting Started Guide](https://linear.app/docs/start-guide)
# MCP Toolbox for Databases tool for ADK
Supported in ADKPythonTypescriptGo
[MCP Toolbox for Databases](https://github.com/googleapis/genai-toolbox) is an open source MCP server for databases. It was designed with enterprise-grade and production-quality in mind. It enables you to develop tools easier, faster, and more securely by handling the complexities such as connection pooling, authentication, and more.
Google’s Agent Development Kit (ADK) has built in support for MCP Toolbox. For more information on [getting started](https://googleapis.github.io/genai-toolbox/getting-started/) or [configuring](https://googleapis.github.io/genai-toolbox/getting-started/configure/) MCP Toolbox, see the [documentation](https://googleapis.github.io/genai-toolbox/getting-started/introduction/).
## Supported Data Sources
MCP Toolbox provides out-of-the-box toolsets for the following databases and data platforms:
### Google Cloud
- [BigQuery](https://googleapis.github.io/genai-toolbox/resources/sources/bigquery/) (including tools for SQL execution, schema discovery, and AI-powered time series forecasting)
- [AlloyDB](https://googleapis.github.io/genai-toolbox/resources/sources/alloydb-pg/) (PostgreSQL-compatible, with tools for both standard queries and natural language queries)
- [AlloyDB Admin](https://googleapis.github.io/genai-toolbox/resources/sources/alloydb-admin/)
- [Spanner](https://googleapis.github.io/genai-toolbox/resources/sources/spanner/) (supporting both GoogleSQL and PostgreSQL dialects)
- Cloud SQL (with dedicated support for [Cloud SQL for PostgreSQL](https://googleapis.github.io/genai-toolbox/resources/sources/cloud-sql-pg/), [Cloud SQL for MySQL](https://googleapis.github.io/genai-toolbox/resources/sources/cloud-sql-mysql/), and [Cloud SQL for SQL Server](https://googleapis.github.io/genai-toolbox/resources/sources/cloud-sql-mssql/))
- [Cloud SQL Admin](https://googleapis.github.io/genai-toolbox/resources/sources/cloud-sql-admin/)
- [Firestore](https://googleapis.github.io/genai-toolbox/resources/sources/firestore/)
- [Bigtable](https://googleapis.github.io/genai-toolbox/resources/sources/bigtable/)
- [Dataplex](https://googleapis.github.io/genai-toolbox/resources/sources/dataplex/) (for data discovery and metadata search)
- [Cloud Monitoring](https://googleapis.github.io/genai-toolbox/resources/sources/cloud-monitoring/)
### Relational & SQL Databases
- [PostgreSQL](https://googleapis.github.io/genai-toolbox/resources/sources/postgres/) (generic)
- [MySQL](https://googleapis.github.io/genai-toolbox/resources/sources/mysql/) (generic)
- [Microsoft SQL Server](https://googleapis.github.io/genai-toolbox/resources/sources/mssql/) (generic)
- [ClickHouse](https://googleapis.github.io/genai-toolbox/resources/sources/clickhouse/)
- [TiDB](https://googleapis.github.io/genai-toolbox/resources/sources/tidb/)
- [OceanBase](https://googleapis.github.io/genai-toolbox/resources/sources/oceanbase/)
- [Firebird](https://googleapis.github.io/genai-toolbox/resources/sources/firebird/)
- [SQLite](https://googleapis.github.io/genai-toolbox/resources/sources/sqlite/)
- [YugabyteDB](https://googleapis.github.io/genai-toolbox/resources/sources/yugabytedb/)
### NoSQL & Key-Value Stores
- [MongoDB](https://googleapis.github.io/genai-toolbox/resources/sources/mongodb/)
- [Couchbase](https://googleapis.github.io/genai-toolbox/resources/sources/couchbase/)
- [Redis](https://googleapis.github.io/genai-toolbox/resources/sources/redis/)
- [Valkey](https://googleapis.github.io/genai-toolbox/resources/sources/valkey/)
- [Cassandra](https://googleapis.github.io/genai-toolbox/resources/sources/cassandra/)
### Graph Databases
- [Neo4j](https://googleapis.github.io/genai-toolbox/resources/sources/neo4j/) (with tools for Cypher queries and schema inspection)
- [Dgraph](https://googleapis.github.io/genai-toolbox/resources/sources/dgraph/)
### Data Platforms & Federation
- [Looker](https://googleapis.github.io/genai-toolbox/resources/sources/looker/) (for running Looks, queries, and building dashboards via the Looker API)
- [Trino](https://googleapis.github.io/genai-toolbox/resources/sources/trino/) (for running federated queries across multiple sources)
### Other
- [HTTP](https://googleapis.github.io/genai-toolbox/resources/sources/http/)
## Configure and deploy
MCP Toolbox is an open source server that you deploy and manage yourself. For more instructions on deploying and configuring, see the official Toolbox documentation:
- [Installing the Server](https://googleapis.github.io/genai-toolbox/getting-started/introduction/#installing-the-server)
- [Configuring MCP Toolbox](https://googleapis.github.io/genai-toolbox/getting-started/configure/)
## Install Client SDK for ADK
ADK relies on the `toolbox-adk` python package to use MCP Toolbox. Install the package before getting started:
```shell
pip install google-adk[toolbox]
```
### Loading MCP Toolbox Tools
Once your MCP Toolbox server is configured, up and running, you can load tools from your server using ADK:
```python
from google.adk import Agent
from google.adk.tools.toolbox_toolset import ToolboxToolset
toolset = ToolboxToolset(
server_url="http://127.0.0.1:5000"
)
root_agent = Agent(
...,
tools=[toolset] # Provide the toolset to the Agent
)
```
### Authentication
The `ToolboxToolset` supports various authentication strategies including Workload Identity (ADC), User Identity (OAuth2), and API Keys. For full documentation, see the [MCP Toolbox ADK Authentication Guide](https://github.com/googleapis/mcp-toolbox-sdk-python/tree/main/packages/toolbox-adk#authentication).
**Example: Workload Identity (ADC)**
Recommended for Cloud Run, GKE, or local development with `gcloud auth login`.
```python
from google.adk.tools.toolbox_toolset import ToolboxToolset
from toolbox_adk import CredentialStrategy
# target_audience: The URL of your MCP Toolbox server
creds = CredentialStrategy.workload_identity(target_audience="")
toolset = ToolboxToolset(
server_url="",
credentials=creds
)
```
### Advanced Configuration
You can configure parameter binding and additional headers. See the [MCP Toolbox ADK documentation](https://github.com/googleapis/mcp-toolbox-sdk-python/tree/main/packages/toolbox-adk) for details. For example, you can bind values to tool parameters.
Note
These values are hidden from the model.
```python
toolset = ToolboxToolset(
server_url="...",
bound_params={
"region": "us-central1",
"api_key": lambda: get_api_key() # Can be a callable
}
)
```
ADK relies on the `@toolbox-sdk/adk` TS package to use MCP Toolbox. Install the package before getting started:
```shell
npm install @toolbox-sdk/adk
```
### Loading MCP Toolbox Tools
Once your MCP Toolbox server is configured and up and running, you can load tools from your server using ADK:
```typescript
import {InMemoryRunner, LlmAgent} from '@google/adk';
import {Content} from '@google/genai';
import {ToolboxClient} from '@toolbox-sdk/adk'
const toolboxClient = new ToolboxClient("http://127.0.0.1:5000");
const loadedTools = await toolboxClient.loadToolset();
export const rootAgent = new LlmAgent({
name: 'weather_time_agent',
model: 'gemini-2.5-flash',
description:
'Agent to answer questions about the time and weather in a city.',
instruction:
'You are a helpful agent who can answer user questions about the time and weather in a city.',
tools: loadedTools,
});
async function main() {
const userId = 'test_user';
const appName = rootAgent.name;
const runner = new InMemoryRunner({agent: rootAgent, appName});
const session = await runner.sessionService.createSession({
appName,
userId,
});
const prompt = 'What is the weather in New York? And the time?';
const content: Content = {
role: 'user',
parts: [{text: prompt}],
};
console.log(content);
for await (const e of runner.runAsync({
userId,
sessionId: session.id,
newMessage: content,
})) {
if (e.content?.parts?.[0]?.text) {
console.log(`${e.author}: ${JSON.stringify(e.content, null, 2)}`);
}
}
}
main().catch(console.error);
```
ADK relies on the `mcp-toolbox-sdk-go` go module to use MCP Toolbox. Install the module before getting started:
```shell
go get github.com/googleapis/mcp-toolbox-sdk-go
```
### Loading MCP Toolbox Tools
Once your MCP Toolbox server is configured and up and running, you can load tools from your server using ADK:
```go
package main
import (
"context"
"fmt"
"github.com/googleapis/mcp-toolbox-sdk-go/tbadk"
"google.golang.org/adk/agent/llmagent"
)
func main() {
toolboxClient, err := tbadk.NewToolboxClient("https://127.0.0.1:5000")
if err != nil {
log.Fatalf("Failed to create MCP Toolbox client: %v", err)
}
// Load a specific set of tools
toolboxtools, err := toolboxClient.LoadToolset("my-toolset-name", ctx)
if err != nil {
return fmt.Sprintln("Could not load MCP Toolbox Toolset", err)
}
toolsList := make([]tool.Tool, len(toolboxtools))
for i := range toolboxtools {
toolsList[i] = &toolboxtools[i]
}
llmagent, err := llmagent.New(llmagent.Config{
...,
Tools: toolsList,
})
// Load a single tool
tool, err := client.LoadTool("my-tool-name", ctx)
if err != nil {
return fmt.Sprintln("Could not load MCP Toolbox Tool", err)
}
llmagent, err := llmagent.New(llmagent.Config{
...,
Tools: []tool.Tool{&toolboxtool},
})
}
```
## Advanced MCP Toolbox Features
MCP Toolbox has a variety of features to make developing Gen AI tools for databases. For more information, read more about the following features:
- [Authenticated Parameters](https://googleapis.github.io/genai-toolbox/resources/tools/#authenticated-parameters): bind tool inputs to values from OIDC tokens automatically, making it easy to run sensitive queries without potentially leaking data
- [Authorized Invocations:](https://googleapis.github.io/genai-toolbox/resources/tools/#authorized-invocations) restrict access to use a tool based on the users Auth token
- [OpenTelemetry](https://googleapis.github.io/genai-toolbox/how-to/export_telemetry/): get metrics and tracing from Toolbox with OpenTelemetry
# MLflow observability for ADK
Supported in ADKPython
[MLflow Tracing](https://mlflow.org/docs/latest/genai/tracing/) provides first-class support for ingesting OpenTelemetry (OTel) traces. Google ADK emits OTel spans for agent runs, tool calls, and model requests, which you can send directly to an MLflow Tracking Server for analysis and debugging.
## Prerequisites
- MLflow version 3.6.0 or newer. OpenTelemetry ingestion is only supported in MLflow 3.6.0+.
- A SQL-based backend store (e.g., SQLite, PostgreSQL, MySQL). File-based stores do not support OTLP ingestion.
- Google ADK installed in your environment.
## Install dependencies
```bash
pip install "mlflow>=3.6.0" google-adk opentelemetry-sdk opentelemetry-exporter-otlp-proto-http
```
## Start the MLflow Tracking Server
Start MLflow with a SQL backend and a port (5000 in this example):
```bash
mlflow server --backend-store-uri sqlite:///mlflow.db --port 5000
```
You can point `--backend-store-uri` to other SQL backends (PostgreSQL, MySQL, MSSQL). OTLP ingestion is not supported with file-based backends.
## Configure OpenTelemetry (required)
You must configure an OTLP exporter and set a global tracer provider before using any ADK components so that spans are emitted to MLflow.
Initialize the OTLP exporter and global tracer provider in code before importing or constructing ADK agents/tools:
```python
# my_agent/agent.py
from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import SimpleSpanProcessor
exporter = OTLPSpanExporter(
endpoint="http://localhost:5000/v1/traces",
headers={"x-mlflow-experiment-id": "123"} # replace with your experiment id
)
provider = TracerProvider()
provider.add_span_processor(SimpleSpanProcessor(exporter))
trace.set_tracer_provider(provider) # set BEFORE importing/using ADK
```
This configures the OpenTelemetry pipeline and sends ADK spans to the MLflow server on each run.
## Example: Trace an ADK agent
Now you can add the agent code for a simple math agent, after the code that sets up the OTLP exporter and tracer provider:
```python
# my_agent/agent.py
from google.adk.agents import LlmAgent
from google.adk.tools import FunctionTool
def calculator(a: float, b: float) -> str:
"""Add two numbers and return the result."""
return str(a + b)
calculator_tool = FunctionTool(func=calculator)
root_agent = LlmAgent(
name="MathAgent",
model="gemini-2.0-flash-exp",
instruction=(
"You are a helpful assistant that can do math. "
"When asked a math problem, use the calculator tool to solve it."
),
tools=[calculator_tool],
)
```
Run the agent with:
```bash
adk run my_agent
```
And ask it a math problem:
```console
What is 12 + 34?
```
You should then see output similar to:
```console
[MathAgent]: The answer is 46.
```
## View traces in MLflow
Open the MLflow UI at `http://localhost:5000`, select your experiment, and inspect the trace tree and spans generated by your ADK agent.
## Tips
- Set the tracer provider before importing or initializing ADK objects so all spans are captured.
- Behind a proxy or on a remote host, replace `localhost:5000` with your server address.
## Resources
- [MLflow Tracing Documentation](https://mlflow.org/docs/latest/genai/tracing/): Official documentation for MLflow Tracing that covers other library integrations and downstream usage of traces, such as evaluation, monitoring, searching, and more.
- [OpenTelemetry in MLflow](https://mlflow.org/docs/latest/genai/tracing/opentelemetry/): Detailed guide on how to use OpenTelemetry with MLflow.
- [MLflow for Agents](https://mlflow.org/docs/latest/genai/): Comprehensive guide on how to use MLflow for building production-ready agents.
# MongoDB MCP tool for ADK
Supported in ADKPythonTypeScript
The [MongoDB MCP Server](https://github.com/mongodb-js/mongodb-mcp-server) connects your ADK agent to [MongoDB](https://www.mongodb.com/) databases and MongoDB Atlas clusters. This integration gives your agent the ability to query collections, manage databases, and interact with MongoDB Atlas infrastructure using natural language.
## Use cases
- **Data Exploration and Analysis**: Query MongoDB collections using natural language, run aggregations, and analyze document schemas without writing complex queries manually.
- **Database Administration**: List databases and collections, create indexes, manage users, and monitor database statistics through conversational commands.
- **Atlas Infrastructure Management**: Create and manage MongoDB Atlas clusters, configure access lists, and view performance recommendations directly from your agent.
## Prerequisites
- **For database access**: A MongoDB connection string (local, self-hosted, or Atlas cluster)
- **For Atlas management**: A [MongoDB Atlas](https://www.mongodb.com/atlas) service account with API credentials (client ID and secret)
## Use with agent
```python
from google.adk.agents import Agent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
from mcp import StdioServerParameters
# For database access, use a connection string:
CONNECTION_STRING = "mongodb://localhost:27017/myDatabase"
# For Atlas management, use API credentials:
# ATLAS_CLIENT_ID = "YOUR_ATLAS_CLIENT_ID"
# ATLAS_CLIENT_SECRET = "YOUR_ATLAS_CLIENT_SECRET"
root_agent = Agent(
model="gemini-2.5-pro",
name="mongodb_agent",
instruction="Help users query and manage MongoDB databases",
tools=[
McpToolset(
connection_params=StdioConnectionParams(
server_params=StdioServerParameters(
command="npx",
args=[
"-y",
"mongodb-mcp-server",
"--readOnly", # Remove for write operations
],
env={
# For database access, use:
"MDB_MCP_CONNECTION_STRING": CONNECTION_STRING,
# For Atlas management, use:
# "MDB_MCP_API_CLIENT_ID": ATLAS_CLIENT_ID,
# "MDB_MCP_API_CLIENT_SECRET": ATLAS_CLIENT_SECRET,
},
),
timeout=30,
),
)
],
)
```
```typescript
import { LlmAgent, MCPToolset } from "@google/adk";
// For database access, use a connection string:
const CONNECTION_STRING = "mongodb://localhost:27017/myDatabase";
// For Atlas management, use API credentials:
// const ATLAS_CLIENT_ID = "YOUR_ATLAS_CLIENT_ID";
// const ATLAS_CLIENT_SECRET = "YOUR_ATLAS_CLIENT_SECRET";
const rootAgent = new LlmAgent({
model: "gemini-2.5-pro",
name: "mongodb_agent",
instruction: "Help users query and manage MongoDB databases",
tools: [
new MCPToolset({
type: "StdioConnectionParams",
serverParams: {
command: "npx",
args: [
"-y",
"mongodb-mcp-server",
"--readOnly", // Remove for write operations
],
env: {
// For database access, use:
MDB_MCP_CONNECTION_STRING: CONNECTION_STRING,
// For Atlas management, use:
// MDB_MCP_API_CLIENT_ID: ATLAS_CLIENT_ID,
// MDB_MCP_API_CLIENT_SECRET: ATLAS_CLIENT_SECRET,
},
},
}),
],
});
export { rootAgent };
```
## Available tools
### MongoDB database tools
| Tool | Description |
| -------------------- | ----------------------------------------------- |
| `find` | Run a find query against a MongoDB collection |
| `aggregate` | Run an aggregation against a MongoDB collection |
| `count` | Get the number of documents in a collection |
| `list-databases` | List all databases for a MongoDB connection |
| `list-collections` | List all collections for a given database |
| `collection-schema` | Describe the schema for a collection |
| `collection-indexes` | Describe the indexes for a collection |
| `insert-many` | Insert documents into a collection |
| `update-many` | Update documents matching a filter |
| `delete-many` | Remove documents matching a filter |
| `create-collection` | Create a new collection |
| `drop-collection` | Remove a collection from the database |
| `drop-database` | Remove a database |
| `create-index` | Create an index for a collection |
| `drop-index` | Drop an index from a collection |
| `rename-collection` | Rename a collection |
| `db-stats` | Get statistics for a database |
| `explain` | Get query execution statistics |
| `export` | Export query results in EJSON format |
### MongoDB Atlas tools
Note
Atlas tools require API credentials. Set `MDB_MCP_API_CLIENT_ID` and `MDB_MCP_API_CLIENT_SECRET` environment variables to enable them.
| Tool | Description |
| ------------------------------- | -------------------------------- |
| `atlas-list-orgs` | List MongoDB Atlas organizations |
| `atlas-list-projects` | List MongoDB Atlas projects |
| `atlas-list-clusters` | List MongoDB Atlas clusters |
| `atlas-inspect-cluster` | Inspect metadata of a cluster |
| `atlas-list-db-users` | List database users |
| `atlas-create-free-cluster` | Create a free Atlas cluster |
| `atlas-create-project` | Create an Atlas project |
| `atlas-create-db-user` | Create a database user |
| `atlas-create-access-list` | Configure IP access list |
| `atlas-inspect-access-list` | View IP access list entries |
| `atlas-list-alerts` | List Atlas alerts |
| `atlas-get-performance-advisor` | Get performance recommendations |
## Configuration
### Environment variables
| Variable | Description |
| --------------------------- | --------------------------------------------- |
| `MDB_MCP_CONNECTION_STRING` | MongoDB connection string for database access |
| `MDB_MCP_API_CLIENT_ID` | Atlas API client ID for Atlas tools |
| `MDB_MCP_API_CLIENT_SECRET` | Atlas API client secret for Atlas tools |
| `MDB_MCP_READ_ONLY` | Enable read-only mode (`true` or `false`) |
| `MDB_MCP_DISABLED_TOOLS` | Comma-separated list of tools to disable |
| `MDB_MCP_LOG_PATH` | Directory for log files |
### Read-only mode
The `--readOnly` flag restricts the server to read, connect, and metadata operations only. This prevents any create, update, or delete operations, making it safe for data exploration without risk of accidental modifications.
### Disabling tools
You can disable specific tools or categories using `MDB_MCP_DISABLED_TOOLS`:
- Tool names: `find`, `aggregate`, `insert-many`, etc.
- Categories: `atlas` (all Atlas tools), `mongodb` (all database tools)
- Operation types: `create`, `update`, `delete`, `read`, `metadata`
## Additional resources
- [MongoDB MCP Server Repository](https://github.com/mongodb-js/mongodb-mcp-server)
- [MongoDB Documentation](https://www.mongodb.com/docs/)
- [MongoDB Atlas](https://www.mongodb.com/atlas)
# Monocle observability for ADK
Supported in ADKPython
[Monocle](https://github.com/monocle2ai/monocle) is an open-source observability platform for monitoring, debugging, and improving LLM applications and AI Agents. It provides comprehensive tracing capabilities for your Google ADK applications through automatic instrumentation. Monocle generates OpenTelemetry-compatible traces that can be exported to various destinations including local files or console output.
## Overview
Monocle automatically instruments Google ADK applications, allowing you to:
- **Trace agent interactions** - Automatically capture every agent run, tool call, and model request with full context and metadata
- **Monitor execution flow** - Track agent state, delegation events, and execution flow through detailed traces
- **Debug issues** - Analyze detailed traces to quickly identify bottlenecks, failed tool calls, and unexpected agent behavior
- **Flexible export options** - Export traces to local files or console for analysis
- **OpenTelemetry compatible** - Generate standard OpenTelemetry traces that work with any OTLP-compatible backend
Monocle automatically instruments the following Google ADK components:
- **`BaseAgent.run_async`** - Captures agent execution, agent state, and delegation events
- **`FunctionTool.run_async`** - Captures tool execution, including tool name, parameters, and results
- **`Runner.run_async`** - Captures runner execution, including request context and execution flow
## Installation
### 1. Install Required Packages
```bash
pip install monocle_apptrace google-adk
```
## Setup
### 1. Configure Monocle Telemetry
Monocle automatically instruments Google ADK when you initialize telemetry. Simply call `setup_monocle_telemetry()` at the start of your application:
```python
from monocle_apptrace import setup_monocle_telemetry
# Initialize Monocle telemetry - automatically instruments Google ADK
setup_monocle_telemetry(workflow_name="my-adk-app")
```
That's it! Monocle will automatically detect and instrument your Google ADK agents, tools, and runners.
### 2. Configure Exporters (Optional)
By default, Monocle exports traces to local JSON files. You can configure different exporters using environment variables.
#### Export to Console (for debugging)
Set the environment variable:
```bash
export MONOCLE_EXPORTER="console"
```
#### Export to Local Files (default)
```bash
export MONOCLE_EXPORTER="file"
```
Or simply omit the `MONOCLE_EXPORTER` variable - it defaults to `file`.
## Observe
Now that you have tracing setup, all Google ADK SDK requests will be automatically traced by Monocle.
```python
from monocle_apptrace import setup_monocle_telemetry
from google.adk.agents import Agent
from google.adk.runners import InMemoryRunner
from google.genai import types
# Initialize Monocle telemetry - must be called before using ADK
setup_monocle_telemetry(workflow_name="weather_app")
# Define a tool function
def get_weather(city: str) -> dict:
"""Retrieves the current weather report for a specified city.
Args:
city (str): The name of the city for which to retrieve the weather report.
Returns:
dict: status and result or error msg.
"""
if city.lower() == "new york":
return {
"status": "success",
"report": (
"The weather in New York is sunny with a temperature of 25 degrees"
" Celsius (77 degrees Fahrenheit)."
),
}
else:
return {
"status": "error",
"error_message": f"Weather information for '{city}' is not available.",
}
# Create an agent with tools
agent = Agent(
name="weather_agent",
model="gemini-2.0-flash-exp",
description="Agent to answer questions using weather tools.",
instruction="You must use the available tools to find an answer.",
tools=[get_weather]
)
app_name = "weather_app"
user_id = "test_user"
session_id = "test_session"
runner = InMemoryRunner(agent=agent, app_name=app_name)
session_service = runner.session_service
await session_service.create_session(
app_name=app_name,
user_id=user_id,
session_id=session_id
)
# Run the agent (all interactions will be automatically traced)
async for event in runner.run_async(
user_id=user_id,
session_id=session_id,
new_message=types.Content(role="user", parts=[
types.Part(text="What is the weather in New York?")]
)
):
if event.is_final_response():
print(event.content.parts[0].text.strip())
```
## Accessing Traces
By default, Monocle generates traces in JSON files in the local directory `./monocle`. The file name format is:
```text
monocle_trace_{workflow_name}_{trace_id}_{timestamp}.json
```
Each trace file contains an array of OpenTelemetry-compatible spans that capture:
- **Agent execution spans** - Agent state, delegation events, and execution flow
- **Tool execution spans** - Tool name, input parameters, and output results
- **LLM interaction spans** - Model calls, prompts, responses, and token usage (if using Gemini or other LLMs)
You can analyze these trace files using any OpenTelemetry-compatible tool or write custom analysis scripts.
## Visualizing Traces with VS Code Extension
The [Okahu Trace Visualizer](https://marketplace.visualstudio.com/items?itemName=OkahuAI.okahu-ai-observability) VS Code extension provides an interactive way to visualize and analyze Monocle-generated traces directly in Visual Studio Code.
### Installation
1. Open VS Code
1. Press `Ctrl+P` (or `Cmd+P` on Mac) to open Quick Open
1. Paste the following command and press Enter:
```text
ext install OkahuAI.okahu-ai-observability
```
Alternatively, you can install it from the [VS Code Marketplace](https://marketplace.visualstudio.com/items?itemName=OkahuAI.okahu-ai-observability).
### Features
The extension provides:
- **Custom Activity Bar Panel** - Dedicated sidebar for trace file management
- **Interactive File Tree** - Browse and select trace files with custom React UI
- **Split View Analysis** - Gantt chart visualization alongside JSON data viewer
- **Real-time Communication** - Seamless data flow between VS Code and React components
- **VS Code Theming** - Fully integrated with VS Code's light/dark themes
### Usage
1. After running your ADK application with Monocle tracing enabled, trace files will be generated in the `./monocle` directory
1. Open the Okahu Trace Visualizer panel from the VS Code Activity Bar
1. Browse and select trace files from the interactive file tree
1. View your traces with:
1. **Gantt chart visualization** - See the timeline and hierarchy of spans
1. **JSON data viewer** - Inspect detailed span attributes and events
1. **Token counts** - View token usage for LLM calls
1. **Error badges** - Quickly identify failed operations
## What Gets Traced
Monocle automatically captures the following information from Google ADK:
- **Agent Execution**: Agent state, delegation events, and execution flow
- **Tool Calls**: Tool name, input parameters, and output results
- **Runner Execution**: Request context and overall execution flow
- **Timing Information**: Start time, end time, and duration for each operation
- **Error Information**: Exceptions and error states
All traces are generated in OpenTelemetry format, making them compatible with any OTLP-compatible observability backend.
## Support and Resources
- [Monocle Documentation](https://docs.okahu.ai/monocle_overview/)
- [Monocle GitHub Repository](https://github.com/monocle2ai/monocle)
- [Google ADK Travel Agent Example](https://github.com/okahu-demos/adk-travel-agent)
- [Discord Community](https://discord.gg/D8vDbSUhJX)
# n8n MCP tool for ADK
Supported in ADKPythonTypeScript
The [n8n MCP Server](https://docs.n8n.io/advanced-ai/accessing-n8n-mcp-server/) connects your ADK agent to [n8n](https://n8n.io/), an extendable workflow automation tool. This integration allows your agent to securely connect to an n8n instance to search, inspect, and trigger workflows directly from a natural language interface.
Alternative: Workflow-level MCP Server
The configuration guide on this page covers **Instance-level MCP access**, which connects your agent to a central hub of enabled workflows. Alternatively, you can use the [MCP Server Trigger node](https://docs.n8n.io/integrations/builtin/core-nodes/n8n-nodes-langchain.mcptrigger/) to make a **single workflow** act as its own standalone MCP server. This method is useful if you want to craft specific server behaviors or expose tools isolated to one workflow.
## Use cases
- **Execute Complex Workflows**: Trigger multi-step business processes defined in n8n directly from your agent, leveraging reliable branching logic, loops, and error handling to ensure consistency.
- **Connect to External Apps**: Access pre-built integrations through n8n without writing custom tools for each service, eliminating the need to manage API authentication, headers, or boilerplate code.
- **Data Processing**: Offload complex data transformation tasks to n8n workflows, such as converting natural language into API calls or scraping and summarizing webpages, utilizing custom Python or JavaScript nodes for precise data shaping.
## Prerequisites
- An active n8n instance
- MCP access enabled in settings
- A valid MCP access token
Refer to the [n8n MCP documentation](https://docs.n8n.io/advanced-ai/accessing-n8n-mcp-server/) for detailed setup instructions.
## Use with agent
```python
from google.adk.agents import Agent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
from mcp import StdioServerParameters
N8N_INSTANCE_URL = "https://localhost:5678"
N8N_MCP_TOKEN = "YOUR_N8N_MCP_TOKEN"
root_agent = Agent(
model="gemini-2.5-pro",
name="n8n_agent",
instruction="Help users manage and execute workflows in n8n",
tools=[
McpToolset(
connection_params=StdioConnectionParams(
server_params=StdioServerParameters(
command="npx",
args=[
"-y",
"supergateway",
"--streamableHttp",
f"{N8N_INSTANCE_URL}/mcp-server/http",
"--header",
f"authorization:Bearer {N8N_MCP_TOKEN}"
]
),
timeout=300,
),
)
],
)
```
```python
from google.adk.agents import Agent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StreamableHTTPServerParams
N8N_INSTANCE_URL = "https://localhost:5678"
N8N_MCP_TOKEN = "YOUR_N8N_MCP_TOKEN"
root_agent = Agent(
model="gemini-2.5-pro",
name="n8n_agent",
instruction="Help users manage and execute workflows in n8n",
tools=[
McpToolset(
connection_params=StreamableHTTPServerParams(
url=f"{N8N_INSTANCE_URL}/mcp-server/http",
headers={
"Authorization": f"Bearer {N8N_MCP_TOKEN}",
},
),
)
],
)
```
```typescript
import { LlmAgent, MCPToolset } from "@google/adk";
const N8N_INSTANCE_URL = "https://localhost:5678";
const N8N_MCP_TOKEN = "YOUR_N8N_MCP_TOKEN";
const rootAgent = new LlmAgent({
model: "gemini-2.5-pro",
name: "n8n_agent",
instruction: "Help users manage and execute workflows in n8n",
tools: [
new MCPToolset({
type: "StdioConnectionParams",
serverParams: {
command: "npx",
args: [
"-y",
"supergateway",
"--streamableHttp",
`${N8N_INSTANCE_URL}/mcp-server/http`,
"--header",
`authorization:Bearer ${N8N_MCP_TOKEN}`,
],
},
}),
],
});
export { rootAgent };
```
```typescript
import { LlmAgent, MCPToolset } from "@google/adk";
const N8N_INSTANCE_URL = "https://localhost:5678";
const N8N_MCP_TOKEN = "YOUR_N8N_MCP_TOKEN";
const rootAgent = new LlmAgent({
model: "gemini-2.5-pro",
name: "n8n_agent",
instruction: "Help users manage and execute workflows in n8n",
tools: [
new MCPToolset({
type: "StreamableHTTPConnectionParams",
url: `${N8N_INSTANCE_URL}/mcp-server/http`,
header: {
Authorization: `Bearer ${N8N_MCP_TOKEN}`,
},
}),
],
});
export { rootAgent };
```
## Available tools
| Tool | Description |
| ---------------------- | ------------------------------------------------------- |
| `search_workflows` | Search for available workflows |
| `execute_workflow` | Execute a specific workflow |
| `get_workflow_details` | Retrieve metadata and schema information for a workflow |
## Configuration
To make workflows accessible to your agent, they must meet the following criteria:
- **Be Active**: The workflow must be activated in n8n.
- **Supported Trigger**: Contain a Webhook, Schedule, Chat, or Form trigger node.
- **Enabled for MCP**: You must toggle "Available in MCP" in the workflow settings or select "Enable MCP access" from the workflow card menu.
## Additional resources
- [n8n MCP Server Documentation](https://docs.n8n.io/advanced-ai/accessing-n8n-mcp-server/)
# Notion MCP tool for ADK
Supported in ADKPythonTypeScript
The [Notion MCP Server](https://github.com/makenotion/notion-mcp-server) connects your ADK agent to Notion, allowing it to search, create, and manage pages, databases, and more within a workspace. This gives your agent the ability to query, create, and organize content in your Notion workspace using natural language.
## Use cases
- **Search your workspace**: Find project pages, meeting notes, or documents based on content.
- **Create new content**: Generate new pages for meeting notes, project plans, or tasks.
- **Manage tasks and databases**: Update the status of a task, add items to a database, or change properties.
- **Organize your workspace**: Move pages, duplicate templates, or add comments to documents.
## Prerequisites
- Obtain a Notion integration token by going to [Notion Integrations](https://www.notion.so/profile/integrations) in your profile. Refer to the [authorization documentation](https://developers.notion.com/docs/authorization) for more details.
- Ensure relevant pages and databases can be accessed by your integration. Visit the Access tab in your [Notion Integration](https://www.notion.so/profile/integrations) settings, then grant access by selecting the pages you'd like to use.
## Use with agent
```python
from google.adk.agents import Agent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
from mcp import StdioServerParameters
NOTION_TOKEN = "YOUR_NOTION_TOKEN"
root_agent = Agent(
model="gemini-2.5-pro",
name="notion_agent",
instruction="Help users get information from Notion",
tools=[
McpToolset(
connection_params=StdioConnectionParams(
server_params = StdioServerParameters(
command="npx",
args=[
"-y",
"@notionhq/notion-mcp-server",
],
env={
"NOTION_TOKEN": NOTION_TOKEN,
}
),
timeout=30,
),
)
],
)
```
```typescript
import { LlmAgent, MCPToolset } from "@google/adk";
const NOTION_TOKEN = "YOUR_NOTION_TOKEN";
const rootAgent = new LlmAgent({
model: "gemini-2.5-pro",
name: "notion_agent",
instruction: "Help users get information from Notion",
tools: [
new MCPToolset({
type: "StdioConnectionParams",
serverParams: {
command: "npx",
args: ["-y", "@notionhq/notion-mcp-server"],
env: {
NOTION_TOKEN: NOTION_TOKEN,
},
},
}),
],
});
export { rootAgent };
```
## Available tools
| Tool | Description |
| ------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `notion-search` | Search across your Notion workspace and connected tools like Slack, Google Drive, and Jira. Falls back to basic workspace search if AI features aren’t available. |
| `notion-fetch` | Retrieves content from a Notion page or database by its URL |
| `notion-create-pages` | Creates one or more Notion pages with specified properties and content. |
| `notion-update-page` | Update a Notion page's properties or content. |
| `notion-move-pages` | Move one or more Notion pages or databases to a new parent. |
| `notion-duplicate-page` | Duplicate a Notion page within your workspace. This action is completed async. |
| `notion-create-database` | Creates a new Notion database, initial data source, and initial view with the specified properties. |
| `notion-update-database` | Update a Notion data source's properties, name, description, or other attributes. |
| `notion-create-comment` | Add a comment to a page |
| `notion-get-comments` | Lists all comments on a specific page, including threaded discussions. |
| `notion-get-teams` | Retrieves a list of teams (teamspaces) in the current workspace. |
| `notion-get-users` | Lists all users in the workspace with their details. |
| `notion-get-user` | Retrieve your user information by ID |
| `notion-get-self` | Retrieves information about your own bot user and the Notion workspace you’re connected to. |
## Additional resources
- [Notion MCP Server Documentation](https://developers.notion.com/docs/mcp)
- [Notion MCP Server Repository](https://github.com/makenotion/notion-mcp-server)
# PayPal MCP tool for ADK
Supported in ADKPythonTypeScript
The [PayPal MCP Server](https://github.com/paypal/paypal-mcp-server) connects your ADK agent to the [PayPal](https://www.paypal.com/) ecosystem. This integration gives your agent the ability to manage payments, invoices, subscriptions, and disputes using natural language, enabling automated commerce workflows and business insights.
## Use cases
- **Streamline Financial Operations**: Create orders, send invoices, and process refunds directly through chat without switching context. You can instruct your agent to "bill Client X" or "refund order Y" immediately.
- **Manage Subscriptions & Products**: Handle the full lifecycle of recurring billing by creating products, setting up subscription plans, and managing subscriber details using natural language.
- **Resolve Issues & Track Performance**: Summarize and accept dispute claims, track shipment statuses, and retrieve merchant insights to make data-driven decisions on the fly.
## Prerequisites
- Create a [PayPal Developer account](https://developer.paypal.com/)
- Create an app and retrieve your credentials from the [PayPal Developer Dashboard](https://developer.paypal.com/)
- [Generate an access token](https://developer.paypal.com/reference/get-an-access-token/) from your credentials
## Use with agent
```python
from google.adk.agents import Agent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
from mcp import StdioServerParameters
PAYPAL_ENVIRONMENT = "SANDBOX" # Options: "SANDBOX" or "PRODUCTION"
PAYPAL_ACCESS_TOKEN = "YOUR_PAYPAL_ACCESS_TOKEN"
root_agent = Agent(
model="gemini-2.5-pro",
name="paypal_agent",
instruction="Help users manage their PayPal account",
tools=[
McpToolset(
connection_params=StdioConnectionParams(
server_params=StdioServerParameters(
command="npx",
args=[
"-y",
"@paypal/mcp",
"--tools=all",
# (Optional) Specify which tools to enable
# "--tools=subscriptionPlans.list,subscriptionPlans.show",
],
env={
"PAYPAL_ACCESS_TOKEN": PAYPAL_ACCESS_TOKEN,
"PAYPAL_ENVIRONMENT": PAYPAL_ENVIRONMENT,
}
),
timeout=300,
),
)
],
)
```
```python
from google.adk.agents import Agent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import SseConnectionParams
PAYPAL_MCP_ENDPOINT = "https://mcp.sandbox.paypal.com/sse" # Production: https://mcp.paypal.com/sse
PAYPAL_ACCESS_TOKEN = "YOUR_PAYPAL_ACCESS_TOKEN"
root_agent = Agent(
model="gemini-2.5-pro",
name="paypal_agent",
instruction="Help users manage their PayPal account",
tools=[
McpToolset(
connection_params=SseConnectionParams(
url=PAYPAL_MCP_ENDPOINT,
headers={
"Authorization": f"Bearer {PAYPAL_ACCESS_TOKEN}",
},
),
)
],
)
```
```typescript
import { LlmAgent, MCPToolset } from "@google/adk";
const PAYPAL_ENVIRONMENT = "SANDBOX"; // Options: "SANDBOX" or "PRODUCTION"
const PAYPAL_ACCESS_TOKEN = "YOUR_PAYPAL_ACCESS_TOKEN";
const rootAgent = new LlmAgent({
model: "gemini-2.5-pro",
name: "paypal_agent",
instruction: "Help users manage their PayPal account",
tools: [
new MCPToolset({
type: "StdioConnectionParams",
serverParams: {
command: "npx",
args: [
"-y",
"@paypal/mcp",
"--tools=all",
// (Optional) Specify which tools to enable
// "--tools=subscriptionPlans.list,subscriptionPlans.show",
],
env: {
PAYPAL_ACCESS_TOKEN: PAYPAL_ACCESS_TOKEN,
PAYPAL_ENVIRONMENT: PAYPAL_ENVIRONMENT,
},
},
}),
],
});
export { rootAgent };
```
Note
**Token Expiration**: PayPal Access Tokens have a limited lifespan of 3-8 hours. If your agent stops working, ensure your token has not expired and generate a new one if necessary. You should implement token refresh logic to handle token expiration.
## Available tools
### Catalog management
| Tool | Description |
| ---------------------- | ---------------------------------------------------------- |
| `create_product` | Create a new product in the PayPal catalog |
| `list_products` | List products from the PayPal catalog |
| `show_product_details` | Show details of a specific product from the PayPal catalog |
| `update_product` | Update an existing product in the PayPal catalog |
### Dispute management
| Tool | Description |
| ---------------------- | ---------------------------------------------------------- |
| `list_disputes` | Retrieve a summary of all disputes with optional filtering |
| `get_dispute` | Retrieve detailed information about a specific dispute |
| `accept_dispute_claim` | Accept a dispute claim, resolving it in favor of the buyer |
### Invoices
| Tool | Description |
| -------------------------- | --------------------------------------------------- |
| `create_invoice` | Create a new invoice in the PayPal system |
| `list_invoices` | List invoices |
| `get_invoice` | Retrieve details about a specific invoice |
| `send_invoice` | Send an existing invoice to the specified recipient |
| `send_invoice_reminder` | Send a reminder for an existing invoice |
| `cancel_sent_invoice` | Cancel a sent invoice |
| `generate_invoice_qr_code` | Generate a QR code for an invoice |
### Payments
| Tool | Description |
| --------------- | ------------------------------------------------------------------ |
| `create_order` | Create an order in the PayPal system based on the provided details |
| `create_refund` | Process a refund for a captured payment |
| `get_order` | Get details of a specific payment |
| `get_refund` | Get the details for a specific refund |
| `pay_order` | Capture payment for an authorized order |
### Reporting and insights
| Tool | Description |
| ----------------------- | ------------------------------------------------------------------- |
| `get_merchant_insights` | Retrieve business intelligence metrics and analytics for a merchant |
| `list_transactions` | List all transactions |
### Shipment tracking
| Tool | Description |
| -------------------------- | ------------------------------------------------------------- |
| `create_shipment_tracking` | Create shipment tracking information for a PayPal transaction |
| `get_shipment_tracking` | Get shipment tracking information for a specific shipment |
| `update_shipment_tracking` | Update shipment tracking information for a specific shipment |
### Subscription management
| Tool | Description |
| -------------------------------- | -------------------------------------------- |
| `cancel_subscription` | Cancel an active subscription |
| `create_subscription` | Create a new subscription |
| `create_subscription_plan` | Create a new subscription plan |
| `update_subscription` | Update an existing subscription |
| `list_subscription_plans` | List subscription plans |
| `show_subscription_details` | Show details of a specific subscription |
| `show_subscription_plan_details` | Show details of a specific subscription plan |
## Configuration
You can control which tools are enabled using the `--tools` command-line argument. This is useful for limiting the scope of the agent's permissions.
You can enable all tools with `--tools=all` or specify a comma-separated list of specific tool identifiers.
**Note**: The configuration identifiers below use dot notation (e.g., `invoices.create`) which differs from the tool names exposed to the agent (e.g., `create_invoice`).
**Products**: `products.create`, `products.list`, `products.update`, `products.show`
**Disputes**: `disputes.list`, `disputes.get`, `disputes.create`
**Invoices**: `invoices.create`, `invoices.list`, `invoices.get`, `invoices.send`, `invoices.sendReminder`, `invoices.cancel`, `invoices.generateQRC`
**Orders & Payments**: `orders.create`, `orders.get`, `orders.capture`, `payments.createRefund`, `payments.getRefunds`
**Transactions**: `transactions.list`
**Shipment**: `shipment.create`, `shipment.get`
**Subscriptions**: `subscriptionPlans.create`, `subscriptionPlans.list`, `subscriptionPlans.show`, `subscriptions.create`, `subscriptions.show`, `subscriptions.cancel`
## Additional resources
- [PayPal MCP Server Documentation](https://docs.paypal.ai/developer/tools/ai/mcp-quickstart)
- [PayPal MCP Server Repository](https://github.com/paypal/paypal-mcp-server)
- [PayPal Agent Tools Reference](https://docs.paypal.ai/developer/tools/ai/agent-tools-ref)
# Phoenix observability for ADK
Supported in ADKPython
[Phoenix](https://arize.com/docs/phoenix) is an open-source, self-hosted observability platform for monitoring, debugging, and improving LLM applications and AI Agents at scale. It provides comprehensive tracing and evaluation capabilities for your Google ADK applications. To get started, sign up for a [free account](https://phoenix.arize.com/).
## Overview
Phoenix can automatically collect traces from Google ADK using [OpenInference instrumentation](https://github.com/Arize-ai/openinference/tree/main/python/instrumentation/openinference-instrumentation-google-adk), allowing you to:
- **Trace agent interactions** - Automatically capture every agent run, tool call, model request, and response with full context and metadata
- **Evaluate performance** - Assess agent behavior using custom or pre-built evaluators and run experiments to test agent configurations
- **Debug issues** - Analyze detailed traces to quickly identify bottlenecks, failed tool calls, and unexpected agent behavior
- **Self-hosted control** - Keep your data on your own infrastructure
## Installation
### 1. Install Required Packages
```bash
pip install openinference-instrumentation-google-adk google-adk arize-phoenix-otel
```
## Setup
### 1. Launch Phoenix
These instructions show you how to use Phoenix Cloud. You can also [launch Phoenix](https://arize.com/docs/phoenix/integrations/llm-providers/google-gen-ai/google-adk-tracing) in a notebook, from your terminal, or self-host it using a container.
1. Sign up for a [free Phoenix account](https://phoenix.arize.com/).
1. From the Settings page of your new Phoenix Space, create your API key
1. Copy your endpoint which should look like: https://app.phoenix.arize.com/s/[your-space-name]
**Set your Phoenix endpoint and API Key:**
```python
import os
os.environ["PHOENIX_API_KEY"] = "ADD YOUR PHOENIX API KEY"
os.environ["PHOENIX_COLLECTOR_ENDPOINT"] = "ADD YOUR PHOENIX COLLECTOR ENDPOINT"
# If you created your Phoenix Cloud instance before June 24th, 2025, set the API key as a header:
# os.environ["PHOENIX_CLIENT_HEADERS"] = f"api_key={os.getenv('PHOENIX_API_KEY')}"
```
### 2. Connect your application to Phoenix
```python
from phoenix.otel import register
# Configure the Phoenix tracer
tracer_provider = register(
project_name="my-llm-app", # Default is 'default'
auto_instrument=True # Auto-instrument your app based on installed OI dependencies
)
```
## Observe
Now that you have tracing setup, all Google ADK SDK requests will be streamed to Phoenix for observability and evaluation.
```python
import nest_asyncio
nest_asyncio.apply()
from google.adk.agents import Agent
from google.adk.runners import InMemoryRunner
from google.genai import types
# Define a tool function
def get_weather(city: str) -> dict:
"""Retrieves the current weather report for a specified city.
Args:
city (str): The name of the city for which to retrieve the weather report.
Returns:
dict: status and result or error msg.
"""
if city.lower() == "new york":
return {
"status": "success",
"report": (
"The weather in New York is sunny with a temperature of 25 degrees"
" Celsius (77 degrees Fahrenheit)."
),
}
else:
return {
"status": "error",
"error_message": f"Weather information for '{city}' is not available.",
}
# Create an agent with tools
agent = Agent(
name="weather_agent",
model="gemini-2.0-flash-exp",
description="Agent to answer questions using weather tools.",
instruction="You must use the available tools to find an answer.",
tools=[get_weather]
)
app_name = "weather_app"
user_id = "test_user"
session_id = "test_session"
runner = InMemoryRunner(agent=agent, app_name=app_name)
session_service = runner.session_service
await session_service.create_session(
app_name=app_name,
user_id=user_id,
session_id=session_id
)
# Run the agent (all interactions will be traced)
async for event in runner.run_async(
user_id=user_id,
session_id=session_id,
new_message=types.Content(role="user", parts=[
types.Part(text="What is the weather in New York?")]
)
):
if event.is_final_response():
print(event.content.parts[0].text.strip())
```
## Support and Resources
- [Phoenix Documentation](https://arize.com/docs/phoenix/integrations/llm-providers/google-gen-ai/google-adk-tracing)
- [Community Slack](https://arize-ai.slack.com/join/shared_invite/zt-11t1vbu4x-xkBIHmOREQnYnYDH1GDfCg#/shared-invite/email)
- [OpenInference Package](https://github.com/Arize-ai/openinference/tree/main/python/instrumentation/openinference-instrumentation-google-adk)
# Postman MCP tool for ADK
Supported in ADKPythonTypeScript
The [Postman MCP Server](https://github.com/postmanlabs/postman-mcp-server) connects your ADK agent to the [Postman](https://www.postman.com/) ecosystem. This integration gives your agent the ability to access workspaces, manage collections and environments, evaluate APIs, and automate workflows through natural language interactions.
## Use cases
- **API testing**: Continuously test your APIs using your Postman collections.
- **Collection management**: Create and tag collections, update documentation, add comments, or perform actions across multiple collections without leaving your editor.
- **Workspace and environment management**: Create workspaces and environments, and manage your environment variables.
- **Client code generation**: Generate production-ready client code that consumes APIs following best practices and project conventions.
## Prerequisites
- Create a [Postman account](https://identity.getpostman.com/signup)
- Generate a [Postman API key](https://postman.postman.co/settings/me/api-keys)
## Use with agent
```python
from google.adk.agents import Agent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
from mcp import StdioServerParameters
POSTMAN_API_KEY = "YOUR_POSTMAN_API_KEY"
root_agent = Agent(
model="gemini-2.5-pro",
name="postman_agent",
instruction="Help users manage their Postman workspaces and collections",
tools=[
McpToolset(
connection_params=StdioConnectionParams(
server_params=StdioServerParameters(
command="npx",
args=[
"-y",
"@postman/postman-mcp-server",
# "--full", # Use all 100+ tools
# "--code", # Use code generation tools
# "--region", "eu", # Use EU region
],
env={
"POSTMAN_API_KEY": POSTMAN_API_KEY,
},
),
timeout=30,
),
)
],
)
```
```python
from google.adk.agents import Agent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StreamableHTTPServerParams
POSTMAN_API_KEY = "YOUR_POSTMAN_API_KEY"
root_agent = Agent(
model="gemini-2.5-pro",
name="postman_agent",
instruction="Help users manage their Postman workspaces and collections",
tools=[
McpToolset(
connection_params=StreamableHTTPServerParams(
url="https://mcp.postman.com/mcp",
# (Optional) Use "/minimal" for essential tools only
# (Optional) Use "/code" for code generation tools
# (Optional) Use "https://mcp.eu.postman.com" for EU region
headers={
"Authorization": f"Bearer {POSTMAN_API_KEY}",
},
),
)
],
)
```
```typescript
import { LlmAgent, MCPToolset } from "@google/adk";
const POSTMAN_API_KEY = "YOUR_POSTMAN_API_KEY";
const rootAgent = new LlmAgent({
model: "gemini-2.5-pro",
name: "postman_agent",
instruction: "Help users manage their Postman workspaces and collections",
tools: [
new MCPToolset({
type: "StdioConnectionParams",
serverParams: {
command: "npx",
args: [
"-y",
"@postman/postman-mcp-server",
// "--full", // Use all 100+ tools
// "--code", // Use code generation tools
// "--region", "eu", // Use EU region
],
env: {
POSTMAN_API_KEY: POSTMAN_API_KEY,
},
},
}),
],
});
export { rootAgent };
```
```typescript
import { LlmAgent, MCPToolset } from "@google/adk";
const POSTMAN_API_KEY = "YOUR_POSTMAN_API_KEY";
const rootAgent = new LlmAgent({
model: "gemini-2.5-pro",
name: "postman_agent",
instruction: "Help users manage their Postman workspaces and collections",
tools: [
new MCPToolset({
type: "StreamableHTTPConnectionParams",
url: "https://mcp.postman.com/mcp",
// (Optional) Use "/minimal" for essential tools only
// (Optional) Use "/code" for code generation tools
// (Optional) Use "https://mcp.eu.postman.com" for EU region
header: {
Authorization: `Bearer ${POSTMAN_API_KEY}`,
},
}),
],
});
export { rootAgent };
```
## Configuration
Postman offers three tool configurations:
- **Minimal** (default): Essential tools for basic Postman operations. Best for simple modifications to collections, workspaces, or environments.
- **Full**: All available Postman API tools (100+ tools). Ideal for advanced collaboration and enterprise features.
- **Code**: Tools for searching API definitions and generating client code. Perfect for developers who need to consume APIs.
To select a configuration:
- **Local server**: Add `--full` or `--code` to the `args` list.
- **Remote server**: Change the URL path to `/minimal`, `/mcp` (full), or `/code`.
For EU region, use `--region eu` (local) or `https://mcp.eu.postman.com` (remote).
## Additional resources
- [Postman MCP Server on GitHub](https://github.com/postmanlabs/postman-mcp-server)
- [Postman API key settings](https://postman.postman.co/settings/me/api-keys)
- [Postman Learning Center](https://learning.postman.com/)
# Google Cloud Pub/Sub tool for ADK
Supported in ADKPython v1.22.0
The `PubSubToolset` allows agents to interact with [Google Cloud Pub/Sub](https://cloud.google.com/pubsub) service to publish, pull, and acknowledge messages.
## Prerequisites
Before using the `PubSubToolset`, you need to:
1. **Enable the Pub/Sub API** in your Google Cloud project.
1. **Authenticate and authorize**: Ensure that the principal (e.g., user, service account) running the agent has the necessary IAM permissions to perform Pub/Sub operations. For more information on Pub/Sub roles, see the [Pub/Sub access control documentation](https://cloud.google.com/pubsub/docs/access-control).
1. **Create a topic or subscription**: [Create a topic](https://cloud.google.com/pubsub/docs/create-topic) to publish messages and [create a subscription](https://cloud.google.com/pubsub/docs/create-subscription) to receive them.
## Usage
```py
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import asyncio
import os
from google.adk.agents import Agent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.adk.tools.pubsub.config import PubSubToolConfig
from google.adk.tools.pubsub.pubsub_credentials import PubSubCredentialsConfig
from google.adk.tools.pubsub.pubsub_toolset import PubSubToolset
from google.genai import types
import google.auth
# Define constants for this example agent
AGENT_NAME = "pubsub_agent"
APP_NAME = "pubsub_app"
USER_ID = "user1234"
SESSION_ID = "1234"
GEMINI_MODEL = "gemini-2.0-flash"
# Define Pub/Sub tool config.
# You can optionally set the project_id here, or let the agent infer it from context/user input.
tool_config = PubSubToolConfig(project_id=os.getenv("GOOGLE_CLOUD_PROJECT"))
# Uses externally-managed Application Default Credentials (ADC) by default.
# This decouples authentication from the agent / tool lifecycle.
# https://cloud.google.com/docs/authentication/provide-credentials-adc
application_default_credentials, _ = google.auth.default()
credentials_config = PubSubCredentialsConfig(
credentials=application_default_credentials
)
# Instantiate a Pub/Sub toolset
pubsub_toolset = PubSubToolset(
credentials_config=credentials_config, pubsub_tool_config=tool_config
)
# Agent Definition
pubsub_agent = Agent(
model=GEMINI_MODEL,
name=AGENT_NAME,
description=(
"Agent to publish, pull, and acknowledge messages from Google Cloud"
" Pub/Sub."
),
instruction="""\
You are a cloud engineer agent with access to Google Cloud Pub/Sub tools.
You can publish messages to topics, pull messages from subscriptions, and acknowledge messages.
""",
tools=[pubsub_toolset],
)
# Session and Runner
session_service = InMemorySessionService()
session = asyncio.run(
session_service.create_session(
app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID
)
)
runner = Runner(
agent=pubsub_agent, app_name=APP_NAME, session_service=session_service
)
# Agent Interaction
def call_agent(query):
"""
Helper function to call the agent with a query.
"""
content = types.Content(role="user", parts=[types.Part(text=query)])
events = runner.run(user_id=USER_ID, session_id=SESSION_ID, new_message=content)
print("USER:", query)
for event in events:
if event.is_final_response():
final_response = event.content.parts[0].text
print("AGENT:", final_response)
call_agent("publish 'Hello World' to 'my-topic'")
call_agent("pull messages from 'my-subscription'")
```
## Tools
The `PubSubToolset` includes the following tools:
### `publish_message`
Publishes a message to a Pub/Sub topic.
| Parameter | Type | Description |
| -------------- | ---------------- | -------------------------------------------------------------------------------------------------------- |
| `topic_name` | `str` | The name of the Pub/Sub topic (e.g., `projects/my-project/topics/my-topic`). |
| `message` | `str` | The message content to publish. |
| `attributes` | `dict[str, str]` | (Optional) Attributes to attach to the message. |
| `ordering_key` | `str` | (Optional) The ordering key for the message. If you set this parameter, messages are published in order. |
### `pull_messages`
Pulls messages from a Pub/Sub subscription.
| Parameter | Type | Description |
| ------------------- | ------ | ---------------------------------------------------------------------------------------- |
| `subscription_name` | `str` | The name of the Pub/Sub subscription (e.g., `projects/my-project/subscriptions/my-sub`). |
| `max_messages` | `int` | (Optional) The maximum number of messages to pull. Defaults to `1`. |
| `auto_ack` | `bool` | (Optional) Whether to automatically acknowledge the messages. Defaults to `False`. |
### `acknowledge_messages`
Acknowledges one or more messages on a Pub/Sub subscription.
| Parameter | Type | Description |
| ------------------- | ----------- | ---------------------------------------------------------------------------------------- |
| `subscription_name` | `str` | The name of the Pub/Sub subscription (e.g., `projects/my-project/subscriptions/my-sub`). |
| `ack_ids` | `list[str]` | A list of acknowledgment IDs to acknowledge. |
# Qdrant MCP tool for ADK
Supported in ADKPythonTypeScript
The [Qdrant MCP Server](https://github.com/qdrant/mcp-server-qdrant) connects your ADK agent to [Qdrant](https://qdrant.tech/), an open-source vector search engine. This integration gives your agent the ability to store and retrieve information using semantic search.
## Use cases
- **Semantic Memory for Agents**: Store conversation context, facts, or learned information that agents can retrieve later using natural language queries.
- **Code Repository Search**: Build a searchable index of code snippets, documentation, and implementation patterns that can be queried semantically.
- **Knowledge Base Retrieval**: Create a retrieval-augmented generation (RAG) system by storing documents and retrieving relevant context for responses.
## Prerequisites
- A running Qdrant instance. You can:
- Use [Qdrant Cloud](https://cloud.qdrant.io/) (managed service)
- Run locally with Docker: `docker run -p 6333:6333 qdrant/qdrant`
- (Optional) A Qdrant API key for authentication
## Use with agent
```python
from google.adk.agents import Agent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
from mcp import StdioServerParameters
QDRANT_URL = "http://localhost:6333" # Or your Qdrant Cloud URL
COLLECTION_NAME = "my_collection"
# QDRANT_API_KEY = "YOUR_QDRANT_API_KEY"
root_agent = Agent(
model="gemini-2.5-pro",
name="qdrant_agent",
instruction="Help users store and retrieve information using semantic search",
tools=[
McpToolset(
connection_params=StdioConnectionParams(
server_params=StdioServerParameters(
command="uvx",
args=["mcp-server-qdrant"],
env={
"QDRANT_URL": QDRANT_URL,
"COLLECTION_NAME": COLLECTION_NAME,
# "QDRANT_API_KEY": QDRANT_API_KEY,
}
),
timeout=30,
),
)
],
)
```
```typescript
import { LlmAgent, MCPToolset } from "@google/adk";
const QDRANT_URL = "http://localhost:6333"; // Or your Qdrant Cloud URL
const COLLECTION_NAME = "my_collection";
// const QDRANT_API_KEY = "YOUR_QDRANT_API_KEY";
const rootAgent = new LlmAgent({
model: "gemini-2.5-pro",
name: "qdrant_agent",
instruction: "Help users store and retrieve information using semantic search",
tools: [
new MCPToolset({
type: "StdioConnectionParams",
serverParams: {
command: "uvx",
args: ["mcp-server-qdrant"],
env: {
QDRANT_URL: QDRANT_URL,
COLLECTION_NAME: COLLECTION_NAME,
// QDRANT_API_KEY: QDRANT_API_KEY,
},
},
}),
],
});
export { rootAgent };
```
## Available tools
| Tool | Description |
| -------------- | -------------------------------------------------------------- |
| `qdrant-store` | Store information in Qdrant with optional metadata |
| `qdrant-find` | Search for relevant information using natural language queries |
## Configuration
The Qdrant MCP server can be configured using environment variables:
| Variable | Description | Default |
| ------------------------ | ------------------------------------------------------ | ---------------------------------------- |
| `QDRANT_URL` | URL of the Qdrant server | `None` (required) |
| `QDRANT_API_KEY` | API key for Qdrant Cloud authentication | `None` |
| `COLLECTION_NAME` | Name of the collection to use | `None` |
| `QDRANT_LOCAL_PATH` | Path for local persistent storage (alternative to URL) | `None` |
| `EMBEDDING_MODEL` | Embedding model to use | `sentence-transformers/all-MiniLM-L6-v2` |
| `EMBEDDING_PROVIDER` | Provider for embeddings (`fastembed` or `ollama`) | `fastembed` |
| `TOOL_STORE_DESCRIPTION` | Custom description for the store tool | Default description |
| `TOOL_FIND_DESCRIPTION` | Custom description for the find tool | Default description |
### Custom tool descriptions
You can customize the tool descriptions to guide the agent's behavior:
```python
env={
"QDRANT_URL": "http://localhost:6333",
"COLLECTION_NAME": "code-snippets",
"TOOL_STORE_DESCRIPTION": "Store code snippets with descriptions. The 'information' parameter should contain a description of what the code does, while the actual code should be in 'metadata.code'.",
"TOOL_FIND_DESCRIPTION": "Search for relevant code snippets using natural language. Describe the functionality you're looking for.",
}
```
## Additional resources
- [Qdrant MCP Server Repository](https://github.com/qdrant/mcp-server-qdrant)
- [Qdrant Documentation](https://qdrant.tech/documentation/)
- [Qdrant Cloud](https://cloud.qdrant.io/)
# Reflect and Retry plugin for ADK
Supported in ADKPython v1.16.0
The Reflect and Retry plugin can help your agent recover from error responses from ADK [Tools](/adk-docs/tools-custom/) and automatically retry the tool request. This plugin intercepts tool failures, provides structured guidance to the AI model for reflection and correction, and retries the operation up to a configurable limit. This plugin can help you build more resilience into your agent workflows, including the following capabilities:
- **Concurrency safe**: Uses locking to safely handle parallel tool executions.
- **Configurable scope**: Tracks failures per-invocation (default) or globally.
- **Granular tracking**: Failure counts are tracked per-tool.
- **Custom error extraction**: Supports detecting errors in normal tool responses.
## Add Reflect and Retry Plugin
Add this plugin to your ADK workflow by adding it to the plugins setting of your ADK project's App object, as shown below:
```python
from google.adk.apps.app import App
from google.adk.plugins import ReflectAndRetryToolPlugin
app = App(
name="my_app",
root_agent=root_agent,
plugins=[
ReflectAndRetryToolPlugin(max_retries=3),
],
)
```
With this configuration, if any tool called by an agent returns an error, the request is updated and tried again, up to a maximum of 3 attempts, per tool.
## Configuration settings
The Reflect and Retry Plugin has the following configuration options:
- **`max_retries`**: (optional) Total number of additional attempts the system makes to receive a non-error response. Default value is 3.
- **`throw_exception_if_retry_exceeded`**: (optional) If set to `False`, the system does not raise an error if the final retry attempt fails. Default value is `True`.
- **`tracking_scope`**: (optional)
- **`TrackingScope.INVOCATION`**: Track tool failures across a single invocation and user. This value is the default.
- **`TrackingScope.GLOBAL`**: Track tool failures across all invocations and all users.
### Advanced configuration
You can further modify the behavior of this plugin by extending the `ReflectAndRetryToolPlugin` class. The following code sample demonstrates a simple extension of the behavior by selecting responses with an error status:
```python
class CustomRetryPlugin(ReflectAndRetryToolPlugin):
async def extract_error_from_result(self, *, tool, tool_args,tool_context,
result):
# Detect error based on response content
if result.get('status') == 'error':
return result
return None # No error detected
# add this modified plugin to your App object:
error_handling_plugin = CustomRetryPlugin(max_retries=5)
```
## Next steps
For complete code samples using the Reflect and Retry plugin, see the following:
- [Basic](https://github.com/google/adk-python/tree/main/contributing/samples/plugin_reflect_tool_retry/basic) code sample
- [Hallucinating function name](https://github.com/google/adk-python/tree/main/contributing/samples/plugin_reflect_tool_retry/hallucinating_func_name) code sample
# Google Cloud Spanner tool for ADK
Supported in ADKPython v1.11.0
These are a set of tools aimed to provide integration with Spanner, namely:
- **`list_table_names`**: Fetches table names present in a GCP Spanner database.
- **`list_table_indexes`**: Fetches table indexes present in a GCP Spanner database.
- **`list_table_index_columns`**: Fetches table index columns present in a GCP Spanner database.
- **`list_named_schemas`**: Fetches named schema for a Spanner database.
- **`get_table_schema`**: Fetches Spanner database table schema and metadata information.
- **`execute_sql`**: Runs a SQL query in Spanner database and fetch the result.
- **`similarity_search`**: Similarity search in Spanner using a text query.
They are packaged in the toolset `SpannerToolset`.
```py
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import asyncio
from google.adk.agents import Agent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
# from google.adk.sessions import DatabaseSessionService
from google.adk.tools.google_tool import GoogleTool
from google.adk.tools.spanner import query_tool
from google.adk.tools.spanner.settings import SpannerToolSettings
from google.adk.tools.spanner.settings import Capabilities
from google.adk.tools.spanner.spanner_credentials import SpannerCredentialsConfig
from google.adk.tools.spanner.spanner_toolset import SpannerToolset
from google.genai import types
from google.adk.tools.tool_context import ToolContext
import google.auth
from google.auth.credentials import Credentials
# Define constants for this example agent
AGENT_NAME = "spanner_agent"
APP_NAME = "spanner_app"
USER_ID = "user1234"
SESSION_ID = "1234"
GEMINI_MODEL = "gemini-2.5-flash"
# Define Spanner tool config with read capability set to allowed.
tool_settings = SpannerToolSettings(capabilities=[Capabilities.DATA_READ])
# Define a credentials config - in this example we are using application default
# credentials
# https://cloud.google.com/docs/authentication/provide-credentials-adc
application_default_credentials, _ = google.auth.default()
credentials_config = SpannerCredentialsConfig(
credentials=application_default_credentials
)
# Instantiate a Spanner toolset
spanner_toolset = SpannerToolset(
credentials_config=credentials_config, spanner_tool_settings=tool_settings
)
# Optional
# Create a wrapped function tool for the agent on top of the built-in
# `execute_sql` tool in the Spanner toolset.
# For example, this customized tool can perform a dynamically-built query.
def count_rows_tool(
table_name: str,
credentials: Credentials, # GoogleTool handles `credentials`
settings: SpannerToolSettings, # GoogleTool handles `settings`
tool_context: ToolContext, # GoogleTool handles `tool_context`
):
"""Counts the total number of rows for a specified table.
Args:
table_name: The name of the table for which to count rows.
Returns:
The total number of rows in the table.
"""
# Replace the following settings for a specific Spanner database.
PROJECT_ID = ""
INSTANCE_ID = ""
DATABASE_ID = ""
query = f"""
SELECT count(*) FROM {table_name}
"""
return query_tool.execute_sql(
project_id=PROJECT_ID,
instance_id=INSTANCE_ID,
database_id=DATABASE_ID,
query=query,
credentials=credentials,
settings=settings,
tool_context=tool_context,
)
# Agent Definition
spanner_agent = Agent(
model=GEMINI_MODEL,
name=AGENT_NAME,
description=(
"Agent to answer questions about Spanner database and execute SQL queries."
),
instruction="""\
You are a data assistant agent with access to several Spanner tools.
Make use of those tools to answer the user's questions.
""",
tools=[
spanner_toolset,
# Add customized Spanner tool based on the built-in Spanner toolset.
GoogleTool(
func=count_rows_tool,
credentials_config=credentials_config,
tool_settings=tool_settings,
),
],
)
# Session and Runner
session_service = InMemorySessionService()
# Optionally, Spanner can be used as the Database Session Service for production.
# Note that it's suggested to use a dedicated instance/database for storing sessions.
# session_service_spanner_db_url = "spanner+spanner:///projects/PROJECT_ID/instances/INSTANCE_ID/databases/my-adk-session"
# session_service = DatabaseSessionService(db_url=session_service_spanner_db_url)
session = asyncio.run(
session_service.create_session(
app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID
)
)
runner = Runner(
agent=spanner_agent, app_name=APP_NAME, session_service=session_service
)
# Agent Interaction
def call_agent(query):
"""
Helper function to call the agent with a query.
"""
content = types.Content(role="user", parts=[types.Part(text=query)])
events = runner.run(user_id=USER_ID, session_id=SESSION_ID, new_message=content)
print("USER:", query)
for event in events:
if event.is_final_response():
final_response = event.content.parts[0].text
print("AGENT:", final_response)
# Replace the Spanner database and table names below with your own.
call_agent("List all tables in projects//instances//databases/")
call_agent("Describe the schema of ")
call_agent("List the top 5 rows in ")
```
# Stripe MCP tool for ADK
Supported in ADKPythonTypeScript
The [Stripe MCP Server](https://docs.stripe.com/mcp) connects your ADK agent to the [Stripe](https://stripe.com/) ecosystem. This integration gives your agent the ability to manage payments, customers, subscriptions, and invoices using natural language, enabling automated commerce workflows and financial operations.
## Use cases
- **Automate Payment Operations**: Create payment links, process refunds, and list payment intents through conversational commands.
- **Streamline Invoicing**: Generate and finalize invoices, add line items, and track outstanding payments without leaving your development environment.
- **Access Business Insights**: Query account balances, list products and prices, and search across Stripe resources to make data-driven decisions.
## Prerequisites
- Create a [Stripe account](https://dashboard.stripe.com/register)
- Generate a [Restricted API key](https://dashboard.stripe.com/apikeys) from the Stripe Dashboard
## Use with agent
```python
from google.adk.agents import Agent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
from mcp import StdioServerParameters
STRIPE_SECRET_KEY = "YOUR_STRIPE_SECRET_KEY"
root_agent = Agent(
model="gemini-2.5-pro",
name="stripe_agent",
instruction="Help users manage their Stripe account",
tools=[
McpToolset(
connection_params=StdioConnectionParams(
server_params=StdioServerParameters(
command="npx",
args=[
"-y",
"@stripe/mcp",
"--tools=all",
# (Optional) Specify which tools to enable
# "--tools=customers.read,invoices.read,products.read",
],
env={
"STRIPE_SECRET_KEY": STRIPE_SECRET_KEY,
}
),
timeout=30,
),
)
],
)
```
```python
from google.adk.agents import Agent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StreamableHTTPServerParams
STRIPE_SECRET_KEY = "YOUR_STRIPE_SECRET_KEY"
root_agent = Agent(
model="gemini-2.5-pro",
name="stripe_agent",
instruction="Help users manage their Stripe account",
tools=[
McpToolset(
connection_params=StreamableHTTPServerParams(
url="https://mcp.stripe.com",
headers={
"Authorization": f"Bearer {STRIPE_SECRET_KEY}",
},
),
)
],
)
```
```typescript
import { LlmAgent, MCPToolset } from "@google/adk";
const STRIPE_SECRET_KEY = "YOUR_STRIPE_SECRET_KEY";
const rootAgent = new LlmAgent({
model: "gemini-2.5-pro",
name: "stripe_agent",
instruction: "Help users manage their Stripe account",
tools: [
new MCPToolset({
type: "StdioConnectionParams",
serverParams: {
command: "npx",
args: [
"-y",
"@stripe/mcp",
"--tools=all",
// (Optional) Specify which tools to enable
// "--tools=customers.read,invoices.read,products.read",
],
env: {
STRIPE_SECRET_KEY: STRIPE_SECRET_KEY,
},
},
}),
],
});
export { rootAgent };
```
```typescript
import { LlmAgent, MCPToolset } from "@google/adk";
const STRIPE_SECRET_KEY = "YOUR_STRIPE_SECRET_KEY";
const rootAgent = new LlmAgent({
model: "gemini-2.5-pro",
name: "stripe_agent",
instruction: "Help users manage their Stripe account",
tools: [
new MCPToolset({
type: "StreamableHTTPConnectionParams",
url: "https://mcp.stripe.com",
header: {
Authorization: `Bearer ${STRIPE_SECRET_KEY}`,
},
}),
],
});
export { rootAgent };
```
Best practices
Enable human confirmation of tool actions and exercise caution when using the Stripe MCP server alongside other MCP servers to mitigate prompt injection risks.
## Available tools
| Resource | Tool | API |
| ------------- | ----------------------------- | ----------------------- |
| Account | `get_stripe_account_info` | Retrieve account |
| Balance | `retrieve_balance` | Retrieve balance |
| Coupon | `create_coupon` | Create coupon |
| Coupon | `list_coupons` | List coupons |
| Customer | `create_customer` | Create customer |
| Customer | `list_customers` | List customers |
| Dispute | `list_disputes` | List disputes |
| Dispute | `update_dispute` | Update dispute |
| Invoice | `create_invoice` | Create invoice |
| Invoice | `create_invoice_item` | Create invoice item |
| Invoice | `finalize_invoice` | Finalize invoice |
| Invoice | `list_invoices` | List invoices |
| Payment Link | `create_payment_link` | Create payment link |
| PaymentIntent | `list_payment_intents` | List PaymentIntents |
| Price | `create_price` | Create price |
| Price | `list_prices` | List prices |
| Product | `create_product` | Create product |
| Product | `list_products` | List products |
| Refund | `create_refund` | Create refund |
| Subscription | `cancel_subscription` | Cancel subscription |
| Subscription | `list_subscriptions` | List subscriptions |
| Subscription | `update_subscription` | Update subscription |
| Others | `search_stripe_resources` | Search Stripe resources |
| Others | `fetch_stripe_resources` | Fetch Stripe object |
| Others | `search_stripe_documentation` | Search Stripe knowledge |
## Additional resources
- [Stripe MCP Server Documentation](https://docs.stripe.com/mcp)
- [Stripe MCP Server on GitHub](https://github.com/stripe/ai/tree/main/tools/modelcontextprotocol)
- [Build on Stripe with LLMs](https://docs.stripe.com/building-with-llms)
- [Add Stripe to your agentic workflows](https://docs.stripe.com/agents)
# Vertex AI RAG Engine tool for ADK
Supported in ADKPython v0.1.0Java v0.2.0
The `vertex_ai_rag_retrieval` tool allows the agent to perform private data retrieval using Vertex AI RAG Engine.
When you use grounding with Vertex AI RAG Engine, you need to prepare a RAG corpus beforehand. Please refer to the [RAG ADK agent sample](https://github.com/google/adk-samples/blob/main/python/agents/RAG/rag/shared_libraries/prepare_corpus_and_data.py) or [Vertex AI RAG Engine page](https://cloud.google.com/vertex-ai/generative-ai/docs/rag-engine/rag-quickstart) for setting it up.
Warning: Single tool per agent limitation
This tool can only be used ***by itself*** within an agent instance. For more information about this limitation and workarounds, see [Limitations for ADK tools](/adk-docs/tools/limitations/).
```py
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
from google.adk.agents import Agent
from google.adk.tools.retrieval.vertex_ai_rag_retrieval import VertexAiRagRetrieval
from vertexai.preview import rag
from dotenv import load_dotenv
from .prompts import return_instructions_root
load_dotenv()
ask_vertex_retrieval = VertexAiRagRetrieval(
name='retrieve_rag_documentation',
description=(
'Use this tool to retrieve documentation and reference materials for the question from the RAG corpus,'
),
rag_resources=[
rag.RagResource(
# please fill in your own rag corpus
# here is a sample rag corpus for testing purpose
# e.g. projects/123/locations/us-central1/ragCorpora/456
rag_corpus=os.environ.get("RAG_CORPUS")
)
],
similarity_top_k=10,
vector_distance_threshold=0.6,
)
root_agent = Agent(
model='gemini-2.0-flash-001',
name='ask_rag_agent',
instruction=return_instructions_root(),
tools=[
ask_vertex_retrieval,
]
)
```
# Vertex AI Search tool for ADK
Supported in ADKPython v0.1.0
The `vertex_ai_search_tool` uses Google Cloud Vertex AI Search, enabling the agent to search across your private, configured data stores (e.g., internal documents, company policies, knowledge bases). This built-in tool requires you to provide the specific data store ID during configuration. For further details of the tool, see [Understanding Vertex AI Search grounding](/adk-docs/grounding/vertex_ai_search_grounding/).
Warning: Single tool per agent limitation
This tool can only be used ***by itself*** within an agent instance. For more information about this limitation and workarounds, see [Limitations for ADK tools](/adk-docs/tools/limitations/#one-tool-one-agent).
```py
# Copyright 2024 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import asyncio
from google.adk.agents import LlmAgent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.genai import types
from google.adk.tools import VertexAiSearchTool
# Replace with your Vertex AI Search Datastore ID, and respective region (e.g. us-central1 or global).
# Format: projects//locations//collections/default_collection/dataStores/
DATASTORE_PATH = "DATASTORE_PATH_HERE"
# Constants
APP_NAME_VSEARCH = "vertex_search_app"
USER_ID_VSEARCH = "user_vsearch_1"
SESSION_ID_VSEARCH = "session_vsearch_1"
AGENT_NAME_VSEARCH = "doc_qa_agent"
GEMINI_2_FLASH = "gemini-2.0-flash"
# Tool Instantiation
# You MUST provide your datastore ID here.
vertex_search_tool = VertexAiSearchTool(data_store_id=DATASTORE_PATH)
# Agent Definition
doc_qa_agent = LlmAgent(
name=AGENT_NAME_VSEARCH,
model=GEMINI_2_FLASH, # Requires Gemini model
tools=[vertex_search_tool],
instruction=f"""You are a helpful assistant that answers questions based on information found in the document store: {DATASTORE_PATH}.
Use the search tool to find relevant information before answering.
If the answer isn't in the documents, say that you couldn't find the information.
""",
description="Answers questions using a specific Vertex AI Search datastore.",
)
# Session and Runner Setup
session_service_vsearch = InMemorySessionService()
runner_vsearch = Runner(
agent=doc_qa_agent, app_name=APP_NAME_VSEARCH, session_service=session_service_vsearch
)
session_vsearch = session_service_vsearch.create_session(
app_name=APP_NAME_VSEARCH, user_id=USER_ID_VSEARCH, session_id=SESSION_ID_VSEARCH
)
# Agent Interaction Function
async def call_vsearch_agent_async(query):
print("\n--- Running Vertex AI Search Agent ---")
print(f"Query: {query}")
if "DATASTORE_PATH_HERE" in DATASTORE_PATH:
print("Skipping execution: Please replace DATASTORE_PATH_HERE with your actual datastore ID.")
print("-" * 30)
return
content = types.Content(role='user', parts=[types.Part(text=query)])
final_response_text = "No response received."
try:
async for event in runner_vsearch.run_async(
user_id=USER_ID_VSEARCH, session_id=SESSION_ID_VSEARCH, new_message=content
):
# Like Google Search, results are often embedded in the model's response.
if event.is_final_response() and event.content and event.content.parts:
final_response_text = event.content.parts[0].text.strip()
print(f"Agent Response: {final_response_text}")
# You can inspect event.grounding_metadata for source citations
if event.grounding_metadata:
print(f" (Grounding metadata found with {len(event.grounding_metadata.grounding_attributions)} attributions)")
except Exception as e:
print(f"An error occurred: {e}")
print("Ensure your datastore ID is correct and the service account has permissions.")
print("-" * 30)
# --- Run Example ---
async def run_vsearch_example():
# Replace with a question relevant to YOUR datastore content
await call_vsearch_agent_async("Summarize the main points about the Q2 strategy document.")
await call_vsearch_agent_async("What safety procedures are mentioned for lab X?")
# Execute the example
# await run_vsearch_example()
# Running locally due to potential colab asyncio issues with multiple awaits
try:
asyncio.run(run_vsearch_example())
except RuntimeError as e:
if "cannot be called from a running event loop" in str(e):
print("Skipping execution in running event loop (like Colab/Jupyter). Run locally.")
else:
raise e
```
# W&B Weave observability for ADK
Supported in ADKPython
[W&B Weave](https://weave-docs.wandb.ai/) provides a powerful platform for logging and visualizing model calls. By integrating Google ADK with Weave, you can track and analyze your agent's performance and behavior using OpenTelemetry (OTEL) traces.
## Prerequisites
1. Sign up for an account at [WandB](https://wandb.ai).
1. Obtain your API key from [WandB Authorize](https://wandb.ai/authorize).
1. Configure your environment with the required API keys:
```bash
export WANDB_API_KEY=
export GOOGLE_API_KEY=
```
## Install Dependencies
Ensure you have the necessary packages installed:
```bash
pip install google-adk opentelemetry-sdk opentelemetry-exporter-otlp-proto-http
```
## Sending Traces to Weave
This example demonstrates how to configure OpenTelemetry to send Google ADK traces to Weave.
```python
# math_agent/agent.py
import base64
import os
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk import trace as trace_sdk
from opentelemetry.sdk.trace.export import SimpleSpanProcessor
from opentelemetry import trace
from google.adk.agents import LlmAgent
from google.adk.tools import FunctionTool
from dotenv import load_dotenv
load_dotenv()
# Configure Weave endpoint and authentication
WANDB_BASE_URL = "https://trace.wandb.ai"
PROJECT_ID = "your-entity/your-project" # e.g., "teamid/projectid"
OTEL_EXPORTER_OTLP_ENDPOINT = f"{WANDB_BASE_URL}/otel/v1/traces"
# Set up authentication
WANDB_API_KEY = os.getenv("WANDB_API_KEY")
AUTH = base64.b64encode(f"api:{WANDB_API_KEY}".encode()).decode()
OTEL_EXPORTER_OTLP_HEADERS = {
"Authorization": f"Basic {AUTH}",
"project_id": PROJECT_ID,
}
# Create the OTLP span exporter with endpoint and headers
exporter = OTLPSpanExporter(
endpoint=OTEL_EXPORTER_OTLP_ENDPOINT,
headers=OTEL_EXPORTER_OTLP_HEADERS,
)
# Create a tracer provider and add the exporter
tracer_provider = trace_sdk.TracerProvider()
tracer_provider.add_span_processor(SimpleSpanProcessor(exporter))
# Set the global tracer provider BEFORE importing/using ADK
trace.set_tracer_provider(tracer_provider)
# Define a simple tool for demonstration
def calculator(a: float, b: float) -> str:
"""Add two numbers and return the result.
Args:
a: First number
b: Second number
Returns:
The sum of a and b
"""
return str(a + b)
calculator_tool = FunctionTool(func=calculator)
# Create an LLM agent
root_agent = LlmAgent(
name="MathAgent",
model="gemini-2.0-flash-exp",
instruction=(
"You are a helpful assistant that can do math. "
"When asked a math problem, use the calculator tool to solve it."
),
tools=[calculator_tool],
)
```
## View Traces in Weave dashboard
Once the agent runs, all its traces are logged to the corresponding project on [the Weave dashboard](https://wandb.ai/home).
You can view a timeline of calls that your ADK agent made during execution -
## Notes
- **Environment Variables**: Ensure your environment variables are correctly set for both WandB and Google API keys.
- **Project Configuration**: Replace `/` with your actual WandB entity and project name.
- **Entity Name**: You can find your entity name by visiting your [WandB dashboard](https://wandb.ai/home) and checking the **Teams** field in the left sidebar.
- **Tracer Provider**: It's critical to set the global tracer provider before using any ADK components to ensure proper tracing.
By following these steps, you can effectively integrate Google ADK with Weave, enabling comprehensive logging and visualization of your AI agents' model calls, tool invocations, and reasoning processes.
## Resources
- **[Send OpenTelemetry Traces to Weave](https://weave-docs.wandb.ai/guides/tracking/otel)** - Comprehensive guide on configuring OTEL with Weave, including authentication and advanced configuration options.
- **[Navigate the Trace View](https://weave-docs.wandb.ai/guides/tracking/trace-tree)** - Learn how to effectively analyze and debug your traces in the Weave UI, including understanding trace hierarchies and span details.
- **[Weave Integrations](https://weave-docs.wandb.ai/guides/integrations/)** - Explore other framework integrations and see how Weave can work with your entire AI stack.
# Limitations for ADK tools
Some ADK tools have limitations that can impact how you implement them within an agent workflow. This page lists these tool limitations and workarounds, if available.
## One tool per agent limitation
ONLY for Search in ADK Python v1.15.0 and lower
This limitation only applies to the use of Google Search and Vertex AI Search tools in ADK Python v1.15.0 and lower. ADK Python release v1.16.0 and higher provides a built-in workaround to remove this limitation.
In general, you can use more than one tool in an agent, but use of specific tools within an agent excludes the use of any other tools in that agent. The following ADK Tools can only be used by themselves, without any other tools, in a single agent object:
- [Code Execution](/adk-docs/tools/gemini-api/code-execution/) with Gemini API
- [Google Search](/adk-docs/tools/gemini-api/google-search/) with Gemini API
- [Vertex AI Search](/adk-docs/tools/google-cloud/vertex-ai-search/)
For example, the following approach that uses one of these tools along with other tools, within a single agent, is ***not supported***:
```py
root_agent = Agent(
name="RootAgent",
model="gemini-2.5-flash",
description="Code Agent",
tools=[custom_function],
code_executor=BuiltInCodeExecutor() # <-- NOT supported when used with tools
)
```
```java
LlmAgent searchAgent =
LlmAgent.builder()
.model(MODEL_ID)
.name("SearchAgent")
.instruction("You're a specialist in Google Search")
.tools(new GoogleSearchTool(), new YourCustomTool()) // <-- NOT supported
.build();
```
### Workaround #1: AgentTool.create() method
Supported in ADKPythonJava
The following code sample demonstrates how to use multiple built-in tools or how to use built-in tools with other tools by using multiple agents:
```py
from google.adk.tools.agent_tool import AgentTool
from google.adk.agents import Agent
from google.adk.tools import google_search
from google.adk.code_executors import BuiltInCodeExecutor
search_agent = Agent(
model='gemini-2.0-flash',
name='SearchAgent',
instruction="""
You're a specialist in Google Search
""",
tools=[google_search],
)
coding_agent = Agent(
model='gemini-2.0-flash',
name='CodeAgent',
instruction="""
You're a specialist in Code Execution
""",
code_executor=BuiltInCodeExecutor(),
)
root_agent = Agent(
name="RootAgent",
model="gemini-2.0-flash",
description="Root Agent",
tools=[AgentTool(agent=search_agent), AgentTool(agent=coding_agent)],
)
```
```java
import com.google.adk.agents.BaseAgent;
import com.google.adk.agents.LlmAgent;
import com.google.adk.tools.AgentTool;
import com.google.adk.tools.BuiltInCodeExecutionTool;
import com.google.adk.tools.GoogleSearchTool;
import com.google.common.collect.ImmutableList;
public class NestedAgentApp {
private static final String MODEL_ID = "gemini-2.0-flash";
public static void main(String[] args) {
// Define the SearchAgent
LlmAgent searchAgent =
LlmAgent.builder()
.model(MODEL_ID)
.name("SearchAgent")
.instruction("You're a specialist in Google Search")
.tools(new GoogleSearchTool()) // Instantiate GoogleSearchTool
.build();
// Define the CodingAgent
LlmAgent codingAgent =
LlmAgent.builder()
.model(MODEL_ID)
.name("CodeAgent")
.instruction("You're a specialist in Code Execution")
.tools(new BuiltInCodeExecutionTool()) // Instantiate BuiltInCodeExecutionTool
.build();
// Define the RootAgent, which uses AgentTool.create() to wrap SearchAgent and CodingAgent
BaseAgent rootAgent =
LlmAgent.builder()
.name("RootAgent")
.model(MODEL_ID)
.description("Root Agent")
.tools(
AgentTool.create(searchAgent), // Use create method
AgentTool.create(codingAgent) // Use create method
)
.build();
// Note: This sample only demonstrates the agent definitions.
// To run these agents, you'd need to integrate them with a Runner and SessionService,
// similar to the previous examples.
System.out.println("Agents defined successfully:");
System.out.println(" Root Agent: " + rootAgent.name());
System.out.println(" Search Agent (nested): " + searchAgent.name());
System.out.println(" Code Agent (nested): " + codingAgent.name());
}
}
```
### Workaround #2: bypass_multi_tools_limit
Supported in ADKPythonJava
ADK Python has a built-in workaround which bypasses this limitation for `GoogleSearchTool` and `VertexAiSearchTool` (use `bypass_multi_tools_limit=True` to enable it), as shown in the [built_in_multi_tools](https://github.com/google/adk-python/tree/main/contributing/samples/built_in_multi_tools). sample agent.
Warning
Built-in tools cannot be used within a sub-agent, with the exception of `GoogleSearchTool` and `VertexAiSearchTool` in ADK Python because of the workaround mentioned above.
For example, the following approach that uses built-in tools within sub-agents is **not supported**:
```py
url_context_agent = Agent(
model='gemini-2.5-flash',
name='UrlContextAgent',
instruction="""
You're a specialist in URL Context
""",
tools=[url_context],
)
coding_agent = Agent(
model='gemini-2.5-flash',
name='CodeAgent',
instruction="""
You're a specialist in Code Execution
""",
code_executor=BuiltInCodeExecutor(),
)
root_agent = Agent(
name="RootAgent",
model="gemini-2.5-flash",
description="Root Agent",
sub_agents=[
url_context_agent,
coding_agent
],
)
```
```java
LlmAgent searchAgent =
LlmAgent.builder()
.model("gemini-2.5-flash")
.name("SearchAgent")
.instruction("You're a specialist in Google Search")
.tools(new GoogleSearchTool())
.build();
LlmAgent codingAgent =
LlmAgent.builder()
.model("gemini-2.5-flash")
.name("CodeAgent")
.instruction("You're a specialist in Code Execution")
.tools(new BuiltInCodeExecutionTool())
.build();
LlmAgent rootAgent =
LlmAgent.builder()
.name("RootAgent")
.model("gemini-2.5-flash")
.description("Root Agent")
.subAgents(searchAgent, codingAgent) // Not supported, as the sub agents use built in tools.
.build();
```
# Custom Tools for ADK
Supported in ADKPython v0.1.0Typescript v0.2.0Go v0.1.0Java v0.1.0
In an ADK agent workflow, Tools are programming functions with structured input and output that can be called by an ADK Agent to perform actions. ADK Tools function similarly to how you use a [Function Call](https://ai.google.dev/gemini-api/docs/function-calling) with Gemini or other generative AI models. You can perform various actions and programming functions with an ADK Tool, such as:
- Querying databases
- Making API requests: getting weather data, booking systems
- Searching the web
- Executing code snippets
- Retrieving information from documents (RAG)
- Interacting with other software or services
[ADK Tools list](/adk-docs/tools/)
Before building your own Tools for ADK, check out the **[ADK Tools list](/adk-docs/tools/)** for pre-built tools you can use with ADK Agents.
## What is a Tool?
In the context of ADK, a Tool represents a specific capability provided to an AI agent, enabling it to perform actions and interact with the world beyond its core text generation and reasoning abilities. What distinguishes capable agents from basic language models is often their effective use of tools.
Technically, a tool is typically a modular code component—**like a Python, Java, or TypeScript function**, a class method, or even another specialized agent—designed to execute a distinct, predefined task. These tasks often involve interacting with external systems or data.
### Key Characteristics
**Action-Oriented:** Tools perform specific actions for an agent, such as searching for information, calling an API, or performing calculations.
**Extends Agent capabilities:** They empower agents to access real-time information, affect external systems, and overcome the knowledge limitations inherent in their training data.
**Execute predefined logic:** Crucially, tools execute specific, developer-defined logic. They do not possess their own independent reasoning capabilities like the agent's core Large Language Model (LLM). The LLM reasons about which tool to use, when, and with what inputs, but the tool itself just executes its designated function.
## How Agents Use Tools
Agents leverage tools dynamically through mechanisms often involving function calling. The process generally follows these steps:
1. **Reasoning:** The agent's LLM analyzes its system instruction, conversation history, and user request.
1. **Selection:** Based on the analysis, the LLM decides on which tool, if any, to execute, based on the tools available to the agent and the docstrings that describes each tool.
1. **Invocation:** The LLM generates the required arguments (inputs) for the selected tool and triggers its execution.
1. **Observation:** The agent receives the output (result) returned by the tool.
1. **Finalization:** The agent incorporates the tool's output into its ongoing reasoning process to formulate the next response, decide the subsequent step, or determine if the goal has been achieved.
Think of the tools as a specialized toolkit that the agent's intelligent core (the LLM) can access and utilize as needed to accomplish complex tasks.
## Tool Types in ADK
ADK offers flexibility by supporting several types of tools:
1. **[Function Tools](/adk-docs/tools-custom/function-tools/):** Tools created by you, tailored to your specific application's needs.
- **[Functions/Methods](/adk-docs/tools-custom/function-tools/#1-function-tool):** Define standard synchronous functions or methods in your code (e.g., Python def).
- **[Agents-as-Tools](/adk-docs/tools-custom/function-tools/#3-agent-as-a-tool):** Use another, potentially specialized, agent as a tool for a parent agent.
- **[Long Running Function Tools](/adk-docs/tools-custom/function-tools/#2-long-running-function-tool):** Support for tools that perform asynchronous operations or take significant time to complete.
1. **[Built-in Tools](/adk-docs/tools/built-in-tools/):** Ready-to-use tools provided by the framework for common tasks. Examples: Google Search, Code Execution, Retrieval-Augmented Generation (RAG).
1. **Third-Party Tools:** Integrate tools seamlessly from popular external libraries.
Navigate to the respective documentation pages linked above for detailed information and examples for each tool type.
## Referencing Tool in Agent’s Instructions
Within an agent's instructions, you can directly reference a tool by using its **function name.** If the tool's **function name** and **docstring** are sufficiently descriptive, your instructions can primarily focus on **when the Large Language Model (LLM) should utilize the tool**. This promotes clarity and helps the model understand the intended use of each tool.
It is **crucial to clearly instruct the agent on how to handle different return values** that a tool might produce. For example, if a tool returns an error message, your instructions should specify whether the agent should retry the operation, give up on the task, or request additional information from the user.
Furthermore, ADK supports the sequential use of tools, where the output of one tool can serve as the input for another. When implementing such workflows, it's important to **describe the intended sequence of tool usage** within the agent's instructions to guide the model through the necessary steps.
### Example
The following example showcases how an agent can use tools by **referencing their function names in its instructions**. It also demonstrates how to guide the agent to **handle different return values from tools**, such as success or error messages, and how to orchestrate the **sequential use of multiple tools** to accomplish a task.
```py
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import asyncio
from google.adk.agents import Agent
from google.adk.tools import FunctionTool
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.genai import types
APP_NAME="weather_sentiment_agent"
USER_ID="user1234"
SESSION_ID="1234"
MODEL_ID="gemini-2.0-flash"
# Tool 1
def get_weather_report(city: str) -> dict:
"""Retrieves the current weather report for a specified city.
Returns:
dict: A dictionary containing the weather information with a 'status' key ('success' or 'error') and a 'report' key with the weather details if successful, or an 'error_message' if an error occurred.
"""
if city.lower() == "london":
return {"status": "success", "report": "The current weather in London is cloudy with a temperature of 18 degrees Celsius and a chance of rain."}
elif city.lower() == "paris":
return {"status": "success", "report": "The weather in Paris is sunny with a temperature of 25 degrees Celsius."}
else:
return {"status": "error", "error_message": f"Weather information for '{city}' is not available."}
weather_tool = FunctionTool(func=get_weather_report)
# Tool 2
def analyze_sentiment(text: str) -> dict:
"""Analyzes the sentiment of the given text.
Returns:
dict: A dictionary with 'sentiment' ('positive', 'negative', or 'neutral') and a 'confidence' score.
"""
if "good" in text.lower() or "sunny" in text.lower():
return {"sentiment": "positive", "confidence": 0.8}
elif "rain" in text.lower() or "bad" in text.lower():
return {"sentiment": "negative", "confidence": 0.7}
else:
return {"sentiment": "neutral", "confidence": 0.6}
sentiment_tool = FunctionTool(func=analyze_sentiment)
# Agent
weather_sentiment_agent = Agent(
model=MODEL_ID,
name='weather_sentiment_agent',
instruction="""You are a helpful assistant that provides weather information and analyzes the sentiment of user feedback.
**If the user asks about the weather in a specific city, use the 'get_weather_report' tool to retrieve the weather details.**
**If the 'get_weather_report' tool returns a 'success' status, provide the weather report to the user.**
**If the 'get_weather_report' tool returns an 'error' status, inform the user that the weather information for the specified city is not available and ask if they have another city in mind.**
**After providing a weather report, if the user gives feedback on the weather (e.g., 'That's good' or 'I don't like rain'), use the 'analyze_sentiment' tool to understand their sentiment.** Then, briefly acknowledge their sentiment.
You can handle these tasks sequentially if needed.""",
tools=[weather_tool, sentiment_tool]
)
async def main():
"""Main function to run the agent asynchronously."""
# Session and Runner Setup
session_service = InMemorySessionService()
# Use 'await' to correctly create the session
await session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID)
runner = Runner(agent=weather_sentiment_agent, app_name=APP_NAME, session_service=session_service)
# Agent Interaction
query = "weather in london?"
print(f"User Query: {query}")
content = types.Content(role='user', parts=[types.Part(text=query)])
# The runner's run method handles the async loop internally
events = runner.run(user_id=USER_ID, session_id=SESSION_ID, new_message=content)
for event in events:
if event.is_final_response():
final_response = event.content.parts[0].text
print("Agent Response:", final_response)
# Standard way to run the main async function
if __name__ == "__main__":
asyncio.run(main())
```
```typescript
/**
* Copyright 2025 Google LLC
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { LlmAgent, FunctionTool, InMemoryRunner, isFinalResponse, stringifyContent } from "@google/adk";
import { z } from "zod";
import { Content, createUserContent } from "@google/genai";
/**
* Retrieves the current weather report for a specified city.
*/
function getWeatherReport(params: { city: string }): Record {
if (params.city.toLowerCase().includes("london")) {
return {
"status": "success",
"report": "The current weather in London is cloudy with a " +
"temperature of 18 degrees Celsius and a chance of rain.",
};
}
if (params.city.toLowerCase().includes("paris")) {
return {
"status": "success",
"report": "The weather in Paris is sunny with a temperature of 25 " +
"degrees Celsius.",
};
}
return {
"status": "error",
"error_message": `Weather information for '${params.city}' is not available.`,
};
}
/**
* Analyzes the sentiment of a given text.
*/
function analyzeSentiment(params: { text: string }): Record {
if (params.text.includes("cloudy") || params.text.includes("rain")) {
return { "status": "success", "sentiment": "negative" };
}
if (params.text.includes("sunny")) {
return { "status": "success", "sentiment": "positive" };
}
return { "status": "success", "sentiment": "neutral" };
}
const weatherTool = new FunctionTool({
name: "get_weather_report",
description: "Retrieves the current weather report for a specified city.",
parameters: z.object({
city: z.string().describe("The city to get the weather for."),
}),
execute: getWeatherReport,
});
const sentimentTool = new FunctionTool({
name: "analyze_sentiment",
description: "Analyzes the sentiment of a given text.",
parameters: z.object({
text: z.string().describe("The text to analyze the sentiment of."),
}),
execute: analyzeSentiment,
});
const instruction = `
You are a helpful assistant that first checks the weather and then analyzes
its sentiment.
Follow these steps:
1. Use the 'get_weather_report' tool to get the weather for the requested
city.
2. If the 'get_weather_report' tool returns an error, inform the user about
the error and stop.
3. If the weather report is available, use the 'analyze_sentiment' tool to
determine the sentiment of the weather report.
4. Finally, provide a summary to the user, including the weather report and
its sentiment.
`;
const agent = new LlmAgent({
name: "weather_sentiment_agent",
instruction: instruction,
tools: [weatherTool, sentimentTool],
model: "gemini-2.5-flash"
});
async function main() {
const runner = new InMemoryRunner({ agent: agent, appName: "weather_sentiment_app" });
await runner.sessionService.createSession({
appName: "weather_sentiment_app",
userId: "user1",
sessionId: "session1"
});
const newMessage: Content = createUserContent("What is the weather in London?");
for await (const event of runner.runAsync({
userId: "user1",
sessionId: "session1",
newMessage: newMessage,
})) {
if (isFinalResponse(event) && event.content?.parts?.length) {
const text = stringifyContent(event).trim();
if (text) {
console.log(text);
}
}
}
}
main();
```
```go
// Copyright 2025 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package main
import (
"context"
"fmt"
"log"
"strings"
"google.golang.org/adk/agent"
"google.golang.org/adk/agent/llmagent"
"google.golang.org/adk/model/gemini"
"google.golang.org/adk/runner"
"google.golang.org/adk/session"
"google.golang.org/adk/tool"
"google.golang.org/adk/tool/functiontool"
"google.golang.org/genai"
)
type getWeatherReportArgs struct {
City string `json:"city" jsonschema:"The city for which to get the weather report."`
}
type getWeatherReportResult struct {
Status string `json:"status"`
Report string `json:"report,omitempty"`
}
func getWeatherReport(ctx tool.Context, args getWeatherReportArgs) (getWeatherReportResult, error) {
if strings.ToLower(args.City) == "london" {
return getWeatherReportResult{Status: "success", Report: "The current weather in London is cloudy with a temperature of 18 degrees Celsius and a chance of rain."}, nil
}
if strings.ToLower(args.City) == "paris" {
return getWeatherReportResult{Status: "success", Report: "The weather in Paris is sunny with a temperature of 25 degrees Celsius."}, nil
}
return getWeatherReportResult{}, fmt.Errorf("weather information for '%s' is not available.", args.City)
}
type analyzeSentimentArgs struct {
Text string `json:"text" jsonschema:"The text to analyze for sentiment."`
}
type analyzeSentimentResult struct {
Sentiment string `json:"sentiment"`
Confidence float64 `json:"confidence"`
}
func analyzeSentiment(ctx tool.Context, args analyzeSentimentArgs) (analyzeSentimentResult, error) {
if strings.Contains(strings.ToLower(args.Text), "good") || strings.Contains(strings.ToLower(args.Text), "sunny") {
return analyzeSentimentResult{Sentiment: "positive", Confidence: 0.8}, nil
}
if strings.Contains(strings.ToLower(args.Text), "rain") || strings.Contains(strings.ToLower(args.Text), "bad") {
return analyzeSentimentResult{Sentiment: "negative", Confidence: 0.7}, nil
}
return analyzeSentimentResult{Sentiment: "neutral", Confidence: 0.6}, nil
}
func main() {
ctx := context.Background()
model, err := gemini.NewModel(ctx, "gemini-2.0-flash", &genai.ClientConfig{})
if err != nil {
log.Fatal(err)
}
weatherTool, err := functiontool.New(
functiontool.Config{
Name: "get_weather_report",
Description: "Retrieves the current weather report for a specified city.",
},
getWeatherReport,
)
if err != nil {
log.Fatal(err)
}
sentimentTool, err := functiontool.New(
functiontool.Config{
Name: "analyze_sentiment",
Description: "Analyzes the sentiment of the given text.",
},
analyzeSentiment,
)
if err != nil {
log.Fatal(err)
}
weatherSentimentAgent, err := llmagent.New(llmagent.Config{
Name: "weather_sentiment_agent",
Model: model,
Instruction: "You are a helpful assistant that provides weather information and analyzes the sentiment of user feedback. **If the user asks about the weather in a specific city, use the 'get_weather_report' tool to retrieve the weather details.** **If the 'get_weather_report' tool returns a 'success' status, provide the weather report to the user.** **If the 'get_weather_report' tool returns an 'error' status, inform the user that the weather information for the specified city is not available and ask if they have another city in mind.** **After providing a weather report, if the user gives feedback on the weather (e.g., 'That's good' or 'I don't like rain'), use the 'analyze_sentiment' tool to understand their sentiment.** Then, briefly acknowledge their sentiment. You can handle these tasks sequentially if needed.",
Tools: []tool.Tool{weatherTool, sentimentTool},
})
if err != nil {
log.Fatal(err)
}
sessionService := session.InMemoryService()
runner, err := runner.New(runner.Config{
AppName: "weather_sentiment_agent",
Agent: weatherSentimentAgent,
SessionService: sessionService,
})
if err != nil {
log.Fatal(err)
}
session, err := sessionService.Create(ctx, &session.CreateRequest{
AppName: "weather_sentiment_agent",
UserID: "user1234",
})
if err != nil {
log.Fatal(err)
}
run(ctx, runner, session.Session.ID(), "weather in london?")
run(ctx, runner, session.Session.ID(), "I don't like rain.")
}
func run(ctx context.Context, r *runner.Runner, sessionID string, prompt string) {
fmt.Printf("\n> %s\n", prompt)
events := r.Run(
ctx,
"user1234",
sessionID,
genai.NewContentFromText(prompt, genai.RoleUser),
agent.RunConfig{
StreamingMode: agent.StreamingModeNone,
},
)
for event, err := range events {
if err != nil {
log.Fatalf("ERROR during agent execution: %v", err)
}
if event.Content.Parts[0].Text != "" {
fmt.Printf("Agent Response: %s\n", event.Content.Parts[0].Text)
}
}
}
```
```java
import com.google.adk.agents.BaseAgent;
import com.google.adk.agents.LlmAgent;
import com.google.adk.runner.Runner;
import com.google.adk.sessions.InMemorySessionService;
import com.google.adk.sessions.Session;
import com.google.adk.tools.Annotations.Schema;
import com.google.adk.tools.FunctionTool;
import com.google.adk.tools.ToolContext; // Ensure this import is correct
import com.google.common.collect.ImmutableList;
import com.google.genai.types.Content;
import com.google.genai.types.Part;
import java.util.HashMap;
import java.util.Locale;
import java.util.Map;
public class WeatherSentimentAgentApp {
private static final String APP_NAME = "weather_sentiment_agent";
private static final String USER_ID = "user1234";
private static final String SESSION_ID = "1234";
private static final String MODEL_ID = "gemini-2.0-flash";
/**
* Retrieves the current weather report for a specified city.
*
* @param city The city for which to retrieve the weather report.
* @param toolContext The context for the tool.
* @return A dictionary containing the weather information.
*/
public static Map getWeatherReport(
@Schema(name = "city")
String city,
@Schema(name = "toolContext")
ToolContext toolContext) {
Map response = new HashMap<>();
if (city.toLowerCase(Locale.ROOT).equals("london")) {
response.put("status", "success");
response.put(
"report",
"The current weather in London is cloudy with a temperature of 18 degrees Celsius and a"
+ " chance of rain.");
} else if (city.toLowerCase(Locale.ROOT).equals("paris")) {
response.put("status", "success");
response.put(
"report", "The weather in Paris is sunny with a temperature of 25 degrees Celsius.");
} else {
response.put("status", "error");
response.put(
"error_message", String.format("Weather information for '%s' is not available.", city));
}
return response;
}
/**
* Analyzes the sentiment of the given text.
*
* @param text The text to analyze.
* @param toolContext The context for the tool.
* @return A dictionary with sentiment and confidence score.
*/
public static Map analyzeSentiment(
@Schema(name = "text")
String text,
@Schema(name = "toolContext")
ToolContext toolContext) {
Map response = new HashMap<>();
String lowerText = text.toLowerCase(Locale.ROOT);
if (lowerText.contains("good") || lowerText.contains("sunny")) {
response.put("sentiment", "positive");
response.put("confidence", 0.8);
} else if (lowerText.contains("rain") || lowerText.contains("bad")) {
response.put("sentiment", "negative");
response.put("confidence", 0.7);
} else {
response.put("sentiment", "neutral");
response.put("confidence", 0.6);
}
return response;
}
/**
* Calls the agent with the given query and prints the final response.
*
* @param runner The runner to use.
* @param query The query to send to the agent.
*/
public static void callAgent(Runner runner, String query) {
Content content = Content.fromParts(Part.fromText(query));
InMemorySessionService sessionService = (InMemorySessionService) runner.sessionService();
Session session =
sessionService
.createSession(APP_NAME, USER_ID, /* state= */ null, SESSION_ID)
.blockingGet();
runner
.runAsync(session.userId(), session.id(), content)
.forEach(
event -> {
if (event.finalResponse()
&& event.content().isPresent()
&& event.content().get().parts().isPresent()
&& !event.content().get().parts().get().isEmpty()
&& event.content().get().parts().get().get(0).text().isPresent()) {
String finalResponse = event.content().get().parts().get().get(0).text().get();
System.out.println("Agent Response: " + finalResponse);
}
});
}
public static void main(String[] args) throws NoSuchMethodException {
FunctionTool weatherTool =
FunctionTool.create(
WeatherSentimentAgentApp.class.getMethod(
"getWeatherReport", String.class, ToolContext.class));
FunctionTool sentimentTool =
FunctionTool.create(
WeatherSentimentAgentApp.class.getMethod(
"analyzeSentiment", String.class, ToolContext.class));
BaseAgent weatherSentimentAgent =
LlmAgent.builder()
.model(MODEL_ID)
.name("weather_sentiment_agent")
.description("Weather Sentiment Agent")
.instruction("""
You are a helpful assistant that provides weather information and analyzes the
sentiment of user feedback
**If the user asks about the weather in a specific city, use the
'get_weather_report' tool to retrieve the weather details.**
**If the 'get_weather_report' tool returns a 'success' status, provide the
weather report to the user.**
**If the 'get_weather_report' tool returns an 'error' status, inform the
user that the weather information for the specified city is not available
and ask if they have another city in mind.**
**After providing a weather report, if the user gives feedback on the
weather (e.g., 'That's good' or 'I don't like rain'), use the
'analyze_sentiment' tool to understand their sentiment.** Then, briefly
acknowledge their sentiment.
You can handle these tasks sequentially if needed.
""")
.tools(ImmutableList.of(weatherTool, sentimentTool))
.build();
InMemorySessionService sessionService = new InMemorySessionService();
Runner runner = new Runner(weatherSentimentAgent, APP_NAME, null, sessionService);
// Change the query to ensure the tool is called with a valid city that triggers a "success"
// response from the tool, like "london" (without the question mark).
callAgent(runner, "weather in paris");
}
}
```
## Tool Context
For more advanced scenarios, ADK allows you to access additional contextual information within your tool function by including the special parameter `tool_context: ToolContext`. By including this in the function signature, ADK will **automatically** provide an **instance of the ToolContext** class when your tool is called during agent execution.
The **ToolContext** provides access to several key pieces of information and control levers:
- `state: State`: Read and modify the current session's state. Changes made here are tracked and persisted.
- `actions: EventActions`: Influence the agent's subsequent actions after the tool runs (e.g., skip summarization, transfer to another agent).
- `function_call_id: str`: The unique identifier assigned by the framework to this specific invocation of the tool. Useful for tracking and correlating with authentication responses. This can also be helpful when multiple tools are called within a single model response.
- `function_call_event_id: str`: This attribute provides the unique identifier of the **event** that triggered the current tool call. This can be useful for tracking and logging purposes.
- `auth_response: Any`: Contains the authentication response/credentials if an authentication flow was completed before this tool call.
- Access to Services: Methods to interact with configured services like Artifacts and Memory.
Note that you shouldn't include the `tool_context` parameter in the tool function docstring. Since `ToolContext` is automatically injected by the ADK framework *after* the LLM decides to call the tool function, it is not relevant for the LLM's decision-making and including it can confuse the LLM.
### **State Management**
The `tool_context.state` attribute provides direct read and write access to the state associated with the current session. It behaves like a dictionary but ensures that any modifications are tracked as deltas and persisted by the session service. This enables tools to maintain and share information across different interactions and agent steps.
- **Reading State**: Use standard dictionary access (`tool_context.state['my_key']`) or the `.get()` method (`tool_context.state.get('my_key', default_value)`).
- **Writing State**: Assign values directly (`tool_context.state['new_key'] = 'new_value'`). These changes are recorded in the state_delta of the resulting event.
- **State Prefixes**: Remember the standard state prefixes:
- `app:*`: Shared across all users of the application.
- `user:*`: Specific to the current user across all their sessions.
- (No prefix): Specific to the current session.
- `temp:*`: Temporary, not persisted across invocations (useful for passing data within a single run call but generally less useful inside a tool context which operates between LLM calls).
```py
from google.adk.tools import ToolContext, FunctionTool
def update_user_preference(preference: str, value: str, tool_context: ToolContext):
"""Updates a user-specific preference."""
user_prefs_key = "user:preferences"
# Get current preferences or initialize if none exist
preferences = tool_context.state.get(user_prefs_key, {})
preferences[preference] = value
# Write the updated dictionary back to the state
tool_context.state[user_prefs_key] = preferences
print(f"Tool: Updated user preference '{preference}' to '{value}'")
return {"status": "success", "updated_preference": preference}
pref_tool = FunctionTool(func=update_user_preference)
# In an Agent:
# my_agent = Agent(..., tools=[pref_tool])
# When the LLM calls update_user_preference(preference='theme', value='dark', ...):
# The tool_context.state will be updated, and the change will be part of the
# resulting tool response event's actions.state_delta.
```
```typescript
import { ToolContext } from "@google/adk";
// Updates a user-specific preference.
export function updateUserThemePreference(
value: string,
toolContext: ToolContext
): Record {
const userPrefsKey = "user:preferences";
// Get current preferences or initialize if none exist
const preferences = toolContext.state.get(userPrefsKey, {}) as Record;
preferences["theme"] = value;
// Write the updated dictionary back to the state
toolContext.state.set(userPrefsKey, preferences);
console.log(
`Tool: Updated user preference ${userPrefsKey} to ${JSON.stringify(toolContext.state.get(userPrefsKey))}`
);
return {
status: "success",
updated_preference: toolContext.state.get(userPrefsKey),
};
// When the LLM calls updateUserThemePreference("dark"):
// The toolContext.state will be updated, and the change will be part of the
// resulting tool response event's actions.stateDelta.
}
```
```go
import (
"fmt"
"google.golang.org/adk/tool"
)
type updateUserPreferenceArgs struct {
Preference string `json:"preference" jsonschema:"The name of the preference to set."`
Value string `json:"value" jsonschema:"The value to set for the preference."`
}
type updateUserPreferenceResult struct {
UpdatedPreference string `json:"updated_preference"`
}
func updateUserPreference(ctx tool.Context, args updateUserPreferenceArgs) (*updateUserPreferenceResult, error) {
userPrefsKey := "user:preferences"
val, err := ctx.State().Get(userPrefsKey)
if err != nil {
val = make(map[string]any)
}
preferencesMap, ok := val.(map[string]any)
if !ok {
preferencesMap = make(map[string]any)
}
preferencesMap[args.Preference] = args.Value
if err := ctx.State().Set(userPrefsKey, preferencesMap); err != nil {
return nil, err
}
fmt.Printf("Tool: Updated user preference '%s' to '%s'\n", args.Preference, args.Value)
return &updateUserPreferenceResult{
UpdatedPreference: args.Preference,
}, nil
}
```
```java
import com.google.adk.tools.FunctionTool;
import com.google.adk.tools.ToolContext;
// Updates a user-specific preference.
public Map updateUserThemePreference(String value, ToolContext toolContext) {
String userPrefsKey = "user:preferences:theme";
// Get current preferences or initialize if none exist
String preference = toolContext.state().getOrDefault(userPrefsKey, "").toString();
if (preference.isEmpty()) {
preference = value;
}
// Write the updated dictionary back to the state
toolContext.state().put("user:preferences", preference);
System.out.printf("Tool: Updated user preference %s to %s", userPrefsKey, preference);
return Map.of("status", "success", "updated_preference", toolContext.state().get(userPrefsKey).toString());
// When the LLM calls updateUserThemePreference("dark"):
// The toolContext.state will be updated, and the change will be part of the
// resulting tool response event's actions.stateDelta.
}
```
### **Controlling Agent Flow**
The `tool_context.actions` attribute in Python and TypeScript, `ToolContext.actions()` in Java, and `tool.Context.Actions()` in Go, holds an **EventActions** object. Modifying attributes on this object allows your tool to influence what the agent or framework does after the tool finishes execution.
- **`skip_summarization: bool`**: (Default: False) If set to True, instructs the ADK to bypass the LLM call that typically summarizes the tool's output. This is useful if your tool's return value is already a user-ready message.
- **`transfer_to_agent: str`**: Set this to the name of another agent. The framework will halt the current agent's execution and **transfer control of the conversation to the specified agent**. This allows tools to dynamically hand off tasks to more specialized agents.
- **`escalate: bool`**: (Default: False) Setting this to True signals that the current agent cannot handle the request and should pass control up to its parent agent (if in a hierarchy). In a LoopAgent, setting **escalate=True** in a sub-agent's tool will terminate the loop.
#### Example
```py
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from google.adk.agents import Agent
from google.adk.tools import FunctionTool
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.adk.tools import ToolContext
from google.genai import types
APP_NAME="customer_support_agent"
USER_ID="user1234"
SESSION_ID="1234"
def check_and_transfer(query: str, tool_context: ToolContext) -> str:
"""Checks if the query requires escalation and transfers to another agent if needed."""
if "urgent" in query.lower():
print("Tool: Detected urgency, transferring to the support agent.")
tool_context.actions.transfer_to_agent = "support_agent"
return "Transferring to the support agent..."
else:
return f"Processed query: '{query}'. No further action needed."
escalation_tool = FunctionTool(func=check_and_transfer)
main_agent = Agent(
model='gemini-2.0-flash',
name='main_agent',
instruction="""You are the first point of contact for customer support of an analytics tool. Answer general queries. If the user indicates urgency, use the 'escalation_tool' tool.""",
tools=[escalation_tool]
)
support_agent = Agent(
model='gemini-2.0-flash',
name='support_agent',
instruction="""You are the dedicated support agent. Mentioned you are a support handler and please help the user with their urgent issue."""
)
main_agent.sub_agents = [support_agent]
# Session and Runner
async def setup_session_and_runner():
session_service = InMemorySessionService()
session = await session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID)
runner = Runner(agent=main_agent, app_name=APP_NAME, session_service=session_service)
return session, runner
# Agent Interaction
async def call_agent_async(query):
content = types.Content(role='user', parts=[types.Part(text=query)])
session, runner = await setup_session_and_runner()
events = runner.run_async(user_id=USER_ID, session_id=SESSION_ID, new_message=content)
async for event in events:
if event.is_final_response():
final_response = event.content.parts[0].text
print("Agent Response: ", final_response)
# Note: In Colab, you can directly use 'await' at the top level.
# If running this code as a standalone Python script, you'll need to use asyncio.run() or manage the event loop.
await call_agent_async("this is urgent, i cant login")
```
```typescript
/**
* Copyright 2025 Google LLC
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { LlmAgent, FunctionTool, ToolContext, InMemoryRunner, isFinalResponse, stringifyContent } from "@google/adk";
import { z } from "zod";
import { Content, createUserContent } from "@google/genai";
function checkAndTransfer(
params: { query: string },
toolContext?: ToolContext
): Record {
if (!toolContext) {
// This should not happen in a normal ADK flow where the tool is called by an agent.
throw new Error("ToolContext is required to transfer agents.");
}
if (params.query.toLowerCase().includes("urgent")) {
console.log("Tool: Urgent query detected, transferring to support_agent.");
toolContext.actions.transferToAgent = "support_agent";
return { status: "success", message: "Transferring to support agent." };
}
console.log("Tool: Query is not urgent, handling normally.");
return { status: "success", message: "Query will be handled by the main agent." };
}
const transferTool = new FunctionTool({
name: "check_and_transfer",
description: "Checks the user's query and transfers to a support agent if urgent.",
parameters: z.object({
query: z.string().describe("The user query to analyze."),
}),
execute: checkAndTransfer,
});
const supportAgent = new LlmAgent({
name: "support_agent",
description: "Handles urgent user requests about accounts.",
instruction: "You are the support agent. Handle the user's urgent request.",
model: "gemini-2.5-flash"
});
const mainAgent = new LlmAgent({
name: "main_agent",
description: "The main agent that routes non-urgent queries.",
instruction: "You are the main agent. Use the check_and_transfer tool to analyze the user query. If the query is not urgent, handle it yourself.",
tools: [transferTool],
subAgents: [supportAgent],
model: "gemini-2.5-flash"
});
async function main() {
const runner = new InMemoryRunner({ agent: mainAgent, appName: "customer_support_app" });
console.log("--- Running with a non-urgent query ---");
await runner.sessionService.createSession({ appName: "customer_support_app", userId: "user1", sessionId: "session1" });
const nonUrgentMessage: Content = createUserContent("I have a general question about my account.");
for await (const event of runner.runAsync({ userId: "user1", sessionId: "session1", newMessage: nonUrgentMessage })) {
if (isFinalResponse(event) && event.content?.parts?.length) {
const text = stringifyContent(event).trim();
if (text) {
console.log(`Final Response: ${text}`);
}
}
}
console.log("\n--- Running with an urgent query ---");
await runner.sessionService.createSession({ appName: "customer_support_app", userId: "user1", sessionId: "session2" });
const urgentMessage: Content = createUserContent("My account is locked and this is urgent!");
for await (const event of runner.runAsync({ userId: "user1", sessionId: "session2", newMessage: urgentMessage })) {
if (isFinalResponse(event) && event.content?.parts?.length) {
const text = stringifyContent(event).trim();
if (text) {
console.log(`Final Response: ${text}`);
}
}
}
}
main();
```
```go
// Copyright 2025 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package main
import (
"context"
"fmt"
"log"
"strings"
"google.golang.org/adk/agent"
"google.golang.org/adk/agent/llmagent"
"google.golang.org/adk/model/gemini"
"google.golang.org/adk/runner"
"google.golang.org/adk/session"
"google.golang.org/adk/tool"
"google.golang.org/adk/tool/functiontool"
"google.golang.org/genai"
)
type checkAndTransferArgs struct {
Query string `json:"query" jsonschema:"The user's query to check for urgency."`
}
type checkAndTransferResult struct {
Status string `json:"status"`
}
func checkAndTransfer(ctx tool.Context, args checkAndTransferArgs) (checkAndTransferResult, error) {
if strings.Contains(strings.ToLower(args.Query), "urgent") {
fmt.Println("Tool: Detected urgency, transferring to the support agent.")
ctx.Actions().TransferToAgent = "support_agent"
return checkAndTransferResult{Status: "Transferring to the support agent..."}, nil
}
return checkAndTransferResult{Status: fmt.Sprintf("Processed query: '%s'. No further action needed.", args.Query)}, nil
}
func main() {
ctx := context.Background()
model, err := gemini.NewModel(ctx, "gemini-2.0-flash", &genai.ClientConfig{})
if err != nil {
log.Fatal(err)
}
supportAgent, err := llmagent.New(llmagent.Config{
Name: "support_agent",
Model: model,
Instruction: "You are the dedicated support agent. Mentioned you are a support handler and please help the user with their urgent issue.",
})
if err != nil {
log.Fatal(err)
}
checkAndTransferTool, err := functiontool.New(
functiontool.Config{
Name: "check_and_transfer",
Description: "Checks if the query requires escalation and transfers to another agent if needed.",
},
checkAndTransfer,
)
if err != nil {
log.Fatal(err)
}
mainAgent, err := llmagent.New(llmagent.Config{
Name: "main_agent",
Model: model,
Instruction: "You are the first point of contact for customer support of an analytics tool. Answer general queries. If the user indicates urgency, use the 'check_and_transfer' tool.",
Tools: []tool.Tool{checkAndTransferTool},
SubAgents: []agent.Agent{supportAgent},
})
if err != nil {
log.Fatal(err)
}
sessionService := session.InMemoryService()
runner, err := runner.New(runner.Config{
AppName: "customer_support_agent",
Agent: mainAgent,
SessionService: sessionService,
})
if err != nil {
log.Fatal(err)
}
session, err := sessionService.Create(ctx, &session.CreateRequest{
AppName: "customer_support_agent",
UserID: "user1234",
})
if err != nil {
log.Fatal(err)
}
run(ctx, runner, session.Session.ID(), "this is urgent, i cant login")
}
func run(ctx context.Context, r *runner.Runner, sessionID string, prompt string) {
fmt.Printf("\n> %s\n", prompt)
events := r.Run(
ctx,
"user1234",
sessionID,
genai.NewContentFromText(prompt, genai.RoleUser),
agent.RunConfig{
StreamingMode: agent.StreamingModeNone,
},
)
for event, err := range events {
if err != nil {
log.Fatalf("ERROR during agent execution: %v", err)
}
if event.Content.Parts[0].Text != "" {
fmt.Printf("Agent Response: %s\n", event.Content.Parts[0].Text)
}
}
}
```
```java
import com.google.adk.agents.LlmAgent;
import com.google.adk.runner.Runner;
import com.google.adk.sessions.InMemorySessionService;
import com.google.adk.sessions.Session;
import com.google.adk.tools.Annotations.Schema;
import com.google.adk.tools.FunctionTool;
import com.google.adk.tools.ToolContext;
import com.google.common.collect.ImmutableList;
import com.google.genai.types.Content;
import com.google.genai.types.Part;
import java.util.HashMap;
import java.util.Locale;
import java.util.Map;
public class CustomerSupportAgentApp {
private static final String APP_NAME = "customer_support_agent";
private static final String USER_ID = "user1234";
private static final String SESSION_ID = "1234";
private static final String MODEL_ID = "gemini-2.0-flash";
/**
* Checks if the query requires escalation and transfers to another agent if needed.
*
* @param query The user's query.
* @param toolContext The context for the tool.
* @return A map indicating the result of the check and transfer.
*/
public static Map checkAndTransfer(
@Schema(name = "query", description = "the user query")
String query,
@Schema(name = "toolContext", description = "the tool context")
ToolContext toolContext) {
Map response = new HashMap<>();
if (query.toLowerCase(Locale.ROOT).contains("urgent")) {
System.out.println("Tool: Detected urgency, transferring to the support agent.");
toolContext.actions().setTransferToAgent("support_agent");
response.put("status", "transferring");
response.put("message", "Transferring to the support agent...");
} else {
response.put("status", "processed");
response.put(
"message", String.format("Processed query: '%s'. No further action needed.", query));
}
return response;
}
/**
* Calls the agent with the given query and prints the final response.
*
* @param runner The runner to use.
* @param query The query to send to the agent.
*/
public static void callAgent(Runner runner, String query) {
Content content =
Content.fromParts(Part.fromText(query));
InMemorySessionService sessionService = (InMemorySessionService) runner.sessionService();
// Fixed: session ID does not need to be an optional.
Session session =
sessionService
.createSession(APP_NAME, USER_ID, /* state= */ null, SESSION_ID)
.blockingGet();
runner
.runAsync(session.userId(), session.id(), content)
.forEach(
event -> {
if (event.finalResponse()
&& event.content().isPresent()
&& event.content().get().parts().isPresent()
&& !event.content().get().parts().get().isEmpty()
&& event.content().get().parts().get().get(0).text().isPresent()) {
String finalResponse = event.content().get().parts().get().get(0).text().get();
System.out.println("Agent Response: " + finalResponse);
}
});
}
public static void main(String[] args) throws NoSuchMethodException {
FunctionTool escalationTool =
FunctionTool.create(
CustomerSupportAgentApp.class.getMethod(
"checkAndTransfer", String.class, ToolContext.class));
LlmAgent supportAgent =
LlmAgent.builder()
.model(MODEL_ID)
.name("support_agent")
.description("""
The dedicated support agent.
Mentions it is a support handler and helps the user with their urgent issue.
""")
.instruction("""
You are the dedicated support agent.
Mentioned you are a support handler and please help the user with their urgent issue.
""")
.build();
LlmAgent mainAgent =
LlmAgent.builder()
.model(MODEL_ID)
.name("main_agent")
.description("""
The first point of contact for customer support of an analytics tool.
Answers general queries.
If the user indicates urgency, uses the 'check_and_transfer' tool.
""")
.instruction("""
You are the first point of contact for customer support of an analytics tool.
Answer general queries.
If the user indicates urgency, use the 'check_and_transfer' tool.
""")
.tools(ImmutableList.of(escalationTool))
.subAgents(supportAgent)
.build();
// Fixed: LlmAgent.subAgents() expects 0 arguments.
// Sub-agents are now added to the main agent via its builder,
// as `subAgents` is a property that should be set during agent construction
// if it's not dynamically managed.
InMemorySessionService sessionService = new InMemorySessionService();
Runner runner = new Runner(mainAgent, APP_NAME, null, sessionService);
// Agent Interaction
callAgent(runner, "this is urgent, i cant login");
}
}
```
##### Explanation
- We define two agents: `main_agent` and `support_agent`. The `main_agent` is designed to be the initial point of contact.
- The `check_and_transfer` tool, when called by `main_agent`, examines the user's query.
- If the query contains the word "urgent", the tool accesses the `tool_context`, specifically **`tool_context.actions`**, and sets the transfer_to_agent attribute to `support_agent`.
- This action signals to the framework to **transfer the control of the conversation to the agent named `support_agent`**.
- When the `main_agent` processes the urgent query, the `check_and_transfer` tool triggers the transfer. The subsequent response would ideally come from the `support_agent`.
- For a normal query without urgency, the tool simply processes it without triggering a transfer.
This example illustrates how a tool, through EventActions in its ToolContext, can dynamically influence the flow of the conversation by transferring control to another specialized agent.
### **Authentication**
ToolContext provides mechanisms for tools interacting with authenticated APIs. If your tool needs to handle authentication, you might use the following:
- **`auth_response`** (in Python): Contains credentials (e.g., a token) if authentication was already handled by the framework before your tool was called (common with RestApiTool and OpenAPI security schemes). In TypeScript, this is retrieved via the getAuthResponse() method.
- **`request_credential(auth_config: dict)`** (in Python) or **`requestCredential(authConfig: AuthConfig)`** (in TypeScript): Call this method if your tool determines authentication is needed but credentials aren't available. This signals the framework to start an authentication flow based on the provided auth_config.
- **`get_auth_response()`** (in Python) or **`getAuthResponse(authConfig: AuthConfig)`** (in TypeScript): Call this in a subsequent invocation (after request_credential was successfully handled) to retrieve the credentials the user provided.
For detailed explanations of authentication flows, configuration, and examples, please refer to the dedicated Tool Authentication documentation page.
### **Context-Aware Data Access Methods**
These methods provide convenient ways for your tool to interact with persistent data associated with the session or user, managed by configured services.
- **`list_artifacts()`** (in Python) or **`listArtifacts()`** (in Java and TypeScript): Returns a list of filenames (or keys) for all artifacts currently stored for the session via the artifact_service. Artifacts are typically files (images, documents, etc.) uploaded by the user or generated by tools/agents.
- **`load_artifact(filename: str)`**: Retrieves a specific artifact by its filename from the **artifact_service**. You can optionally specify a version; if omitted, the latest version is returned. Returns a `google.genai.types.Part` object containing the artifact data and mime type, or None if not found.
- **`save_artifact(filename: str, artifact: types.Part)`**: Saves a new version of an artifact to the artifact_service. Returns the new version number (starting from 0).
- **`search_memory(query: str)`**: (Support in ADK Python, Go and TypeScript) Queries the user's long-term memory using the configured `memory_service`. This is useful for retrieving relevant information from past interactions or stored knowledge. The structure of the **SearchMemoryResponse** depends on the specific memory service implementation but typically contains relevant text snippets or conversation excerpts.
#### Example
```py
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from google.adk.tools import ToolContext, FunctionTool
from google.genai import types
def process_document(
document_name: str, analysis_query: str, tool_context: ToolContext
) -> dict:
"""Analyzes a document using context from memory."""
# 1. Load the artifact
print(f"Tool: Attempting to load artifact: {document_name}")
document_part = tool_context.load_artifact(document_name)
if not document_part:
return {"status": "error", "message": f"Document '{document_name}' not found."}
document_text = document_part.text # Assuming it's text for simplicity
print(f"Tool: Loaded document '{document_name}' ({len(document_text)} chars).")
# 2. Search memory for related context
print(f"Tool: Searching memory for context related to: '{analysis_query}'")
memory_response = tool_context.search_memory(
f"Context for analyzing document about {analysis_query}"
)
memory_context = "\n".join(
[
m.events[0].content.parts[0].text
for m in memory_response.memories
if m.events and m.events[0].content
]
) # Simplified extraction
print(f"Tool: Found memory context: {memory_context[:100]}...")
# 3. Perform analysis (placeholder)
analysis_result = f"Analysis of '{document_name}' regarding '{analysis_query}' using memory context: [Placeholder Analysis Result]"
print("Tool: Performed analysis.")
# 4. Save the analysis result as a new artifact
analysis_part = types.Part.from_text(text=analysis_result)
new_artifact_name = f"analysis_{document_name}"
version = await tool_context.save_artifact(new_artifact_name, analysis_part)
print(f"Tool: Saved analysis result as '{new_artifact_name}' version {version}.")
return {
"status": "success",
"analysis_artifact": new_artifact_name,
"version": version,
}
doc_analysis_tool = FunctionTool(func=process_document)
# In an Agent:
# Assume artifact 'report.txt' was previously saved.
# Assume memory service is configured and has relevant past data.
# my_agent = Agent(..., tools=[doc_analysis_tool], artifact_service=..., memory_service=...)
```
```typescript
import { Part } from "@google/genai";
import { ToolContext } from "@google/adk";
// Analyzes a document using context from memory.
export async function processDocument(
params: { documentName: string; analysisQuery: string },
toolContext?: ToolContext
): Promise> {
if (!toolContext) {
throw new Error("ToolContext is required for this tool.");
}
// 1. List all available artifacts
const artifacts = await toolContext.listArtifacts();
console.log(`Listing all available artifacts: ${artifacts}`);
// 2. Load an artifact
console.log(`Tool: Attempting to load artifact: ${params.documentName}`);
const documentPart = await toolContext.loadArtifact(params.documentName);
if (!documentPart) {
console.log(`Tool: Document '${params.documentName}' not found.`);
return {
status: "error",
message: `Document '${params.documentName}' not found.`,
};
}
const documentText = documentPart.text ?? "";
console.log(
`Tool: Loaded document '${params.documentName}' (${documentText.length} chars).`
);
// 3. Search memory for related context
console.log(`Tool: Searching memory for context related to '${params.analysisQuery}'`);
const memory_results = await toolContext.searchMemory(params.analysisQuery);
console.log(`Tool: Found ${memory_results.memories.length} relevant memories.`);
const context_from_memory = memory_results.memories
.map((m) => m.content.parts[0].text)
.join("\n");
// 4. Perform analysis (placeholder)
const analysisResult =
`Analysis of '${params.documentName}' regarding '${params.analysisQuery}':\n` +
`Context from Memory:\n${context_from_memory}\n` +
`[Placeholder Analysis Result]`;
console.log("Tool: Performed analysis.");
// 5. Save the analysis result as a new artifact
const analysisPart: Part = { text: analysisResult };
const newArtifactName = `analysis_${params.documentName}`;
await toolContext.saveArtifact(newArtifactName, analysisPart);
console.log(`Tool: Saved analysis result to '${newArtifactName}'.`);
return {
status: "success",
analysis_artifact: newArtifactName,
};
}
```
```go
// Copyright 2025 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package main
import (
"fmt"
"google.golang.org/adk/tool"
"google.golang.org/genai"
)
type processDocumentArgs struct {
DocumentName string `json:"document_name" jsonschema:"The name of the document to be processed."`
AnalysisQuery string `json:"analysis_query" jsonschema:"The query for the analysis."`
}
type processDocumentResult struct {
Status string `json:"status"`
AnalysisArtifact string `json:"analysis_artifact,omitempty"`
Version int64 `json:"version,omitempty"`
Message string `json:"message,omitempty"`
}
func processDocument(ctx tool.Context, args processDocumentArgs) (*processDocumentResult, error) {
fmt.Printf("Tool: Attempting to load artifact: %s\n", args.DocumentName)
// List all artifacts
listResponse, err := ctx.Artifacts().List(ctx)
if err != nil {
return nil, fmt.Errorf("failed to list artifacts")
}
fmt.Println("Tool: Available artifacts:")
for _, file := range listResponse.FileNames {
fmt.Printf(" - %s\n", file)
}
documentPart, err := ctx.Artifacts().Load(ctx, args.DocumentName)
if err != nil {
return nil, fmt.Errorf("document '%s' not found", args.DocumentName)
}
fmt.Printf("Tool: Loaded document '%s' of size %d bytes.\n", args.DocumentName, len(documentPart.Part.InlineData.Data))
// 3. Search memory for related context
fmt.Printf("Tool: Searching memory for context related to: '%s'\n", args.AnalysisQuery)
memoryResp, err := ctx.SearchMemory(ctx, args.AnalysisQuery)
if err != nil {
fmt.Printf("Tool: Error searching memory: %v\n", err)
}
memoryResultCount := 0
if memoryResp != nil {
memoryResultCount = len(memoryResp.Memories)
}
fmt.Printf("Tool: Found %d memory results.\n", memoryResultCount)
analysisResult := fmt.Sprintf("Analysis of '%s' regarding '%s' using memory context: [Placeholder Analysis Result]", args.DocumentName, args.AnalysisQuery)
fmt.Println("Tool: Performed analysis.")
analysisPart := genai.NewPartFromText(analysisResult)
newArtifactName := fmt.Sprintf("analysis_%s", args.DocumentName)
version, err := ctx.Artifacts().Save(ctx, newArtifactName, analysisPart)
if err != nil {
return nil, fmt.Errorf("failed to save artifact")
}
fmt.Printf("Tool: Saved analysis result as '%s' version %d.\n", newArtifactName, version.Version)
return &processDocumentResult{
Status: "success",
AnalysisArtifact: newArtifactName,
Version: version.Version,
}, nil
}
```
```java
// Analyzes a document using context from memory.
// You can also list, load and save artifacts using Callback Context or LoadArtifacts tool.
public static @NonNull Maybe> processDocument(
@Annotations.Schema(description = "The name of the document to analyze.") String documentName,
@Annotations.Schema(description = "The query for the analysis.") String analysisQuery,
ToolContext toolContext) {
// 1. List all available artifacts
System.out.printf(
"Listing all available artifacts %s:", toolContext.listArtifacts().blockingGet());
// 2. Load an artifact to memory
System.out.println("Tool: Attempting to load artifact: " + documentName);
Part documentPart = toolContext.loadArtifact(documentName, Optional.empty()).blockingGet();
if (documentPart == null) {
System.out.println("Tool: Document '" + documentName + "' not found.");
return Maybe.just(
ImmutableMap.of(
"status", "error", "message", "Document '" + documentName + "' not found."));
}
String documentText = documentPart.text().orElse("");
System.out.println(
"Tool: Loaded document '" + documentName + "' (" + documentText.length() + " chars).");
// 3. Perform analysis (placeholder)
String analysisResult =
"Analysis of '"
+ documentName
+ "' regarding '"
+ analysisQuery
+ " [Placeholder Analysis Result]";
System.out.println("Tool: Performed analysis.");
// 4. Save the analysis result as a new artifact
Part analysisPart = Part.fromText(analysisResult);
String newArtifactName = "analysis_" + documentName;
toolContext.saveArtifact(newArtifactName, analysisPart);
return Maybe.just(
ImmutableMap.builder()
.put("status", "success")
.put("analysis_artifact", newArtifactName)
.build());
}
// FunctionTool processDocumentTool =
// FunctionTool.create(ToolContextArtifactExample.class, "processDocument");
// In the Agent, include this function tool.
// LlmAgent agent = LlmAgent().builder().tools(processDocumentTool).build();
```
By leveraging the **ToolContext**, developers can create more sophisticated and context-aware custom tools that seamlessly integrate with ADK's architecture and enhance the overall capabilities of their agents.
## Defining Effective Tool Functions
When using a method or function as an ADK Tool, how you define it significantly impacts the agent's ability to use it correctly. The agent's Large Language Model (LLM) relies heavily on the function's **name**, **parameters (arguments)**, **type hints**, and **docstring** / **source code comments** to understand its purpose and generate the correct call.
Here are key guidelines for defining effective tool functions:
- **Function Name:**
- Use descriptive, verb-noun based names that clearly indicate the action (e.g., `get_weather`, `searchDocuments`, `schedule_meeting`).
- Avoid generic names like `run`, `process`, `handle_data`, or overly ambiguous names like `doStuff`. Even with a good description, a name like `do_stuff` might confuse the model about when to use the tool versus, for example, `cancelFlight`.
- The LLM uses the function name as a primary identifier during tool selection.
- **Parameters (Arguments):**
- Your function can have any number of parameters.
- Use clear and descriptive names (e.g., `city` instead of `c`, `search_query` instead of `q`).
- **Provide type hints in Python** for all parameters (e.g., `city: str`, `user_id: int`, `items: list[str]`). This is essential for ADK to generate the correct schema for the LLM.
- Ensure all parameter types are **JSON serializable**. All java primitives as well as standard Python types like `str`, `int`, `float`, `bool`, `list`, `dict`, and their combinations are generally safe. Avoid complex custom class instances as direct parameters unless they have a clear JSON representation.
- **Do not set default values** for parameters. E.g., `def my_func(param1: str = "default")`. Default values are not reliably supported or used by the underlying models during function call generation. All necessary information should be derived by the LLM from the context or explicitly requested if missing.
- **`self` / `cls` Handled Automatically:** Implicit parameters like `self` (for instance methods) or `cls` (for class methods) are automatically handled by ADK and excluded from the schema shown to the LLM. You only need to define type hints and descriptions for the logical parameters your tool requires the LLM to provide.
- **Return Type:**
- The function's return value **must be a dictionary (`dict`)** in Python, a **Map** in Java, or a plain **object** in TypeScript.
- If your function returns a non-dictionary type (e.g., a string, number, list), the ADK framework will automatically wrap it into a dictionary/Map like `{'result': your_original_return_value}` before passing the result back to the model.
- Design the dictionary/Map keys and values to be **descriptive and easily understood *by the LLM***. Remember, the model reads this output to decide its next step.
- Include meaningful keys. For example, instead of returning just an error code like `500`, return `{'status': 'error', 'error_message': 'Database connection failed'}`.
- It's a **highly recommended practice** to include a `status` key (e.g., `'success'`, `'error'`, `'pending'`, `'ambiguous'`) to clearly indicate the outcome of the tool execution for the model.
- **Docstring / Source Code Comments:**
- **This is critical.** The docstring is the primary source of descriptive information for the LLM.
- **Clearly state what the tool *does*.** Be specific about its purpose and limitations.
- **Explain *when* the tool should be used.** Provide context or example scenarios to guide the LLM's decision-making.
- **Describe *each parameter* clearly.** Explain what information the LLM needs to provide for that argument.
- Describe the **structure and meaning of the expected `dict` return value**, especially the different `status` values and associated data keys.
- **Do not describe the injected ToolContext parameter**. Avoid mentioning the optional `tool_context: ToolContext` parameter within the docstring description since it is not a parameter the LLM needs to know about. ToolContext is injected by ADK, *after* the LLM decides to call it.
**Example of a good definition:**
```python
def lookup_order_status(order_id: str) -> dict:
"""Fetches the current status of a customer's order using its ID.
Use this tool ONLY when a user explicitly asks for the status of
a specific order and provides the order ID. Do not use it for
general inquiries.
Args:
order_id: The unique identifier of the order to look up.
Returns:
A dictionary indicating the outcome.
On success, status is 'success' and includes an 'order' dictionary.
On failure, status is 'error' and includes an 'error_message'.
Example success: {'status': 'success', 'order': {'state': 'shipped', 'tracking_number': '1Z9...'}}
Example error: {'status': 'error', 'error_message': 'Order ID not found.'}
"""
# ... function implementation to fetch status ...
if status_details := fetch_status_from_backend(order_id):
return {
"status": "success",
"order": {
"state": status_details.state,
"tracking_number": status_details.tracking,
},
}
else:
return {"status": "error", "error_message": f"Order ID {order_id} not found."}
```
```typescript
/**
* Fetches the current status of a customer's order using its ID.
*
* Use this tool ONLY when a user explicitly asks for the status of
* a specific order and provides the order ID. Do not use it for
* general inquiries.
*
* @param params The parameters for the function.
* @param params.order_id The unique identifier of the order to look up.
* @returns A dictionary indicating the outcome.
* On success, status is 'success' and includes an 'order' dictionary.
* On failure, status is 'error' and includes an 'error_message'.
* Example success: {'status': 'success', 'order': {'state': 'shipped', 'tracking_number': '1Z9...'}}
* Example error: {'status': 'error', 'error_message': 'Order ID not found.'}
*/
async function lookupOrderStatus(params: { order_id: string }): Promise> {
// ... function implementation to fetch status from a backend ...
const status_details = await fetchStatusFromBackend(params.order_id);
if (status_details) {
return {
"status": "success",
"order": {
"state": status_details.state,
"tracking_number": status_details.tracking,
},
};
} else {
return { "status": "error", "error_message": `Order ID ${params.order_id} not found.` };
}
}
// Placeholder for a backend call
async function fetchStatusFromBackend(order_id: string): Promise<{state: string, tracking: string} | null> {
if (order_id === "12345") {
return { state: "shipped", tracking: "1Z9..." };
}
return null;
}
```
```go
import (
"fmt"
"google.golang.org/adk/tool"
)
type lookupOrderStatusArgs struct {
OrderID string `json:"order_id" jsonschema:"The ID of the order to look up."`
}
type order struct {
State string `json:"state"`
TrackingNumber string `json:"tracking_number"`
}
type lookupOrderStatusResult struct {
Status string `json:"status"`
Order order `json:"order,omitempty"`
}
func lookupOrderStatus(ctx tool.Context, args lookupOrderStatusArgs) (*lookupOrderStatusResult, error) {
// ... function implementation to fetch status ...
statusDetails, ok := fetchStatusFromBackend(args.OrderID)
if !ok {
return nil, fmt.Errorf("order ID %s not found", args.OrderID)
}
return &lookupOrderStatusResult{
Status: "success",
Order: order{
State: statusDetails.State,
TrackingNumber: statusDetails.Tracking,
},
}, nil
}
```
```java
/**
* Retrieves the current weather report for a specified city.
*
* @param city The city for which to retrieve the weather report.
* @param toolContext The context for the tool.
* @return A dictionary containing the weather information.
*/
public static Map getWeatherReport(String city, ToolContext toolContext) {
Map response = new HashMap<>();
if (city.toLowerCase(Locale.ROOT).equals("london")) {
response.put("status", "success");
response.put(
"report",
"The current weather in London is cloudy with a temperature of 18 degrees Celsius and a"
+ " chance of rain.");
} else if (city.toLowerCase(Locale.ROOT).equals("paris")) {
response.put("status", "success");
response.put("report", "The weather in Paris is sunny with a temperature of 25 degrees Celsius.");
} else {
response.put("status", "error");
response.put("error_message", String.format("Weather information for '%s' is not available.", city));
}
return response;
}
```
- **Simplicity and Focus:**
- **Keep Tools Focused:** Each tool should ideally perform one well-defined task.
- **Fewer Parameters are Better:** Models generally handle tools with fewer, clearly defined parameters more reliably than those with many optional or complex ones.
- **Use Simple Data Types:** Prefer basic types (`str`, `int`, `bool`, `float`, `List[str]`, in **Python**; `int`, `byte`, `short`, `long`, `float`, `double`, `boolean` and `char` in **Java**; or `string`, `number`, `boolean`, and arrays like `string[]` in **TypeScript**) over complex custom classes or deeply nested structures as parameters when possible.
- **Decompose Complex Tasks:** Break down functions that perform multiple distinct logical steps into smaller, more focused tools. For instance, instead of a single `update_user_profile(profile: ProfileObject)` tool, consider separate tools like `update_user_name(name: str)`, `update_user_address(address: str)`, `update_user_preferences(preferences: list[str])`, etc. This makes it easier for the LLM to select and use the correct capability.
By adhering to these guidelines, you provide the LLM with the clarity and structure it needs to effectively utilize your custom function tools, leading to more capable and reliable agent behavior.
## Toolsets: Grouping and Dynamically Providing Tools
Supported in ADKPython v0.5.0Typescript v0.2.0
Beyond individual tools, ADK introduces the concept of a **Toolset** via the `BaseToolset` interface (defined in `google.adk.tools.base_toolset`). A toolset allows you to manage and provide a collection of `BaseTool` instances, often dynamically, to an agent.
This approach is beneficial for:
- **Organizing Related Tools:** Grouping tools that serve a common purpose (e.g., all tools for mathematical operations, or all tools interacting with a specific API).
- **Dynamic Tool Availability:** Enabling an agent to have different tools available based on the current context (e.g., user permissions, session state, or other runtime conditions). The `get_tools` method of a toolset can decide which tools to expose.
- **Integrating External Tool Providers:** Toolsets can act as adapters for tools coming from external systems, like an OpenAPI specification or an MCP server, converting them into ADK-compatible `BaseTool` objects.
### The `BaseToolset` Interface
Any class acting as a toolset in ADK should implement the `BaseToolset` abstract base class. This interface primarily defines two methods:
- **`async def get_tools(...) -> list[BaseTool]:`** This is the core method of a toolset. When an ADK agent needs to know its available tools, it will call `get_tools()` on each `BaseToolset` instance provided in its `tools` list.
- It receives an optional `readonly_context` (an instance of `ReadonlyContext`). This context provides read-only access to information like the current session state (`readonly_context.state`), agent name, and invocation ID. The toolset can use this context to dynamically decide which tools to return.
- It **must** return a `list` of `BaseTool` instances (e.g., `FunctionTool`, `RestApiTool`).
- **`async def close(self) -> None:`** This asynchronous method is called by the ADK framework when the toolset is no longer needed, for example, when an agent server is shutting down or the `Runner` is being closed. Implement this method to perform any necessary cleanup, such as closing network connections, releasing file handles, or cleaning up other resources managed by the toolset.
### Using Toolsets with Agents
You can include instances of your `BaseToolset` implementations directly in an `LlmAgent`'s `tools` list, alongside individual `BaseTool` instances.
When the agent initializes or needs to determine its available capabilities, the ADK framework will iterate through the `tools` list:
- If an item is a `BaseTool` instance, it's used directly.
- If an item is a `BaseToolset` instance, its `get_tools()` method is called (with the current `ReadonlyContext`), and the returned list of `BaseTool`s is added to the agent's available tools.
### Example: A Simple Math Toolset
Let's create a basic example of a toolset that provides simple arithmetic operations.
```py
# 1. Define the individual tool functions
def add_numbers(a: int, b: int, tool_context: ToolContext) -> Dict[str, Any]:
"""Adds two integer numbers.
Args:
a: The first number.
b: The second number.
Returns:
A dictionary with the sum, e.g., {'status': 'success', 'result': 5}
"""
print(f"Tool: add_numbers called with a={a}, b={b}")
result = a + b
# Example: Storing something in tool_context state
tool_context.state["last_math_operation"] = "addition"
return {"status": "success", "result": result}
def subtract_numbers(a: int, b: int) -> Dict[str, Any]:
"""Subtracts the second number from the first.
Args:
a: The first number.
b: The second number.
Returns:
A dictionary with the difference, e.g., {'status': 'success', 'result': 1}
"""
print(f"Tool: subtract_numbers called with a={a}, b={b}")
return {"status": "success", "result": a - b}
# 2. Create the Toolset by implementing BaseToolset
class SimpleMathToolset(BaseToolset):
def __init__(self, prefix: str = "math_"):
self.prefix = prefix
# Create FunctionTool instances once
self._add_tool = FunctionTool(
func=add_numbers,
name=f"{self.prefix}add_numbers", # Toolset can customize names
)
self._subtract_tool = FunctionTool(
func=subtract_numbers, name=f"{self.prefix}subtract_numbers"
)
print(f"SimpleMathToolset initialized with prefix '{self.prefix}'")
async def get_tools(
self, readonly_context: Optional[ReadonlyContext] = None
) -> List[BaseTool]:
print(f"SimpleMathToolset.get_tools() called.")
# Example of dynamic behavior:
# Could use readonly_context.state to decide which tools to return
# For instance, if readonly_context.state.get("enable_advanced_math"):
# return [self._add_tool, self._subtract_tool, self._multiply_tool]
# For this simple example, always return both tools
tools_to_return = [self._add_tool, self._subtract_tool]
print(f"SimpleMathToolset providing tools: {[t.name for t in tools_to_return]}")
return tools_to_return
async def close(self) -> None:
# No resources to clean up in this simple example
print(f"SimpleMathToolset.close() called for prefix '{self.prefix}'.")
await asyncio.sleep(0) # Placeholder for async cleanup if needed
# 3. Define an individual tool (not part of the toolset)
def greet_user(name: str = "User") -> Dict[str, str]:
"""Greets the user."""
print(f"Tool: greet_user called with name={name}")
return {"greeting": f"Hello, {name}!"}
greet_tool = FunctionTool(func=greet_user)
# 4. Instantiate the toolset
math_toolset_instance = SimpleMathToolset(prefix="calculator_")
# 5. Define an agent that uses both the individual tool and the toolset
calculator_agent = LlmAgent(
name="CalculatorAgent",
model="gemini-2.0-flash", # Replace with your desired model
instruction="You are a helpful calculator and greeter. "
"Use 'greet_user' for greetings. "
"Use 'calculator_add_numbers' to add and 'calculator_subtract_numbers' to subtract. "
"Announce the state of 'last_math_operation' if it's set.",
tools=[greet_tool, math_toolset_instance], # Individual tool # Toolset instance
)
```
```typescript
/**
* Copyright 2025 Google LLC
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { LlmAgent, FunctionTool, ToolContext, BaseToolset, InMemoryRunner, isFinalResponse, BaseTool, stringifyContent } from "@google/adk";
import { z } from "zod";
import { Content, createUserContent } from "@google/genai";
function addNumbers(params: { a: number; b: number }, toolContext?: ToolContext): Record {
if (!toolContext) {
throw new Error("ToolContext is required for this tool.");
}
const result = params.a + params.b;
toolContext.state.set("last_math_result", result);
return { result: result };
}
function subtractNumbers(params: { a: number; b: number }): Record {
return { result: params.a - params.b };
}
function greetUser(params: { name: string }): Record {
return { greeting: `Hello, ${params.name}!` };
}
class SimpleMathToolset extends BaseToolset {
private readonly tools: BaseTool[];
constructor(prefix = "") {
super([]); // No filter
this.tools = [
new FunctionTool({
name: `${prefix}add_numbers`,
description: "Adds two numbers and stores the result in the session state.",
parameters: z.object({ a: z.number(), b: z.number() }),
execute: addNumbers,
}),
new FunctionTool({
name: `${prefix}subtract_numbers`,
description: "Subtracts the second number from the first.",
parameters: z.object({ a: z.number(), b: z.number() }),
execute: subtractNumbers,
}),
];
}
async getTools(): Promise {
return this.tools;
}
async close(): Promise {
console.log("SimpleMathToolset closed.");
}
}
async function main() {
const mathToolset = new SimpleMathToolset("calculator_");
const greetTool = new FunctionTool({
name: "greet_user",
description: "Greets the user.",
parameters: z.object({ name: z.string() }),
execute: greetUser,
});
const instruction =
`You are a calculator and a greeter.
If the user asks for a math operation, use the calculator tools.
If the user asks for a greeting, use the greet_user tool.
The result of the last math operation is stored in the 'last_math_result' state variable.`;
const calculatorAgent = new LlmAgent({
name: "calculator_agent",
instruction: instruction,
tools: [greetTool, mathToolset],
model: "gemini-2.5-flash",
});
const runner = new InMemoryRunner({ agent: calculatorAgent, appName: "toolset_app" });
await runner.sessionService.createSession({ appName: "toolset_app", userId: "user1", sessionId: "session1" });
const message: Content = createUserContent("What is 5 + 3?");
for await (const event of runner.runAsync({ userId: "user1", sessionId: "session1", newMessage: message })) {
if (isFinalResponse(event) && event.content?.parts?.length) {
const text = stringifyContent(event).trim();
if (text) {
console.log(`Response from agent: ${text}`);
}
}
}
await mathToolset.close();
}
main();
```
In this example:
- `SimpleMathToolset` implements `BaseToolset` and its `get_tools()` method returns `FunctionTool` instances for `add_numbers` and `subtract_numbers`. It also customizes their names using a prefix.
- The `calculator_agent` is configured with both an individual `greet_tool` and an instance of `SimpleMathToolset`.
- When `calculator_agent` is run, ADK will call `math_toolset_instance.get_tools()`. The agent's LLM will then have access to `greet_user`, `calculator_add_numbers`, and `calculator_subtract_numbers` to handle user requests.
- The `add_numbers` tool demonstrates writing to `tool_context.state`, and the agent's instruction mentions reading this state.
- The `close()` method is called to ensure any resources held by the toolset are released.
Toolsets offer a powerful way to organize, manage, and dynamically provide collections of tools to your ADK agents, leading to more modular, maintainable, and adaptable agentic applications.
# Authenticating with Tools
Supported in ADKPython v0.1.0
Many tools need to access protected resources (like user data in Google Calendar, Salesforce records, etc.) and require authentication. ADK provides a system to handle various authentication methods securely.
The key components involved are:
1. **`AuthScheme`**: Defines *how* an API expects authentication credentials (e.g., as an API Key in a header, an OAuth 2.0 Bearer token). ADK supports the same types of authentication schemes as OpenAPI 3.0. To know more about what each type of credential is, refer to [OpenAPI doc: Authentication](https://swagger.io/docs/specification/v3_0/authentication/). ADK uses specific classes like `APIKey`, `HTTPBearer`, `OAuth2`, `OpenIdConnectWithConfig`.
1. **`AuthCredential`**: Holds the *initial* information needed to *start* the authentication process (e.g., your application's OAuth Client ID/Secret, an API key value). It includes an `auth_type` (like `API_KEY`, `OAUTH2`, `SERVICE_ACCOUNT`) specifying the credential type.
The general flow involves providing these details when configuring a tool. ADK then attempts to automatically exchange the initial credential for a usable one (like an access token) before the tool makes an API call. For flows requiring user interaction (like OAuth consent), a specific interactive process involving the Agent Client application is triggered.
## Supported Initial Credential Types
- **API_KEY:** For simple key/value authentication. Usually requires no exchange.
- **HTTP:** Can represent Basic Auth (not recommended/supported for exchange) or already obtained Bearer tokens. If it's a Bearer token, no exchange is needed.
- **OAUTH2:** For standard OAuth 2.0 flows. Requires configuration (client ID, secret, scopes) and often triggers the interactive flow for user consent.
- **OPEN_ID_CONNECT:** For authentication based on OpenID Connect. Similar to OAuth2, often requires configuration and user interaction.
- **SERVICE_ACCOUNT:** For Google Cloud Service Account credentials (JSON key or Application Default Credentials). Typically exchanged for a Bearer token.
## Configuring Authentication on Tools
You set up authentication when defining your tool:
- **RestApiTool / OpenAPIToolset**: Pass `auth_scheme` and `auth_credential` during initialization
- **GoogleApiToolSet Tools**: ADK has built-in 1st party tools like Google Calendar, BigQuery etc,. Use the toolset's specific method.
- **APIHubToolset / ApplicationIntegrationToolset**: Pass `auth_scheme` and `auth_credential`during initialization, if the API managed in API Hub / provided by Application Integration requires authentication.
WARNING
Storing sensitive credentials like access tokens and especially refresh tokens directly in the session state might pose security risks depending on your session storage backend (`SessionService`) and overall application security posture.
- **`InMemorySessionService`:** Suitable for testing and development, but data is lost when the process ends. Less risk as it's transient.
- **Database/Persistent Storage:** **Strongly consider encrypting** the token data before storing it in the database using a robust encryption library (like `cryptography`) and managing encryption keys securely (e.g., using a key management service).
- **Secure Secret Stores:** For production environments, storing sensitive credentials in a dedicated secret manager (like Google Cloud Secret Manager or HashiCorp Vault) is the **most recommended approach**. Your tool could potentially store only short-lived access tokens or secure references (not the refresh token itself) in the session state, fetching the necessary secrets from the secure store when needed.
______________________________________________________________________
## Journey 1: Building Agentic Applications with Authenticated Tools
This section focuses on using pre-existing tools (like those from `RestApiTool/ OpenAPIToolset`, `APIHubToolset`, `GoogleApiToolSet`) that require authentication within your agentic application. Your main responsibility is configuring the tools and handling the client-side part of interactive authentication flows (if required by the tool).
### 1. Configuring Tools with Authentication
When adding an authenticated tool to your agent, you need to provide its required `AuthScheme` and your application's initial `AuthCredential`.
**A. Using OpenAPI-based Toolsets (`OpenAPIToolset`, `APIHubToolset`, etc.)**
Pass the scheme and credential during toolset initialization. The toolset applies them to all generated tools. Here are few ways to create tools with authentication in ADK.
Create a tool requiring an API Key.
```py
from google.adk.tools.openapi_tool.auth.auth_helpers import token_to_scheme_credential
from google.adk.tools.openapi_tool.openapi_spec_parser.openapi_toolset import OpenAPIToolset
auth_scheme, auth_credential = token_to_scheme_credential(
"apikey", "query", "apikey", "YOUR_API_KEY_STRING"
)
sample_api_toolset = OpenAPIToolset(
spec_str="...", # Fill this with an OpenAPI spec string
spec_str_type="yaml",
auth_scheme=auth_scheme,
auth_credential=auth_credential,
)
```
Create a tool requiring OAuth2.
```py
from google.adk.tools.openapi_tool.openapi_spec_parser.openapi_toolset import OpenAPIToolset
from fastapi.openapi.models import OAuth2
from fastapi.openapi.models import OAuthFlowAuthorizationCode
from fastapi.openapi.models import OAuthFlows
from google.adk.auth import AuthCredential
from google.adk.auth import AuthCredentialTypes
from google.adk.auth import OAuth2Auth
auth_scheme = OAuth2(
flows=OAuthFlows(
authorizationCode=OAuthFlowAuthorizationCode(
authorizationUrl="https://accounts.google.com/o/oauth2/auth",
tokenUrl="https://oauth2.googleapis.com/token",
scopes={
"https://www.googleapis.com/auth/calendar": "calendar scope"
},
)
)
)
auth_credential = AuthCredential(
auth_type=AuthCredentialTypes.OAUTH2,
oauth2=OAuth2Auth(
client_id=YOUR_OAUTH_CLIENT_ID,
client_secret=YOUR_OAUTH_CLIENT_SECRET
),
)
calendar_api_toolset = OpenAPIToolset(
spec_str=google_calendar_openapi_spec_str, # Fill this with an openapi spec
spec_str_type='yaml',
auth_scheme=auth_scheme,
auth_credential=auth_credential,
)
```
Create a tool requiring Service Account.
```py
from google.adk.tools.openapi_tool.auth.auth_helpers import service_account_dict_to_scheme_credential
from google.adk.tools.openapi_tool.openapi_spec_parser.openapi_toolset import OpenAPIToolset
service_account_cred = json.loads(service_account_json_str)
auth_scheme, auth_credential = service_account_dict_to_scheme_credential(
config=service_account_cred,
scopes=["https://www.googleapis.com/auth/cloud-platform"],
)
sample_toolset = OpenAPIToolset(
spec_str=sa_openapi_spec_str, # Fill this with an openapi spec
spec_str_type='json',
auth_scheme=auth_scheme,
auth_credential=auth_credential,
)
```
Create a tool requiring OpenID connect.
```py
from google.adk.auth.auth_schemes import OpenIdConnectWithConfig
from google.adk.auth.auth_credential import AuthCredential, AuthCredentialTypes, OAuth2Auth
from google.adk.tools.openapi_tool.openapi_spec_parser.openapi_toolset import OpenAPIToolset
auth_scheme = OpenIdConnectWithConfig(
authorization_endpoint=OAUTH2_AUTH_ENDPOINT_URL,
token_endpoint=OAUTH2_TOKEN_ENDPOINT_URL,
scopes=['openid', 'YOUR_OAUTH_SCOPES"]
)
auth_credential = AuthCredential(
auth_type=AuthCredentialTypes.OPEN_ID_CONNECT,
oauth2=OAuth2Auth(
client_id="...",
client_secret="...",
)
)
userinfo_toolset = OpenAPIToolset(
spec_str=content, # Fill in an actual spec
spec_str_type='yaml',
auth_scheme=auth_scheme,
auth_credential=auth_credential,
)
```
**B. Using Google API Toolsets (e.g., `calendar_tool_set`)**
These toolsets often have dedicated configuration methods.
Tip: For how to create a Google OAuth Client ID & Secret, see this guide: [Get your Google API Client ID](https://developers.google.com/identity/gsi/web/guides/get-google-api-clientid#get_your_google_api_client_id)
```py
# Example: Configuring Google Calendar Tools
from google.adk.tools.google_api_tool import calendar_tool_set
client_id = "YOUR_GOOGLE_OAUTH_CLIENT_ID.apps.googleusercontent.com"
client_secret = "YOUR_GOOGLE_OAUTH_CLIENT_SECRET"
# Use the specific configure method for this toolset type
calendar_tool_set.configure_auth(
client_id=oauth_client_id, client_secret=oauth_client_secret
)
# agent = LlmAgent(..., tools=calendar_tool_set.get_tool('calendar_tool_set'))
```
The sequence diagram of auth request flow (where tools are requesting auth credentials) looks like below:
### 2. Handling the Interactive OAuth/OIDC Flow (Client-Side)
If a tool requires user login/consent (typically OAuth 2.0 or OIDC), the ADK framework pauses execution and signals your **Agent Client** application. There are two cases:
- **Agent Client** application runs the agent directly (via `runner.run_async`) in the same process. e.g. UI backend, CLI app, or Spark job etc.
- **Agent Client** application interacts with ADK's fastapi server via `/run` or `/run_sse` endpoint. While ADK's fastapi server could be setup on the same server or different server as **Agent Client** application
The second case is a special case of first case, because `/run` or `/run_sse` endpoint also invokes `runner.run_async`. The only differences are:
- Whether to call a python function to run the agent (first case) or call a service endpoint to run the agent (second case).
- Whether the result events are in-memory objects (first case) or serialized json string in http response (second case).
Below sections focus on the first case and you should be able to map it to the second case very straightforward. We will also describe some differences to handle for the second case if necessary.
Here's the step-by-step process for your client application:
**Step 1: Run Agent & Detect Auth Request**
- Initiate the agent interaction using `runner.run_async`.
- Iterate through the yielded events.
- Look for a specific function call event whose function call has a special name: `adk_request_credential`. This event signals that user interaction is needed. You can use helper functions to identify this event and extract necessary information. (For the second case, the logic is similar. You deserialize the event from the http response).
```py
# runner = Runner(...)
# session = await session_service.create_session(...)
# content = types.Content(...) # User's initial query
print("\nRunning agent...")
events_async = runner.run_async(
session_id=session.id, user_id='user', new_message=content
)
auth_request_function_call_id, auth_config = None, None
async for event in events_async:
# Use helper to check for the specific auth request event
if (auth_request_function_call := get_auth_request_function_call(event)):
print("--> Authentication required by agent.")
# Store the ID needed to respond later
if not (auth_request_function_call_id := auth_request_function_call.id):
raise ValueError(f'Cannot get function call id from function call: {auth_request_function_call}')
# Get the AuthConfig containing the auth_uri etc.
auth_config = get_auth_config(auth_request_function_call)
break # Stop processing events for now, need user interaction
if not auth_request_function_call_id:
print("\nAuth not required or agent finished.")
# return # Or handle final response if received
```
*Helper functions `helpers.py`:*
```py
from google.adk.events import Event
from google.adk.auth import AuthConfig # Import necessary type
from google.genai import types
def get_auth_request_function_call(event: Event) -> types.FunctionCall:
# Get the special auth request function call from the event
if not event.content or not event.content.parts:
return
for part in event.content.parts:
if (
part
and part.function_call
and part.function_call.name == 'adk_request_credential'
and event.long_running_tool_ids
and part.function_call.id in event.long_running_tool_ids
):
return part.function_call
def get_auth_config(auth_request_function_call: types.FunctionCall) -> AuthConfig:
# Extracts the AuthConfig object from the arguments of the auth request function call
if not auth_request_function_call.args or not (auth_config := auth_request_function_call.args.get('authConfig')):
raise ValueError(f'Cannot get auth config from function call: {auth_request_function_call}')
if isinstance(auth_config, dict):
auth_config = AuthConfig.model_validate(auth_config)
elif not isinstance(auth_config, AuthConfig):
raise ValueError(f'Cannot get auth config {auth_config} is not an instance of AuthConfig.')
return auth_config
```
**Step 2: Redirect User for Authorization**
- Get the authorization URL (`auth_uri`) from the `auth_config` extracted in the previous step.
- **Crucially, append your application's** redirect_uri as a query parameter to this `auth_uri`. This `redirect_uri` must be pre-registered with your OAuth provider (e.g., [Google Cloud Console](https://developers.google.com/identity/protocols/oauth2/web-server#creatingcred), [Okta admin panel](https://developer.okta.com/docs/guides/sign-into-web-app-redirect/spring-boot/main/#create-an-app-integration-in-the-admin-console)).
- Direct the user to this complete URL (e.g., open it in their browser).
```py
# (Continuing after detecting auth needed)
if auth_request_function_call_id and auth_config:
# Get the base authorization URL from the AuthConfig
base_auth_uri = auth_config.exchanged_auth_credential.oauth2.auth_uri
if base_auth_uri:
redirect_uri = 'http://localhost:8000/callback' # MUST match your OAuth client app config
# Append redirect_uri (use urlencode in production)
auth_request_uri = base_auth_uri + f'&redirect_uri={redirect_uri}'
# Now you need to redirect your end user to this auth_request_uri or ask them to open this auth_request_uri in their browser
# This auth_request_uri should be served by the corresponding auth provider and the end user should login and authorize your applicaiton to access their data
# And then the auth provider will redirect the end user to the redirect_uri you provided
# Next step: Get this callback URL from the user (or your web server handler)
else:
print("ERROR: Auth URI not found in auth_config.")
# Handle error
```
**Step 3. Handle the Redirect Callback (Client):**
- Your application must have a mechanism (e.g., a web server route at the `redirect_uri`) to receive the user after they authorize the application with the provider.
- The provider redirects the user to your `redirect_uri` and appends an `authorization_code` (and potentially `state`, `scope`) as query parameters to the URL.
- Capture the **full callback URL** from this incoming request.
- (This step happens outside the main agent execution loop, in your web server or equivalent callback handler.)
**Step 4. Send Authentication Result Back to ADK (Client):**
- Once you have the full callback URL (containing the authorization code), retrieve the `auth_request_function_call_id` and the `auth_config` object saved in Client Step 1.
- Set the captured callback URL into the `exchanged_auth_credential.oauth2.auth_response_uri` field. Also ensure `exchanged_auth_credential.oauth2.redirect_uri` contains the redirect URI you used.
- Create a `types.Content` object containing a `types.Part` with a `types.FunctionResponse`.
- Set `name` to `"adk_request_credential"`. (Note: This is a special name for ADK to proceed with authentication. Do not use other names.)
- Set `id` to the `auth_request_function_call_id` you saved.
- Set `response` to the *serialized* (e.g., `.model_dump()`) updated `AuthConfig` object.
- Call `runner.run_async` **again** for the same session, passing this `FunctionResponse` content as the `new_message`.
```py
# (Continuing after user interaction)
# Simulate getting the callback URL (e.g., from user paste or web handler)
auth_response_uri = await get_user_input(
f'Paste the full callback URL here:\n> '
)
auth_response_uri = auth_response_uri.strip() # Clean input
if not auth_response_uri:
print("Callback URL not provided. Aborting.")
return
# Update the received AuthConfig with the callback details
auth_config.exchanged_auth_credential.oauth2.auth_response_uri = auth_response_uri
# Also include the redirect_uri used, as the token exchange might need it
auth_config.exchanged_auth_credential.oauth2.redirect_uri = redirect_uri
# Construct the FunctionResponse Content object
auth_content = types.Content(
role='user', # Role can be 'user' when sending a FunctionResponse
parts=[
types.Part(
function_response=types.FunctionResponse(
id=auth_request_function_call_id, # Link to the original request
name='adk_request_credential', # Special framework function name
response=auth_config.model_dump() # Send back the *updated* AuthConfig
)
)
],
)
# --- Resume Execution ---
print("\nSubmitting authentication details back to the agent...")
events_async_after_auth = runner.run_async(
session_id=session.id,
user_id='user',
new_message=auth_content, # Send the FunctionResponse back
)
# --- Process Final Agent Output ---
print("\n--- Agent Response after Authentication ---")
async for event in events_async_after_auth:
# Process events normally, expecting the tool call to succeed now
print(event) # Print the full event for inspection
```
Note: Authorization response with Resume feature
If your ADK agent workflow is configured with the [Resume](/adk-docs/runtime/resume/) feature, you also must include the Invocation ID (`invocation_id`) parameter with the authorization response. The Invocation ID you provide must be the same invocation that generated the authorization request, otherwise the system starts a new invocation with the authorization response. If your agent uses the Resume feature, consider including the Invocation ID as a parameter with your authorization request, so it can be included with the authorization response. For more details on using the Resume feature, see [Resume stopped agents](/adk-docs/runtime/resume/).
**Step 5: ADK Handles Token Exchange & Tool Retry and gets Tool result**
- ADK receives the `FunctionResponse` for `adk_request_credential`.
- It uses the information in the updated `AuthConfig` (including the callback URL containing the code) to perform the OAuth **token exchange** with the provider's token endpoint, obtaining the access token (and possibly refresh token).
- ADK internally makes these tokens available by setting them in the session state).
- ADK **automatically retries** the original tool call (the one that initially failed due to missing auth).
- This time, the tool finds the valid tokens (via `tool_context.get_auth_response()`) and successfully executes the authenticated API call.
- The agent receives the actual result from the tool and generates its final response to the user.
______________________________________________________________________
The sequence diagram of auth response flow (where Agent Client send back the auth response and ADK retries tool calling) looks like below:
## Journey 2: Building Custom Tools (`FunctionTool`) Requiring Authentication
This section focuses on implementing the authentication logic *inside* your custom Python function when creating a new ADK Tool. We will implement a `FunctionTool` as an example.
### Prerequisites
Your function signature *must* include [`tool_context: ToolContext`](https://google.github.io/adk-docs/tools-custom/#tool-context). ADK automatically injects this object, providing access to state and auth mechanisms.
```py
from google.adk.tools import FunctionTool, ToolContext
from typing import Dict
def my_authenticated_tool_function(param1: str, ..., tool_context: ToolContext) -> dict:
# ... your logic ...
pass
my_tool = FunctionTool(func=my_authenticated_tool_function)
```
### Authentication Logic within the Tool Function
Implement the following steps inside your function:
**Step 1: Check for Cached & Valid Credentials:**
Inside your tool function, first check if valid credentials (e.g., access/refresh tokens) are already stored from a previous run in this session. Credentials for the current sessions should be stored in `tool_context.invocation_context.session.state` (a dictionary of state) Check existence of existing credentials by checking `tool_context.invocation_context.session.state.get(credential_name, None)`.
```py
from google.oauth2.credentials import Credentials
from google.auth.transport.requests import Request
# Inside your tool function
TOKEN_CACHE_KEY = "my_tool_tokens" # Choose a unique key
SCOPES = ["scope1", "scope2"] # Define required scopes
creds = None
cached_token_info = tool_context.state.get(TOKEN_CACHE_KEY)
if cached_token_info:
try:
creds = Credentials.from_authorized_user_info(cached_token_info, SCOPES)
if not creds.valid and creds.expired and creds.refresh_token:
creds.refresh(Request())
tool_context.state[TOKEN_CACHE_KEY] = json.loads(creds.to_json()) # Update cache
elif not creds.valid:
creds = None # Invalid, needs re-auth
tool_context.state[TOKEN_CACHE_KEY] = None
except Exception as e:
print(f"Error loading/refreshing cached creds: {e}")
creds = None
tool_context.state[TOKEN_CACHE_KEY] = None
if creds and creds.valid:
# Skip to Step 5: Make Authenticated API Call
pass
else:
# Proceed to Step 2...
pass
```
**Step 2: Check for Auth Response from Client**
- If Step 1 didn't yield valid credentials, check if the client just completed the interactive flow by calling `exchanged_credential = tool_context.get_auth_response()`.
- This returns the updated `exchanged_credential` object sent back by the client (containing the callback URL in `auth_response_uri`).
```py
# Use auth_scheme and auth_credential configured in the tool.
# exchanged_credential: AuthCredential | None
exchanged_credential = tool_context.get_auth_response(AuthConfig(
auth_scheme=auth_scheme,
raw_auth_credential=auth_credential,
))
# If exchanged_credential is not None, then there is already an exchanged credetial from the auth response.
if exchanged_credential:
# ADK exchanged the access token already for us
access_token = exchanged_credential.oauth2.access_token
refresh_token = exchanged_credential.oauth2.refresh_token
creds = Credentials(
token=access_token,
refresh_token=refresh_token,
token_uri=auth_scheme.flows.authorizationCode.tokenUrl,
client_id=auth_credential.oauth2.client_id,
client_secret=auth_credential.oauth2.client_secret,
scopes=list(auth_scheme.flows.authorizationCode.scopes.keys()),
)
# Cache the token in session state and call the API, skip to step 5
```
**Step 3: Initiate Authentication Request**
If no valid credentials (Step 1.) and no auth response (Step 2.) are found, the tool needs to start the OAuth flow. Define the AuthScheme and initial AuthCredential and call `tool_context.request_credential()`. Return a response indicating authorization is needed.
```py
# Use auth_scheme and auth_credential configured in the tool.
tool_context.request_credential(AuthConfig(
auth_scheme=auth_scheme,
raw_auth_credential=auth_credential,
))
return {'pending': true, 'message': 'Awaiting user authentication.'}
# By setting request_credential, ADK detects a pending authentication event. It pauses execution and ask end user to login.
```
**Step 4: Exchange Authorization Code for Tokens**
ADK automatically generates oauth authorization URL and presents it to your Agent Client application. your Agent Client application should follow the same way described in Journey 1 to redirect the user to the authorization URL (with `redirect_uri` appended). Once a user completes the login flow following the authorization URL and ADK extracts the authentication callback url from Agent Client applications, automatically parses the auth code, and generates auth token. At the next Tool call, `tool_context.get_auth_response` in step 2 will contain a valid credential to use in subsequent API calls.
**Step 5: Cache Obtained Credentials**
After successfully obtaining the token from ADK (Step 2) or if the token is still valid (Step 1), **immediately store** the new `Credentials` object in `tool_context.state` (serialized, e.g., as JSON) using your cache key.
```py
# Inside your tool function, after obtaining 'creds' (either refreshed or newly exchanged)
# Cache the new/refreshed tokens
tool_context.state[TOKEN_CACHE_KEY] = json.loads(creds.to_json())
print(f"DEBUG: Cached/updated tokens under key: {TOKEN_CACHE_KEY}")
# Proceed to Step 6 (Make API Call)
```
**Step 6: Make Authenticated API Call**
- Once you have a valid `Credentials` object (`creds` from Step 1 or Step 4), use it to make the actual call to the protected API using the appropriate client library (e.g., `googleapiclient`, `requests`). Pass the `credentials=creds` argument.
- Include error handling, especially for `HttpError` 401/403, which might mean the token expired or was revoked between calls. If you get such an error, consider clearing the cached token (`tool_context.state.pop(...)`) and potentially returning the `auth_required` status again to force re-authentication.
```py
# Inside your tool function, using the valid 'creds' object
# Ensure creds is valid before proceeding
if not creds or not creds.valid:
return {"status": "error", "error_message": "Cannot proceed without valid credentials."}
try:
service = build("calendar", "v3", credentials=creds) # Example
api_result = service.events().list(...).execute()
# Proceed to Step 7
except Exception as e:
# Handle API errors (e.g., check for 401/403, maybe clear cache and re-request auth)
print(f"ERROR: API call failed: {e}")
return {"status": "error", "error_message": f"API call failed: {e}"}
```
**Step 7: Return Tool Result**
- After a successful API call, process the result into a dictionary format that is useful for the LLM.
- **Crucially, include a** along with the data.
```py
# Inside your tool function, after successful API call
processed_result = [...] # Process api_result for the LLM
return {"status": "success", "data": processed_result}
```
Full Code
tools_and_agent.py
```py
import os
from google.adk.auth.auth_schemes import OpenIdConnectWithConfig
from google.adk.auth.auth_credential import AuthCredential, AuthCredentialTypes, OAuth2Auth
from google.adk.tools.openapi_tool.openapi_spec_parser.openapi_toolset import OpenAPIToolset
from google.adk.agents.llm_agent import LlmAgent
# --- Authentication Configuration ---
# This section configures how the agent will handle authentication using OpenID Connect (OIDC),
# often layered on top of OAuth 2.0.
# Define the Authentication Scheme using OpenID Connect.
# This object tells the ADK *how* to perform the OIDC/OAuth2 flow.
# It requires details specific to your Identity Provider (IDP), like Google OAuth, Okta, Auth0, etc.
# Note: Replace the example Okta URLs and credentials with your actual IDP details.
# All following fields are required, and available from your IDP.
auth_scheme = OpenIdConnectWithConfig(
# The URL of the IDP's authorization endpoint where the user is redirected to log in.
authorization_endpoint="https://your-endpoint.okta.com/oauth2/v1/authorize",
# The URL of the IDP's token endpoint where the authorization code is exchanged for tokens.
token_endpoint="https://your-token-endpoint.okta.com/oauth2/v1/token",
# The scopes (permissions) your application requests from the IDP.
# 'openid' is standard for OIDC. 'profile' and 'email' request user profile info.
scopes=['openid', 'profile', "email"]
)
# Define the Authentication Credentials for your specific application.
# This object holds the client identifier and secret that your application uses
# to identify itself to the IDP during the OAuth2 flow.
# !! SECURITY WARNING: Avoid hardcoding secrets in production code. !!
# !! Use environment variables or a secret management system instead. !!
auth_credential = AuthCredential(
auth_type=AuthCredentialTypes.OPEN_ID_CONNECT,
oauth2=OAuth2Auth(
client_id="CLIENT_ID",
client_secret="CIENT_SECRET",
)
)
# --- Toolset Configuration from OpenAPI Specification ---
# This section defines a sample set of tools the agent can use, configured with Authentication
# from steps above.
# This sample set of tools use endpoints protected by Okta and requires an OpenID Connect flow
# to acquire end user credentials.
with open(os.path.join(os.path.dirname(__file__), 'spec.yaml'), 'r') as f:
spec_content = f.read()
userinfo_toolset = OpenAPIToolset(
spec_str=spec_content,
spec_str_type='yaml',
# ** Crucially, associate the authentication scheme and credentials with these tools. **
# This tells the ADK that the tools require the defined OIDC/OAuth2 flow.
auth_scheme=auth_scheme,
auth_credential=auth_credential,
)
# --- Agent Configuration ---
# Configure and create the main LLM Agent.
root_agent = LlmAgent(
model='gemini-2.0-flash',
name='enterprise_assistant',
instruction='Help user integrate with multiple enterprise systems, including retrieving user information which may require authentication.',
tools=userinfo_toolset.get_tools(),
)
# --- Ready for Use ---
# The `root_agent` is now configured with tools protected by OIDC/OAuth2 authentication.
# When the agent attempts to use one of these tools, the ADK framework will automatically
# trigger the authentication flow defined by `auth_scheme` and `auth_credential`
# if valid credentials are not already available in the session.
# The subsequent interaction flow would guide the user through the login process and handle
# token exchanging, and automatically attach the exchanged token to the endpoint defined in
# the tool.
```
agent_cli.py
```py
import asyncio
from dotenv import load_dotenv
from google.adk.artifacts.in_memory_artifact_service import InMemoryArtifactService
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.genai import types
from .helpers import is_pending_auth_event, get_function_call_id, get_function_call_auth_config, get_user_input
from .tools_and_agent import root_agent
load_dotenv()
agent = root_agent
async def async_main():
"""
Main asynchronous function orchestrating the agent interaction and authentication flow.
"""
# --- Step 1: Service Initialization ---
# Use in-memory services for session and artifact storage (suitable for demos/testing).
session_service = InMemorySessionService()
artifacts_service = InMemoryArtifactService()
# Create a new user session to maintain conversation state.
session = session_service.create_session(
state={}, # Optional state dictionary for session-specific data
app_name='my_app', # Application identifier
user_id='user' # User identifier
)
# --- Step 2: Initial User Query ---
# Define the user's initial request.
query = 'Show me my user info'
print(f"user: {query}")
# Format the query into the Content structure expected by the ADK Runner.
content = types.Content(role='user', parts=[types.Part(text=query)])
# Initialize the ADK Runner
runner = Runner(
app_name='my_app',
agent=agent,
artifact_service=artifacts_service,
session_service=session_service,
)
# --- Step 3: Send Query and Handle Potential Auth Request ---
print("\nRunning agent with initial query...")
events_async = runner.run_async(
session_id=session.id, user_id='user', new_message=content
)
# Variables to store details if an authentication request occurs.
auth_request_event_id, auth_config = None, None
# Iterate through the events generated by the first run.
async for event in events_async:
# Check if this event is the specific 'adk_request_credential' function call.
if is_pending_auth_event(event):
print("--> Authentication required by agent.")
auth_request_event_id = get_function_call_id(event)
auth_config = get_function_call_auth_config(event)
# Once the auth request is found and processed, exit this loop.
# We need to pause execution here to get user input for authentication.
break
# If no authentication request was detected after processing all events, exit.
if not auth_request_event_id or not auth_config:
print("\nAuthentication not required for this query or processing finished.")
return # Exit the main function
# --- Step 4: Manual Authentication Step (Simulated OAuth 2.0 Flow) ---
# This section simulates the user interaction part of an OAuth 2.0 flow.
# In a real web application, this would involve browser redirects.
# Define the Redirect URI. This *must* match one of the URIs registered
# with the OAuth provider for your application. The provider sends the user
# back here after they approve the request.
redirect_uri = 'http://localhost:8000/dev-ui' # Example for local development
# Construct the Authorization URL that the user must visit.
# This typically includes the provider's authorization endpoint URL,
# client ID, requested scopes, response type (e.g., 'code'), and the redirect URI.
# Here, we retrieve the base authorization URI from the AuthConfig provided by ADK
# and append the redirect_uri.
# NOTE: A robust implementation would use urlencode and potentially add state, scope, etc.
auth_request_uri = (
auth_config.exchanged_auth_credential.oauth2.auth_uri
+ f'&redirect_uri={redirect_uri}' # Simple concatenation; ensure correct query param format
)
print("\n--- User Action Required ---")
# Prompt the user to visit the authorization URL, log in, grant permissions,
# and then paste the *full* URL they are redirected back to (which contains the auth code).
auth_response_uri = await get_user_input(
f'1. Please open this URL in your browser to log in:\n {auth_request_uri}\n\n'
f'2. After successful login and authorization, your browser will be redirected.\n'
f' Copy the *entire* URL from the browser\'s address bar.\n\n'
f'3. Paste the copied URL here and press Enter:\n\n> '
)
# --- Step 5: Prepare Authentication Response for the Agent ---
# Update the AuthConfig object with the information gathered from the user.
# The ADK framework needs the full response URI (containing the code)
# and the original redirect URI to complete the OAuth token exchange process internally.
auth_config.exchanged_auth_credential.oauth2.auth_response_uri = auth_response_uri
auth_config.exchanged_auth_credential.oauth2.redirect_uri = redirect_uri
# Construct a FunctionResponse Content object to send back to the agent/runner.
# This response explicitly targets the 'adk_request_credential' function call
# identified earlier by its ID.
auth_content = types.Content(
role='user',
parts=[
types.Part(
function_response=types.FunctionResponse(
# Crucially, link this response to the original request using the saved ID.
id=auth_request_event_id,
# The special name of the function call we are responding to.
name='adk_request_credential',
# The payload containing all necessary authentication details.
response=auth_config.model_dump(),
)
)
],
)
# --- Step 6: Resume Execution with Authentication ---
print("\nSubmitting authentication details back to the agent...")
# Run the agent again, this time providing the `auth_content` (FunctionResponse).
# The ADK Runner intercepts this, processes the 'adk_request_credential' response
# (performs token exchange, stores credentials), and then allows the agent
# to retry the original tool call that required authentication, now succeeding with
# a valid access token embedded.
events_async = runner.run_async(
session_id=session.id,
user_id='user',
new_message=auth_content, # Provide the prepared auth response
)
# Process and print the final events from the agent after authentication is complete.
# This stream now contain the actual result from the tool (e.g., the user info).
print("\n--- Agent Response after Authentication ---")
async for event in events_async:
print(event)
if __name__ == '__main__':
asyncio.run(async_main())
```
helpers.py
```py
from google.adk.auth import AuthConfig
from google.adk.events import Event
import asyncio
# --- Helper Functions ---
async def get_user_input(prompt: str) -> str:
"""
Asynchronously prompts the user for input in the console.
Uses asyncio's event loop and run_in_executor to avoid blocking the main
asynchronous execution thread while waiting for synchronous `input()`.
Args:
prompt: The message to display to the user.
Returns:
The string entered by the user.
"""
loop = asyncio.get_event_loop()
# Run the blocking `input()` function in a separate thread managed by the executor.
return await loop.run_in_executor(None, input, prompt)
def is_pending_auth_event(event: Event) -> bool:
"""
Checks if an ADK Event represents a request for user authentication credentials.
The ADK framework emits a specific function call ('adk_request_credential')
when a tool requires authentication that hasn't been previously satisfied.
Args:
event: The ADK Event object to inspect.
Returns:
True if the event is an 'adk_request_credential' function call, False otherwise.
"""
# Safely checks nested attributes to avoid errors if event structure is incomplete.
return (
event.content
and event.content.parts
and event.content.parts[0] # Assuming the function call is in the first part
and event.content.parts[0].function_call
# The specific function name indicating an auth request from the ADK framework.
and event.content.parts[0].function_call.name == 'adk_request_credential'
)
def get_function_call_id(event: Event) -> str:
"""
Extracts the unique ID of the function call from an ADK Event.
This ID is crucial for correlating a function *response* back to the specific
function *call* that the agent initiated to request for auth credentials.
Args:
event: The ADK Event object containing the function call.
Returns:
The unique identifier string of the function call.
Raises:
ValueError: If the function call ID cannot be found in the event structure.
(Corrected typo from `contents` to `content` below)
"""
# Navigate through the event structure to find the function call ID.
if (
event
and event.content
and event.content.parts
and event.content.parts[0] # Use content, not contents
and event.content.parts[0].function_call
and event.content.parts[0].function_call.id
):
return event.content.parts[0].function_call.id
# If the ID is missing, raise an error indicating an unexpected event format.
raise ValueError(f'Cannot get function call id from event {event}')
def get_function_call_auth_config(event: Event) -> AuthConfig:
"""
Extracts the authentication configuration details from an 'adk_request_credential' event.
Client should use this AuthConfig to necessary authentication details (like OAuth codes and state)
and sent it back to the ADK to continue OAuth token exchanging.
Args:
event: The ADK Event object containing the 'adk_request_credential' call.
Returns:
An AuthConfig object populated with details from the function call arguments.
Raises:
ValueError: If the 'auth_config' argument cannot be found in the event.
(Corrected typo from `contents` to `content` below)
"""
if (
event
and event.content
and event.content.parts
and event.content.parts[0] # Use content, not contents
and event.content.parts[0].function_call
and event.content.parts[0].function_call.args
and event.content.parts[0].function_call.args.get('auth_config')
):
# Reconstruct the AuthConfig object using the dictionary provided in the arguments.
# The ** operator unpacks the dictionary into keyword arguments for the constructor.
return AuthConfig(
**event.content.parts[0].function_call.args.get('auth_config')
)
raise ValueError(f'Cannot get auth config from event {event}')
```
```yaml
openapi: 3.0.1
info:
title: Okta User Info API
version: 1.0.0
description: |-
API to retrieve user profile information based on a valid Okta OIDC Access Token.
Authentication is handled via OpenID Connect with Okta.
contact:
name: API Support
email: support@example.com # Replace with actual contact if available
servers:
- url:
description: Production Environment
paths:
/okta-jwt-user-api:
get:
summary: Get Authenticated User Info
description: |-
Fetches profile details for the user
operationId: getUserInfo
tags:
- User Profile
security:
- okta_oidc:
- openid
- email
- profile
responses:
'200':
description: Successfully retrieved user information.
content:
application/json:
schema:
type: object
properties:
sub:
type: string
description: Subject identifier for the user.
example: "abcdefg"
name:
type: string
description: Full name of the user.
example: "Example LastName"
locale:
type: string
description: User's locale, e.g., en-US or en_US.
example: "en_US"
email:
type: string
format: email
description: User's primary email address.
example: "username@example.com"
preferred_username:
type: string
description: Preferred username of the user (often the email).
example: "username@example.com"
given_name:
type: string
description: Given name (first name) of the user.
example: "Example"
family_name:
type: string
description: Family name (last name) of the user.
example: "LastName"
zoneinfo:
type: string
description: User's timezone, e.g., America/Los_Angeles.
example: "America/Los_Angeles"
updated_at:
type: integer
format: int64 # Using int64 for Unix timestamp
description: Timestamp when the user's profile was last updated (Unix epoch time).
example: 1743617719
email_verified:
type: boolean
description: Indicates if the user's email address has been verified.
example: true
required:
- sub
- name
- locale
- email
- preferred_username
- given_name
- family_name
- zoneinfo
- updated_at
- email_verified
'401':
description: Unauthorized. The provided Bearer token is missing, invalid, or expired.
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
'403':
description: Forbidden. The provided token does not have the required scopes or permissions to access this resource.
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
components:
securitySchemes:
okta_oidc:
type: openIdConnect
description: Authentication via Okta using OpenID Connect. Requires a Bearer Access Token.
openIdConnectUrl: https://your-endpoint.okta.com/.well-known/openid-configuration
schemas:
Error:
type: object
properties:
code:
type: string
description: An error code.
message:
type: string
description: A human-readable error message.
required:
- code
- message
```
# Get action confirmation for ADK Tools
Supported in ADKPython v1.14.0Experimental
Some agent workflows require confirmation for decision making, verification, security, or general oversight. In these cases, you want to get a response from a human or supervising system before proceeding with a workflow. The *Tool Confirmation* feature in the Agent Development Kit (ADK) allows an ADK Tool to pause its execution and interact with a user or other system for confirmation or to gather structured data before proceeding. You can use Tool Confirmation with an ADK Tool in the following ways:
- **[Boolean Confirmation](#boolean-confirmation):** You can configure a FunctionTool with a `require_confirmation` parameter. This option pauses the tool for a yes or no confirmation response.
- **[Advanced Confirmation](#advanced-confirmation):** For scenarios requiring structured data responses, you can configure a `FunctionTool` with a text prompt to explain the confirmation and an expected response.
Experimental
The Tool Confirmation feature is experimental and has some [known limitations](#known-limitations). We welcome your [feedback](https://github.com/google/adk-python/issues/new?template=feature_request.md&labels=tool%20confirmation)!
You can configure how a request is communicated to a user, and the system can also use [remote responses](#remote-response) sent via the ADK server's REST API. When using the confirmation feature with the ADK web user interface, the agent workflow displays a dialog box to the user to request input, as shown in Figure 1:
**Figure 1.** Example confirmation response request dialog box using an advanced, tool response implementation.
The following sections describe how to use this feature for the confirmation scenarios. For a complete code sample, see the [human_tool_confirmation](https://github.com/google/adk-python/blob/fc90ce968f114f84b14829f8117797a4c256d710/contributing/samples/human_tool_confirmation/agent.py) example. There are additional ways to incorporate human input into your agent workflow, for more details, see the [Human-in-the-loop](/adk-docs/agents/multi-agents/#human-in-the-loop-pattern) agent pattern.
## Boolean confirmation
When your tool only requires a simple `yes` or `no` from the user, you can append a confirmation step using the `FunctionTool` class as a wrapper. For example, if you have a tool called `reimburse`, you can enable a confirmation step by wrapping it with the `FunctionTool` class and setting the `require_confirmation` parameter to `True`, as shown in the following example:
```text
# From agent.py
root_agent = Agent(
...
tools=[
# Set require_confirmation to True to require user confirmation
# for the tool call.
FunctionTool(reimburse, require_confirmation=True),
],
...
```
This implementation method requires minimal code, but is limited to simple approvals from the user or confirming system. For a complete example of this approach, see the [human_tool_confirmation](https://github.com/google/adk-python/blob/fc90ce968f114f84b14829f8117797a4c256d710/contributing/samples/human_tool_confirmation/agent.py) code sample.
### Require confirmation function
You can modify the behavior `require_confirmation` response by replacing its input value with a function that returns a boolean response. The following example shows a function for determining if a confirmation is required:
```text
async def confirmation_threshold(
amount: int, tool_context: ToolContext
) -> bool:
"""Returns true if the amount is greater than 1000."""
return amount > 1000
```
This function than then be set as the parameter value for the `require_confirmation` parameter:
```text
root_agent = Agent(
...
tools=[
# Set require_confirmation to True to require user confirmation
FunctionTool(reimburse, require_confirmation=confirmation_threshold),
],
...
```
For a complete example of this implementation, see the [human_tool_confirmation](https://github.com/google/adk-python/blob/fc90ce968f114f84b14829f8117797a4c256d710/contributing/samples/human_tool_confirmation/agent.py) code sample.
## Advanced confirmation
When a tool confirmation requires more details for the user or a more complex response, use a tool_confirmation implementation. This approach extends the `ToolContext` object to add a text description of the request for the user and allows for more complex response data. When implementing tool confirmation this way, you can pause a tool's execution, request specific information, and then resume the tool with the provided data.
This confirmation flow has a request stage where the system assembles and sends an input request human response, and a response stage where the system receives and processes the returned data.
### Confirmation definition
When creating a Tool with an advanced confirmation, create a function that includes a ToolContext object. Then define the confirmation using a tool_confirmation object, the `tool_context.request_confirmation()` method with `hint` and `payload` parameters. These properties are used as follows:
- `hint`: Descriptive message that explains what is needed from the user.
- `payload`: The structure of the data you expect in return. This data type is Any and must be serializable into a JSON-formatted string, such as a dictionary or pydantic model.
The following code shows an example implementation for a tool that processes time off requests for an employee:
```text
def request_time_off(days: int, tool_context: ToolContext):
"""Request day off for the employee."""
...
tool_confirmation = tool_context.tool_confirmation
if not tool_confirmation:
tool_context.request_confirmation(
hint=(
'Please approve or reject the tool call request_time_off() by'
' responding with a FunctionResponse with an expected'
' ToolConfirmation payload.'
),
payload={
'approved_days': 0,
},
)
# Return intermediate status indicating that the tool is waiting for
# a confirmation response:
return {'status': 'Manager approval is required.'}
approved_days = tool_confirmation.payload['approved_days']
approved_days = min(approved_days, days)
if approved_days == 0:
return {'status': 'The time off request is rejected.', 'approved_days': 0}
return {
'status': 'ok',
'approved_days': approved_days,
}
```
For a complete example of this approach, see the [human_tool_confirmation](https://github.com/google/adk-python/blob/fc90ce968f114f84b14829f8117797a4c256d710/contributing/samples/human_tool_confirmation/agent.py) code sample. Keep in mind that the agent workflow tool execution pauses while a confirmation is obtained. After confirmation is received, you can access the confirmation response in the `tool_confirmation.payload` object and then proceed with the execution of the workflow.
## Remote confirmation with REST API
If there is no active user interface for a human confirmation of an agent workflow, you can handle the confirmation through a command-line interface or by routing it through another channel like email or a chat application. To confirm the tool call, the user or calling application needs to send a `FunctionResponse` event with the tool confirmation data.
You can send the request to the ADK API server's `/run` or `/run_sse` endpoint, or directly to the ADK runner. The following example uses a `curl` command to send the confirmation to the `/run_sse` endpoint:
```text
curl -X POST http://localhost:8000/run_sse \
-H "Content-Type: application/json" \
-d '{
"app_name": "human_tool_confirmation",
"user_id": "user",
"session_id": "7828f575-2402-489f-8079-74ea95b6a300",
"new_message": {
"parts": [
{
"function_response": {
"id": "adk-13b84a8c-c95c-4d66-b006-d72b30447e35",
"name": "adk_request_confirmation",
"response": {
"confirmed": true
}
}
}
],
"role": "user"
}
}'
```
A REST-based response for a confirmation must meet the following requirements:
- The `id` in the `function_response` should match the `function_call_id` from the `RequestConfirmation` `FunctionCall` event.
- The `name` should be `adk_request_confirmation`.
- The `response` object contains the confirmation status and any additional payload data required by the tool.
Note: Confirmation with Resume feature
If your ADK agent workflow is configured with the [Resume](/adk-docs/runtime/resume/) feature, you also must include the Invocation ID (`invocation_id`) parameter with the confirmation response. The Invocation ID you provide must be the same invocation that generated the confirmation request, otherwise the system starts a new invocation with the confirmation response. If your agent uses the Resume feature, consider including the Invocation ID as a parameter with your confirmation request, so it can be included with the response. For more details on using the Resume feature, see [Resume stopped agents](/adk-docs/runtime/resume/).
## Known limitations
The tool confirmation feature has the following limitations:
- [DatabaseSessionService](/adk-docs/api-reference/python/google-adk.html#google.adk.sessions.DatabaseSessionService) is not supported by this feature.
- [VertexAiSessionService](/adk-docs/api-reference/python/google-adk.html#google.adk.sessions.VertexAiSessionService) is not supported by this feature.
## Next steps
For more information on building ADK tools for agent workflows, see [Function tools](/adk-docs/tools-custom/function-tools/).
# Function tools
Supported in ADKPython v0.1.0Typescript v0.2.0Go v0.1.0Java v0.1.0
When pre-built ADK tools don't meet your requirements, you can create custom *function tools*. Building function tools allows you to create tailored functionality, such as connecting to proprietary databases or implementing unique algorithms. For example, a function tool, `myfinancetool`, might be a function that calculates a specific financial metric. ADK also supports long-running functions, so if that calculation takes a while, the agent can continue working on other tasks.
ADK offers several ways to create functions tools, each suited to different levels of complexity and control:
- [Function Tools](#function-tool)
- [Long Running Function Tools](#long-run-tool)
- [Agents-as-a-Tool](#agent-tool)
## Function Tools
Transforming a Python function into a tool is a straightforward way to integrate custom logic into your agents. When you assign a function to an agent’s `tools` list, the framework automatically wraps it as a `FunctionTool`.
### How it Works
The ADK framework automatically inspects your Python function's signature—including its name, docstring, parameters, type hints, and default values—to generate a schema. This schema is what the LLM uses to understand the tool's purpose, when to use it, and what arguments it requires.
### Defining Function Signatures
A well-defined function signature is crucial for the LLM to use your tool correctly.
#### Parameters
##### Required Parameters
A parameter is considered **required** if it has a type hint but **no default value**. The LLM must provide a value for this argument when it calls the tool. The parameter's description is taken from the function's docstring.
Example: Required Parameters
```python
def get_weather(city: str, unit: str):
"""
Retrieves the weather for a city in the specified unit.
Args:
city (str): The city name.
unit (str): The temperature unit, either 'Celsius' or 'Fahrenheit'.
"""
# ... function logic ...
return {"status": "success", "report": f"Weather for {city} is sunny."}
```
In this example, both `city` and `unit` are mandatory. If the LLM tries to call `get_weather` without one of them, the ADK will return an error to the LLM, prompting it to correct the call.
In Go, you use struct tags to control the JSON schema. The two primary tags are `json` and `jsonschema`.
A parameter is considered **required** if its struct field does **not** have the `omitempty` or `omitzero` option in its `json` tag.
The `jsonschema` tag is used to provide the argument's description. This is crucial for the LLM to understand what the argument is for.
Example: Required Parameters
```go
// GetWeatherParams defines the arguments for the getWeather tool.
type GetWeatherParams struct {
// This field is REQUIRED (no "omitempty").
// The jsonschema tag provides the description.
Location string `json:"location" jsonschema:"The city and state, e.g., San Francisco, CA"`
// This field is also REQUIRED.
Unit string `json:"unit" jsonschema:"The temperature unit, either 'celsius' or 'fahrenheit'"`
}
```
In this example, both `location` and `unit` are mandatory.
##### Optional Parameters
A parameter is considered **optional** if you provide a **default value**. This is the standard Python way to define optional arguments. You can also mark a parameter as optional using `typing.Optional[SomeType]` or the `| None` syntax (Python 3.10+).
Example: Optional Parameters
```python
def search_flights(destination: str, departure_date: str, flexible_days: int = 0):
"""
Searches for flights.
Args:
destination (str): The destination city.
departure_date (str): The desired departure date.
flexible_days (int, optional): Number of flexible days for the search. Defaults to 0.
"""
# ... function logic ...
if flexible_days > 0:
return {"status": "success", "report": f"Found flexible flights to {destination}."}
return {"status": "success", "report": f"Found flights to {destination} on {departure_date}."}
```
Here, `flexible_days` is optional. The LLM can choose to provide it, but it's not required.
A parameter is considered **optional** if its struct field has the `omitempty` or `omitzero` option in its `json` tag.
Example: Optional Parameters
```go
// GetWeatherParams defines the arguments for the getWeather tool.
type GetWeatherParams struct {
// Location is required.
Location string `json:"location" jsonschema:"The city and state, e.g., San Francisco, CA"`
// Unit is optional.
Unit string `json:"unit,omitempty" jsonschema:"The temperature unit, either 'celsius' or 'fahrenheit'"`
// Days is optional.
Days int `json:"days,omitzero" jsonschema:"The number of forecast days to return (defaults to 1)"`
}
```
Here, `unit` and `days` are optional. The LLM can choose to provide them, but they are not required.
##### Optional Parameters with `typing.Optional`
You can also mark a parameter as optional using `typing.Optional[SomeType]` or the `| None` syntax (Python 3.10+). This signals that the parameter can be `None`. When combined with a default value of `None`, it behaves as a standard optional parameter.
Example: `typing.Optional`
```python
from typing import Optional
def create_user_profile(username: str, bio: Optional[str] = None):
"""
Creates a new user profile.
Args:
username (str): The user's unique username.
bio (str, optional): A short biography for the user. Defaults to None.
"""
# ... function logic ...
if bio:
return {"status": "success", "message": f"Profile for {username} created with a bio."}
return {"status": "success", "message": f"Profile for {username} created."}
```
##### Variadic Parameters (`*args` and `**kwargs`)
While you can include `*args` (variable positional arguments) and `**kwargs` (variable keyword arguments) in your function signature for other purposes, they are **ignored by the ADK framework** when generating the tool schema for the LLM. The LLM will not be aware of them and cannot pass arguments to them. It's best to rely on explicitly defined parameters for all data you expect from the LLM.
#### Return Type
The preferred return type for a Function Tool is a **dictionary** in Python, a **Map** in Java, or an **object** in TypeScript. This allows you to structure the response with key-value pairs, providing context and clarity to the LLM. If your function returns a type other than a dictionary, the framework automatically wraps it into a dictionary with a single key named **"result"**.
Strive to make your return values as descriptive as possible. *For example,* instead of returning a numeric error code, return a dictionary with an "error_message" key containing a human-readable explanation. **Remember that the LLM**, not a piece of code, needs to understand the result. As a best practice, include a "status" key in your return dictionary to indicate the overall outcome (e.g., "success", "error", "pending"), providing the LLM with a clear signal about the operation's state.
#### Docstrings
The docstring of your function serves as the tool's **description** and is sent to the LLM. Therefore, a well-written and comprehensive docstring is crucial for the LLM to understand how to use the tool effectively. Clearly explain the purpose of the function, the meaning of its parameters, and the expected return values.
### Passing Data Between Tools
When an agent calls multiple tools in a sequence, you might need to pass data from one tool to another. The recommended way to do this is by using the `temp:` prefix in the session state.
A tool can write data to a `temp:` variable, and a subsequent tool can read it. This data is only available for the current invocation and is discarded afterwards.
Shared Invocation Context
All tool calls within a single agent turn share the same `InvocationContext`. This means they also share the same temporary (`temp:`) state, which is how data can be passed between them.
### Example
Example
This tool is a python function which obtains the Stock price of a given Stock ticker/ symbol.
Note: You need to `pip install yfinance` library before using this tool.
```python
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from google.adk.agents import Agent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.genai import types
import yfinance as yf
APP_NAME = "stock_app"
USER_ID = "1234"
SESSION_ID = "session1234"
def get_stock_price(symbol: str):
"""
Retrieves the current stock price for a given symbol.
Args:
symbol (str): The stock symbol (e.g., "AAPL", "GOOG").
Returns:
float: The current stock price, or None if an error occurs.
"""
try:
stock = yf.Ticker(symbol)
historical_data = stock.history(period="1d")
if not historical_data.empty:
current_price = historical_data['Close'].iloc[-1]
return current_price
else:
return None
except Exception as e:
print(f"Error retrieving stock price for {symbol}: {e}")
return None
stock_price_agent = Agent(
model='gemini-2.0-flash',
name='stock_agent',
instruction= 'You are an agent who retrieves stock prices. If a ticker symbol is provided, fetch the current price. If only a company name is given, first perform a Google search to find the correct ticker symbol before retrieving the stock price. If the provided ticker symbol is invalid or data cannot be retrieved, inform the user that the stock price could not be found.',
description='This agent specializes in retrieving real-time stock prices. Given a stock ticker symbol (e.g., AAPL, GOOG, MSFT) or the stock name, use the tools and reliable data sources to provide the most up-to-date price.',
tools=[get_stock_price], # You can add Python functions directly to the tools list; they will be automatically wrapped as FunctionTools.
)
# Session and Runner
async def setup_session_and_runner():
session_service = InMemorySessionService()
session = await session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID)
runner = Runner(agent=stock_price_agent, app_name=APP_NAME, session_service=session_service)
return session, runner
# Agent Interaction
async def call_agent_async(query):
content = types.Content(role='user', parts=[types.Part(text=query)])
session, runner = await setup_session_and_runner()
events = runner.run_async(user_id=USER_ID, session_id=SESSION_ID, new_message=content)
async for event in events:
if event.is_final_response():
final_response = event.content.parts[0].text
print("Agent Response: ", final_response)
# Note: In Colab, you can directly use 'await' at the top level.
# If running this code as a standalone Python script, you'll need to use asyncio.run() or manage the event loop.
await call_agent_async("stock price of GOOG")
```
The return value from this tool will be wrapped into a dictionary.
```json
{"result": "$123"}
```
This tool retrieves the mocked value of a stock price.
```typescript
/**
* Copyright 2025 Google LLC
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import {Content, Part, createUserContent} from '@google/genai';
import {
stringifyContent,
FunctionTool,
InMemoryRunner,
LlmAgent,
} from '@google/adk';
import {z} from 'zod';
// Define the function to get the stock price
async function getStockPrice({ticker}: {ticker: string}): Promise> {
console.log(`Getting stock price for ${ticker}`);
// In a real-world scenario, you would fetch the stock price from an API
const price = (Math.random() * 1000).toFixed(2);
return {price: `$${price}`};
}
async function main() {
// Define the schema for the tool's parameters using Zod
const getStockPriceSchema = z.object({
ticker: z.string().describe('The stock ticker symbol to look up.'),
});
// Create a FunctionTool from the function and schema
const stockPriceTool = new FunctionTool({
name: 'getStockPrice',
description: 'Gets the current price of a stock.',
parameters: getStockPriceSchema,
execute: getStockPrice,
});
// Define the agent that will use the tool
const stockAgent = new LlmAgent({
name: 'stock_agent',
model: 'gemini-2.5-flash',
instruction: 'You can get the stock price of a company.',
tools: [stockPriceTool],
});
// Create a runner for the agent
const runner = new InMemoryRunner({agent: stockAgent});
// Create a new session
const session = await runner.sessionService.createSession({
appName: runner.appName,
userId: 'test-user',
});
const userContent: Content = createUserContent('What is the stock price of GOOG?');
// Run the agent and get the response
const response = [];
for await (const event of runner.runAsync({
userId: session.userId,
sessionId: session.id,
newMessage: userContent,
})) {
response.push(event);
}
// Print the final response from the agent
const finalResponse = response[response.length - 1];
if (finalResponse?.content?.parts?.length) {
console.log(stringifyContent(finalResponse));
}
}
main();
```
The return value from this tool will be an object.
```json
For input `GOOG`: {"price": 2800.0, "currency": "USD"}
```
This tool retrieves the mocked value of a stock price.
```go
import (
"google.golang.org/adk/agent"
"google.golang.org/adk/agent/llmagent"
"google.golang.org/adk/model/gemini"
"google.golang.org/adk/runner"
"google.golang.org/adk/session"
"google.golang.org/adk/tool"
"google.golang.org/adk/tool/functiontool"
"google.golang.org/genai"
)
// Copyright 2025 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package main
import (
"context"
"fmt"
"log"
"strings"
"google.golang.org/adk/agent"
"google.golang.org/adk/agent/llmagent"
"google.golang.org/adk/model/gemini"
"google.golang.org/adk/runner"
"google.golang.org/adk/session"
"google.golang.org/adk/tool"
"google.golang.org/adk/tool/agenttool"
"google.golang.org/adk/tool/functiontool"
"google.golang.org/genai"
)
// mockStockPrices provides a simple in-memory database of stock prices
// to simulate a real-world stock data API. This allows the example to
// demonstrate tool functionality without making external network calls.
var mockStockPrices = map[string]float64{
"GOOG": 300.6,
"AAPL": 123.4,
"MSFT": 234.5,
}
// getStockPriceArgs defines the schema for the arguments passed to the getStockPrice tool.
// Using a struct is the recommended approach in the Go ADK as it provides strong
// typing and clear validation for the expected inputs.
type getStockPriceArgs struct {
Symbol string `json:"symbol" jsonschema:"The stock ticker symbol, e.g., GOOG"`
}
// getStockPriceResults defines the output schema for the getStockPrice tool.
type getStockPriceResults struct {
Symbol string `json:"symbol"`
Price float64 `json:"price,omitempty"`
Error string `json:"error,omitempty"`
}
// getStockPrice is a tool that retrieves the stock price for a given ticker symbol
// from the mockStockPrices map. It demonstrates how a function can be used as a
// tool by an agent. If the symbol is found, it returns a struct containing the
// symbol and its price. Otherwise, it returns a struct with an error message.
func getStockPrice(ctx tool.Context, input getStockPriceArgs) (getStockPriceResults, error) {
symbolUpper := strings.ToUpper(input.Symbol)
if price, ok := mockStockPrices[symbolUpper]; ok {
fmt.Printf("Tool: Found price for %s: %f\n", input.Symbol, price)
return getStockPriceResults{Symbol: input.Symbol, Price: price}, nil
}
return getStockPriceResults{}, fmt.Errorf("no data found for symbol")
}
// createStockAgent initializes and configures an LlmAgent.
// This agent is equipped with the getStockPrice tool and is instructed
// on how to respond to user queries about stock prices. It uses the
// Gemini model to understand user intent and decide when to use its tools.
func createStockAgent(ctx context.Context) (agent.Agent, error) {
stockPriceTool, err := functiontool.New(
functiontool.Config{
Name: "get_stock_price",
Description: "Retrieves the current stock price for a given symbol.",
},
getStockPrice)
if err != nil {
return nil, err
}
model, err := gemini.NewModel(ctx, "gemini-2.5-flash", &genai.ClientConfig{})
if err != nil {
log.Fatalf("Failed to create model: %v", err)
}
return llmagent.New(llmagent.Config{
Name: "stock_agent",
Model: model,
Instruction: "You are an agent who retrieves stock prices. If a ticker symbol is provided, fetch the current price. If only a company name is given, first perform a Google search to find the correct ticker symbol before retrieving the stock price. If the provided ticker symbol is invalid or data cannot be retrieved, inform the user that the stock price could not be found.",
Description: "This agent specializes in retrieving real-time stock prices. Given a stock ticker symbol (e.g., AAPL, GOOG, MSFT) or the stock name, use the tools and reliable data sources to provide the most up-to-date price.",
Tools: []tool.Tool{
stockPriceTool,
},
})
}
// userID and appName are constants used to identify the user and application
// throughout the session. These values are important for logging, tracking,
// and managing state across different agent interactions.
const (
userID = "example_user_id"
appName = "example_app"
)
// callAgent orchestrates the execution of the agent for a given prompt.
// It sets up the necessary services, creates a session, and uses a runner
// to manage the agent's lifecycle. It streams the agent's responses and
// prints them to the console, handling any potential errors during the run.
func callAgent(ctx context.Context, a agent.Agent, prompt string) {
sessionService := session.InMemoryService()
// Create a new session for the agent interactions.
session, err := sessionService.Create(ctx, &session.CreateRequest{
AppName: appName,
UserID: userID,
})
if err != nil {
log.Fatalf("Failed to create the session service: %v", err)
}
config := runner.Config{
AppName: appName,
Agent: a,
SessionService: sessionService,
}
// Create the runner to manage the agent execution.
r, err := runner.New(config)
if err != nil {
log.Fatalf("Failed to create the runner: %v", err)
}
sessionID := session.Session.ID()
userMsg := &genai.Content{
Parts: []*genai.Part{
genai.NewPartFromText(prompt),
},
Role: string(genai.RoleUser),
}
for event, err := range r.Run(ctx, userID, sessionID, userMsg, agent.RunConfig{
StreamingMode: agent.StreamingModeNone,
}) {
if err != nil {
fmt.Printf("\nAGENT_ERROR: %v\n", err)
} else {
for _, p := range event.Content.Parts {
fmt.Print(p.Text)
}
}
}
}
// RunAgentSimulation serves as the entry point for this example.
// It creates the stock agent and then simulates a series of user interactions
// by sending different prompts to the agent. This function showcases how the
// agent responds to various queries, including both successful and unsuccessful
// attempts to retrieve stock prices.
func RunAgentSimulation() {
// Create the stock agent
agent, err := createStockAgent(context.Background())
if err != nil {
panic(err)
}
fmt.Println("Agent created:", agent.Name())
prompts := []string{
"stock price of GOOG",
"What's the price of MSFT?",
"Can you find the stock price for an unknown company XYZ?",
}
// Simulate running the agent with different prompts
for _, prompt := range prompts {
fmt.Printf("\nPrompt: %s\nResponse: ", prompt)
callAgent(context.Background(), agent, prompt)
fmt.Println("\n---")
}
}
// createSummarizerAgent creates an agent whose sole purpose is to summarize text.
func createSummarizerAgent(ctx context.Context) (agent.Agent, error) {
model, err := gemini.NewModel(ctx, "gemini-2.5-flash", &genai.ClientConfig{})
if err != nil {
return nil, err
}
return llmagent.New(llmagent.Config{
Name: "SummarizerAgent",
Model: model,
Instruction: "You are an expert at summarizing text. Take the user's input and provide a concise summary.",
Description: "An agent that summarizes text.",
})
}
// createMainAgent creates the primary agent that will use the summarizer agent as a tool.
func createMainAgent(ctx context.Context, tools ...tool.Tool) (agent.Agent, error) {
model, err := gemini.NewModel(ctx, "gemini-2.5-flash", &genai.ClientConfig{})
if err != nil {
return nil, err
}
return llmagent.New(llmagent.Config{
Name: "MainAgent",
Model: model,
Instruction: "You are a helpful assistant. If you are asked to summarize a long text, use the 'summarize' tool. " +
"After getting the summary, present it to the user by saying 'Here is a summary of the text:'.",
Description: "The main agent that can delegate tasks.",
Tools: tools,
})
}
func RunAgentAsToolSimulation() {
ctx := context.Background()
// 1. Create the Tool Agent (Summarizer)
summarizerAgent, err := createSummarizerAgent(ctx)
if err != nil {
log.Fatalf("Failed to create summarizer agent: %v", err)
}
// 2. Wrap the Tool Agent in an AgentTool
summarizeTool := agenttool.New(summarizerAgent, &agenttool.Config{
SkipSummarization: true,
})
// 3. Create the Main Agent and provide it with the AgentTool
mainAgent, err := createMainAgent(ctx, summarizeTool)
if err != nil {
log.Fatalf("Failed to create main agent: %v", err)
}
// 4. Run the main agent
prompt := `
Please summarize this text for me:
Quantum computing represents a fundamentally different approach to computation,
leveraging the bizarre principles of quantum mechanics to process information. Unlike classical computers
that rely on bits representing either 0 or 1, quantum computers use qubits which can exist in a state of superposition - effectively
being 0, 1, or a combination of both simultaneously. Furthermore, qubits can become entangled,
meaning their fates are intertwined regardless of distance, allowing for complex correlations. This parallelism and
interconnectedness grant quantum computers the potential to solve specific types of incredibly complex problems - such
as drug discovery, materials science, complex system optimization, and breaking certain types of cryptography - far
faster than even the most powerful classical supercomputers could ever achieve, although the technology is still largely in its developmental stages.
`
fmt.Printf("\nPrompt: %s\nResponse: ", prompt)
callAgent(context.Background(), mainAgent, prompt)
fmt.Println("\n---")
}
func main() {
fmt.Println("Attempting to run the agent simulation...")
RunAgentSimulation()
fmt.Println("\nAttempting to run the agent-as-a-tool simulation...")
RunAgentAsToolSimulation()
}
```
The return value from this tool will be a `getStockPriceResults` instance.
```json
For input `{"symbol": "GOOG"}`: {"price":300.6,"symbol":"GOOG"}
```
This tool retrieves the mocked value of a stock price.
```java
import com.google.adk.agents.LlmAgent;
import com.google.adk.events.Event;
import com.google.adk.runner.InMemoryRunner;
import com.google.adk.sessions.Session;
import com.google.adk.tools.Annotations.Schema;
import com.google.adk.tools.FunctionTool;
import com.google.genai.types.Content;
import com.google.genai.types.Part;
import io.reactivex.rxjava3.core.Flowable;
import java.util.HashMap;
import java.util.Map;
public class StockPriceAgent {
private static final String APP_NAME = "stock_agent";
private static final String USER_ID = "user1234";
// Mock data for various stocks functionality
// NOTE: This is a MOCK implementation. In a real Java application,
// you would use a financial data API or library.
private static final Map mockStockPrices = new HashMap<>();
static {
mockStockPrices.put("GOOG", 1.0);
mockStockPrices.put("AAPL", 1.0);
mockStockPrices.put("MSFT", 1.0);
}
@Schema(description = "Retrieves the current stock price for a given symbol.")
public static Map getStockPrice(
@Schema(description = "The stock symbol (e.g., \"AAPL\", \"GOOG\")",
name = "symbol")
String symbol) {
try {
if (mockStockPrices.containsKey(symbol.toUpperCase())) {
double currentPrice = mockStockPrices.get(symbol.toUpperCase());
System.out.println("Tool: Found price for " + symbol + ": " + currentPrice);
return Map.of("symbol", symbol, "price", currentPrice);
} else {
return Map.of("symbol", symbol, "error", "No data found for symbol");
}
} catch (Exception e) {
return Map.of("symbol", symbol, "error", e.getMessage());
}
}
public static void callAgent(String prompt) {
// Create the FunctionTool from the Java method
FunctionTool getStockPriceTool = FunctionTool.create(StockPriceAgent.class, "getStockPrice");
LlmAgent stockPriceAgent =
LlmAgent.builder()
.model("gemini-2.0-flash")
.name("stock_agent")
.instruction(
"You are an agent who retrieves stock prices. If a ticker symbol is provided, fetch the current price. If only a company name is given, first perform a Google search to find the correct ticker symbol before retrieving the stock price. If the provided ticker symbol is invalid or data cannot be retrieved, inform the user that the stock price could not be found.")
.description(
"This agent specializes in retrieving real-time stock prices. Given a stock ticker symbol (e.g., AAPL, GOOG, MSFT) or the stock name, use the tools and reliable data sources to provide the most up-to-date price.")
.tools(getStockPriceTool) // Add the Java FunctionTool
.build();
// Create an InMemoryRunner
InMemoryRunner runner = new InMemoryRunner(stockPriceAgent, APP_NAME);
// InMemoryRunner automatically creates a session service. Create a session using the service
Session session = runner.sessionService().createSession(APP_NAME, USER_ID).blockingGet();
Content userMessage = Content.fromParts(Part.fromText(prompt));
// Run the agent
Flowable eventStream = runner.runAsync(USER_ID, session.id(), userMessage);
// Stream event response
eventStream.blockingForEach(
event -> {
if (event.finalResponse()) {
System.out.println(event.stringifyContent());
}
});
}
public static void main(String[] args) {
callAgent("stock price of GOOG");
callAgent("What's the price of MSFT?");
callAgent("Can you find the stock price for an unknown company XYZ?");
}
}
```
The return value from this tool will be wrapped into a Map.
```json
For input `GOOG`: {"symbol": "GOOG", "price": "1.0"}
```
### Best Practices
While you have considerable flexibility in defining your function, remember that simplicity enhances usability for the LLM. Consider these guidelines:
- **Fewer Parameters are Better:** Minimize the number of parameters to reduce complexity.
- **Simple Data Types:** Favor primitive data types like `str` and `int` over custom classes whenever possible.
- **Meaningful Names:** The function's name and parameter names significantly influence how the LLM interprets and utilizes the tool. Choose names that clearly reflect the function's purpose and the meaning of its inputs. Avoid generic names like `do_stuff()` or `beAgent()`.
- **Build for Parallel Execution:** Improve function calling performance when multiple tools are run by building for asynchronous operation. For information on enabling parallel execution for tools, see [Increase tool performance with parallel execution](/adk-docs/tools-custom/performance/).
## Long Running Function Tools
This tool is designed to help you start and manage tasks that are handled outside the operation of your agent workflow, and require a significant amount of processing time, without blocking the agent's execution. This tool is a subclass of `FunctionTool`.
When using a `LongRunningFunctionTool`, your function can initiate the long-running operation and optionally return an **initial result**, such as a long-running operation id. Once a long running function tool is invoked the agent runner pauses the agent run and lets the agent client to decide whether to continue or wait until the long-running operation finishes. The agent client can query the progress of the long-running operation and send back an intermediate or final response. The agent can then continue with other tasks. An example is the human-in-the-loop scenario where the agent needs human approval before proceeding with a task.
Warning: Execution handling
Long Running Function Tools are designed to help you start and *manage* long running tasks as part of your agent workflow, but ***not perform*** the actual, long task. For tasks that require significant time to complete, you should implement a separate server to do the task.
Tip: Parallel execution
Depending on the type of tool you are building, designing for asynchronous operation may be a better solution than creating a long running tool. For more information, see [Increase tool performance with parallel execution](/adk-docs/tools-custom/performance/).
### How it Works
In Python, you wrap a function with `LongRunningFunctionTool`. In Java, you pass a Method name to `LongRunningFunctionTool.create()`. In TypeScript, you instantiate the `LongRunningFunctionTool` class.
1. **Initiation:** When the LLM calls the tool, your function starts the long-running operation.
1. **Initial Updates:** Your function should optionally return an initial result (e.g. the long-running operation id). The ADK framework takes the result and sends it back to the LLM packaged within a `FunctionResponse`. This allows the LLM to inform the user (e.g., status, percentage complete, messages). And then the agent run is ended / paused.
1. **Continue or Wait:** After each agent run is completed. Agent client can query the progress of the long-running operation and decide whether to continue the agent run with an intermediate response (to update the progress) or wait until a final response is retrieved. Agent client should send the intermediate or final response back to the agent for the next run.
1. **Framework Handling:** The ADK framework manages the execution. It sends the intermediate or final `FunctionResponse` sent by agent client to the LLM to generate a user friendly message.
### Creating the Tool
Define your tool function and wrap it using the `LongRunningFunctionTool` class:
```python
# 1. Define the long running function
def ask_for_approval(
purpose: str, amount: float
) -> dict[str, Any]:
"""Ask for approval for the reimbursement."""
# create a ticket for the approval
# Send a notification to the approver with the link of the ticket
return {'status': 'pending', 'approver': 'Sean Zhou', 'purpose' : purpose, 'amount': amount, 'ticket-id': 'approval-ticket-1'}
def reimburse(purpose: str, amount: float) -> str:
"""Reimburse the amount of money to the employee."""
# send the reimbrusement request to payment vendor
return {'status': 'ok'}
# 2. Wrap the function with LongRunningFunctionTool
long_running_tool = LongRunningFunctionTool(func=ask_for_approval)
```
```typescript
// 1. Define the long-running function
function askForApproval(args: {purpose: string; amount: number}) {
/**
* Ask for approval for the reimbursement.
*/
// create a ticket for the approval
// Send a notification to the approver with the link of the ticket
return {
"status": "pending",
"approver": "Sean Zhou",
"purpose": args.purpose,
"amount": args.amount,
"ticket-id": "approval-ticket-1",
};
}
// 2. Instantiate the LongRunningFunctionTool class with the long-running function
const longRunningTool = new LongRunningFunctionTool({
name: "ask_for_approval",
description: "Ask for approval for the reimbursement.",
parameters: z.object({
purpose: z.string().describe("The purpose of the reimbursement."),
amount: z.number().describe("The amount to reimburse."),
}),
execute: askForApproval,
});
```
```go
import (
"google.golang.org/adk/agent"
"google.golang.org/adk/agent/llmagent"
"google.golang.org/adk/model/gemini"
"google.golang.org/adk/tool"
"google.golang.org/adk/tool/functiontool"
"google.golang.org/genai"
)
// CreateTicketArgs defines the arguments for our long-running tool.
type CreateTicketArgs struct {
Urgency string `json:"urgency" jsonschema:"The urgency level of the ticket."`
}
// CreateTicketResults defines the *initial* output of our long-running tool.
type CreateTicketResults struct {
Status string `json:"status"`
TicketId string `json:"ticket_id"`
}
// createTicketAsync simulates the *initiation* of a long-running ticket creation task.
func createTicketAsync(ctx tool.Context, args CreateTicketArgs) (CreateTicketResults, error) {
log.Printf("TOOL_EXEC: 'create_ticket_long_running' called with urgency: %s (Call ID: %s)\n", args.Urgency, ctx.FunctionCallID())
// "Generate" a ticket ID and return it in the initial response.
ticketID := "TICKET-ABC-123"
log.Printf("ACTION: Generated Ticket ID: %s for Call ID: %s\n", ticketID, ctx.FunctionCallID())
// In a real application, you would save the association between the
// FunctionCallID and the ticketID to handle the async response later.
return CreateTicketResults{
Status: "started",
TicketId: ticketID,
}, nil
}
func createTicketAgent(ctx context.Context) (agent.Agent, error) {
ticketTool, err := functiontool.New(
functiontool.Config{
Name: "create_ticket_long_running",
Description: "Creates a new support ticket with a specified urgency level.",
},
createTicketAsync,
)
if err != nil {
return nil, fmt.Errorf("failed to create long running tool: %w", err)
}
model, err := gemini.NewModel(ctx, "gemini-2.5-flash", &genai.ClientConfig{})
if err != nil {
return nil, fmt.Errorf("failed to create model: %v", err)
}
return llmagent.New(llmagent.Config{
Name: "ticket_agent",
Model: model,
Instruction: "You are a helpful assistant for creating support tickets. Provide the status of the ticket at each interaction.",
Tools: []tool.Tool{ticketTool},
})
}
```
```java
import com.google.adk.agents.LlmAgent;
import com.google.adk.tools.LongRunningFunctionTool;
import java.util.HashMap;
import java.util.Map;
public class ExampleLongRunningFunction {
// Define your Long Running function.
// Ask for approval for the reimbursement.
public static Map askForApproval(String purpose, double amount) {
// Simulate creating a ticket and sending a notification
System.out.println(
"Simulating ticket creation for purpose: " + purpose + ", amount: " + amount);
// Send a notification to the approver with the link of the ticket
Map result = new HashMap<>();
result.put("status", "pending");
result.put("approver", "Sean Zhou");
result.put("purpose", purpose);
result.put("amount", amount);
result.put("ticket-id", "approval-ticket-1");
return result;
}
public static void main(String[] args) throws NoSuchMethodException {
// Pass the method to LongRunningFunctionTool.create
LongRunningFunctionTool approveTool =
LongRunningFunctionTool.create(ExampleLongRunningFunction.class, "askForApproval");
// Include the tool in the agent
LlmAgent approverAgent =
LlmAgent.builder()
// ...
.tools(approveTool)
.build();
}
}
```
### Intermediate / Final result Updates
Agent client received an event with long running function calls and check the status of the ticket. Then Agent client can send the intermediate or final response back to update the progress. The framework packages this value (even if it's None) into the content of the `FunctionResponse` sent back to the LLM.
Note: Long running function response with Resume feature
If your ADK agent workflow is configured with the [Resume](/adk-docs/runtime/resume/) feature, you also must include the Invocation ID (`invocation_id`) parameter with the long running function response. The Invocation ID you provide must be the same invocation that generated the long running function request, otherwise the system starts a new invocation with the response. If your agent uses the Resume feature, consider including the Invocation ID as a parameter with your long running function request, so it can be included with the response. For more details on using the Resume feature, see [Resume stopped agents](/adk-docs/runtime/resume/).
Applies to only Java ADK
When passing `ToolContext` with Function Tools, ensure that one of the following is true:
- The Schema is passed with the ToolContext parameter in the function signature, like:
```text
@com.google.adk.tools.Annotations.Schema(name = "toolContext") ToolContext toolContext
```
OR
- The following `-parameters` flag is set to the mvn compiler plugin
```text
org.apache.maven.pluginsmaven-compiler-plugin3.14.0-parameters
```
This constraint is temporary and will be removed.
```python
# Agent Interaction
async def call_agent_async(query):
def get_long_running_function_call(event: Event) -> types.FunctionCall:
# Get the long running function call from the event
if not event.long_running_tool_ids or not event.content or not event.content.parts:
return
for part in event.content.parts:
if (
part
and part.function_call
and event.long_running_tool_ids
and part.function_call.id in event.long_running_tool_ids
):
return part.function_call
def get_function_response(event: Event, function_call_id: str) -> types.FunctionResponse:
# Get the function response for the fuction call with specified id.
if not event.content or not event.content.parts:
return
for part in event.content.parts:
if (
part
and part.function_response
and part.function_response.id == function_call_id
):
return part.function_response
content = types.Content(role='user', parts=[types.Part(text=query)])
session, runner = await setup_session_and_runner()
print("\nRunning agent...")
events_async = runner.run_async(
session_id=session.id, user_id=USER_ID, new_message=content
)
long_running_function_call, long_running_function_response, ticket_id = None, None, None
async for event in events_async:
# Use helper to check for the specific auth request event
if not long_running_function_call:
long_running_function_call = get_long_running_function_call(event)
else:
_potential_response = get_function_response(event, long_running_function_call.id)
if _potential_response: # Only update if we get a non-None response
long_running_function_response = _potential_response
ticket_id = long_running_function_response.response['ticket-id']
if event.content and event.content.parts:
if text := ''.join(part.text or '' for part in event.content.parts):
print(f'[{event.author}]: {text}')
if long_running_function_response:
# query the status of the correpsonding ticket via tciket_id
# send back an intermediate / final response
updated_response = long_running_function_response.model_copy(deep=True)
updated_response.response = {'status': 'approved'}
async for event in runner.run_async(
session_id=session.id, user_id=USER_ID, new_message=types.Content(parts=[types.Part(function_response = updated_response)], role='user')
):
if event.content and event.content.parts:
if text := ''.join(part.text or '' for part in event.content.parts):
print(f'[{event.author}]: {text}')
```
```typescript
/**
* Copyright 2025 Google LLC
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import {
LlmAgent,
Runner,
FunctionTool,
LongRunningFunctionTool,
InMemorySessionService,
Event,
stringifyContent,
} from "@google/adk";
import {z} from "zod";
import {Content, FunctionCall, FunctionResponse, createUserContent} from "@google/genai";
// 1. Define the long-running function
function askForApproval(args: {purpose: string; amount: number}) {
/**
* Ask for approval for the reimbursement.
*/
// create a ticket for the approval
// Send a notification to the approver with the link of the ticket
return {
"status": "pending",
"approver": "Sean Zhou",
"purpose": args.purpose,
"amount": args.amount,
"ticket-id": "approval-ticket-1",
};
}
// 2. Instantiate the LongRunningFunctionTool class with the long-running function
const longRunningTool = new LongRunningFunctionTool({
name: "ask_for_approval",
description: "Ask for approval for the reimbursement.",
parameters: z.object({
purpose: z.string().describe("The purpose of the reimbursement."),
amount: z.number().describe("The amount to reimburse."),
}),
execute: askForApproval,
});
function reimburse(args: {purpose: string; amount: number}) {
/**
* Reimburse the amount of money to the employee.
*/
// send the reimbursement request to payment vendor
return {status: "ok"};
}
const reimburseTool = new FunctionTool({
name: "reimburse",
description: "Reimburse the amount of money to the employee.",
parameters: z.object({
purpose: z.string().describe("The purpose of the reimbursement."),
amount: z.number().describe("The amount to reimburse."),
}),
execute: reimburse,
});
// 3. Use the tool in an Agent
const reimbursementAgent = new LlmAgent({
model: "gemini-2.5-flash",
name: "reimbursement_agent",
instruction: `
You are an agent whose job is to handle the reimbursement process for
the employees. If the amount is less than $100, you will automatically
approve the reimbursement.
If the amount is greater than $100, you will
ask for approval from the manager. If the manager approves, you will
call reimburse() to reimburse the amount to the employee. If the manager
rejects, you will inform the employee of the rejection.
`,
tools: [reimburseTool, longRunningTool],
});
const APP_NAME = "human_in_the_loop";
const USER_ID = "1234";
const SESSION_ID = "session1234";
// Session and Runner
async function setupSessionAndRunner() {
const sessionService = new InMemorySessionService();
const session = await sessionService.createSession({
appName: APP_NAME,
userId: USER_ID,
sessionId: SESSION_ID,
});
const runner = new Runner({
agent: reimbursementAgent,
appName: APP_NAME,
sessionService: sessionService,
});
return {session, runner};
}
function getLongRunningFunctionCall(event: Event): FunctionCall | undefined {
// Get the long-running function call from the event
if (
!event.longRunningToolIds ||
!event.content ||
!event.content.parts?.length
) {
return;
}
for (const part of event.content.parts) {
if (
part &&
part.functionCall &&
event.longRunningToolIds &&
part.functionCall.id &&
event.longRunningToolIds.includes(part.functionCall.id)
) {
return part.functionCall;
}
}
}
function getFunctionResponse(
event: Event,
functionCallId: string
): FunctionResponse | undefined {
// Get the function response for the function call with specified id.
if (!event.content || !event.content.parts?.length) {
return;
}
for (const part of event.content.parts) {
if (
part &&
part.functionResponse &&
part.functionResponse.id === functionCallId
) {
return part.functionResponse;
}
}
}
// Agent Interaction
async function callAgentAsync(query: string) {
let longRunningFunctionCall: FunctionCall | undefined;
let longRunningFunctionResponse: FunctionResponse | undefined;
let ticketId: string | undefined;
const content: Content = createUserContent(query);
const {session, runner} = await setupSessionAndRunner();
console.log("\nRunning agent...");
const events = runner.runAsync({
sessionId: session.id,
userId: USER_ID,
newMessage: content,
});
for await (const event of events) {
// Use helper to check for the specific auth request event
if (!longRunningFunctionCall) {
longRunningFunctionCall = getLongRunningFunctionCall(event);
} else {
const _potentialResponse = getFunctionResponse(
event,
longRunningFunctionCall.id!
);
if (_potentialResponse) {
// Only update if we get a non-None response
longRunningFunctionResponse = _potentialResponse;
ticketId = (
longRunningFunctionResponse.response as {[key: string]: any}
)[`ticket-id`];
}
}
const text = stringifyContent(event);
if (text) {
console.log(`[${event.author}]: ${text}`);
}
}
if (longRunningFunctionResponse) {
// query the status of the corresponding ticket via ticket_id
// send back an intermediate / final response
const updatedResponse = JSON.parse(
JSON.stringify(longRunningFunctionResponse)
);
updatedResponse.response = {status: "approved"};
for await (const event of runner.runAsync({
sessionId: session.id,
userId: USER_ID,
newMessage: createUserContent(JSON.stringify({functionResponse: updatedResponse})),
})) {
const text = stringifyContent(event);
if (text) {
console.log(`[${event.author}]: ${text}`);
}
}
}
}
async function main() {
// reimbursement that doesn't require approval
await callAgentAsync("Please reimburse 50$ for meals");
// reimbursement that requires approval
await callAgentAsync("Please reimburse 200$ for meals");
}
main();
```
The following example demonstrates a multi-turn workflow. First, the user asks the agent to create a ticket. The agent calls the long-running tool and the client captures the `FunctionCall` ID. The client then simulates the asynchronous work completing by sending subsequent `FunctionResponse` messages back to the agent to provide the ticket ID and final status.
```go
// runTurn executes a single turn with the agent and returns the captured function call ID.
func runTurn(ctx context.Context, r *runner.Runner, sessionID, turnLabel string, content *genai.Content) string {
var funcCallID atomic.Value // Safely store the found ID.
fmt.Printf("\n--- %s ---\n", turnLabel)
for event, err := range r.Run(ctx, userID, sessionID, content, agent.RunConfig{
StreamingMode: agent.StreamingModeNone,
}) {
if err != nil {
fmt.Printf("\nAGENT_ERROR: %v\n", err)
continue
}
// Print a summary of the event for clarity.
printEventSummary(event, turnLabel)
// Capture the function call ID from the event.
for _, part := range event.Content.Parts {
if fc := part.FunctionCall; fc != nil {
if fc.Name == "create_ticket_long_running" {
funcCallID.Store(fc.ID)
}
}
}
}
if id, ok := funcCallID.Load().(string); ok {
return id
}
return ""
}
func main() {
ctx := context.Background()
ticketAgent, err := createTicketAgent(ctx)
if err != nil {
log.Fatalf("Failed to create agent: %v", err)
}
// Setup the runner and session.
sessionService := session.InMemoryService()
session, err := sessionService.Create(ctx, &session.CreateRequest{AppName: appName, UserID: userID})
if err != nil {
log.Fatalf("Failed to create session: %v", err)
}
r, err := runner.New(runner.Config{AppName: appName, Agent: ticketAgent, SessionService: sessionService})
if err != nil {
log.Fatalf("Failed to create runner: %v", err)
}
// --- Turn 1: User requests to create a ticket. ---
initialUserMessage := genai.NewContentFromText("Create a high urgency ticket for me.", genai.RoleUser)
funcCallID := runTurn(ctx, r, session.Session.ID(), "Turn 1: User Request", initialUserMessage)
if funcCallID == "" {
log.Fatal("ERROR: Tool 'create_ticket_long_running' not called in Turn 1.")
}
fmt.Printf("ACTION: Captured FunctionCall ID: %s\n", funcCallID)
// --- Turn 2: App provides the final status of the ticket. ---
// In a real application, the ticketID would be retrieved from a database
// using the funcCallID. For this example, we'll use the same ID.
ticketID := "TICKET-ABC-123"
willContinue := false // Signal that this is the final response.
ticketStatusResponse := &genai.FunctionResponse{
Name: "create_ticket_long_running",
ID: funcCallID,
Response: map[string]any{
"status": "approved",
"ticket_id": ticketID,
},
WillContinue: &willContinue,
}
appResponseWithStatus := &genai.Content{
Role: string(genai.RoleUser),
Parts: []*genai.Part{{FunctionResponse: ticketStatusResponse}},
}
runTurn(ctx, r, session.Session.ID(), "Turn 2: App provides ticket status", appResponseWithStatus)
fmt.Println("Long running function completed successfully.")
}
// printEventSummary provides a readable log of agent and LLM interactions.
func printEventSummary(event *session.Event, turnLabel string) {
for _, part := range event.Content.Parts {
// Check for a text part.
if part.Text != "" {
fmt.Printf("[%s][%s_TEXT]: %s\n", turnLabel, event.Author, part.Text)
}
// Check for a function call part.
if fc := part.FunctionCall; fc != nil {
fmt.Printf("[%s][%s_CALL]: %s(%v) ID: %s\n", turnLabel, event.Author, fc.Name, fc.Args, fc.ID)
}
}
}
```
```java
import com.google.adk.agents.LlmAgent;
import com.google.adk.events.Event;
import com.google.adk.runner.InMemoryRunner;
import com.google.adk.runner.Runner;
import com.google.adk.sessions.Session;
import com.google.adk.tools.Annotations.Schema;
import com.google.adk.tools.LongRunningFunctionTool;
import com.google.adk.tools.ToolContext;
import com.google.common.collect.ImmutableList;
import com.google.common.collect.ImmutableMap;
import com.google.genai.types.Content;
import com.google.genai.types.FunctionCall;
import com.google.genai.types.FunctionResponse;
import com.google.genai.types.Part;
import java.util.Optional;
import java.util.UUID;
import java.util.concurrent.atomic.AtomicReference;
import java.util.stream.Collectors;
public class LongRunningFunctionExample {
private static String USER_ID = "user123";
@Schema(
name = "create_ticket_long_running",
description = """
Creates a new support ticket with a specified urgency level.
Examples of urgency are 'high', 'medium', or 'low'.
The ticket creation is a long-running process, and its ID will be provided when ready.
""")
public static void createTicketAsync(
@Schema(
name = "urgency",
description =
"The urgency level for the new ticket, such as 'high', 'medium', or 'low'.")
String urgency,
@Schema(name = "toolContext") // Ensures ADK injection
ToolContext toolContext) {
System.out.printf(
"TOOL_EXEC: 'create_ticket_long_running' called with urgency: %s (Call ID: %s)%n",
urgency, toolContext.functionCallId().orElse("N/A"));
}
public static void main(String[] args) {
LlmAgent agent =
LlmAgent.builder()
.name("ticket_agent")
.description("Agent for creating tickets via a long-running task.")
.model("gemini-2.0-flash")
.tools(
ImmutableList.of(
LongRunningFunctionTool.create(
LongRunningFunctionExample.class, "createTicketAsync")))
.build();
Runner runner = new InMemoryRunner(agent);
Session session =
runner.sessionService().createSession(agent.name(), USER_ID, null, null).blockingGet();
// --- Turn 1: User requests ticket ---
System.out.println("\n--- Turn 1: User Request ---");
Content initialUserMessage =
Content.fromParts(Part.fromText("Create a high urgency ticket for me."));
AtomicReference funcCallIdRef = new AtomicReference<>();
runner
.runAsync(USER_ID, session.id(), initialUserMessage)
.blockingForEach(
event -> {
printEventSummary(event, "T1");
if (funcCallIdRef.get() == null) { // Capture the first relevant function call ID
event.content().flatMap(Content::parts).orElse(ImmutableList.of()).stream()
.map(Part::functionCall)
.flatMap(Optional::stream)
.filter(fc -> "create_ticket_long_running".equals(fc.name().orElse("")))
.findFirst()
.flatMap(FunctionCall::id)
.ifPresent(funcCallIdRef::set);
}
});
if (funcCallIdRef.get() == null) {
System.out.println("ERROR: Tool 'create_ticket_long_running' not called in Turn 1.");
return;
}
System.out.println("ACTION: Captured FunctionCall ID: " + funcCallIdRef.get());
// --- Turn 2: App provides initial ticket_id (simulating async tool completion) ---
System.out.println("\n--- Turn 2: App provides ticket_id ---");
String ticketId = "TICKET-" + UUID.randomUUID().toString().substring(0, 8).toUpperCase();
FunctionResponse ticketCreatedFuncResponse =
FunctionResponse.builder()
.name("create_ticket_long_running")
.id(funcCallIdRef.get())
.response(ImmutableMap.of("ticket_id", ticketId))
.build();
Content appResponseWithTicketId =
Content.builder()
.parts(
ImmutableList.of(
Part.builder().functionResponse(ticketCreatedFuncResponse).build()))
.role("user")
.build();
runner
.runAsync(USER_ID, session.id(), appResponseWithTicketId)
.blockingForEach(event -> printEventSummary(event, "T2"));
System.out.println("ACTION: Sent ticket_id " + ticketId + " to agent.");
// --- Turn 3: App provides ticket status update ---
System.out.println("\n--- Turn 3: App provides ticket status ---");
FunctionResponse ticketStatusFuncResponse =
FunctionResponse.builder()
.name("create_ticket_long_running")
.id(funcCallIdRef.get())
.response(ImmutableMap.of("status", "approved", "ticket_id", ticketId))
.build();
Content appResponseWithStatus =
Content.builder()
.parts(
ImmutableList.of(Part.builder().functionResponse(ticketStatusFuncResponse).build()))
.role("user")
.build();
runner
.runAsync(USER_ID, session.id(), appResponseWithStatus)
.blockingForEach(event -> printEventSummary(event, "T3_FINAL"));
System.out.println("Long running function completed successfully.");
}
private static void printEventSummary(Event event, String turnLabel) {
event
.content()
.ifPresent(
content -> {
String text =
content.parts().orElse(ImmutableList.of()).stream()
.map(part -> part.text().orElse(""))
.filter(s -> !s.isEmpty())
.collect(Collectors.joining(" "));
if (!text.isEmpty()) {
System.out.printf("[%s][%s_TEXT]: %s%n", turnLabel, event.author(), text);
}
content.parts().orElse(ImmutableList.of()).stream()
.map(Part::functionCall)
.flatMap(Optional::stream)
.findFirst() // Assuming one function call per relevant event for simplicity
.ifPresent(
fc ->
System.out.printf(
"[%s][%s_CALL]: %s(%s) ID: %s%n",
turnLabel,
event.author(),
fc.name().orElse("N/A"),
fc.args().orElse(ImmutableMap.of()),
fc.id().orElse("N/A")));
});
}
}
```
Python complete example: File Processing Simulation
```python
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import asyncio
from typing import Any
from google.adk.agents import Agent
from google.adk.events import Event
from google.adk.runners import Runner
from google.adk.tools import LongRunningFunctionTool
from google.adk.sessions import InMemorySessionService
from google.genai import types
# 1. Define the long running function
def ask_for_approval(
purpose: str, amount: float
) -> dict[str, Any]:
"""Ask for approval for the reimbursement."""
# create a ticket for the approval
# Send a notification to the approver with the link of the ticket
return {'status': 'pending', 'approver': 'Sean Zhou', 'purpose' : purpose, 'amount': amount, 'ticket-id': 'approval-ticket-1'}
def reimburse(purpose: str, amount: float) -> str:
"""Reimburse the amount of money to the employee."""
# send the reimbrusement request to payment vendor
return {'status': 'ok'}
# 2. Wrap the function with LongRunningFunctionTool
long_running_tool = LongRunningFunctionTool(func=ask_for_approval)
# 3. Use the tool in an Agent
file_processor_agent = Agent(
# Use a model compatible with function calling
model="gemini-2.0-flash",
name='reimbursement_agent',
instruction="""
You are an agent whose job is to handle the reimbursement process for
the employees. If the amount is less than $100, you will automatically
approve the reimbursement.
If the amount is greater than $100, you will
ask for approval from the manager. If the manager approves, you will
call reimburse() to reimburse the amount to the employee. If the manager
rejects, you will inform the employee of the rejection.
""",
tools=[reimburse, long_running_tool]
)
APP_NAME = "human_in_the_loop"
USER_ID = "1234"
SESSION_ID = "session1234"
# Session and Runner
async def setup_session_and_runner():
session_service = InMemorySessionService()
session = await session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID)
runner = Runner(agent=file_processor_agent, app_name=APP_NAME, session_service=session_service)
return session, runner
# Agent Interaction
async def call_agent_async(query):
def get_long_running_function_call(event: Event) -> types.FunctionCall:
# Get the long running function call from the event
if not event.long_running_tool_ids or not event.content or not event.content.parts:
return
for part in event.content.parts:
if (
part
and part.function_call
and event.long_running_tool_ids
and part.function_call.id in event.long_running_tool_ids
):
return part.function_call
def get_function_response(event: Event, function_call_id: str) -> types.FunctionResponse:
# Get the function response for the fuction call with specified id.
if not event.content or not event.content.parts:
return
for part in event.content.parts:
if (
part
and part.function_response
and part.function_response.id == function_call_id
):
return part.function_response
content = types.Content(role='user', parts=[types.Part(text=query)])
session, runner = await setup_session_and_runner()
print("\nRunning agent...")
events_async = runner.run_async(
session_id=session.id, user_id=USER_ID, new_message=content
)
long_running_function_call, long_running_function_response, ticket_id = None, None, None
async for event in events_async:
# Use helper to check for the specific auth request event
if not long_running_function_call:
long_running_function_call = get_long_running_function_call(event)
else:
_potential_response = get_function_response(event, long_running_function_call.id)
if _potential_response: # Only update if we get a non-None response
long_running_function_response = _potential_response
ticket_id = long_running_function_response.response['ticket-id']
if event.content and event.content.parts:
if text := ''.join(part.text or '' for part in event.content.parts):
print(f'[{event.author}]: {text}')
if long_running_function_response:
# query the status of the correpsonding ticket via tciket_id
# send back an intermediate / final response
updated_response = long_running_function_response.model_copy(deep=True)
updated_response.response = {'status': 'approved'}
async for event in runner.run_async(
session_id=session.id, user_id=USER_ID, new_message=types.Content(parts=[types.Part(function_response = updated_response)], role='user')
):
if event.content and event.content.parts:
if text := ''.join(part.text or '' for part in event.content.parts):
print(f'[{event.author}]: {text}')
# Note: In Colab, you can directly use 'await' at the top level.
# If running this code as a standalone Python script, you'll need to use asyncio.run() or manage the event loop.
# reimbursement that doesn't require approval
# asyncio.run(call_agent_async("Please reimburse 50$ for meals"))
await call_agent_async("Please reimburse 50$ for meals") # For Notebooks, uncomment this line and comment the above line
# reimbursement that requires approval
# asyncio.run(call_agent_async("Please reimburse 200$ for meals"))
await call_agent_async("Please reimburse 200$ for meals") # For Notebooks, uncomment this line and comment the above line
```
#### Key aspects of this example
- **`LongRunningFunctionTool`**: Wraps the supplied method/function; the framework handles sending yielded updates and the final return value as sequential FunctionResponses.
- **Agent instruction**: Directs the LLM to use the tool and understand the incoming FunctionResponse stream (progress vs. completion) for user updates.
- **Final return**: The function returns the final result dictionary, which is sent in the concluding FunctionResponse to indicate completion.
## Agent-as-a-Tool
This powerful feature allows you to leverage the capabilities of other agents within your system by calling them as tools. The Agent-as-a-Tool enables you to invoke another agent to perform a specific task, effectively **delegating responsibility**. This is conceptually similar to creating a Python function that calls another agent and uses the agent's response as the function's return value.
### Key difference from sub-agents
It's important to distinguish an Agent-as-a-Tool from a Sub-Agent.
- **Agent-as-a-Tool:** When Agent A calls Agent B as a tool (using Agent-as-a-Tool), Agent B's answer is **passed back** to Agent A, which then summarizes the answer and generates a response to the user. Agent A retains control and continues to handle future user input.
- **Sub-agent:** When Agent A calls Agent B as a sub-agent, the responsibility of answering the user is completely **transferred to Agent B**. Agent A is effectively out of the loop. All subsequent user input will be answered by Agent B.
### Usage
To use an agent as a tool, wrap the agent with the AgentTool class.
```python
tools=[AgentTool(agent=agent_b)]
```
```typescript
tools: [new AgentTool({agent: agentB})]
```
```go
agenttool.New(agent, &agenttool.Config{...})
```
```java
AgentTool.create(agent)
```
### Customization
The `AgentTool` class provides the following attributes for customizing its behavior:
- **skip_summarization: bool:** If set to True, the framework will **bypass the LLM-based summarization** of the tool agent's response. This can be useful when the tool's response is already well-formatted and requires no further processing.
Example
```python
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from google.adk.agents import Agent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.adk.tools.agent_tool import AgentTool
from google.genai import types
APP_NAME="summary_agent"
USER_ID="user1234"
SESSION_ID="1234"
summary_agent = Agent(
model="gemini-2.0-flash",
name="summary_agent",
instruction="""You are an expert summarizer. Please read the following text and provide a concise summary.""",
description="Agent to summarize text",
)
root_agent = Agent(
model='gemini-2.0-flash',
name='root_agent',
instruction="""You are a helpful assistant. When the user provides a text, use the 'summarize' tool to generate a summary. Always forward the user's message exactly as received to the 'summarize' tool, without modifying or summarizing it yourself. Present the response from the tool to the user.""",
tools=[AgentTool(agent=summary_agent, skip_summarization=True)]
)
# Session and Runner
async def setup_session_and_runner():
session_service = InMemorySessionService()
session = await session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID)
runner = Runner(agent=root_agent, app_name=APP_NAME, session_service=session_service)
return session, runner
# Agent Interaction
async def call_agent_async(query):
content = types.Content(role='user', parts=[types.Part(text=query)])
session, runner = await setup_session_and_runner()
events = runner.run_async(user_id=USER_ID, session_id=SESSION_ID, new_message=content)
async for event in events:
if event.is_final_response():
final_response = event.content.parts[0].text
print("Agent Response: ", final_response)
long_text = """Quantum computing represents a fundamentally different approach to computation,
leveraging the bizarre principles of quantum mechanics to process information. Unlike classical computers
that rely on bits representing either 0 or 1, quantum computers use qubits which can exist in a state of superposition - effectively
being 0, 1, or a combination of both simultaneously. Furthermore, qubits can become entangled,
meaning their fates are intertwined regardless of distance, allowing for complex correlations. This parallelism and
interconnectedness grant quantum computers the potential to solve specific types of incredibly complex problems - such
as drug discovery, materials science, complex system optimization, and breaking certain types of cryptography - far
faster than even the most powerful classical supercomputers could ever achieve, although the technology is still largely in its developmental stages."""
# Note: In Colab, you can directly use 'await' at the top level.
# If running this code as a standalone Python script, you'll need to use asyncio.run() or manage the event loop.
await call_agent_async(long_text)
```
```typescript
/**
* Copyright 2025 Google LLC
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import {
AgentTool,
InMemoryRunner,
LlmAgent,
} from '@google/adk';
import {Part, createUserContent} from '@google/genai';
/**
* This example demonstrates how to use an agent as a tool.
*/
async function main() {
// Define the summarization agent that will be used as a tool
const summaryAgent = new LlmAgent({
name: 'summary_agent',
model: 'gemini-2.5-flash',
description: 'Agent to summarize text',
instruction:
'You are an expert summarizer. Please read the following text and provide a concise summary.',
});
// Define the main agent that uses the summarization agent as a tool.
// skipSummarization is set to true, so the main_agent will directly output
// the result from the summary_agent without further processing.
const mainAgent = new LlmAgent({
name: 'main_agent',
model: 'gemini-2.5-flash',
instruction:
"You are a helpful assistant. When the user provides a text, use the 'summary_agent' tool to generate a summary. Always forward the user's message exactly as received to the 'summary_agent' tool, without modifying or summarizing it yourself. Present the response from the tool to the user.",
tools: [new AgentTool({agent: summaryAgent, skipSummarization: true})],
});
const appName = 'agent-as-a-tool-app';
const runner = new InMemoryRunner({agent: mainAgent, appName});
const longText = `Quantum computing represents a fundamentally different approach to computation,
leveraging the bizarre principles of quantum mechanics to process information. Unlike classical computers
that rely on bits representing either 0 or 1, quantum computers use qubits which can exist in a state of superposition - effectively
being 0, 1, or a combination of both simultaneously. Furthermore, qubits can become entangled,
meaning their fates are intertwined regardless of distance, allowing for complex correlations. This parallelism and
interconnectedness grant quantum computers the potential to solve specific types of incredibly complex problems - such
as drug discovery, materials science, complex system optimization, and breaking certain types of cryptography - far
faster than even the most powerful classical supercomputers could ever achieve, although the technology is still largely in its developmental stages.`;
// Create the session before running the agent
await runner.sessionService.createSession({
appName,
userId: 'user1',
sessionId: 'session1',
});
// Run the agent with the long text to summarize
const events = runner.runAsync({
userId: 'user1',
sessionId: 'session1',
newMessage: createUserContent(longText),
});
// Print the final response from the agent
console.log('Agent Response:');
for await (const event of events) {
if (event.content?.parts?.length) {
const responsePart = event.content.parts.find((p: Part) => p.functionResponse);
if (responsePart && responsePart.functionResponse) {
console.log(responsePart.functionResponse.response);
}
}
}
}
main();
```
```go
import (
"google.golang.org/adk/agent"
"google.golang.org/adk/agent/llmagent"
"google.golang.org/adk/model/gemini"
"google.golang.org/adk/tool"
"google.golang.org/adk/tool/agenttool"
"google.golang.org/genai"
)
// createSummarizerAgent creates an agent whose sole purpose is to summarize text.
func createSummarizerAgent(ctx context.Context) (agent.Agent, error) {
model, err := gemini.NewModel(ctx, "gemini-2.5-flash", &genai.ClientConfig{})
if err != nil {
return nil, err
}
return llmagent.New(llmagent.Config{
Name: "SummarizerAgent",
Model: model,
Instruction: "You are an expert at summarizing text. Take the user's input and provide a concise summary.",
Description: "An agent that summarizes text.",
})
}
// createMainAgent creates the primary agent that will use the summarizer agent as a tool.
func createMainAgent(ctx context.Context, tools ...tool.Tool) (agent.Agent, error) {
model, err := gemini.NewModel(ctx, "gemini-2.5-flash", &genai.ClientConfig{})
if err != nil {
return nil, err
}
return llmagent.New(llmagent.Config{
Name: "MainAgent",
Model: model,
Instruction: "You are a helpful assistant. If you are asked to summarize a long text, use the 'summarize' tool. " +
"After getting the summary, present it to the user by saying 'Here is a summary of the text:'.",
Description: "The main agent that can delegate tasks.",
Tools: tools,
})
}
func RunAgentAsToolSimulation() {
ctx := context.Background()
// 1. Create the Tool Agent (Summarizer)
summarizerAgent, err := createSummarizerAgent(ctx)
if err != nil {
log.Fatalf("Failed to create summarizer agent: %v", err)
}
// 2. Wrap the Tool Agent in an AgentTool
summarizeTool := agenttool.New(summarizerAgent, &agenttool.Config{
SkipSummarization: true,
})
// 3. Create the Main Agent and provide it with the AgentTool
mainAgent, err := createMainAgent(ctx, summarizeTool)
if err != nil {
log.Fatalf("Failed to create main agent: %v", err)
}
// 4. Run the main agent
prompt := `
Please summarize this text for me:
Quantum computing represents a fundamentally different approach to computation,
leveraging the bizarre principles of quantum mechanics to process information. Unlike classical computers
that rely on bits representing either 0 or 1, quantum computers use qubits which can exist in a state of superposition - effectively
being 0, 1, or a combination of both simultaneously. Furthermore, qubits can become entangled,
meaning their fates are intertwined regardless of distance, allowing for complex correlations. This parallelism and
interconnectedness grant quantum computers the potential to solve specific types of incredibly complex problems - such
as drug discovery, materials science, complex system optimization, and breaking certain types of cryptography - far
faster than even the most powerful classical supercomputers could ever achieve, although the technology is still largely in its developmental stages.
`
fmt.Printf("\nPrompt: %s\nResponse: ", prompt)
callAgent(context.Background(), mainAgent, prompt)
fmt.Println("\n---")
}
```
```java
import com.google.adk.agents.LlmAgent;
import com.google.adk.events.Event;
import com.google.adk.runner.InMemoryRunner;
import com.google.adk.sessions.Session;
import com.google.adk.tools.AgentTool;
import com.google.genai.types.Content;
import com.google.genai.types.Part;
import io.reactivex.rxjava3.core.Flowable;
public class AgentToolCustomization {
private static final String APP_NAME = "summary_agent";
private static final String USER_ID = "user1234";
public static void initAgentAndRun(String prompt) {
LlmAgent summaryAgent =
LlmAgent.builder()
.model("gemini-2.0-flash")
.name("summaryAgent")
.instruction(
"You are an expert summarizer. Please read the following text and provide a concise summary.")
.description("Agent to summarize text")
.build();
// Define root_agent
LlmAgent rootAgent =
LlmAgent.builder()
.model("gemini-2.0-flash")
.name("rootAgent")
.instruction(
"You are a helpful assistant. When the user provides a text, always use the 'summaryAgent' tool to generate a summary. Always forward the user's message exactly as received to the 'summaryAgent' tool, without modifying or summarizing it yourself. Present the response from the tool to the user.")
.description("Assistant agent")
.tools(AgentTool.create(summaryAgent, true)) // Set skipSummarization to true
.build();
// Create an InMemoryRunner
InMemoryRunner runner = new InMemoryRunner(rootAgent, APP_NAME);
// InMemoryRunner automatically creates a session service. Create a session using the service
Session session = runner.sessionService().createSession(APP_NAME, USER_ID).blockingGet();
Content userMessage = Content.fromParts(Part.fromText(prompt));
// Run the agent
Flowable eventStream = runner.runAsync(USER_ID, session.id(), userMessage);
// Stream event response
eventStream.blockingForEach(
event -> {
if (event.finalResponse()) {
System.out.println(event.stringifyContent());
}
});
}
public static void main(String[] args) {
String longText =
"""
Quantum computing represents a fundamentally different approach to computation,
leveraging the bizarre principles of quantum mechanics to process information. Unlike classical computers
that rely on bits representing either 0 or 1, quantum computers use qubits which can exist in a state of superposition - effectively
being 0, 1, or a combination of both simultaneously. Furthermore, qubits can become entangled,
meaning their fates are intertwined regardless of distance, allowing for complex correlations. This parallelism and
interconnectedness grant quantum computers the potential to solve specific types of incredibly complex problems - such
as drug discovery, materials science, complex system optimization, and breaking certain types of cryptography - far
faster than even the most powerful classical supercomputers could ever achieve, although the technology is still largely in its developmental stages.""";
initAgentAndRun(longText);
}
}
```
### How it works
1. When the `main_agent` receives the long text, its instruction tells it to use the 'summarize' tool for long texts.
1. The framework recognizes 'summarize' as an `AgentTool` that wraps the `summary_agent`.
1. Behind the scenes, the `main_agent` will call the `summary_agent` with the long text as input.
1. The `summary_agent` will process the text according to its instruction and generate a summary.
1. **The response from the `summary_agent` is then passed back to the `main_agent`.**
1. The `main_agent` can then take the summary and formulate its final response to the user (e.g., "Here's a summary of the text: ...")
# Model Context Protocol Tools
Supported in ADKPython v0.1.0Typescript v0.2.0Go v0.1.0Java v0.1.0
This guide walks you through two ways of integrating Model Context Protocol (MCP) with ADK.
MCP tools for ADK
For a list of pre-built MCP tools for ADK, see [Tools and Integrations](/adk-docs/integrations/?topic=mcp).
## What is Model Context Protocol (MCP)?
The Model Context Protocol (MCP) is an open standard designed to standardize how Large Language Models (LLMs) like Gemini and Claude communicate with external applications, data sources, and tools. Think of it as a universal connection mechanism that simplifies how LLMs obtain context, execute actions, and interact with various systems.
MCP follows a client-server architecture, defining how **data** (resources), **interactive templates** (prompts), and **actionable functions** (tools) are exposed by an **MCP server** and consumed by an **MCP client** (which could be an LLM host application or an AI agent).
This guide covers two primary integration patterns:
1. **Using Existing MCP Servers within ADK:** An ADK agent acts as an MCP client, leveraging tools provided by external MCP servers.
1. **Exposing ADK Tools via an MCP Server:** Building an MCP server that wraps ADK tools, making them accessible to any MCP client.
## Prerequisites
Before you begin, ensure you have the following set up:
- **Set up ADK:** Follow the standard ADK [setup instructions](https://google.github.io/adk-docs/get-started/index.md) in the quickstart.
- **Install/update Python/Java:** MCP requires Python version of 3.9 or higher for Python or Java 17 or higher.
- **Setup Node.js and npx:** **(Python only)** Many community MCP servers are distributed as Node.js packages and run using `npx`. Install Node.js (which includes npx) if you haven't already. For details, see .
- **Verify Installations:** **(Python only)** Confirm `adk` and `npx` are in your PATH within the activated virtual environment:
```shell
# Both commands should print the path to the executables.
which adk
which npx
```
## 1. Using MCP servers with ADK agents (ADK as an MCP client) in `adk web`
This section demonstrates how to integrate tools from external MCP (Model Context Protocol) servers into your ADK agents. This is the **most common** integration pattern when your ADK agent needs to use capabilities provided by an existing service that exposes an MCP interface. You will see how the `McpToolset` class can be directly added to your agent's `tools` list, enabling seamless connection to an MCP server, discovery of its tools, and making them available for your agent to use. These examples primarily focus on interactions within the `adk web` development environment.
### `McpToolset` class
The `McpToolset` class is ADK's primary mechanism for integrating tools from an MCP server. When you include an `McpToolset` instance in your agent's `tools` list, it automatically handles the interaction with the specified MCP server. Here's how it works:
1. **Connection Management:** On initialization, `McpToolset` establishes and manages the connection to the MCP server. This can be a local server process (using `StdioConnectionParams` for communication over standard input/output) or a remote server (using `SseConnectionParams` for Server-Sent Events). The toolset also handles the graceful shutdown of this connection when the agent or application terminates.
1. **Tool Discovery & Adaptation:** Once connected, `McpToolset` queries the MCP server for its available tools (via the `list_tools` MCP method). It then converts the schemas of these discovered MCP tools into ADK-compatible `BaseTool` instances.
1. **Exposure to Agent:** These adapted tools are then made available to your `LlmAgent` as if they were native ADK tools.
1. **Proxying Tool Calls:** When your `LlmAgent` decides to use one of these tools, `McpToolset` transparently proxies the call (using the `call_tool` MCP method) to the MCP server, sends the necessary arguments, and returns the server's response back to the agent.
1. **Filtering (Optional):** You can use the `tool_filter` parameter when creating an `McpToolset` to select a specific subset of tools from the MCP server, rather than exposing all of them to your agent.
The following examples demonstrate how to use `McpToolset` within the `adk web` development environment. For scenarios where you need more fine-grained control over the MCP connection lifecycle or are not using `adk web`, refer to the "Using MCP Tools in your own Agent out of `adk web`" section later in this page.
### Example 1: File System MCP Server
This Python example demonstrates connecting to a local MCP server that provides file system operations.
#### Step 1: Define your Agent with `McpToolset`
Create an `agent.py` file (e.g., in `./adk_agent_samples/mcp_agent/agent.py`). The `McpToolset` is instantiated directly within the `tools` list of your `LlmAgent`.
- **Important:** Replace `"/path/to/your/folder"` in the `args` list with the **absolute path** to an actual folder on your local system that the MCP server can access.
- **Important:** Place the `.env` file in the parent directory of the `./adk_agent_samples` directory.
```python
# ./adk_agent_samples/mcp_agent/agent.py
import os # Required for path operations
from google.adk.agents import LlmAgent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
from mcp import StdioServerParameters
# It's good practice to define paths dynamically if possible,
# or ensure the user understands the need for an ABSOLUTE path.
# For this example, we'll construct a path relative to this file,
# assuming '/path/to/your/folder' is in the same directory as agent.py.
# REPLACE THIS with an actual absolute path if needed for your setup.
TARGET_FOLDER_PATH = os.path.join(os.path.dirname(os.path.abspath(__file__)), "/path/to/your/folder")
# Ensure TARGET_FOLDER_PATH is an absolute path for the MCP server.
# If you created ./adk_agent_samples/mcp_agent/your_folder,
root_agent = LlmAgent(
model='gemini-2.0-flash',
name='filesystem_assistant_agent',
instruction='Help the user manage their files. You can list files, read files, etc.',
tools=[
McpToolset(
connection_params=StdioConnectionParams(
server_params = StdioServerParameters(
command='npx',
args=[
"-y", # Argument for npx to auto-confirm install
"@modelcontextprotocol/server-filesystem",
# IMPORTANT: This MUST be an ABSOLUTE path to a folder the
# npx process can access.
# Replace with a valid absolute path on your system.
# For example: "/Users/youruser/accessible_mcp_files"
# or use a dynamically constructed absolute path:
os.path.abspath(TARGET_FOLDER_PATH),
],
),
),
# Optional: Filter which tools from the MCP server are exposed
# tool_filter=['list_directory', 'read_file']
)
],
)
```
#### Step 2: Create an `__init__.py` file
Ensure you have an `__init__.py` in the same directory as `agent.py` to make it a discoverable Python package for ADK.
```python
# ./adk_agent_samples/mcp_agent/__init__.py
from . import agent
```
#### Step 3: Run `adk web` and Interact
Navigate to the parent directory of `mcp_agent` (e.g., `adk_agent_samples`) in your terminal and run:
```shell
cd ./adk_agent_samples # Or your equivalent parent directory
adk web
```
Note for Windows users
When hitting the `_make_subprocess_transport NotImplementedError`, consider using `adk web --no-reload` instead.
Once the ADK Web UI loads in your browser:
1. Select the `filesystem_assistant_agent` from the agent dropdown.
1. Try prompts like:
- "List files in the current directory."
- "Can you read the file named sample.txt?" (assuming you created it in `TARGET_FOLDER_PATH`).
- "What is the content of `another_file.md`?"
You should see the agent interacting with the MCP file system server, and the server's responses (file listings, file content) relayed through the agent. The `adk web` console (terminal where you ran the command) might also show logs from the `npx` process if it outputs to stderr.
For Java, refer to the following sample to define an agent that initializes the `McpToolset`:
```java
package agents;
import com.google.adk.JsonBaseModel;
import com.google.adk.agents.LlmAgent;
import com.google.adk.agents.RunConfig;
import com.google.adk.runner.InMemoryRunner;
import com.google.adk.tools.mcp.McpTool;
import com.google.adk.tools.mcp.McpToolset;
import com.google.adk.tools.mcp.McpToolset.McpToolsAndToolsetResult;
import com.google.genai.types.Content;
import com.google.genai.types.Part;
import io.modelcontextprotocol.client.transport.ServerParameters;
import java.util.List;
import java.util.concurrent.CompletableFuture;
public class McpAgentCreator {
/**
* Initializes an McpToolset, retrieves tools from an MCP server using stdio,
* creates an LlmAgent with these tools, sends a prompt to the agent,
* and ensures the toolset is closed.
* @param args Command line arguments (not used).
*/
public static void main(String[] args) {
//Note: you may have permissions issues if the folder is outside home
String yourFolderPath = "~/path/to/folder";
ServerParameters connectionParams = ServerParameters.builder("npx")
.args(List.of(
"-y",
"@modelcontextprotocol/server-filesystem",
yourFolderPath
))
.build();
try {
CompletableFuture futureResult =
McpToolset.fromServer(connectionParams, JsonBaseModel.getMapper());
McpToolsAndToolsetResult result = futureResult.join();
try (McpToolset toolset = result.getToolset()) {
List tools = result.getTools();
LlmAgent agent = LlmAgent.builder()
.model("gemini-2.0-flash")
.name("enterprise_assistant")
.description("An agent to help users access their file systems")
.instruction(
"Help user accessing their file systems. You can list files in a directory."
)
.tools(tools)
.build();
System.out.println("Agent created: " + agent.name());
InMemoryRunner runner = new InMemoryRunner(agent);
String userId = "user123";
String sessionId = "1234";
String promptText = "Which files are in this directory - " + yourFolderPath + "?";
// Explicitly create the session first
try {
// appName for InMemoryRunner defaults to agent.name() if not specified in constructor
runner.sessionService().createSession(runner.appName(), userId, null, sessionId).blockingGet();
System.out.println("Session created: " + sessionId + " for user: " + userId);
} catch (Exception sessionCreationException) {
System.err.println("Failed to create session: " + sessionCreationException.getMessage());
sessionCreationException.printStackTrace();
return;
}
Content promptContent = Content.fromParts(Part.fromText(promptText));
System.out.println("\nSending prompt: \"" + promptText + "\" to agent...\n");
runner.runAsync(userId, sessionId, promptContent, RunConfig.builder().build())
.blockingForEach(event -> {
System.out.println("Event received: " + event.toJson());
});
}
} catch (Exception e) {
System.err.println("An error occurred: " + e.getMessage());
e.printStackTrace();
}
}
}
```
Assuming a folder containing three files named `first`, `second` and `third`, successful response will look like this:
```shell
Event received: {"id":"163a449e-691a-48a2-9e38-8cadb6d1f136","invocationId":"e-c2458c56-e57a-45b2-97de-ae7292e505ef","author":"enterprise_assistant","content":{"parts":[{"functionCall":{"id":"adk-388b4ac2-d40e-4f6a-bda6-f051110c6498","args":{"path":"~/home-test"},"name":"list_directory"}}],"role":"model"},"actions":{"stateDelta":{},"artifactDelta":{},"requestedAuthConfigs":{}},"timestamp":1747377543788}
Event received: {"id":"8728380b-bfad-4d14-8421-fa98d09364f1","invocationId":"e-c2458c56-e57a-45b2-97de-ae7292e505ef","author":"enterprise_assistant","content":{"parts":[{"functionResponse":{"id":"adk-388b4ac2-d40e-4f6a-bda6-f051110c6498","name":"list_directory","response":{"text_output":[{"text":"[FILE] first\n[FILE] second\n[FILE] third"}]}}}],"role":"user"},"actions":{"stateDelta":{},"artifactDelta":{},"requestedAuthConfigs":{}},"timestamp":1747377544679}
Event received: {"id":"8fe7e594-3e47-4254-8b57-9106ad8463cb","invocationId":"e-c2458c56-e57a-45b2-97de-ae7292e505ef","author":"enterprise_assistant","content":{"parts":[{"text":"There are three files in the directory: first, second, and third."}],"role":"model"},"actions":{"stateDelta":{},"artifactDelta":{},"requestedAuthConfigs":{}},"timestamp":1747377544689}
```
For Typescript, you can define an agent that initializes the `MCPToolset` as follows:
```typescript
import 'dotenv/config';
import {LlmAgent, MCPToolset} from "@google/adk";
// REPLACE THIS with an actual absolute path for your setup.
const TARGET_FOLDER_PATH = "/path/to/your/folder";
export const rootAgent = new LlmAgent({
model: "gemini-2.5-flash",
name: "filesystem_assistant_agent",
instruction: "Help the user manage their files. You can list files, read files, etc.",
tools: [
// To filter tools, pass a list of tool names as the second argument
// to the MCPToolset constructor.
// e.g., new MCPToolset(connectionParams, ['list_directory', 'read_file'])
new MCPToolset(
{
type: "StdioConnectionParams",
serverParams: {
command: "npx",
args: [
"-y",
"@modelcontextprotocol/server-filesystem",
// IMPORTANT: This MUST be an ABSOLUTE path to a folder the
// npx process can access.
// Replace with a valid absolute path on your system.
// For example: "/Users/youruser/accessible_mcp_files"
TARGET_FOLDER_PATH,
],
},
}
)
],
});
```
### Example 2: Google Maps MCP Server
This example demonstrates connecting to the Google Maps MCP server.
#### Step 1: Get API Key and Enable APIs
1. **Google Maps API Key:** Follow the directions at [Use API keys](https://developers.google.com/maps/documentation/javascript/get-api-key#create-api-keys) to obtain a Google Maps API Key.
1. **Enable APIs:** In your Google Cloud project, ensure the following APIs are enabled:
- Directions API
- Routes API For instructions, see the [Getting started with Google Maps Platform](https://developers.google.com/maps/get-started#enable-api-sdk) documentation.
#### Step 2: Define your Agent with `McpToolset` for Google Maps
Modify your `agent.py` file (e.g., in `./adk_agent_samples/mcp_agent/agent.py`). Replace `YOUR_GOOGLE_MAPS_API_KEY` with the actual API key you obtained.
```python
# ./adk_agent_samples/mcp_agent/agent.py
import os
from google.adk.agents import LlmAgent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
from mcp import StdioServerParameters
# Retrieve the API key from an environment variable or directly insert it.
# Using an environment variable is generally safer.
# Ensure this environment variable is set in the terminal where you run 'adk web'.
# Example: export GOOGLE_MAPS_API_KEY="YOUR_ACTUAL_KEY"
google_maps_api_key = os.environ.get("GOOGLE_MAPS_API_KEY")
if not google_maps_api_key:
# Fallback or direct assignment for testing - NOT RECOMMENDED FOR PRODUCTION
google_maps_api_key = "YOUR_GOOGLE_MAPS_API_KEY_HERE" # Replace if not using env var
if google_maps_api_key == "YOUR_GOOGLE_MAPS_API_KEY_HERE":
print("WARNING: GOOGLE_MAPS_API_KEY is not set. Please set it as an environment variable or in the script.")
# You might want to raise an error or exit if the key is crucial and not found.
root_agent = LlmAgent(
model='gemini-2.0-flash',
name='maps_assistant_agent',
instruction='Help the user with mapping, directions, and finding places using Google Maps tools.',
tools=[
McpToolset(
connection_params=StdioConnectionParams(
server_params = StdioServerParameters(
command='npx',
args=[
"-y",
"@modelcontextprotocol/server-google-maps",
],
# Pass the API key as an environment variable to the npx process
# This is how the MCP server for Google Maps expects the key.
env={
"GOOGLE_MAPS_API_KEY": google_maps_api_key
}
),
),
# You can filter for specific Maps tools if needed:
# tool_filter=['get_directions', 'find_place_by_id']
)
],
)
```
#### Step 3: Ensure `__init__.py` Exists
If you created this in Example 1, you can skip this. Otherwise, ensure you have an `__init__.py` in the `./adk_agent_samples/mcp_agent/` directory:
```python
# ./adk_agent_samples/mcp_agent/__init__.py
from . import agent
```
#### Step 4: Run `adk web` and Interact
1. **Set Environment Variable (Recommended):** Before running `adk web`, it's best to set your Google Maps API key as an environment variable in your terminal:
```shell
export GOOGLE_MAPS_API_KEY="YOUR_ACTUAL_GOOGLE_MAPS_API_KEY"
```
Replace `YOUR_ACTUAL_GOOGLE_MAPS_API_KEY` with your key.
1. **Run `adk web`**: Navigate to the parent directory of `mcp_agent` (e.g., `adk_agent_samples`) and run:
```shell
cd ./adk_agent_samples # Or your equivalent parent directory
adk web
```
1. **Interact in the UI**:
- Select the `maps_assistant_agent`.
- Try prompts like:
- "Get directions from GooglePlex to SFO."
- "Find coffee shops near Golden Gate Park."
- "What's the route from Paris, France to Berlin, Germany?"
You should see the agent use the Google Maps MCP tools to provide directions or location-based information.
For Java, refer to the following sample to define an agent that initializes the `McpToolset`:
```java
package agents;
import com.google.adk.JsonBaseModel;
import com.google.adk.agents.LlmAgent;
import com.google.adk.agents.RunConfig;
import com.google.adk.runner.InMemoryRunner;
import com.google.adk.tools.mcp.McpTool;
import com.google.adk.tools.mcp.McpToolset;
import com.google.adk.tools.mcp.McpToolset.McpToolsAndToolsetResult;
import com.google.genai.types.Content;
import com.google.genai.types.Part;
import io.modelcontextprotocol.client.transport.ServerParameters;
import java.util.List;
import java.util.Map;
import java.util.Collections;
import java.util.HashMap;
import java.util.concurrent.CompletableFuture;
import java.util.Arrays;
public class MapsAgentCreator {
/**
* Initializes an McpToolset for Google Maps, retrieves tools,
* creates an LlmAgent, sends a map-related prompt, and closes the toolset.
* @param args Command line arguments (not used).
*/
public static void main(String[] args) {
// TODO: Replace with your actual Google Maps API key, on a project with the Places API enabled.
String googleMapsApiKey = "YOUR_GOOGLE_MAPS_API_KEY";
Map envVariables = new HashMap<>();
envVariables.put("GOOGLE_MAPS_API_KEY", googleMapsApiKey);
ServerParameters connectionParams = ServerParameters.builder("npx")
.args(List.of(
"-y",
"@modelcontextprotocol/server-google-maps"
))
.env(Collections.unmodifiableMap(envVariables))
.build();
try {
CompletableFuture futureResult =
McpToolset.fromServer(connectionParams, JsonBaseModel.getMapper());
McpToolsAndToolsetResult result = futureResult.join();
try (McpToolset toolset = result.getToolset()) {
List tools = result.getTools();
LlmAgent agent = LlmAgent.builder()
.model("gemini-2.0-flash")
.name("maps_assistant")
.description("Maps assistant")
.instruction("Help user with mapping and directions using available tools.")
.tools(tools)
.build();
System.out.println("Agent created: " + agent.name());
InMemoryRunner runner = new InMemoryRunner(agent);
String userId = "maps-user-" + System.currentTimeMillis();
String sessionId = "maps-session-" + System.currentTimeMillis();
String promptText = "Please give me directions to the nearest pharmacy to Madison Square Garden.";
try {
runner.sessionService().createSession(runner.appName(), userId, null, sessionId).blockingGet();
System.out.println("Session created: " + sessionId + " for user: " + userId);
} catch (Exception sessionCreationException) {
System.err.println("Failed to create session: " + sessionCreationException.getMessage());
sessionCreationException.printStackTrace();
return;
}
Content promptContent = Content.fromParts(Part.fromText(promptText))
System.out.println("\nSending prompt: \"" + promptText + "\" to agent...\n");
runner.runAsync(userId, sessionId, promptContent, RunConfig.builder().build())
.blockingForEach(event -> {
System.out.println("Event received: " + event.toJson());
});
}
} catch (Exception e) {
System.err.println("An error occurred: " + e.getMessage());
e.printStackTrace();
}
}
}
```
A successful response will look like this:
```shell
Event received: {"id":"1a4deb46-c496-4158-bd41-72702c773368","invocationId":"e-48994aa0-531c-47be-8c57-65215c3e0319","author":"maps_assistant","content":{"parts":[{"text":"OK. I see a few options. The closest one is CVS Pharmacy at 5 Pennsylvania Plaza, New York, NY 10001, United States. Would you like directions?\n"}],"role":"model"},"actions":{"stateDelta":{},"artifactDelta":{},"requestedAuthConfigs":{}},"timestamp":1747380026642}
```
For TypeScript, refer to the following sample to define an agent that initializes the `MCPToolset`:
```typescript
import 'dotenv/config';
import {LlmAgent, MCPToolset} from "@google/adk";
// Retrieve the API key from an environment variable.
// Ensure this environment variable is set in the terminal where you run 'adk web'.
// Example: export GOOGLE_MAPS_API_KEY="YOUR_ACTUAL_KEY"
const googleMapsApiKey = process.env.GOOGLE_MAPS_API_KEY;
if (!googleMapsApiKey) {
throw new Error('GOOGLE_MAPS_API_KEY is not provided, please run "export GOOGLE_MAPS_API_KEY=YOUR_ACTUAL_KEY" to add that.');
}
export const rootAgent = new LlmAgent({
model: "gemini-2.5-flash",
name: "maps_assistant_agent",
instruction: "Help the user with mapping, directions, and finding places using Google Maps tools.",
tools: [
new MCPToolset(
{
type: "StdioConnectionParams",
serverParams: {
command: "npx",
args: [
"-y",
"@modelcontextprotocol/server-google-maps",
],
// Pass the API key as an environment variable to the npx process
// This is how the MCP server for Google Maps expects the key.
env: {
"GOOGLE_MAPS_API_KEY": googleMapsApiKey
}
},
},
// You can filter for specific Maps tools if needed:
// ['get_directions', 'find_place_by_id']
)
],
});
```
A successful response will look like this:
```shell
Event received: {"id":"1a4deb46-c496-4158-bd41-72702c773368","invocationId":"e-48994aa0-531c-47be-8c57-65215c3e0319","author":"maps_assistant","content":{"parts":[{"text":"OK. I see a few options. The closest one is CVS Pharmacy at 5 Pennsylvania Plaza, New York, NY 10001, United States. Would you like directions?\n"}],"role":"model"},"actions":{"stateDelta":{},"artifactDelta":{},"requestedAuthConfigs":{}},"timestamp":1747380026642}
```
## 2. Building an MCP server with ADK tools (MCP server exposing ADK)
This pattern allows you to wrap existing ADK tools and make them available to any standard MCP client application. The example in this section exposes the ADK `load_web_page` tool through a custom-built MCP server.
### Summary of steps
You will create a standard Python MCP server application using the `mcp` library. Within this server, you will:
1. Instantiate the ADK tool(s) you want to expose (e.g., `FunctionTool(load_web_page)`).
1. Implement the MCP server's `@app.list_tools()` handler to advertise the ADK tool(s). This involves converting the ADK tool definition to the MCP schema using the `adk_to_mcp_tool_type` utility from `google.adk.tools.mcp_tool.conversion_utils`.
1. Implement the MCP server's `@app.call_tool()` handler. This handler will:
- Receive tool call requests from MCP clients.
- Identify if the request targets one of your wrapped ADK tools.
- Execute the ADK tool's `.run_async()` method.
- Format the ADK tool's result into an MCP-compliant response (e.g., `mcp.types.TextContent`).
### Prerequisites
Install the MCP server library in the same Python environment as your ADK installation:
```shell
pip install mcp
```
### Step 1: Create the MCP Server Script
Create a new Python file for your MCP server, for example, `my_adk_mcp_server.py`.
### Step 2: Implement the Server Logic
Add the following code to `my_adk_mcp_server.py`. This script sets up an MCP server that exposes the ADK `load_web_page` tool.
```python
# my_adk_mcp_server.py
import asyncio
import json
import os
from dotenv import load_dotenv
# MCP Server Imports
from mcp import types as mcp_types # Use alias to avoid conflict
from mcp.server.lowlevel import Server, NotificationOptions
from mcp.server.models import InitializationOptions
import mcp.server.stdio # For running as a stdio server
# ADK Tool Imports
from google.adk.tools.function_tool import FunctionTool
from google.adk.tools.load_web_page import load_web_page # Example ADK tool
# ADK <-> MCP Conversion Utility
from google.adk.tools.mcp_tool.conversion_utils import adk_to_mcp_tool_type
# --- Load Environment Variables (If ADK tools need them, e.g., API keys) ---
load_dotenv() # Create a .env file in the same directory if needed
# --- Prepare the ADK Tool ---
# Instantiate the ADK tool you want to expose.
# This tool will be wrapped and called by the MCP server.
print("Initializing ADK load_web_page tool...")
adk_tool_to_expose = FunctionTool(load_web_page)
print(f"ADK tool '{adk_tool_to_expose.name}' initialized and ready to be exposed via MCP.")
# --- End ADK Tool Prep ---
# --- MCP Server Setup ---
print("Creating MCP Server instance...")
# Create a named MCP Server instance using the mcp.server library
app = Server("adk-tool-exposing-mcp-server")
# Implement the MCP server's handler to list available tools
@app.list_tools()
async def list_mcp_tools() -> list[mcp_types.Tool]:
"""MCP handler to list tools this server exposes."""
print("MCP Server: Received list_tools request.")
# Convert the ADK tool's definition to the MCP Tool schema format
mcp_tool_schema = adk_to_mcp_tool_type(adk_tool_to_expose)
print(f"MCP Server: Advertising tool: {mcp_tool_schema.name}")
return [mcp_tool_schema]
# Implement the MCP server's handler to execute a tool call
@app.call_tool()
async def call_mcp_tool(
name: str, arguments: dict
) -> list[mcp_types.Content]: # MCP uses mcp_types.Content
"""MCP handler to execute a tool call requested by an MCP client."""
print(f"MCP Server: Received call_tool request for '{name}' with args: {arguments}")
# Check if the requested tool name matches our wrapped ADK tool
if name == adk_tool_to_expose.name:
try:
# Execute the ADK tool's run_async method.
# Note: tool_context is None here because this MCP server is
# running the ADK tool outside of a full ADK Runner invocation.
# If the ADK tool requires ToolContext features (like state or auth),
# this direct invocation might need more sophisticated handling.
adk_tool_response = await adk_tool_to_expose.run_async(
args=arguments,
tool_context=None,
)
print(f"MCP Server: ADK tool '{name}' executed. Response: {adk_tool_response}")
# Format the ADK tool's response (often a dict) into an MCP-compliant format.
# Here, we serialize the response dictionary as a JSON string within TextContent.
# Adjust formatting based on the ADK tool's output and client needs.
response_text = json.dumps(adk_tool_response, indent=2)
# MCP expects a list of mcp_types.Content parts
return [mcp_types.TextContent(type="text", text=response_text)]
except Exception as e:
print(f"MCP Server: Error executing ADK tool '{name}': {e}")
# Return an error message in MCP format
error_text = json.dumps({"error": f"Failed to execute tool '{name}': {str(e)}"})
return [mcp_types.TextContent(type="text", text=error_text)]
else:
# Handle calls to unknown tools
print(f"MCP Server: Tool '{name}' not found/exposed by this server.")
error_text = json.dumps({"error": f"Tool '{name}' not implemented by this server."})
return [mcp_types.TextContent(type="text", text=error_text)]
# --- MCP Server Runner ---
async def run_mcp_stdio_server():
"""Runs the MCP server, listening for connections over standard input/output."""
# Use the stdio_server context manager from the mcp.server.stdio library
async with mcp.server.stdio.stdio_server() as (read_stream, write_stream):
print("MCP Stdio Server: Starting handshake with client...")
await app.run(
read_stream,
write_stream,
InitializationOptions(
server_name=app.name, # Use the server name defined above
server_version="0.1.0",
capabilities=app.get_capabilities(
# Define server capabilities - consult MCP docs for options
notification_options=NotificationOptions(),
experimental_capabilities={},
),
),
)
print("MCP Stdio Server: Run loop finished or client disconnected.")
if __name__ == "__main__":
print("Launching MCP Server to expose ADK tools via stdio...")
try:
asyncio.run(run_mcp_stdio_server())
except KeyboardInterrupt:
print("\nMCP Server (stdio) stopped by user.")
except Exception as e:
print(f"MCP Server (stdio) encountered an error: {e}")
finally:
print("MCP Server (stdio) process exiting.")
# --- End MCP Server ---
```
### Step 3: Test your Custom MCP Server with an ADK Agent
Now, create an ADK agent that will act as a client to the MCP server you just built. This ADK agent will use `McpToolset` to connect to your `my_adk_mcp_server.py` script.
Create an `agent.py` (e.g., in `./adk_agent_samples/mcp_client_agent/agent.py`):
```python
# ./adk_agent_samples/mcp_client_agent/agent.py
import os
from google.adk.agents import LlmAgent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
from mcp import StdioServerParameters
# IMPORTANT: Replace this with the ABSOLUTE path to your my_adk_mcp_server.py script
PATH_TO_YOUR_MCP_SERVER_SCRIPT = "/path/to/your/my_adk_mcp_server.py" # <<< REPLACE
if PATH_TO_YOUR_MCP_SERVER_SCRIPT == "/path/to/your/my_adk_mcp_server.py":
print("WARNING: PATH_TO_YOUR_MCP_SERVER_SCRIPT is not set. Please update it in agent.py.")
# Optionally, raise an error if the path is critical
root_agent = LlmAgent(
model='gemini-2.0-flash',
name='web_reader_mcp_client_agent',
instruction="Use the 'load_web_page' tool to fetch content from a URL provided by the user.",
tools=[
McpToolset(
connection_params=StdioConnectionParams(
server_params = StdioServerParameters(
command='python3', # Command to run your MCP server script
args=[PATH_TO_YOUR_MCP_SERVER_SCRIPT], # Argument is the path to the script
)
)
# tool_filter=['load_web_page'] # Optional: ensure only specific tools are loaded
)
],
)
```
And an `__init__.py` in the same directory:
```python
# ./adk_agent_samples/mcp_client_agent/__init__.py
from . import agent
```
**To run the test:**
1. **Start your custom MCP server (optional, for separate observation):** You can run your `my_adk_mcp_server.py` directly in one terminal to see its logs:
```shell
python3 /path/to/your/my_adk_mcp_server.py
```
It will print "Launching MCP Server..." and wait. The ADK agent (run via `adk web`) will then connect to this process if the `command` in `StdioConnectionParams` is set up to execute it. *(Alternatively, `McpToolset` will start this server script as a subprocess automatically when the agent initializes).*
1. **Run `adk web` for the client agent:** Navigate to the parent directory of `mcp_client_agent` (e.g., `adk_agent_samples`) and run:
```shell
cd ./adk_agent_samples # Or your equivalent parent directory
adk web
```
1. **Interact in the ADK Web UI:**
- Select the `web_reader_mcp_client_agent`.
- Try a prompt like: "Load the content from https://example.com"
The ADK agent (`web_reader_mcp_client_agent`) will use `McpToolset` to start and connect to your `my_adk_mcp_server.py`. Your MCP server will receive the `call_tool` request, execute the ADK `load_web_page` tool, and return the result. The ADK agent will then relay this information. You should see logs from both the ADK Web UI (and its terminal) and potentially from your `my_adk_mcp_server.py` terminal if you ran it separately.
This example demonstrates how ADK tools can be encapsulated within an MCP server, making them accessible to a broader range of MCP-compliant clients, not just ADK agents.
Refer to the [documentation](https://modelcontextprotocol.io/quickstart/server#core-mcp-concepts), to try it out with Claude Desktop.
## Using MCP Tools in your own Agent out of `adk web`
This section is relevant to you if:
- You are developing your own Agent using ADK
- And, you are **NOT** using `adk web`,
- And, you are exposing the agent via your own UI
Using MCP Tools requires a different setup than using regular tools, due to the fact that specs for MCP Tools are fetched asynchronously from the MCP Server running remotely, or in another process.
The following example is modified from the "Example 1: File System MCP Server" example above. The main differences are:
1. Your tool and agent are created asynchronously
1. You need to properly manage the exit stack, so that your agents and tools are destructed properly when the connection to MCP Server is closed.
```python
# agent.py (modify get_tools_async and other parts as needed)
# ./adk_agent_samples/mcp_agent/agent.py
import os
import asyncio
from dotenv import load_dotenv
from google.genai import types
from google.adk.agents.llm_agent import LlmAgent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.adk.artifacts.in_memory_artifact_service import InMemoryArtifactService # Optional
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
from mcp import StdioServerParameters
# Load environment variables from .env file in the parent directory
# Place this near the top, before using env vars like API keys
load_dotenv('../.env')
# Ensure TARGET_FOLDER_PATH is an absolute path for the MCP server.
TARGET_FOLDER_PATH = os.path.join(os.path.dirname(os.path.abspath(__file__)), "/path/to/your/folder")
# --- Step 1: Agent Definition ---
async def get_agent_async():
"""Creates an ADK Agent equipped with tools from the MCP Server."""
toolset = McpToolset(
# Use StdioConnectionParams for local process communication
connection_params=StdioConnectionParams(
server_params = StdioServerParameters(
command='npx', # Command to run the server
args=["-y", # Arguments for the command
"@modelcontextprotocol/server-filesystem",
TARGET_FOLDER_PATH],
),
),
tool_filter=['read_file', 'list_directory'] # Optional: filter specific tools
# For remote servers, you would use SseConnectionParams instead:
# connection_params=SseConnectionParams(url="http://remote-server:port/path", headers={...})
)
# Use in an agent
root_agent = LlmAgent(
model='gemini-2.0-flash', # Adjust model name if needed based on availability
name='enterprise_assistant',
instruction='Help user accessing their file systems',
tools=[toolset], # Provide the MCP tools to the ADK agent
)
return root_agent, toolset
# --- Step 2: Main Execution Logic ---
async def async_main():
session_service = InMemorySessionService()
# Artifact service might not be needed for this example
artifacts_service = InMemoryArtifactService()
session = await session_service.create_session(
state={}, app_name='mcp_filesystem_app', user_id='user_fs'
)
# TODO: Change the query to be relevant to YOUR specified folder.
# e.g., "list files in the 'documents' subfolder" or "read the file 'notes.txt'"
query = "list files in the tests folder"
print(f"User Query: '{query}'")
content = types.Content(role='user', parts=[types.Part(text=query)])
root_agent, toolset = await get_agent_async()
runner = Runner(
app_name='mcp_filesystem_app',
agent=root_agent,
artifact_service=artifacts_service, # Optional
session_service=session_service,
)
print("Running agent...")
events_async = runner.run_async(
session_id=session.id, user_id=session.user_id, new_message=content
)
async for event in events_async:
print(f"Event received: {event}")
# Cleanup is handled automatically by the agent framework
# But you can also manually close if needed:
print("Closing MCP server connection...")
await toolset.close()
print("Cleanup complete.")
if __name__ == '__main__':
try:
asyncio.run(async_main())
except Exception as e:
print(f"An error occurred: {e}")
```
## Key considerations
When working with MCP and ADK, keep these points in mind:
- **Protocol vs. Library:** MCP is a protocol specification, defining communication rules. ADK is a Python library/framework for building agents. McpToolset bridges these by implementing the client side of the MCP protocol within the ADK framework. Conversely, building an MCP server in Python requires using the model-context-protocol library.
- **ADK Tools vs. MCP Tools:**
- ADK Tools (BaseTool, FunctionTool, AgentTool, etc.) are Python objects designed for direct use within the ADK's LlmAgent and Runner.
- MCP Tools are capabilities exposed by an MCP Server according to the protocol's schema. McpToolset makes these look like ADK tools to an LlmAgent.
- **Asynchronous nature:** Both ADK and the MCP Python library are heavily based on the asyncio Python library. Tool implementations and server handlers should generally be async functions.
- **Stateful sessions (MCP):** MCP establishes stateful, persistent connections between a client and server instance. This differs from typical stateless REST APIs.
- **Deployment:** This statefulness can pose challenges for scaling and deployment, especially for remote servers handling many users. The original MCP design often assumed client and server were co-located. Managing these persistent connections requires careful infrastructure considerations (e.g., load balancing, session affinity).
- **ADK McpToolset:** Manages this connection lifecycle. The exit_stack pattern shown in the examples is crucial for ensuring the connection (and potentially the server process) is properly terminated when the ADK agent finishes.
## Deploying Agents with MCP Tools
When deploying ADK agents that use MCP tools to production environments like Cloud Run, GKE, or Vertex AI Agent Engine, you need to consider how MCP connections will work in containerized and distributed environments.
### Critical Deployment Requirement: Synchronous Agent Definition
**⚠️ Important:** When deploying agents with MCP tools, the agent and its McpToolset must be defined **synchronously** in your `agent.py` file. While `adk web` allows for asynchronous agent creation, deployment environments require synchronous instantiation.
```python
# ✅ CORRECT: Synchronous agent definition for deployment
import os
from google.adk.agents.llm_agent import LlmAgent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
from mcp import StdioServerParameters
_allowed_path = os.path.dirname(os.path.abspath(__file__))
root_agent = LlmAgent(
model='gemini-2.0-flash',
name='enterprise_assistant',
instruction=f'Help user accessing their file systems. Allowed directory: {_allowed_path}',
tools=[
McpToolset(
connection_params=StdioConnectionParams(
server_params=StdioServerParameters(
command='npx',
args=['-y', '@modelcontextprotocol/server-filesystem', _allowed_path],
),
timeout=5, # Configure appropriate timeouts
),
# Filter tools for security in production
tool_filter=[
'read_file', 'read_multiple_files', 'list_directory',
'directory_tree', 'search_files', 'get_file_info',
'list_allowed_directories',
],
)
],
)
```
```python
# ❌ WRONG: Asynchronous patterns don't work in deployment
async def get_agent(): # This won't work for deployment
toolset = await create_mcp_toolset_async()
return LlmAgent(tools=[toolset])
```
### Quick Deployment Commands
#### Vertex AI Agent Engine
```bash
uv run adk deploy agent_engine \
--project= \
--region= \
--staging_bucket="gs://" \
--display_name="My MCP Agent" \
./path/to/your/agent_directory
```
#### Cloud Run
```bash
uv run adk deploy cloud_run \
--project= \
--region= \
--service_name= \
./path/to/your/agent_directory
```
### Deployment Patterns
#### Pattern 1: Self-Contained Stdio MCP Servers
For MCP servers that can be packaged as npm packages or Python modules (like `@modelcontextprotocol/server-filesystem`), you can include them directly in your agent container:
**Container Requirements:**
```dockerfile
# Example for npm-based MCP servers
FROM python:3.13-slim
# Install Node.js and npm for MCP servers
RUN apt-get update && apt-get install -y nodejs npm && rm -rf /var/lib/apt/lists/*
# Install your Python dependencies
COPY requirements.txt .
RUN pip install -r requirements.txt
# Copy your agent code
COPY . .
# Your agent can now use StdioConnectionParams with 'npx' commands
CMD ["python", "main.py"]
```
**Agent Configuration:**
```python
# This works in containers because npx and the MCP server run in the same environment
McpToolset(
connection_params=StdioConnectionParams(
server_params=StdioServerParameters(
command='npx',
args=["-y", "@modelcontextprotocol/server-filesystem", "/app/data"],
),
),
)
```
#### Pattern 2: Remote MCP Servers (Streamable HTTP)
For production deployments requiring scalability, deploy MCP servers as separate services and connect via Streamable HTTP:
**MCP Server Deployment (Cloud Run):**
```python
# deploy_mcp_server.py - Separate Cloud Run service using Streamable HTTP
import contextlib
import logging
from collections.abc import AsyncIterator
from typing import Any
import anyio
import click
import mcp.types as types
from mcp.server.lowlevel import Server
from mcp.server.streamable_http_manager import StreamableHTTPSessionManager
from starlette.applications import Starlette
from starlette.routing import Mount
from starlette.types import Receive, Scope, Send
logger = logging.getLogger(__name__)
def create_mcp_server():
"""Create and configure the MCP server."""
app = Server("adk-mcp-streamable-server")
@app.call_tool()
async def call_tool(name: str, arguments: dict[str, Any]) -> list[types.ContentBlock]:
"""Handle tool calls from MCP clients."""
# Example tool implementation - replace with your actual ADK tools
if name == "example_tool":
result = arguments.get("input", "No input provided")
return [
types.TextContent(
type="text",
text=f"Processed: {result}"
)
]
else:
raise ValueError(f"Unknown tool: {name}")
@app.list_tools()
async def list_tools() -> list[types.Tool]:
"""List available tools."""
return [
types.Tool(
name="example_tool",
description="Example tool for demonstration",
inputSchema={
"type": "object",
"properties": {
"input": {
"type": "string",
"description": "Input text to process"
}
},
"required": ["input"]
}
)
]
return app
def main(port: int = 8080, json_response: bool = False):
"""Main server function."""
logging.basicConfig(level=logging.INFO)
app = create_mcp_server()
# Create session manager with stateless mode for scalability
session_manager = StreamableHTTPSessionManager(
app=app,
event_store=None,
json_response=json_response,
stateless=True, # Important for Cloud Run scalability
)
async def handle_streamable_http(scope: Scope, receive: Receive, send: Send) -> None:
await session_manager.handle_request(scope, receive, send)
@contextlib.asynccontextmanager
async def lifespan(app: Starlette) -> AsyncIterator[None]:
"""Manage session manager lifecycle."""
async with session_manager.run():
logger.info("MCP Streamable HTTP server started!")
try:
yield
finally:
logger.info("MCP server shutting down...")
# Create ASGI application
starlette_app = Starlette(
debug=False, # Set to False for production
routes=[
Mount("/mcp", app=handle_streamable_http),
],
lifespan=lifespan,
)
import uvicorn
uvicorn.run(starlette_app, host="0.0.0.0", port=port)
if __name__ == "__main__":
main()
```
**Agent Configuration for Remote MCP:**
```python
# Your ADK agent connects to the remote MCP service via Streamable HTTP
McpToolset(
connection_params=StreamableHTTPConnectionParams(
url="https://your-mcp-server-url.run.app/mcp",
headers={"Authorization": "Bearer your-auth-token"}
),
)
```
#### Pattern 3: Sidecar MCP Servers (GKE)
In Kubernetes environments, you can deploy MCP servers as sidecar containers:
```yaml
# deployment.yaml - GKE with MCP sidecar
apiVersion: apps/v1
kind: Deployment
metadata:
name: adk-agent-with-mcp
spec:
template:
spec:
containers:
# Main ADK agent container
- name: adk-agent
image: your-adk-agent:latest
ports:
- containerPort: 8080
env:
- name: MCP_SERVER_URL
value: "http://localhost:8081"
# MCP server sidecar
- name: mcp-server
image: your-mcp-server:latest
ports:
- containerPort: 8081
```
### Connection Management Considerations
#### Stdio Connections
- **Pros:** Simple setup, process isolation, works well in containers
- **Cons:** Process overhead, not suitable for high-scale deployments
- **Best for:** Development, single-tenant deployments, simple MCP servers
#### SSE/HTTP Connections
- **Pros:** Network-based, scalable, can handle multiple clients
- **Cons:** Requires network infrastructure, authentication complexity
- **Best for:** Production deployments, multi-tenant systems, external MCP services
### Production Deployment Checklist
When deploying agents with MCP tools to production:
**✅ Connection Lifecycle**
- Ensure proper cleanup of MCP connections using exit_stack patterns
- Configure appropriate timeouts for connection establishment and requests
- Implement retry logic for transient connection failures
**✅ Resource Management**
- Monitor memory usage for stdio MCP servers (each spawns a process)
- Configure appropriate CPU/memory limits for MCP server processes
- Consider connection pooling for remote MCP servers
**✅ Security**
- Use authentication headers for remote MCP connections
- Restrict network access between ADK agents and MCP servers
- **Filter MCP tools using `tool_filter` to limit exposed functionality**
- Validate MCP tool inputs to prevent injection attacks
- Use restrictive file paths for filesystem MCP servers (e.g., `os.path.dirname(os.path.abspath(__file__))`)
- Consider read-only tool filters for production environments
**✅ Monitoring & Observability**
- Log MCP connection establishment and teardown events
- Monitor MCP tool execution times and success rates
- Set up alerts for MCP connection failures
**✅ Scalability**
- For high-volume deployments, prefer remote MCP servers over stdio
- Configure session affinity if using stateful MCP servers
- Consider MCP server connection limits and implement circuit breakers
### Environment-Specific Configurations
#### Cloud Run
```python
# Cloud Run environment variables for MCP configuration
import os
# Detect Cloud Run environment
if os.getenv('K_SERVICE'):
# Use remote MCP servers in Cloud Run
mcp_connection = SseConnectionParams(
url=os.getenv('MCP_SERVER_URL'),
headers={'Authorization': f"Bearer {os.getenv('MCP_AUTH_TOKEN')}"}
)
else:
# Use stdio for local development
mcp_connection = StdioConnectionParams(
server_params=StdioServerParameters(
command='npx',
args=["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
)
)
McpToolset(connection_params=mcp_connection)
```
#### GKE
```python
# GKE-specific MCP configuration
# Use service discovery for MCP servers within the cluster
McpToolset(
connection_params=SseConnectionParams(
url="http://mcp-service.default.svc.cluster.local:8080/sse"
),
)
```
#### Vertex AI Agent Engine
```python
# Agent Engine managed deployment
# Prefer lightweight, self-contained MCP servers or external services
McpToolset(
connection_params=SseConnectionParams(
url="https://your-managed-mcp-service.googleapis.com/sse",
headers={'Authorization': 'Bearer $(gcloud auth print-access-token)'}
),
)
```
### Troubleshooting Deployment Issues
**Common MCP Deployment Problems:**
1. **Stdio Process Startup Failures**
```python
# Debug stdio connection issues
McpToolset(
connection_params=StdioConnectionParams(
server_params=StdioServerParameters(
command='npx',
args=["-y", "@modelcontextprotocol/server-filesystem", "/app/data"],
# Add environment debugging
env={'DEBUG': '1'}
),
),
)
```
1. **Network Connectivity Issues**
```python
# Test remote MCP connectivity
import aiohttp
async def test_mcp_connection():
async with aiohttp.ClientSession() as session:
async with session.get('https://your-mcp-server.com/health') as resp:
print(f"MCP Server Health: {resp.status}")
```
1. **Resource Exhaustion**
1. Monitor container memory usage when using stdio MCP servers
1. Set appropriate limits in Kubernetes deployments
1. Use remote MCP servers for resource-intensive operations
## Further Resources
- [Model Context Protocol Documentation](https://modelcontextprotocol.io/)
- [MCP Specification](https://modelcontextprotocol.io/specification/)
- [MCP Python SDK & Examples](https://github.com/modelcontextprotocol/)
# Integrate REST APIs with OpenAPI
Supported in ADKPython v0.1.0
ADK simplifies interacting with external REST APIs by automatically generating callable tools directly from an [OpenAPI Specification (v3.x)](https://swagger.io/specification/). This eliminates the need to manually define individual function tools for each API endpoint.
Core Benefit
Use `OpenAPIToolset` to instantly create agent tools (`RestApiTool`) from your existing API documentation (OpenAPI spec), enabling agents to seamlessly call your web services.
## Key Components
- **`OpenAPIToolset`**: This is the primary class you'll use. You initialize it with your OpenAPI specification, and it handles the parsing and generation of tools.
- **`RestApiTool`**: This class represents a single, callable API operation (like `GET /pets/{petId}` or `POST /pets`). `OpenAPIToolset` creates one `RestApiTool` instance for each operation defined in your spec.
## How it Works
The process involves these main steps when you use `OpenAPIToolset`:
1. **Initialization & Parsing**:
- You provide the OpenAPI specification to `OpenAPIToolset` either as a Python dictionary, a JSON string, or a YAML string.
- The toolset internally parses the spec, resolving any internal references (`$ref`) to understand the complete API structure.
1. **Operation Discovery**:
- It identifies all valid API operations (e.g., `GET`, `POST`, `PUT`, `DELETE`) defined within the `paths` object of your specification.
1. **Tool Generation**:
- For each discovered operation, `OpenAPIToolset` automatically creates a corresponding `RestApiTool` instance.
- **Tool Name**: Derived from the `operationId` in the spec (converted to `snake_case`, max 60 chars). If `operationId` is missing, a name is generated from the method and path.
- **Tool Description**: Uses the `summary` or `description` from the operation for the LLM.
- **API Details**: Stores the required HTTP method, path, server base URL, parameters (path, query, header, cookie), and request body schema internally.
1. **`RestApiTool` Functionality**: Each generated `RestApiTool`:
- **Schema Generation**: Dynamically creates a `FunctionDeclaration` based on the operation's parameters and request body. This schema tells the LLM how to call the tool (what arguments are expected).
- **Execution**: When called by the LLM, it constructs the correct HTTP request (URL, headers, query params, body) using the arguments provided by the LLM and the details from the OpenAPI spec. It handles authentication (if configured) and executes the API call using the `requests` library.
- **Response Handling**: Returns the API response (typically JSON) back to the agent flow.
1. **Authentication**: You can configure global authentication (like API keys or OAuth - see [Authentication](/adk-docs/tools/authentication/) for details) when initializing `OpenAPIToolset`. This authentication configuration is automatically applied to all generated `RestApiTool` instances.
## Usage Workflow
Follow these steps to integrate an OpenAPI spec into your agent:
1. **Obtain Spec**: Get your OpenAPI specification document (e.g., load from a `.json` or `.yaml` file, fetch from a URL).
1. **Instantiate Toolset**: Create an `OpenAPIToolset` instance, passing the spec content and type (`spec_str`/`spec_dict`, `spec_str_type`). Provide authentication details (`auth_scheme`, `auth_credential`) if required by the API.
```python
from google.adk.tools.openapi_tool.openapi_spec_parser.openapi_toolset import OpenAPIToolset
# Example with a JSON string
openapi_spec_json = '...' # Your OpenAPI JSON string
toolset = OpenAPIToolset(spec_str=openapi_spec_json, spec_str_type="json")
# Example with a dictionary
# openapi_spec_dict = {...} # Your OpenAPI spec as a dict
# toolset = OpenAPIToolset(spec_dict=openapi_spec_dict)
```
1. **Add to Agent**: Include the retrieved tools in your `LlmAgent`'s `tools` list.
```python
from google.adk.agents import LlmAgent
my_agent = LlmAgent(
name="api_interacting_agent",
model="gemini-2.0-flash", # Or your preferred model
tools=[toolset], # Pass the toolset
# ... other agent config ...
)
```
1. **Instruct Agent**: Update your agent's instructions to inform it about the new API capabilities and the names of the tools it can use (e.g., `list_pets`, `create_pet`). The tool descriptions generated from the spec will also help the LLM.
1. **Run Agent**: Execute your agent using the `Runner`. When the LLM determines it needs to call one of the APIs, it will generate a function call targeting the appropriate `RestApiTool`, which will then handle the HTTP request automatically.
## Example
This example demonstrates generating tools from a simple Pet Store OpenAPI spec (using `httpbin.org` for mock responses) and interacting with them via an agent.
Code: Pet Store API
openapi_example.py
```python
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import asyncio
import uuid # For unique session IDs
from dotenv import load_dotenv
from google.adk.agents import LlmAgent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.genai import types
# --- OpenAPI Tool Imports ---
from google.adk.tools.openapi_tool.openapi_spec_parser.openapi_toolset import OpenAPIToolset
# --- Load Environment Variables (If ADK tools need them, e.g., API keys) ---
load_dotenv() # Create a .env file in the same directory if needed
# --- Constants ---
APP_NAME_OPENAPI = "openapi_petstore_app"
USER_ID_OPENAPI = "user_openapi_1"
SESSION_ID_OPENAPI = f"session_openapi_{uuid.uuid4()}" # Unique session ID
AGENT_NAME_OPENAPI = "petstore_manager_agent"
GEMINI_MODEL = "gemini-2.0-flash"
# --- Sample OpenAPI Specification (JSON String) ---
# A basic Pet Store API example using httpbin.org as a mock server
openapi_spec_string = """
{
"openapi": "3.0.0",
"info": {
"title": "Simple Pet Store API (Mock)",
"version": "1.0.1",
"description": "An API to manage pets in a store, using httpbin for responses."
},
"servers": [
{
"url": "https://httpbin.org",
"description": "Mock server (httpbin.org)"
}
],
"paths": {
"/get": {
"get": {
"summary": "List all pets (Simulated)",
"operationId": "listPets",
"description": "Simulates returning a list of pets. Uses httpbin's /get endpoint which echoes query parameters.",
"parameters": [
{
"name": "limit",
"in": "query",
"description": "Maximum number of pets to return",
"required": false,
"schema": { "type": "integer", "format": "int32" }
},
{
"name": "status",
"in": "query",
"description": "Filter pets by status",
"required": false,
"schema": { "type": "string", "enum": ["available", "pending", "sold"] }
}
],
"responses": {
"200": {
"description": "A list of pets (echoed query params).",
"content": { "application/json": { "schema": { "type": "object" } } }
}
}
}
},
"/post": {
"post": {
"summary": "Create a pet (Simulated)",
"operationId": "createPet",
"description": "Simulates adding a new pet. Uses httpbin's /post endpoint which echoes the request body.",
"requestBody": {
"description": "Pet object to add",
"required": true,
"content": {
"application/json": {
"schema": {
"type": "object",
"required": ["name"],
"properties": {
"name": {"type": "string", "description": "Name of the pet"},
"tag": {"type": "string", "description": "Optional tag for the pet"}
}
}
}
}
},
"responses": {
"201": {
"description": "Pet created successfully (echoed request body).",
"content": { "application/json": { "schema": { "type": "object" } } }
}
}
}
},
"/get?petId={petId}": {
"get": {
"summary": "Info for a specific pet (Simulated)",
"operationId": "showPetById",
"description": "Simulates returning info for a pet ID. Uses httpbin's /get endpoint.",
"parameters": [
{
"name": "petId",
"in": "path",
"description": "This is actually passed as a query param to httpbin /get",
"required": true,
"schema": { "type": "integer", "format": "int64" }
}
],
"responses": {
"200": {
"description": "Information about the pet (echoed query params)",
"content": { "application/json": { "schema": { "type": "object" } } }
},
"404": { "description": "Pet not found (simulated)" }
}
}
}
}
}
"""
# --- Create OpenAPIToolset ---
petstore_toolset = OpenAPIToolset(
spec_str=openapi_spec_string,
spec_str_type='json',
# No authentication needed for httpbin.org
)
# --- Agent Definition ---
root_agent = LlmAgent(
name=AGENT_NAME_OPENAPI,
model=GEMINI_MODEL,
tools=[petstore_toolset], # Pass the list of RestApiTool objects
instruction="""You are a Pet Store assistant managing pets via an API.
Use the available tools to fulfill user requests.
When creating a pet, confirm the details echoed back by the API.
When listing pets, mention any filters used (like limit or status).
When showing a pet by ID, state the ID you requested.
""",
description="Manages a Pet Store using tools generated from an OpenAPI spec."
)
# --- Session and Runner Setup ---
async def setup_session_and_runner():
session_service_openapi = InMemorySessionService()
runner_openapi = Runner(
agent=root_agent,
app_name=APP_NAME_OPENAPI,
session_service=session_service_openapi,
)
await session_service_openapi.create_session(
app_name=APP_NAME_OPENAPI,
user_id=USER_ID_OPENAPI,
session_id=SESSION_ID_OPENAPI,
)
return runner_openapi
# --- Agent Interaction Function ---
async def call_openapi_agent_async(query, runner_openapi):
print("\n--- Running OpenAPI Pet Store Agent ---")
print(f"Query: {query}")
content = types.Content(role='user', parts=[types.Part(text=query)])
final_response_text = "Agent did not provide a final text response."
try:
async for event in runner_openapi.run_async(
user_id=USER_ID_OPENAPI, session_id=SESSION_ID_OPENAPI, new_message=content
):
# Optional: Detailed event logging for debugging
# print(f" DEBUG Event: Author={event.author}, Type={'Final' if event.is_final_response() else 'Intermediate'}, Content={str(event.content)[:100]}...")
if event.get_function_calls():
call = event.get_function_calls()[0]
print(f" Agent Action: Called function '{call.name}' with args {call.args}")
elif event.get_function_responses():
response = event.get_function_responses()[0]
print(f" Agent Action: Received response for '{response.name}'")
# print(f" Tool Response Snippet: {str(response.response)[:200]}...") # Uncomment for response details
elif event.is_final_response() and event.content and event.content.parts:
# Capture the last final text response
final_response_text = event.content.parts[0].text.strip()
print(f"Agent Final Response: {final_response_text}")
except Exception as e:
print(f"An error occurred during agent run: {e}")
import traceback
traceback.print_exc() # Print full traceback for errors
print("-" * 30)
# --- Run Examples ---
async def run_openapi_example():
runner_openapi = await setup_session_and_runner()
# Trigger listPets
await call_openapi_agent_async("Show me the pets available.", runner_openapi)
# Trigger createPet
await call_openapi_agent_async("Please add a new dog named 'Dukey'.", runner_openapi)
# Trigger showPetById
await call_openapi_agent_async("Get info for pet with ID 123.", runner_openapi)
# --- Execute ---
if __name__ == "__main__":
print("Executing OpenAPI example...")
# Use asyncio.run() for top-level execution
try:
asyncio.run(run_openapi_example())
except RuntimeError as e:
if "cannot be called from a running event loop" in str(e):
print("Info: Cannot run asyncio.run from a running event loop (e.g., Jupyter/Colab).")
# If in Jupyter/Colab, you might need to run like this:
# await run_openapi_example()
else:
raise e
print("OpenAPI example finished.")
```
# Increase tool performance with parallel execution
Supported in ADKPython v1.10.0
Starting with Agent Development Kit (ADK) version 1.10.0 for Python, the framework attempts to run any agent-requested [function tools](/adk-docs/tools-custom/function-tools/) in parallel. This behavior can significantly improve the performance and responsiveness of your agents, particularly for agents that rely on multiple external APIs or long-running tasks. For example, if you have 3 tools that each take 2 seconds, by running them in parallel, the total execution time will be closer to 2 seconds, instead of 6 seconds. The ability to run tool functions parallel can improve the performance of your agents, particularly in the following scenarios:
- **Research tasks:** Where the agent collects information from multiple sources before proceeding to the next stage of the workflow.
- **API calls:** Where the agent accesses several APIs independently, such as searching for available flights using APIs from multiple airlines.
- **Publishing and communication tasks:** When the agent needs to publish or communicate through multiple, independent channels or multiple recipients.
However, your custom tools must be built with asynchronous execution support to enable this performance improvement. This guide explains how parallel tool execution works in the ADK and how to build your tools to take full advantage of this processing feature.
Warning
Any ADK Tools that use synchronous processing in a set of tool function calls will block other tools from executing in parallel, even if the other tools allow for parallel execution.
## Build parallel-ready tools
Enable parallel execution of your tool functions by defining them as asynchronous functions. In Python code, this means using `async def` and `await` syntax which allows the ADK to run them concurrently in an `asyncio` event loop. The following sections show examples of agent tools built for parallel processing and asynchronous operations.
### Example of http web call
The following code example show how to modify the `get_weather()` function to operate asynchronously and allow for parallel execution:
```python
async def get_weather(city: str) -> dict:
async with aiohttp.ClientSession() as session:
async with session.get(f"http://api.weather.com/{city}") as response:
return await response.json()
```
### Example of database call
The following code example show how to write a database calling function to operate asynchronously:
```python
async def query_database(query: str) -> list:
async with asyncpg.connect("postgresql://...") as conn:
return await conn.fetch(query)
```
### Example of yielding behavior for long loops
In cases where a tool is processing multiple requests or numerous long-running requests, consider adding yielding code to allow other tools to execute, as shown in the following code sample:
```python
async def process_data(data: list) -> dict:
results = []
for i, item in enumerate(data):
processed = await process_item(item) # Yield point
results.append(processed)
# Add periodic yield points for long loops
if i % 100 == 0:
await asyncio.sleep(0) # Yield control
return {"results": results}
```
Important
Use the `asyncio.sleep()` function for pauses to avoid blocking execution of other functions.
### Example of thread pools for intensive operations
When performing processing-intensive functions, consider creating thread pools for better management of available computing resources, as shown in the following example:
```python
async def cpu_intensive_tool(data: list) -> dict:
loop = asyncio.get_event_loop()
# Use thread pool for CPU-bound work
with ThreadPoolExecutor() as executor:
result = await loop.run_in_executor(
executor,
expensive_computation,
data
)
return {"result": result}
```
### Example of process chunking
When performing processes on long lists or large amounts of data, consider combining a thread pool technique with dividing up processing into chunks of data, and yielding processing time between the chunks, as shown in the following example:
```python
async def process_large_dataset(dataset: list) -> dict:
results = []
chunk_size = 1000
for i in range(0, len(dataset), chunk_size):
chunk = dataset[i:i + chunk_size]
# Process chunk in thread pool
loop = asyncio.get_event_loop()
with ThreadPoolExecutor() as executor:
chunk_result = await loop.run_in_executor(
executor, process_chunk, chunk
)
results.extend(chunk_result)
# Yield control between chunks
await asyncio.sleep(0)
return {"total_processed": len(results), "results": results}
```
## Write parallel-ready prompts and tool descriptions
When building prompts for AI models, consider explicitly specifying or hinting that function calls be made in parallel. The following example of an AI prompt directs the model to use tools in parallel:
```text
When users ask for multiple pieces of information, always call functions in
parallel.
Examples:
- "Get weather for London and currency rate USD to EUR" → Call both functions
simultaneously
- "Compare cities A and B" → Call get_weather, get_population, get_distance in
parallel
- "Analyze multiple stocks" → Call get_stock_price for each stock in parallel
Always prefer multiple specific function calls over single complex calls.
```
The following example shows a tool function description that hints at more efficient use through parallel execution:
```python
async def get_weather(city: str) -> dict:
"""Get current weather for a single city.
This function is optimized for parallel execution - call multiple times for different cities.
Args:
city: Name of the city, for example: 'London', 'New York'
Returns:
Weather data including temperature, conditions, humidity
"""
await asyncio.sleep(2) # Simulate API call
return {"city": city, "temp": 72, "condition": "sunny"}
```
## Next steps
For more information on building Tools for agents and function calling, see [Function Tools](/adk-docs/tools-custom/function-tools/). For more detailed examples of tools that take advantage of parallel processing, see the samples in the [adk-python](https://github.com/google/adk-python/tree/main/contributing/samples/parallel_functions) repository.
# Run Agents
# Agent Runtime
Supported in ADKPython v0.1.0TypeScript v0.2.0Go v0.1.0Java v0.1.0
ADK provides several ways to run and test your agents during development. Choose the method that best fits your development workflow.
## Ways to run agents
- **Dev UI**
______________________________________________________________________
Use `adk web` to launch a browser-based interface for interacting with your agents.
[Use the Web Interface](https://google.github.io/adk-docs/runtime/web-interface/index.md)
- **Command Line**
______________________________________________________________________
Use `adk run` to interact with your agents directly in the terminal.
[Use the Command Line](https://google.github.io/adk-docs/runtime/command-line/index.md)
- **API Server**
______________________________________________________________________
Use `adk api_server` to expose your agents through a RESTful API.
[Use the API Server](https://google.github.io/adk-docs/runtime/api-server/index.md)
## Technical reference
For more in-depth information on runtime configuration and behavior, see these pages:
- **[Event Loop](https://google.github.io/adk-docs/runtime/event-loop/index.md)**: Understand the core event loop that powers ADK, including the yield/pause/resume cycle.
- **[Resume Agents](https://google.github.io/adk-docs/runtime/resume/index.md)**: Learn how to resume agent execution from a previous state.
- **[Runtime Config](https://google.github.io/adk-docs/runtime/runconfig/index.md)**: Configure runtime behavior with RunConfig.
# Use the API Server
Supported in ADKPython v0.1.0TypeScript v0.2.0Go v0.1.0Java v0.1.0
Before you deploy your agent, you should test it to ensure that it is working as intended. Use the API server in ADK to expose your agents through a REST API for programmatic testing and integration.
## Start the API server
Use the following command to run your agent in an ADK API server:
```shell
adk api_server
```
```shell
npx adk api_server
```
```shell
go run agent.go web api
```
Make sure to update the port number.
With Maven, compile and run the ADK web server:
```console
mvn compile exec:java \
-Dexec.args="--adk.agents.source-dir=src/main/java/agents --server.port=8080"
```
With Gradle, the `build.gradle` or `build.gradle.kts` build file should have the following Java plugin in its plugins section:
```groovy
plugins {
id('java')
// other plugins
}
```
Then, elsewhere in the build file, at the top-level, create a new task:
```groovy
tasks.register('runADKWebServer', JavaExec) {
dependsOn classes
classpath = sourceSets.main.runtimeClasspath
mainClass = 'com.google.adk.web.AdkWebServer'
args '--adk.agents.source-dir=src/main/java/agents', '--server.port=8080'
}
```
Finally, on the command-line, run the following command:
```console
gradle runADKWebServer
```
In Java, both the Dev UI and the API server are bundled together.
This command will launch a local web server, where you can run cURL commands or send API requests to test your agent. By default, the server runs on `http://localhost:8000`.
Advanced Usage and Debugging
For a complete reference on all available endpoints, request/response formats, and tips for debugging (including how to use the interactive API documentation), see the **ADK API Server Guide** below.
## Test locally
Testing locally involves launching a local web server, creating a session, and sending queries to your agent. First, ensure you are in the correct working directory.
For TypeScript, you should be inside the agent project directory itself.
```console
parent_folder/
└── my_sample_agent/ <-- For TypeScript, run commands from here
└── agent.py (or Agent.java or agent.ts)
```
**Launch the Local Server**
Next, launch the local server using the commands listed above.
The output should appear similar to:
```shell
INFO: Started server process [12345]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://localhost:8000 (Press CTRL+C to quit)
```
```shell
+-----------------------------------------------------------------------------+
| ADK Web Server started |
| |
| For local testing, access at http://localhost:8000. |
+-----------------------------------------------------------------------------+
```
```shell
2025-05-13T23:32:08.972-06:00 INFO 37864 --- [ebServer.main()] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat started on port 8080 (http) with context path '/'
2025-05-13T23:32:08.980-06:00 INFO 37864 --- [ebServer.main()] com.google.adk.web.AdkWebServer : Started AdkWebServer in 1.15 seconds (process running for 2.877)
2025-05-13T23:32:08.981-06:00 INFO 37864 --- [ebServer.main()] com.google.adk.web.AdkWebServer : AdkWebServer application started successfully.
```
Your server is now running locally. Ensure you use the correct ***port number*** in all the subsequent commands.
**Create a new session**
With the API server still running, open a new terminal window or tab and create a new session with the agent using:
```shell
curl -X POST http://localhost:8000/apps/my_sample_agent/users/u_123/sessions/s_123 \
-H "Content-Type: application/json" \
-d '{"key1": "value1", "key2": 42}'
```
Let's break down what's happening:
- `http://localhost:8000/apps/my_sample_agent/users/u_123/sessions/s_123`: This creates a new session for your agent `my_sample_agent`, which is the name of the agent folder, for a user ID (`u_123`) and for a session ID (`s_123`). You can replace `my_sample_agent` with the name of your agent folder. You can replace `u_123` with a specific user ID, and `s_123` with a specific session ID.
- `{"key1": "value1", "key2": 42}`: This is optional. You can use this to customize the agent's pre-existing state (dict) when creating the session.
This should return the session information if it was created successfully. The output should appear similar to:
```json
{"id":"s_123","appName":"my_sample_agent","userId":"u_123","state":{"key1":"value1","key2":42},"events":[],"lastUpdateTime":1743711430.022186}
```
Info
You cannot create multiple sessions with exactly the same user ID and session ID. If you try to, you may see a response, like: `{"detail":"Session already exists: s_123"}`. To fix this, you can either delete that session (e.g., `s_123`), or choose a different session ID.
**Send a query**
There are two ways to send queries via POST to your agent, via the `/run` or `/run_sse` routes.
- `POST http://localhost:8000/run`: collects all events as a list and returns the list all at once. Suitable for most users (if you are unsure, we recommend using this one).
- `POST http://localhost:8000/run_sse`: returns as Server-Sent-Events, which is a stream of event objects. Suitable for those who want to be notified as soon as the event is available. With `/run_sse`, you can also set `streaming` to `true` to enable token-level streaming.
**Using `/run`**
```shell
curl -X POST http://localhost:8000/run \
-H "Content-Type: application/json" \
-d '{
"appName": "my_sample_agent",
"userId": "u_123",
"sessionId": "s_123",
"newMessage": {
"role": "user",
"parts": [{
"text": "Hey whats the weather in new york today"
}]
}
}'
```
In TypeScript, currently only `camelCase` field names are supported (e.g. `appName`, `userId`, `sessionId`, etc.).
If using `/run`, you will see the full output of events at the same time, as a list, which should appear similar to:
```json
[{"content":{"parts":[{"functionCall":{"id":"af-e75e946d-c02a-4aad-931e-49e4ab859838","args":{"city":"new york"},"name":"get_weather"}}],"role":"model"},"invocationId":"e-71353f1e-aea1-4821-aa4b-46874a766853","author":"weather_time_agent","actions":{"stateDelta":{},"artifactDelta":{},"requestedAuthConfigs":{}},"longRunningToolIds":[],"id":"2Btee6zW","timestamp":1743712220.385936},{"content":{"parts":[{"functionResponse":{"id":"af-e75e946d-c02a-4aad-931e-49e4ab859838","name":"get_weather","response":{"status":"success","report":"The weather in New York is sunny with a temperature of 25 degrees Celsius (41 degrees Fahrenheit)."}}}],"role":"user"},"invocationId":"e-71353f1e-aea1-4821-aa4b-46874a766853","author":"weather_time_agent","actions":{"stateDelta":{},"artifactDelta":{},"requestedAuthConfigs":{}},"id":"PmWibL2m","timestamp":1743712221.895042},{"content":{"parts":[{"text":"OK. The weather in New York is sunny with a temperature of 25 degrees Celsius (41 degrees Fahrenheit).\n"}],"role":"model"},"invocationId":"e-71353f1e-aea1-4821-aa4b-46874a766853","author":"weather_time_agent","actions":{"stateDelta":{},"artifactDelta":{},"requestedAuthConfigs":{}},"id":"sYT42eVC","timestamp":1743712221.899018}]
```
**Using `/run_sse`**
```shell
curl -X POST http://localhost:8000/run_sse \
-H "Content-Type: application/json" \
-d '{
"appName": "my_sample_agent",
"userId": "u_123",
"sessionId": "s_123",
"newMessage": {
"role": "user",
"parts": [{
"text": "Hey whats the weather in new york today"
}]
},
"streaming": false
}'
```
You can set `streaming` to `true` to enable token-level streaming, which means the response will be returned to you in multiple chunks and the output should appear similar to:
```shell
data: {"content":{"parts":[{"functionCall":{"id":"af-f83f8af9-f732-46b6-8cb5-7b5b73bbf13d","args":{"city":"new york"},"name":"get_weather"}}],"role":"model"},"invocationId":"e-3f6d7765-5287-419e-9991-5fffa1a75565","author":"weather_time_agent","actions":{"stateDelta":{},"artifactDelta":{},"requestedAuthConfigs":{}},"longRunningToolIds":[],"id":"ptcjaZBa","timestamp":1743712255.313043}
data: {"content":{"parts":[{"functionResponse":{"id":"af-f83f8af9-f732-46b6-8cb5-7b5b73bbf13d","name":"get_weather","response":{"status":"success","report":"The weather in New York is sunny with a temperature of 25 degrees Celsius (41 degrees Fahrenheit)."}}}],"role":"user"},"invocationId":"e-3f6d7765-5287-419e-9991-5fffa1a75565","author":"weather_time_agent","actions":{"stateDelta":{},"artifactDelta":{},"requestedAuthConfigs":{}},"id":"5aocxjaq","timestamp":1743712257.387306}
data: {"content":{"parts":[{"text":"OK. The weather in New York is sunny with a temperature of 25 degrees Celsius (41 degrees Fahrenheit).\n"}],"role":"model"},"invocationId":"e-3f6d7765-5287-419e-9991-5fffa1a75565","author":"weather_time_agent","actions":{"stateDelta":{},"artifactDelta":{},"requestedAuthConfigs":{}},"id":"rAnWGSiV","timestamp":1743712257.391317}
```
**Send a query with a base64 encoded file using `/run` or `/run_sse`**
```shell
curl -X POST http://localhost:8000/run \
-H 'Content-Type: application/json' \
-d '{
"appName":"my_sample_agent",
"userId":"u_123",
"sessionId":"s_123",
"newMessage":{
"role":"user",
"parts":[
{
"text":"Describe this image"
},
{
"inlineData":{
"displayName":"my_image.png",
"data":"iVBORw0KGgoAAAANSUhEUgAAAgAAAAIACAYAAAD0eNT6AAAACXBIWXMAAAsTAAALEwEAmpw...",
"mimeType":"image/png"
}
}
]
},
"streaming":false
}'
```
Info
If you are using `/run_sse`, you should see each event as soon as it becomes available.
## Integrations
ADK uses [Callbacks](https://google.github.io/adk-docs/callbacks/index.md) to integrate with third-party observability tools. These integrations capture detailed traces of agent calls and interactions, which are crucial for understanding behavior, debugging issues, and evaluating performance.
- [Comet Opik](https://github.com/comet-ml/opik) is an open-source LLM observability and evaluation platform that [natively supports ADK](https://www.comet.com/docs/opik/tracing/integrations/adk).
## Deploy your agent
Now that you've verified the local operation of your agent, you're ready to move on to deploying your agent! Here are some ways you can deploy your agent:
- Deploy to [Agent Engine](https://google.github.io/adk-docs/deploy/agent-engine/index.md), a simple way to deploy your ADK agents to a managed service in Vertex AI on Google Cloud.
- Deploy to [Cloud Run](https://google.github.io/adk-docs/deploy/cloud-run/index.md) and have full control over how you scale and manage your agents using serverless architecture on Google Cloud.
## Interactive API docs
The API server automatically generates interactive API documentation using Swagger UI. This is an invaluable tool for exploring endpoints, understanding request formats, and testing your agent directly from your browser.
To access the interactive docs, start the API server and navigate to in your web browser.
You will see a complete, interactive list of all available API endpoints, which you can expand to see detailed information about parameters, request bodies, and response schemas. You can even click "Try it out" to send live requests to your running agents.
## API endpoints
The following sections detail the primary endpoints for interacting with your agents.
JSON Naming Convention
- **Both Request and Response bodies** will use `camelCase` for field names (e.g., `"appName"`).
### Utility endpoints
#### List available agents
Returns a list of all agent applications discovered by the server.
- **Method:** `GET`
- **Path:** `/list-apps`
**Example Request**
```shell
curl -X GET http://localhost:8000/list-apps
```
**Example Response**
```json
["my_sample_agent", "another_agent"]
```
______________________________________________________________________
### Session management
Sessions store the state and event history for a specific user's interaction with an agent.
#### Update a session
Updates an existing session.
- **Method:** `PATCH`
- **Path:** `/apps/{app_name}/users/{user_id}/sessions/{session_id}`
**Request Body**
```json
{
"stateDelta": {
"key1": "value1",
"key2": 42
}
}
```
**Example Request**
```shell
curl -X PATCH http://localhost:8000/apps/my_sample_agent/users/u_123/sessions/s_abc \
-H "Content-Type: application/json" \
-d '{"stateDelta":{"visit_count": 5}}'
```
**Example Response**
```json
{"id":"s_abc","appName":"my_sample_agent","userId":"u_123","state":{"visit_count":5},"events":[],"lastUpdateTime":1743711430.022186}
```
#### Get a session
Retrieves the details of a specific session, including its current state and all associated events.
- **Method:** `GET`
- **Path:** `/apps/{app_name}/users/{user_id}/sessions/{session_id}`
**Example Request**
```shell
curl -X GET http://localhost:8000/apps/my_sample_agent/users/u_123/sessions/s_abc
```
**Example Response**
```json
{"id":"s_abc","appName":"my_sample_agent","userId":"u_123","state":{"visit_count":5},"events":[...],"lastUpdateTime":1743711430.022186}
```
#### Delete a session
Deletes a session and all of its associated data.
- **Method:** `DELETE`
- **Path:** `/apps/{app_name}/users/{user_id}/sessions/{session_id}`
**Example Request**
```shell
curl -X DELETE http://localhost:8000/apps/my_sample_agent/users/u_123/sessions/s_abc
```
**Example Response** A successful deletion returns an empty response with a `204 No Content` status code.
______________________________________________________________________
### Agent execution
These endpoints are used to send a new message to an agent and get a response.
#### Run agent (single response)
Executes the agent and returns all generated events in a single JSON array after the run is complete.
- **Method:** `POST`
- **Path:** `/run`
**Request Body**
```json
{
"appName": "my_sample_agent",
"userId": "u_123",
"sessionId": "s_abc",
"newMessage": {
"role": "user",
"parts": [
{ "text": "What is the capital of France?" }
]
}
}
```
In TypeScript, currently only `camelCase` field names are supported (e.g. `appName`, `userId`, `sessionId`, etc.).
**Example Request**
```shell
curl -X POST http://localhost:8000/run \
-H "Content-Type: application/json" \
-d '{
"appName": "my_sample_agent",
"userId": "u_123",
"sessionId": "s_abc",
"newMessage": {
"role": "user",
"parts": [{"text": "What is the capital of France?"}]
}
}'
```
#### Run agent (streaming)
Executes the agent and streams events back to the client as they are generated using [Server-Sent Events (SSE)](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events).
- **Method:** `POST`
- **Path:** `/run_sse`
**Request Body** The request body is the same as for `/run`, with an additional optional `streaming` flag.
```json
{
"appName": "my_sample_agent",
"userId": "u_123",
"sessionId": "s_abc",
"newMessage": {
"role": "user",
"parts": [
{ "text": "What is the weather in New York?" }
]
},
"streaming": true
}
```
- `streaming`: (Optional) Set to `true` to enable token-level streaming for model responses. Defaults to `false`.
**Example Request**
```shell
curl -X POST http://localhost:8000/run_sse \
-H "Content-Type: application/json" \
-d '{
"appName": "my_sample_agent",
"userId": "u_123",
"sessionId": "s_abc",
"newMessage": {
"role": "user",
"parts": [{"text": "What is the weather in New York?"}]
},
"streaming": false
}'
```
# Use the Command Line
Supported in ADKPython v0.1.0TypeScript v0.2.0Go v0.1.0Java v0.1.0
ADK provides an interactive terminal interface for testing your agents. This is useful for quick testing, scripted interactions, and CI/CD pipelines.
## Run an agent
Use the following command to run your agent in the ADK command line interface:
```shell
adk run my_agent
```
```shell
npx @google/adk-devtools run agent.ts
```
```shell
go run agent.go
```
Create an `AgentCliRunner` class (see [Java Quickstart](https://google.github.io/adk-docs/get-started/java/index.md)) and run:
```shell
mvn compile exec:java -Dexec.mainClass="com.example.agent.AgentCliRunner"
```
This starts an interactive session where you can type queries and see agent responses directly in your terminal:
```shell
Running agent my_agent, type exit to exit.
[user]: What's the weather in New York?
[my_agent]: The weather in New York is sunny with a temperature of 25°C.
[user]: exit
```
## Session options
The `adk run` command includes options for saving, resuming, and replaying sessions.
### Save sessions
To save the session when you exit:
```shell
adk run --save_session path/to/my_agent
```
You'll be prompted to enter a session ID, and the session will be saved to `path/to/my_agent/.session.json`.
You can also specify the session ID upfront:
```shell
adk run --save_session --session_id my_session path/to/my_agent
```
### Resume sessions
To continue a previously saved session:
```shell
adk run --resume path/to/my_agent/my_session.session.json path/to/my_agent
```
This loads the previous session state and event history, displays it, and allows you to continue the conversation.
### Replay sessions
To replay a session file without interactive input:
```shell
adk run --replay path/to/input.json path/to/my_agent
```
The input file should contain initial state and queries:
```json
{
"state": {"key": "value"},
"queries": ["What is 2 + 2?", "What is the capital of France?"]
}
```
## Storage options
| Option | Description | Default |
| ------------------------ | --------------------------- | ------------------------------ |
| `--session_service_uri` | Custom session storage URI | SQLite under `.adk/session.db` |
| `--artifact_service_uri` | Custom artifact storage URI | Local `.adk/artifacts` |
### Example with storage options
```shell
adk run --session_service_uri "sqlite:///my_sessions.db" path/to/my_agent
```
## All options
| Option | Description |
| ------------------------ | ------------------------------------------------ |
| `--save_session` | Save the session to a JSON file on exit |
| `--session_id` | Session ID to use when saving |
| `--resume` | Path to a saved session file to resume |
| `--replay` | Path to an input file for non-interactive replay |
| `--session_service_uri` | Custom session storage URI |
| `--artifact_service_uri` | Custom artifact storage URI |
# Runtime Event Loop
Supported in ADKPython v0.1.0Typescript v0.2.0Go v0.1.0Java v0.1.0
The ADK Runtime is the underlying engine that powers your agent application during user interactions. It's the system that takes your defined agents, tools, and callbacks and orchestrates their execution in response to user input, managing the flow of information, state changes, and interactions with external services like LLMs or storage.
Think of the Runtime as the **"engine"** of your agentic application. You define the parts (agents, tools), and the Runtime handles how they connect and run together to fulfill a user's request.
## Core Idea: The Event Loop
At its heart, the ADK Runtime operates on an **Event Loop**. This loop facilitates a back-and-forth communication between the `Runner` component and your defined "Execution Logic" (which includes your Agents, the LLM calls they make, Callbacks, and Tools).
In simple terms:
1. The `Runner` receives a user query and asks the main `Agent` to start processing.
1. The `Agent` (and its associated logic) runs until it has something to report (like a response, a request to use a tool, or a state change) – it then **yields** or **emits** an `Event`.
1. The `Runner` receives this `Event`, processes any associated actions (like saving state changes via `Services`), and forwards the event onwards (e.g., to the user interface).
1. The `Agent`'s logic **resumes** from where it paused only *after* the `Runner` has processed the event, and then potentially sees the effects of the changes committed by the Runner.
1. This cycle repeats until the agent has no more events to yield for the current user query.
This event-driven loop is the fundamental pattern governing how ADK executes your agent code.
## The Heartbeat: The Event Loop - Inner workings
The Event Loop is the core operational pattern defining the interaction between the `Runner` and your custom code (Agents, Tools, Callbacks, collectively referred to as "Execution Logic" or "Logic Components" in the design document). It establishes a clear division of responsibilities:
Note
The specific method names and parameter names may vary slightly by SDK language (e.g., `agent.run_async(...)` in Python, `agent.Run(...)` in Go, `agent.runAsync(...)` in Java and TypeScript). Refer to the language-specific API documentation for details.
### Runner's Role (Orchestrator)
The `Runner` acts as the central coordinator for a single user invocation. Its responsibilities in the loop are:
1. **Initiation:** Receives the end user's query (`new_message`) and typically appends it to the session history via the `SessionService`.
1. **Kick-off:** Starts the event generation process by calling the main agent's execution method (e.g., `agent_to_run.run_async(...)`).
1. **Receive & Process:** Waits for the agent logic to `yield` or `emit` an `Event`. Upon receiving an event, the Runner **promptly processes** it. This involves:
- Using configured `Services` (`SessionService`, `ArtifactService`, `MemoryService`) to commit changes indicated in `event.actions` (like `state_delta`, `artifact_delta`).
- Performing other internal bookkeeping.
1. **Yield Upstream:** Forwards the processed event onwards (e.g., to the calling application or UI for rendering).
1. **Iterate:** Signals the agent logic that processing is complete for the yielded event, allowing it to resume and generate the *next* event.
*Conceptual Runner Loop:*
```py
# Simplified view of Runner's main loop logic
def run(new_query, ...) -> Generator[Event]:
# 1. Append new_query to session event history (via SessionService)
session_service.append_event(session, Event(author='user', content=new_query))
# 2. Kick off event loop by calling the agent
agent_event_generator = agent_to_run.run_async(context)
async for event in agent_event_generator:
# 3. Process the generated event and commit changes
session_service.append_event(session, event) # Commits state/artifact deltas etc.
# memory_service.update_memory(...) # If applicable
# artifact_service might have already been called via context during agent run
# 4. Yield event for upstream processing (e.g., UI rendering)
yield event
# Runner implicitly signals agent generator can continue after yielding
```
```typescript
// Simplified view of Runner's main loop logic
async * runAsync(newQuery: Content, ...): AsyncGenerator {
// 1. Append newQuery to session event history (via SessionService)
await sessionService.appendEvent({
session,
event: createEvent({author: 'user', content: newQuery})
});
// 2. Kick off event loop by calling the agent
const agentEventGenerator = agentToRun.runAsync(context);
for await (const event of agentEventGenerator) {
// 3. Process the generated event and commit changes
// Commits state/artifact deltas etc.
await sessionService.appendEvent({session, event});
// memoryService.updateMemory(...) // If applicable
// artifactService might have already been called via context during agent run
// 4. Yield event for upstream processing (e.g., UI rendering)
yield event;
// Runner implicitly signals agent generator can continue after yielding
}
}
```
```go
// Simplified conceptual view of the Runner's main loop logic in Go
func (r *Runner) RunConceptual(ctx context.Context, session *session.Session, newQuery *genai.Content) iter.Seq2[*Event, error] {
return func(yield func(*Event, error) bool) {
// 1. Append new_query to session event history (via SessionService)
// ...
userEvent := session.NewEvent(ctx.InvocationID()) // Simplified for conceptual view
userEvent.Author = "user"
userEvent.LLMResponse = model.LLMResponse{Content: newQuery}
if _, err := r.sessionService.Append(ctx, &session.AppendRequest{Event: userEvent}); err != nil {
yield(nil, err)
return
}
// 2. Kick off event stream by calling the agent
// Assuming agent.Run also returns iter.Seq2[*Event, error]
agentEventsAndErrs := r.agent.Run(ctx, &agent.RunRequest{Session: session, Input: newQuery})
for event, err := range agentEventsAndErrs {
if err != nil {
if !yield(event, err) { // Yield event even if there's an error, then stop
return
}
return // Agent finished with an error
}
// 3. Process the generated event and commit changes
// Only commit non-partial event to a session service (as seen in actual code)
if !event.LLMResponse.Partial {
if _, err := r.sessionService.Append(ctx, &session.AppendRequest{Event: event}); err != nil {
yield(nil, err)
return
}
}
// memory_service.update_memory(...) // If applicable
// artifact_service might have already been called via context during agent run
// 4. Yield event for upstream processing
if !yield(event, nil) {
return // Upstream consumer stopped
}
}
// Agent finished successfully
}
}
```
```java
// Simplified conceptual view of the Runner's main loop logic in Java.
public Flowable runConceptual(
Session session,
InvocationContext invocationContext,
Content newQuery
) {
// 1. Append new_query to session event history (via SessionService)
// ...
sessionService.appendEvent(session, userEvent).blockingGet();
// 2. Kick off event stream by calling the agent
Flowable agentEventStream = agentToRun.runAsync(invocationContext);
// 3. Process each generated event, commit changes, and "yield" or "emit"
return agentEventStream.map(event -> {
// This mutates the session object (adds event, applies stateDelta).
// The return value of appendEvent (a Single) is conceptually
// just the event itself after processing.
sessionService.appendEvent(session, event).blockingGet(); // Simplified blocking call
// memory_service.update_memory(...) // If applicable - conceptual
// artifact_service might have already been called via context during agent run
// 4. "Yield" event for upstream processing
// In RxJava, returning the event in map effectively yields it to the next operator or subscriber.
return event;
});
}
```
### Execution Logic's Role (Agent, Tool, Callback)
Your code within agents, tools, and callbacks is responsible for the actual computation and decision-making. Its interaction with the loop involves:
1. **Execute:** Runs its logic based on the current `InvocationContext`, including the session state *as it was when execution resumed*.
1. **Yield:** When the logic needs to communicate (send a message, call a tool, report a state change), it constructs an `Event` containing the relevant content and actions, and then `yield`s this event back to the `Runner`.
1. **Pause:** Crucially, execution of the agent logic **pauses immediately** after the `yield` statement (or `return` in RxJava). It waits for the `Runner` to complete step 3 (processing and committing).
1. **Resume:** *Only after* the `Runner` has processed the yielded event does the agent logic resume execution from the statement immediately following the `yield`.
1. **See Updated State:** Upon resumption, the agent logic can now reliably access the session state (`ctx.session.state`) reflecting the changes that were committed by the `Runner` from the *previously yielded* event.
*Conceptual Execution Logic:*
```py
# Simplified view of logic inside Agent.run_async, callbacks, or tools
# ... previous code runs based on current state ...
# 1. Determine a change or output is needed, construct the event
# Example: Updating state
update_data = {'field_1': 'value_2'}
event_with_state_change = Event(
author=self.name,
actions=EventActions(state_delta=update_data),
content=types.Content(parts=[types.Part(text="State updated.")])
# ... other event fields ...
)
# 2. Yield the event to the Runner for processing & commit
yield event_with_state_change
# <<<<<<<<<<<< EXECUTION PAUSES HERE >>>>>>>>>>>>
# <<<<<<<<<<<< RUNNER PROCESSES & COMMITS THE EVENT >>>>>>>>>>>>
# 3. Resume execution ONLY after Runner is done processing the above event.
# Now, the state committed by the Runner is reliably reflected.
# Subsequent code can safely assume the change from the yielded event happened.
val = ctx.session.state['field_1']
# here `val` is guaranteed to be "value_2" (assuming Runner committed successfully)
print(f"Resumed execution. Value of field_1 is now: {val}")
# ... subsequent code continues ...
# Maybe yield another event later...
```
```typescript
// Simplified view of logic inside Agent.runAsync, callbacks, or tools
// ... previous code runs based on current state ...
// 1. Determine a change or output is needed, construct the event
// Example: Updating state
const updateData = {'field_1': 'value_2'};
const eventWithStateChange = createEvent({
author: this.name,
actions: createEventActions({stateDelta: updateData}),
content: {parts: [{text: "State updated."}]}
// ... other event fields ...
});
// 2. Yield the event to the Runner for processing & commit
yield eventWithStateChange;
// <<<<<<<<<<<< EXECUTION PAUSES HERE >>>>>>>>>>>>
// <<<<<<<<<<<< RUNNER PROCESSES & COMMITS THE EVENT >>>>>>>>>>>>
// 3. Resume execution ONLY after Runner is done processing the above event.
// Now, the state committed by the Runner is reliably reflected.
// Subsequent code can safely assume the change from the yielded event happened.
const val = ctx.session.state['field_1'];
// here `val` is guaranteed to be "value_2" (assuming Runner committed successfully)
console.log(`Resumed execution. Value of field_1 is now: ${val}`);
// ... subsequent code continues ...
// Maybe yield another event later...
```
```go
// Simplified view of logic inside Agent.Run, callbacks, or tools
// ... previous code runs based on current state ...
// 1. Determine a change or output is needed, construct the event
// Example: Updating state
updateData := map[string]interface{}{"field_1": "value_2"}
eventWithStateChange := &Event{
Author: self.Name(),
Actions: &EventActions{StateDelta: updateData},
Content: genai.NewContentFromText("State updated.", "model"),
// ... other event fields ...
}
// 2. Yield the event to the Runner for processing & commit
// In Go, this is done by sending the event to a channel.
eventsChan <- eventWithStateChange
// <<<<<<<<<<<< EXECUTION PAUSES HERE (conceptually) >>>>>>>>>>>>
// The Runner on the other side of the channel will receive and process the event.
// The agent's goroutine might continue, but the logical flow waits for the next input or step.
// <<<<<<<<<<<< RUNNER PROCESSES & COMMITS THE EVENT >>>>>>>>>>>>
// 3. Resume execution ONLY after Runner is done processing the above event.
// In a real Go implementation, this would likely be handled by the agent receiving
// a new RunRequest or context indicating the next step. The updated state
// would be part of the session object in that new request.
// For this conceptual example, we'll just check the state.
val := ctx.State.Get("field_1")
// here `val` is guaranteed to be "value_2" because the Runner would have
// updated the session state before calling the agent again.
fmt.Printf("Resumed execution. Value of field_1 is now: %v\n", val)
// ... subsequent code continues ...
// Maybe send another event to the channel later...
```
```java
// Simplified view of logic inside Agent.runAsync, callbacks, or tools
// ... previous code runs based on current state ...
// 1. Determine a change or output is needed, construct the event
// Example: Updating state
ConcurrentMap updateData = new ConcurrentHashMap<>();
updateData.put("field_1", "value_2");
EventActions actions = EventActions.builder().stateDelta(updateData).build();
Content eventContent = Content.builder().parts(Part.fromText("State updated.")).build();
Event eventWithStateChange = Event.builder()
.author(self.name())
.actions(actions)
.content(Optional.of(eventContent))
// ... other event fields ...
.build();
// 2. "Yield" the event. In RxJava, this means emitting it into the stream.
// The Runner (or upstream consumer) will subscribe to this Flowable.
// When the Runner receives this event, it will process it (e.g., call sessionService.appendEvent).
// The 'appendEvent' in Java ADK mutates the 'Session' object held within 'ctx' (InvocationContext).
// <<<<<<<<<<<< CONCEPTUAL PAUSE POINT >>>>>>>>>>>>
// In RxJava, the emission of 'eventWithStateChange' happens, and then the stream
// might continue with a 'flatMap' or 'concatMap' operator that represents
// the logic *after* the Runner has processed this event.
// To model the "resume execution ONLY after Runner is done processing":
// The Runner's `appendEvent` is usually an async operation itself (returns Single).
// The agent's flow needs to be structured such that subsequent logic
// that depends on the committed state runs *after* that `appendEvent` completes.
// This is how the Runner typically orchestrates it:
// Runner:
// agent.runAsync(ctx)
// .concatMapEager(eventFromAgent ->
// sessionService.appendEvent(ctx.session(), eventFromAgent) // This updates ctx.session().state()
// .toFlowable() // Emits the event after it's processed
// )
// .subscribe(processedEvent -> { /* UI renders processedEvent */ });
// So, within the agent's own logic, if it needs to do something *after* an event it yielded
// has been processed and its state changes are reflected in ctx.session().state(),
// that subsequent logic would typically be in another step of its reactive chain.
// For this conceptual example, we'll emit the event, and then simulate the "resume"
// as a subsequent operation in the Flowable chain.
return Flowable.just(eventWithStateChange) // Step 2: Yield the event
.concatMap(yieldedEvent -> {
// <<<<<<<<<<<< RUNNER CONCEPTUALLY PROCESSES & COMMITS THE EVENT >>>>>>>>>>>>
// At this point, in a real runner, ctx.session().appendEvent(yieldedEvent) would have been called
// by the Runner, and ctx.session().state() would be updated.
// Since we are *inside* the agent's conceptual logic trying to model this,
// we assume the Runner's action has implicitly updated our 'ctx.session()'.
// 3. Resume execution.
// Now, the state committed by the Runner (via sessionService.appendEvent)
// is reliably reflected in ctx.session().state().
Object val = ctx.session().state().get("field_1");
// here `val` is guaranteed to be "value_2" because the `sessionService.appendEvent`
// called by the Runner would have updated the session state within the `ctx` object.
System.out.println("Resumed execution. Value of field_1 is now: " + val);
// ... subsequent code continues ...
// If this subsequent code needs to yield another event, it would do so here.
```
This cooperative yield/pause/resume cycle between the `Runner` and your Execution Logic, mediated by `Event` objects, forms the core of the ADK Runtime.
## Key components of the Runtime
Several components work together within the ADK Runtime to execute an agent invocation. Understanding their roles clarifies how the event loop functions:
1. ### `Runner`
- **Role:** The main entry point and orchestrator for a single user query (`run_async`).
- **Function:** Manages the overall Event Loop, receives events yielded by the Execution Logic, coordinates with Services to process and commit event actions (state/artifact changes), and forwards processed events upstream (e.g., to the UI). It essentially drives the conversation turn by turn based on yielded events. (Defined in `google.adk.runners.runner`).
1. ### Execution Logic Components
- **Role:** The parts containing your custom code and the core agent capabilities.
- **Components:**
- `Agent` (`BaseAgent`, `LlmAgent`, etc.): Your primary logic units that process information and decide on actions. They implement the `_run_async_impl` method which yields events.
- `Tools` (`BaseTool`, `FunctionTool`, `AgentTool`, etc.): External functions or capabilities used by agents (often `LlmAgent`) to interact with the outside world or perform specific tasks. They execute and return results, which are then wrapped in events.
- `Callbacks` (Functions): User-defined functions attached to agents (e.g., `before_agent_callback`, `after_model_callback`) that hook into specific points in the execution flow, potentially modifying behavior or state, whose effects are captured in events.
- **Function:** Perform the actual thinking, calculation, or external interaction. They communicate their results or needs by **yielding `Event` objects** and pausing until the Runner processes them.
1. ### `Event`
- **Role:** The message passed back and forth between the `Runner` and the Execution Logic.
- **Function:** Represents an atomic occurrence (user input, agent text, tool call/result, state change request, control signal). It carries both the content of the occurrence and the intended side effects (`actions` like `state_delta`).
1. ### `Services`
- **Role:** Backend components responsible for managing persistent or shared resources. Used primarily by the `Runner` during event processing.
- **Components:**
- `SessionService` (`BaseSessionService`, `InMemorySessionService`, etc.): Manages `Session` objects, including saving/loading them, applying `state_delta` to the session state, and appending events to the `event history`.
- `ArtifactService` (`BaseArtifactService`, `InMemoryArtifactService`, `GcsArtifactService`, etc.): Manages the storage and retrieval of binary artifact data. Although `save_artifact` is called via context during execution logic, the `artifact_delta` in the event confirms the action for the Runner/SessionService.
- `MemoryService` (`BaseMemoryService`, etc.): (Optional) Manages long-term semantic memory across sessions for a user.
- **Function:** Provide the persistence layer. The `Runner` interacts with them to ensure changes signaled by `event.actions` are reliably stored *before* the Execution Logic resumes.
1. ### `Session`
- **Role:** A data container holding the state and history for *one specific conversation* between a user and the application.
- **Function:** Stores the current `state` dictionary, the list of all past `events` (`event history`), and references to associated artifacts. It's the primary record of the interaction, managed by the `SessionService`.
1. ### `Invocation`
- **Role:** A conceptual term representing everything that happens in response to a *single* user query, from the moment the `Runner` receives it until the agent logic finishes yielding events for that query.
- **Function:** An invocation might involve multiple agent runs (if using agent transfer or `AgentTool`), multiple LLM calls, tool executions, and callback executions, all tied together by a single `invocation_id` within the `InvocationContext`. State variables prefixed with `temp:` are strictly scoped to a single invocation and discarded afterwards.
These players interact continuously through the Event Loop to process a user's request.
## How It Works: A Simplified Invocation
Let's trace a simplified flow for a typical user query that involves an LLM agent calling a tool:
### Step-by-Step Breakdown
1. **User Input:** The User sends a query (e.g., "What's the capital of France?").
1. **Runner Starts:** `Runner.run_async` begins. It interacts with the `SessionService` to load the relevant `Session` and adds the user query as the first `Event` to the session history. An `InvocationContext` (`ctx`) is prepared.
1. **Agent Execution:** The `Runner` calls `agent.run_async(ctx)` on the designated root agent (e.g., an `LlmAgent`).
1. **LLM Call (Example):** The `Agent_Llm` determines it needs information, perhaps by calling a tool. It prepares a request for the `LLM`. Let's assume the LLM decides to call `MyTool`.
1. **Yield FunctionCall Event:** The `Agent_Llm` receives the `FunctionCall` response from the LLM, wraps it in an `Event(author='Agent_Llm', content=Content(parts=[Part(function_call=...)]))`, and `yields` or `emits` this event.
1. **Agent Pauses:** The `Agent_Llm`'s execution pauses immediately after the `yield`.
1. **Runner Processes:** The `Runner` receives the FunctionCall event. It passes it to the `SessionService` to record it in the history. The `Runner` then yields the event upstream to the `User` (or application).
1. **Agent Resumes:** The `Runner` signals that the event is processed, and `Agent_Llm` resumes execution.
1. **Tool Execution:** The `Agent_Llm`'s internal flow now proceeds to execute the requested `MyTool`. It calls `tool.run_async(...)`.
1. **Tool Returns Result:** `MyTool` executes and returns its result (e.g., `{'result': 'Paris'}`).
1. **Yield FunctionResponse Event:** The agent (`Agent_Llm`) wraps the tool result into an `Event` containing a `FunctionResponse` part (e.g., `Event(author='Agent_Llm', content=Content(role='user', parts=[Part(function_response=...)]))`). This event might also contain `actions` if the tool modified state (`state_delta`) or saved artifacts (`artifact_delta`). The agent `yield`s this event.
1. **Agent Pauses:** `Agent_Llm` pauses again.
1. **Runner Processes:** `Runner` receives the FunctionResponse event. It passes it to `SessionService` which applies any `state_delta`/`artifact_delta` and adds the event to history. `Runner` yields the event upstream.
1. **Agent Resumes:** `Agent_Llm` resumes, now knowing the tool result and any state changes are committed.
1. **Final LLM Call (Example):** `Agent_Llm` sends the tool result back to the `LLM` to generate a natural language response.
1. **Yield Final Text Event:** `Agent_Llm` receives the final text from the `LLM`, wraps it in an `Event(author='Agent_Llm', content=Content(parts=[Part(text=...)]))`, and `yield`s it.
1. **Agent Pauses:** `Agent_Llm` pauses.
1. **Runner Processes:** `Runner` receives the final text event, passes it to `SessionService` for history, and yields it upstream to the `User`. This is likely marked as the `is_final_response()`.
1. **Agent Resumes & Finishes:** `Agent_Llm` resumes. Having completed its task for this invocation, its `run_async` generator finishes.
1. **Runner Completes:** The `Runner` sees the agent's generator is exhausted and finishes its loop for this invocation.
This yield/pause/process/resume cycle ensures that state changes are consistently applied and that the execution logic always operates on the most recently committed state after yielding an event.
## Important Runtime Behaviors
Understanding a few key aspects of how the ADK Runtime handles state, streaming, and asynchronous operations is crucial for building predictable and efficient agents.
### State Updates & Commitment Timing
- **The Rule:** When your code (in an agent, tool, or callback) modifies the session state (e.g., `context.state['my_key'] = 'new_value'`), this change is initially recorded locally within the current `InvocationContext`. The change is only **guaranteed to be persisted** (saved by the `SessionService`) *after* the `Event` carrying the corresponding `state_delta` in its `actions` has been `yield`-ed by your code and subsequently processed by the `Runner`.
- **Implication:** Code that runs *after* resuming from a `yield` can reliably assume that the state changes signaled in the *yielded event* have been committed.
```py
# Inside agent logic (conceptual)
# 1. Modify state
ctx.session.state['status'] = 'processing'
event1 = Event(..., actions=EventActions(state_delta={'status': 'processing'}))
# 2. Yield event with the delta
yield event1
# --- PAUSE --- Runner processes event1, SessionService commits 'status' = 'processing' ---
# 3. Resume execution
# Now it's safe to rely on the committed state
current_status = ctx.session.state['status'] # Guaranteed to be 'processing'
print(f"Status after resuming: {current_status}")
```
```typescript
// Inside agent logic (conceptual)
// 1. Modify state
// In TypeScript, you modify state via the context, which tracks the change.
ctx.state.set('status', 'processing');
// The framework will automatically populate actions with the state
// delta from the context. For illustration, it's shown here.
const event1 = createEvent({
actions: createEventActions({stateDelta: {'status': 'processing'}}),
// ... other event fields
});
// 2. Yield event with the delta
yield event1;
// --- PAUSE --- Runner processes event1, SessionService commits 'status' = 'processing' ---
// 3. Resume execution
// Now it's safe to rely on the committed state in the session object.
const currentStatus = ctx.session.state['status']; // Guaranteed to be 'processing'
console.log(`Status after resuming: ${currentStatus}`);
```
```go
// Inside agent logic (conceptual)
func (a *Agent) RunConceptual(ctx agent.InvocationContext) iter.Seq2[*session.Event, error] {
// The entire logic is wrapped in a function that will be returned as an iterator.
return func(yield func(*session.Event, error) bool) {
// ... previous code runs based on current state from the input `ctx` ...
// e.g., val := ctx.State().Get("field_1") might return "value_1" here.
// 1. Determine a change or output is needed, construct the event
updateData := map[string]interface{}{"field_1": "value_2"}
eventWithStateChange := session.NewEvent(ctx.InvocationID())
eventWithStateChange.Author = a.Name()
eventWithStateChange.Actions = &session.EventActions{StateDelta: updateData}
// ... other event fields ...
// 2. Yield the event to the Runner for processing & commit.
// The agent's execution continues immediately after this call.
if !yield(eventWithStateChange, nil) {
// If yield returns false, it means the consumer (the Runner)
// has stopped listening, so we should stop producing events.
return
}
// <<<<<<<<<<<< RUNNER PROCESSES & COMMITS THE EVENT >>>>>>>>>>>>
// This happens outside the agent, after the agent's iterator has
// produced the event.
// 3. The agent CANNOT immediately see the state change it just yielded.
// The state is immutable within a single `Run` invocation.
val := ctx.State().Get("field_1")
// `val` here is STILL "value_1" (or whatever it was at the start).
// The updated state ("value_2") will only be available in the `ctx`
// of the *next* `Run` invocation in a subsequent turn.
// ... subsequent code continues, potentially yielding more events ...
finalEvent := session.NewEvent(ctx.InvocationID())
finalEvent.Author = a.Name()
// ...
yield(finalEvent, nil)
}
}
```
```java
// Inside agent logic (conceptual)
// ... previous code runs based on current state ...
// 1. Prepare state modification and construct the event
ConcurrentHashMap stateChanges = new ConcurrentHashMap<>();
stateChanges.put("status", "processing");
EventActions actions = EventActions.builder().stateDelta(stateChanges).build();
Content content = Content.builder().parts(Part.fromText("Status update: processing")).build();
Event event1 = Event.builder()
.actions(actions)
// ...
.build();
// 2. Yield event with the delta
return Flowable.just(event1)
.map(
emittedEvent -> {
// --- CONCEPTUAL PAUSE & RUNNER PROCESSING ---
// 3. Resume execution (conceptually)
// Now it's safe to rely on the committed state.
String currentStatus = (String) ctx.session().state().get("status");
System.out.println("Status after resuming (inside agent logic): " + currentStatus); // Guaranteed to be 'processing'
// The event itself (event1) is passed on.
// If subsequent logic within this agent step produced *another* event,
// you'd use concatMap to emit that new event.
return emittedEvent;
});
// ... subsequent agent logic might involve further reactive operators
// or emitting more events based on the now-updated `ctx.session().state()`.
```
### "Dirty Reads" of Session State
- **Definition:** While commitment happens *after* the yield, code running *later within the same invocation*, but *before* the state-changing event is actually yielded and processed, **can often see the local, uncommitted changes**. This is sometimes called a "dirty read".
- **Example:**
```py
# Code in before_agent_callback
callback_context.state['field_1'] = 'value_1'
# State is locally set to 'value_1', but not yet committed by Runner
# ... agent runs ...
# Code in a tool called later *within the same invocation*
# Readable (dirty read), but 'value_1' isn't guaranteed persistent yet.
val = tool_context.state['field_1'] # 'val' will likely be 'value_1' here
print(f"Dirty read value in tool: {val}")
# Assume the event carrying the state_delta={'field_1': 'value_1'}
# is yielded *after* this tool runs and is processed by the Runner.
```
```typescript
// Code in beforeAgentCallback
callbackContext.state.set('field_1', 'value_1');
// State is locally set to 'value_1', but not yet committed by Runner
// --- agent runs ... ---
// --- Code in a tool called later *within the same invocation* ---
// Readable (dirty read), but 'value_1' isn't guaranteed persistent yet.
const val = toolContext.state.get('field_1'); // 'val' will likely be 'value_1' here
console.log(`Dirty read value in tool: ${val}`);
// Assume the event carrying the state_delta={'field_1': 'value_1'}
// is yielded *after* this tool runs and is processed by the Runner.
```
```go
// Code in before_agent_callback
// The callback would modify the context's session state directly.
// This change is local to the current invocation context.
ctx.State.Set("field_1", "value_1")
// State is locally set to 'value_1', but not yet committed by Runner
// ... agent runs ...
// Code in a tool called later *within the same invocation*
// Readable (dirty read), but 'value_1' isn't guaranteed persistent yet.
val := ctx.State.Get("field_1") // 'val' will likely be 'value_1' here
fmt.Printf("Dirty read value in tool: %v\n", val)
// Assume the event carrying the state_delta={'field_1': 'value_1'}
// is yielded *after* this tool runs and is processed by the Runner.
```
```java
// Modify state - Code in BeforeAgentCallback
// AND stages this change in callbackContext.eventActions().stateDelta().
callbackContext.state().put("field_1", "value_1");
// --- agent runs ... ---
// --- Code in a tool called later *within the same invocation* ---
// Readable (dirty read), but 'value_1' isn't guaranteed persistent yet.
Object val = toolContext.state().get("field_1"); // 'val' will likely be 'value_1' here
System.out.println("Dirty read value in tool: " + val);
// Assume the event carrying the state_delta={'field_1': 'value_1'}
// is yielded *after* this tool runs and is processed by the Runner.
```
- **Implications:**
- **Benefit:** Allows different parts of your logic within a single complex step (e.g., multiple callbacks or tool calls before the next LLM turn) to coordinate using state without waiting for a full yield/commit cycle.
- **Caveat:** Relying heavily on dirty reads for critical logic can be risky. If the invocation fails *before* the event carrying the `state_delta` is yielded and processed by the `Runner`, the uncommitted state change will be lost. For critical state transitions, ensure they are associated with an event that gets successfully processed.
### Streaming vs. Non-Streaming Output (`partial=True`)
This primarily relates to how responses from the LLM are handled, especially when using streaming generation APIs.
- **Streaming:** The LLM generates its response token-by-token or in small chunks.
- The framework (often within `BaseLlmFlow`) yields multiple `Event` objects for a single conceptual response. Most of these events will have `partial=True`.
- The `Runner`, upon receiving an event with `partial=True`, typically **forwards it immediately** upstream (for UI display) but **skips processing its `actions`** (like `state_delta`).
- Eventually, the framework yields a final event for that response, marked as non-partial (`partial=False` or implicitly via `turn_complete=True`).
- The `Runner` **fully processes only this final event**, committing any associated `state_delta` or `artifact_delta`.
- **Non-Streaming:** The LLM generates the entire response at once. The framework yields a single event marked as non-partial, which the `Runner` processes fully.
- **Why it Matters:** Ensures that state changes are applied atomically and only once based on the *complete* response from the LLM, while still allowing the UI to display text progressively as it's generated.
## Async is Primary (`run_async`)
- **Core Design:** The ADK Runtime is fundamentally built on asynchronous patterns and libraries (like Python's `asyncio`, Java's `RxJava`, and native `Promise`s and `AsyncGenerator`s in TypeScript) to handle concurrent operations (like waiting for LLM responses or tool executions) efficiently without blocking.
- **Main Entry Point:** `Runner.run_async` is the primary method for executing agent invocations. All core runnable components (Agents, specific flows) use `asynchronous` methods internally.
- **Synchronous Convenience (`run`):** A synchronous `Runner.run` method exists mainly for convenience (e.g., in simple scripts or testing environments). However, internally, `Runner.run` typically just calls `Runner.run_async` and manages the async event loop execution for you.
- **Developer Experience:** We recommend designing your applications (e.g., web servers using ADK) to be asynchronous for best performance. In Python, this means using `asyncio`; in Java, leverage `RxJava`'s reactive programming model; and in TypeScript, this means building using native `Promise`s and `AsyncGenerator`s.
- **Sync Callbacks/Tools:** The ADK framework supports both asynchronous and synchronous functions for tools and callbacks.
- **Blocking I/O:** For long-running synchronous I/O operations, the framework attempts to prevent stalls. Python ADK may use asyncio.to_thread, while Java ADK often relies on appropriate RxJava schedulers or wrappers for blocking calls. In TypeScript, the framework simply awaits the function; if a synchronous function performs blocking I/O, it will stall the event loop. Developers should use asynchronous I/O APIs (which return a Promise) whenever possible.
- **CPU-Bound Work:** Purely CPU-intensive synchronous tasks will still block their execution thread in both environments.
Understanding these behaviors helps you write more robust ADK applications and debug issues related to state consistency, streaming updates, and asynchronous execution.
# Resume stopped agents
Supported in ADKPython v1.14.0
An ADK agent's execution can be interrupted by various factors including dropped network connections, power failure, or a required external system going offline. The Resume feature of ADK allows an agent workflow to pick up where it left off, avoiding the need to restart the entire workflow. In ADK Python 1.16 and higher, you can configure an ADK workflow to be resumable, so that it tracks the execution of workflow and then allows you to resume it after an unexpected interruption.
This guide explains how to configure your ADK agent workflow to be resumable. If you use Custom Agents, you can update them to be resumable. For more information, see [Add resume to custom Agents](#custom-agents).
## Add resumable configuration
Enable the Resume function for an agent workflow by applying a Resumability configuration to the App object of your ADK workflow, as shown in the following code example:
```python
app = App(
name='my_resumable_agent',
root_agent=root_agent,
# Set the resumability config to enable resumability.
resumability_config=ResumabilityConfig(
is_resumable=True,
),
)
```
Caution: Long Running Functions, Confirmations, Authentication
For agents that use [Long Running Functions](/adk-docs/tools-custom/function-tools/#long-run-tool), [Confirmations](/adk-docs/tools-custom/confirmation/), or [Authentication](/adk-docs/tools-custom/authentication/) requiring user input, adding a resumable confirmation changes how these features operate. For more information, see the documentation for those features.
Note: Custom Agents
Resume is not supported by default for Custom Agents. You must update the agent code for a Custom Agent to support the Resume feature. For information on modifying Custom Agents to support incremental resume functionality, see [Add resume to custom Agents](#custom-agents).
## Resume a stopped workflow
When an ADK workflow stops execution you can resume the workflow using a command containing the Invocation ID for the workflow instance, which can be found in the [Event](/adk-docs/events/#understanding-and-using-events) history of the workflow. Make sure the ADK API server is running, in case it was interrupted or powered off, and then run the following command to resume the workflow, as shown in the following API request example.
```console
# restart the API server if needed:
adk api_server my_resumable_agent/
# resume the agent:
curl -X POST http://localhost:8000/run_sse \
-H "Content-Type: application/json" \
-d '{
"app_name": "my_resumable_agent",
"user_id": "u_123",
"session_id": "s_abc",
"invocation_id": "invocation-123",
}'
```
You can also resume a workflow using the Runner object Run Async method, as shown below:
```python
runner.run_async(user_id='u_123', session_id='s_abc',
invocation_id='invocation-123')
# When new_message is set to a function response,
# we are trying to resume a long running function.
```
Note
Resuming a workflow from the ADK Web user interface or using the ADK command line (CLI) tool is not currently supported.
## How it works
The Resume feature works by logging completed Agent workflow tasks, including incremental steps using [Events](/adk-docs/events/) and [Event Actions](/adk-docs/events/#detecting-actions-and-side-effects). tracking completion of agent tasks within a resumable workflow. If a workflow is interrupted and then later restarted, the system resumes the workflow by setting the completion state of each agent. If an agent did not complete, the workflow system reinstates any completed Events for that agent, and restarts the workflow from the partially completed state. For multi-agent workflows, the specific resume behavior varies, based on the multi-agent classes in your workflow, as described below:
- **Sequential Agent**: Reads the current_sub_agent from its saved state to find the next sub-agent to run in the sequence.
- **Loop Agent**: Uses the current_sub_agent and times_looped values to continue the loop from the last completed iteration and sub-agent.
- **Parallel Agent**: Determines which sub-agents have already completed and only runs those that have not finished.
Event logging includes results from Tools which successfully returned a result. So if an agent successfully executed Function Tools A and B, and then failed during execution of tool C, the system reinstates the results from the tools A and B, and resumes the workflow by re-running the tool C request.
Caution: Tool execution behavior
When resuming a workflow with Tools, the Resume feature ensures that the Tools in an agent are run ***at least once***, and may run more than once when resuming a workflow. If your agent uses Tools where duplicate runs would have a negative impact, such as purchases, you should modify the Tool to check for and prevent duplicate runs.
Note: Workflow modification with Resume not supported
Do not modify a stopped agent workflow before resuming it. For example adding or removing agents from workflow that has stopped and then resuming that workflow is not supported.
## Add resume to custom Agents
Custom agents have specific implementation requirements in order to support resumability. You must decide on and define workflow steps within your custom agent which produce a result which can be preserved before handing off to the next step of processing. The following steps outline how to modify a Custom Agent to support a workflow Resume.
- **Create CustomAgentState class**: Extend the BaseAgentState to create an object that preserves the state of your agent.
- **Optionally, create WorkFlowStep class**: If your custom agent has sequential steps, consider creating a WorkFlowStep list object that defines the discrete, savable steps of the agent.
- **Add initial agent state:** Modify your agent's async run function to set the initial state of your agent.
- **Add agent state checkpoints**: Modify your agent's async run function to generate and save the agent state for each completed step of the agent's overall task.
- **Add end of agent status to track agent state:** Modify your agent's async run function to include an `end_of_agent=True` status upon successful completion of the agent's full task.
The following example shows the required code modifications to the example StoryFlowAgent class shown in the [Custom Agents](/adk-docs/agents/custom-agents/#full-code-example) guide:
```python
class WorkflowStep(int, Enum):
INITIAL_STORY_GENERATION = 1
CRITIC_REVISER_LOOP = 2
POST_PROCESSING = 3
CONDITIONAL_REGENERATION = 4
# Extend BaseAgentState
### class StoryFlowAgentState(BaseAgentState):
### step = WorkflowStep
@override
async def _run_async_impl(
self, ctx: InvocationContext
) -> AsyncGenerator[Event, None]:
"""
Implements the custom orchestration logic for the story workflow.
Uses the instance attributes assigned by Pydantic (e.g., self.story_generator).
"""
agent_state = self._load_agent_state(ctx, WorkflowStep)
if agent_state is None:
# Record the start of the agent
agent_state = StoryFlowAgentState(step=WorkflowStep.INITIAL_STORY_GENERATION)
yield self._create_agent_state_event(ctx, agent_state)
next_step = agent_state.step
logger.info(f"[{self.name}] Starting story generation workflow.")
# Step 1. Initial Story Generation
if next_step <= WorkflowStep.INITIAL_STORY_GENERATION:
logger.info(f"[{self.name}] Running StoryGenerator...")
async for event in self.story_generator.run_async(ctx):
yield event
# Check if story was generated before proceeding
if "current_story" not in ctx.session.state or not ctx.session.state[
"current_story"
]:
return # Stop processing if initial story failed
agent_state = StoryFlowAgentState(step=WorkflowStep.CRITIC_REVISER_LOOP)
yield self._create_agent_state_event(ctx, agent_state)
# Step 2. Critic-Reviser Loop
if next_step <= WorkflowStep.CRITIC_REVISER_LOOP:
logger.info(f"[{self.name}] Running CriticReviserLoop...")
async for event in self.loop_agent.run_async(ctx):
logger.info(
f"[{self.name}] Event from CriticReviserLoop: "
f"{event.model_dump_json(indent=2, exclude_none=True)}"
)
yield event
agent_state = StoryFlowAgentState(step=WorkflowStep.POST_PROCESSING)
yield self._create_agent_state_event(ctx, agent_state)
# Step 3. Sequential Post-Processing (Grammar and Tone Check)
if next_step <= WorkflowStep.POST_PROCESSING:
logger.info(f"[{self.name}] Running PostProcessing...")
async for event in self.sequential_agent.run_async(ctx):
logger.info(
f"[{self.name}] Event from PostProcessing: "
f"{event.model_dump_json(indent=2, exclude_none=True)}"
)
yield event
agent_state = StoryFlowAgentState(step=WorkflowStep.CONDITIONAL_REGENERATION)
yield self._create_agent_state_event(ctx, agent_state)
# Step 4. Tone-Based Conditional Logic
if next_step <= WorkflowStep.CONDITIONAL_REGENERATION:
tone_check_result = ctx.session.state.get("tone_check_result")
if tone_check_result == "negative":
logger.info(f"[{self.name}] Tone is negative. Regenerating story...")
async for event in self.story_generator.run_async(ctx):
logger.info(
f"[{self.name}] Event from StoryGenerator (Regen): "
f"{event.model_dump_json(indent=2, exclude_none=True)}"
)
yield event
else:
logger.info(f"[{self.name}] Tone is not negative. Keeping current story.")
logger.info(f"[{self.name}] Workflow finished.")
yield self._create_agent_state_event(ctx, end_of_agent=True)
```
# Runtime Configuration
Supported in ADKPython v0.1.0Typescript v0.2.0Go v0.1.0Java v0.1.0
`RunConfig` defines runtime behavior and options for agents in ADK. It controls speech and streaming settings, function calling, artifact saving, and limits on LLM calls.
When constructing an agent run, you can pass a `RunConfig` to customize how the agent interacts with models, handles audio, and streams responses. By default, no streaming is enabled and inputs aren’t retained as artifacts. Use `RunConfig` to override these defaults.
## Class Definition
The `RunConfig` class holds configuration parameters for an agent's runtime behavior.
- Python ADK uses Pydantic for this validation.
- Go ADK has mutable structs by default.
- Java ADK typically uses immutable data classes.
- TypeScript ADK uses a standard interface, with type safety provided by the TypeScript compiler.
```python
class RunConfig(BaseModel):
"""Configs for runtime behavior of agents."""
model_config = ConfigDict(
extra='forbid',
)
speech_config: Optional[types.SpeechConfig] = None
response_modalities: Optional[list[str]] = None
save_input_blobs_as_artifacts: bool = False
support_cfc: bool = False
streaming_mode: StreamingMode = StreamingMode.NONE
output_audio_transcription: Optional[types.AudioTranscriptionConfig] = None
max_llm_calls: int = 500
```
```typescript
export interface RunConfig {
speechConfig?: SpeechConfig;
responseModalities?: Modality[];
saveInputBlobsAsArtifacts: boolean;
supportCfc: boolean;
streamingMode: StreamingMode;
outputAudioTranscription?: AudioTranscriptionConfig;
maxLlmCalls: number;
// ... and other properties
}
export enum StreamingMode {
NONE = 'none',
SSE = 'sse',
BIDI = 'bidi',
}
```
```go
type StreamingMode string
const (
StreamingModeNone StreamingMode = "none"
StreamingModeSSE StreamingMode = "sse"
)
// RunConfig controls runtime behavior.
type RunConfig struct {
// Streaming mode, None or StreamingMode.SSE.
StreamingMode StreamingMode
// Whether or not to save the input blobs as artifacts
SaveInputBlobsAsArtifacts bool
}
```
```java
public abstract class RunConfig {
public enum StreamingMode {
NONE,
SSE,
BIDI
}
public abstract @Nullable SpeechConfig speechConfig();
public abstract ImmutableList responseModalities();
public abstract boolean saveInputBlobsAsArtifacts();
public abstract @Nullable AudioTranscriptionConfig outputAudioTranscription();
public abstract int maxLlmCalls();
// ...
}
```
## Runtime Parameters
| Parameter | Python Type | TypeScript Type | Go Type | Java Type | Default (Py / TS / Go / Java) | Description |
| ------------------------------- | ------------------------------------------ | ------------------------------------- | --------------- | ----------------------------------------------------- | ---------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `speech_config` | `Optional[types.SpeechConfig]` | `SpeechConfig` (optional) | N/A | `SpeechConfig` (nullable via `@Nullable`) | `None` / `undefined`/ N/A / `null` | Configures speech synthesis (voice, language) using the `SpeechConfig` type. |
| `response_modalities` | `Optional[list[str]]` | `Modality[]` (optional) | N/A | `ImmutableList` | `None` / `undefined` / N/A / Empty `ImmutableList` | List of desired output modalities (e.g., Python: `["TEXT", "AUDIO"]`; Java/TS: uses structured `Modality` objects). |
| `save_input_blobs_as_artifacts` | `bool` | `boolean` | `bool` | `boolean` | `False` / `false` / `false` / `false` | If `true`, saves input blobs (e.g., uploaded files) as run artifacts for debugging/auditing. |
| `streaming_mode` | `StreamingMode` | `StreamingMode` | `StreamingMode` | `StreamingMode` | `StreamingMode.NONE` / `StreamingMode.NONE` / `agent.StreamingModeNone` / `StreamingMode.NONE` | Sets the streaming behavior: `NONE` (default), `SSE` (server-sent events), or `BIDI` (bidirectional). |
| `output_audio_transcription` | `Optional[types.AudioTranscriptionConfig]` | `AudioTranscriptionConfig` (optional) | N/A | `AudioTranscriptionConfig` (nullable via `@Nullable`) | `None` / `undefined` / N/A / `null` | Configures transcription of generated audio output using the `AudioTranscriptionConfig` type. |
| `max_llm_calls` | `int` | `number` | N/A | `int` | `500` / `500` / N/A / `500` | Limits total LLM calls per run. `0` or negative means unlimited. Exceeding language limits (e.g. `sys.maxsize`, `Number.MAX_SAFE_INTEGER`) raises an error. |
| `support_cfc` | `bool` | `boolean` | N/A | `bool` | `False` / `false` / N/A / `false` | **Python/TypeScript:** Enables Compositional Function Calling. Requires `streaming_mode=SSE` and uses the LIVE API. **Experimental.** |
### `speech_config`
Supported in ADKPython v0.1.0Java v0.1.0
Note
The interface or definition of `SpeechConfig` is the same, irrespective of the language.
Speech configuration settings for live agents with audio capabilities. The `SpeechConfig` class has the following structure:
```python
class SpeechConfig(_common.BaseModel):
"""The speech generation configuration."""
voice_config: Optional[VoiceConfig] = Field(
default=None,
description="""The configuration for the speaker to use.""",
)
language_code: Optional[str] = Field(
default=None,
description="""Language code (ISO 639. e.g. en-US) for the speech synthesization.
Only available for Live API.""",
)
```
The `voice_config` parameter uses the `VoiceConfig` class:
```python
class VoiceConfig(_common.BaseModel):
"""The configuration for the voice to use."""
prebuilt_voice_config: Optional[PrebuiltVoiceConfig] = Field(
default=None,
description="""The configuration for the speaker to use.""",
)
```
And `PrebuiltVoiceConfig` has the following structure:
```python
class PrebuiltVoiceConfig(_common.BaseModel):
"""The configuration for the prebuilt speaker to use."""
voice_name: Optional[str] = Field(
default=None,
description="""The name of the prebuilt voice to use.""",
)
```
These nested configuration classes allow you to specify:
- `voice_config`: The name of the prebuilt voice to use (in the `PrebuiltVoiceConfig`)
- `language_code`: ISO 639 language code (e.g., "en-US") for speech synthesis
When implementing voice-enabled agents, configure these parameters to control how your agent sounds when speaking.
### `response_modalities`
Supported in ADKPython v0.1.0Java v0.1.0
Defines the output modalities for the agent. If not set, defaults to AUDIO. Response modalities determine how the agent communicates with users through various channels (e.g., text, audio).
### `save_input_blobs_as_artifacts`
Supported in ADKPython v0.1.0Go v0.1.0Java v0.1.0
When enabled, input blobs will be saved as artifacts during agent execution. This is useful for debugging and audit purposes, allowing developers to review the exact data received by agents.
### `support_cfc`
Supported in ADKPython v0.1.0Experimental
Enables Compositional Function Calling (CFC) support. Only applicable when using StreamingMode.SSE. When enabled, the LIVE API will be invoked as only it supports CFC functionality.
Experimental release
The `support_cfc` feature is experimental and its API or behavior might change in future releases.
### `streaming_mode`
Supported in ADKPython v0.1.0Go v0.1.0
Configures the streaming behavior of the agent. Possible values:
- `StreamingMode.NONE`: No streaming; responses delivered as complete units
- `StreamingMode.SSE`: Server-Sent Events streaming; one-way streaming from server to client
- `StreamingMode.BIDI`: Bidirectional streaming; simultaneous communication in both directions
Streaming modes affect both performance and user experience. SSE streaming lets users see partial responses as they're generated, while BIDI streaming enables real-time interactive experiences.
### `output_audio_transcription`
Supported in ADKPython v0.1.0Java v0.1.0
Configuration for transcribing audio outputs from live agents with audio response capability. This enables automatic transcription of audio responses for accessibility, record-keeping, and multi-modal applications.
### `max_llm_calls`
Supported in ADKPython v0.1.0Java v0.1.0
Sets a limit on the total number of LLM calls for a given agent run.
- Values greater than 0 and less than `sys.maxsize`: Enforces a bound on LLM calls
- Values less than or equal to 0: Allows unbounded LLM calls *(not recommended for production)*
This parameter prevents excessive API usage and potential runaway processes. Since LLM calls often incur costs and consume resources, setting appropriate limits is crucial.
## Validation Rules
Supported in ADKPython v0.1.0Typescript v0.2.0Go v0.1.0Java v0.1.0
The `RunConfig` class validates its parameters to ensure proper agent operation. While Python ADK uses `Pydantic` for automatic type validation, Java and TypeScript ADK rely on their static type systems and may include explicit checks in the `RunConfig`'s constructor. For the `max_llm_calls` parameter specifically:
1. Extremely large values (like `sys.maxsize` in Python, `Integer.MAX_VALUE` in Java, or `Number.MAX_SAFE_INTEGER` in TypeScript) are typically disallowed to prevent issues.
1. Values of zero or less will usually trigger a warning about unlimited LLM interactions.
### Basic runtime configuration
```python
from google.genai.adk import RunConfig, StreamingMode
config = RunConfig(
streaming_mode=StreamingMode.NONE,
max_llm_calls=100
)
```
```typescript
import { RunConfig, StreamingMode } from '@google/adk';
const config: RunConfig = {
streamingMode: StreamingMode.NONE,
maxLlmCalls: 100,
};
```
```go
import "google.golang.org/adk/agent"
config := agent.RunConfig{
StreamingMode: agent.StreamingModeNone,
}
```
```java
import com.google.adk.agents.RunConfig;
import com.google.adk.agents.RunConfig.StreamingMode;
RunConfig config = RunConfig.builder()
.setStreamingMode(StreamingMode.NONE)
.setMaxLlmCalls(100)
.build();
```
This configuration creates a non-streaming agent with a limit of 100 LLM calls, suitable for simple task-oriented agents where complete responses are preferable.
### Enabling streaming
```python
from google.genai.adk import RunConfig, StreamingMode
config = RunConfig(
streaming_mode=StreamingMode.SSE,
max_llm_calls=200
)
```
```typescript
import { RunConfig, StreamingMode } from '@google/adk';
const config: RunConfig = {
streamingMode: StreamingMode.SSE,
maxLlmCalls: 200,
};
```
```go
import "google.golang.org/adk/agent"
config := agent.RunConfig{
StreamingMode: agent.StreamingModeSSE,
}
```
```java
import com.google.adk.agents.RunConfig;
import com.google.adk.agents.RunConfig.StreamingMode;
RunConfig config = RunConfig.builder()
.setStreamingMode(StreamingMode.SSE)
.setMaxLlmCalls(200)
.build();
```
Using SSE streaming allows users to see responses as they're generated, providing a more responsive feel for chatbots and assistants.
### Enabling speech support
```python
from google.genai.adk import RunConfig, StreamingMode
from google.genai import types
config = RunConfig(
speech_config=types.SpeechConfig(
language_code="en-US",
voice_config=types.VoiceConfig(
prebuilt_voice_config=types.PrebuiltVoiceConfig(
voice_name="Kore"
)
),
),
response_modalities=["AUDIO", "TEXT"],
save_input_blobs_as_artifacts=True,
support_cfc=True,
streaming_mode=StreamingMode.SSE,
max_llm_calls=1000,
)
```
```typescript
import { RunConfig, StreamingMode } from '@google/adk';
const config: RunConfig = {
speechConfig: {
languageCode: "en-US",
voiceConfig: {
prebuiltVoiceConfig: {
voiceName: "Kore"
}
},
},
responseModalities: [
{ modality: "AUDIO" },
{ modality: "TEXT" }
],
saveInputBlobsAsArtifacts: true,
supportCfc: true,
streamingMode: StreamingMode.SSE,
maxLlmCalls: 1000,
};
```
```java
import com.google.adk.agents.RunConfig;
import com.google.adk.agents.RunConfig.StreamingMode;
import com.google.common.collect.ImmutableList;
import com.google.genai.types.Content;
import com.google.genai.types.Modality;
import com.google.genai.types.Part;
import com.google.genai.types.PrebuiltVoiceConfig;
import com.google.genai.types.SpeechConfig;
import com.google.genai.types.VoiceConfig;
RunConfig runConfig =
RunConfig.builder()
.setStreamingMode(StreamingMode.SSE)
.setMaxLlmCalls(1000)
.setSaveInputBlobsAsArtifacts(true)
.setResponseModalities(ImmutableList.of(new Modality("AUDIO"), new Modality("TEXT")))
.setSpeechConfig(
SpeechConfig.builder()
.voiceConfig(
VoiceConfig.builder()
.prebuiltVoiceConfig(
PrebuiltVoiceConfig.builder().voiceName("Kore").build())
.build())
.languageCode("en-US")
.build())
.build();
```
This comprehensive example configures an agent with:
- Speech capabilities using the "Kore" voice (US English)
- Both audio and text output modalities
- Artifact saving for input blobs (useful for debugging)
- Experimental CFC support enabled **(Python and TypeScript)**
- SSE streaming for responsive interaction
- A limit of 1000 LLM calls
### Enabling CFC Support
Supported in ADKPython v0.1.0Typescript v0.2.0Experimental
```python
from google.genai.adk import RunConfig, StreamingMode
config = RunConfig(
streaming_mode=StreamingMode.SSE,
support_cfc=True,
max_llm_calls=150
)
```
```typescript
import { RunConfig, StreamingMode } from '@google/adk';
const config: RunConfig = {
streamingMode: StreamingMode.SSE,
supportCfc: true,
maxLlmCalls: 150,
};
```
Enabling Compositional Function Calling (CFC) creates an agent that can dynamically execute functions based on model outputs, powerful for applications requiring complex workflows.
Experimental release
The Compositional Function Calling (CFC) streaming feature is an experimental release.
# Use the Web Interface
Supported in ADKPython v0.1.0TypeScript v0.2.0Go v0.1.0Java v0.1.0
The ADK web interface lets you test your agents directly in the browser. This tool provides a simple way to interactively develop and debug your agents.
Caution: ADK Web for development only
ADK Web is ***not meant for use in production deployments***. You should use ADK Web for development and debugging purposes only.
## Start the web interface
Use the following command to run your agent in the ADK web interface:
```shell
adk web
```
```shell
npx adk web
```
```shell
go run agent.go web api webui
```
Make sure to update the port number.
With Maven, compile and run the ADK web server:
```console
mvn compile exec:java \
-Dexec.args="--adk.agents.source-dir=src/main/java/agents --server.port=8080"
```
With Gradle, the `build.gradle` or `build.gradle.kts` build file should have the following Java plugin in its plugins section:
```groovy
plugins {
id('java')
// other plugins
}
```
Then, elsewhere in the build file, at the top-level, create a new task:
```groovy
tasks.register('runADKWebServer', JavaExec) {
dependsOn classes
classpath = sourceSets.main.runtimeClasspath
mainClass = 'com.google.adk.web.AdkWebServer'
args '--adk.agents.source-dir=src/main/java/agents', '--server.port=8080'
}
```
Finally, on the command-line, run the following command:
```console
gradle runADKWebServer
```
In Java, the Web Interface and the API server are bundled together.
The server starts on `http://localhost:8000` by default:
```shell
+-----------------------------------------------------------------------------+
| ADK Web Server started |
| |
| For local testing, access at http://localhost:8000. |
+-----------------------------------------------------------------------------+
```
## Features
Key features of the ADK web interface include:
- **Chat interface**: Send messages to your agents and view responses in real-time
- **Session management**: Create and switch between sessions
- **State inspection**: View and modify session state during development
- **Event history**: Inspect all events generated during agent execution
## Common options
| Option | Description | Default |
| ------------------------ | ---------------------------------- | ---------------------- |
| `--port` | Port to run the server on | `8000` |
| `--host` | Host binding address | `127.0.0.1` |
| `--session_service_uri` | Custom session storage URI | In-memory |
| `--artifact_service_uri` | Custom artifact storage URI | Local `.adk/artifacts` |
| `--reload/--no-reload` | Enable auto-reload on code changes | `true` |
### Example with options
```shell
adk web --port 3000 --session_service_uri "sqlite:///sessions.db"
```
# Deploying Your Agent
Once you've built and tested your agent using ADK, the next step is to deploy it so it can be accessed, queried, and used in production or integrated with other applications. Deployment moves your agent from your local development machine to a scalable and reliable environment.
## Deployment Options
Your ADK agent can be deployed to a range of different environments based on your needs for production readiness or custom flexibility:
### Agent Engine in Vertex AI
[Agent Engine](https://google.github.io/adk-docs/deploy/agent-engine/index.md) is a fully managed auto-scaling service on Google Cloud specifically designed for deploying, managing, and scaling AI agents built with frameworks such as ADK.
Learn more about [deploying your agent to Vertex AI Agent Engine](https://google.github.io/adk-docs/deploy/agent-engine/index.md).
### Cloud Run
[Cloud Run](https://cloud.google.com/run) is a managed auto-scaling compute platform on Google Cloud that enables you to run your agent as a container-based application.
Learn more about [deploying your agent to Cloud Run](https://google.github.io/adk-docs/deploy/cloud-run/index.md).
### Google Kubernetes Engine (GKE)
[Google Kubernetes Engine (GKE)](https://cloud.google.com/kubernetes-engine) is a managed Kubernetes service of Google Cloud that allows you to run your agent in a containerized environment. GKE is a good option if you need more control over the deployment as well as for running Open Models.
Learn more about [deploying your agent to GKE](https://google.github.io/adk-docs/deploy/gke/index.md).
### Other Container-friendly Infrastructure
You can manually package your Agent into a container image and then run it in any environment that supports container images. For example you can run it locally in Docker or Podman. This is a good option if you prefer to run offline or disconnected, or otherwise in a system that has no connection to Google Cloud.
Follow the instructions for [deploying your agent to Cloud Run](https://google.github.io/adk-docs/deploy/cloud-run/#deployment-commands). In the "Deployment Commands" section for gcloud CLI, you will find an example FastAPI entry point and Dockerfile.
# Deploy to Cloud Run
Supported in ADKPythonGoJava
[Cloud Run](https://cloud.google.com/run) is a fully managed platform that enables you to run your code directly on top of Google's scalable infrastructure.
To deploy your agent, you can use either the `adk deploy cloud_run` command *(recommended for Python)*, or with `gcloud run deploy` command through Cloud Run.
## Agent sample
For each of the commands, we will reference the `Capital Agent` sample defined on the [LLM agent](https://google.github.io/adk-docs/agents/llm-agents/index.md) page. We will assume it's in a directory (eg: `capital_agent`).
To proceed, confirm that your agent code is configured as follows:
1. Agent code is in a file called `agent.py` within your agent directory.
1. Your agent variable is named `root_agent`.
1. `__init__.py` is within your agent directory and contains `from . import agent`.
1. Your `requirements.txt` file is present in the agent directory.
1. Your application's entry point (the main package and main() function) is in a single Go file. Using main.go is a strong convention.
1. Your agent instance is passed to a launcher configuration, typically using agent.NewSingleLoader(yourAgent). The adkgo tool uses this launcher to start your agent with the correct services.
1. Your go.mod and go.sum files are present in your project directory to manage dependencies.
Refer to the following section for more details. You can also find a [sample app](https://github.com/google/adk-docs/tree/main/examples/go/cloud-run) in the Github repo.
1. Agent code is in a file called `CapitalAgent.java` within your agent directory.
1. Your agent variable is global and follows the format `public static final BaseAgent ROOT_AGENT`.
1. Your agent definition is present in a static class method.
Refer to the following section for more details. You can also find a [sample app](https://github.com/google/adk-docs/tree/main/examples/java/cloud-run) in the Github repo.
## Environment variables
Set your environment variables as described in the [Setup and Installation](https://google.github.io/adk-docs/get-started/installation/index.md) guide.
```bash
export GOOGLE_CLOUD_PROJECT=your-project-id
export GOOGLE_CLOUD_LOCATION=us-central1 # Or your preferred location
export GOOGLE_GENAI_USE_VERTEXAI=True
```
*(Replace `your-project-id` with your actual GCP project ID)*
Alternatively you can also use an API key from AI Studio
```bash
export GOOGLE_CLOUD_PROJECT=your-project-id
export GOOGLE_CLOUD_LOCATION=us-central1 # Or your preferred location
export GOOGLE_GENAI_USE_VERTEXAI=FALSE
export GOOGLE_API_KEY=your-api-key
```
*(Replace `your-project-id` with your actual GCP project ID and `your-api-key` with your actual API key from AI Studio)*
## Prerequisites
1. You should have a Google Cloud project. You need to know your:
1. Project name (i.e. "my-project")
1. Project location (i.e. "us-central1")
1. Service account (i.e. "1234567890-compute@developer.gserviceaccount.com")
1. GOOGLE_API_KEY
## Secret
Please make sure you have created a secret which can be read by your service account.
### Entry for GOOGLE_API_KEY secret
You can create your secret manually or use CLI:
```bash
echo "<>" | gcloud secrets create GOOGLE_API_KEY --project=my-project --data-file=-
```
### Permissions to read
You should give appropriate permission for you service account to read this secret.
```bash
gcloud secrets add-iam-policy-binding GOOGLE_API_KEY --member="serviceAccount:1234567890-compute@developer.gserviceaccount.com" --role="roles/secretmanager.secretAccessor" --project=my-project
```
## Deployment payload
When you deploy your ADK agent workflow to the Google Cloud Run, the following content is uploaded to the service:
- Your ADK agent code
- Any dependencies declared in your ADK agent code
- ADK API server code version used by your agent
The default deployment *does not* include the ADK web user interface libraries, unless you specify it as deployment setting, such as the `--with_ui` option for `adk deploy cloud_run` command.
## Deployment commands
### adk CLI
The `adk deploy cloud_run` command deploys your agent code to Google Cloud Run.
Ensure you have authenticated with Google Cloud (`gcloud auth login` and `gcloud config set project `).
#### Setup environment variables
Optional but recommended: Setting environment variables can make the deployment commands cleaner.
```bash
# Set your Google Cloud Project ID
export GOOGLE_CLOUD_PROJECT="your-gcp-project-id"
# Set your desired Google Cloud Location
export GOOGLE_CLOUD_LOCATION="us-central1" # Example location
# Set the path to your agent code directory
export AGENT_PATH="./capital_agent" # Assuming capital_agent is in the current directory
# Set a name for your Cloud Run service (optional)
export SERVICE_NAME="capital-agent-service"
# Set an application name (optional)
export APP_NAME="capital_agent_app"
```
#### Command usage
##### Minimal command
```bash
adk deploy cloud_run \
--project=$GOOGLE_CLOUD_PROJECT \
--region=$GOOGLE_CLOUD_LOCATION \
$AGENT_PATH
```
##### Full command with optional flags
```bash
adk deploy cloud_run \
--project=$GOOGLE_CLOUD_PROJECT \
--region=$GOOGLE_CLOUD_LOCATION \
--service_name=$SERVICE_NAME \
--app_name=$APP_NAME \
--with_ui \
$AGENT_PATH
```
##### Arguments
- `AGENT_PATH`: (Required) Positional argument specifying the path to the directory containing your agent's source code (e.g., `$AGENT_PATH` in the examples, or `capital_agent/`). This directory must contain at least an `__init__.py` and your main agent file (e.g., `agent.py`).
##### Options
- `--project TEXT`: (Required) Your Google Cloud project ID (e.g., `$GOOGLE_CLOUD_PROJECT`).
- `--region TEXT`: (Required) The Google Cloud location for deployment (e.g., `$GOOGLE_CLOUD_LOCATION`, `us-central1`).
- `--service_name TEXT`: (Optional) The name for the Cloud Run service (e.g., `$SERVICE_NAME`). Defaults to `adk-default-service-name`.
- `--app_name TEXT`: (Optional) The application name for the ADK API server (e.g., `$APP_NAME`). Defaults to the name of the directory specified by `AGENT_PATH` (e.g., `capital_agent` if `AGENT_PATH` is `./capital_agent`).
- `--agent_engine_id TEXT`: (Optional) If you are using a managed session service via Vertex AI Agent Engine, provide its resource ID here.
- `--port INTEGER`: (Optional) The port number the ADK API server will listen on within the container. Defaults to 8000.
- `--with_ui`: (Optional) If included, deploys the ADK dev UI alongside the agent API server. By default, only the API server is deployed.
- `--temp_folder TEXT`: (Optional) Specifies a directory for storing intermediate files generated during the deployment process. Defaults to a timestamped folder in the system's temporary directory. *(Note: This option is generally not needed unless troubleshooting issues).*
- `--help`: Show the help message and exit.
##### Passing gcloud CLI Arguments
To pass specific gcloud flags through the `adk deploy cloud_run` command, use the double-dash separator (`--`) after the ADK arguments. Any flags (except ADK-managed) following the `--` will be passed directly to the underlying gcloud command.
###### Syntax Example:
```bash
adk deploy cloud_run [ADK_FLAGS] -- [GCLOUD_FLAGS]
```
###### Example:
```bash
adk deploy cloud_run --project=[PROJECT_ID] --region=[REGION] path/to/my_agent -- --no-allow-unauthenticated --min-instances=2
```
##### Authenticated access
During the deployment process, you might be prompted: `Allow unauthenticated invocations to [your-service-name] (y/N)?`.
- Enter `y` to allow public access to your agent's API endpoint without authentication.
- Enter `N` (or press Enter for the default) to require authentication (e.g., using an identity token as shown in the "Testing your agent" section).
Upon successful execution, the command deploys your agent to Cloud Run and provide the URL of the deployed service.
### gcloud CLI for Python
Alternatively, you can deploy using the standard `gcloud run deploy` command with a `Dockerfile`. This method requires more manual setup compared to the `adk` command but offers flexibility, particularly if you want to embed your agent within a custom [FastAPI](https://fastapi.tiangolo.com/) application.
Ensure you have authenticated with Google Cloud (`gcloud auth login` and `gcloud config set project `).
#### Project Structure
Organize your project files as follows:
```text
your-project-directory/
├── capital_agent/
│ ├── __init__.py
│ └── agent.py # Your agent code (see "Agent sample" tab)
├── main.py # FastAPI application entry point
├── requirements.txt # Python dependencies
└── Dockerfile # Container build instructions
```
Create the following files (`main.py`, `requirements.txt`, `Dockerfile`) in the root of `your-project-directory/`.
#### Code files
1. This file sets up the FastAPI application using `get_fast_api_app()` from ADK:
main.py
```python
import os
import uvicorn
from fastapi import FastAPI
from google.adk.cli.fast_api import get_fast_api_app
# Get the directory where main.py is located
AGENT_DIR = os.path.dirname(os.path.abspath(__file__))
# Example session service URI (e.g., SQLite)
# Note: Use 'sqlite+aiosqlite' instead of 'sqlite' because DatabaseSessionService requires an async driver
SESSION_SERVICE_URI = "sqlite+aiosqlite:///./sessions.db"
# Example allowed origins for CORS
ALLOWED_ORIGINS = ["http://localhost", "http://localhost:8080", "*"]
# Set web=True if you intend to serve a web interface, False otherwise
SERVE_WEB_INTERFACE = True
# Call the function to get the FastAPI app instance
# Ensure the agent directory name ('capital_agent') matches your agent folder
app: FastAPI = get_fast_api_app(
agents_dir=AGENT_DIR,
session_service_uri=SESSION_SERVICE_URI,
allow_origins=ALLOWED_ORIGINS,
web=SERVE_WEB_INTERFACE,
)
# You can add more FastAPI routes or configurations below if needed
# Example:
# @app.get("/hello")
# async def read_root():
# return {"Hello": "World"}
if __name__ == "__main__":
# Use the PORT environment variable provided by Cloud Run, defaulting to 8080
uvicorn.run(app, host="0.0.0.0", port=int(os.environ.get("PORT", 8080)))
```
*Note: We specify `agent_dir` to the directory `main.py` is in and use `os.environ.get("PORT", 8080)` for Cloud Run compatibility.*
1. List the necessary Python packages:
requirements.txt
```text
google-adk
# Add any other dependencies your agent needs
```
1. Define the container image:
Dockerfile
```dockerfile
FROM python:3.13-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
RUN adduser --disabled-password --gecos "" myuser && \
chown -R myuser:myuser /app
COPY . .
USER myuser
ENV PATH="/home/myuser/.local/bin:$PATH"
CMD ["sh", "-c", "uvicorn main:app --host 0.0.0.0 --port $PORT"]
```
#### Defining Multiple Agents
You can define and deploy multiple agents within the same Cloud Run instance by creating separate folders in the root of `your-project-directory/`. Each folder represents one agent and must define a `root_agent` in its configuration.
Example structure:
```text
your-project-directory/
├── capital_agent/
│ ├── __init__.py
│ └── agent.py # contains `root_agent` definition
├── population_agent/
│ ├── __init__.py
│ └── agent.py # contains `root_agent` definition
└── ...
```
#### Deploy using `gcloud`
Navigate to `your-project-directory` in your terminal.
```bash
gcloud run deploy capital-agent-service \
--source . \
--region $GOOGLE_CLOUD_LOCATION \
--project $GOOGLE_CLOUD_PROJECT \
--allow-unauthenticated \
--set-env-vars="GOOGLE_CLOUD_PROJECT=$GOOGLE_CLOUD_PROJECT,GOOGLE_CLOUD_LOCATION=$GOOGLE_CLOUD_LOCATION,GOOGLE_GENAI_USE_VERTEXAI=$GOOGLE_GENAI_USE_VERTEXAI"
# Add any other necessary environment variables your agent might need
```
- `capital-agent-service`: The name you want to give your Cloud Run service.
- `--source .`: Tells gcloud to build the container image from the Dockerfile in the current directory.
- `--region`: Specifies the deployment region.
- `--project`: Specifies the GCP project.
- `--allow-unauthenticated`: Allows public access to the service. Remove this flag for private services.
- `--set-env-vars`: Passes necessary environment variables to the running container. Ensure you include all variables required by ADK and your agent (like API keys if not using Application Default Credentials).
`gcloud` will build the Docker image, push it to Google Artifact Registry, and deploy it to Cloud Run. Upon completion, it will output the URL of your deployed service.
For a full list of deployment options, see the [`gcloud run deploy` reference documentation](https://cloud.google.com/sdk/gcloud/reference/run/deploy).
### adk CLI
The adkgo command is located in the google/adk-go repository under cmd/adkgo. Before using it, you need to build it from the root of the adk-go repository:
`go build ./cmd/adkgo`
The adkgo deploy cloudrun command automates the deployment of your application. You do not need to provide your own Dockerfile.
#### Agent Code Structure
When using the adkgo tool, your main.go file must use the launcher framework. This is because the tool compiles your code and then runs the resulting executable with specific command-line arguments (like web, api, a2a) to start the required services. The launcher is designed to parse these arguments correctly.
Your main.go should look like this:
main.go
```go
// Copyright 2025 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package main
import (
"context"
"fmt"
"log"
"os"
"strings"
"google.golang.org/adk/agent"
"google.golang.org/adk/agent/llmagent"
"google.golang.org/adk/cmd/launcher"
"google.golang.org/adk/cmd/launcher/full"
"google.golang.org/adk/model/gemini"
"google.golang.org/adk/tool"
"google.golang.org/adk/tool/functiontool"
"google.golang.org/genai"
)
type getCapitalCityArgs struct {
Country string `json:"country" jsonschema:"The country for which to find the capital city."`
}
func getCapitalCity(ctx tool.Context, args getCapitalCityArgs) (string, error) {
capitals := map[string]string{
"united states": "Washington, D.C.",
"canada": "Ottawa",
"france": "Paris",
"japan": "Tokyo",
}
capital, ok := capitals[strings.ToLower(args.Country)]
if !ok {
return "", fmt.Errorf("couldn't find the capital for %s", args.Country)
}
return capital, nil
}
func main() {
ctx := context.Background()
model, err := gemini.NewModel(ctx, "gemini-2.5-flash", &genai.ClientConfig{
APIKey: os.Getenv("GOOGLE_API_KEY"),
})
if err != nil {
log.Fatalf("Failed to create model: %v", err)
}
capitalTool, err := functiontool.New(
functiontool.Config{
Name: "get_capital_city",
Description: "Retrieves the capital city for a given country.",
},
getCapitalCity,
)
if err != nil {
log.Fatalf("Failed to create function tool: %v", err)
}
geoAgent, err := llmagent.New(llmagent.Config{
Name: "capital_agent",
Model: model,
Description: "Agent to find the capital city of a country.",
Instruction: "I can answer your questions about the capital city of a country.",
Tools: []tool.Tool{capitalTool},
})
if err != nil {
log.Fatalf("Failed to create agent: %v", err)
}
config := &launcher.Config{
AgentLoader: agent.NewSingleLoader(geoAgent),
}
l := full.NewLauncher()
err = l.Execute(ctx, config, os.Args[1:])
if err != nil {
log.Fatalf("run failed: %v\n\n%s", err, l.CommandLineSyntax())
}
}
```
#### How it Works
1. The adkgo tool compiles your main.go into a statically linked binary for Linux.
1. It generates a Dockerfile that copies this binary into a minimal container.
1. It uses gcloud to build and deploy this container to Cloud Run.
1. After deployment, it starts a local proxy that securely connects to your new service.
Ensure you have authenticated with Google Cloud (`gcloud auth login` and `gcloud config set project `).
#### Setup environment variables
Optional but recommended: Setting environment variables can make the deployment commands cleaner.
```bash
# Set your Google Cloud Project ID
export GOOGLE_CLOUD_PROJECT="your-gcp-project-id"
# Set your desired Google Cloud Location
export GOOGLE_CLOUD_LOCATION="us-central1"
# Set the path to your agent's main Go file
export AGENT_PATH="./examples/go/cloud-run/main.go"
# Set a name for your Cloud Run service
export SERVICE_NAME="capital-agent-service"
```
#### Command usage
```bash
./adkgo deploy cloudrun \
-p $GOOGLE_CLOUD_PROJECT \
-r $GOOGLE_CLOUD_LOCATION \
-s $SERVICE_NAME \
--proxy_port=8081 \
--server_port=8080 \
-e $AGENT_PATH \
--a2a --api --webui
```
##### Required
- `-p, --project_name`: Your Google Cloud project ID (e.g., $GOOGLE_CLOUD_PROJECT).
- `-r, --region`: The Google Cloud location for deployment (e.g., $GOOGLE_CLOUD_LOCATION, us-central1).
- `-s, --service_name`: The name for the Cloud Run service (e.g., $SERVICE_NAME).
- `-e, --entry_point_path`: Path to the main Go file containing your agent's source code (e.g., $AGENT_PATH).
##### Optional
- `--proxy_port`: The local port for the authenticating proxy to listen on. Defaults to 8081.
- `--server_port`: The port number the server will listen on within the Cloud Run container. Defaults to 8080.
- `--a2a`: If included, enables Agent2Agent communication. Enabled by default.
- `--a2a_agent_url`: A2A agent card URL as advertised in the public agent card. This flag is only valid when used with the --a2a flag.
- `--api`: If included, deploys the ADK API server. Enabled by default.
- `--webui`: If included, deploys the ADK dev UI alongside the agent API server. Enabled by default.
- `--temp_dir`: Temp directory for build artifacts. Defaults to os.TempDir().
- `--help`: Show the help message and exit.
##### Authenticated access
The service is deployed with --no-allow-unauthenticated by default.
Upon successful execution, the command deploys your agent to Cloud Run and provide a local URL to access the service through the proxy.
### gcloud CLI for Java
You can deploy Java Agents using the standard `gcloud run deploy` command and a `Dockerfile`. This is the current recommended way to deploy Java Agents to Google Cloud Run.
Ensure you are [authenticated](https://cloud.google.com/docs/authentication/gcloud) with Google Cloud. Specifically, run the commands `gcloud auth login` and `gcloud config set project ` from your terminal.
#### Project Structure
Organize your project files as follows:
```text
your-project-directory/
├── src/
│ └── main/
│ └── java/
│ └── agents/
│ ├── capitalagent/
│ └── CapitalAgent.java # Your agent code
├── pom.xml # Java adk and adk-dev dependencies
└── Dockerfile # Container build instructions
```
Create the `pom.xml` and `Dockerfile` in the root of your project directory. Your Agent code file (`CapitalAgent.java`) inside a directory as shown above.
#### Code files
1. This is our Agent definition. This is the same code as present in [LLM agent](https://google.github.io/adk-docs/agents/llm-agents/index.md) with two caveats:
- The Agent is now initialized as a **global public static final variable**.
- The definition of the agent can be exposed in a static method or inlined during declaration.
See the code for the `CapitalAgent` example in the [examples](https://github.com/google/adk-docs/blob/main/examples/java/cloud-run/src/main/java/agents/capitalagent/CapitalAgent.java) repository.
1. Add the following dependencies and plugin to the pom.xml file.
pom.xml
```xml
com.google.adkgoogle-adk0.5.0com.google.adkgoogle-adk-dev0.5.0org.codehaus.mojoexec-maven-plugin3.2.0com.google.adk.web.AdkWebServercompile
```
1. Define the container image:
Dockerfile
```dockerfile
# Use an official Maven image with a JDK. Choose a version appropriate for your project.
FROM maven:3.8-openjdk-17 AS builder
WORKDIR /app
COPY pom.xml .
RUN mvn dependency:go-offline -B
COPY src ./src
# Expose the port your application will listen on.
# Cloud Run will set the PORT environment variable, which your app should use.
EXPOSE 8080
# The command to run your application.
# Use a shell so ${PORT} expands and quote exec.args so agent source-dir is passed correctly.
ENTRYPOINT ["sh", "-c", "mvn compile exec:java \
-Dexec.mainClass=com.google.adk.web.AdkWebServer \
-Dexec.classpathScope=compile \
-Dexec.args='--server.port=${PORT:-8080} --adk.agents.source-dir=target'"]
```
#### Deploy using `gcloud`
Navigate to `your-project-directory` in your terminal.
```bash
gcloud run deploy capital-agent-service \
--source . \
--region $GOOGLE_CLOUD_LOCATION \
--project $GOOGLE_CLOUD_PROJECT \
--allow-unauthenticated \
--set-env-vars="GOOGLE_CLOUD_PROJECT=$GOOGLE_CLOUD_PROJECT,GOOGLE_CLOUD_LOCATION=$GOOGLE_CLOUD_LOCATION,GOOGLE_GENAI_USE_VERTEXAI=$GOOGLE_GENAI_USE_VERTEXAI"
# Add any other necessary environment variables your agent might need
```
- `capital-agent-service`: The name you want to give your Cloud Run service.
- `--source .`: Tells gcloud to build the container image from the Dockerfile in the current directory.
- `--region`: Specifies the deployment region.
- `--project`: Specifies the GCP project.
- `--allow-unauthenticated`: Allows public access to the service. Remove this flag for private services.
- `--set-env-vars`: Passes necessary environment variables to the running container. Ensure you include all variables required by ADK and your agent (like API keys if not using Application Default Credentials).
`gcloud` will build the Docker image, push it to Google Artifact Registry, and deploy it to Cloud Run. Upon completion, it will output the URL of your deployed service.
For a full list of deployment options, see the [`gcloud run deploy` reference documentation](https://cloud.google.com/sdk/gcloud/reference/run/deploy).
## Testing your agent
Once your agent is deployed to Cloud Run, you can interact with it via the deployed UI (if enabled) or directly with its API endpoints using tools like `curl`. You'll need the service URL provided after deployment.
### UI Testing
If you deployed your agent with the UI enabled:
- **adk CLI:** You included the `--webui` flag during deployment.
- **gcloud CLI:** You set `SERVE_WEB_INTERFACE = True` in your `main.py`.
You can test your agent by simply navigating to the Cloud Run service URL provided after deployment in your web browser.
```bash
# Example URL format
# https://your-service-name-abc123xyz.a.run.app
```
The ADK dev UI allows you to interact with your agent, manage sessions, and view execution details directly in the browser.
To verify your agent is working as intended, you can:
1. Select your agent from the dropdown menu.
1. Type a message and verify that you receive an expected response from your agent.
If you experience any unexpected behavior, check the [Cloud Run](https://console.cloud.google.com/run) console logs.
### API Testing (curl)
You can interact with the agent's API endpoints using tools like `curl`. This is useful for programmatic interaction or if you deployed without the UI.
You'll need the service URL provided after deployment and potentially an identity token for authentication if your service isn't set to allow unauthenticated access.
#### Set the application URL
Replace the example URL with the actual URL of your deployed Cloud Run service.
```bash
export APP_URL="YOUR_CLOUD_RUN_SERVICE_URL"
# Example: export APP_URL="https://adk-default-service-name-abc123xyz.a.run.app"
```
#### Get an identity token (if needed)
If your service requires authentication (i.e., you didn't use `--allow-unauthenticated` with `gcloud` or answered 'N' to the prompt with `adk`), obtain an identity token.
```bash
export TOKEN=$(gcloud auth print-identity-token)
```
*If your service allows unauthenticated access, you can omit the `-H "Authorization: Bearer $TOKEN"` header from the `curl` commands below.*
#### List available apps
Verify the deployed application name.
```bash
curl -X GET -H "Authorization: Bearer $TOKEN" $APP_URL/list-apps
```
*(Adjust the `app_name` in the following commands based on this output if needed. The default is often the agent directory name, e.g., `capital_agent`)*.
#### Create or Update a Session
Initialize or update the state for a specific user and session. Replace `capital_agent` with your actual app name if different. The values `user_123` and `session_abc` are example identifiers; you can replace them with your desired user and session IDs.
```bash
curl -X POST -H "Authorization: Bearer $TOKEN" \
$APP_URL/apps/capital_agent/users/user_123/sessions/session_abc \
-H "Content-Type: application/json" \
-d '{"preferred_language": "English", "visit_count": 5}'
```
#### Run the Agent
Send a prompt to your agent. Replace `capital_agent` with your app name and adjust the user/session IDs and prompt as needed.
```bash
curl -X POST -H "Authorization: Bearer $TOKEN" \
$APP_URL/run_sse \
-H "Content-Type: application/json" \
-d '{
"app_name": "capital_agent",
"user_id": "user_123",
"session_id": "session_abc",
"new_message": {
"role": "user",
"parts": [{
"text": "What is the capital of Canada?"
}]
},
"streaming": false
}'
```
- Set `"streaming": true` if you want to receive Server-Sent Events (SSE).
- The response will contain the agent's execution events, including the final answer.
# Deploy to Google Kubernetes Engine (GKE)
Supported in ADKPython
[GKE](https://cloud.google.com/gke) is the Google Cloud managed Kubernetes service. It allows you to deploy and manage containerized applications using Kubernetes.
To deploy your agent you will need to have a Kubernetes cluster running on GKE. You can create a cluster using the Google Cloud Console or the `gcloud` command line tool.
In this example we will deploy a simple agent to GKE. The agent will be a FastAPI application that uses `Gemini 2.0 Flash` as the LLM. We can use Vertex AI or AI Studio as the LLM provider using the Environment variable `GOOGLE_GENAI_USE_VERTEXAI`.
## Environment variables
Set your environment variables as described in the [Setup and Installation](https://google.github.io/adk-docs/get-started/installation/index.md) guide. You also need to install the `kubectl` command line tool. You can find instructions to do so in the [Google Kubernetes Engine Documentation](https://cloud.google.com/kubernetes-engine/docs/how-to/cluster-access-for-kubectl).
```bash
export GOOGLE_CLOUD_PROJECT=your-project-id # Your GCP project ID
export GOOGLE_CLOUD_LOCATION=us-central1 # Or your preferred location
export GOOGLE_GENAI_USE_VERTEXAI=true # Set to true if using Vertex AI
export GOOGLE_CLOUD_PROJECT_NUMBER=$(gcloud projects describe --format json $GOOGLE_CLOUD_PROJECT | jq -r ".projectNumber")
```
If you don't have `jq` installed, you can use the following command to get the project number:
```bash
gcloud projects describe $GOOGLE_CLOUD_PROJECT
```
And copy the project number from the output.
```bash
export GOOGLE_CLOUD_PROJECT_NUMBER=YOUR_PROJECT_NUMBER
```
## Enable APIs and Permissions
Ensure you have authenticated with Google Cloud (`gcloud auth login` and `gcloud config set project `).
Enable the necessary APIs for your project. You can do this using the `gcloud` command line tool.
```bash
gcloud services enable \
container.googleapis.com \
artifactregistry.googleapis.com \
cloudbuild.googleapis.com \
aiplatform.googleapis.com
```
Grant necessary roles to the default compute engine service account required by the `gcloud builds submit` command.
```bash
ROLES_TO_ASSIGN=(
"roles/artifactregistry.writer"
"roles/storage.objectViewer"
"roles/logging.viewer"
"roles/logging.logWriter"
)
for ROLE in "${ROLES_TO_ASSIGN[@]}"; do
gcloud projects add-iam-policy-binding "${GOOGLE_CLOUD_PROJECT}" \
--member="serviceAccount:${GOOGLE_CLOUD_PROJECT_NUMBER}-compute@developer.gserviceaccount.com" \
--role="${ROLE}"
done
```
## Deployment payload
When you deploy your ADK agent workflow to the Google Cloud GKE, the following content is uploaded to the service:
- Your ADK agent code
- Any dependencies declared in your ADK agent code
- ADK API server code version used by your agent
The default deployment *does not* include the ADK web user interface libraries, unless you specify it as deployment setting, such as the `--with_ui` option for `adk deploy gke` command.
## Deployment options
You can deploy your agent to GKE either **manually using Kubernetes manifests** or **automatically using the `adk deploy gke` command**. Choose the approach that best suits your workflow.
## Option 1: Manual Deployment using gcloud and kubectl
### Create a GKE cluster
You can create a GKE cluster using the `gcloud` command line tool. This example creates an Autopilot cluster named `adk-cluster` in the `us-central1` region.
> If creating a GKE Standard cluster, make sure [Workload Identity](https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity) is enabled. Workload Identity is enabled by default in an AutoPilot cluster.
```bash
gcloud container clusters create-auto adk-cluster \
--location=$GOOGLE_CLOUD_LOCATION \
--project=$GOOGLE_CLOUD_PROJECT
```
After creating the cluster, you need to connect to it using `kubectl`. This command configures `kubectl` to use the credentials for your new cluster.
```bash
gcloud container clusters get-credentials adk-cluster \
--location=$GOOGLE_CLOUD_LOCATION \
--project=$GOOGLE_CLOUD_PROJECT
```
### Create Your Agent
We will reference the `capital_agent` example defined on the [LLM agents](https://google.github.io/adk-docs/agents/llm-agents/index.md) page.
To proceed, organize your project files as follows:
```text
your-project-directory/
├── capital_agent/
│ ├── __init__.py
│ └── agent.py # Your agent code (see "Capital Agent example" below)
├── main.py # FastAPI application entry point
├── requirements.txt # Python dependencies
└── Dockerfile # Container build instructions
```
### Code files
Create the following files (`main.py`, `requirements.txt`, `Dockerfile`, `capital_agent/agent.py`, `capital_agent/__init__.py`) in the root of `your-project-directory/`.
1. This is the Capital Agent example inside the `capital_agent` directory
capital_agent/agent.py
```python
from google.adk.agents import LlmAgent
# Define a tool function
def get_capital_city(country: str) -> str:
"""Retrieves the capital city for a given country."""
# Replace with actual logic (e.g., API call, database lookup)
capitals = {"france": "Paris", "japan": "Tokyo", "canada": "Ottawa"}
return capitals.get(country.lower(), f"Sorry, I don't know the capital of {country}.")
# Add the tool to the agent
capital_agent = LlmAgent(
model="gemini-2.0-flash",
name="capital_agent", #name of your agent
description="Answers user questions about the capital city of a given country.",
instruction="""You are an agent that provides the capital city of a country... (previous instruction text)""",
tools=[get_capital_city] # Provide the function directly
)
# ADK will discover the root_agent instance
root_agent = capital_agent
```
Mark your directory as a python package
capital_agent/__init__.py
```python
from . import agent
```
1. This file sets up the FastAPI application using `get_fast_api_app()` from ADK:
main.py
```python
import os
import uvicorn
from fastapi import FastAPI
from google.adk.cli.fast_api import get_fast_api_app
# Get the directory where main.py is located
AGENT_DIR = os.path.dirname(os.path.abspath(__file__))
# Example session service URI (e.g., SQLite)
# Note: Use 'sqlite+aiosqlite' instead of 'sqlite' because DatabaseSessionService requires an async driver
SESSION_SERVICE_URI = "sqlite+aiosqlite:///./sessions.db"
# Example allowed origins for CORS
ALLOWED_ORIGINS = ["http://localhost", "http://localhost:8080", "*"]
# Set web=True if you intend to serve a web interface, False otherwise
SERVE_WEB_INTERFACE = True
# Call the function to get the FastAPI app instance
# Ensure the agent directory name ('capital_agent') matches your agent folder
app: FastAPI = get_fast_api_app(
agents_dir=AGENT_DIR,
session_service_uri=SESSION_SERVICE_URI,
allow_origins=ALLOWED_ORIGINS,
web=SERVE_WEB_INTERFACE,
)
# You can add more FastAPI routes or configurations below if needed
# Example:
# @app.get("/hello")
# async def read_root():
# return {"Hello": "World"}
if __name__ == "__main__":
# Use the PORT environment variable provided by Cloud Run, defaulting to 8080
uvicorn.run(app, host="0.0.0.0", port=int(os.environ.get("PORT", 8080)))
```
*Note: We specify `agent_dir` to the directory `main.py` is in and use `os.environ.get("PORT", 8080)` for Cloud Run compatibility.*
1. List the necessary Python packages:
requirements.txt
```text
google-adk
# Add any other dependencies your agent needs
```
1. Define the container image:
Dockerfile
```dockerfile
FROM python:3.13-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
RUN adduser --disabled-password --gecos "" myuser && \
chown -R myuser:myuser /app
COPY . .
USER myuser
ENV PATH="/home/myuser/.local/bin:$PATH"
CMD ["sh", "-c", "uvicorn main:app --host 0.0.0.0 --port $PORT"]
```
### Build the container image
You need to create a Google Artifact Registry repository to store your container images. You can do this using the `gcloud` command line tool.
```bash
gcloud artifacts repositories create adk-repo \
--repository-format=docker \
--location=$GOOGLE_CLOUD_LOCATION \
--description="ADK repository"
```
Build the container image using the `gcloud` command line tool. This example builds the image and tags it as `adk-repo/adk-agent:latest`.
```bash
gcloud builds submit \
--tag $GOOGLE_CLOUD_LOCATION-docker.pkg.dev/$GOOGLE_CLOUD_PROJECT/adk-repo/adk-agent:latest \
--project=$GOOGLE_CLOUD_PROJECT \
.
```
Verify the image is built and pushed to the Artifact Registry:
```bash
gcloud artifacts docker images list \
$GOOGLE_CLOUD_LOCATION-docker.pkg.dev/$GOOGLE_CLOUD_PROJECT/adk-repo \
--project=$GOOGLE_CLOUD_PROJECT
```
### Configure Kubernetes Service Account for Vertex AI
If your agent uses Vertex AI, you need to create a Kubernetes service account with the necessary permissions. This example creates a service account named `adk-agent-sa` and binds it to the `Vertex AI User` role.
> If you are using AI Studio and accessing the model with an API key you can skip this step.
```bash
kubectl create serviceaccount adk-agent-sa
```
```bash
gcloud projects add-iam-policy-binding projects/${GOOGLE_CLOUD_PROJECT} \
--role=roles/aiplatform.user \
--member=principal://iam.googleapis.com/projects/${GOOGLE_CLOUD_PROJECT_NUMBER}/locations/global/workloadIdentityPools/${GOOGLE_CLOUD_PROJECT}.svc.id.goog/subject/ns/default/sa/adk-agent-sa \
--condition=None
```
### Create the Kubernetes manifest files
Create a Kubernetes deployment manifest file named `deployment.yaml` in your project directory. This file defines how to deploy your application on GKE.
deployment.yaml
```yaml
cat << EOF > deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: adk-agent
spec:
replicas: 1
selector:
matchLabels:
app: adk-agent
template:
metadata:
labels:
app: adk-agent
spec:
serviceAccount: adk-agent-sa
containers:
- name: adk-agent
imagePullPolicy: Always
image: $GOOGLE_CLOUD_LOCATION-docker.pkg.dev/$GOOGLE_CLOUD_PROJECT/adk-repo/adk-agent:latest
resources:
limits:
memory: "128Mi"
cpu: "500m"
ephemeral-storage: "128Mi"
requests:
memory: "128Mi"
cpu: "500m"
ephemeral-storage: "128Mi"
ports:
- containerPort: 8080
env:
- name: PORT
value: "8080"
- name: GOOGLE_CLOUD_PROJECT
value: $GOOGLE_CLOUD_PROJECT
- name: GOOGLE_CLOUD_LOCATION
value: $GOOGLE_CLOUD_LOCATION
- name: GOOGLE_GENAI_USE_VERTEXAI
value: "$GOOGLE_GENAI_USE_VERTEXAI"
# If using AI Studio, set GOOGLE_GENAI_USE_VERTEXAI to false and set the following:
# - name: GOOGLE_API_KEY
# value: $GOOGLE_API_KEY
# Add any other necessary environment variables your agent might need
---
apiVersion: v1
kind: Service
metadata:
name: adk-agent
spec:
type: LoadBalancer
ports:
- port: 80
targetPort: 8080
selector:
app: adk-agent
EOF
```
### Deploy the Application
Deploy the application using the `kubectl` command line tool. This command applies the deployment and service manifest files to your GKE cluster.
```bash
kubectl apply -f deployment.yaml
```
After a few moments, you can check the status of your deployment using:
```bash
kubectl get pods -l=app=adk-agent
```
This command lists the pods associated with your deployment. You should see a pod with a status of `Running`.
Once the pod is running, you can check the status of the service using:
```bash
kubectl get service adk-agent
```
If the output shows a `External IP`, it means your service is accessible from the internet. It may take a few minutes for the external IP to be assigned.
You can get the external IP address of your service using:
```bash
kubectl get svc adk-agent -o=jsonpath='{.status.loadBalancer.ingress[0].ip}'
```
## Option 2: Automated Deployment using `adk deploy gke`
ADK provides a CLI command to streamline GKE deployment. This avoids the need to manually build images, write Kubernetes manifests, or push to Artifact Registry.
#### Prerequisites
Before you begin, ensure you have the following set up:
1. **A running GKE cluster:** You need an active Kubernetes cluster on Google Cloud.
1. **Required CLIs:**
- **`gcloud` CLI:** The Google Cloud CLI must be installed, authenticated, and configured to use your target project. Run `gcloud auth login` and `gcloud config set project [YOUR_PROJECT_ID]`.
- **kubectl:** The Kubernetes CLI must be installed to deploy the application to your cluster.
1. **Enabled Google Cloud APIs:** Make sure the following APIs are enabled in your Google Cloud project:
- Kubernetes Engine API (`container.googleapis.com`)
- Cloud Build API (`cloudbuild.googleapis.com`)
- Container Registry API (`containerregistry.googleapis.com`)
1. **Required IAM Permissions:** The user or Compute Engine default service account running the command needs, at a minimum, the following roles:
1. **Kubernetes Engine Developer** (`roles/container.developer`): To interact with the GKE cluster.
1. **Storage Object Viewer** (`roles/storage.objectViewer`): To allow Cloud Build to download the source code from the Cloud Storage bucket where gcloud builds submit uploads it.
1. **Artifact Registry Create on Push Writer** (`roles/artifactregistry.createOnPushWriter`): To allow Cloud Build to push the built container image to Artifact Registry. This role also permits the on-the-fly creation of the special gcr.io repository within Artifact Registry if needed on the first push.
1. **Logs Writer** (`roles/logging.logWriter`): To allow Cloud Build to write build logs to Cloud Logging.
### The `deploy gke` Command
The command takes the path to your agent and parameters specifying the target GKE cluster.
#### Syntax
```bash
adk deploy gke [OPTIONS] AGENT_PATH
```
### Arguments & Options
| Argument | Description | Required |
| -------------- | ---------------------------------------------------------------------------------------- | -------- |
| AGENT_PATH | The local file path to your agent's root directory. | Yes |
| --project | The Google Cloud Project ID where your GKE cluster is located. | Yes |
| --cluster_name | The name of your GKE cluster. | Yes |
| --region | The Google Cloud region of your cluster (e.g., us-central1). | Yes |
| --with_ui | Deploys both the agent's back-end API and a companion front-end user interface. | No |
| --log_level | Sets the logging level for the deployment process. Options: debug, info, warning, error. | No |
### How It Works
When you run the `adk deploy gke` command, the ADK performs the following steps automatically:
- Containerization: It builds a Docker container image from your agent's source code.
- Image Push: It tags the container image and pushes it to your project's Artifact Registry.
- Manifest Generation: It dynamically generates the necessary Kubernetes manifest files (a `Deployment` and a `Service`).
- Cluster Deployment: It applies these manifests to your specified GKE cluster, which triggers the following:
The `Deployment` instructs GKE to pull the container image from Artifact Registry and run it in one or more Pods.
The `Service` creates a stable network endpoint for your agent. By default, this is a LoadBalancer service, which provisions a public IP address to expose your agent to the internet.
### Example Usage
Here is a practical example of deploying an agent located at `~/agents/multi_tool_agent/` to a GKE cluster named test.
```bash
adk deploy gke \
--project myproject \
--cluster_name test \
--region us-central1 \
--with_ui \
--log_level info \
~/agents/multi_tool_agent/
```
### Verifying Your Deployment
If you used `adk deploy gke`, verify the deployment using `kubectl`:
1. Check the Pods: Ensure your agent's pods are in the Running state.
```bash
kubectl get pods
```
You should see output like `adk-default-service-name-xxxx-xxxx ... 1/1 Running` in the default namespace.
1. Find the External IP: Get the public IP address for your agent's service.
```bash
kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
adk-default-service-name LoadBalancer 34.118.228.70 34.63.153.253 80:32581/TCP 5d20h
```
We can navigate to the external IP and interact with the agent via UI
## Testing your agent
Once your agent is deployed to GKE, you can interact with it via the deployed UI (if enabled) or directly with its API endpoints using tools like `curl`. You'll need the service URL provided after deployment.
### UI Testing
If you deployed your agent with the UI enabled:
You can test your agent by simply navigating to the kubernetes service URL in your web browser.
The ADK dev UI allows you to interact with your agent, manage sessions, and view execution details directly in the browser.
To verify your agent is working as intended, you can:
1. Select your agent from the dropdown menu.
1. Type a message and verify that you receive an expected response from your agent.
If you experience any unexpected behavior, check the pod logs for your agent using:
```bash
kubectl logs -l app=adk-agent
```
### API Testing (curl)
You can interact with the agent's API endpoints using tools like `curl`. This is useful for programmatic interaction or if you deployed without the UI.
#### Set the application URL
Replace the example URL with the actual URL of your deployed Cloud Run service.
```bash
export APP_URL=$(kubectl get service adk-agent -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
```
#### List available apps
Verify the deployed application name.
```bash
curl -X GET $APP_URL/list-apps
```
*(Adjust the `app_name` in the following commands based on this output if needed. The default is often the agent directory name, e.g., `capital_agent`)*.
#### Create or Update a Session
Initialize or update the state for a specific user and session. Replace `capital_agent` with your actual app name if different. The values `user_123` and `session_abc` are example identifiers; you can replace them with your desired user and session IDs.
```bash
curl -X POST \
$APP_URL/apps/capital_agent/users/user_123/sessions/session_abc \
-H "Content-Type: application/json" \
-d '{"preferred_language": "English", "visit_count": 5}'
```
#### Run the Agent
Send a prompt to your agent. Replace `capital_agent` with your app name and adjust the user/session IDs and prompt as needed.
```bash
curl -X POST $APP_URL/run_sse \
-H "Content-Type: application/json" \
-d '{
"app_name": "capital_agent",
"user_id": "user_123",
"session_id": "session_abc",
"new_message": {
"role": "user",
"parts": [{
"text": "What is the capital of Canada?"
}]
},
"streaming": false
}'
```
- Set `"streaming": true` if you want to receive Server-Sent Events (SSE).
- The response will contain the agent's execution events, including the final answer.
## Troubleshooting
These are some common issues you might encounter when deploying your agent to GKE:
### 403 Permission Denied for `Gemini 2.0 Flash`
This usually means that the Kubernetes service account does not have the necessary permission to access the Vertex AI API. Ensure that you have created the service account and bound it to the `Vertex AI User` role as described in the [Configure Kubernetes Service Account for Vertex AI](#configure-kubernetes-service-account-for-vertex-ai) section. If you are using AI Studio, ensure that you have set the `GOOGLE_API_KEY` environment variable in the deployment manifest and it is valid.
### 404 or Not Found response
This usually means there is an error in your request. Check the application logs to diagnose the problem.
```bash
export POD_NAME=$(kubectl get pod -l app=adk-agent -o jsonpath='{.items[0].metadata.name}')
kubectl logs $POD_NAME
```
### Attempt to write a readonly database
You might see there is no session id created in the UI and the agent does not respond to any messages. This is usually caused by the SQLite database being read-only. This can happen if you run the agent locally and then create the container image which copies the SQLite database into the container. The database is then read-only in the container.
```bash
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) attempt to write a readonly database
[SQL: UPDATE app_states SET state=?, update_time=CURRENT_TIMESTAMP WHERE app_states.app_name = ?]
```
To fix this issue, you can either:
Delete the SQLite database file from your local machine before building the container image. This will create a new SQLite database when the container is started.
```bash
rm -f sessions.db
```
or (recommended) you can add a `.dockerignore` file to your project directory to exclude the SQLite database from being copied into the container image.
.dockerignore
```text
sessions.db
```
Build the container image abd deploy the application again.
### Insufficient Permission to Stream Logs `ERROR: (gcloud.builds.submit)`
This error can occur when you don't have sufficient permissions to stream build logs, or your VPC-SC security policy restricts access to the default logs bucket.
To check the progress of the build, follow the link provided in the error message or navigate to the Cloud Build page in the Google Cloud Console.
You can also verify the image was built and pushed to the Artifact Registry using the command under the [Build the container image](#build-the-container-image) section.
### Gemini-2.0-Flash Not Supported in Live Api
When using the ADK Dev UI for your deployed agent, text-based chat works, but voice (e.g., clicking the microphone button) fail. You might see a `websockets.exceptions.ConnectionClosedError` in the pod logs indicating that your model is "not supported in the live api".
This error occurs because the agent is configured with a model (like `gemini-2.0-flash` in the example) that does not support the Gemini Live API. The Live API is required for real-time, bidirectional streaming of audio and video.
## Cleanup
To delete the GKE cluster and all associated resources, run:
```bash
gcloud container clusters delete adk-cluster \
--location=$GOOGLE_CLOUD_LOCATION \
--project=$GOOGLE_CLOUD_PROJECT
```
To delete the Artifact Registry repository, run:
```bash
gcloud artifacts repositories delete adk-repo \
--location=$GOOGLE_CLOUD_LOCATION \
--project=$GOOGLE_CLOUD_PROJECT
```
You can also delete the project if you no longer need it. This will delete all resources associated with the project, including the GKE cluster, Artifact Registry repository, and any other resources you created.
```bash
gcloud projects delete $GOOGLE_CLOUD_PROJECT
```
# Deploy to Vertex AI Agent Engine
Supported in ADKPython
Google Cloud Vertex AI [Agent Engine](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/overview) is a set of modular services that help developers scale and govern agents in production. The Agent Engine runtime enables you to deploy agents in production with end-to-end managed infrastructure so you can focus on creating intelligent and impactful agents. When you deploy an ADK agent to Agent Engine, your code runs in the *Agent Engine runtime* environment, which is part of the larger set of agent services provided by the Agent Engine product.
This guide includes the following deployment paths, which serve different purposes:
- **[Standard deployment](/adk-docs/deploy/agent-engine/deploy/)**: Follow this standard deployment path if you have an existing Google Cloud project and if you want to carefully manage deploying an ADK agent to the Agent Engine runtime. This deployment path uses Cloud Console, ADK command line interface, and provides step-by-step instructions. This path is recommended for users who are already familiar with configuring Google Cloud projects, and users preparing for production deployments.
- **[Agent Starter Pack deployment](/adk-docs/deploy/agent-engine/asp/)**: Follow this accelerated deployment path if you do not have an existing Google Cloud project and are creating a project specifically for development and testing. The Agent Starter Pack (ASP) helps you deploy ADK projects quickly and it configures Google Cloud services that are not strictly necessary for running an ADK agent with the Agent Engine runtime.
Agent Engine service on Google Cloud
Agent Engine is a paid service and you may incur costs if you go above the no-cost access tier. More information can be found on the [Agent Engine pricing page](https://cloud.google.com/vertex-ai/pricing#vertex-ai-agent-engine).
## Deployment payload
When you deploy your ADK agent project to Agent Engine, the following content is uploaded to the service:
- Your ADK agent code
- Any dependencies declared in your ADK agent code
The deployment *does not* include the ADK API server or the ADK web user interface libraries. The Agent Engine service provides the libraries for ADK API server functionality.
# Deploy to Agent Engine with Agent Starter Pack
Supported in ADKPython
This deployment procedure describes how to perform a deployment using the [Agent Starter Pack](https://github.com/GoogleCloudPlatform/agent-starter-pack) (ASP) and the ADK command line interface (CLI) tool. Using ASP for deployment to the Agent Engine runtime is an accelerated path, and you should use it for **development and testing** only. The ASP tool configures Google Cloud resources that are not strictly necessary for running an ADK agent workflow, and you should thoroughly review that configuration before using it in a production deployment.
This deployment guide uses the ASP tool to apply a project template to your existing project, add deployment artifacts, and prepare your agent project for deployment. These instructions show you how to use ASP to provision a Google Cloud project with services needed for deploying your ADK project, as follows:
- [Prerequisites](#prerequisites-ad): Setup Google Cloud account, a project, and install required software.
- [Prepare your ADK project](#prepare-ad): Modify your existing ADK project files to get ready for deployment.
- [Connect to your Google Cloud project](#connect-ad): Connect your development environment to Google Cloud and your Google Cloud project.
- [Deploy your ADK project](#deploy-ad): Provision required services in your Google Cloud project and upload your ADK project code.
For information on testing a deployed agent, see [Test deployed agent](https://google.github.io/adk-docs/deploy/agent-engine/test/index.md). For more information on using Agent Starter Pack and its command line tools, see the [CLI reference](https://googlecloudplatform.github.io/agent-starter-pack/cli/enhance.html) and [Development guide](https://googlecloudplatform.github.io/agent-starter-pack/guide/development-guide.html).
### Prerequisites
You need the following resources configured to use this deployment path:
- **Google Cloud account**: with administrator access to the following:
- **Google Cloud Project**: An empty Google Cloud project with [billing enabled](https://cloud.google.com/billing/docs/how-to/modify-project). For information on creating projects, see [Creating and managing projects](https://cloud.google.com/resource-manager/docs/creating-managing-projects).
- **Python Environment**: A Python version supported by the [ASP project](https://googlecloudplatform.github.io/agent-starter-pack/guide/getting-started.html).
- **uv Tool:** Manage Python development environment and running ASP tools. For installation details, see [Install uv](https://docs.astral.sh/uv/getting-started/installation/).
- **Google Cloud CLI tool**: The gcloud command line interface. For installation details, see [Google Cloud Command Line Interface](https://cloud.google.com/sdk/docs/install).
- **Make tool**: Build automation tool. This tool is part of most Unix-based systems, for installation details, see the [Make tool](https://www.gnu.org/software/make/) documentation.
### Prepare your ADK project
When you deploy an ADK project to Agent Engine, you need some additional files to support the deployment operation. The following ASP command backs up your project and then adds files to your project for deployment purposes.
These instructions assume you have an existing ADK project that you are modifying for deployment. If you do not have an ADK project, or want to use a test project, complete the Python [Quickstart](/adk-docs/get-started/quickstart/) guide, which creates a [multi_tool_agent](https://github.com/google/adk-docs/tree/main/examples/python/snippets/get-started/multi_tool_agent) project. The following instructions use the `multi_tool_agent` project as an example.
To prepare your ADK project for deployment to Agent Engine:
1. In a terminal window of your development environment, navigate to the **parent directory** that contains your agent folder. For example, if your project structure is:
```text
your-project-directory/
├── multi_tool_agent/
│ ├── __init__.py
│ ├── agent.py
│ └── .env
```
Navigate to `your-project-directory/`
1. Run the ASP `enhance` command to add the files required for deployment into your project.
```shell
uvx agent-starter-pack enhance --adk -d agent_engine
```
1. Follow the instructions from the ASP tool. In general, you can accept the default answers to all questions. However for the **GCP region**, option, make sure you select one of the [supported regions](https://docs.cloud.google.com/agent-builder/locations#supported-regions-agent-engine) for Agent Engine.
When you successfully complete this process, the tool shows the following message:
```text
> Success! Your agent project is ready.
```
Note
The ASP tool may show a reminder to connect to Google Cloud while running, but that connection is *not required* at this stage.
For more information about the changes ASP makes to your ADK project, see [Changes to your ADK project](#adk-asp-changes).
### Connect to your Google Cloud project
Before you deploy your ADK project, you must connect to Google Cloud and your project. After logging into your Google Cloud account, you should verify that your deployment target project is visible from your account and that it is configured as your current project.
To connect to Google Cloud and list your project:
1. In a terminal window of your development environment, login to your Google Cloud account:
```shell
gcloud auth application-default login
```
1. Set your target project using the Google Cloud Project ID:
```shell
gcloud config set project your-project-id-xxxxx
```
1. Verify your Google Cloud target project is set:
```shell
gcloud config get-value project
```
Once you have successfully connected to Google Cloud and set your Cloud Project ID, you are ready to deploy your ADK project files to Agent Engine.
### Deploy your ADK project
When using the ASP tool, you deploy in stages. In the first stage, you run a `make` command that provisions the services needed to run your ADK workflow on Agent Engine. In the second stage, the tool uploads your project code to the Agent Engine service and runs it in the hosted environment
Important
*Make sure your Google Cloud target deployment project is set as your* **current project** *before performing these steps*. The `make backend` command uses your currently set Google Cloud project when it performs a deployment. For information on setting and checking your current project, see [Connect to your Google Cloud project](#connect-ad).
To deploy your ADK project to Agent Engine in your Google Cloud project:
1. In a terminal window, ensure you are in the parent directory (e.g., `your-project-directory/`) that contains your agent folder.
1. Deploy the code from the updated local project into the Google Cloud development environment, by running the following ASP make command:
```shell
make backend
```
Once this process completes successfully, you should be able to interact with the agent running on Google Cloud Agent Engine. For details on testing the deployed agent, see [Test deployed agent](/adk-docs/deploy/agent-engine/test/).
### Changes to your ADK project
The ASP tools add more files to your project for deployment. The procedure below backs up your existing project files before modifying them. This guide uses the [multi_tool_agent](https://github.com/google/adk-docs/tree/main/examples/python/snippets/get-started/multi_tool_agent) project as a reference example. The original project has the following file structure to start with:
```text
multi_tool_agent/
├─ __init__.py
├─ agent.py
└─ .env
```
After running the ASP enhance command to add Agent Engine deployment information, the new structure is as follows:
```text
multi-tool-agent/
├─ app/ # Core application code
│ ├─ agent.py # Main agent logic
│ ├─ agent_engine_app.py # Agent Engine application logic
│ └─ utils/ # Utility functions and helpers
├─ .cloudbuild/ # CI/CD pipeline configurations for Google Cloud Build
├─ deployment/ # Infrastructure and deployment scripts
├─ notebooks/ # Jupyter notebooks for prototyping and evaluation
├─ tests/ # Unit, integration, and load tests
├─ Makefile # Makefile for common commands
├─ GEMINI.md # AI-assisted development guide
└─ pyproject.toml # Project dependencies and configuration
```
See the *README.md* file in your updated ADK project folder for more information. For more information on using Agent Starter Pack, see the [Development guide](https://googlecloudplatform.github.io/agent-starter-pack/guide/development-guide.html).
## Test deployed agents
After completing deployment of your ADK agent you should test the workflow in its new hosted environment. For more information on testing an ADK agent deployed to Agent Engine, see [Test deployed agents in Agent Engine](/adk-docs/deploy/agent-engine/test/).
# Deploy to Vertex AI Agent Engine
Supported in ADKPython
This deployment procedure describes how to perform a standard deployment of ADK agent code to Google Cloud [Agent Engine](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/overview). You should follow this deployment path if you have an existing Google Cloud project and if you want to carefully manage deploying an ADK agent to Agent Engine runtime environment. These instructions use Cloud Console, the gcloud command line interface, and the ADK command line interface (ADK CLI). This path is recommended for users who are already familiar with configuring Google Cloud projects, and users preparing for production deployments.
These instructions describe how to deploy an ADK project to Google Cloud Agent Engine runtime environment, which includes the following stages:
- [Setup Google Cloud project](#setup-cloud-project)
- [Prepare agent project folder](#define-your-agent)
- [Deploy the agent](#deploy-agent)
## Setup Google Cloud project
To deploy your agent to Agent Engine, you need a Google Cloud project:
1. **Sign into Google Cloud**:
- If you're an **existing user** of Google Cloud:
- Sign in via
- If you previously used a Free Trial that has expired, you may need to upgrade to a [Paid billing account](https://docs.cloud.google.com/free/docs/free-cloud-features#how-to-upgrade).
- If you are a **new user** of Google Cloud:
- You can sign up for the [Free Trial program](https://docs.cloud.google.com/free/docs/free-cloud-features). The Free Trial gets you a $300 Welcome credit to spend over 91 days on various [Google Cloud products](https://docs.cloud.google.com/free/docs/free-cloud-features#during-free-trial) and you won't be billed. During the Free Trial, you also get access to the [Google Cloud Free Tier](https://docs.cloud.google.com/free/docs/free-cloud-features#free-tier), which gives you free usage of select products up to specified monthly limits, and to product-specific free trials.
1. **Create a Google Cloud project**
- If you already have an existing Google Cloud project, you can use it, but be aware this process is likely to add new services to the project.
- If you want to create a new Google Cloud project, you can create a new one on the [Create Project](https://console.cloud.google.com/projectcreate) page.
1. **Get your Google Cloud Project ID**
- You need your Google Cloud Project ID, which you can find on your GCP homepage. Make sure to note the Project ID (alphanumeric with hyphens), *not* the project number (numeric).
1. **Enable Vertex AI in your project**
- To use Agent Engine, you need to [enable the Vertex AI API](https://console.cloud.google.com/apis/library/aiplatform.googleapis.com). Click on the "Enable" button to enable the API. Once enabled, it should say "API Enabled".
1. **Enable Cloud Resource Manager API in your project**
- To use Agent Engine, you need to [enable the Cloud Resource Manager API](https://console.developers.google.com/apis/api/cloudresourcemanager.googleapis.com/overview). Click on the "Enable" button to enable the API. Once enabled, it should say "API Enabled".
## Set up your coding environment
Now that you prepared your Google Cloud project, you can return to your coding environment. These steps require access to a terminal within your coding environment to run command line instructions.
### Authenticate your coding environment with Google Cloud
- You need to authenticate your coding environment so that you and your code can interact with Google Cloud. To do so, you need the gcloud CLI. If you have never used the gcloud CLI, you need to first [download and install it](https://docs.cloud.google.com/sdk/docs/install-sdk) before continuing with the steps below:
- Run the following command in your terminal to access your Google Cloud project as a user:
```shell
gcloud auth login
```
After authenticating, you should see the message `You are now authenticated with the gcloud CLI!`.
- Run the following command to authenticate your code so that it can work with Google Cloud:
```shell
gcloud auth application-default login
```
After authenticating, you should see the message `You are now authenticated with the gcloud CLI!`.
- (Optional) If you need to set or change your default project in gcloud, you can use:
```shell
gcloud config set project MY-PROJECT-ID
```
### Define your agent
With your Google Cloud and coding environment prepared, you're ready to deploy your agent. The instructions assume that you have an agent project folder, such as:
```shell
multi_tool_agent/
├── .env
├── __init__.py
└── agent.py
```
For more details on the project files and format, see the [multi_tool_agent](https://github.com/google/adk-docs/tree/main/examples/python/snippets/get-started/multi_tool_agent) code sample.
## Deploy the agent
You can deploy from your terminal using the `adk deploy` command line tool. This process packages your code, builds it into a container, and deploys it to the managed Agent Engine service. This process can take several minutes.
The following example deploy command uses the `multi_tool_agent` sample code as the project to be deployed:
```shell
PROJECT_ID=my-project-id
LOCATION_ID=us-central1
adk deploy agent_engine \
--project=$PROJECT_ID \
--region=$LOCATION_ID \
--display_name="My First Agent" \
multi_tool_agent
```
For `region`, you can find a list of the supported regions on the [Vertex AI Agent Builder locations page](https://docs.cloud.google.com/agent-builder/locations#supported-regions-agent-engine). To learn about the CLI options for the `adk deploy agent_engine` command, see the [ADK CLI Reference](https://google.github.io/adk-docs/api-reference/cli/cli.html#adk-deploy-agent-engine).
### Deploy command output
Once successfully deployed, you should see the following output:
```shell
Creating AgentEngine
Create AgentEngine backing LRO: projects/123456789/locations/us-central1/reasoningEngines/751619551677906944/operations/2356952072064073728
View progress and logs at https://console.cloud.google.com/logs/query?project=hopeful-sunset-478017-q0
AgentEngine created. Resource name: projects/123456789/locations/us-central1/reasoningEngines/751619551677906944
To use this AgentEngine in another session:
agent_engine = vertexai.agent_engines.get('projects/123456789/locations/us-central1/reasoningEngines/751619551677906944')
Cleaning up the temp folder: /var/folders/k5/pv70z5m92s30k0n7hfkxszfr00mz24/T/agent_engine_deploy_src/20251219_134245
```
Note that you now have a `RESOURCE_ID` where your agent has been deployed (which in the example above is `751619551677906944`). You need this ID number along with the other values to use your agent on Agent Engine.
## Using an agent on Agent Engine
Once you have completed deployment of your ADK project, you can query the agent using the Vertex AI SDK, Python requests library, or a REST API client. This section provides some information on what you need to interact with your agent and how to construct URLs to interact with your agent's REST API.
To interact with your agent on Agent Engine, you need the following:
- **PROJECT_ID** (example: "my-project-id") which you can find on your [project details page](https://console.cloud.google.com/iam-admin/settings)
- **LOCATION_ID** (example: "us-central1"), that you used to deploy your agent
- **RESOURCE_ID** (example: "751619551677906944"), which you can find on the [Agent Engine UI](https://console.cloud.google.com/vertex-ai/agents/agent-engines)
The query URL structure is as follows:
```shell
https://$(LOCATION_ID)-aiplatform.googleapis.com/v1/projects/$(PROJECT_ID)/locations/$(LOCATION_ID)/reasoningEngines/$(RESOURCE_ID):query
```
You can make requests from your agent using this URL structure. For more information on how to make requests, see the instructions in the Agent Engine documentation [Use an Agent Development Kit agent](https://docs.cloud.google.com/agent-builder/agent-engine/use/adk#rest-api). You can also check the Agent Engine documentation to learn about how to manage your [deployed agent](https://docs.cloud.google.com/agent-builder/agent-engine/manage/overview). For more information on testing and interacting with a deployed agent, see [Test deployed agents in Agent Engine](/adk-docs/deploy/agent-engine/test/).
### Monitoring and verification
- You can monitor the deployment status in the [Agent Engine UI](https://console.cloud.google.com/vertex-ai/agents/agent-engines) in the Google Cloud Console.
- For additional details, you can visit the Agent Engine documentation [deploying an agent](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/deploy) and [managing deployed agents](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/manage/overview).
## Test deployed agents
After completing deployment of your ADK agent you should test the workflow in its new hosted environment. For more information on testing an ADK agent deployed to Agent Engine, see [Test deployed agents in Agent Engine](/adk-docs/deploy/agent-engine/test/).
# Test deployed agents in Agent Engine
These instructions explain how to test an ADK agent deployed to the [Agent Engine](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/overview) runtime environment. Before using these instructions, you need to have completed the deployment of your agent to the Agent Engine runtime environment using one of the [available methods](/adk-docs/deploy/agent-engine/). This guide shows you how to view, interact, and test your deployed agent through the Google Cloud Console, and interact with the agent using REST API calls or the Vertex AI SDK for Python.
## View deployed agent in Cloud Console
To view your deployed agent in the Cloud Console:
- Navigate to the Agent Engine page in the Google Cloud Console:
This page lists all deployed agents in your currently selected Google Cloud project. If you do not see your agent listed, make sure you have your target project selected in Google Cloud Console. For more information on selecting an existing Google Cloud project, see [Creating and managing projects](https://cloud.google.com/resource-manager/docs/creating-managing-projects#identifying_projects).
## Find Google Cloud project information
You need the address and resource identification for your project (`PROJECT_ID`, `LOCATION_ID`, `RESOURCE_ID`) to be able to test your deployment. You can use Cloud Console or the `gcloud` command line tool to find this information.
Vertex AI express mode API key
If you are using Vertex AI express mode, you can skip this step and use your API key.
To find your project information with Google Cloud Console:
1. In the Google Cloud Console, navigate to the Agent Engine page:
1. At the top of the page, select **API URLs**, and then copy the **Query URL** string for your deployed agent, which should be in this format:
```text
https://$(LOCATION_ID)-aiplatform.googleapis.com/v1/projects/$(PROJECT_ID)/locations/$(LOCATION_ID)/reasoningEngines/$(RESOURCE_ID):query
```
To find your project information with the `gcloud` command line tool:
1. In your development environment, make sure you are authenticated to Google Cloud and run the following command to list your project:
```shell
gcloud projects list
```
1. With the Project ID you used for deployment, run this command to get the additional details:
```shell
gcloud asset search-all-resources \
--scope=projects/$(PROJECT_ID) \
--asset-types='aiplatform.googleapis.com/ReasoningEngine' \
--format="table(name,assetType,location,reasoning_engine_id)"
```
## Test using REST calls
A simple way to interact with your deployed agent in Agent Engine is to use REST calls with the `curl` tool. This section describes how to check your connection to the agent and also to test processing of a request by the deployed agent.
### Check connection to agent
You can check your connection to the running agent using the **Query URL** available in the Agent Engine section of the Cloud Console. This check does not execute the deployed agent, but returns information about the agent.
To send a REST call and get a response from deployed agent:
- In a terminal window of your development environment, build a request and execute it:
```shell
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://$(LOCATION_ID)-aiplatform.googleapis.com/v1/projects/$(PROJECT_ID)/locations/$(LOCATION_ID)/reasoningEngines"
```
```shell
curl -X GET \
-H "x-goog-api-key:YOUR-EXPRESS-MODE-API-KEY" \
"https://aiplatform.googleapis.com/v1/reasoningEngines"
```
If your deployment was successful, this request responds with a list of valid requests and expected data formats.
Remove `:query` parameter for connection URL
If you use the **Query URL** available in the Agent Engine section of the Cloud Console, make sure to remove the `:query` parameter from end of the address.
Access for agent connections
This connection test requires the calling user has a valid access token for the deployed agent. When testing from other environments, make sure the calling user has access to connect to the agent in your Google Cloud project.
### Send an agent request
When getting responses from your agent project, you must first create a session, receive a Session ID, and then send your requests using that Session ID. This process is described in the following instructions.
To test interaction with the deployed agent via REST:
1. In a terminal window of your development environment, create a session by building a request using this template:
```shell
curl \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://$(LOCATION_ID)-aiplatform.googleapis.com/v1/projects/$(PROJECT_ID)/locations/$(LOCATION_ID)/reasoningEngines/$(RESOURCE_ID):query \
-d '{"class_method": "async_create_session", "input": {"user_id": "u_123"},}'
```
```shell
curl \
-H "x-goog-api-key:YOUR-EXPRESS-MODE-API-KEY" \
-H "Content-Type: application/json" \
https://aiplatform.googleapis.com/v1/reasoningEngines/$(RESOURCE_ID):query \
-d '{"class_method": "async_create_session", "input": {"user_id": "u_123"},}'
```
1. In the response from the previous command, extract the created **Session ID** from the **id** field:
```json
{
"output": {
"userId": "u_123",
"lastUpdateTime": 1757690426.337745,
"state": {},
"id": "4857885913439920384", # Session ID
"appName": "9888888855577777776",
"events": []
}
}
```
1. In a terminal window of your development environment, send a message to your agent by building a request using this template and the Session ID created in the previous step:
```shell
curl \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://$(LOCATION_ID)-aiplatform.googleapis.com/v1/projects/$(PROJECT_ID)/locations/$(LOCATION_ID)/reasoningEngines/$(RESOURCE_ID):query?alt=sse -d '{
"class_method": "async_stream_query",
"input": {
"user_id": "u_123",
"session_id": "4857885913439920384",
"message": "Hey whats the weather in new york today?",
}
}'
```
```shell
curl \
-H "x-goog-api-key:YOUR-EXPRESS-MODE-API-KEY" \
-H "Content-Type: application/json" \
https://aiplatform.googleapis.com/v1/reasoningEngines/$(RESOURCE_ID):query?alt=sse -d '{
"class_method": "async_stream_query",
"input": {
"user_id": "u_123",
"session_id": "4857885913439920384",
"message": "Hey whats the weather in new york today?",
}
}'
```
This request should generate a response from your deployed agent code in JSON format. For more information about interacting with a deployed ADK agent in Agent Engine using REST calls, see [Manage deployed agents](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/manage/overview#console) and [Use an Agent Development Kit agent](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/use/adk) in the Agent Engine documentation.
## Test using Python
You can use Python code for more sophisticated and repeatable testing of your agent deployed in Agent Engine. These instructions describe how to create a session with the deployed agent, and then send a request to the agent for processing.
### Create a remote session
Use the `remote_app` object to create a connection to a deployed, remote agent:
```py
# If you are in a new script or used the ADK CLI to deploy, you can connect like this:
# remote_app = agent_engines.get("your-agent-resource-name")
remote_session = await remote_app.async_create_session(user_id="u_456")
print(remote_session)
```
Expected output for `create_session` (remote):
```console
{'events': [],
'user_id': 'u_456',
'state': {},
'id': '7543472750996750336',
'app_name': '7917477678498709504',
'last_update_time': 1743683353.030133}
```
The `id` value is the session ID, and `app_name` is the resource ID of the deployed agent on Agent Engine.
#### Send queries to your remote agent
```py
async for event in remote_app.async_stream_query(
user_id="u_456",
session_id=remote_session["id"],
message="whats the weather in new york",
):
print(event)
```
Expected output for `async_stream_query` (remote):
```console
{'parts': [{'function_call': {'id': 'af-f1906423-a531-4ecf-a1ef-723b05e85321', 'args': {'city': 'new york'}, 'name': 'get_weather'}}], 'role': 'model'}
{'parts': [{'function_response': {'id': 'af-f1906423-a531-4ecf-a1ef-723b05e85321', 'name': 'get_weather', 'response': {'status': 'success', 'report': 'The weather in New York is sunny with a temperature of 25 degrees Celsius (41 degrees Fahrenheit).'}}}], 'role': 'user'}
{'parts': [{'text': 'The weather in New York is sunny with a temperature of 25 degrees Celsius (41 degrees Fahrenheit).'}], 'role': 'model'}
```
For more information about interacting with a deployed ADK agent in Agent Engine, see [Manage deployed agents](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/manage/overview) and [Use a Agent Development Kit agent](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/use/adk) in the Agent Engine documentation.
### Sending Multimodal Queries
To send multimodal queries (e.g., including images) to your agent, you can construct the `message` parameter of `async_stream_query` with a list of `types.Part` objects. Each part can be text or an image.
To include an image, you can use `types.Part.from_uri`, providing a Google Cloud Storage (GCS) URI for the image.
```python
from google.genai import types
image_part = types.Part.from_uri(
file_uri="gs://cloud-samples-data/generative-ai/image/scones.jpg",
mime_type="image/jpeg",
)
text_part = types.Part.from_text(
text="What is in this image?",
)
async for event in remote_app.async_stream_query(
user_id="u_456",
session_id=remote_session["id"],
message=[text_part, image_part],
):
print(event)
```
Note
While the underlying communication with the model may involve Base64 encoding for images, the recommended and supported method for sending image data to an agent deployed on Agent Engine is by providing a GCS URI.
## Clean up deployments
If you have performed deployments as tests, it is a good practice to clean up your cloud resources after you have finished. You can delete the deployed Agent Engine instance to avoid any unexpected charges on your Google Cloud account.
```python
remote_app.delete(force=True)
```
The `force=True` parameter also deletes any child resources that were generated from the deployed agent, such as sessions. You can also delete your deployed agent via the [Agent Engine UI](https://console.cloud.google.com/vertex-ai/agents/agent-engines) on Google Cloud.
# Observability for agents
Observability for agents enables measurement of a system's internal state, including reasoning traces, tool calls, and latent model outputs, by analyzing its external telemetry and structured logs. When building agents, you may need these features to help debug and diagnose their in-process behavior. Basic input and output monitoring is typically insufficient for agents with any significant level of complexity.
Agent Development Kit (ADK) provides configurable [logging](/adk-docs/observability/logging/) functionality for monitoring and debugging agents. However, you may need to consider more advanced [observability ADK Integrations](/adk-docs/integrations/?topic=observability) for monitoring and analysis.
ADK Integrations for observability
For a list of pre-built observability libraries for ADK, see [Tools and Integrations](/adk-docs/integrations/?topic=observability).
# Agent activity logging
Supported in ADKPython v0.1.0TypeScript v0.2.0Go v0.1.0Java v0.1.0
Agent Development Kit (ADK) uses Python's standard `logging` module to provide flexible and powerful logging capabilities. Understanding how to configure and interpret these logs is crucial for monitoring agent behavior and debugging issues effectively.
## Logging Philosophy
ADK's approach to logging is to provide detailed diagnostic information without being overly verbose by default. It is designed to be configured by the application developer, allowing you to tailor the log output to your specific needs, whether in a development or production environment.
- **Standard Library:** It uses the standard `logging` library, so any configuration or handler that works with it will work with ADK.
- **Hierarchical Loggers:** Loggers are named hierarchically based on the module path (e.g., `google_adk.google.adk.agents.llm_agent`), allowing for fine-grained control over which parts of the framework produce logs.
- **User-Configured:** The framework does not configure logging itself. It is the responsibility of the developer using the framework to set up the desired logging configuration in their application's entry point.
## How to Configure Logging
You can configure logging in your main application script (e.g., `main.py`) before you initialize and run your agent. The simplest way is to use `logging.basicConfig`.
### Example Configuration
To enable detailed logging, including `DEBUG` level messages, add the following to the top of your script:
```python
import logging
logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s - %(levelname)s - %(name)s - %(message)s'
)
# Your ADK agent code follows...
# from google.adk.agents import LlmAgent
# ...
```
### Configuring Logging with the ADK CLI
When running agents using the ADK's built-in web or API servers, you can easily control the log verbosity directly from the command line. The `adk web`, `adk api_server`, and `adk deploy cloud_run` commands all accept a `--log_level` option.
This provides a convenient way to set the logging level without modifying your agent's source code.
> **Note:** The command-line setting always takes precedence over the programmatic configuration (like `logging.basicConfig`) for ADK's loggers. It's recommended to use `INFO` or `WARNING` in production and enable `DEBUG` only when troubleshooting.
**Example using `adk web`:**
To start the web server with `DEBUG` level logging, run:
```bash
adk web --log_level DEBUG path/to/your/agents_dir
```
The available log levels for the `--log_level` option are:
- `DEBUG`
- `INFO` (default)
- `WARNING`
- `ERROR`
- `CRITICAL`
> You can also use `-v` or `--verbose` as a shortcut for `--log_level DEBUG`.
>
> ```bash
> adk web -v path/to/your/agents_dir
> ```
### Log Levels
ADK uses standard log levels to categorize messages. The configured level determines what information gets logged.
| Level | Description | Type of Information Logged |
| ------------- | ---------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **`DEBUG`** | **Crucial for debugging.** The most verbose level for fine-grained diagnostic information. | - **Full LLM Prompts:** The complete request sent to the language model, including system instructions, history, and tools. - Detailed API responses from services. - Internal state transitions and variable values. |
| **`INFO`** | General information about the agent's lifecycle. | - Agent initialization and startup. - Session creation and deletion events. - Execution of a tool, including its name and arguments. |
| **`WARNING`** | Indicates a potential issue or deprecated feature use. The agent continues to function, but attention may be required. | - Use of deprecated methods or parameters. - Non-critical errors that the system recovered from. |
| **`ERROR`** | A serious error that prevented an operation from completing. | - Failed API calls to external services (e.g., LLM, Session Service). - Unhandled exceptions during agent execution. - Configuration errors. |
> **Note:** It is recommended to use `INFO` or `WARNING` in production environments. Only enable `DEBUG` when actively troubleshooting an issue, as `DEBUG` logs can be very verbose and may contain sensitive information.
## Reading and Understanding the Logs
The `format` string in the `basicConfig` example determines the structure of each log message.
Here’s a sample log entry:
```text
2025-07-08 11:22:33,456 - DEBUG - google_adk.google.adk.models.google_llm - LLM Request: contents { ... }
```
| Log Segment | Format Specifier | Meaning |
| ------------------------------- | ---------------- | ---------------------------------------------- |
| `2025-07-08 11:22:33,456` | `%(asctime)s` | Timestamp |
| `DEBUG` | `%(levelname)s` | Severity level |
| `google_adk.models.google_llm` | `%(name)s` | Logger name (the module that produced the log) |
| `LLM Request: contents { ... }` | `%(message)s` | The actual log message |
By reading the logger name, you can immediately pinpoint the source of the log and understand its context within the agent's architecture.
## Debugging with Logs: A Practical Example
**Scenario:** Your agent is not producing the expected output, and you suspect the prompt being sent to the LLM is incorrect or missing information.
**Steps:**
1. **Enable DEBUG Logging:** In your `main.py`, set the logging level to `DEBUG` as shown in the configuration example.
```python
logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s - %(levelname)s - %(name)s - %(message)s'
)
```
1. **Run Your Agent:** Execute your agent's task as you normally would.
1. **Inspect the Logs:** Look through the console output for a message from the `google.adk.models.google_llm` logger that starts with `LLM Request:`.
```text
...
2025-07-10 15:26:13,778 - DEBUG - google_adk.google.adk.models.google_llm - Sending out request, model: gemini-2.0-flash, backend: GoogleLLMVariant.GEMINI_API, stream: False
2025-07-10 15:26:13,778 - DEBUG - google_adk.google.adk.models.google_llm -
LLM Request:
-----------------------------------------------------------
System Instruction:
You roll dice and answer questions about the outcome of the dice rolls.
You can roll dice of different sizes.
You can use multiple tools in parallel by calling functions in parallel(in one request and in one round).
It is ok to discuss previous dice roles, and comment on the dice rolls.
When you are asked to roll a die, you must call the roll_die tool with the number of sides. Be sure to pass in an integer. Do not pass in a string.
You should never roll a die on your own.
When checking prime numbers, call the check_prime tool with a list of integers. Be sure to pass in a list of integers. You should never pass in a string.
You should not check prime numbers before calling the tool.
When you are asked to roll a die and check prime numbers, you should always make the following two function calls:
1. You should first call the roll_die tool to get a roll. Wait for the function response before calling the check_prime tool.
2. After you get the function response from roll_die tool, you should call the check_prime tool with the roll_die result.
2.1 If user asks you to check primes based on previous rolls, make sure you include the previous rolls in the list.
3. When you respond, you must include the roll_die result from step 1.
You should always perform the previous 3 steps when asking for a roll and checking prime numbers.
You should not rely on the previous history on prime results.
You are an agent. Your internal name is "hello_world_agent".
The description about you is "hello world agent that can roll a dice of 8 sides and check prime numbers."
-----------------------------------------------------------
Contents:
{"parts":[{"text":"Roll a 6 sided dice"}],"role":"user"}
{"parts":[{"function_call":{"args":{"sides":6},"name":"roll_die"}}],"role":"model"}
{"parts":[{"function_response":{"name":"roll_die","response":{"result":2}}}],"role":"user"}
-----------------------------------------------------------
Functions:
roll_die: {'sides': {'type': }}
check_prime: {'nums': {'items': {'type': }, 'type': }}
-----------------------------------------------------------
2025-07-10 15:26:13,779 - INFO - google_genai.models - AFC is enabled with max remote calls: 10.
2025-07-10 15:26:14,309 - INFO - google_adk.google.adk.models.google_llm -
LLM Response:
-----------------------------------------------------------
Text:
I have rolled a 6 sided die, and the result is 2.
...
```
1. **Analyze the Prompt:** By examining the `System Instruction`, `contents`, `functions` sections of the logged request, you can verify:
- Is the system instruction correct?
- Is the conversation history (`user` and `model` turns) accurate?
- Is the most recent user query included?
- Are the correct tools being provided to the model?
- Are the tools correctly called by the model?
- How long it takes for the model to respond?
This detailed output allows you to diagnose a wide range of issues, from incorrect prompt engineering to problems with tool definitions, directly from the log files.
# Why Evaluate Agents
Supported in ADKPython
In traditional software development, unit tests and integration tests provide confidence that code functions as expected and remains stable through changes. These tests provide a clear "pass/fail" signal, guiding further development. However, LLM agents introduce a level of variability that makes traditional testing approaches insufficient.
Due to the probabilistic nature of models, deterministic "pass/fail" assertions are often unsuitable for evaluating agent performance. Instead, we need qualitative evaluations of both the final output and the agent's trajectory - the sequence of steps taken to reach the solution. This involves assessing the quality of the agent's decisions, its reasoning process, and the final result.
This may seem like a lot of extra work to set up, but the investment of automating evaluations pays off quickly. If you intend to progress beyond prototype, this is a highly recommended best practice.
## Preparing for Agent Evaluations
Before automating agent evaluations, define clear objectives and success criteria:
- **Define Success:** What constitutes a successful outcome for your agent?
- **Identify Critical Tasks:** What are the essential tasks your agent must accomplish?
- **Choose Relevant Metrics:** What metrics will you track to measure performance?
These considerations will guide the creation of evaluation scenarios and enable effective monitoring of agent behavior in real-world deployments.
## What to Evaluate?
To bridge the gap between a proof-of-concept and a production-ready AI agent, a robust and automated evaluation framework is essential. Unlike evaluating generative models, where the focus is primarily on the final output, agent evaluation requires a deeper understanding of the decision-making process. Agent evaluation can be broken down into two components:
1. **Evaluating Trajectory and Tool Use:** Analyzing the steps an agent takes to reach a solution, including its choice of tools, strategies, and the efficiency of its approach.
1. **Evaluating the Final Response:** Assessing the quality, relevance, and correctness of the agent's final output.
The trajectory is just a list of steps the agent took before it returned to the user. We can compare that against the list of steps we expect the agent to have taken.
### Evaluating trajectory and tool use
Before responding to a user, an agent typically performs a series of actions, which we refer to as a 'trajectory.' It might compare the user input with session history to disambiguate a term, or lookup a policy document, search a knowledge base or invoke an API to save a ticket. We call this a ‘trajectory’ of actions. Evaluating an agent's performance requires comparing its actual trajectory to an expected, or ideal, one. This comparison can reveal errors and inefficiencies in the agent's process. The expected trajectory represents the ground truth -- the list of steps we anticipate the agent should take.
For example:
```python
# Trajectory evaluation will compare
expected_steps = ["determine_intent", "use_tool", "review_results", "report_generation"]
actual_steps = ["determine_intent", "use_tool", "review_results", "report_generation"]
```
ADK provides both groundtruth based and rubric based tool use evaluation metrics. To select the appropriate metric for your agent's specific requirements and goals, please refer to our [recommendations](#recommendations-on-criteria).
## How Evaluation works with the ADK
The ADK offers two methods for evaluating agent performance against predefined datasets and evaluation criteria. While conceptually similar, they differ in the amount of data they can process, which typically dictates the appropriate use case for each.
### First approach: Using a test file
This approach involves creating individual test files, each representing a single, simple agent-model interaction (a session). It's most effective during active agent development, serving as a form of unit testing. These tests are designed for rapid execution and should focus on simple session complexity. Each test file contains a single session, which may consist of multiple turns. A turn represents a single interaction between the user and the agent. Each turn includes
- `User Content`: The user issued query.
- `Expected Intermediate Tool Use Trajectory`: The tool calls we expect the agent to make in order to respond correctly to the user query.
- `Expected Intermediate Agent Responses`: These are the natural language responses that the agent (or sub-agents) generates as it moves towards generating a final answer. These natural language responses are usually an artifact of an multi-agent system, where your root agent depends on sub-agents to achieve a goal. These intermediate responses, may or may not be of interest to the end user, but for a developer/owner of the system, are of critical importance, as they give you the confidence that the agent went through the right path to generate final response.
- `Final Response`: The expected final response from the agent.
You can give the file any name for example `evaluation.test.json`. The framework only checks for the `.test.json` suffix, and the preceding part of the filename is not constrained. The test files are backed by a formal Pydantic data model. The two key schema files are [Eval Set](https://github.com/google/adk-python/blob/main/src/google/adk/evaluation/eval_set.py) and [Eval Case](https://github.com/google/adk-python/blob/main/src/google/adk/evaluation/eval_case.py). Here is a test file with a few examples:
*(Note: Comments are included for explanatory purposes and should be removed for the JSON to be valid.)*
```json
# Do note that some fields are removed for sake of making this doc readable.
{
"eval_set_id": "home_automation_agent_light_on_off_set",
"name": "",
"description": "This is an eval set that is used for unit testing `x` behavior of the Agent",
"eval_cases": [
{
"eval_id": "eval_case_id",
"conversation": [
{
"invocation_id": "b7982664-0ab6-47cc-ab13-326656afdf75", # Unique identifier for the invocation.
"user_content": { # Content provided by the user in this invocation. This is the query.
"parts": [
{
"text": "Turn off device_2 in the Bedroom."
}
],
"role": "user"
},
"final_response": { # Final response from the agent that acts as a reference of benchmark.
"parts": [
{
"text": "I have set the device_2 status to off."
}
],
"role": "model"
},
"intermediate_data": {
"tool_uses": [ # Tool use trajectory in chronological order.
{
"args": {
"location": "Bedroom",
"device_id": "device_2",
"status": "OFF"
},
"name": "set_device_info"
}
],
"intermediate_responses": [] # Any intermediate sub-agent responses.
}
}
],
"session_input": { # Initial session input.
"app_name": "home_automation_agent",
"user_id": "test_user",
"state": {}
}
}
]
}
```
Test files can be organized into folders. Optionally, a folder can also include a `test_config.json` file that specifies the evaluation criteria.
#### How to migrate test files not backed by the Pydantic schema?
NOTE: If your test files don't adhere to [EvalSet](https://github.com/google/adk-python/blob/main/src/google/adk/evaluation/eval_set.py) schema file, then this section is relevant to you.
Please use `AgentEvaluator.migrate_eval_data_to_new_schema` to migrate your existing `*.test.json` files to the Pydantic backed schema.
The utility takes your current test data file and an optional initial session file, and generates a single output json file with data serialized in the new format. Given that the new schema is more cohesive, both the old test data file and initial session file can be ignored (or removed.)
### Second approach: Using An Evalset File
The evalset approach utilizes a dedicated dataset called an "evalset" for evaluating agent-model interactions. Similar to a test file, the evalset contains example interactions. However, an evalset can contain multiple, potentially lengthy sessions, making it ideal for simulating complex, multi-turn conversations. Due to its ability to represent complex sessions, the evalset is well-suited for integration tests. These tests are typically run less frequently than unit tests due to their more extensive nature.
An evalset file contains multiple "evals," each representing a distinct session. Each eval consists of one or more "turns," which include the user query, expected tool use, expected intermediate agent responses, and a reference response. These fields have the same meaning as they do in the test file approach. Alternatively, an eval can define a *conversation scenario* which is used to [dynamically simulate](https://google.github.io/adk-docs/evaluate/user-sim/index.md) a user interaction with the agent. Each eval is identified by a unique name. Furthermore, each eval includes an associated initial session state.
Creating evalsets manually can be complex, therefore UI tools are provided to help capture relevant sessions and easily convert them into evals within your evalset. Learn more about using the web UI for evaluation below. Here is an example evalset containing two sessions. The eval set files are backed by a formal Pydantic data model. The two key schema files are [Eval Set](https://github.com/google/adk-python/blob/main/src/google/adk/evaluation/eval_set.py) and [Eval Case](https://github.com/google/adk-python/blob/main/src/google/adk/evaluation/eval_case.py).
Warning
This evalset evaluation method requires the use of a paid service, [Vertex Gen AI Evaluation Service API](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/evaluation).
*(Note: Comments are included for explanatory purposes and should be removed for the JSON to be valid.)*
```json
# Do note that some fields are removed for sake of making this doc readable.
{
"eval_set_id": "eval_set_example_with_multiple_sessions",
"name": "Eval set with multiple sessions",
"description": "This eval set is an example that shows that an eval set can have more than one session.",
"eval_cases": [
{
"eval_id": "session_01",
"conversation": [
{
"invocation_id": "e-0067f6c4-ac27-4f24-81d7-3ab994c28768",
"user_content": {
"parts": [
{
"text": "What can you do?"
}
],
"role": "user"
},
"final_response": {
"parts": [
{
"text": "I can roll dice of different sizes and check if numbers are prime."
}
],
"role": null
},
"intermediate_data": {
"tool_uses": [],
"intermediate_responses": []
}
}
],
"session_input": {
"app_name": "hello_world",
"user_id": "user",
"state": {}
}
},
{
"eval_id": "session_02",
"conversation": [
{
"invocation_id": "e-92d34c6d-0a1b-452a-ba90-33af2838647a",
"user_content": {
"parts": [
{
"text": "Roll a 19 sided dice"
}
],
"role": "user"
},
"final_response": {
"parts": [
{
"text": "I rolled a 17."
}
],
"role": null
},
"intermediate_data": {
"tool_uses": [],
"intermediate_responses": []
}
},
{
"invocation_id": "e-bf8549a1-2a61-4ecc-a4ee-4efbbf25a8ea",
"user_content": {
"parts": [
{
"text": "Roll a 10 sided dice twice and then check if 9 is a prime or not"
}
],
"role": "user"
},
"final_response": {
"parts": [
{
"text": "I got 4 and 7 from the dice roll, and 9 is not a prime number.\n"
}
],
"role": null
},
"intermediate_data": {
"tool_uses": [
{
"id": "adk-1a3f5a01-1782-4530-949f-07cf53fc6f05",
"args": {
"sides": 10
},
"name": "roll_die"
},
{
"id": "adk-52fc3269-caaf-41c3-833d-511e454c7058",
"args": {
"sides": 10
},
"name": "roll_die"
},
{
"id": "adk-5274768e-9ec5-4915-b6cf-f5d7f0387056",
"args": {
"nums": [
9
]
},
"name": "check_prime"
}
],
"intermediate_responses": [
[
"data_processing_agent",
[
{
"text": "I have rolled a 10 sided die twice. The first roll is 5 and the second roll is 3.\n"
}
]
]
]
}
}
],
"session_input": {
"app_name": "hello_world",
"user_id": "user",
"state": {}
}
}
]
}
```
#### How to migrate eval set files not backed by the Pydantic schema?
NOTE: If your eval set files don't adhere to [EvalSet](https://github.com/google/adk-python/blob/main/src/google/adk/evaluation/eval_set.py) schema file, then this section is relevant to you.
Based on who is maintaining the eval set data, there are two routes:
1. **Eval set data maintained by ADK UI** If you use ADK UI to maintain your Eval set data then *no action is needed* from you.
1. **Eval set data is developed and maintained manually and used in ADK eval CLI** A migration tool is in the works, until then the ADK eval CLI command will continue to support data in the old format.
### Evaluation Criteria
ADK provides several built-in criteria for evaluating agent performance, ranging from tool trajectory matching to LLM-based response quality assessment. For a detailed list of available criteria and guidance on when to use them, please see [Evaluation Criteria](https://google.github.io/adk-docs/evaluate/criteria/index.md).
Here is a summary of all the available criteria:
- **tool_trajectory_avg_score**: Exact match of tool call trajectory.
- **response_match_score**: ROUGE-1 similarity to reference response.
- **final_response_match_v2**: LLM-judged semantic match to a reference response.
- **rubric_based_final_response_quality_v1**: LLM-judged final response quality based on custom rubrics.
- **rubric_based_tool_use_quality_v1**: LLM-judged tool usage quality based on custom rubrics.
- **hallucinations_v1**: LLM-judged groundedness of agent response against context.
- **safety_v1**: Safety/harmlessness of agent response.
If no evaluation criteria are provided, the following default configuration is used:
- `tool_trajectory_avg_score`: Defaults to 1.0, requiring a 100% match in the tool usage trajectory.
- `response_match_score`: Defaults to 0.8, allowing for a small margin of error in the agent's natural language responses.
Here is an example of a `test_config.json` file specifying custom evaluation criteria:
```json
{
"criteria": {
"tool_trajectory_avg_score": 1.0,
"response_match_score": 0.8
}
}
```
#### Recommendations on Criteria
Choose criteria based on your evaluation goals:
- **Enable tests in CI/CD pipelines or regression testing:** Use `tool_trajectory_avg_score` and `response_match_score`. These criteria are fast, predictable, and suitable for frequent automated checks.
- **Evaluate trusted reference responses:** Use `final_response_match_v2` to evaluate semantic equivalence. This LLM-based check is more flexible than exact matching and better captures whether the agent's response means the same thing as the reference response.
- **Evaluate response quality without a reference response:** Use `rubric_based_final_response_quality_v1`. This is useful when you don't have a trusted reference, but you can define attributes of a good response (e.g., "The response is concise," "The response has a helpful tone").
- **Evaluate the correctness of tool usage:** Use `rubric_based_tool_use_quality_v1`. This allows you to validate the agent's reasoning process by checking, for example, that a specific tool was called or that tools were called in the correct order (e.g., "Tool A must be called before Tool B").
- **Check if responses are grounded in context:** Use `hallucinations_v1` to detect if the agent makes claims that are unsupported by or contradictory to the information available to it (e.g., tool outputs).
- **Check for harmful content:** Use `safety_v1` to ensure that agent responses are safe and do not violate safety policies.
In addition, criteria which require information on expected agent tool use and/or responses are not supported in combination with [User Simulation](https://google.github.io/adk-docs/evaluate/user-sim/index.md). Currently, only the `hallucinations_v1` and `safety_v1` criteria support such evals.
### User Simulation
When evaluating conversational agents, it is not always practical to use a fixed set of user prompts, as the conversation can proceed in unexpected ways. For example, if the agent needs the user to supply two values to perform a task, it may ask for those values one at a time or both at once. To resolve this issue, ADK allows you test the behavior of the agent in a specific *conversation scenario* with user prompts that are dynamically generated by an AI model. For details on how to set up an eval with user simulation, see [User Simulation](https://google.github.io/adk-docs/evaluate/user-sim/index.md).
## How to run Evaluation with the ADK
As a developer, you can evaluate your agents using the ADK in the following ways:
1. **Web-based UI (**`adk web`**):** Evaluate agents interactively through a web-based interface.
1. **Programmatically (**`pytest`**)**: Integrate evaluation into your testing pipeline using `pytest` and test files.
1. **Command Line Interface (**`adk eval`**):** Run evaluations on an existing evaluation set file directly from the command line.
### 1. `adk web` - Run Evaluations via the Web UI
The web UI provides an interactive way to evaluate agents, generate evaluation datasets, and inspect agent behavior in detail.
#### Step 1: Create and Save a Test Case
1. Start the web server by running: `adk web `
1. In the web interface, select an agent and interact with it to create a session.
1. Navigate to the **Eval** tab on the right side of the interface.
1. Create a new eval set or select an existing one.
1. Click **"Add current session"** to save the conversation as a new evaluation case.
#### Step 2: View and Edit Your Test Case
Once a case is saved, you can click its ID in the list to inspect it. To make changes, click the **Edit current eval case** icon (pencil). This interactive view allows you to:
- **Modify** agent text responses to refine test scenarios.
- **Delete** individual agent messages from the conversation.
- **Delete** the entire evaluation case if it's no longer needed.
#### Step 3: Run the Evaluation with Custom Metrics
1. Select one or more test cases from your evalset.
1. Click **Run Evaluation**. An **EVALUATION METRIC** dialog will appear.
1. In the dialog, use the sliders to configure the thresholds for:
- **Tool trajectory avg score**
- **Response match score**
1. Click **Start** to run the evaluation using your custom criteria. The evaluation history will record the metrics used for each run.
#### Step 4: Analyze Results
After the run completes, you can analyze the results:
- **Analyze Run Failures**: Click on any **Pass** or **Fail** result. For failures, you can hover over the `Fail` label to see a side-by-side comparison of the **Actual vs. Expected Output** and the scores that caused the failure.
### Debugging with the Trace View
The ADK web UI includes a powerful **Trace** tab for debugging agent behavior. This feature is available for any agent session, not just during evaluation.
The **Trace** tab provides a detailed and interactive way to inspect your agent's execution flow. Traces are automatically grouped by user message, making it easy to follow the chain of events.
Each trace row is interactive:
- **Hovering** over a trace row highlights the corresponding message in the chat window.
- **Clicking** on a trace row opens a detailed inspection panel with four tabs:
- **Event**: The raw event data.
- **Request**: The request sent to the model.
- **Response**: The response received from the model.
- **Graph**: A visual representation of the tool calls and agent logic flow.
Blue rows in the trace view indicate that an event was generated from that interaction. Clicking on these blue rows will open the bottom event detail panel, providing deeper insights into the agent's execution flow.
### 2. `pytest` - Run Tests Programmatically
You can also use **`pytest`** to run test files as part of your integration tests.
#### Example Command
```shell
pytest tests/integration/
```
#### Example Test Code
Here is an example of a `pytest` test case that runs a single test file:
```py
from google.adk.evaluation.agent_evaluator import AgentEvaluator
import pytest
@pytest.mark.asyncio
async def test_with_single_test_file():
"""Test the agent's basic ability via a session file."""
await AgentEvaluator.evaluate(
agent_module="home_automation_agent",
eval_dataset_file_path_or_dir="tests/integration/fixture/home_automation_agent/simple_test.test.json",
)
```
This approach allows you to integrate agent evaluations into your CI/CD pipelines or larger test suites. If you want to specify the initial session state for your tests, you can do that by storing the session details in a file and passing that to `AgentEvaluator.evaluate` method.
### 3. `adk eval` - Run Evaluations via the CLI
You can also run evaluation of an eval set file through the command line interface (CLI). This runs the same evaluation that runs on the UI, but it helps with automation, i.e. you can add this command as a part of your regular build generation and verification process.
Here is the command:
```shell
adk eval \
\
\
[--config_file_path=] \
[--print_detailed_results]
```
For example:
```shell
adk eval \
samples_for_testing/hello_world \
samples_for_testing/hello_world/hello_world_eval_set_001.evalset.json
```
Here are the details for each command line argument:
- `AGENT_MODULE_FILE_PATH`: The path to the `__init__.py` file that contains a module by the name "agent". "agent" module contains a `root_agent`.
- `EVAL_SET_FILE_PATH`: The path to evaluations file(s). You can specify one or more eval set file paths. For each file, all evals will be run by default. If you want to run only specific evals from a eval set, first create a comma separated list of eval names and then add that as a suffix to the eval set file name, demarcated by a colon `:` .
- For example: `sample_eval_set_file.json:eval_1,eval_2,eval_3`\
`This will only run eval_1, eval_2 and eval_3 from sample_eval_set_file.json`
- `CONFIG_FILE_PATH`: The path to the config file.
- `PRINT_DETAILED_RESULTS`: Prints detailed results on the console.
# Evaluation Criteria
Supported in ADKPython
This page outlines the evaluation criteria provided by ADK to assess agent performance, including tool use trajectory, response quality, and safety.
| Criterion | Description | Reference-Based | Requires Rubrics | LLM-as-a-Judge | Supports [User Simulation](https://google.github.io/adk-docs/evaluate/user-sim/index.md) |
| ---------------------------------------- | --------------------------------------------------------- | --------------- | ---------------- | -------------- | ---------------------------------------------------------------------------------------- |
| `tool_trajectory_avg_score` | Exact match of tool call trajectory | Yes | No | No | No |
| `response_match_score` | ROUGE-1 similarity to reference response | Yes | No | No | No |
| `final_response_match_v2` | LLM-judged semantic match to reference response | Yes | No | Yes | No |
| `rubric_based_final_response_quality_v1` | LLM-judged final response quality based on custom rubrics | No | Yes | Yes | Yes |
| `rubric_based_tool_use_quality_v1` | LLM-judged tool usage quality based on custom rubrics | No | Yes | Yes | Yes |
| `hallucinations_v1` | LLM-judged groundedness of agent response against context | No | No | Yes | Yes |
| `safety_v1` | Safety/harmlessness of agent response | No | No | Yes | Yes |
| `per_turn_user_simulator_quality_v1` | LLM-judged user simulator quality | No | No | Yes | Yes |
## tool_trajectory_avg_score
This criterion compares the sequence of tools called by the agent against a list of expected calls and computes an average score based on one of the match types: `EXACT`, `IN_ORDER`, or `ANY_ORDER`.
#### When To Use This Criterion?
This criterion is ideal for scenarios where agent correctness depends on tool calls. Depending on how strictly tool calls need to be followed, you can choose from one of three match types: `EXACT`, `IN_ORDER`, and `ANY_ORDER`.
This metric is particularly valuable for:
- **Regression testing:** Ensuring that agent updates do not unintentionally alter tool call behavior for established test cases.
- **Workflow validation:** Verifying that agents correctly follow predefined workflows that require specific API calls in a specific order.
- **High-precision tasks:** Evaluating tasks where slight deviations in tool parameters or call order can lead to significantly different or incorrect outcomes.
Use `EXACT` match when you need to enforce a specific tool execution path and consider any deviation—whether in tool name, arguments, or order—as a failure.
Use `IN_ORDER` match when you want to ensure certain key tool calls occur in a specific order, but allow for other tool calls to happen in between. This option is useful in assuring if certain key actions or tool calls occur and in certain order, leaving some scope for other tools calls to happen as well.
Use `ANY_ORDER` match when you want to ensure certain key tool calls occur, but do not care about their order, and allow for other tool calls to happen in between. This criteria is helpful for cases where multiple tool calls about the same concept occur, like your agent issues 5 search queries. You don't really care the order in which the search queries are issued, till they occur.
#### Details
For each invocation that is being evaluated, this criterion compares the list of tool calls produced by the agent against the list of expected tool calls using one of three match types. If the tool calls match based on the selected match type, a score of 1.0 is awarded for that invocation, otherwise the score is 0.0. The final value is the average of these scores across all invocations in the eval case.
The comparison can be done using one of following match types:
- **`EXACT`**: Requires a perfect match between the actual and expected tool calls, with no extra or missing tool calls.
- **`IN_ORDER`**: Requires all tool calls from the expected list to be present in the actual list, in the same order, but allows for other tool calls to appear in between.
- **`ANY_ORDER`**: Requires all tool calls from the expected list to be present in the actual list, in any order, and allows for other tool calls to appear in between.
#### How To Use This Criterion?
By default, `tool_trajectory_avg_score` uses `EXACT` match type. You can specify just a threshold for this criterion in `EvalConfig` under the `criteria` dictionary for `EXACT` match type. The value should be a float between 0.0 and 1.0, which represents the minimum acceptable score for the eval case to pass. If you expect tool trajectories to match exactly in all invocations, you should set the threshold to 1.0.
Example `EvalConfig` entry for `EXACT` match:
```json
{
"criteria": {
"tool_trajectory_avg_score": 1.0
}
}
```
Or you could specify the `match_type` explicitly:
```json
{
"criteria": {
"tool_trajectory_avg_score": {
"threshold": 1.0,
"match_type": "EXACT"
}
}
}
```
If you want to use `IN_ORDER` or `ANY_ORDER` match type, you can specify it via `match_type` field along with threshold.
Example `EvalConfig` entry for `IN_ORDER` match:
```json
{
"criteria": {
"tool_trajectory_avg_score": {
"threshold": 1.0,
"match_type": "IN_ORDER"
}
}
}
```
Example `EvalConfig` entry for `ANY_ORDER` match:
```json
{
"criteria": {
"tool_trajectory_avg_score": {
"threshold": 1.0,
"match_type": "ANY_ORDER"
}
}
}
```
#### Output And How To Interpret
The output is a score between 0.0 and 1.0, where 1.0 indicates a perfect match between actual and expected tool trajectories for all invocations, and 0.0 indicates a complete mismatch for all invocations. Higher scores are better. A score below 1.0 means that for at least one invocation, the agent's tool call trajectory deviated from the expected one.
## response_match_score
This criterion evaluates if agent's final response matches a golden/expected final response using Rouge-1.
### When To Use This Criterion?
Use this criterion when you need a quantitative measure of how closely the agent's output matches the expected output in terms of content overlap.
### Details
ROUGE-1 specifically measures the overlap of unigrams (single words) between the system-generated text (candidate summary) and the a reference text. It essentially checks how many individual words from the reference text are present in the candidate text. To learn more, see details on [ROUGE-1](https://github.com/google-research/google-research/tree/master/rouge).
### How To Use This Criterion?
You can specify a threshold for this criterion in `EvalConfig` under the `criteria` dictionary. The value should be a float between 0.0 and 1.0, which represents the minimum acceptable score for the eval case to pass.
Example `EvalConfig` entry:
```json
{
"criteria": {
"response_match_score": 0.8
}
}
```
### Output And How To Interpret
Value range for this criterion is [0,1], with values closer to 1 more desirable.
## final_response_match_v2
This criterion evaluates if the agent's final response matches a golden/expected final response using LLM as a judge.
### When To Use This Criterion?
Use this criterion when you need to evaluate the correctness of an agent's final response against a reference, but require flexibility in how the answer is presented. It is suitable for cases where different phrasings or formats are acceptable, as long as the core meaning and information match the reference. This criterion is a good choice for evaluating question-answering, summarization, or other generative tasks where semantic equivalence is more important than exact lexical overlap, making it a more sophisticated alternative to `response_match_score`.
### Details
This criterion uses a Large Language Model (LLM) as a judge to determine if the agent's final response is semantically equivalent to the provided reference response. It is designed to be more flexible than lexical matching metrics (like `response_match_score`), as it focuses on whether the agent's response contains the correct information, while tolerating differences in formatting, phrasing, or the inclusion of additional correct details.
For each invocation, the criterion prompts a judge LLM to rate the agent's response as "valid" or "invalid" compared to the reference. This is repeated multiple times for robustness (configurable via `num_samples`), and a majority vote determines if the invocation receives a score of 1.0 (valid) or 0.0 (invalid). The final criterion score is the fraction of invocations deemed valid across the entire eval case.
### How To Use This Criterion?
This criterion uses `LlmAsAJudgeCriterion`, allowing you to configure the evaluation threshold, the judge model, and the number of samples per invocation.
Example `EvalConfig` entry:
```json
{
"criteria": {
"final_response_match_v2": {
"threshold": 0.8,
"judge_model_options": {
"judge_model": "gemini-2.5-flash",
"num_samples": 5
}
}
}
}
}
```
### Output And How To Interpret
The criterion returns a score between 0.0 and 1.0. A score of 1.0 means the LLM judge considered the agent's final response to be valid for all invocations, while a score closer to 0.0 indicates that many responses were judged as invalid when compared to the reference responses. Higher values are better.
## rubric_based_final_response_quality_v1
This criterion assesses the quality of an agent's final response against a user-defined set of rubrics using LLM as a judge.
### When To Use This Criterion?
Use this criterion when you need to evaluate aspects of response quality that go beyond simple correctness or semantic equivalence with a reference. It is ideal for assessing nuanced attributes like tone, style, helpfulness, or adherence to specific conversational guidelines defined in your rubrics. This criterion is particularly useful when no single reference response exists, or when quality depends on multiple subjective factors.
### Details
This criterion provides a flexible way to evaluate response quality based on specific criteria that you define as rubrics. For example, you could define rubrics to check if a response is concise, if it correctly infers user intent, or if it avoids jargon.
The criterion uses an LLM-as-a-judge to evaluate the agent's final response against each rubric, producing a `yes` (1.0) or `no` (0.0) verdict for each. Like other LLM-based metrics, it samples the judge model multiple times per invocation and uses a majority vote to determine the score for each rubric in that invocation. The overall score for an invocation is the average of its rubric scores. The final criterion score for the eval case is the average of these overall scores across all invocations.
### How To Use This Criterion?
This criterion uses `RubricsBasedCriterion`, which requires a list of rubrics to be provided in the `EvalConfig`. Each rubric should be defined with a unique ID and its content.
Example `EvalConfig` entry:
```json
{
"criteria": {
"rubric_based_final_response_quality_v1": {
"threshold": 0.8,
"judge_model_options": {
"judge_model": "gemini-2.5-flash",
"num_samples": 5
},
"rubrics": [
{
"rubric_id": "conciseness",
"rubric_content": {
"text_property": "The agent's response is direct and to the point."
}
},
{
"rubric_id": "intent_inference",
"rubric_content": {
"text_property": "The agent's response accurately infers the user's underlying goal from ambiguous queries."
}
}
]
}
}
}
```
### Output And How To Interpret
The criterion outputs an overall score between 0.0 and 1.0, where 1.0 indicates that the agent's responses satisfied all rubrics across all invocations, and 0.0 indicates that no rubrics were satisfied. The results also include detailed per-rubric scores for each invocation. Higher values are better.
## rubric_based_tool_use_quality_v1
This criterion assesses the quality of an agent's tool usage against a user-defined set of rubrics using LLM as a judge.
### When To Use This Criterion?
Use this criterion when you need to evaluate *how* an agent uses tools, rather than just *if* the final response is correct. It is ideal for assessing whether the agent selected the right tool, used the correct parameters, or followed a specific sequence of tool calls. This is useful for validating agent reasoning processes, debugging tool-use errors, and ensuring adherence to prescribed workflows, especially in cases where multiple tool-use paths could lead to a similar final answer but only one path is considered correct.
### Details
This criterion provides a flexible way to evaluate tool usage based on specific rules that you define as rubrics. For example, you could define rubrics to check if a specific tool was called, if its parameters were correct, or if tools were called in a particular order.
The criterion uses an LLM-as-a-judge to evaluate the agent's tool calls and responses against each rubric, producing a `yes` (1.0) or `no` (0.0) verdict for each. Like other LLM-based metrics, it samples the judge model multiple times per invocation and uses a majority vote to determine the score for each rubric in that invocation. The overall score for an invocation is the average of its rubric scores. The final criterion score for the eval case is the average of these overall scores across all invocations.
### How To Use This Criterion?
This criterion uses `RubricsBasedCriterion`, which requires a list of rubrics to be provided in the `EvalConfig`. Each rubric should be defined with a unique ID and its content, describing a specific aspect of tool use to evaluate.
Example `EvalConfig` entry:
```json
{
"criteria": {
"rubric_based_tool_use_quality_v1": {
"threshold": 1.0,
"judge_model_options": {
"judge_model": "gemini-2.5-flash",
"num_samples": 5
},
"rubrics": [
{
"rubric_id": "geocoding_called",
"rubric_content": {
"text_property": "The agent calls the GeoCoding tool before calling the GetWeather tool."
}
},
{
"rubric_id": "getweather_called",
"rubric_content": {
"text_property": "The agent calls the GetWeather tool with coordinates derived from the user's location."
}
}
]
}
}
}
```
### Output And How To Interpret
The criterion outputs an overall score between 0.0 and 1.0, where 1.0 indicates that the agent's tool usage satisfied all rubrics across all invocations, and 0.0 indicates that no rubrics were satisfied. The results also include detailed per-rubric scores for each invocation. Higher values are better.
## hallucinations_v1
This criterion assesses whether a model response contains any false, contradictory, or unsupported claims.
### When To Use This Criterion?
Use this criterion to ensure the agent's response is grounded in the provided context (e.g., tool outputs, user query, instructions) and does not contain hallucinations.
### Details
This criterion assesses whether a model response contains any false, contradictory, or unsupported claims based on context that includes developer instructions, user prompt, tool definitions, and tool invocations and their results. It uses LLM-as-a-judge and follows a two-step process:
1. **Segmenter**: Segments the agent response into individual sentences.
1. **Sentence Validator**: Evaluates each segmented sentence against the provided context for grounding. Each sentence is labeled as `supported`, `unsupported`, `contradictory`, `disputed` or `not_applicable`.
The metric computes an Accuracy Score: the percentage of sentences that are `supported` or `not_applicable`. By default, only the final response is evaluated. If `evaluate_intermediate_nl_responses` is set to true in the criterion, intermediate natural language responses from agents are also evaluated.
### How To Use This Criterion?
This criterion uses `HallucinationsCriterion`, allowing you to configure the evaluation threshold, the judge model, the number of samples per invocation and whether to evaluate intermediate natural language responses.
Example `EvalConfig` entry:
```json
{
"criteria": {
"hallucinations_v1": {
"threshold": 0.8,
"judge_model_options": {
"judge_model": "gemini-2.5-flash",
},
"evaluate_intermediate_nl_responses": true
}
}
}
```
### Output And How To Interpret
The criterion returns a score between 0.0 and 1.0. A score of 1.0 means all sentences in agent's response are grounded in the context, while a score closer to 0.0 indicates that many sentences are false, contradictory, or unsupported. Higher values are better.
## safety_v1
This criterion evaluates the safety (harmlessness) of an Agent's Response.
### When To Use This Criterion?
This criterion should be used when you need to ensure that agent responses comply with safety guidelines and do not produce harmful or inappropriate content. It is essential for user-facing applications or any system where response safety is a priority.
### Details
This criterion assesses whether the agent's response contains any harmful content, such as hate speech, harassment, or dangerous information. Unlike other metrics implemented natively within ADK, `safety_v1` delegates the evaluation to the Vertex AI General AI Eval SDK.
### How To Use This Criterion?
Using this criterion requires a Google Cloud Project. You must have `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` environment variables set, typically in an `.env` file in your agent's directory, for the Vertex AI SDK to function correctly.
You can specify a threshold for this criterion in `EvalConfig` under the `criteria` dictionary. The value should be a float between 0.0 and 1.0, representing the minimum safety score for a response to be considered passing.
Example `EvalConfig` entry:
```json
{
"criteria": {
"safety_v1": 0.8
}
}
```
### Output And How To Interpret
The criterion returns a score between 0.0 and 1.0. Scores closer to 1.0 indicate that the response is safe, while scores closer to 0.0 indicate potential safety issues.
## per_turn_user_simulator_quality_v1
This criterion evaluates whether a user simulator is faithful to a conversation plan.
#### When To Use This Criterion?
Use this criterion when you need to evaluate a user simulator in a multi-turn conversation. It is designed to assess whether the simulator follows the conversation plan defined in the `ConversationScenario`.
#### Details
This criterion determines whether the a user simulator follows a defined `ConversationScenario` in a multi-turn conversation.
For the first turn, this criterion checks if user simulator response matches the `starting_prompt` in the `ConversationScenario`. For subsequent turns, it uses LLM-as-a-judge to evaluate if the user response follows the `conversation_plan` in the `ConversationScenario`.
#### How To Use This Criterion?
This criterion allows you to configure the evaluation threshold, the judge model and the number of samples per invocation. The criterion also lets you specify a `stop_signal`, which signals the LLM judge that the conversation was completed. For best results, use the stop signal in `LlmBackedUserSimulator`.
Example `EvalConfig` entry:
```json
{
"criteria": {
"per_turn_user_simulator_quality_v1": {
"threshold": 1.0,
"judge_model_options": {
"judge_model": "gemini-2.5-flash",
"num_samples": 5
},
"stop_signal": ""
}
}
}
```
#### Output And How To Interpret
The criterion returns a score between 0.0 and 1.0, representing the fraction of turns in which the user simulator's response was judged to be valid according to the conversation scenario. A score of 1.0 indicates that the simulator behaved as expected in all turns, while a score closer to 0.0 indicates that the simulator deviated in many turns. Higher values are better.
# User Simulation
Supported in ADKPython v1.18.0
When evaluating conversational agents, it is not always practical to use a fixed set of user prompts, as the conversation can proceed in unexpected ways. For example, if the agent needs the user to supply two values to perform a task, it may ask for those values one at a time or both at once. To resolve this issue, ADK can dynamically generate user prompts using a generative AI model.
To use this feature, you must specify a [`ConversationScenario`](https://github.com/google/adk-python/blob/main/src/google/adk/evaluation/conversation_scenarios.py) which dictates the user's goals in their conversation with the agent. A sample conversation scenario for the [`hello_world`](https://github.com/google/adk-python/tree/main/contributing/samples/hello_world) agent is shown below:
```json
{
"starting_prompt": "What can you do for me?",
"conversation_plan": "Ask the agent to roll a 20-sided die. After you get the result, ask the agent to check if it is prime."
}
```
The `starting_prompt` in a conversation scenario specifies a fixed initial prompt that the user should use to start the conversation with the agent. Specifying such fixed prompts for subsequent interactions with the agent is not practical as the agent may respond in different ways. Instead, the `conversation_plan` provides a guideline for how the rest of the conversation with the agent should proceed. An LLM uses this conversation plan, along with the conversation history, to dynamically generate user prompts until it judges that the conversation is complete.
Try it in Colab
Test this entire workflow yourself in an interactive notebook on [Simulating User Conversations to Dynamically Evaluate ADK Agents](https://github.com/google/adk-samples/blob/main/python/notebooks/evaluation/user_simulation_in_adk_evals.ipynb). You'll define a conversation scenario, run a "dry run" to check the dialogue, and then perform a full evaluation to score the agent's responses.
## Example: Evaluating the [`hello_world`](https://github.com/google/adk-python/tree/main/contributing/samples/hello_world) agent with conversation scenarios
To add evaluation cases containing conversation scenarios to a new or existing [`EvalSet`](https://github.com/google/adk-python/blob/main/src/google/adk/evaluation/eval_set.py), you need to first create a list of conversation scenarios to test the agent in.
Try saving the following to `contributing/samples/hello_world/conversation_scenarios.json`:
```json
{
"scenarios": [
{
"starting_prompt": "What can you do for me?",
"conversation_plan": "Ask the agent to roll a 20-sided die. After you get the result, ask the agent to check if it is prime."
},
{
"starting_prompt": "Hi, I'm running a tabletop RPG in which prime numbers are bad!",
"conversation_plan": "Say that you don't care about the value; you just want the agent to tell you if a roll is good or bad. Once the agent agrees, ask it to roll a 6-sided die. Finally, ask the agent to do the same with 2 20-sided dice."
}
]
}
```
You will also need a session input file containing information used during evaluation. Try saving the following to `contributing/samples/hello_world/session_input.json`:
```json
{
"app_name": "hello_world",
"user_id": "user"
}
```
Then, you can add the conversation scenarios to an `EvalSet`:
```bash
# (optional) create a new EvalSet
adk eval_set create \
contributing/samples/hello_world \
eval_set_with_scenarios
# add conversation scenarios to the EvalSet as new eval cases
adk eval_set add_eval_case \
contributing/samples/hello_world \
eval_set_with_scenarios \
--scenarios_file contributing/samples/hello_world/conversation_scenarios.json \
--session_input_file contributing/samples/hello_world/session_input.json
```
By default, ADK runs evaluations with metrics that require the agent's expected response to be specified. Since that is not the case for a dynamic conversation scenario, we will use an [`EvalConfig`](https://github.com/google/adk-python/blob/main/src/google/adk/evaluation/eval_config.py) with some alternate supported metrics.
Try saving the following to `contributing/samples/hello_world/eval_config.json`:
```json
{
"criteria": {
"hallucinations_v1": {
"threshold": 0.5,
"evaluate_intermediate_nl_responses": true
},
"safety_v1": {
"threshold": 0.8
}
}
}
```
Finally, you can use the `adk eval` command to run the evaluation:
```bash
adk eval \
contributing/samples/hello_world \
--config_file_path contributing/samples/hello_world/eval_config.json \
eval_set_with_scenarios \
--print_detailed_results
```
## User simulator configuration
You can override the default user simulator configuration to change the model, internal model behavior, and the maximum number of user-agent interactions. The below `EvalConfig` shows the default user simulator configuration:
```json
{
"criteria": {
# same as before
},
"user_simulator_config": {
"model": "gemini-2.5-flash",
"model_configuration": {
"thinking_config": {
"include_thoughts": true,
"thinking_budget": 10240
}
},
"max_allowed_invocations": 20
}
}
```
- `model`: The model backing the user simulator.
- `model_configuration`: A [`GenerateContentConfig`](https://github.com/googleapis/python-genai/blob/6196b1b4251007e33661bb5d7dc27bafee3feefe/google/genai/types.py#L4295) which controls the model behavior.
- `max_allowed_invocations`: The maximum user-agent interactions allowed before the conversation is forcefully terminated. This should be set to be greater than the longest reasonable user-agent interaction in your `EvalSet`.
- `custom_instructions`: Optional. Overrides the default instructions for the user simulator. The instruction string must contain the following formatting placeholders exactly as shown below (*do not substitute values in advance!*):
- `{stop_signal}` : text to be generated when the user simulator decides that the conversation is over.
- `{conversation_plan}` : the overall plan for the conversation that the user simulator must follow.
- `{conversation_history}` : the conversation between the user and the agent so far.
# Safety and Security for AI Agents
Supported in ADKPythonTypeScriptGoJava
As AI agents grow in capability, ensuring they operate safely, securely, and align with your brand values is paramount. Uncontrolled agents can pose risks, including executing misaligned or harmful actions, such as data exfiltration, and generating inappropriate content that can impact your brand’s reputation. **Sources of risk include vague instructions, model hallucination, jailbreaks and prompt injections from adversarial users, and indirect prompt injections via tool use.**
[Google Cloud Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs/overview) provides a multi-layered approach to mitigate these risks, enabling you to build powerful *and* trustworthy agents. It offers several mechanisms to establish strict boundaries, ensuring agents only perform actions you've explicitly allowed:
1. **Identity and Authorization**: Control who the agent **acts as** by defining agent and user auth.
1. **Guardrails to screen inputs and outputs:** Control your model and tool calls precisely.
- *In-Tool Guardrails:* Design tools defensively, using developer-set tool context to enforce policies (e.g., allowing queries only on specific tables).
- *Built-in Gemini Safety Features:* If using Gemini models, benefit from content filters to block harmful outputs and system Instructions to guide the model's behavior and safety guidelines
- *Callbacks and Plugins:* Validate model and tool calls before or after execution, checking parameters against agent state or external policies.
- *Using Gemini as a safety guardrail:* Implement an additional safety layer using a cheap and fast model (like Gemini Flash Lite) configured via callbacks to screen inputs and outputs.
1. **Sandboxed code execution:** Prevent model-generated code to cause security issues by sandboxing the environment
1. **Evaluation and tracing**: Use evaluation tools to assess the quality, relevance, and correctness of the agent's final output. Use tracing to gain visibility into agent actions to analyze the steps an agent takes to reach a solution, including its choice of tools, strategies, and the efficiency of its approach.
1. **Network Controls and VPC-SC:** Confine agent activity within secure perimeters (like VPC Service Controls) to prevent data exfiltration and limit the potential impact radius.
## Safety and Security Risks
Before implementing safety measures, perform a thorough risk assessment specific to your agent's capabilities, domain, and deployment context.
***Sources*** **of risk** include:
- Ambiguous agent instructions
- Prompt injection and jailbreak attempts from adversarial users
- Indirect prompt injections via tool use
**Risk categories** include:
- **Misalignment & goal corruption**
- Pursuing unintended or proxy goals that lead to harmful outcomes ("reward hacking")
- Misinterpreting complex or ambiguous instructions
- **Harmful content generation, including brand safety**
- Generating toxic, hateful, biased, sexually explicit, discriminatory, or illegal content
- Brand safety risks such as Using language that goes against the brand’s values or off-topic conversations
- **Unsafe actions**
- Executing commands that damage systems
- Making unauthorized purchases or financial transactions.
- Leaking sensitive personal data (PII)
- Data exfiltration
## Best practices
### Identity and Authorization
The identity that a *tool* uses to perform actions on external systems is a crucial design consideration from a security perspective. Different tools in the same agent can be configured with different strategies, so care is needed when talking about the agent's configurations.
#### Agent-Auth
The **tool interacts with external systems using the agent's own identity** (e.g., a service account). The agent identity must be explicitly authorized in the external system access policies, like adding an agent's service account to a database's IAM policy for read access. Such policies constrain the agent in only performing actions that the developer intended as possible: by giving read-only permissions to a resource, no matter what the model decides, the tool will be prohibited from performing write actions.
This approach is simple to implement, and it is **appropriate for agents where all users share the same level of access.** If not all users have the same level of access, such an approach alone doesn't provide enough protection and must be complemented with other techniques below. In tool implementation, ensure that logs are created to maintain attribution of actions to users, as all agents' actions will appear as coming from the agent.
#### User Auth
The tool interacts with an external system using the **identity of the "controlling user"** (e.g., the human interacting with the frontend in a web application). In ADK, this is typically implemented using OAuth: the agent interacts with the frontend to acquire a OAuth token, and then the tool uses the token when performing external actions: the external system authorizes the action if the controlling user is authorized to perform it on its own.
User auth has the advantage that agents only perform actions that the user could have performed themselves. This greatly reduces the risk that a malicious user could abuse the agent to obtain access to additional data. However, most common implementations of delegation have a fixed set permissions to delegate (i.e., OAuth scopes). Often, such scopes are broader than the access that the agent actually requires, and the techniques below are required to further constrain agent actions.
### Guardrails to screen inputs and outputs
#### In-tool guardrails
Tools can be designed with security in mind: we can create tools that expose the actions we want the model to take and nothing else. By limiting the range of actions we provide to the agents, we can deterministically eliminate classes of rogue actions that we never want the agent to take.
In-tool guardrails is an approach to create common and re-usable tools that expose deterministic controls that can be used by developers to set limits on each tool instantiation.
This approach relies on the fact that tools receive two types of input: arguments, which are set by the model, and [**`Tool Context`**](https://google.github.io/adk-docs/tools-custom/#tool-context), which can be set deterministically by the agent developer. We can rely on the deterministically set information to validate that the model is behaving as-expected.
For example, a query tool can be designed to expect a policy to be read from the Tool Context.
```py
# Conceptual example: Setting policy data intended for tool context
# In a real ADK app, this might be set in InvocationContext.session.state
# or passed during tool initialization, then retrieved via ToolContext.
policy = {} # Assuming policy is a dictionary
policy['select_only'] = True
policy['tables'] = ['mytable1', 'mytable2']
# Conceptual: Storing policy where the tool can access it via ToolContext later.
# This specific line might look different in practice.
# For example, storing in session state:
invocation_context.session.state["query_tool_policy"] = policy
# Or maybe passing during tool init:
query_tool = QueryTool(policy=policy)
# For this example, we'll assume it gets stored somewhere accessible.
```
```typescript
// Conceptual example: Setting policy data intended for tool context
// In a real ADK app, this might be set in InvocationContext.session.state
// or passed during tool initialization, then retrieved via ToolContext.
const policy: {[key: string]: any} = {}; // Assuming policy is an object
policy['select_only'] = true;
policy['tables'] = ['mytable1', 'mytable2'];
// Conceptual: Storing policy where the tool can access it via ToolContext later.
// This specific line might look different in practice.
// For example, storing in session state:
invocationContext.session.state["query_tool_policy"] = policy;
// Or maybe passing during tool init:
const queryTool = new QueryTool({policy: policy});
// For this example, we'll assume it gets stored somewhere accessible.
```
```go
// Conceptual example: Setting policy data intended for tool context
// In a real ADK app, this might be set using the session state service.
// `ctx` is an `agent.Context` available in callbacks or custom agents.
policy := map[string]interface{}{
"select_only": true,
"tables": []string{"mytable1", "mytable2"},
}
// Conceptual: Storing policy where the tool can access it via ToolContext later.
// This specific line might look different in practice.
// For example, storing in session state:
if err := ctx.Session().State().Set("query_tool_policy", policy); err != nil {
// Handle error, e.g., log it.
}
// Or maybe passing during tool init:
// queryTool := NewQueryTool(policy)
// For this example, we'll assume it gets stored somewhere accessible.
```
```java
// Conceptual example: Setting policy data intended for tool context
// In a real ADK app, this might be set in InvocationContext.session.state
// or passed during tool initialization, then retrieved via ToolContext.
policy = new HashMap(); // Assuming policy is a Map
policy.put("select_only", true);
policy.put("tables", new ArrayList<>("mytable1", "mytable2"));
// Conceptual: Storing policy where the tool can access it via ToolContext later.
// This specific line might look different in practice.
// For example, storing in session state:
invocationContext.session().state().put("query_tool_policy", policy);
// Or maybe passing during tool init:
query_tool = QueryTool(policy);
// For this example, we'll assume it gets stored somewhere accessible.
```
During the tool execution, [**`Tool Context`**](https://google.github.io/adk-docs/tools-custom/#tool-context) will be passed to the tool:
```py
def query(query: str, tool_context: ToolContext) -> str | dict:
# Assume 'policy' is retrieved from context, e.g., via session state:
# policy = tool_context.invocation_context.session.state.get('query_tool_policy', {})
# --- Placeholder Policy Enforcement ---
policy = tool_context.invocation_context.session.state.get('query_tool_policy', {}) # Example retrieval
actual_tables = explainQuery(query) # Hypothetical function call
if not set(actual_tables).issubset(set(policy.get('tables', []))):
# Return an error message for the model
allowed = ", ".join(policy.get('tables', ['(None defined)']))
return f"Error: Query targets unauthorized tables. Allowed: {allowed}"
if policy.get('select_only', False):
if not query.strip().upper().startswith("SELECT"):
return "Error: Policy restricts queries to SELECT statements only."
# --- End Policy Enforcement ---
print(f"Executing validated query (hypothetical): {query}")
return {"status": "success", "results": [...]} # Example successful return
```
```typescript
function query(query: string, toolContext: ToolContext): string | object {
// Assume 'policy' is retrieved from context, e.g., via session state:
const policy = toolContext.state.get('query_tool_policy', {}) as {[key: string]: any};
// --- Placeholder Policy Enforcement ---
const actual_tables = explainQuery(query); // Hypothetical function call
const policyTables = new Set(policy['tables'] || []);
const isSubset = actual_tables.every(table => policyTables.has(table));
if (!isSubset) {
// Return an error message for the model
const allowed = (policy['tables'] || ['(None defined)']).join(', ');
return `Error: Query targets unauthorized tables. Allowed: ${allowed}`;
}
if (policy['select_only']) {
if (!query.trim().toUpperCase().startsWith("SELECT")) {
return "Error: Policy restricts queries to SELECT statements only.";
}
}
// --- End Policy Enforcement ---
console.log(`Executing validated query (hypothetical): ${query}`);
return { "status": "success", "results": [] }; // Example successful return
}
```
```go
import (
"fmt"
"strings"
"google.golang.org/adk/tool"
)
func query(query string, toolContext *tool.Context) (any, error) {
// Assume 'policy' is retrieved from context, e.g., via session state:
policyAny, err := toolContext.State().Get("query_tool_policy")
if err != nil {
return nil, fmt.Errorf("could not retrieve policy: %w", err)
} policy, _ := policyAny.(map[string]interface{})
actualTables := explainQuery(query) // Hypothetical function call
// --- Placeholder Policy Enforcement ---
if tables, ok := policy["tables"].([]string); ok {
if !isSubset(actualTables, tables) {
// Return an error to signal failure
allowed := strings.Join(tables, ", ")
if allowed == "" {
allowed = "(None defined)"
}
return nil, fmt.Errorf("query targets unauthorized tables. Allowed: %s", allowed)
}
}
if selectOnly, _ := policy["select_only"].(bool); selectOnly {
if !strings.HasPrefix(strings.ToUpper(strings.TrimSpace(query)), "SELECT") {
return nil, fmt.Errorf("policy restricts queries to SELECT statements only")
}
}
// --- End Policy Enforcement ---
fmt.Printf("Executing validated query (hypothetical): %s\n", query)
return map[string]interface{}{"status": "success", "results": []string{"..."}}, nil
}
// Helper function to check if a is a subset of b
func isSubset(a, b []string) bool {
set := make(map[string]bool)
for _, item := range b {
set[item] = true
}
for _, item := range a {
if _, found := set[item]; !found {
return false
}
}
return true
}
```
```java
import com.google.adk.tools.ToolContext;
import java.util.*;
class ToolContextQuery {
public Object query(String query, ToolContext toolContext) {
// Assume 'policy' is retrieved from context, e.g., via session state:
Map queryToolPolicy =
toolContext.invocationContext.session().state().getOrDefault("query_tool_policy", null);
List actualTables = explainQuery(query);
// --- Placeholder Policy Enforcement ---
if (!queryToolPolicy.get("tables").containsAll(actualTables)) {
List allowedPolicyTables =
(List) queryToolPolicy.getOrDefault("tables", new ArrayList());
String allowedTablesString =
allowedPolicyTables.isEmpty() ? "(None defined)" : String.join(", ", allowedPolicyTables);
return String.format(
"Error: Query targets unauthorized tables. Allowed: %s", allowedTablesString);
}
if (!queryToolPolicy.get("select_only")) {
if (!query.trim().toUpperCase().startswith("SELECT")) {
return "Error: Policy restricts queries to SELECT statements only.";
}
}
// --- End Policy Enforcement ---
System.out.printf("Executing validated query (hypothetical) %s:", query);
Map successResult = new HashMap<>();
successResult.put("status", "success");
successResult.put("results", Arrays.asList("result_item1", "result_item2"));
return successResult;
}
}
```
#### Built-in Gemini Safety Features
Gemini models come with in-built safety mechanisms that can be leveraged to improve content and brand safety.
- **Content safety filters**: [Content filters](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/configure-safety-attributes) can help block the output of harmful content. They function independently from Gemini models as part of a layered defense against threat actors who attempt to jailbreak the model. Gemini models on Vertex AI use two types of content filters:
- **Non-configurable safety filters** automatically block outputs containing prohibited content, such as child sexual abuse material (CSAM) and personally identifiable information (PII).
- **Configurable content filters** allow you to define blocking thresholds in four harm categories (hate speech, harassment, sexually explicit, and dangerous content,) based on probability and severity scores. These filters are default off but you can configure them according to your needs.
- **System instructions for safety**: [System instructions](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/safety-system-instructions) for Gemini models in Vertex AI provide direct guidance to the model on how to behave and what type of content to generate. By providing specific instructions, you can proactively steer the model away from generating undesirable content to meet your organization’s unique needs. You can craft system instructions to define content safety guidelines, such as prohibited and sensitive topics, and disclaimer language, as well as brand safety guidelines to ensure the model's outputs align with your brand's voice, tone, values, and target audience.
While these measures are robust against content safety, you need additional checks to reduce agent misalignment, unsafe actions, and brand safety risks.
#### Callbacks and Plugins for Security Guardrails
Callbacks provide a simple, agent-specific method for adding pre-validation to tool and model I/O, whereas plugins offer a reusable solution for implementing general security policies across multiple agents.
When modifications to the tools to add guardrails aren't possible, the [**`Before Tool Callback`**](https://google.github.io/adk-docs/callbacks/types-of-callbacks/#before-tool-callback) function can be used to add pre-validation of calls. The callback has access to the agent's state, the requested tool and parameters. This approach is very general and can even be created to create a common library of re-usable tool policies. However, it might not be applicable for all tools if the information to enforce the guardrails isn't directly visible in the parameters.
```py
# Hypothetical callback function
def validate_tool_params(
callback_context: CallbackContext, # Correct context type
tool: BaseTool,
args: Dict[str, Any],
tool_context: ToolContext
) -> Optional[Dict]: # Correct return type for before_tool_callback
print(f"Callback triggered for tool: {tool.name}, args: {args}")
# Example validation: Check if a required user ID from state matches an arg
expected_user_id = callback_context.state.get("session_user_id")
actual_user_id_in_args = args.get("user_id_param") # Assuming tool takes 'user_id_param'
if actual_user_id_in_args != expected_user_id:
print("Validation Failed: User ID mismatch!")
# Return a dictionary to prevent tool execution and provide feedback
return {"error": f"Tool call blocked: User ID mismatch."}
# Return None to allow the tool call to proceed if validation passes
print("Callback validation passed.")
return None
# Hypothetical Agent setup
root_agent = LlmAgent( # Use specific agent type
model='gemini-2.0-flash',
name='root_agent',
instruction="...",
before_tool_callback=validate_tool_params, # Assign the callback
tools = [
# ... list of tool functions or Tool instances ...
# e.g., query_tool_instance
]
)
```
```typescript
// Hypothetical callback function
function validateToolParams(
{tool, args, context}: {
tool: BaseTool,
args: {[key: string]: any},
context: ToolContext
}
): {[key: string]: any} | undefined {
console.log(`Callback triggered for tool: ${tool.name}, args: ${JSON.stringify(args)}`);
// Example validation: Check if a required user ID from state matches an arg
const expectedUserId = context.state.get("session_user_id");
const actualUserIdInArgs = args["user_id_param"]; // Assuming tool takes 'user_id_param'
if (actualUserIdInArgs !== expectedUserId) {
console.log("Validation Failed: User ID mismatch!");
// Return a dictionary to prevent tool execution and provide feedback
return {"error": `Tool call blocked: User ID mismatch.`};
}
// Return undefined to allow the tool call to proceed if validation passes
console.log("Callback validation passed.");
return undefined;
}
// Hypothetical Agent setup
const rootAgent = new LlmAgent({
model: 'gemini-2.5-flash',
name: 'root_agent',
instruction: "...",
beforeToolCallback: validateToolParams, // Assign the callback
tools: [
// ... list of tool functions or Tool instances ...
// e.g., queryToolInstance
]
});
```
```go
import (
"fmt"
"reflect"
"google.golang.org/adk/agent/llmagent"
"google.golang.org/adk/tool"
)
// Hypothetical callback function
func validateToolParams(
ctx tool.Context,
t tool.Tool,
args map[string]any,
) (map[string]any, error) {
fmt.Printf("Callback triggered for tool: %s, args: %v\n", t.Name(), args)
// Example validation: Check if a required user ID from state matches an arg
expectedUserID, err := ctx.State().Get("session_user_id")
if err != nil {
// This is an unexpected failure, return an error.
return nil, fmt.Errorf("internal error: session_user_id not found in state: %w", err)
}
expectedUserID, ok := expectedUserIDVal.(string)
if !ok {
return nil, fmt.Errorf("internal error: session_user_id in state is not a string, got %T", expectedUserIDVal)
}
actualUserIDInArgs, exists := args["user_id_param"]
if !exists {
// Handle case where user_id_param is not in args
fmt.Println("Validation Failed: user_id_param missing from arguments!")
return map[string]any{"error": "Tool call blocked: user_id_param missing from arguments."}, nil
}
actualUserID, ok := actualUserIDInArgs.(string)
if !ok {
// Handle case where user_id_param is not a string
fmt.Println("Validation Failed: user_id_param is not a string!")
return map[string]any{"error": "Tool call blocked: user_id_param is not a string."}, nil
}
if actualUserID != expectedUserID {
fmt.Println("Validation Failed: User ID mismatch!")
// Return a map to prevent tool execution and provide feedback to the model.
// This is not a Go error, but a message for the agent.
return map[string]any{"error": "Tool call blocked: User ID mismatch."}, nil
}
// Return nil, nil to allow the tool call to proceed if validation passes
fmt.Println("Callback validation passed.")
return nil, nil
}
// Hypothetical Agent setup
// rootAgent, err := llmagent.New(llmagent.Config{
// Model: "gemini-2.0-flash",
// Name: "root_agent",
// Instruction: "...",
// BeforeToolCallbacks: []llmagent.BeforeToolCallback{validateToolParams},
// Tools: []tool.Tool{queryToolInstance},
// })
```
```java
// Hypothetical callback function
public Optional