Tutorial: Manual Workflow¶
For developers who prefer to type every command themselves, without a coding agent.
This tutorial walks you through building, testing, and evaluating an ADK agent by typing every command yourself — no coding agent required.
Tip
Prefer to let your coding agent do the work? See Tutorial: Build Your First Agent instead.
What You'll Build¶
You'll start with the default agent template — an assistant that can look up weather and tell the time — then customize it with a new persona and a custom tool.
Prerequisites¶
- Python 3.11+ and uv installed
- Authentication set up — either a Gemini API key or Google Cloud credentials
1. Create the Project¶
--prototypeskips Terraform and CI/CD — just agent code, tests, and eval sets.--yesauto-accepts defaults (ADK template, in-memory session storage).agents-cli installinstalls all Python dependencies viauv sync.
2. Explore the Project¶
Your project contains:
my-first-agent/
├── app/
│ ├── __init__.py # Registers the app
│ ├── agent.py # Agent definition — this is where your logic lives
│ └── app_utils/ # Telemetry and utility code
├── tests/
│ ├── eval/
│ │ ├── evalsets/
│ │ │ └── basic.evalset.json # Test cases for evaluation
│ │ └── eval_config.json # LLM-as-judge criteria
│ ├── integration/
│ │ └── test_agent.py
│ └── unit/
│ └── test_dummy.py
├── pyproject.toml # Project config and dependencies
└── GEMINI.md # Guidance file for coding agents
The important file is app/agent.py. Open it and you'll see two tool functions (get_weather, get_current_time) and an agent definition:
root_agent = Agent(
name="root_agent",
model=Gemini(model="gemini-flash-latest"),
instruction="You are a helpful AI assistant designed to provide accurate and useful information.",
tools=[get_weather, get_current_time],
)
For a full breakdown of every file, see Project Structure.
3. Run the Agent Locally¶
Start the ADK web playground:
Open http://localhost:8080 in your browser. You'll see a chat interface. Try sending:
What's the weather in San Francisco?
The agent calls the get_weather tool and responds with something like: "It's 60 degrees and foggy in San Francisco."
Tip
The playground has hot reload — save changes to app/agent.py and they take effect immediately.
4. Test from the Terminal¶
You can also test without the browser:
This sends a single prompt and prints the agent's response.
5. Customize the Agent¶
Let's give the agent a personality. Open app/agent.py and change the instruction:
root_agent = Agent(
name="root_agent",
model=Gemini(
model="gemini-flash-latest",
retry_options=types.HttpRetryOptions(attempts=3),
),
instruction="""You are a cheerful weather reporter who speaks in short,
punchy sentences. Always include a fun weather-related pun in your responses.
When asked about time, relate it back to weather somehow.""",
tools=[get_weather, get_current_time],
)
Save the file. If the playground is still running, it reloads automatically. Try the same question again — the response should now have a different tone.
6. Add a Custom Tool¶
Let's add a tool that counts words. Add this function above the root_agent definition in app/agent.py:
def count_words(text: str) -> str:
"""Count the number of words in the given text.
Args:
text: The text to count words in.
Returns:
A string with the word count.
"""
word_count = len(text.split())
return f"The text contains {word_count} words."
Then register it in the agent's tools list:
Test it:
The agent calls count_words and responds with the word count.
Tip
ADK tools are plain Python functions. The docstring becomes the description the LLM sees, so write it clearly — it tells the model when and how to use the tool.
For more on adding tools, see the ADK Tools documentation.
7. Run an Evaluation¶
Evaluations validate that your agent behaves correctly. Your project comes with a default eval set at tests/eval/evalsets/basic.evalset.json:
{
"eval_set_id": "basic_eval",
"name": "Basic Agent Evaluation",
"eval_cases": [
{
"eval_id": "greeting",
"conversation": [
{
"user_content": {
"parts": [{"text": "Hello, what can you help me with?"}]
}
}
],
"session_input": {
"app_name": "app",
"user_id": "eval_user",
"state": {}
}
}
]
}
Each eval case defines a user message and session context. The eval system sends the message to your agent and scores the response using LLM-as-judge criteria defined in eval_config.json.
Run it:
The output shows scores for each eval case against the configured rubrics (relevance, helpfulness). A score above the threshold (default: 0.8) passes.
For the full evaluation workflow — writing test cases, adding metrics, the eval-fix loop — see the Evaluation Guide.
8. Deploy to Google Cloud¶
Once your agent passes evals, deploy it. First, add a deployment target (prototype projects don't include one):
Set your Google Cloud project and deploy:
Verify it's running:
Note
Deployment requires Google Cloud credentials. See the Deployment Guide for Agent Runtime, GKE, and other options.
9. Observe Your Agent¶
Cloud Trace is enabled by default — no configuration needed. Send a few requests to your agent, then open the Trace explorer in the Google Cloud Console. You'll see spans for each LLM call and tool execution, with latency breakdowns.
View content logs¶
To inspect the actual prompts and responses your agent handles in production, provision the observability infrastructure:
This runs Terraform to create a dedicated service account, GCS bucket, and BigQuery dataset — and updates your deployed service to use them.
See the Observability Guide for verification steps, full content capture, and BigQuery Agent Analytics.
What You've Done¶
| Step | What happened |
|---|---|
agents-cli create --prototype --yes |
Created a project with agent code, tests, and eval sets |
agents-cli playground |
Started the ADK playground for interactive testing |
agents-cli run "..." |
Tested the agent from the terminal |
Edited agent.py |
Customized the persona and added a tool |
agents-cli eval run |
Validated agent behavior with structured evaluations |
agents-cli deploy |
Deployed the agent to Google Cloud |
| Trace explorer + content logs | Verified tracing and set up prompt-response logging |
Next Steps¶
- ADK Custom Tools — more tool patterns and advanced usage
- Evaluation Guide — write better evals, understand metrics
- Deployment Guide — Agent Runtime, GKE, secrets, and CI/CD
- Observability Guide — BigQuery Agent Analytics, third-party integrations