Tutorial: Build Your First Agent¶
For beginners who want to build, evaluate, and deploy an agent using a coding agent.
This tutorial shows the full Agents CLI in Agent Platform experience — you talk to your coding agent, and it builds, evaluates, and deploys an ADK agent for you.
You'll build a caveman compressor: an agent that takes verbose text and grunts it down to terse, caveman-style summaries. Inspired by caveman.
Here's what it looks like end to end:

Setup¶
The only command you run yourself. Everything else goes through your coding agent.
Then open your coding agent — Gemini CLI, Claude Code, Codex, or any other.
1. Scaffold¶
Tell your coding agent:
"Use agents-cli to build a caveman-style agent that compresses verbose text into terse, technical grunts"
Your coding agent activates the google-agents-cli-workflow and google-agents-cli-scaffold skills. It will:
- Ask clarifying questions (deployment target, safety constraints, etc.)
- Write a
DESIGN_SPEC.mdcapturing the agent's purpose - Scaffold the project:
You now have a working project with boilerplate agent code, tests, and eval sets.
2. Build¶
Your coding agent edits app/agent.py — replacing the default agent with your caveman compressor. It uses the google-agents-cli-adk-code skill for ADK patterns.
The agent definition ends up looking something like:
root_agent = Agent(
name="caveman_agent",
model=Gemini(model="gemini-flash-latest"),
instruction="""You caveman compressor. Human give long words,
you make short. Rules:
- No articles. No filler. No fluff.
- Short grunts. Simple words.
- Keep technical terms but grunt around them.
- Funny but meaning stays.
Example input: "I would like to deploy the application to production"
Example output: "Me deploy. Production. Now."
""",
)
Your coding agent then smoke-tests it:
Output:
3. Evaluate¶
Tell your coding agent:
"Write evals for the caveman agent and run them"
Your coding agent activates the google-agents-cli-eval skill and:
- Creates
tests/eval/evalsets/caveman.evalset.jsonwith test cases (compression quality, technical term preservation, caveman tone) - Configures LLM-as-judge criteria in
tests/eval/eval_config.json - Runs the evaluation:
If cases fail, tell your coding agent what to fix:
"The response to the greeting test is too polite. Make it more caveman."
Your coding agent adjusts the instruction, re-runs agents-cli eval run, and iterates until quality thresholds pass.
4. Deploy¶
Tell your coding agent:
"Deploy this to Cloud Run"
Your coding agent activates the google-agents-cli-deploy skill and:
- Adds deployment infrastructure:
- Deploys:
Your caveman agent is now live. Cloud Run URL in the output.
5. Observe¶
Cloud Trace is enabled by default — no setup needed. Open the Trace explorer in the Google Cloud Console and send a few requests to your agent. You'll see spans for each LLM call and tool execution.
To go further and inspect the actual prompts and responses your agent handles in production, tell your coding agent:
"Set up observability infrastructure for my agent"
Your coding agent runs infra single-project, which provisions the service account, GCS bucket, and BigQuery dataset — and updates the deployed service to use them. See the Observability Guide for verification steps and advanced options.
What happened¶
Here's what each prompt triggered under the hood:
| You said | Your coding agent did |
|---|---|
| "Build a caveman compressor agent" | Scaffolded project, wrote agent code, tested locally |
| "Write evals and run them" | Created evalset, configured LLM-as-judge, ran agents-cli eval run |
| "Deploy this to Cloud Run" | Added deployment target, deployed to Cloud Run |
| "Set up observability" | Provisioned service account, GCS bucket, and BigQuery dataset |
The skills gave your coding agent the context to make the right decisions at each step — which ADK patterns to use, how to structure evals, which deploy target flags to pass.
Next steps¶
Try building something more complex:
- Add tools — "Add a Google Search tool so the caveman can grunt about current events"
- Multi-agent — "Create an A2A agent that other agents can talk to" (use
adk_a2atemplate) - RAG — "Build an agent that answers questions from our docs" (use
agentic_ragtemplate)
See Agent Templates for all options, or jump to the Development Guide for the full workflow.