Tutorial: Build Your First Agent¶

For beginners who want to build, evaluate, and deploy an agent using a coding agent.

This tutorial shows the full Agents CLI in Agent Platform experience — you talk to your coding agent, and it builds, evaluates, and deploys an ADK agent for you.

You'll build a caveman compressor: an agent that takes verbose text and grunts it down to terse, caveman-style summaries. Inspired by caveman.

Here's what it looks like end to end:

agents-cli demo

Setup¶

The only command you run yourself. Everything else goes through your coding agent.

uvx google-agents-cli setup

Then open your coding agent — Antigravity CLI, Claude Code, Codex, or any other.

1. Scaffold¶

Tell your coding agent:

"Use agents-cli to build a caveman-style agent that compresses verbose text into terse, technical grunts"

Your coding agent activates the google-agents-cli-workflow and google-agents-cli-scaffold skills. It will:

Ask clarifying questions (deployment target, safety constraints, etc.)
Save a spec at .agents-cli-spec.md capturing the agent's purpose
Scaffold the project:

agents-cli create caveman-agent --prototype --yes
cd caveman-agent && agents-cli install

You now have a working project with boilerplate agent code, tests, and eval sets.

2. Build¶

Your coding agent edits app/agent.py — replacing the default agent with your caveman compressor. It uses the google-agents-cli-adk-code skill for ADK patterns.

The agent definition ends up looking something like:

app/agent.py

root_agent = Agent(
    name="caveman_agent",
    model=Gemini(model="gemini-flash-latest"),
    instruction="""You caveman compressor. Human give long words,
    you make short. Rules:
    - No articles. No filler. No fluff.
    - Short grunts. Simple words.
    - Keep technical terms but grunt around them.
    - Funny but meaning stays.

    Example input:  "I would like to deploy the application to production"
    Example output: "Me deploy. Production. Now."
    """,
)

Your coding agent then smoke-tests it:

agents-cli run "Please help me understand the deployment options available for my project"

Output:

Deploy options: Agent Runtime, Cloud Run, GKE. Pick one. Ship.

3. Evaluate¶

Tell your coding agent:

"Write evals for the caveman agent and run them"

Your coding agent activates the google-agents-cli-eval skill and:

Creates tests/eval/datasets/caveman-dataset.json with test cases (compression quality, technical term preservation, caveman tone)
Runs the evaluation:

agents-cli eval generate
agents-cli eval grade

If cases fail, tell your coding agent what to fix:

"The response to the greeting test is too polite. Make it more caveman."

Your coding agent adjusts the instruction, re-runs the evaluation, and iterates until quality thresholds pass.

The eval surface goes beyond generate and grade — eval dataset synthesize, eval compare, eval analyze, and eval optimize cover synthetic case generation, regression diffing, failure clustering, and prompt auto-tuning. See the Evaluation Guide for the full surface.

4. Deploy¶

Tell your coding agent:

"Deploy this to Cloud Run"

Your coding agent activates the google-agents-cli-deploy skill and:

Adds deployment infrastructure:

agents-cli scaffold enhance --deployment-target cloud_run

Deploys:

agents-cli deploy

Your caveman agent is now live. Cloud Run URL in the output.

5. Observe¶

Cloud Trace is enabled by default — no setup needed. Open the Trace explorer in the Google Cloud Console and send a few requests to your agent. You'll see spans for each LLM call and tool execution.

To go further and inspect the actual prompts and responses your agent handles in production, tell your coding agent:

"Set up observability infrastructure for my agent"

Your coding agent runs infra single-project, which provisions the service account, GCS bucket, and BigQuery dataset — and updates the deployed service to use them. See the Observability Guide for verification steps and advanced options.

What happened¶

Here's what each prompt triggered under the hood:

You said	Your coding agent did
"Build a caveman compressor agent"	Scaffolded project, wrote agent code, tested locally
"Write evals and run them"	Created dataset, ran evaluation using `generate` and `grade`
"Deploy this to Cloud Run"	Added deployment target, deployed to Cloud Run
"Set up observability"	Provisioned service account, GCS bucket, and BigQuery dataset

The skills gave your coding agent the context to make the right decisions at each step — which ADK patterns to use, how to structure evals, which deploy target flags to pass.

Next steps¶

Try building something more complex:

Add tools — "Add a Google Search tool so the caveman can grunt about current events"
Multi-agent — "Create an A2A agent that other agents can talk to" (use the adk template — A2A is built in)
RAG — "Build an agent that answers questions from our docs" (clone-and-study a RAG sample — see Agent Templates)

See Agent Templates for all options, or jump to the Development Guide for the full workflow.