vLLM model host for ADK agents¶
Supported in ADKPython v0.1.0
Tools such as vLLM allow you to host models efficiently and serve them as an OpenAI-compatible API endpoint. You can use vLLM models through the LiteLLM library for Python.
Setup¶
- Deploy Model: Deploy your chosen model using vLLM (or a similar tool).
Note the API base URL (e.g.,
https://your-vllm-endpoint.run.app/v1).- Important for ADK Tools: When deploying, ensure the serving tool
supports and enables OpenAI-compatible tool/function calling. For vLLM,
this might involve flags like
--enable-auto-tool-choiceand potentially a specific--tool-call-parser, depending on the model. Refer to the vLLM documentation on Tool Use.
- Important for ADK Tools: When deploying, ensure the serving tool
supports and enables OpenAI-compatible tool/function calling. For vLLM,
this might involve flags like
- Authentication: Determine how your endpoint handles authentication (e.g., API key, bearer token).
Integration Example¶
The following example shows how to use a vLLM endpoint with ADK agents.
import subprocess
from google.adk.agents import LlmAgent
from google.adk.models.lite_llm import LiteLlm
# --- Example Agent using a model hosted on a vLLM endpoint ---
# Endpoint URL provided by your vLLM deployment
api_base_url = "https://your-vllm-endpoint.run.app/v1"
# Model name as recognized by *your* vLLM endpoint configuration
model_name_at_endpoint = "hosted_vllm/google/gemma-3-4b-it" # Example from vllm_test.py
# Authentication (Example: using gcloud identity token for a Cloud Run deployment)
# Adapt this based on your endpoint's security
try:
gcloud_token = subprocess.check_output(
["gcloud", "auth", "print-identity-token", "-q"]
).decode().strip()
auth_headers = {"Authorization": f"Bearer {gcloud_token}"}
except Exception as e:
print(f"Warning: Could not get gcloud token - {e}. Endpoint might be unsecured or require different auth.")
auth_headers = None # Or handle error appropriately
agent_vllm = LlmAgent(
model=LiteLlm(
model=model_name_at_endpoint,
api_base=api_base_url,
# Pass authentication headers if needed
extra_headers=auth_headers
# Alternatively, if endpoint uses an API key:
# api_key="YOUR_ENDPOINT_API_KEY"
),
name="vllm_agent",
instruction="You are a helpful assistant running on a self-hosted vLLM endpoint.",
# ... other agent parameters
)