# LiteLLM Agent with Hot-Swap Support A flexible AI agent powered by LiteLLM that supports runtime hot-swapping of models and system prompts. Compatible with ADK and A2A protocols. ## Features - πŸ”„ **Hot-Swap Models**: Change LLM models on-the-fly without restarting - πŸ“ **Dynamic Prompts**: Update system prompts during conversation - 🌐 **Multi-Provider Support**: Works with OpenAI, Anthropic, Google, OpenRouter, and more - πŸ”Œ **A2A Compatible**: Can be served as an A2A agent - πŸ› οΈ **ADK Integration**: Run with `adk web`, `adk run`, or `adk api_server` ## Architecture ``` task_agent/ β”œβ”€β”€ __init__.py # Exposes root_agent for ADK β”œβ”€β”€ a2a_hot_swap.py # JSON-RPC helper for hot-swapping β”œβ”€β”€ README.md # This guide β”œβ”€β”€ QUICKSTART.md # Quick-start walkthrough β”œβ”€β”€ .env # Active environment (gitignored) β”œβ”€β”€ .env.example # Environment template └── litellm_agent/ β”œβ”€β”€ __init__.py β”œβ”€β”€ agent.py # Main agent implementation β”œβ”€β”€ agent.json # A2A agent card β”œβ”€β”€ callbacks.py # ADK callbacks β”œβ”€β”€ config.py # Defaults and state keys β”œβ”€β”€ control.py # HOTSWAP message helpers β”œβ”€β”€ prompts.py # Base instruction β”œβ”€β”€ state.py # Session state utilities └── tools.py # set_model / set_prompt / get_config ``` ## Setup ### 1. Environment Configuration Copying the example file is optionalβ€”the repository already ships with a root-level `.env` seeded with defaults. Adjust the values at the package root: ```bash cd task_agent # Optionally refresh from the template # cp .env.example .env ``` Edit `.env` (or `.env.example`) and add your proxy + API keys. The agent must be restarted after changes so the values are picked up: ```bash # Route every request through the proxy container (use http://localhost:10999 from the host) FF_LLM_PROXY_BASE_URL=http://llm-proxy:4000 # Default model + provider the agent boots with LITELLM_MODEL=openai/gpt-4o-mini LITELLM_PROVIDER=openai # Virtual key issued by the proxy to the task agent (bootstrap replaces the placeholder) OPENAI_API_KEY=sk-proxy-default # Upstream keys stay inside the proxy. Store real secrets under the LiteLLM # aliases and the bootstrapper mirrors them into .env.litellm for the proxy container. LITELLM_OPENAI_API_KEY=your_real_openai_api_key LITELLM_ANTHROPIC_API_KEY=your_real_anthropic_key LITELLM_GEMINI_API_KEY=your_real_gemini_key LITELLM_MISTRAL_API_KEY=your_real_mistral_key LITELLM_OPENROUTER_API_KEY=your_real_openrouter_key ``` > When running the agent outside of Docker, swap `FF_LLM_PROXY_BASE_URL` to the host port (default `http://localhost:10999`). The bootstrap container provisions LiteLLM, copies provider secrets into `volumes/env/.env.litellm`, and rewrites `volumes/env/.env` with the virtual key. Populate the `LITELLM_*_API_KEY` values before the first launch so the proxy can reach your upstream providers as soon as the bootstrap script runs. ### 2. Install Dependencies ```bash pip install "google-adk" "a2a-sdk[all]" "python-dotenv" "litellm" ``` ### 3. Run in Docker Build the container (this image can be pushed to any registry or run locally): ```bash docker build -t litellm-hot-swap:latest task_agent ``` Provide environment configuration at runtime (either pass variables individually or mount a file): ```bash docker run \ -p 8000:8000 \ --env-file task_agent/.env \ litellm-hot-swap:latest ``` The container starts Uvicorn with the ADK app (`main.py`) listening on port 8000. ## Running the Agent ### Option 1: ADK Web UI (Recommended for Testing) Start the web interface: ```bash adk web task_agent ``` > **Tip:** before launching `adk web`/`adk run`/`adk api_server`, ensure the root-level `.env` contains valid API keys for any provider you plan to hot-swap to (e.g. set `OPENAI_API_KEY` before switching to `openai/gpt-4o`). Open http://localhost:8000 in your browser and interact with the agent. ### Option 2: ADK Terminal Run in terminal mode: ```bash adk run task_agent ``` ### Option 3: A2A API Server Start as an A2A-compatible API server: ```bash adk api_server --a2a --port 8000 task_agent ``` The agent will be available at: `http://localhost:8000/a2a/litellm_agent` ### Command-line helper Use the bundled script to drive hot-swaps and user messages over A2A: ```bash python task_agent/a2a_hot_swap.py \ --url http://127.0.0.1:8000/a2a/litellm_agent \ --model openai gpt-4o \ --prompt "You are concise." \ --config \ --context demo-session ``` To send a follow-up prompt in the same session (with a larger timeout for long answers): ```bash python task_agent/a2a_hot_swap.py \ --url http://127.0.0.1:8000/a2a/litellm_agent \ --model openai gpt-4o \ --prompt "You are concise." \ --message "Give me a fuzzing harness." \ --context demo-session \ --timeout 120 ``` > Ensure the corresponding provider keys are present in `.env` (or passed via environment variables) before issuing model swaps. ## Hot-Swap Tools The agent provides three special tools: ### 1. `set_model` - Change the LLM Model Change the model during conversation: ``` User: Use the set_model tool to change to gpt-4o with openai provider Agent: βœ… Model configured to: openai/gpt-4o This change is active now! ``` **Parameters:** - `model`: Model name (e.g., "gpt-4o", "claude-3-sonnet-20240229") - `custom_llm_provider`: Optional provider prefix (e.g., "openai", "anthropic", "openrouter") **Examples:** - OpenAI: `set_model(model="gpt-4o", custom_llm_provider="openai")` - Anthropic: `set_model(model="claude-3-sonnet-20240229", custom_llm_provider="anthropic")` - Google: `set_model(model="gemini-2.0-flash-001", custom_llm_provider="gemini")` ### 2. `set_prompt` - Change System Prompt Update the system instructions: ``` User: Use set_prompt to change my behavior to "You are a helpful coding assistant" Agent: βœ… System prompt updated: You are a helpful coding assistant This change is active now! ``` ### 3. `get_config` - View Configuration Check current model and prompt: ``` User: Use get_config to show me your configuration Agent: πŸ“Š Current Configuration: ━━━━━━━━━━━━━━━━━━━━━━ Model: openai/gpt-4o System Prompt: You are a helpful coding assistant ━━━━━━━━━━━━━━━━━━━━━━ ``` ## Testing ### Basic A2A Client Test ```bash python agent/test_a2a_client.py ``` ### Hot-Swap Functionality Test ```bash python agent/test_hotswap.py ``` This will: 1. Check initial configuration 2. Query with default model 3. Hot-swap to GPT-4o 4. Verify model changed 5. Change system prompt 6. Test new prompt behavior 7. Hot-swap to Claude 8. Verify final configuration ### Command-Line Hot-Swap Helper You can trigger model and prompt changes directly against the A2A endpoint without the interactive CLI: ```bash # Start the agent first (in another terminal): adk api_server --a2a --port 8000 task_agent # Apply swaps via pure A2A calls python task_agent/a2a_hot_swap.py --model openai gpt-4o --prompt "You are concise." --config python task_agent/a2a_hot_swap.py --model anthropic claude-3-sonnet-20240229 --context shared-session --config python task_agent/a2a_hot_swap.py --prompt "" --context shared-session --config # Clear the prompt and show current state ``` `--model` accepts either `provider/model` or a provider/model pair. Add `--context` if you want to reuse the same conversation across invocations. Use `--config` to dump the agent's configuration after the changes are applied. ## Supported Models ### OpenAI - `openai/gpt-4o` - `openai/gpt-4-turbo` - `openai/gpt-3.5-turbo` ### Anthropic - `anthropic/claude-3-opus-20240229` - `anthropic/claude-3-sonnet-20240229` - `anthropic/claude-3-haiku-20240307` ### Google - `gemini/gemini-2.0-flash-001` - `gemini/gemini-2.5-pro-exp-03-25` - `vertex_ai/gemini-2.0-flash-001` ### OpenRouter - `openrouter/anthropic/claude-3-opus` - `openrouter/openai/gpt-4` - Any model from OpenRouter catalog ## How It Works ### Session State - Model and prompt settings are stored in session state - Each session maintains its own configuration - Settings persist across messages in the same session ### Hot-Swap Mechanism 1. Tools update session state with new model/prompt 2. `before_agent_callback` checks for changes 3. If model changed, directly updates: `agent.model = LiteLlm(model=new_model)` 4. Dynamic instruction function reads custom prompt from session state ### A2A Compatibility - Agent card at `agent.json` defines A2A metadata - Served at `/a2a/litellm_agent` endpoint - Compatible with A2A client protocol ## Example Usage ### Interactive Session ```python from a2a.client import A2AClient import asyncio async def chat(): client = A2AClient("http://localhost:8000/a2a/litellm_agent") context_id = "my-session-123" # Start with default model async for msg in client.send_message("Hello!", context_id=context_id): print(msg) # Switch to GPT-4 async for msg in client.send_message( "Use set_model with model gpt-4o and provider openai", context_id=context_id ): print(msg) # Continue with new model async for msg in client.send_message( "Help me write a function", context_id=context_id ): print(msg) asyncio.run(chat()) ``` ## Troubleshooting ### Model Not Found - Ensure API key for the provider is set in `.env` - Check model name is correct for the provider - Verify LiteLLM supports the model (https://docs.litellm.ai/docs/providers) ### Connection Refused - Ensure the agent is running (`adk api_server --a2a task_agent`) - Check the port matches (default: 8000) - Verify no firewall blocking localhost ### Hot-Swap Not Working - Check that you're using the same `context_id` across messages - Ensure the tool is being called (not just asked to switch) - Look for `πŸ”„ Hot-swapped model to:` in server logs ## Development ### Adding New Tools ```python async def my_tool(tool_ctx: ToolContext, param: str) -> str: """Your tool description.""" # Access session state tool_ctx.state["my_key"] = "my_value" return "Tool result" # Add to agent root_agent = LlmAgent( # ... tools=[set_model, set_prompt, get_config, my_tool], ) ``` ### Modifying Callbacks ```python async def after_model_callback( callback_context: CallbackContext, llm_response: LlmResponse ) -> Optional[LlmResponse]: """Modify response after model generates it.""" # Your logic here return llm_response ``` ## License Apache 2.0