Files
fuzzforge_ai/ai/agents/task_agent
Songbird99 02b877d23d Feature/litellm proxy (#27)
* feat: seed governance config and responses routing

* Add env-configurable timeout for proxy providers

* Integrate LiteLLM OTEL collector and update docs

* Make .env.litellm optional for LiteLLM proxy

* Add LiteLLM proxy integration with model-agnostic virtual keys

Changes:
- Bootstrap generates 3 virtual keys with individual budgets (CLI: $100, Task-Agent: $25, Cognee: $50)
- Task-agent loads config at runtime via entrypoint script to wait for bootstrap completion
- All keys are model-agnostic by default (no LITELLM_DEFAULT_MODELS restrictions)
- Bootstrap handles database/env mismatch after docker prune by deleting stale aliases
- CLI and Cognee configured to use LiteLLM proxy with virtual keys
- Added comprehensive documentation in volumes/env/README.md

Technical details:
- task-agent entrypoint waits for keys in .env file before starting uvicorn
- Bootstrap creates/updates TASK_AGENT_API_KEY, COGNEE_API_KEY, and OPENAI_API_KEY
- Removed hardcoded API keys from docker-compose.yml
- All services route through http://localhost:10999 proxy

Generated with Claude Code https://claude.com/claude-code

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix CLI not loading virtual keys from global .env

Project .env files with empty OPENAI_API_KEY values were overriding
the global virtual keys. Updated _load_env_file_if_exists to only
override with non-empty values.

Generated with Claude Code https://claude.com/claude-code

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix agent executor not passing API key to LiteLLM

The agent was initializing LiteLlm without api_key or api_base,
causing authentication errors when using the LiteLLM proxy. Now
reads from OPENAI_API_KEY/LLM_API_KEY and LLM_ENDPOINT environment
variables and passes them to LiteLlm constructor.

Generated with Claude Code https://claude.com/claude-code

Co-Authored-By: Claude <noreply@anthropic.com>

* Auto-populate project .env with virtual key from global config

When running 'ff init', the command now checks for a global
volumes/env/.env file and automatically uses the OPENAI_API_KEY
virtual key if found. This ensures projects work with LiteLLM
proxy out of the box without manual key configuration.

Generated with Claude Code https://claude.com/claude-code

Co-Authored-By: Claude <noreply@anthropic.com>

* docs: Update README with LiteLLM configuration instructions

Add note about LITELLM_GEMINI_API_KEY configuration and clarify that OPENAI_API_KEY default value should not be changed as it's used for the LLM proxy.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Refactor workflow parameters to use JSON Schema defaults

Consolidates parameter defaults into JSON Schema format, removing the separate default_parameters field. Adds extract_defaults_from_json_schema() helper to extract defaults from the standard schema structure. Updates LiteLLM proxy config to use LITELLM_OPENAI_API_KEY environment variable.

* Remove .env.example from task_agent

* Fix MDX syntax error in llm-proxy.md

* fix: apply default parameters from metadata.yaml automatically

Fixed TemporalManager.run_workflow() to correctly apply default parameter
values from workflow metadata.yaml files when parameters are not provided
by the caller.

Previous behavior:
- When workflow_params was empty {}, the condition
  `if workflow_params and 'parameters' in metadata` would fail
- Parameters would not be extracted from schema, resulting in workflows
  receiving only target_id with no other parameters

New behavior:
- Removed the `workflow_params and` requirement from the condition
- Now explicitly checks for defaults in parameter spec
- Applies defaults from metadata.yaml automatically when param not provided
- Workflows receive all parameters with proper fallback:
  provided value > metadata default > None

This makes metadata.yaml the single source of truth for parameter defaults,
removing the need for workflows to implement defensive default handling.

Affected workflows:
- llm_secret_detection (was failing with KeyError)
- All other workflows now benefit from automatic default application

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: tduhamel42 <tduhamel@fuzzinglabs.com>
2025-10-26 12:51:53 +01:00
..
2025-10-26 12:51:53 +01:00
2025-10-26 12:51:53 +01:00
2025-10-26 12:51:53 +01:00

LiteLLM Agent with Hot-Swap Support

A flexible AI agent powered by LiteLLM that supports runtime hot-swapping of models and system prompts. Compatible with ADK and A2A protocols.

Features

  • 🔄 Hot-Swap Models: Change LLM models on-the-fly without restarting
  • 📝 Dynamic Prompts: Update system prompts during conversation
  • 🌐 Multi-Provider Support: Works with OpenAI, Anthropic, Google, OpenRouter, and more
  • 🔌 A2A Compatible: Can be served as an A2A agent
  • 🛠️ ADK Integration: Run with adk web, adk run, or adk api_server

Architecture

task_agent/
├── __init__.py              # Exposes root_agent for ADK
├── a2a_hot_swap.py          # JSON-RPC helper for hot-swapping
├── README.md                # This guide
├── QUICKSTART.md            # Quick-start walkthrough
├── .env                     # Active environment (gitignored)
├── .env.example             # Environment template
└── litellm_agent/
    ├── __init__.py
    ├── agent.py             # Main agent implementation
    ├── agent.json           # A2A agent card
    ├── callbacks.py         # ADK callbacks
    ├── config.py            # Defaults and state keys
    ├── control.py           # HOTSWAP message helpers
    ├── prompts.py           # Base instruction
    ├── state.py             # Session state utilities
    └── tools.py             # set_model / set_prompt / get_config

Setup

1. Environment Configuration

Copying the example file is optional—the repository already ships with a root-level .env seeded with defaults. Adjust the values at the package root:

cd task_agent
# Optionally refresh from the template
# cp .env.example .env

Edit .env (or .env.example) and add your proxy + API keys. The agent must be restarted after changes so the values are picked up:

# Route every request through the proxy container (use http://localhost:10999 from the host)
FF_LLM_PROXY_BASE_URL=http://llm-proxy:4000

# Default model + provider the agent boots with
LITELLM_MODEL=openai/gpt-4o-mini
LITELLM_PROVIDER=openai

# Virtual key issued by the proxy to the task agent (bootstrap replaces the placeholder)
OPENAI_API_KEY=sk-proxy-default

# Upstream keys stay inside the proxy. Store real secrets under the LiteLLM
# aliases and the bootstrapper mirrors them into .env.litellm for the proxy container.
LITELLM_OPENAI_API_KEY=your_real_openai_api_key
LITELLM_ANTHROPIC_API_KEY=your_real_anthropic_key
LITELLM_GEMINI_API_KEY=your_real_gemini_key
LITELLM_MISTRAL_API_KEY=your_real_mistral_key
LITELLM_OPENROUTER_API_KEY=your_real_openrouter_key

When running the agent outside of Docker, swap FF_LLM_PROXY_BASE_URL to the host port (default http://localhost:10999).

The bootstrap container provisions LiteLLM, copies provider secrets into volumes/env/.env.litellm, and rewrites volumes/env/.env with the virtual key. Populate the LITELLM_*_API_KEY values before the first launch so the proxy can reach your upstream providers as soon as the bootstrap script runs.

2. Install Dependencies

pip install "google-adk" "a2a-sdk[all]" "python-dotenv" "litellm"

3. Run in Docker

Build the container (this image can be pushed to any registry or run locally):

docker build -t litellm-hot-swap:latest task_agent

Provide environment configuration at runtime (either pass variables individually or mount a file):

docker run \
  -p 8000:8000 \
  --env-file task_agent/.env \
  litellm-hot-swap:latest

The container starts Uvicorn with the ADK app (main.py) listening on port 8000.

Running the Agent

Start the web interface:

adk web task_agent

Tip: before launching adk web/adk run/adk api_server, ensure the root-level .env contains valid API keys for any provider you plan to hot-swap to (e.g. set OPENAI_API_KEY before switching to openai/gpt-4o).

Open http://localhost:8000 in your browser and interact with the agent.

Option 2: ADK Terminal

Run in terminal mode:

adk run task_agent

Option 3: A2A API Server

Start as an A2A-compatible API server:

adk api_server --a2a --port 8000 task_agent

The agent will be available at: http://localhost:8000/a2a/litellm_agent

Command-line helper

Use the bundled script to drive hot-swaps and user messages over A2A:

python task_agent/a2a_hot_swap.py \
  --url http://127.0.0.1:8000/a2a/litellm_agent \
  --model openai gpt-4o \
  --prompt "You are concise." \
  --config \
  --context demo-session

To send a follow-up prompt in the same session (with a larger timeout for long answers):

python task_agent/a2a_hot_swap.py \
  --url http://127.0.0.1:8000/a2a/litellm_agent \
  --model openai gpt-4o \
  --prompt "You are concise." \
  --message "Give me a fuzzing harness." \
  --context demo-session \
  --timeout 120

Ensure the corresponding provider keys are present in .env (or passed via environment variables) before issuing model swaps.

Hot-Swap Tools

The agent provides three special tools:

1. set_model - Change the LLM Model

Change the model during conversation:

User: Use the set_model tool to change to gpt-4o with openai provider
Agent: ✅ Model configured to: openai/gpt-4o
       This change is active now!

Parameters:

  • model: Model name (e.g., "gpt-4o", "claude-3-sonnet-20240229")
  • custom_llm_provider: Optional provider prefix (e.g., "openai", "anthropic", "openrouter")

Examples:

  • OpenAI: set_model(model="gpt-4o", custom_llm_provider="openai")
  • Anthropic: set_model(model="claude-3-sonnet-20240229", custom_llm_provider="anthropic")
  • Google: set_model(model="gemini-2.0-flash-001", custom_llm_provider="gemini")

2. set_prompt - Change System Prompt

Update the system instructions:

User: Use set_prompt to change my behavior to "You are a helpful coding assistant"
Agent: ✅ System prompt updated:
       You are a helpful coding assistant

       This change is active now!

3. get_config - View Configuration

Check current model and prompt:

User: Use get_config to show me your configuration
Agent: 📊 Current Configuration:
       ━━━━━━━━━━━━━━━━━━━━━━
       Model: openai/gpt-4o
       System Prompt: You are a helpful coding assistant
       ━━━━━━━━━━━━━━━━━━━━━━

Testing

Basic A2A Client Test

python agent/test_a2a_client.py

Hot-Swap Functionality Test

python agent/test_hotswap.py

This will:

  1. Check initial configuration
  2. Query with default model
  3. Hot-swap to GPT-4o
  4. Verify model changed
  5. Change system prompt
  6. Test new prompt behavior
  7. Hot-swap to Claude
  8. Verify final configuration

Command-Line Hot-Swap Helper

You can trigger model and prompt changes directly against the A2A endpoint without the interactive CLI:

# Start the agent first (in another terminal):
adk api_server --a2a --port 8000 task_agent

# Apply swaps via pure A2A calls
python task_agent/a2a_hot_swap.py --model openai gpt-4o --prompt "You are concise." --config
python task_agent/a2a_hot_swap.py --model anthropic claude-3-sonnet-20240229 --context shared-session --config
python task_agent/a2a_hot_swap.py --prompt "" --context shared-session --config  # Clear the prompt and show current state

--model accepts either provider/model or a provider/model pair. Add --context if you want to reuse the same conversation across invocations. Use --config to dump the agent's configuration after the changes are applied.

Supported Models

OpenAI

  • openai/gpt-4o
  • openai/gpt-4-turbo
  • openai/gpt-3.5-turbo

Anthropic

  • anthropic/claude-3-opus-20240229
  • anthropic/claude-3-sonnet-20240229
  • anthropic/claude-3-haiku-20240307

Google

  • gemini/gemini-2.0-flash-001
  • gemini/gemini-2.5-pro-exp-03-25
  • vertex_ai/gemini-2.0-flash-001

OpenRouter

  • openrouter/anthropic/claude-3-opus
  • openrouter/openai/gpt-4
  • Any model from OpenRouter catalog

How It Works

Session State

  • Model and prompt settings are stored in session state
  • Each session maintains its own configuration
  • Settings persist across messages in the same session

Hot-Swap Mechanism

  1. Tools update session state with new model/prompt
  2. before_agent_callback checks for changes
  3. If model changed, directly updates: agent.model = LiteLlm(model=new_model)
  4. Dynamic instruction function reads custom prompt from session state

A2A Compatibility

  • Agent card at agent.json defines A2A metadata
  • Served at /a2a/litellm_agent endpoint
  • Compatible with A2A client protocol

Example Usage

Interactive Session

from a2a.client import A2AClient
import asyncio

async def chat():
    client = A2AClient("http://localhost:8000/a2a/litellm_agent")
    context_id = "my-session-123"

    # Start with default model
    async for msg in client.send_message("Hello!", context_id=context_id):
        print(msg)

    # Switch to GPT-4
    async for msg in client.send_message(
        "Use set_model with model gpt-4o and provider openai",
        context_id=context_id
    ):
        print(msg)

    # Continue with new model
    async for msg in client.send_message(
        "Help me write a function",
        context_id=context_id
    ):
        print(msg)

asyncio.run(chat())

Troubleshooting

Model Not Found

Connection Refused

  • Ensure the agent is running (adk api_server --a2a task_agent)
  • Check the port matches (default: 8000)
  • Verify no firewall blocking localhost

Hot-Swap Not Working

  • Check that you're using the same context_id across messages
  • Ensure the tool is being called (not just asked to switch)
  • Look for 🔄 Hot-swapped model to: in server logs

Development

Adding New Tools

async def my_tool(tool_ctx: ToolContext, param: str) -> str:
    """Your tool description."""
    # Access session state
    tool_ctx.state["my_key"] = "my_value"
    return "Tool result"

# Add to agent
root_agent = LlmAgent(
    # ...
    tools=[set_model, set_prompt, get_config, my_tool],
)

Modifying Callbacks

async def after_model_callback(
    callback_context: CallbackContext,
    llm_response: LlmResponse
) -> Optional[LlmResponse]:
    """Modify response after model generates it."""
    # Your logic here
    return llm_response

License

Apache 2.0