mirror of https://github.com/FuzzingLabs/fuzzforge_ai.git synced 2026-02-13 07:52:45 +00:00

Files

Songbird99 f77c3ff1e9 Feature/litellm proxy (#27 )

* feat: seed governance config and responses routing

* Add env-configurable timeout for proxy providers

* Integrate LiteLLM OTEL collector and update docs

* Make .env.litellm optional for LiteLLM proxy

* Add LiteLLM proxy integration with model-agnostic virtual keys

Changes:
- Bootstrap generates 3 virtual keys with individual budgets (CLI: $100, Task-Agent: $25, Cognee: $50)
- Task-agent loads config at runtime via entrypoint script to wait for bootstrap completion
- All keys are model-agnostic by default (no LITELLM_DEFAULT_MODELS restrictions)
- Bootstrap handles database/env mismatch after docker prune by deleting stale aliases
- CLI and Cognee configured to use LiteLLM proxy with virtual keys
- Added comprehensive documentation in volumes/env/README.md

Technical details:
- task-agent entrypoint waits for keys in .env file before starting uvicorn
- Bootstrap creates/updates TASK_AGENT_API_KEY, COGNEE_API_KEY, and OPENAI_API_KEY
- Removed hardcoded API keys from docker-compose.yml
- All services route through http://localhost:10999 proxy

* Fix CLI not loading virtual keys from global .env

Project .env files with empty OPENAI_API_KEY values were overriding
the global virtual keys. Updated _load_env_file_if_exists to only
override with non-empty values.

* Fix agent executor not passing API key to LiteLLM

The agent was initializing LiteLlm without api_key or api_base,
causing authentication errors when using the LiteLLM proxy. Now
reads from OPENAI_API_KEY/LLM_API_KEY and LLM_ENDPOINT environment
variables and passes them to LiteLlm constructor.

* Auto-populate project .env with virtual key from global config

When running 'ff init', the command now checks for a global
volumes/env/.env file and automatically uses the OPENAI_API_KEY
virtual key if found. This ensures projects work with LiteLLM
proxy out of the box without manual key configuration.

* docs: Update README with LiteLLM configuration instructions

Add note about LITELLM_GEMINI_API_KEY configuration and clarify that OPENAI_API_KEY default value should not be changed as it's used for the LLM proxy.

* Refactor workflow parameters to use JSON Schema defaults

Consolidates parameter defaults into JSON Schema format, removing the separate default_parameters field. Adds extract_defaults_from_json_schema() helper to extract defaults from the standard schema structure. Updates LiteLLM proxy config to use LITELLM_OPENAI_API_KEY environment variable.

* Remove .env.example from task_agent

* Fix MDX syntax error in llm-proxy.md

* fix: apply default parameters from metadata.yaml automatically

Fixed TemporalManager.run_workflow() to correctly apply default parameter
values from workflow metadata.yaml files when parameters are not provided
by the caller.

Previous behavior:
- When workflow_params was empty {}, the condition
  `if workflow_params and 'parameters' in metadata` would fail
- Parameters would not be extracted from schema, resulting in workflows
  receiving only target_id with no other parameters

New behavior:
- Removed the `workflow_params and` requirement from the condition
- Now explicitly checks for defaults in parameter spec
- Applies defaults from metadata.yaml automatically when param not provided
- Workflows receive all parameters with proper fallback:
  provided value > metadata default > None

This makes metadata.yaml the single source of truth for parameter defaults,
removing the need for workflows to implement defensive default handling.

Affected workflows:
- llm_secret_detection (was failing with KeyError)
- All other workflows now benefit from automatic default application

Co-authored-by: tduhamel42 <tduhamel@fuzzinglabs.com>

2025-11-04 14:04:10 +01:00

benchmarks

fix: Add benchmark results files to git

2025-10-17 09:56:09 +02:00

src

Feature/litellm proxy (#27 )

2025-11-04 14:04:10 +01:00

tests

CI/CD Integration with Ephemeral Deployment Model (#14 )

2025-10-14 10:13:45 +02:00

toolbox

Feature/litellm proxy (#27 )

2025-11-04 14:04:10 +01:00

Dockerfile

CI/CD Integration with Ephemeral Deployment Model (#14 )

2025-10-14 10:13:45 +02:00

mcp-config.json

fix: resolve live monitoring bug, remove deprecated parameters, and auto-start Python worker

2025-10-22 16:26:58 +02:00

pyproject.toml

chore: Bump version to 0.7.0

2025-10-16 12:23:56 +02:00

README.md

docs: Remove obsolete volume_mode references from documentation

2025-10-16 11:36:53 +02:00

uv.lock

CI/CD Integration with Ephemeral Deployment Model (#14 )

2025-10-14 10:13:45 +02:00

README.md

FuzzForge Backend

A stateless API server for security testing workflow orchestration using Temporal. This system dynamically discovers workflows, executes them in isolated worker environments, and returns findings in SARIF format.

Architecture Overview

Core Components

Workflow Discovery System: Automatically discovers workflows at startup
Module System: Reusable components (scanner, analyzer, reporter) with a common interface
Temporal Integration: Handles workflow orchestration, execution, and monitoring with vertical workers
File Upload & Storage: HTTP multipart upload to MinIO for target files
SARIF Output: Standardized security findings format

Key Features

Stateless: No persistent data, fully scalable
Generic: No hardcoded workflows, automatic discovery
Isolated: Each workflow runs in specialized vertical workers
Extensible: Easy to add new workflows and modules
Secure: File upload with MinIO storage, automatic cleanup via lifecycle policies
Observable: Comprehensive logging and status tracking

Quick Start

Prerequisites

Docker and Docker Compose

Installation

From the project root, start all services:

docker-compose -f docker-compose.temporal.yaml up -d

This will start:

Temporal server (Web UI at http://localhost:8233, gRPC at :7233)
MinIO (S3 storage at http://localhost:9000, Console at http://localhost:9001)
PostgreSQL database (for Temporal state)
Vertical workers (worker-rust, worker-android, worker-web, etc.)
FuzzForge backend API (port 8000)

Note: MinIO console login: fuzzforge / fuzzforge123

API Endpoints

Workflows

GET /workflows - List all discovered workflows
GET /workflows/{name}/metadata - Get workflow metadata and parameters
GET /workflows/{name}/parameters - Get workflow parameter schema
GET /workflows/metadata/schema - Get metadata.yaml schema
POST /workflows/{name}/submit - Submit a workflow for execution (path-based, legacy)
POST /workflows/{name}/upload-and-submit - Upload local files and submit workflow (recommended)

Runs

GET /runs/{run_id}/status - Get run status
GET /runs/{run_id}/findings - Get SARIF findings from completed run
GET /runs/{workflow_name}/findings/{run_id} - Alternative findings endpoint with workflow name

Workflow Structure

Each workflow must have:

toolbox/workflows/{workflow_name}/
   workflow.py       # Temporal workflow definition
   metadata.yaml     # Mandatory metadata (parameters, version, vertical, etc.)
   requirements.txt  # Optional Python dependencies (installed in vertical worker)

Note: With Temporal architecture, workflows run in pre-built vertical workers (e.g., worker-rust, worker-android), not individual Docker containers. The workflow code is mounted as a volume and discovered at runtime.

Example metadata.yaml

name: security_assessment
version: "1.0.0"
description: "Comprehensive security analysis workflow"
author: "FuzzForge Team"
category: "comprehensive"
vertical: "rust"  # Routes to worker-rust
tags:
  - "security"
  - "analysis"
  - "comprehensive"

requirements:
  tools:
    - "file_scanner"
    - "security_analyzer"
    - "sarif_reporter"
  resources:
    memory: "512Mi"
    cpu: "500m"
    timeout: 1800

has_docker: true

parameters:
  type: object
  properties:
    target_path:
      type: string
      default: "/workspace"
      description: "Path to analyze"
    scanner_config:
      type: object
      description: "Scanner configuration"
      properties:
        max_file_size:
          type: integer
          description: "Maximum file size to scan (bytes)"

output_schema:
  type: object
  properties:
    sarif:
      type: object
      description: "SARIF-formatted security findings"
    summary:
      type: object
      description: "Scan execution summary"

Metadata Field Descriptions

name: Workflow identifier (must match directory name)
version: Semantic version (x.y.z format)
description: Human-readable description of the workflow
author: Workflow author/maintainer
category: Workflow category (comprehensive, specialized, fuzzing, focused)
tags: Array of descriptive tags for categorization
requirements.tools: Required security tools that the workflow uses
requirements.resources: Resource requirements enforced at runtime:
- memory: Memory limit (e.g., "512Mi", "1Gi")
- cpu: CPU limit (e.g., "500m" for 0.5 cores, "1" for 1 core)
- timeout: Maximum execution time in seconds
parameters: JSON Schema object defining workflow parameters
output_schema: Expected output format (typically SARIF)

Resource Requirements

Resource requirements defined in workflow metadata are automatically enforced. Users can override defaults when submitting workflows:

curl -X POST "http://localhost:8000/workflows/security_assessment/submit" \
  -H "Content-Type: application/json" \
  -d '{
    "target_path": "/tmp/project",
    "resource_limits": {
      "memory_limit": "1Gi",
      "cpu_limit": "1"
    }
  }'

Resource precedence: User limits > Workflow requirements > System defaults

File Upload and Target Access

Upload Endpoint

The backend provides an upload endpoint for submitting workflows with local files:

POST /workflows/{workflow_name}/upload-and-submit
Content-Type: multipart/form-data

Parameters:
  file: File upload (supports .tar.gz for directories)
  parameters: JSON string of workflow parameters (optional)
  timeout: Execution timeout in seconds (optional)

Example using curl:

# Upload a directory (create tarball first)
tar -czf project.tar.gz /path/to/project
curl -X POST "http://localhost:8000/workflows/security_assessment/upload-and-submit" \
  -F "file=@project.tar.gz" \
  -F "parameters={\"check_secrets\":true}"

# Upload a single file
curl -X POST "http://localhost:8000/workflows/security_assessment/upload-and-submit" \
  -F "file=@binary.elf"

Storage Flow

CLI/API uploads file via HTTP multipart
Backend receives file and streams to temporary location (max 10GB)
Backend uploads to MinIO with generated target_id
Workflow is submitted to Temporal with target_id
Worker downloads target from MinIO to local cache
Workflow processes target from cache
MinIO lifecycle policy deletes files after 7 days

Advantages

No host filesystem access required - workers can run anywhere
Automatic cleanup - lifecycle policies prevent disk exhaustion
Caching - repeated workflows reuse cached targets
Multi-host ready - targets accessible from any worker
Secure - isolated storage, no arbitrary host path access

Module Development

Modules implement the BaseModule interface:

from src.toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult

class MyModule(BaseModule):
    def get_metadata(self) -> ModuleMetadata:
        return ModuleMetadata(
            name="my_module",
            version="1.0.0",
            description="Module description",
            category="scanner",
            ...
        )

    async def execute(self, config: Dict, workspace: Path) -> ModuleResult:
        # Module logic here
        findings = [...]
        return self.create_result(findings=findings)

    def validate_config(self, config: Dict) -> bool:
        # Validate configuration
        return True

Submitting a Workflow

With File Upload (Recommended)

# Automatic tarball and upload
tar -czf project.tar.gz /home/user/project
curl -X POST "http://localhost:8000/workflows/security_assessment/upload-and-submit" \
  -F "file=@project.tar.gz" \
  -F "parameters={\"scanner_config\":{\"patterns\":[\"*.py\"]},\"analyzer_config\":{\"check_secrets\":true}}"

Legacy Path-Based Submission

# Only works if backend and target are on same machine
curl -X POST "http://localhost:8000/workflows/security_assessment/submit" \
  -H "Content-Type: application/json" \
  -d '{
    "target_path": "/home/user/project",
    "parameters": {
      "scanner_config": {"patterns": ["*.py"]},
      "analyzer_config": {"check_secrets": true}
    }
  }'

Getting Findings

curl "http://localhost:8000/runs/{run_id}/findings"

Returns SARIF-formatted findings:

{
  "workflow": "security_assessment",
  "run_id": "abc-123",
  "sarif": {
    "version": "2.1.0",
    "runs": [{
      "tool": {...},
      "results": [...]
    }]
  }
}

Security Considerations

File Upload Security: Files uploaded to MinIO with isolated storage
Read-Only Default: Target files accessed as read-only unless explicitly set
Worker Isolation: Each workflow runs in isolated vertical workers
Resource Limits: Can set CPU/memory limits per worker
Automatic Cleanup: MinIO lifecycle policies delete old files after 7 days

Development

Adding a New Workflow

Create directory: toolbox/workflows/my_workflow/
Add workflow.py with a Temporal workflow (using @workflow.defn)
Add mandatory metadata.yaml with vertical field
Restart the appropriate worker: docker-compose -f docker-compose.temporal.yaml restart worker-rust
Worker will automatically discover and register the new workflow

Adding a New Module

Create module in toolbox/modules/{category}/
Implement BaseModule interface
Use in workflows via import

Adding a New Vertical Worker

Create worker directory: workers/{vertical}/
Create Dockerfile with required tools
Add worker to docker-compose.temporal.yaml
Worker will automatically discover workflows with matching vertical in metadata