# FuzzForge Backend A stateless API server for security testing workflow orchestration using Prefect. This system dynamically discovers workflows, executes them in isolated Docker containers with volume mounting, and returns findings in SARIF format. ## Architecture Overview ### Core Components 1. **Workflow Discovery System**: Automatically discovers workflows at startup 2. **Module System**: Reusable components (scanner, analyzer, reporter) with a common interface 3. **Prefect Integration**: Handles container orchestration, workflow execution, and monitoring 4. **Volume Mounting**: Secure file access with configurable permissions (ro/rw) 5. **SARIF Output**: Standardized security findings format ### Key Features - **Stateless**: No persistent data, fully scalable - **Generic**: No hardcoded workflows, automatic discovery - **Isolated**: Each workflow runs in its own Docker container - **Extensible**: Easy to add new workflows and modules - **Secure**: Read-only volume mounts by default, path validation - **Observable**: Comprehensive logging and status tracking ## Quick Start ### Prerequisites - Docker and Docker Compose ### Installation From the project root, start all services: ```bash docker-compose up -d ``` This will start: - Prefect server (API at http://localhost:4200/api) - PostgreSQL database - Redis cache - Docker registry (port 5001) - Prefect worker (for running workflows) - FuzzForge backend API (port 8000) - FuzzForge MCP server (port 8010) **Note**: The Prefect UI at http://localhost:4200 is not currently accessible from the host due to the API being configured for inter-container communication. Use the REST API or MCP interface instead. ## API Endpoints ### Workflows - `GET /workflows` - List all discovered workflows - `GET /workflows/{name}/metadata` - Get workflow metadata and parameters - `GET /workflows/{name}/parameters` - Get workflow parameter schema - `GET /workflows/metadata/schema` - Get metadata.yaml schema - `POST /workflows/{name}/submit` - Submit a workflow for execution ### Runs - `GET /runs/{run_id}/status` - Get run status - `GET /runs/{run_id}/findings` - Get SARIF findings from completed run - `GET /runs/{workflow_name}/findings/{run_id}` - Alternative findings endpoint with workflow name ## Workflow Structure Each workflow must have: ``` toolbox/workflows/{workflow_name}/ workflow.py # Prefect flow definition metadata.yaml # Mandatory metadata (parameters, version, etc.) Dockerfile # Optional custom container definition requirements.txt # Optional Python dependencies ``` ### Example metadata.yaml ```yaml name: security_assessment version: "1.0.0" description: "Comprehensive security analysis workflow" author: "FuzzForge Team" category: "comprehensive" tags: - "security" - "analysis" - "comprehensive" supported_volume_modes: - "ro" - "rw" requirements: tools: - "file_scanner" - "security_analyzer" - "sarif_reporter" resources: memory: "512Mi" cpu: "500m" timeout: 1800 has_docker: true parameters: type: object properties: target_path: type: string default: "/workspace" description: "Path to analyze" volume_mode: type: string enum: ["ro", "rw"] default: "ro" description: "Volume mount mode" scanner_config: type: object description: "Scanner configuration" properties: max_file_size: type: integer description: "Maximum file size to scan (bytes)" output_schema: type: object properties: sarif: type: object description: "SARIF-formatted security findings" summary: type: object description: "Scan execution summary" ``` ### Metadata Field Descriptions - **name**: Workflow identifier (must match directory name) - **version**: Semantic version (x.y.z format) - **description**: Human-readable description of the workflow - **author**: Workflow author/maintainer - **category**: Workflow category (comprehensive, specialized, fuzzing, focused) - **tags**: Array of descriptive tags for categorization - **requirements.tools**: Required security tools that the workflow uses - **requirements.resources**: Resource requirements enforced at runtime: - `memory`: Memory limit (e.g., "512Mi", "1Gi") - `cpu`: CPU limit (e.g., "500m" for 0.5 cores, "1" for 1 core) - `timeout`: Maximum execution time in seconds - **parameters**: JSON Schema object defining workflow parameters - **output_schema**: Expected output format (typically SARIF) ### Resource Requirements Resource requirements defined in workflow metadata are automatically enforced. Users can override defaults when submitting workflows: ```bash curl -X POST "http://localhost:8000/workflows/security_assessment/submit" \ -H "Content-Type: application/json" \ -d '{ "target_path": "/tmp/project", "volume_mode": "ro", "resource_limits": { "memory_limit": "1Gi", "cpu_limit": "1" } }' ``` Resource precedence: User limits > Workflow requirements > System defaults ## Module Development Modules implement the `BaseModule` interface: ```python from src.toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult class MyModule(BaseModule): def get_metadata(self) -> ModuleMetadata: return ModuleMetadata( name="my_module", version="1.0.0", description="Module description", category="scanner", ... ) async def execute(self, config: Dict, workspace: Path) -> ModuleResult: # Module logic here findings = [...] return self.create_result(findings=findings) def validate_config(self, config: Dict) -> bool: # Validate configuration return True ``` ## Submitting a Workflow ```bash curl -X POST "http://localhost:8000/workflows/security_assessment/submit" \ -H "Content-Type: application/json" \ -d '{ "target_path": "/home/user/project", "volume_mode": "ro", "parameters": { "scanner_config": {"patterns": ["*.py"]}, "analyzer_config": {"check_secrets": true} } }' ``` ## Getting Findings ```bash curl "http://localhost:8000/runs/{run_id}/findings" ``` Returns SARIF-formatted findings: ```json { "workflow": "security_assessment", "run_id": "abc-123", "sarif": { "version": "2.1.0", "runs": [{ "tool": {...}, "results": [...] }] } } ``` ## Security Considerations 1. **Volume Mounting**: Only allowed directories can be mounted 2. **Read-Only Default**: Volumes mounted as read-only unless explicitly set 3. **Container Isolation**: Each workflow runs in an isolated container 4. **Resource Limits**: Can set CPU/memory limits via Prefect 5. **Network Isolation**: Containers use bridge networking ## Development ### Adding a New Workflow 1. Create directory: `toolbox/workflows/my_workflow/` 2. Add `workflow.py` with a Prefect flow 3. Add mandatory `metadata.yaml` 4. Restart backend: `docker-compose restart fuzzforge-backend` ### Adding a New Module 1. Create module in `toolbox/modules/{category}/` 2. Implement `BaseModule` interface 3. Use in workflows via import