* feat: Complete migration from Prefect to Temporal BREAKING CHANGE: Replaces Prefect workflow orchestration with Temporal ## Major Changes - Replace Prefect with Temporal for workflow orchestration - Implement vertical worker architecture (rust, android) - Replace Docker registry with MinIO for unified storage - Refactor activities to be co-located with workflows - Update all API endpoints for Temporal compatibility ## Infrastructure - New: docker-compose.temporal.yaml (Temporal + MinIO + workers) - New: workers/ directory with rust and android vertical workers - New: backend/src/temporal/ (manager, discovery) - New: backend/src/storage/ (S3-cached storage with MinIO) - New: backend/toolbox/common/ (shared storage activities) - Deleted: docker-compose.yaml (old Prefect setup) - Deleted: backend/src/core/prefect_manager.py - Deleted: backend/src/services/prefect_stats_monitor.py - Deleted: Docker registry and insecure-registries requirement ## Workflows - Migrated: security_assessment workflow to Temporal - New: rust_test workflow (example/test workflow) - Deleted: secret_detection_scan (Prefect-based, to be reimplemented) - Activities now co-located with workflows for independent testing ## API Changes - Updated: backend/src/api/workflows.py (Temporal submission) - Updated: backend/src/api/runs.py (Temporal status/results) - Updated: backend/src/main.py (727 lines, TemporalManager integration) - Updated: All 16 MCP tools to use TemporalManager ## Testing - ✅ All services healthy (Temporal, PostgreSQL, MinIO, workers, backend) - ✅ All API endpoints functional - ✅ End-to-end workflow test passed (72 findings from vulnerable_app) - ✅ MinIO storage integration working (target upload/download, results) - ✅ Worker activity discovery working (6 activities registered) - ✅ Tarball extraction working - ✅ SARIF report generation working ## Documentation - ARCHITECTURE.md: Complete Temporal architecture documentation - QUICKSTART_TEMPORAL.md: Getting started guide - MIGRATION_DECISION.md: Why we chose Temporal over Prefect - IMPLEMENTATION_STATUS.md: Migration progress tracking - workers/README.md: Worker development guide ## Dependencies - Added: temporalio>=1.6.0 - Added: boto3>=1.34.0 (MinIO S3 client) - Removed: prefect>=3.4.18 * feat: Add Python fuzzing vertical with Atheris integration This commit implements a complete Python fuzzing workflow using Atheris: ## Python Worker (workers/python/) - Dockerfile with Python 3.11, Atheris, and build tools - Generic worker.py for dynamic workflow discovery - requirements.txt with temporalio, boto3, atheris dependencies - Added to docker-compose.temporal.yaml with dedicated cache volume ## AtherisFuzzer Module (backend/toolbox/modules/fuzzer/) - Reusable module extending BaseModule - Auto-discovers fuzz targets (fuzz_*.py, *_fuzz.py, fuzz_target.py) - Recursive search to find targets in nested directories - Dynamically loads TestOneInput() function - Configurable max_iterations and timeout - Real-time stats callback support for live monitoring - Returns findings as ModuleFinding objects ## Atheris Fuzzing Workflow (backend/toolbox/workflows/atheris_fuzzing/) - Temporal workflow for orchestrating fuzzing - Downloads user code from MinIO - Executes AtherisFuzzer module - Uploads results to MinIO - Cleans up cache after execution - metadata.yaml with vertical: python for routing ## Test Project (test_projects/python_fuzz_waterfall/) - Demonstrates stateful waterfall vulnerability - main.py with check_secret() that leaks progress - fuzz_target.py with Atheris TestOneInput() harness - Complete README with usage instructions ## Backend Fixes - Fixed parameter merging in REST API endpoints (workflows.py) - Changed workflow parameter passing from positional args to kwargs (manager.py) - Default parameters now properly merged with user parameters ## Testing ✅ Worker discovered AtherisFuzzingWorkflow ✅ Workflow executed end-to-end successfully ✅ Fuzz target auto-discovered in nested directories ✅ Atheris ran 100,000 iterations ✅ Results uploaded and cache cleaned * chore: Complete Temporal migration with updated CLI/SDK/docs This commit includes all remaining Temporal migration changes: ## CLI Updates (cli/) - Updated workflow execution commands for Temporal - Enhanced error handling and exceptions - Updated dependencies in uv.lock ## SDK Updates (sdk/) - Client methods updated for Temporal workflows - Updated models for new workflow execution - Updated dependencies in uv.lock ## Documentation Updates (docs/) - Architecture documentation for Temporal - Workflow concept documentation - Resource management documentation (new) - Debugging guide (new) - Updated tutorials and how-to guides - Troubleshooting updates ## README Updates - Main README with Temporal instructions - Backend README - CLI README - SDK README ## Other - Updated IMPLEMENTATION_STATUS.md - Removed old vulnerable_app.tar.gz These changes complete the Temporal migration and ensure the CLI/SDK work correctly with the new backend. * fix: Use positional args instead of kwargs for Temporal workflows The Temporal Python SDK's start_workflow() method doesn't accept a 'kwargs' parameter. Workflows must receive parameters as positional arguments via the 'args' parameter. Changed from: args=workflow_args # Positional arguments This fixes the error: TypeError: Client.start_workflow() got an unexpected keyword argument 'kwargs' Workflows now correctly receive parameters in order: - security_assessment: [target_id, scanner_config, analyzer_config, reporter_config] - atheris_fuzzing: [target_id, target_file, max_iterations, timeout_seconds] - rust_test: [target_id, test_message] * fix: Filter metadata-only parameters from workflow arguments SecurityAssessmentWorkflow was receiving 7 arguments instead of 2-5. The issue was that target_path and volume_mode from default_parameters were being passed to the workflow, when they should only be used by the system for configuration. Now filters out metadata-only parameters (target_path, volume_mode) before passing arguments to workflow execution. * refactor: Remove Prefect leftovers and volume mounting legacy Complete cleanup of Prefect migration artifacts: Backend: - Delete registry.py and workflow_discovery.py (Prefect-specific files) - Remove Docker validation from setup.py (no longer needed) - Remove ResourceLimits and VolumeMount models - Remove target_path and volume_mode from WorkflowSubmission - Remove supported_volume_modes from API and discovery - Clean up metadata.yaml files (remove volume/path fields) - Simplify parameter filtering in manager.py SDK: - Remove volume_mode parameter from client methods - Remove ResourceLimits and VolumeMount models - Remove Prefect error patterns from docker_logs.py - Clean up WorkflowSubmission and WorkflowMetadata models CLI: - Remove Volume Modes display from workflow info All removed features are Prefect-specific or Docker volume mounting artifacts. Temporal workflows use MinIO storage exclusively. * feat: Add comprehensive test suite and benchmark infrastructure - Add 68 unit tests for fuzzer, scanner, and analyzer modules - Implement pytest-based test infrastructure with fixtures - Add 6 performance benchmarks with category-specific thresholds - Configure GitHub Actions for automated testing and benchmarking - Add test and benchmark documentation Test coverage: - AtherisFuzzer: 8 tests - CargoFuzzer: 14 tests - FileScanner: 22 tests - SecurityAnalyzer: 24 tests All tests passing (68/68) All benchmarks passing (6/6) * fix: Resolve all ruff linting violations across codebase Fixed 27 ruff violations in 12 files: - Removed unused imports (Depends, Dict, Any, Optional, etc.) - Fixed undefined workflow_info variable in workflows.py - Removed dead code with undefined variables in atheris_fuzzer.py - Changed f-string to regular string where no placeholders used All files now pass ruff checks for CI/CD compliance. * fix: Configure CI for unit tests only - Renamed docker-compose.temporal.yaml → docker-compose.yml for CI compatibility - Commented out integration-tests job (no integration tests yet) - Updated test-summary to only depend on lint and unit-tests CI will now run successfully with 68 unit tests. Integration tests can be added later. * feat: Add CI/CD integration with ephemeral deployment model Implements comprehensive CI/CD support for FuzzForge with on-demand worker management: **Worker Management (v0.7.0)** - Add WorkerManager for automatic worker lifecycle control - Auto-start workers from stopped state when workflows execute - Auto-stop workers after workflow completion - Health checks and startup timeout handling (90s default) **CI/CD Features** - `--fail-on` flag: Fail builds based on SARIF severity levels (error/warning/note/info) - `--export-sarif` flag: Export findings in SARIF 2.1.0 format - `--auto-start`/`--auto-stop` flags: Control worker lifecycle - Exit code propagation: Returns 1 on blocking findings, 0 on success **Exit Code Fix** - Add `except typer.Exit: raise` handlers at 3 critical locations - Move worker cleanup to finally block for guaranteed execution - Exit codes now propagate correctly even when build fails **CI Scripts & Examples** - ci-start.sh: Start FuzzForge services with health checks - ci-stop.sh: Clean shutdown with volume preservation option - GitHub Actions workflow example (security-scan.yml) - GitLab CI pipeline example (.gitlab-ci.example.yml) - docker-compose.ci.yml: CI-optimized compose file with profiles **OSS-Fuzz Integration** - New ossfuzz_campaign workflow for running OSS-Fuzz projects - OSS-Fuzz worker with Docker-in-Docker support - Configurable campaign duration and project selection **Documentation** - Comprehensive CI/CD integration guide (docs/how-to/cicd-integration.md) - Updated architecture docs with worker lifecycle details - Updated workspace isolation documentation - CLI README with worker management examples **SDK Enhancements** - Add get_workflow_worker_info() endpoint - Worker vertical metadata in workflow responses **Testing** - All workflows tested: security_assessment, atheris_fuzzing, secret_detection, cargo_fuzzing - All monitoring commands tested: stats, crashes, status, finding - Full CI pipeline simulation verified - Exit codes verified for success/failure scenarios Ephemeral CI/CD model: ~3-4GB RAM, ~60-90s startup, runs entirely in CI containers. * fix: Resolve ruff linting violations in CI/CD code - Remove unused variables (run_id, defaults, result) - Remove unused imports - Fix f-string without placeholders All CI/CD integration files now pass ruff checks.
FuzzForge Backend
A stateless API server for security testing workflow orchestration using Temporal. This system dynamically discovers workflows, executes them in isolated worker environments, and returns findings in SARIF format.
Architecture Overview
Core Components
- Workflow Discovery System: Automatically discovers workflows at startup
- Module System: Reusable components (scanner, analyzer, reporter) with a common interface
- Temporal Integration: Handles workflow orchestration, execution, and monitoring with vertical workers
- File Upload & Storage: HTTP multipart upload to MinIO for target files
- SARIF Output: Standardized security findings format
Key Features
- Stateless: No persistent data, fully scalable
- Generic: No hardcoded workflows, automatic discovery
- Isolated: Each workflow runs in specialized vertical workers
- Extensible: Easy to add new workflows and modules
- Secure: File upload with MinIO storage, automatic cleanup via lifecycle policies
- Observable: Comprehensive logging and status tracking
Quick Start
Prerequisites
- Docker and Docker Compose
Installation
From the project root, start all services:
docker-compose -f docker-compose.temporal.yaml up -d
This will start:
- Temporal server (Web UI at http://localhost:8233, gRPC at :7233)
- MinIO (S3 storage at http://localhost:9000, Console at http://localhost:9001)
- PostgreSQL database (for Temporal state)
- Vertical workers (worker-rust, worker-android, worker-web, etc.)
- FuzzForge backend API (port 8000)
Note: MinIO console login: fuzzforge / fuzzforge123
API Endpoints
Workflows
GET /workflows- List all discovered workflowsGET /workflows/{name}/metadata- Get workflow metadata and parametersGET /workflows/{name}/parameters- Get workflow parameter schemaGET /workflows/metadata/schema- Get metadata.yaml schemaPOST /workflows/{name}/submit- Submit a workflow for execution (path-based, legacy)POST /workflows/{name}/upload-and-submit- Upload local files and submit workflow (recommended)
Runs
GET /runs/{run_id}/status- Get run statusGET /runs/{run_id}/findings- Get SARIF findings from completed runGET /runs/{workflow_name}/findings/{run_id}- Alternative findings endpoint with workflow name
Workflow Structure
Each workflow must have:
toolbox/workflows/{workflow_name}/
workflow.py # Temporal workflow definition
metadata.yaml # Mandatory metadata (parameters, version, vertical, etc.)
requirements.txt # Optional Python dependencies (installed in vertical worker)
Note: With Temporal architecture, workflows run in pre-built vertical workers (e.g., worker-rust, worker-android), not individual Docker containers. The workflow code is mounted as a volume and discovered at runtime.
Example metadata.yaml
name: security_assessment
version: "1.0.0"
description: "Comprehensive security analysis workflow"
author: "FuzzForge Team"
category: "comprehensive"
vertical: "rust" # Routes to worker-rust
tags:
- "security"
- "analysis"
- "comprehensive"
supported_volume_modes:
- "ro"
- "rw"
requirements:
tools:
- "file_scanner"
- "security_analyzer"
- "sarif_reporter"
resources:
memory: "512Mi"
cpu: "500m"
timeout: 1800
has_docker: true
parameters:
type: object
properties:
target_path:
type: string
default: "/workspace"
description: "Path to analyze"
volume_mode:
type: string
enum: ["ro", "rw"]
default: "ro"
description: "Volume mount mode"
scanner_config:
type: object
description: "Scanner configuration"
properties:
max_file_size:
type: integer
description: "Maximum file size to scan (bytes)"
output_schema:
type: object
properties:
sarif:
type: object
description: "SARIF-formatted security findings"
summary:
type: object
description: "Scan execution summary"
Metadata Field Descriptions
- name: Workflow identifier (must match directory name)
- version: Semantic version (x.y.z format)
- description: Human-readable description of the workflow
- author: Workflow author/maintainer
- category: Workflow category (comprehensive, specialized, fuzzing, focused)
- tags: Array of descriptive tags for categorization
- requirements.tools: Required security tools that the workflow uses
- requirements.resources: Resource requirements enforced at runtime:
memory: Memory limit (e.g., "512Mi", "1Gi")cpu: CPU limit (e.g., "500m" for 0.5 cores, "1" for 1 core)timeout: Maximum execution time in seconds
- parameters: JSON Schema object defining workflow parameters
- output_schema: Expected output format (typically SARIF)
Resource Requirements
Resource requirements defined in workflow metadata are automatically enforced. Users can override defaults when submitting workflows:
curl -X POST "http://localhost:8000/workflows/security_assessment/submit" \
-H "Content-Type: application/json" \
-d '{
"target_path": "/tmp/project",
"volume_mode": "ro",
"resource_limits": {
"memory_limit": "1Gi",
"cpu_limit": "1"
}
}'
Resource precedence: User limits > Workflow requirements > System defaults
File Upload and Target Access
Upload Endpoint
The backend provides an upload endpoint for submitting workflows with local files:
POST /workflows/{workflow_name}/upload-and-submit
Content-Type: multipart/form-data
Parameters:
file: File upload (supports .tar.gz for directories)
parameters: JSON string of workflow parameters (optional)
volume_mode: "ro" or "rw" (default: "ro")
timeout: Execution timeout in seconds (optional)
Example using curl:
# Upload a directory (create tarball first)
tar -czf project.tar.gz /path/to/project
curl -X POST "http://localhost:8000/workflows/security_assessment/upload-and-submit" \
-F "file=@project.tar.gz" \
-F "parameters={\"check_secrets\":true}" \
-F "volume_mode=ro"
# Upload a single file
curl -X POST "http://localhost:8000/workflows/security_assessment/upload-and-submit" \
-F "file=@binary.elf" \
-F "volume_mode=ro"
Storage Flow
- CLI/API uploads file via HTTP multipart
- Backend receives file and streams to temporary location (max 10GB)
- Backend uploads to MinIO with generated
target_id - Workflow is submitted to Temporal with
target_id - Worker downloads target from MinIO to local cache
- Workflow processes target from cache
- MinIO lifecycle policy deletes files after 7 days
Advantages
- No host filesystem access required - workers can run anywhere
- Automatic cleanup - lifecycle policies prevent disk exhaustion
- Caching - repeated workflows reuse cached targets
- Multi-host ready - targets accessible from any worker
- Secure - isolated storage, no arbitrary host path access
Module Development
Modules implement the BaseModule interface:
from src.toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult
class MyModule(BaseModule):
def get_metadata(self) -> ModuleMetadata:
return ModuleMetadata(
name="my_module",
version="1.0.0",
description="Module description",
category="scanner",
...
)
async def execute(self, config: Dict, workspace: Path) -> ModuleResult:
# Module logic here
findings = [...]
return self.create_result(findings=findings)
def validate_config(self, config: Dict) -> bool:
# Validate configuration
return True
Submitting a Workflow
With File Upload (Recommended)
# Automatic tarball and upload
tar -czf project.tar.gz /home/user/project
curl -X POST "http://localhost:8000/workflows/security_assessment/upload-and-submit" \
-F "file=@project.tar.gz" \
-F "parameters={\"scanner_config\":{\"patterns\":[\"*.py\"]},\"analyzer_config\":{\"check_secrets\":true}}" \
-F "volume_mode=ro"
Legacy Path-Based Submission
# Only works if backend and target are on same machine
curl -X POST "http://localhost:8000/workflows/security_assessment/submit" \
-H "Content-Type: application/json" \
-d '{
"target_path": "/home/user/project",
"volume_mode": "ro",
"parameters": {
"scanner_config": {"patterns": ["*.py"]},
"analyzer_config": {"check_secrets": true}
}
}'
Getting Findings
curl "http://localhost:8000/runs/{run_id}/findings"
Returns SARIF-formatted findings:
{
"workflow": "security_assessment",
"run_id": "abc-123",
"sarif": {
"version": "2.1.0",
"runs": [{
"tool": {...},
"results": [...]
}]
}
}
Security Considerations
- File Upload Security: Files uploaded to MinIO with isolated storage
- Read-Only Default: Target files accessed as read-only unless explicitly set
- Worker Isolation: Each workflow runs in isolated vertical workers
- Resource Limits: Can set CPU/memory limits per worker
- Automatic Cleanup: MinIO lifecycle policies delete old files after 7 days
Development
Adding a New Workflow
- Create directory:
toolbox/workflows/my_workflow/ - Add
workflow.pywith a Temporal workflow (using@workflow.defn) - Add mandatory
metadata.yamlwithverticalfield - Restart the appropriate worker:
docker-compose -f docker-compose.temporal.yaml restart worker-rust - Worker will automatically discover and register the new workflow
Adding a New Module
- Create module in
toolbox/modules/{category}/ - Implement
BaseModuleinterface - Use in workflows via import
Adding a New Vertical Worker
- Create worker directory:
workers/{vertical}/ - Create
Dockerfilewith required tools - Add worker to
docker-compose.temporal.yaml - Worker will automatically discover workflows with matching
verticalin metadata