CI/CD Integration with Ephemeral Deployment Model (#14)

* feat: Complete migration from Prefect to Temporal

BREAKING CHANGE: Replaces Prefect workflow orchestration with Temporal

## Major Changes
- Replace Prefect with Temporal for workflow orchestration
- Implement vertical worker architecture (rust, android)
- Replace Docker registry with MinIO for unified storage
- Refactor activities to be co-located with workflows
- Update all API endpoints for Temporal compatibility

## Infrastructure
- New: docker-compose.temporal.yaml (Temporal + MinIO + workers)
- New: workers/ directory with rust and android vertical workers
- New: backend/src/temporal/ (manager, discovery)
- New: backend/src/storage/ (S3-cached storage with MinIO)
- New: backend/toolbox/common/ (shared storage activities)
- Deleted: docker-compose.yaml (old Prefect setup)
- Deleted: backend/src/core/prefect_manager.py
- Deleted: backend/src/services/prefect_stats_monitor.py
- Deleted: Docker registry and insecure-registries requirement

## Workflows
- Migrated: security_assessment workflow to Temporal
- New: rust_test workflow (example/test workflow)
- Deleted: secret_detection_scan (Prefect-based, to be reimplemented)
- Activities now co-located with workflows for independent testing

## API Changes
- Updated: backend/src/api/workflows.py (Temporal submission)
- Updated: backend/src/api/runs.py (Temporal status/results)
- Updated: backend/src/main.py (727 lines, TemporalManager integration)
- Updated: All 16 MCP tools to use TemporalManager

## Testing
-  All services healthy (Temporal, PostgreSQL, MinIO, workers, backend)
-  All API endpoints functional
-  End-to-end workflow test passed (72 findings from vulnerable_app)
-  MinIO storage integration working (target upload/download, results)
-  Worker activity discovery working (6 activities registered)
-  Tarball extraction working
-  SARIF report generation working

## Documentation
- ARCHITECTURE.md: Complete Temporal architecture documentation
- QUICKSTART_TEMPORAL.md: Getting started guide
- MIGRATION_DECISION.md: Why we chose Temporal over Prefect
- IMPLEMENTATION_STATUS.md: Migration progress tracking
- workers/README.md: Worker development guide

## Dependencies
- Added: temporalio>=1.6.0
- Added: boto3>=1.34.0 (MinIO S3 client)
- Removed: prefect>=3.4.18

* feat: Add Python fuzzing vertical with Atheris integration

This commit implements a complete Python fuzzing workflow using Atheris:

## Python Worker (workers/python/)
- Dockerfile with Python 3.11, Atheris, and build tools
- Generic worker.py for dynamic workflow discovery
- requirements.txt with temporalio, boto3, atheris dependencies
- Added to docker-compose.temporal.yaml with dedicated cache volume

## AtherisFuzzer Module (backend/toolbox/modules/fuzzer/)
- Reusable module extending BaseModule
- Auto-discovers fuzz targets (fuzz_*.py, *_fuzz.py, fuzz_target.py)
- Recursive search to find targets in nested directories
- Dynamically loads TestOneInput() function
- Configurable max_iterations and timeout
- Real-time stats callback support for live monitoring
- Returns findings as ModuleFinding objects

## Atheris Fuzzing Workflow (backend/toolbox/workflows/atheris_fuzzing/)
- Temporal workflow for orchestrating fuzzing
- Downloads user code from MinIO
- Executes AtherisFuzzer module
- Uploads results to MinIO
- Cleans up cache after execution
- metadata.yaml with vertical: python for routing

## Test Project (test_projects/python_fuzz_waterfall/)
- Demonstrates stateful waterfall vulnerability
- main.py with check_secret() that leaks progress
- fuzz_target.py with Atheris TestOneInput() harness
- Complete README with usage instructions

## Backend Fixes
- Fixed parameter merging in REST API endpoints (workflows.py)
- Changed workflow parameter passing from positional args to kwargs (manager.py)
- Default parameters now properly merged with user parameters

## Testing
 Worker discovered AtherisFuzzingWorkflow
 Workflow executed end-to-end successfully
 Fuzz target auto-discovered in nested directories
 Atheris ran 100,000 iterations
 Results uploaded and cache cleaned

* chore: Complete Temporal migration with updated CLI/SDK/docs

This commit includes all remaining Temporal migration changes:

## CLI Updates (cli/)
- Updated workflow execution commands for Temporal
- Enhanced error handling and exceptions
- Updated dependencies in uv.lock

## SDK Updates (sdk/)
- Client methods updated for Temporal workflows
- Updated models for new workflow execution
- Updated dependencies in uv.lock

## Documentation Updates (docs/)
- Architecture documentation for Temporal
- Workflow concept documentation
- Resource management documentation (new)
- Debugging guide (new)
- Updated tutorials and how-to guides
- Troubleshooting updates

## README Updates
- Main README with Temporal instructions
- Backend README
- CLI README
- SDK README

## Other
- Updated IMPLEMENTATION_STATUS.md
- Removed old vulnerable_app.tar.gz

These changes complete the Temporal migration and ensure the
CLI/SDK work correctly with the new backend.

* fix: Use positional args instead of kwargs for Temporal workflows

The Temporal Python SDK's start_workflow() method doesn't accept
a 'kwargs' parameter. Workflows must receive parameters as positional
arguments via the 'args' parameter.

Changed from:
  args=workflow_args  # Positional arguments

This fixes the error:
  TypeError: Client.start_workflow() got an unexpected keyword argument 'kwargs'

Workflows now correctly receive parameters in order:
- security_assessment: [target_id, scanner_config, analyzer_config, reporter_config]
- atheris_fuzzing: [target_id, target_file, max_iterations, timeout_seconds]
- rust_test: [target_id, test_message]

* fix: Filter metadata-only parameters from workflow arguments

SecurityAssessmentWorkflow was receiving 7 arguments instead of 2-5.
The issue was that target_path and volume_mode from default_parameters
were being passed to the workflow, when they should only be used by
the system for configuration.

Now filters out metadata-only parameters (target_path, volume_mode)
before passing arguments to workflow execution.

* refactor: Remove Prefect leftovers and volume mounting legacy

Complete cleanup of Prefect migration artifacts:

Backend:
- Delete registry.py and workflow_discovery.py (Prefect-specific files)
- Remove Docker validation from setup.py (no longer needed)
- Remove ResourceLimits and VolumeMount models
- Remove target_path and volume_mode from WorkflowSubmission
- Remove supported_volume_modes from API and discovery
- Clean up metadata.yaml files (remove volume/path fields)
- Simplify parameter filtering in manager.py

SDK:
- Remove volume_mode parameter from client methods
- Remove ResourceLimits and VolumeMount models
- Remove Prefect error patterns from docker_logs.py
- Clean up WorkflowSubmission and WorkflowMetadata models

CLI:
- Remove Volume Modes display from workflow info

All removed features are Prefect-specific or Docker volume mounting
artifacts. Temporal workflows use MinIO storage exclusively.

* feat: Add comprehensive test suite and benchmark infrastructure

- Add 68 unit tests for fuzzer, scanner, and analyzer modules
- Implement pytest-based test infrastructure with fixtures
- Add 6 performance benchmarks with category-specific thresholds
- Configure GitHub Actions for automated testing and benchmarking
- Add test and benchmark documentation

Test coverage:
- AtherisFuzzer: 8 tests
- CargoFuzzer: 14 tests
- FileScanner: 22 tests
- SecurityAnalyzer: 24 tests

All tests passing (68/68)
All benchmarks passing (6/6)

* fix: Resolve all ruff linting violations across codebase

Fixed 27 ruff violations in 12 files:
- Removed unused imports (Depends, Dict, Any, Optional, etc.)
- Fixed undefined workflow_info variable in workflows.py
- Removed dead code with undefined variables in atheris_fuzzer.py
- Changed f-string to regular string where no placeholders used

All files now pass ruff checks for CI/CD compliance.

* fix: Configure CI for unit tests only

- Renamed docker-compose.temporal.yaml → docker-compose.yml for CI compatibility
- Commented out integration-tests job (no integration tests yet)
- Updated test-summary to only depend on lint and unit-tests

CI will now run successfully with 68 unit tests. Integration tests can be added later.

* feat: Add CI/CD integration with ephemeral deployment model

Implements comprehensive CI/CD support for FuzzForge with on-demand worker management:

**Worker Management (v0.7.0)**
- Add WorkerManager for automatic worker lifecycle control
- Auto-start workers from stopped state when workflows execute
- Auto-stop workers after workflow completion
- Health checks and startup timeout handling (90s default)

**CI/CD Features**
- `--fail-on` flag: Fail builds based on SARIF severity levels (error/warning/note/info)
- `--export-sarif` flag: Export findings in SARIF 2.1.0 format
- `--auto-start`/`--auto-stop` flags: Control worker lifecycle
- Exit code propagation: Returns 1 on blocking findings, 0 on success

**Exit Code Fix**
- Add `except typer.Exit: raise` handlers at 3 critical locations
- Move worker cleanup to finally block for guaranteed execution
- Exit codes now propagate correctly even when build fails

**CI Scripts & Examples**
- ci-start.sh: Start FuzzForge services with health checks
- ci-stop.sh: Clean shutdown with volume preservation option
- GitHub Actions workflow example (security-scan.yml)
- GitLab CI pipeline example (.gitlab-ci.example.yml)
- docker-compose.ci.yml: CI-optimized compose file with profiles

**OSS-Fuzz Integration**
- New ossfuzz_campaign workflow for running OSS-Fuzz projects
- OSS-Fuzz worker with Docker-in-Docker support
- Configurable campaign duration and project selection

**Documentation**
- Comprehensive CI/CD integration guide (docs/how-to/cicd-integration.md)
- Updated architecture docs with worker lifecycle details
- Updated workspace isolation documentation
- CLI README with worker management examples

**SDK Enhancements**
- Add get_workflow_worker_info() endpoint
- Worker vertical metadata in workflow responses

**Testing**
- All workflows tested: security_assessment, atheris_fuzzing, secret_detection, cargo_fuzzing
- All monitoring commands tested: stats, crashes, status, finding
- Full CI pipeline simulation verified
- Exit codes verified for success/failure scenarios

Ephemeral CI/CD model: ~3-4GB RAM, ~60-90s startup, runs entirely in CI containers.

* fix: Resolve ruff linting violations in CI/CD code

- Remove unused variables (run_id, defaults, result)
- Remove unused imports
- Fix f-string without placeholders

All CI/CD integration files now pass ruff checks.
This commit is contained in:
tduhamel42
2025-10-14 10:13:45 +02:00
committed by GitHub
parent 987c49569c
commit 60ca088ecf
167 changed files with 26101 additions and 5703 deletions

View File

@@ -93,51 +93,61 @@ graph TB
### Orchestration Layer
- **Prefect Server:** Schedules and tracks workflows, backed by PostgreSQL.
- **Prefect Workers:** Execute workflows in Docker containers. Can be scaled horizontally.
- **Workflow Scheduler:** Balances load, manages priorities, and enforces resource limits.
- **Temporal Server:** Schedules and tracks workflows, backed by PostgreSQL.
- **Vertical Workers:** Long-lived workers pre-built with domain-specific toolchains (Android, Rust, Web, etc.). Can be scaled horizontally.
- **Task Queues:** Route workflows to appropriate vertical workers based on workflow metadata.
### Execution Layer
- **Docker Engine:** Runs workflow containers, enforcing isolation and resource limits.
- **Workflow Containers:** Custom images with security tools, mounting code and results volumes.
- **Docker Registry:** Stores and distributes workflow images.
- **Vertical Workers:** Long-lived processes with pre-installed security tools for specific domains.
- **MinIO Storage:** S3-compatible storage for uploaded targets and results.
- **Worker Cache:** Local cache for downloaded targets, with LRU eviction.
### Storage Layer
- **PostgreSQL Database:** Stores workflow metadata, state, and results.
- **Docker Volumes:** Persist workflow results and artifacts.
- **Result Cache:** Speeds up access to recent results, with in-memory and disk persistence.
- **PostgreSQL Database:** Stores Temporal workflow state and metadata.
- **MinIO (S3):** Persistent storage for uploaded targets and workflow results.
- **Worker Cache:** Local filesystem cache for downloaded targets with workspace isolation:
- **Isolated mode**: Each run gets `/cache/{target_id}/{run_id}/workspace/`
- **Shared mode**: All runs share `/cache/{target_id}/workspace/`
- **Copy-on-write mode**: Download once, copy per run
- **LRU eviction** when cache exceeds configured size
## How Does Data Flow Through the System?
### Submitting a Workflow
1. **User submits a workflow** via CLI or API client.
2. **API validates** the request and creates a deployment in Prefect.
3. **Prefect schedules** the workflow and assigns it to a worker.
4. **Worker launches a container** to run the workflow.
5. **Results are stored** in Docker volumes and the database.
6. **Status updates** flow back through Prefect and the API to the user.
1. **User submits a workflow** via CLI or API client (with optional file upload).
2. **If file provided, API uploads** to MinIO and gets a `target_id`.
3. **API validates** the request and submits to Temporal.
4. **Temporal routes** the workflow to the appropriate vertical worker queue.
5. **Worker downloads target** from MinIO to local cache (if needed).
6. **Worker executes workflow** with pre-installed tools.
7. **Results are stored** in MinIO and metadata in PostgreSQL.
8. **Status updates** flow back through Temporal and the API to the user.
```mermaid
sequenceDiagram
participant User
participant API
participant Prefect
participant MinIO
participant Temporal
participant Worker
participant Container
participant Storage
participant Cache
User->>API: Submit workflow
User->>API: Submit workflow + file
API->>API: Validate parameters
API->>Prefect: Create deployment
Prefect->>Worker: Schedule execution
Worker->>Container: Create and start
Container->>Container: Execute security tools
Container->>Storage: Store SARIF results
Worker->>Prefect: Update status
Prefect->>API: Workflow complete
API->>MinIO: Upload target file
MinIO-->>API: Return target_id
API->>Temporal: Submit workflow(target_id)
Temporal->>Worker: Route to vertical queue
Worker->>MinIO: Download target
MinIO-->>Worker: Stream file
Worker->>Cache: Store in local cache
Worker->>Worker: Execute security tools
Worker->>MinIO: Upload SARIF results
Worker->>Temporal: Update status
Temporal->>API: Workflow complete
API->>User: Return results
```
@@ -149,25 +159,27 @@ sequenceDiagram
## How Do Services Communicate?
- **Internally:** FastAPI talks to Prefect via REST; Prefect coordinates with workers over HTTP; workers manage containers via the Docker Engine API. All core services use pooled connections to PostgreSQL.
- **Externally:** Users interact via CLI or API clients (HTTP REST). The MCP server can automate workflows via its own protocol.
- **Internally:** FastAPI talks to Temporal via gRPC; Temporal coordinates with workers over gRPC; workers access MinIO via S3 API. All core services use pooled connections to PostgreSQL.
- **Externally:** Users interact via CLI or API clients (HTTP REST).
## How Is Security Enforced?
- **Container Isolation:** Each workflow runs in its own Docker network, as a non-root user, with strict resource limits and only necessary volumes mounted.
- **Volume Security:** Source code is mounted read-only; results are written to dedicated, temporary volumes.
- **API Security:** All endpoints require API keys, validate inputs, enforce rate limits, and log requests for auditing.
- **Worker Isolation:** Each workflow runs in isolated vertical workers with pre-defined toolchains.
- **Storage Security:** Uploaded files stored in MinIO with lifecycle policies; read-only access by default.
- **API Security:** All endpoints validate inputs, enforce rate limits, and log requests for auditing.
- **No Host Access:** Workers access targets via MinIO, not host filesystem.
## How Does FuzzForge Scale?
- **Horizontally:** Add more Prefect workers to handle more workflows in parallel. Scale the database with read replicas and connection pooling.
- **Vertically:** Adjust CPU and memory limits for containers and services as needed.
- **Horizontally:** Add more vertical workers to handle more workflows in parallel. Scale specific worker types based on demand.
- **Vertically:** Adjust CPU and memory limits for workers and adjust concurrent activity limits.
Example Docker Compose scaling:
```yaml
services:
prefect-worker:
worker-rust:
deploy:
replicas: 3 # Scale rust workers
resources:
limits:
memory: 4G
@@ -179,21 +191,22 @@ services:
## How Is It Deployed?
- **Development:** All services run via Docker Compose—backend, Prefect, workers, database, and registry.
- **Production:** Add load balancers, database clustering, and multiple worker instances for high availability. Health checks, metrics, and centralized logging support monitoring and troubleshooting.
- **Development:** All services run via Docker Compose—backend, Temporal, vertical workers, database, and MinIO.
- **Production:** Add load balancers, Temporal clustering, database replication, and multiple worker instances for high availability. Health checks, metrics, and centralized logging support monitoring and troubleshooting.
## How Is Configuration Managed?
- **Environment Variables:** Control core settings like database URLs, registry location, and Prefect API endpoints.
- **Service Discovery:** Docker Composes internal DNS lets services find each other by name, with consistent port mapping and health check endpoints.
- **Environment Variables:** Control core settings like database URLs, MinIO endpoints, and Temporal addresses.
- **Service Discovery:** Docker Compose's internal DNS lets services find each other by name, with consistent port mapping and health check endpoints.
Example configuration:
```bash
COMPOSE_PROJECT_NAME=fuzzforge
DATABASE_URL=postgresql://postgres:postgres@postgres:5432/fuzzforge
PREFECT_API_URL=http://prefect-server:4200/api
DOCKER_REGISTRY=localhost:5001
DOCKER_INSECURE_REGISTRY=true
TEMPORAL_ADDRESS=temporal:7233
S3_ENDPOINT=http://minio:9000
S3_ACCESS_KEY=fuzzforge
S3_SECRET_KEY=fuzzforge123
```
## How Are Failures Handled?
@@ -203,9 +216,9 @@ DOCKER_INSECURE_REGISTRY=true
## Implementation Details
- **Tech Stack:** FastAPI (Python async), Prefect 3.x, Docker, Docker Compose, PostgreSQL (asyncpg), and Docker networking.
- **Performance:** Workflows start in 25 seconds; results are retrieved quickly thanks to caching and database indexing.
- **Extensibility:** Add new workflows by deploying new Docker images; extend the API with new endpoints; configure storage backends as needed.
- **Tech Stack:** FastAPI (Python async), Temporal, MinIO, Docker, Docker Compose, PostgreSQL (asyncpg), and boto3 (S3 client).
- **Performance:** Workflows start immediately (workers are long-lived); results are retrieved quickly thanks to MinIO caching and database indexing.
- **Extensibility:** Add new workflows by mounting code; add new vertical workers with specialized toolchains; extend the API with new endpoints.
---

View File

@@ -22,58 +22,62 @@ FuzzForge relies on Docker containers for several key reasons:
Every workflow in FuzzForge is executed inside a Docker container. Heres what that means in practice:
- **Workflow containers** are built from language-specific base images (like Python or Node.js), with security tools and workflow code pre-installed.
- **Infrastructure containers** (API server, Prefect, database) use official images and are configured for the platforms needs.
- **Vertical worker containers** are built from language-specific base images with domain-specific security toolchains pre-installed (Android, Rust, Web, etc.).
- **Infrastructure containers** (API server, Temporal, MinIO, database) use official images and are configured for the platform's needs.
### Container Lifecycle: From Build to Cleanup
### Worker Lifecycle: From Build to Long-Running
The lifecycle of a workflow container looks like this:
The lifecycle of a vertical worker looks like this:
1. **Image Build:** A Docker image is built with all required tools and code.
2. **Image Push/Pull:** The image is pushed to (and later pulled from) a local or remote registry.
3. **Container Creation:** The container is created with the right volumes and environment.
4. **Execution:** The workflow runs inside the container.
5. **Result Storage:** Results are written to mounted volumes.
6. **Cleanup:** The container and any temporary data are removed.
1. **Image Build:** A Docker image is built with all required toolchains for the vertical.
2. **Worker Start:** The worker container starts as a long-lived process.
3. **Workflow Discovery:** Worker scans mounted `/app/toolbox` for workflows matching its vertical.
4. **Registration:** Workflows are registered with Temporal on the worker's task queue.
5. **Execution:** When a workflow is submitted, the worker downloads the target from MinIO and executes.
6. **Continuous Running:** Worker remains running, ready for the next workflow.
```mermaid
graph TB
Build[Build Image] --> Push[Push to Registry]
Push --> Pull[Pull Image]
Pull --> Create[Create Container]
Create --> Mount[Mount Volumes]
Mount --> Start[Start Container]
Start --> Execute[Run Workflow]
Execute --> Results[Store Results]
Execute --> Stop[Stop Container]
Stop --> Cleanup[Cleanup Data]
Cleanup --> Remove[Remove Container]
Build[Build Worker Image] --> Start[Start Worker Container]
Start --> Mount[Mount Toolbox Volume]
Mount --> Discover[Discover Workflows]
Discover --> Register[Register with Temporal]
Register --> Ready[Worker Ready]
Ready --> Workflow[Workflow Submitted]
Workflow --> Download[Download Target from MinIO]
Download --> Execute[Execute Workflow]
Execute --> Upload[Upload Results to MinIO]
Upload --> Ready
```
---
## Whats Inside a Workflow Container?
## What's Inside a Vertical Worker Container?
A typical workflow container is structured like this:
A typical vertical worker container is structured like this:
- **Base Image:** Usually a slim language image (e.g., `python:3.11-slim`).
- **Base Image:** Language-specific image (e.g., `python:3.11-slim`).
- **System Dependencies:** Installed as needed (e.g., `git`, `curl`).
- **Security Tools:** Pre-installed (e.g., `semgrep`, `bandit`, `safety`).
- **Workflow Code:** Copied into the container.
- **Domain-Specific Toolchains:** Pre-installed (e.g., Rust: `AFL++`, `cargo-fuzz`; Android: `apktool`, `Frida`).
- **Temporal Python SDK:** For workflow execution.
- **Boto3:** For MinIO/S3 access.
- **Worker Script:** Discovers and registers workflows.
- **Non-root User:** Created for execution.
- **Entrypoint:** Runs the workflow code.
- **Entrypoint:** Runs the worker discovery and registration loop.
Example Dockerfile snippet:
Example Dockerfile snippet for Rust worker:
```dockerfile
FROM python:3.11-slim
RUN apt-get update && apt-get install -y git curl && rm -rf /var/lib/apt/lists/*
RUN pip install semgrep bandit safety
COPY ./toolbox /app/toolbox
RUN apt-get update && apt-get install -y git curl build-essential && rm -rf /var/lib/apt/lists/*
# Install AFL++, cargo, etc.
RUN pip install temporalio boto3 pydantic
COPY worker.py /app/
WORKDIR /app
RUN useradd -m -u 1000 fuzzforge
USER fuzzforge
CMD ["python", "-m", "toolbox.main"]
# Toolbox will be mounted as volume at /app/toolbox
CMD ["python", "worker.py"]
```
---
@@ -102,37 +106,42 @@ networks:
### Volume Types
- **Target Code Volume:** Mounts the code to be analyzed, read-only, into the container.
- **Result Volume:** Stores workflow results and artifacts, persists after container exit.
- **Temporary Volumes:** Used for scratch space, destroyed with the container.
- **Toolbox Volume:** Mounts the workflow code directory, read-only, for dynamic discovery.
- **Worker Cache:** Local cache for downloaded MinIO targets, with LRU eviction.
- **MinIO Data:** Persistent storage for uploaded targets and results (S3-compatible).
Example volume mount:
```yaml
volumes:
- "/host/path/to/code:/app/target:ro"
- "fuzzforge_prefect_storage:/app/prefect"
- "./toolbox:/app/toolbox:ro" # Workflow code
- "worker_cache:/cache" # Local cache
- "minio_data:/data" # MinIO storage
```
### Volume Security
- **Read-only Mounts:** Prevent workflows from modifying source code.
- **Isolated Results:** Each workflow writes to its own result directory.
- **No Arbitrary Host Access:** Only explicitly mounted paths are accessible.
- **Read-only Toolbox:** Workflows cannot modify the mounted toolbox code.
- **Isolated Storage:** Each workflow's target is stored with a unique `target_id` in MinIO.
- **No Host Filesystem Access:** Workers access targets via MinIO, not host paths.
- **Automatic Cleanup:** MinIO lifecycle policies delete old targets after 7 days.
---
## How Are Images Built and Managed?
## How Are Worker Images Built and Managed?
- **Automated Builds:** Images are built and pushed to a local registry for development, or a secure registry for production.
- **Automated Builds:** Vertical worker images are built with specialized toolchains.
- **Build Optimization:** Use layer caching, multi-stage builds, and minimal base images.
- **Versioning:** Use tags (`latest`, semantic versions, or SHA digests) to track images.
- **Versioning:** Use tags (`latest`, semantic versions) to track worker images.
- **Long-Lived:** Workers run continuously, not ephemeral per-workflow.
Example build and push:
Example build:
```bash
docker build -t localhost:5001/fuzzforge-static-analysis:latest .
docker push localhost:5001/fuzzforge-static-analysis:latest
cd workers/rust
docker build -t fuzzforge-worker-rust:latest .
# Or via docker-compose
docker-compose -f docker-compose.temporal.yaml build worker-rust
```
---
@@ -147,7 +156,7 @@ Example resource config:
```yaml
services:
prefect-worker:
worker-rust:
deploy:
resources:
limits:
@@ -156,6 +165,8 @@ services:
reservations:
memory: 1G
cpus: '0.5'
environment:
MAX_CONCURRENT_ACTIVITIES: 5
```
---
@@ -172,7 +183,7 @@ Example security options:
```yaml
services:
prefect-worker:
worker-rust:
security_opt:
- no-new-privileges:true
cap_drop:
@@ -188,8 +199,9 @@ services:
## How Is Performance Optimized?
- **Image Layering:** Structure Dockerfiles for efficient caching.
- **Dependency Preinstallation:** Reduce startup time by pre-installing dependencies.
- **Warm Containers:** Optionally pre-create containers for faster workflow startup.
- **Pre-installed Toolchains:** All tools installed in worker image, zero setup time per workflow.
- **Long-Lived Workers:** Eliminate container startup overhead entirely.
- **Local Caching:** MinIO targets cached locally for repeated workflows.
- **Horizontal Scaling:** Scale worker containers to handle more workflows in parallel.
---
@@ -205,10 +217,10 @@ services:
## How Does This All Fit Into FuzzForge?
- **Prefect Workers:** Manage the full lifecycle of workflow containers.
- **API Integration:** Exposes container status, logs, and resource metrics.
- **Volume Management:** Ensures results and artifacts are collected and persisted.
- **Security and Resource Controls:** Enforced automatically for every workflow.
- **Temporal Workers:** Long-lived vertical workers execute workflows with pre-installed toolchains.
- **API Integration:** Exposes workflow status, logs, and resource metrics via Temporal.
- **MinIO Storage:** Ensures targets and results are stored, cached, and cleaned up automatically.
- **Security and Resource Controls:** Enforced automatically for every worker and workflow.
---

View File

@@ -0,0 +1,594 @@
# Resource Management in FuzzForge
FuzzForge uses a multi-layered approach to manage CPU, memory, and concurrency for workflow execution. This ensures stable operation, prevents resource exhaustion, and allows predictable performance.
---
## Overview
Resource limiting in FuzzForge operates at three levels:
1. **Docker Container Limits** (Primary Enforcement) - Hard limits enforced by Docker
2. **Worker Concurrency Limits** - Controls parallel workflow execution
3. **Workflow Metadata** (Advisory) - Documents resource requirements
---
## Worker Lifecycle Management (On-Demand Startup)
**New in v0.7.0**: Workers now support on-demand startup/shutdown for optimal resource usage.
### Architecture
Workers are **pre-built** but **not auto-started**:
```
┌─────────────┐
│ docker- │ Pre-built worker images
│ compose │ with profiles: ["workers", "ossfuzz"]
│ build │ restart: "no"
└─────────────┘
┌─────────────┐
│ Workers │ Status: Exited (not running)
│ Pre-built │ RAM Usage: 0 MB
└─────────────┘
┌─────────────┐
│ ff workflow │ CLI detects required worker
│ run │ via /workflows/{name}/worker-info API
└─────────────┘
┌─────────────┐
│ docker │ docker start fuzzforge-worker-ossfuzz
│ start │ Wait for healthy status
└─────────────┘
┌─────────────┐
│ Worker │ Status: Up
│ Running │ RAM Usage: ~1-2 GB
└─────────────┘
```
### Resource Savings
| State | Services Running | RAM Usage |
|-------|-----------------|-----------|
| **Idle** (no workflows) | Temporal, PostgreSQL, MinIO, Backend | ~1.2 GB |
| **Active** (1 workflow) | Core + 1 worker | ~3-5 GB |
| **Legacy** (all workers) | Core + all 5 workers | ~8 GB |
**Savings: ~6-7GB RAM when idle**
### Configuration
Control via `.fuzzforge/config.yaml`:
```yaml
workers:
auto_start_workers: true # Auto-start when needed
auto_stop_workers: false # Auto-stop after completion
worker_startup_timeout: 60 # Startup timeout (seconds)
docker_compose_file: null # Custom compose file path
```
Or via CLI flags:
```bash
# Auto-start disabled
ff workflow run ossfuzz_campaign . --no-auto-start
# Auto-stop enabled
ff workflow run ossfuzz_campaign . --wait --auto-stop
```
### Backend API
New endpoint: `GET /workflows/{workflow_name}/worker-info`
**Response**:
```json
{
"workflow": "ossfuzz_campaign",
"vertical": "ossfuzz",
"worker_container": "fuzzforge-worker-ossfuzz",
"task_queue": "ossfuzz-queue",
"required": true
}
```
### SDK Integration
```python
from fuzzforge_sdk import FuzzForgeClient
client = FuzzForgeClient()
worker_info = client.get_workflow_worker_info("ossfuzz_campaign")
# Returns: {"vertical": "ossfuzz", "worker_container": "fuzzforge-worker-ossfuzz", ...}
```
### Manual Control
```bash
# Start worker manually
docker start fuzzforge-worker-ossfuzz
# Stop worker manually
docker stop fuzzforge-worker-ossfuzz
# Check all worker statuses
docker ps -a --filter "name=fuzzforge-worker"
```
---
## Level 1: Docker Container Limits (Primary)
Docker container limits are the **primary enforcement mechanism** for CPU and memory resources. These are configured in `docker-compose.temporal.yaml` and enforced by the Docker runtime.
### Configuration
```yaml
services:
worker-rust:
deploy:
resources:
limits:
cpus: '2.0' # Maximum 2 CPU cores
memory: 2G # Maximum 2GB RAM
reservations:
cpus: '0.5' # Minimum 0.5 CPU cores reserved
memory: 512M # Minimum 512MB RAM reserved
```
### How It Works
- **CPU Limit**: Docker throttles CPU usage when the container exceeds the limit
- **Memory Limit**: Docker kills the container (OOM) if it exceeds the memory limit
- **Reservations**: Guarantees minimum resources are available to the worker
### Example Configuration by Vertical
Different verticals have different resource needs:
**Rust Worker** (CPU-intensive fuzzing):
```yaml
worker-rust:
deploy:
resources:
limits:
cpus: '4.0'
memory: 4G
```
**Android Worker** (Memory-intensive emulation):
```yaml
worker-android:
deploy:
resources:
limits:
cpus: '2.0'
memory: 8G
```
**Web Worker** (Lightweight analysis):
```yaml
worker-web:
deploy:
resources:
limits:
cpus: '1.0'
memory: 1G
```
### Monitoring Container Resources
Check real-time resource usage:
```bash
# Monitor all workers
docker stats
# Monitor specific worker
docker stats fuzzforge-worker-rust
# Output:
# CONTAINER CPU % MEM USAGE / LIMIT MEM %
# fuzzforge-worker-rust 85% 1.5GiB / 2GiB 75%
```
---
## Level 2: Worker Concurrency Limits
The `MAX_CONCURRENT_ACTIVITIES` environment variable controls how many workflows can execute **simultaneously** on a single worker.
### Configuration
```yaml
services:
worker-rust:
environment:
MAX_CONCURRENT_ACTIVITIES: 5
deploy:
resources:
limits:
memory: 2G
```
### How It Works
- **Total Container Memory**: 2GB
- **Concurrent Workflows**: 5
- **Memory per Workflow**: ~400MB (2GB ÷ 5)
If a 6th workflow is submitted, it **waits in the Temporal queue** until one of the 5 running workflows completes.
### Calculating Concurrency
Use this formula to determine `MAX_CONCURRENT_ACTIVITIES`:
```
MAX_CONCURRENT_ACTIVITIES = Container Memory Limit / Estimated Workflow Memory
```
**Example:**
- Container limit: 4GB
- Workflow memory: ~800MB
- Concurrency: 4GB ÷ 800MB = **5 concurrent workflows**
### Configuration Examples
**High Concurrency (Lightweight Workflows)**:
```yaml
worker-web:
environment:
MAX_CONCURRENT_ACTIVITIES: 10 # Many small workflows
deploy:
resources:
limits:
memory: 2G # ~200MB per workflow
```
**Low Concurrency (Heavy Workflows)**:
```yaml
worker-rust:
environment:
MAX_CONCURRENT_ACTIVITIES: 2 # Few large workflows
deploy:
resources:
limits:
memory: 4G # ~2GB per workflow
```
### Monitoring Concurrency
Check how many workflows are running:
```bash
# View worker logs
docker-compose -f docker-compose.temporal.yaml logs worker-rust | grep "Starting"
# Check Temporal UI
# Open http://localhost:8233
# Navigate to "Task Queues" → "rust" → See pending/running counts
```
---
## Level 3: Workflow Metadata (Advisory)
Workflow metadata in `metadata.yaml` documents resource requirements, but these are **advisory only** (except for timeout).
### Configuration
```yaml
# backend/toolbox/workflows/security_assessment/metadata.yaml
requirements:
resources:
memory: "512Mi" # Estimated memory usage (advisory)
cpu: "500m" # Estimated CPU usage (advisory)
timeout: 1800 # Execution timeout in seconds (ENFORCED)
```
### What's Enforced vs Advisory
| Field | Enforcement | Description |
|-------|-------------|-------------|
| `timeout` | ✅ **Enforced by Temporal** | Workflow killed if exceeds timeout |
| `memory` | ⚠️ Advisory only | Documents expected memory usage |
| `cpu` | ⚠️ Advisory only | Documents expected CPU usage |
### Why Metadata Is Useful
Even though `memory` and `cpu` are advisory, they're valuable for:
1. **Capacity Planning**: Determine appropriate container limits
2. **Concurrency Tuning**: Calculate `MAX_CONCURRENT_ACTIVITIES`
3. **Documentation**: Communicate resource needs to users
4. **Scheduling Hints**: Future horizontal scaling logic
### Timeout Enforcement
The `timeout` field is **enforced by Temporal**:
```python
# Temporal automatically cancels workflow after timeout
@workflow.defn
class SecurityAssessmentWorkflow:
@workflow.run
async def run(self, target_id: str):
# If this takes longer than metadata.timeout (1800s),
# Temporal will cancel the workflow
...
```
**Check timeout in Temporal UI:**
1. Open http://localhost:8233
2. Navigate to workflow execution
3. See "Timeout" in workflow details
4. If exceeded, status shows "TIMED_OUT"
---
## Resource Management Best Practices
### 1. Set Conservative Container Limits
Start with lower limits and increase based on actual usage:
```yaml
# Start conservative
worker-rust:
deploy:
resources:
limits:
cpus: '2.0'
memory: 2G
# Monitor with: docker stats
# Increase if consistently hitting limits
```
### 2. Calculate Concurrency from Profiling
Profile a single workflow first:
```bash
# Run single workflow and monitor
docker stats fuzzforge-worker-rust
# Note peak memory usage (e.g., 800MB)
# Calculate concurrency: 4GB ÷ 800MB = 5
```
### 3. Set Realistic Timeouts
Base timeouts on actual workflow duration:
```yaml
# Static analysis: 5-10 minutes
timeout: 600
# Fuzzing: 1-24 hours
timeout: 86400
# Quick scans: 1-2 minutes
timeout: 120
```
### 4. Monitor Resource Exhaustion
Watch for these warning signs:
```bash
# Check for OOM kills
docker-compose -f docker-compose.temporal.yaml logs worker-rust | grep -i "oom\|killed"
# Check for CPU throttling
docker stats fuzzforge-worker-rust
# If CPU% consistently at limit → increase cpus
# Check for memory pressure
docker stats fuzzforge-worker-rust
# If MEM% consistently >90% → increase memory
```
### 5. Use Vertical-Specific Configuration
Different verticals have different needs:
| Vertical | CPU Priority | Memory Priority | Typical Config |
|----------|--------------|-----------------|----------------|
| Rust Fuzzing | High | Medium | 4 CPUs, 4GB RAM |
| Android Analysis | Medium | High | 2 CPUs, 8GB RAM |
| Web Scanning | Low | Low | 1 CPU, 1GB RAM |
| Static Analysis | Medium | Medium | 2 CPUs, 2GB RAM |
---
## Horizontal Scaling
To handle more workflows, scale worker containers horizontally:
```bash
# Scale rust worker to 3 instances
docker-compose -f docker-compose.temporal.yaml up -d --scale worker-rust=3
# Now you can run:
# - 3 workers × 5 concurrent activities = 15 workflows simultaneously
```
**How it works:**
- Temporal load balances across all workers on the same task queue
- Each worker has independent resource limits
- No shared state between workers
---
## Troubleshooting Resource Issues
### Issue: Workflows Stuck in "Running" State
**Symptom:** Workflow shows RUNNING but makes no progress
**Diagnosis:**
```bash
# Check worker is alive
docker-compose -f docker-compose.temporal.yaml ps worker-rust
# Check worker resource usage
docker stats fuzzforge-worker-rust
# Check for OOM kills
docker-compose -f docker-compose.temporal.yaml logs worker-rust | grep -i oom
```
**Solution:**
- Increase memory limit if worker was killed
- Reduce `MAX_CONCURRENT_ACTIVITIES` if overloaded
- Check worker logs for errors
### Issue: "Too Many Pending Tasks"
**Symptom:** Temporal shows many queued workflows
**Diagnosis:**
```bash
# Check concurrent activities setting
docker exec fuzzforge-worker-rust env | grep MAX_CONCURRENT_ACTIVITIES
# Check current workload
docker-compose -f docker-compose.temporal.yaml logs worker-rust | grep "Starting"
```
**Solution:**
- Increase `MAX_CONCURRENT_ACTIVITIES` if resources allow
- Add more worker instances (horizontal scaling)
- Increase container resource limits
### Issue: Workflow Timeout
**Symptom:** Workflow shows "TIMED_OUT" in Temporal UI
**Diagnosis:**
1. Check `metadata.yaml` timeout setting
2. Check Temporal UI for execution duration
3. Determine if timeout is appropriate
**Solution:**
```yaml
# Increase timeout in metadata.yaml
requirements:
resources:
timeout: 3600 # Increased from 1800
```
---
## Workspace Isolation and Cache Management
FuzzForge uses workspace isolation to prevent concurrent workflows from interfering with each other. Each workflow run can have its own isolated workspace or share a common workspace based on the isolation mode.
### Cache Directory Structure
Workers cache downloaded targets locally to avoid repeated downloads:
```
/cache/
├── {target_id_1}/
│ ├── {run_id_1}/ # Isolated mode
│ │ ├── target # Downloaded tarball
│ │ └── workspace/ # Extracted files
│ ├── {run_id_2}/
│ │ ├── target
│ │ └── workspace/
│ └── workspace/ # Shared mode (no run_id)
│ └── ...
├── {target_id_2}/
│ └── shared/ # Copy-on-write shared download
│ ├── target
│ └── workspace/
```
### Isolation Modes
**Isolated Mode** (default for fuzzing):
- Each run gets `/cache/{target_id}/{run_id}/workspace/`
- Safe for concurrent execution
- Cleanup removes entire run directory
**Shared Mode** (for read-only workflows):
- All runs share `/cache/{target_id}/workspace/`
- Efficient (downloads once)
- No cleanup (cache persists)
**Copy-on-Write Mode**:
- Downloads to `/cache/{target_id}/shared/`
- Copies to `/cache/{target_id}/{run_id}/` per run
- Balances performance and isolation
### Cache Limits
Configure cache limits via environment variables:
```yaml
worker-rust:
environment:
CACHE_DIR: /cache
CACHE_MAX_SIZE: 10GB # Maximum cache size before LRU eviction
CACHE_TTL: 7d # Time-to-live for cached files
```
### LRU Eviction
When cache exceeds `CACHE_MAX_SIZE`, the least-recently-used files are automatically evicted:
1. Worker tracks last access time for each cached target
2. When cache is full, oldest accessed files are removed first
3. Eviction runs periodically (every 30 minutes)
### Monitoring Cache Usage
Check cache size and cleanup logs:
```bash
# Check cache size
docker exec fuzzforge-worker-rust du -sh /cache
# Monitor cache evictions
docker-compose -f docker-compose.temporal.yaml logs worker-rust | grep "Evicted from cache"
# Check download vs cache hit rate
docker-compose -f docker-compose.temporal.yaml logs worker-rust | grep -E "Cache (HIT|MISS)"
```
See the [Workspace Isolation](/concept/workspace-isolation) guide for complete details on isolation modes and when to use each.
---
## Summary
FuzzForge's resource management strategy:
1. **Docker Container Limits**: Primary enforcement (CPU/memory hard limits)
2. **Concurrency Limits**: Controls parallel workflows per worker
3. **Workflow Metadata**: Advisory resource hints + enforced timeout
4. **Workspace Isolation**: Controls cache sharing and cleanup behavior
**Key Takeaways:**
- Set conservative Docker limits and adjust based on monitoring
- Calculate `MAX_CONCURRENT_ACTIVITIES` from container memory ÷ workflow memory
- Use `docker stats` and Temporal UI to monitor resource usage
- Scale horizontally by adding more worker instances
- Set realistic timeouts based on actual workflow duration
- Choose appropriate isolation mode (isolated for fuzzing, shared for analysis)
- Monitor cache usage and adjust `CACHE_MAX_SIZE` as needed
---
**Next Steps:**
- Review `docker-compose.temporal.yaml` resource configuration
- Profile your workflows to determine actual resource usage
- Adjust limits based on monitoring data
- Set up alerts for resource exhaustion

View File

@@ -25,30 +25,31 @@ Heres how a workflow moves through the FuzzForge system:
```mermaid
graph TB
User[User/CLI/API] --> API[FuzzForge API]
API --> Prefect[Prefect Orchestrator]
Prefect --> Worker[Prefect Worker]
Worker --> Container[Docker Container]
Container --> Tools[Security Tools]
API --> MinIO[MinIO Storage]
API --> Temporal[Temporal Orchestrator]
Temporal --> Worker[Vertical Worker]
Worker --> MinIO
Worker --> Tools[Security Tools]
Tools --> Results[SARIF Results]
Results --> Storage[Persistent Storage]
Results --> MinIO
```
**Key roles:**
- **User/CLI/API:** Submits and manages workflows.
- **FuzzForge API:** Validates, orchestrates, and tracks workflows.
- **Prefect Orchestrator:** Schedules and manages workflow execution.
- **Prefect Worker:** Runs the workflow in a Docker container.
- **User/CLI/API:** Submits workflows and uploads files.
- **FuzzForge API:** Validates, uploads targets, and tracks workflows.
- **Temporal Orchestrator:** Schedules and manages workflow execution.
- **Vertical Worker:** Long-lived worker with pre-installed security tools.
- **MinIO Storage:** Stores uploaded targets and results.
- **Security Tools:** Perform the actual analysis.
- **Persistent Storage:** Stores results and artifacts.
---
## Workflow Lifecycle: From Idea to Results
1. **Design:** Choose tools, define integration logic, set up parameters, and build the Docker image.
2. **Deployment:** Build and push the image, register the workflow, and configure defaults.
3. **Execution:** User submits a workflow; parameters and target are validated; the workflow is scheduled and executed in a container; tools run as designed.
4. **Completion:** Results are collected, normalized, and stored; status is updated; temporary resources are cleaned up; results are made available via API/CLI.
1. **Design:** Choose tools, define integration logic, set up parameters, and specify the vertical worker.
2. **Deployment:** Create workflow code, add metadata with `vertical` field, mount as volume in worker.
3. **Execution:** User submits a workflow with file upload; file is stored in MinIO; workflow is routed to vertical worker; worker downloads target and executes; tools run as designed.
4. **Completion:** Results are collected, normalized, and stored in MinIO; status is updated; MinIO lifecycle policies clean up old files; results are made available via API/CLI.
---
@@ -85,25 +86,25 @@ FuzzForge supports several workflow types, each optimized for a specific securit
## Data Flow and Storage
- **Input:** Target code and parameters are validated and mounted as read-only volumes.
- **Processing:** Tools are initialized and run (often in parallel); outputs are collected and normalized.
- **Output:** Results are stored in persistent volumes and indexed for fast retrieval; metadata is saved in the database; intermediate results may be cached for performance.
- **Input:** Target files uploaded via HTTP to MinIO; parameters validated and passed to Temporal.
- **Processing:** Worker downloads target from MinIO to local cache; tools are initialized and run (often in parallel); outputs are collected and normalized.
- **Output:** Results are stored in MinIO and indexed for fast retrieval; metadata is saved in PostgreSQL; targets cached locally for repeated workflows; lifecycle policies clean up after 7 days.
---
## Error Handling and Recovery
- **Tool-Level:** Timeouts, resource exhaustion, and crashes are handled gracefully; failed tools dont stop the workflow.
- **Workflow-Level:** Container failures, volume issues, and network problems are detected and reported.
- **Recovery:** Automatic retries for transient errors; partial results are returned when possible; workflows degrade gracefully if some tools are unavailable.
- **Tool-Level:** Timeouts, resource exhaustion, and crashes are handled gracefully; failed tools don't stop the workflow.
- **Workflow-Level:** Worker failures, storage issues, and network problems are detected and reported by Temporal.
- **Recovery:** Automatic retries for transient errors via Temporal; partial results are returned when possible; workflows degrade gracefully if some tools are unavailable; MinIO ensures targets remain accessible.
---
## Performance and Optimization
- **Container Efficiency:** Docker images are layered and cached for fast startup; containers may be reused when safe.
- **Worker Efficiency:** Long-lived workers eliminate container startup overhead; pre-installed toolchains reduce setup time.
- **Parallel Processing:** Independent tools run concurrently to maximize CPU usage and minimize wait times.
- **Caching:** Images, dependencies, and intermediate results are cached to avoid unnecessary recomputation.
- **Caching:** MinIO targets are cached locally; repeated workflows reuse cached targets; worker cache uses LRU eviction.
---

View File

@@ -0,0 +1,378 @@
# Workspace Isolation
FuzzForge's workspace isolation system ensures that concurrent workflow runs don't interfere with each other. This is critical for fuzzing and security analysis workloads where multiple workflows might process the same target simultaneously.
---
## Why Workspace Isolation?
### The Problem
Without isolation, concurrent workflows accessing the same target would share the same cache directory:
```
/cache/{target_id}/workspace/
```
This causes problems when:
- **Fuzzing workflows** modify corpus files and crash artifacts
- **Multiple runs** operate on the same target simultaneously
- **File conflicts** occur during read/write operations
### The Solution
FuzzForge implements configurable workspace isolation with three modes:
1. **isolated** (default): Each run gets its own workspace
2. **shared**: All runs share the same workspace
3. **copy-on-write**: Download once, copy per run
---
## Isolation Modes
### Isolated Mode (Default)
**Use for**: Fuzzing workflows, any workflow that modifies files
**Cache path**: `/cache/{target_id}/{run_id}/workspace/`
Each workflow run gets a completely isolated workspace directory. The target is downloaded to a run-specific path using the unique `run_id`.
**Advantages:**
- ✅ Safe for concurrent execution
- ✅ No file conflicts
- ✅ Clean per-run state
**Disadvantages:**
- ⚠️ Downloads target for each run (higher bandwidth/storage)
- ⚠️ No sharing of downloaded artifacts
**Example workflows:**
- `atheris_fuzzing` - Modifies corpus, creates crash files
- `cargo_fuzzing` - Modifies corpus, generates artifacts
**metadata.yaml:**
```yaml
name: atheris_fuzzing
workspace_isolation: "isolated"
```
**Cleanup behavior:**
Entire run directory `/cache/{target_id}/{run_id}/` is removed after workflow completes.
---
### Shared Mode
**Use for**: Read-only analysis workflows, security scanners
**Cache path**: `/cache/{target_id}/workspace/`
All workflow runs for the same target share a single workspace directory. The target is downloaded once and reused across runs.
**Advantages:**
- ✅ Efficient (download once, use many times)
- ✅ Lower bandwidth and storage usage
- ✅ Faster startup (cache hit after first download)
**Disadvantages:**
- ⚠️ Not safe for workflows that modify files
- ⚠️ Potential race conditions if workflows write
**Example workflows:**
- `security_assessment` - Read-only file scanning and analysis
- `secret_detection` - Read-only secret scanning
**metadata.yaml:**
```yaml
name: security_assessment
workspace_isolation: "shared"
```
**Cleanup behavior:**
No cleanup (workspace shared across runs). Cache persists until LRU eviction.
---
### Copy-on-Write Mode
**Use for**: Workflows that need isolation but benefit from shared initial download
**Cache paths**:
- Shared download: `/cache/{target_id}/shared/target`
- Per-run copy: `/cache/{target_id}/{run_id}/workspace/`
Target is downloaded once to a shared location, then copied for each run.
**Advantages:**
- ✅ Download once (shared bandwidth)
- ✅ Isolated per-run workspace (safe for modifications)
- ✅ Balances performance and safety
**Disadvantages:**
- ⚠️ Copy overhead (disk I/O per run)
- ⚠️ Higher storage usage than shared mode
**metadata.yaml:**
```yaml
name: my_workflow
workspace_isolation: "copy-on-write"
```
**Cleanup behavior:**
Run-specific copies removed, shared download persists until LRU eviction.
---
## How It Works
### Activity Signature
The `get_target` activity accepts isolation parameters:
```python
from temporalio import workflow
# In your workflow
target_path = await workflow.execute_activity(
"get_target",
args=[target_id, run_id, "isolated"], # target_id, run_id, workspace_isolation
start_to_close_timeout=timedelta(minutes=5)
)
```
### Path Resolution
Based on the isolation mode:
```python
# Isolated mode
if workspace_isolation == "isolated":
cache_path = f"/cache/{target_id}/{run_id}/"
# Shared mode
elif workspace_isolation == "shared":
cache_path = f"/cache/{target_id}/"
# Copy-on-write mode
else: # copy-on-write
shared_path = f"/cache/{target_id}/shared/"
cache_path = f"/cache/{target_id}/{run_id}/"
# Download to shared_path, copy to cache_path
```
### Cleanup
The `cleanup_cache` activity respects isolation mode:
```python
await workflow.execute_activity(
"cleanup_cache",
args=[target_path, "isolated"], # target_path, workspace_isolation
start_to_close_timeout=timedelta(minutes=1)
)
```
**Cleanup behavior by mode:**
- `isolated`: Removes `/cache/{target_id}/{run_id}/` entirely
- `shared`: Skips cleanup (shared across runs)
- `copy-on-write`: Removes run directory, keeps shared cache
---
## Cache Management
### Cache Directory Structure
```
/cache/
├── {target_id_1}/
│ ├── {run_id_1}/
│ │ ├── target # Downloaded tarball
│ │ └── workspace/ # Extracted files
│ ├── {run_id_2}/
│ │ ├── target
│ │ └── workspace/
│ └── workspace/ # Shared mode (no run_id subdirectory)
│ └── ...
├── {target_id_2}/
│ └── shared/
│ ├── target # Copy-on-write shared download
│ └── workspace/
```
### LRU Eviction
When cache exceeds the configured limit (default: 10GB), least-recently-used files are evicted automatically.
**Configuration:**
```yaml
# In worker environment
CACHE_DIR: /cache
CACHE_MAX_SIZE: 10GB
CACHE_TTL: 7d
```
**Eviction policy:**
- Tracks last access time for each cached target
- When cache is full, removes oldest accessed files first
- Cleanup runs periodically (every 30 minutes)
---
## Choosing the Right Mode
### Decision Matrix
| Workflow Type | Modifies Files? | Concurrent Runs? | Recommended Mode |
|---------------|----------------|------------------|------------------|
| Fuzzing (AFL, libFuzzer, Atheris) | ✅ Yes | ✅ Yes | **isolated** |
| Static Analysis | ❌ No | ✅ Yes | **shared** |
| Secret Scanning | ❌ No | ✅ Yes | **shared** |
| File Modification | ✅ Yes | ❌ No | **isolated** |
| Large Downloads | ❌ No | ✅ Yes | **copy-on-write** |
### Guidelines
**Use `isolated` when:**
- Workflow modifies files (corpus, crashes, logs)
- Fuzzing or dynamic analysis
- Concurrent runs must not interfere
**Use `shared` when:**
- Workflow only reads files
- Static analysis or scanning
- Want to minimize bandwidth/storage
**Use `copy-on-write` when:**
- Workflow modifies files but target is large (>100MB)
- Want isolation but minimize download overhead
- Balance between shared and isolated
---
## Configuration
### In Workflow Metadata
Document the isolation mode in `metadata.yaml`:
```yaml
name: atheris_fuzzing
version: "1.0.0"
vertical: python
# Workspace isolation mode
# - "isolated" (default): Each run gets own workspace
# - "shared": All runs share workspace (read-only workflows)
# - "copy-on-write": Download once, copy per run
workspace_isolation: "isolated"
```
### In Workflow Code
Pass isolation mode to storage activities:
```python
@workflow.defn
class MyWorkflow:
@workflow.run
async def run(self, target_id: str) -> Dict[str, Any]:
# Get run ID for isolation
run_id = workflow.info().run_id
# Download target with isolation
target_path = await workflow.execute_activity(
"get_target",
args=[target_id, run_id, "isolated"],
start_to_close_timeout=timedelta(minutes=5)
)
# ... workflow logic ...
# Cleanup with same isolation mode
await workflow.execute_activity(
"cleanup_cache",
args=[target_path, "isolated"],
start_to_close_timeout=timedelta(minutes=1)
)
```
---
## Troubleshooting
### Issue: Workflows interfere with each other
**Symptom:** Fuzzing crashes from one run appear in another
**Diagnosis:**
```bash
# Check workspace paths in logs
docker logs fuzzforge-worker-python | grep "User code downloaded"
# Should see run-specific paths:
# ✅ /cache/abc-123/run-xyz-456/workspace (isolated)
# ❌ /cache/abc-123/workspace (shared - problem for fuzzing)
```
**Solution:** Change `workspace_isolation` to `"isolated"` in metadata.yaml
### Issue: High bandwidth usage
**Symptom:** Target downloaded repeatedly for same target_id
**Diagnosis:**
```bash
# Check MinIO downloads in logs
docker logs fuzzforge-worker-python | grep "downloading from MinIO"
# If many downloads for same target_id with shared workflow:
# Problem is using "isolated" mode for read-only workflow
```
**Solution:** Change to `"shared"` mode for read-only workflows
### Issue: Cache fills up quickly
**Symptom:** Disk space consumed by /cache directory
**Diagnosis:**
```bash
# Check cache size
docker exec fuzzforge-worker-python du -sh /cache
# Check LRU settings
docker exec fuzzforge-worker-python env | grep CACHE
```
**Solution:**
- Increase `CACHE_MAX_SIZE` environment variable
- Use `shared` mode for read-only workflows
- Decrease `CACHE_TTL` for faster eviction
---
## Summary
FuzzForge's workspace isolation system provides:
1. **Safe concurrent execution** for fuzzing and analysis workflows
2. **Three isolation modes** to balance safety vs efficiency
3. **Automatic cache management** with LRU eviction
4. **Per-workflow configuration** via metadata.yaml
**Key Takeaways:**
- Use `isolated` (default) for workflows that modify files
- Use `shared` for read-only analysis workflows
- Use `copy-on-write` to balance isolation and bandwidth
- Configure via `workspace_isolation` field in metadata.yaml
- Workers automatically handle download, extraction, and cleanup
---
**Next Steps:**
- Review your workflows and set appropriate isolation modes
- Monitor cache usage with `docker exec fuzzforge-worker-python du -sh /cache`
- Adjust `CACHE_MAX_SIZE` if needed for your workload

View File

@@ -0,0 +1,550 @@
# CI/CD Integration Guide
This guide shows you how to integrate FuzzForge into your CI/CD pipeline for automated security testing on every commit, pull request, or scheduled run.
---
## Overview
FuzzForge can run entirely inside CI containers (GitHub Actions, GitLab CI, etc.) with no external infrastructure required. The complete FuzzForge stack—Temporal, PostgreSQL, MinIO, Backend, and workers—starts automatically when needed and cleans up after execution.
### Key Benefits
**Zero Infrastructure**: No servers to maintain
**Ephemeral**: Fresh environment per run
**Resource Efficient**: On-demand workers (v0.7.0) save ~6-7GB RAM
**Fast Feedback**: Fail builds on critical/high findings
**Standards Compliant**: SARIF export for GitHub Security / GitLab SAST
---
## Prerequisites
### Required
- **CI Runner**: Ubuntu with Docker support
- **RAM**: At least 4GB available (7GB on GitHub Actions)
- **Startup Time**: ~60-90 seconds
### Optional
- **jq**: For merging Docker daemon config (auto-installed in examples)
- **Python 3.11+**: For FuzzForge CLI
---
## Quick Start
### 1. Add Startup Scripts
FuzzForge provides helper scripts to configure Docker and start services:
```bash
# Start FuzzForge (configure Docker, start services, wait for health)
bash scripts/ci-start.sh
# Stop and cleanup after execution
bash scripts/ci-stop.sh
```
### 2. Install CLI
```bash
pip install ./cli
```
### 3. Initialize Project
```bash
ff init --api-url http://localhost:8000 --name "CI Security Scan"
```
### 4. Run Workflow
```bash
# Run and fail on error findings
ff workflow run security_assessment . \
--wait \
--fail-on error \
--export-sarif results.sarif
```
---
## Deployment Models
FuzzForge supports two CI/CD deployment models:
### Option A: Ephemeral (Recommended)
**Everything runs inside the CI container for each job.**
```
┌────────────────────────────────────┐
│ GitHub Actions Runner │
│ │
│ ┌──────────────────────────────┐ │
│ │ FuzzForge Stack │ │
│ │ • Temporal │ │
│ │ • PostgreSQL │ │
│ │ • MinIO │ │
│ │ • Backend │ │
│ │ • Workers (on-demand) │ │
│ └──────────────────────────────┘ │
│ │
│ ff workflow run ... │
└────────────────────────────────────┘
```
**Pros:**
- No infrastructure to maintain
- Complete isolation per run
- Works on GitHub/GitLab free tier
**Cons:**
- 60-90s startup time per run
- Limited to runner resources
**Best For:** Open source projects, infrequent scans, PR checks
### Option B: Persistent Backend
**Backend runs on a separate server, CLI connects remotely.**
```
┌──────────────┐ ┌──────────────────┐
│ CI Runner │────────▶│ FuzzForge Server │
│ (ff CLI) │ HTTPS │ (self-hosted) │
└──────────────┘ └──────────────────┘
```
**Pros:**
- No startup time
- More resources
- Faster execution
**Cons:**
- Requires infrastructure
- Needs API tokens
**Best For:** Large teams, frequent scans, long fuzzing campaigns
---
## GitHub Actions Integration
### Complete Example
See `.github/workflows/examples/security-scan.yml` for a full working example.
**Basic workflow:**
```yaml
name: Security Scan
on: [pull_request, push]
jobs:
security:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Start FuzzForge
run: bash scripts/ci-start.sh
- name: Install CLI
run: pip install ./cli
- name: Security Scan
run: |
ff init --api-url http://localhost:8000
ff workflow run security_assessment . \
--wait \
--fail-on error \
--export-sarif results.sarif
- name: Upload SARIF
if: always()
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: results.sarif
- name: Cleanup
if: always()
run: bash scripts/ci-stop.sh
```
### GitHub Security Tab Integration
Upload SARIF results to see findings directly in GitHub:
```yaml
- name: Upload SARIF to GitHub Security
if: always()
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: results.sarif
```
Findings appear in:
- **Security** tab → **Code scanning alerts**
- Pull request annotations
- Commit status checks
---
## GitLab CI Integration
### Complete Example
See `.gitlab-ci.example.yml` for a full working example.
**Basic pipeline:**
```yaml
stages:
- security
variables:
FUZZFORGE_API_URL: "http://localhost:8000"
security:scan:
image: docker:24
services:
- docker:24-dind
before_script:
- apk add bash python3 py3-pip
- bash scripts/ci-start.sh
- pip3 install ./cli --break-system-packages
- ff init --api-url $FUZZFORGE_API_URL
script:
- ff workflow run security_assessment . --wait --fail-on error --export-sarif results.sarif
artifacts:
reports:
sast: results.sarif
after_script:
- bash scripts/ci-stop.sh
```
### GitLab SAST Dashboard Integration
The `reports: sast:` section automatically integrates with GitLab's Security Dashboard.
---
## CLI Flags for CI/CD
### `--fail-on`
Fail the build if findings match specified SARIF severity levels.
**Syntax:**
```bash
--fail-on error,warning,note,info,all,none
```
**SARIF Levels:**
- `error` - Critical security issues (fail build)
- `warning` - Potential security issues (may fail build)
- `note` - Informational findings (typically don't fail)
- `info` - Additional context (rarely blocks)
- `all` - Any finding (strictest)
- `none` - Never fail (report only)
**Examples:**
```bash
# Fail on errors only (recommended for CI)
--fail-on error
# Fail on errors or warnings
--fail-on error,warning
# Fail on any finding (strictest)
--fail-on all
# Never fail, just report (useful for monitoring)
--fail-on none
```
**Common Patterns:**
- **PR checks**: `--fail-on error` (block critical issues)
- **Release gates**: `--fail-on error,warning` (stricter)
- **Nightly scans**: `--fail-on none` (monitoring only)
- **Security audit**: `--fail-on all` (maximum strictness)
**Exit Codes:**
- `0` - No blocking findings
- `1` - Found blocking findings or error
### `--export-sarif`
Export SARIF results to a file after workflow completion.
**Syntax:**
```bash
--export-sarif <path>
```
**Example:**
```bash
ff workflow run security_assessment . \
--wait \
--export-sarif results.sarif
```
### `--wait`
Wait for workflow execution to complete (required for CI/CD).
**Example:**
```bash
ff workflow run security_assessment . --wait
```
Without `--wait`, the command returns immediately and the workflow runs in the background.
---
## Common Workflows
### PR Security Gate
Block PRs with critical/high findings:
```yaml
on: pull_request
jobs:
security:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: bash scripts/ci-start.sh
- run: pip install ./cli
- run: |
ff init --api-url http://localhost:8000
ff workflow run security_assessment . --wait --fail-on error
- if: always()
run: bash scripts/ci-stop.sh
```
### Secret Detection (Zero Tolerance)
Fail on ANY exposed secrets:
```bash
ff workflow run secret_detection . --wait --fail-on all
```
### Nightly Fuzzing (Report Only)
Run long fuzzing campaigns without failing the build:
```yaml
on:
schedule:
- cron: '0 2 * * *' # 2 AM daily
jobs:
fuzzing:
runs-on: ubuntu-latest
timeout-minutes: 120
steps:
- uses: actions/checkout@v4
- run: bash scripts/ci-start.sh
- run: pip install ./cli
- run: |
ff init --api-url http://localhost:8000
ff workflow run atheris_fuzzing . \
max_iterations=100000000 \
timeout_seconds=7200 \
--wait \
--export-sarif fuzzing-results.sarif
continue-on-error: true
- if: always()
run: bash scripts/ci-stop.sh
```
### Release Gate
Block releases with ANY security findings:
```yaml
on:
push:
tags:
- 'v*'
jobs:
release-security:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: bash scripts/ci-start.sh
- run: pip install ./cli
- run: |
ff init --api-url http://localhost:8000
ff workflow run security_assessment . --wait --fail-on all
```
---
## Performance Optimization
### Startup Time
**Current:** ~60-90 seconds
**Breakdown:**
- Docker daemon restart: 10-15s
- docker-compose up: 30-40s
- Health check wait: 20-30s
**Tips to reduce:**
1. Use `docker-compose.ci.yml` (optional, see below)
2. Cache Docker layers (GitHub Actions)
3. Use self-hosted runners (persistent Docker)
### Optional: CI-Optimized Compose File
Create `docker-compose.ci.yml`:
```yaml
version: '3.8'
services:
postgresql:
# Use in-memory storage (faster, ephemeral)
tmpfs:
- /var/lib/postgresql/data
command: postgres -c fsync=off -c full_page_writes=off
minio:
# Use in-memory storage
tmpfs:
- /data
temporal:
healthcheck:
# More frequent health checks
interval: 5s
retries: 10
```
**Usage:**
```bash
docker-compose -f docker-compose.yml -f docker-compose.ci.yml up -d
```
---
## Troubleshooting
### "Permission denied" connecting to Docker socket
**Solution:** Add user to docker group or use `sudo`.
```bash
# GitHub Actions (already has permissions)
# GitLab CI: use docker:dind service
```
### "Connection refused to localhost:8000"
**Problem:** Services not healthy yet.
**Solution:** Increase health check timeout in `ci-start.sh`:
```bash
timeout 180 bash -c 'until curl -sf http://localhost:8000/health; do sleep 3; done'
```
### "Out of disk space"
**Problem:** Docker volumes filling up.
**Solution:** Cleanup in `after_script`:
```bash
after_script:
- bash scripts/ci-stop.sh
- docker system prune -af --volumes
```
### Worker not starting
**Problem:** Worker container exists but not running.
**Solution:** Workers are pre-built but start on-demand (v0.7.0). If a workflow fails immediately, check:
```bash
docker logs fuzzforge-worker-<vertical>
```
---
## Best Practices
1. **Always use `--wait`** in CI/CD pipelines
2. **Set appropriate `--fail-on` levels** for your use case:
- PR checks: `error` (block critical issues)
- Release gates: `error,warning` (stricter)
- Nightly scans: Don't use (report only)
3. **Export SARIF** to integrate with security dashboards
4. **Set timeouts** on CI jobs to prevent hanging
5. **Use artifacts** to preserve findings for review
6. **Cleanup always** with `if: always()` or `after_script`
---
## Advanced: Persistent Backend Setup
For high-frequency usage, deploy FuzzForge on a dedicated server:
### 1. Deploy FuzzForge Server
```bash
# On your CI server
git clone https://github.com/FuzzingLabs/fuzzforge_ai.git
cd fuzzforge_ai
docker-compose up -d
```
### 2. Generate API Token (Future Feature)
```bash
# This will be available in a future release
docker exec fuzzforge-backend python -c "
from src.auth import generate_token
print(generate_token(name='github-actions'))
"
```
### 3. Configure CI to Use Remote Backend
```yaml
env:
FUZZFORGE_API_URL: https://fuzzforge.company.com
FUZZFORGE_API_TOKEN: ${{ secrets.FUZZFORGE_TOKEN }}
steps:
- run: pip install fuzzforge-cli
- run: ff workflow run security_assessment . --wait --fail-on error
```
**Note:** Authentication is not yet implemented (v0.7.0). Use network isolation or VPN for now.
---
## Examples
- **GitHub Actions**: `.github/workflows/examples/security-scan.yml`
- **GitLab CI**: `.gitlab-ci.example.yml`
- **Startup Script**: `scripts/ci-start.sh`
- **Cleanup Script**: `scripts/ci-stop.sh`
---
## Support
- **Documentation**: [https://docs.fuzzforge.io](https://docs.fuzzforge.io)
- **Issues**: [GitHub Issues](https://github.com/FuzzingLabs/fuzzforge_ai/issues)
- **Discussions**: [GitHub Discussions](https://github.com/FuzzingLabs/fuzzforge_ai/discussions)

View File

@@ -9,18 +9,18 @@ This guide will walk you through the process of creating a custom security analy
Before you start, make sure you have:
- A working FuzzForge development environment (see [Contributing](/reference/contributing.md))
- Familiarity with Python (async/await), Docker, and Prefect 3
- Familiarity with Python (async/await), Docker, and Temporal
- At least one custom or built-in module to use in your workflow
---
## Step 1: Understand Workflow Architecture
A FuzzForge workflow is a Prefect 3 flow that:
A FuzzForge workflow is a Temporal workflow that:
- Runs in an isolated Docker container
- Runs inside a long-lived vertical worker container (pre-built with toolchains)
- Orchestrates one or more analysis modules (scanner, analyzer, reporter, etc.)
- Handles secure volume mounting for code and results
- Downloads targets from MinIO (S3-compatible storage) automatically
- Produces standardized SARIF output
- Supports configurable parameters and resource limits
@@ -28,9 +28,9 @@ A FuzzForge workflow is a Prefect 3 flow that:
```
backend/toolbox/workflows/{workflow_name}/
├── workflow.py # Main workflow definition (Prefect flow)
├── Dockerfile # Container image definition
├── metadata.yaml # Workflow metadata and configuration
├── workflow.py # Main workflow definition (Temporal workflow)
├── activities.py # Workflow activities (optional)
├── metadata.yaml # Workflow metadata and configuration (must include vertical field)
└── requirements.txt # Additional Python dependencies (optional)
```
@@ -48,6 +48,7 @@ version: "1.0.0"
description: "Analyzes project dependencies for security vulnerabilities"
author: "FuzzingLabs Security Team"
category: "comprehensive"
vertical: "web" # REQUIRED: Which vertical worker to use (rust, android, web, etc.)
tags:
- "dependency-scanning"
- "vulnerability-analysis"
@@ -63,10 +64,6 @@ requirements:
parameters:
type: object
properties:
target_path:
type: string
default: "/workspace"
description: "Path to analyze"
scan_dev_dependencies:
type: boolean
description: "Include development dependencies"
@@ -85,36 +82,63 @@ output_schema:
description: "Scan execution summary"
```
**Important:** The `vertical` field determines which worker runs your workflow. Ensure the worker has the required tools installed.
### Workspace Isolation
Add the `workspace_isolation` field to control how workflow runs share or isolate workspaces:
```yaml
# Workspace isolation mode (system-level configuration)
# - "isolated" (default): Each workflow run gets its own isolated workspace
# - "shared": All runs share the same workspace (for read-only workflows)
# - "copy-on-write": Download once, copy for each run
workspace_isolation: "isolated"
```
**Choosing the right mode:**
- **`isolated`** (default) - For fuzzing workflows that modify files (corpus, crashes)
- Example: `atheris_fuzzing`, `cargo_fuzzing`
- Safe for concurrent execution
- **`shared`** - For read-only analysis workflows
- Example: `security_assessment`, `secret_detection`
- Efficient (downloads once, reuses cache)
- **`copy-on-write`** - For large targets that need isolation
- Downloads once, copies per run
- Balances performance and isolation
See the [Workspace Isolation](/concept/workspace-isolation) guide for details.
---
## Step 3: Add Live Statistics to Your Workflow 🚦
Want real-time progress and stats for your workflow? FuzzForge supports live statistics reporting using Prefect and structured logging. This lets users (and the platform) monitor workflow progress, see live updates, and stream stats via API or WebSocket.
Want real-time progress and stats for your workflow? FuzzForge supports live statistics reporting using Temporal workflow logging. This lets users (and the platform) monitor workflow progress, see live updates, and stream stats via API or WebSocket.
### 1. Import Required Dependencies
```python
from prefect import task, get_run_context
from temporalio import workflow, activity
import logging
logger = logging.getLogger(__name__)
```
### 2. Create a Statistics Callback Function
### 2. Create a Statistics Callback in Activity
Add a callback that logs structured stats updates:
Add a callback that logs structured stats updates in your activity:
```python
@task(name="my_workflow_task")
async def my_workflow_task(workspace: Path, config: Dict[str, Any]) -> Dict[str, Any]:
# Get run context for statistics reporting
try:
context = get_run_context()
run_id = str(context.flow_run.id)
logger.info(f"Running task for flow run: {run_id}")
except Exception:
run_id = None
logger.warning("Could not get run context for statistics")
@activity.defn
async def my_workflow_activity(target_path: str, config: Dict[str, Any]) -> Dict[str, Any]:
# Get activity info for run tracking
info = activity.info()
run_id = info.workflow_id
logger.info(f"Running activity for workflow: {run_id}")
# Define callback function for live statistics
async def stats_callback(stats_data: Dict[str, Any]):
@@ -124,7 +148,7 @@ async def my_workflow_task(workspace: Path, config: Dict[str, Any]) -> Dict[str,
logger.info("LIVE_STATS", extra={
"stats_type": "live_stats", # Type of statistics
"workflow_type": "my_workflow", # Your workflow name
"run_id": stats_data.get("run_id"),
"run_id": run_id,
# Add your custom statistics fields here:
"progress": stats_data.get("progress", 0),
@@ -138,7 +162,7 @@ async def my_workflow_task(workspace: Path, config: Dict[str, Any]) -> Dict[str,
# Pass callback to your module/processor
processor = MyWorkflowModule()
result = await processor.execute(config, workspace, stats_callback=stats_callback)
result = await processor.execute(config, target_path, stats_callback=stats_callback)
return result.dict()
```
@@ -224,15 +248,16 @@ Live statistics automatically appear in:
#### Example: Adding Stats to a Security Scanner
```python
async def security_scan_task(workspace: Path, config: Dict[str, Any]):
context = get_run_context()
run_id = str(context.flow_run.id)
@activity.defn
async def security_scan_activity(target_path: str, config: Dict[str, Any]):
info = activity.info()
run_id = info.workflow_id
async def stats_callback(stats_data):
logger.info("LIVE_STATS", extra={
"stats_type": "scan_progress",
"workflow_type": "security_scan",
"run_id": stats_data.get("run_id"),
"run_id": run_id,
"files_scanned": stats_data.get("files_scanned", 0),
"vulnerabilities_found": stats_data.get("vulnerabilities_found", 0),
"scan_percentage": stats_data.get("scan_percentage", 0.0),
@@ -241,7 +266,7 @@ async def security_scan_task(workspace: Path, config: Dict[str, Any]):
})
scanner = SecurityScannerModule()
return await scanner.execute(config, workspace, stats_callback=stats_callback)
return await scanner.execute(config, target_path, stats_callback=stats_callback)
```
With these steps, your workflow will provide rich, real-time feedback to users and the FuzzForge platform—making automation more transparent and interactive!
@@ -250,95 +275,182 @@ With these steps, your workflow will provide rich, real-time feedback to users a
## Step 4: Implement the Workflow Logic
Create a `workflow.py` file. This is where you define your Prefect flow and tasks.
Create a `workflow.py` file. This is where you define your Temporal workflow and activities.
Example (simplified):
```python
from pathlib import Path
from typing import Dict, Any
from prefect import flow, task
from temporalio import workflow, activity
from datetime import timedelta
from src.toolbox.modules.dependency_scanner import DependencyScanner
from src.toolbox.modules.vulnerability_analyzer import VulnerabilityAnalyzer
from src.toolbox.modules.reporter import SARIFReporter
@task
async def scan_dependencies(workspace: Path, config: Dict[str, Any]) -> Dict[str, Any]:
@activity.defn
async def scan_dependencies(target_path: str, config: Dict[str, Any]) -> Dict[str, Any]:
scanner = DependencyScanner()
return (await scanner.execute(config, workspace)).dict()
return (await scanner.execute(config, target_path)).dict()
@task
async def analyze_vulnerabilities(dependencies: Dict[str, Any], workspace: Path, config: Dict[str, Any]) -> Dict[str, Any]:
@activity.defn
async def analyze_vulnerabilities(dependencies: Dict[str, Any], target_path: str, config: Dict[str, Any]) -> Dict[str, Any]:
analyzer = VulnerabilityAnalyzer()
analyzer_config = {**config, 'dependencies': dependencies.get('findings', [])}
return (await analyzer.execute(analyzer_config, workspace)).dict()
return (await analyzer.execute(analyzer_config, target_path)).dict()
@task
async def generate_report(dep_results: Dict[str, Any], vuln_results: Dict[str, Any], config: Dict[str, Any], workspace: Path) -> Dict[str, Any]:
@activity.defn
async def generate_report(dep_results: Dict[str, Any], vuln_results: Dict[str, Any], config: Dict[str, Any]) -> Dict[str, Any]:
reporter = SARIFReporter()
all_findings = dep_results.get("findings", []) + vuln_results.get("findings", [])
reporter_config = {**config, "findings": all_findings}
return (await reporter.execute(reporter_config, workspace)).dict().get("sarif", {})
return (await reporter.execute(reporter_config, None)).dict().get("sarif", {})
@flow(name="dependency_analysis")
async def main_flow(
target_path: str = "/workspace",
scan_dev_dependencies: bool = True,
vulnerability_threshold: str = "medium"
) -> Dict[str, Any]:
workspace = Path(target_path)
scanner_config = {"scan_dev_dependencies": scan_dev_dependencies}
analyzer_config = {"vulnerability_threshold": vulnerability_threshold}
reporter_config = {}
@workflow.defn
class DependencyAnalysisWorkflow:
@workflow.run
async def run(
self,
target_id: str, # Target file ID from MinIO (downloaded by worker automatically)
scan_dev_dependencies: bool = True,
vulnerability_threshold: str = "medium"
) -> Dict[str, Any]:
workflow.logger.info(f"Starting dependency analysis for target: {target_id}")
dep_results = await scan_dependencies(workspace, scanner_config)
vuln_results = await analyze_vulnerabilities(dep_results, workspace, analyzer_config)
sarif_report = await generate_report(dep_results, vuln_results, reporter_config, workspace)
return sarif_report
# Get run ID for workspace isolation
run_id = workflow.info().run_id
# Worker downloads target from MinIO with isolation
target_path = await workflow.execute_activity(
"get_target",
args=[target_id, run_id, "shared"], # target_id, run_id, workspace_isolation
start_to_close_timeout=timedelta(minutes=5)
)
scanner_config = {"scan_dev_dependencies": scan_dev_dependencies}
analyzer_config = {"vulnerability_threshold": vulnerability_threshold}
# Execute activities with retries and timeouts
dep_results = await workflow.execute_activity(
scan_dependencies,
args=[target_path, scanner_config],
start_to_close_timeout=timedelta(minutes=10),
retry_policy=workflow.RetryPolicy(maximum_attempts=3)
)
vuln_results = await workflow.execute_activity(
analyze_vulnerabilities,
args=[dep_results, target_path, analyzer_config],
start_to_close_timeout=timedelta(minutes=10),
retry_policy=workflow.RetryPolicy(maximum_attempts=3)
)
sarif_report = await workflow.execute_activity(
generate_report,
args=[dep_results, vuln_results, {}],
start_to_close_timeout=timedelta(minutes=5),
retry_policy=workflow.RetryPolicy(maximum_attempts=3)
)
# Cleanup cache (respects isolation mode)
await workflow.execute_activity(
"cleanup_cache",
args=[target_path, "shared"], # target_path, workspace_isolation
start_to_close_timeout=timedelta(minutes=1)
)
workflow.logger.info("Dependency analysis completed")
return sarif_report
```
**Key differences from Prefect:**
- Use `@workflow.defn` class instead of `@flow` function
- Use `@activity.defn` instead of `@task`
- Must call `get_target` activity to download from MinIO with isolation mode
- Use `workflow.execute_activity()` with explicit timeouts and retry policies
- Use `workflow.logger` for logging (appears in Temporal UI)
- Call `cleanup_cache` activity at end to clean up workspace
---
## Step 5: Create the Dockerfile
## Step 5: No Dockerfile Needed! 🎉
Your workflow runs in a container. Create a `Dockerfile`:
**Good news:** You don't need to create a Dockerfile for your workflow. Workflows run inside pre-built **vertical worker containers** that already have toolchains installed.
```dockerfile
FROM python:3.11-slim
WORKDIR /app
RUN apt-get update && apt-get install -y git curl && rm -rf /var/lib/apt/lists/*
COPY ../../../pyproject.toml ./
COPY ../../../uv.lock ./
RUN pip install uv && uv sync --no-dev
COPY requirements.txt ./
RUN uv pip install -r requirements.txt
COPY ../../../ .
RUN mkdir -p /workspace
CMD ["uv", "run", "python", "-m", "src.toolbox.workflows.dependency_analysis.workflow"]
```
**How it works:**
1. Your workflow code lives in `backend/toolbox/workflows/{workflow_name}/`
2. This directory is **mounted as a volume** in the worker container at `/app/toolbox/workflows/`
3. Worker discovers and registers your workflow automatically on startup
4. When submitted, the workflow runs inside the long-lived worker container
**Benefits:**
- Zero container build time per workflow
- Instant code changes (just restart worker)
- All toolchains pre-installed (AFL++, cargo-fuzz, apktool, etc.)
- Consistent environment across all workflows of the same vertical
---
## Step 6: Register and Test Your Workflow
## Step 6: Test Your Workflow
- Add your workflow to the registry (e.g., `backend/toolbox/workflows/registry.py`)
- Write a test script or use the CLI to submit a workflow run
- Check that SARIF results are produced and stored as expected
### Using the CLI
Example test:
```bash
# Start FuzzForge with Temporal
docker-compose -f docker-compose.temporal.yaml up -d
# Wait for services to initialize
sleep 10
# Submit workflow with file upload
cd test_projects/vulnerable_app/
fuzzforge workflow run dependency_analysis .
# CLI automatically:
# - Creates tarball of current directory
# - Uploads to MinIO via backend
# - Submits workflow with target_id
# - Worker downloads from MinIO and executes
```
### Using Python SDK
```python
import asyncio
from backend.src.toolbox.workflows.dependency_analysis.workflow import main_flow
from fuzzforge_sdk import FuzzForgeClient
from pathlib import Path
async def test_workflow():
result = await main_flow(target_path="/tmp/test-project", scan_dev_dependencies=True)
print(result)
client = FuzzForgeClient(base_url="http://localhost:8000")
if __name__ == "__main__":
asyncio.run(test_workflow())
# Submit with automatic upload
response = client.submit_workflow_with_upload(
workflow_name="dependency_analysis",
target_path=Path("/path/to/project"),
parameters={
"scan_dev_dependencies": True,
"vulnerability_threshold": "medium"
}
)
print(f"Workflow started: {response.run_id}")
# Wait for completion
final_status = client.wait_for_completion(response.run_id)
# Get findings
findings = client.get_run_findings(response.run_id)
print(findings.sarif)
client.close()
```
### Check Temporal UI
Open http://localhost:8233 to see:
- Workflow execution timeline
- Activity results
- Logs and errors
- Retry history
---
## Best Practices

View File

@@ -0,0 +1,453 @@
# Debugging Workflows and Modules
This guide shows you how to debug FuzzForge workflows and modules using Temporal's powerful debugging features.
---
## Quick Debugging Checklist
When something goes wrong:
1. **Check worker logs** - `docker-compose -f docker-compose.temporal.yaml logs worker-rust -f`
2. **Check Temporal UI** - http://localhost:8233 (visual execution history)
3. **Check MinIO console** - http://localhost:9001 (inspect uploaded files)
4. **Check backend logs** - `docker-compose -f docker-compose.temporal.yaml logs fuzzforge-backend -f`
---
## Debugging Workflow Discovery
### Problem: Workflow Not Found
**Symptom:** Worker logs show "No workflows found for vertical: rust"
**Debug Steps:**
1. **Check if worker can see the workflow:**
```bash
docker exec fuzzforge-worker-rust ls /app/toolbox/workflows/
```
2. **Check metadata.yaml exists:**
```bash
docker exec fuzzforge-worker-rust cat /app/toolbox/workflows/my_workflow/metadata.yaml
```
3. **Verify vertical field matches:**
```bash
docker exec fuzzforge-worker-rust grep "vertical:" /app/toolbox/workflows/my_workflow/metadata.yaml
```
Should output: `vertical: rust`
4. **Check worker logs for discovery errors:**
```bash
docker-compose -f docker-compose.temporal.yaml logs worker-rust | grep "my_workflow"
```
**Solution:**
- Ensure `metadata.yaml` has correct `vertical` field
- Restart worker to reload: `docker-compose -f docker-compose.temporal.yaml restart worker-rust`
- Check worker logs for discovery confirmation
---
## Debugging Workflow Execution
### Using Temporal Web UI
The Temporal UI at http://localhost:8233 is your primary debugging tool.
**Navigate to a workflow:**
1. Open http://localhost:8233
2. Click "Workflows" in left sidebar
3. Find your workflow by `run_id` or workflow name
4. Click to see detailed execution
**What you can see:**
- **Execution timeline** - When each activity started/completed
- **Input/output** - Exact parameters passed to workflow
- **Activity results** - Return values from each activity
- **Error stack traces** - Full Python tracebacks
- **Retry history** - All retry attempts with reasons
- **Worker information** - Which worker executed each activity
**Example: Finding why an activity failed:**
1. Open workflow in Temporal UI
2. Scroll to failed activity (marked in red)
3. Click on the activity
4. See full error message and stack trace
5. Check "Input" tab to see what parameters were passed
---
## Viewing Worker Logs
### Real-time Monitoring
```bash
# Follow logs from rust worker
docker-compose -f docker-compose.temporal.yaml logs worker-rust -f
# Follow logs from all workers
docker-compose -f docker-compose.temporal.yaml logs worker-rust worker-android -f
# Show last 100 lines
docker-compose -f docker-compose.temporal.yaml logs worker-rust --tail 100
```
### What Worker Logs Show
**On startup:**
```
INFO: Scanning for workflows in: /app/toolbox/workflows
INFO: Importing workflow module: toolbox.workflows.security_assessment.workflow
INFO: ✓ Discovered workflow: SecurityAssessmentWorkflow from security_assessment (vertical: rust)
INFO: 🚀 Worker started for vertical 'rust'
```
**During execution:**
```
INFO: Starting SecurityAssessmentWorkflow (workflow_id=security_assessment-abc123, target_id=548193a1...)
INFO: Downloading target from MinIO: 548193a1-f73f-4ec1-8068-19ec2660b8e4
INFO: Executing activity: scan_files
INFO: Completed activity: scan_files (duration: 3.2s)
```
**On errors:**
```
ERROR: Failed to import workflow module toolbox.workflows.broken.workflow:
File "/app/toolbox/workflows/broken/workflow.py", line 42
def run(
IndentationError: expected an indented block
```
### Filtering Logs
```bash
# Show only errors
docker-compose -f docker-compose.temporal.yaml logs worker-rust | grep ERROR
# Show workflow discovery
docker-compose -f docker-compose.temporal.yaml logs worker-rust | grep "Discovered workflow"
# Show specific workflow execution
docker-compose -f docker-compose.temporal.yaml logs worker-rust | grep "security_assessment-abc123"
# Show activity execution
docker-compose -f docker-compose.temporal.yaml logs worker-rust | grep "activity"
```
---
## Debugging File Upload
### Check if File Was Uploaded
**Using MinIO Console:**
1. Open http://localhost:9001
2. Login: `fuzzforge` / `fuzzforge123`
3. Click "Buckets" → "targets"
4. Look for your `target_id` (UUID format)
5. Click to download and inspect locally
**Using CLI:**
```bash
# Check MinIO status
curl http://localhost:9000
# List backend logs for upload
docker-compose -f docker-compose.temporal.yaml logs fuzzforge-backend | grep "upload"
```
### Check Worker Cache
```bash
# List cached targets
docker exec fuzzforge-worker-rust ls -lh /cache/
# Check specific target
docker exec fuzzforge-worker-rust ls -lh /cache/548193a1-f73f-4ec1-8068-19ec2660b8e4
```
---
## Interactive Debugging
### Access Running Worker
```bash
# Open shell in worker container
docker exec -it fuzzforge-worker-rust bash
# Now you can:
# - Check filesystem
ls -la /app/toolbox/workflows/
# - Test imports
python3 -c "from toolbox.workflows.my_workflow.workflow import MyWorkflow; print(MyWorkflow)"
# - Check environment variables
env | grep TEMPORAL
# - Test activities
cd /app/toolbox/workflows/my_workflow
python3 -c "from activities import my_activity; print(my_activity)"
# - Check cache
ls -lh /cache/
```
### Test Module in Isolation
```bash
# Enter worker container
docker exec -it fuzzforge-worker-rust bash
# Navigate to module
cd /app/toolbox/modules/scanner
# Run module directly
python3 -c "
from file_scanner import FileScannerModule
scanner = FileScannerModule()
print(scanner.get_metadata())
"
```
---
## Debugging Module Code
### Edit and Reload
Since toolbox is mounted as a volume, you can edit code on your host and reload:
1. **Edit module on host:**
```bash
# On your host machine
vim backend/toolbox/modules/scanner/file_scanner.py
```
2. **Restart worker to reload:**
```bash
docker-compose -f docker-compose.temporal.yaml restart worker-rust
```
3. **Check discovery logs:**
```bash
docker-compose -f docker-compose.temporal.yaml logs worker-rust | tail -50
```
### Add Debug Logging
Add logging to your workflow or module:
```python
import logging
logger = logging.getLogger(__name__)
@workflow.defn
class MyWorkflow:
@workflow.run
async def run(self, target_id: str):
workflow.logger.info(f"Starting with target_id: {target_id}") # Shows in Temporal UI
logger.info("Processing step 1") # Shows in worker logs
logger.debug(f"Debug info: {some_variable}") # Shows if LOG_LEVEL=DEBUG
try:
result = await some_activity()
logger.info(f"Activity result: {result}")
except Exception as e:
logger.error(f"Activity failed: {e}", exc_info=True) # Full stack trace
raise
```
Set debug logging:
```bash
# Edit docker-compose.temporal.yaml
services:
worker-rust:
environment:
LOG_LEVEL: DEBUG # Change from INFO to DEBUG
# Restart
docker-compose -f docker-compose.temporal.yaml restart worker-rust
```
---
## Common Issues and Solutions
### Issue: Workflow stuck in "Running" state
**Debug:**
1. Check Temporal UI for last completed activity
2. Check worker logs for errors
3. Check if worker is still running: `docker-compose -f docker-compose.temporal.yaml ps worker-rust`
**Solution:**
- Worker may have crashed - restart it
- Activity may be hanging - check for infinite loops or stuck network calls
- Check worker resource limits: `docker stats fuzzforge-worker-rust`
### Issue: Import errors in workflow
**Debug:**
1. Check worker logs for full error trace
2. Check if module file exists:
```bash
docker exec fuzzforge-worker-rust ls /app/toolbox/modules/my_module/
```
**Solution:**
- Ensure module is in correct directory
- Check for syntax errors: `docker exec fuzzforge-worker-rust python3 -m py_compile /app/toolbox/modules/my_module/my_module.py`
- Verify imports are correct
### Issue: Target file not found in worker
**Debug:**
1. Check if target exists in MinIO console
2. Check worker logs for download errors
3. Verify target_id is correct
**Solution:**
- Re-upload file via CLI
- Check MinIO is running: `docker-compose -f docker-compose.temporal.yaml ps minio`
- Check MinIO credentials in worker environment
---
## Performance Debugging
### Check Activity Duration
**In Temporal UI:**
1. Open workflow execution
2. Scroll through activities
3. Each shows duration (e.g., "3.2s")
4. Identify slow activities
### Monitor Resource Usage
```bash
# Monitor worker resource usage
docker stats fuzzforge-worker-rust
# Check worker logs for memory warnings
docker-compose -f docker-compose.temporal.yaml logs worker-rust | grep -i "memory\|oom"
```
### Profile Workflow Execution
Add timing to your workflow:
```python
import time
@workflow.defn
class MyWorkflow:
@workflow.run
async def run(self, target_id: str):
start = time.time()
result1 = await activity1()
workflow.logger.info(f"Activity1 took: {time.time() - start:.2f}s")
start = time.time()
result2 = await activity2()
workflow.logger.info(f"Activity2 took: {time.time() - start:.2f}s")
```
---
## Advanced Debugging
### Enable Temporal Worker Debug Logs
```bash
# Edit docker-compose.temporal.yaml
services:
worker-rust:
environment:
TEMPORAL_LOG_LEVEL: DEBUG
LOG_LEVEL: DEBUG
# Restart
docker-compose -f docker-compose.temporal.yaml restart worker-rust
```
### Inspect Temporal Workflows via CLI
```bash
# Install Temporal CLI
docker exec fuzzforge-temporal tctl
# List workflows
docker exec fuzzforge-temporal tctl workflow list
# Describe workflow
docker exec fuzzforge-temporal tctl workflow describe -w security_assessment-abc123
# Show workflow history
docker exec fuzzforge-temporal tctl workflow show -w security_assessment-abc123
```
### Check Network Connectivity
```bash
# From worker to Temporal
docker exec fuzzforge-worker-rust ping temporal
# From worker to MinIO
docker exec fuzzforge-worker-rust curl http://minio:9000
# From host to services
curl http://localhost:8233 # Temporal UI
curl http://localhost:9000 # MinIO
curl http://localhost:8000/health # Backend
```
---
## Debugging Best Practices
1. **Always check Temporal UI first** - It shows the most complete execution history
2. **Use structured logging** - Include workflow_id, target_id in log messages
3. **Log at decision points** - Before/after each major operation
4. **Keep worker logs** - They persist across workflow runs
5. **Test modules in isolation** - Use `docker exec` to test before integrating
6. **Use debug builds** - Enable DEBUG logging during development
7. **Monitor resources** - Use `docker stats` to catch resource issues
---
## Getting Help
If you're still stuck:
1. **Collect diagnostic info:**
```bash
# Save all logs
docker-compose -f docker-compose.temporal.yaml logs > fuzzforge-logs.txt
# Check service status
docker-compose -f docker-compose.temporal.yaml ps > service-status.txt
```
2. **Check Temporal UI** and take screenshots of:
- Workflow execution timeline
- Failed activity details
- Error messages
3. **Report issue** with:
- Workflow name and run_id
- Error messages from logs
- Screenshots from Temporal UI
- Steps to reproduce
---
**Happy debugging!** 🐛🔍

View File

@@ -97,6 +97,43 @@ If you prefer, you can use a systemd override to add the registry flag. See the
---
## Worker Profiles (Resource Optimization - v0.7.0)
FuzzForge workers use Docker Compose profiles to prevent auto-startup:
```yaml
# docker-compose.yml
worker-ossfuzz:
profiles:
- workers # For starting all workers
- ossfuzz # For starting just this worker
restart: "no" # Don't auto-restart
```
### Behavior
- **`docker-compose up -d`**: Workers DON'T start (saves ~6-7GB RAM)
- **CLI workflows**: Workers start automatically on-demand
- **Manual start**: `docker start fuzzforge-worker-ossfuzz`
### Resource Savings
| Command | Workers Started | RAM Usage |
|---------|----------------|-----------|
| `docker-compose up -d` | None (core only) | ~1.2 GB |
| `ff workflow run ossfuzz_campaign .` | ossfuzz worker only | ~3-5 GB |
| `docker-compose --profile workers up -d` | All workers | ~8 GB |
### Starting All Workers (Legacy Behavior)
If you prefer the old behavior where all workers start:
```bash
docker-compose --profile workers up -d
```
---
## Common Issues & How to Fix Them
### "x509: certificate signed by unknown authority"

View File

@@ -10,15 +10,16 @@ Before diving into specific errors, lets check the basics:
```bash
# Check all FuzzForge services
docker compose ps
docker-compose -f docker-compose.temporal.yaml ps
# Verify Docker registry config
# Verify Docker registry config (if using workflow registry)
docker info | grep -i "insecure registries"
# Test service health endpoints
curl http://localhost:8000/health
curl http://localhost:4200
curl http://localhost:5001/v2/
curl http://localhost:8233 # Temporal Web UI
curl http://localhost:9000 # MinIO API
curl http://localhost:9001 # MinIO Console
```
If any of these commands fail, note the error message and continue below.
@@ -51,15 +52,17 @@ Docker is trying to use HTTPS for the local registry, but its set up for HTTP
The registry isnt running or the port is blocked.
**How to fix:**
- Make sure the registry container is up:
- Make sure the registry container is up (if using registry for workflow images):
```bash
docker compose ps registry
docker-compose -f docker-compose.temporal.yaml ps registry
```
- Check logs for errors:
```bash
docker compose logs registry
docker-compose -f docker-compose.temporal.yaml logs registry
```
- If port 5001 is in use, change it in `docker-compose.yaml` and your Docker config.
- If port 5001 is in use, change it in `docker-compose.temporal.yaml` and your Docker config.
**Note:** With Temporal architecture, target files use MinIO (port 9000), not the registry.
### "no such host" error
@@ -74,31 +77,42 @@ Docker cant resolve `localhost`.
## Workflow Execution Issues
### "mounts denied" or volume errors
### Upload fails or file access errors
**Whats happening?**
Docker cant access the path you provided.
**What's happening?**
File upload to MinIO failed or worker can't download target.
**How to fix:**
- Always use absolute paths.
- On Docker Desktop, add your project directory to File Sharing.
- Confirm the path exists and is readable.
### Workflow status is "Crashed" or "Late"
**Whats happening?**
- "Crashed": Usually a registry, path, or tool error.
- "Late": Worker is overloaded or system is slow.
**How to fix:**
- Check logs for details:
- Check MinIO is running:
```bash
docker compose logs prefect-worker | tail -50
docker-compose -f docker-compose.temporal.yaml ps minio
```
- Check MinIO logs:
```bash
docker-compose -f docker-compose.temporal.yaml logs minio
```
- Verify MinIO is accessible:
```bash
curl http://localhost:9000
```
- Check file size (max 10GB by default).
### Workflow status is "Failed" or "Running" (stuck)
**What's happening?**
- "Failed": Usually a target download, storage, or tool error.
- "Running" (stuck): Worker is overloaded, target download failed, or worker crashed.
**How to fix:**
- Check worker logs for details:
```bash
docker-compose -f docker-compose.temporal.yaml logs worker-rust | tail -50
```
- Check Temporal Web UI at http://localhost:8233 for detailed execution history
- Restart services:
```bash
docker compose down
docker compose up -d
docker-compose -f docker-compose.temporal.yaml down
docker-compose -f docker-compose.temporal.yaml up -d
```
- Reduce the number of concurrent workflows if your system is resource-constrained.
@@ -106,22 +120,23 @@ Docker cant access the path you provided.
## Service Connectivity Issues
### Backend (port 8000) or Prefect UI (port 4200) not responding
### Backend (port 8000) or Temporal UI (port 8233) not responding
**How to fix:**
- Check if the service is running:
```bash
docker compose ps fuzzforge-backend
docker compose ps prefect-server
docker-compose -f docker-compose.temporal.yaml ps fuzzforge-backend
docker-compose -f docker-compose.temporal.yaml ps temporal
```
- View logs for errors:
```bash
docker compose logs fuzzforge-backend --tail 50
docker compose logs prefect-server --tail 20
docker-compose -f docker-compose.temporal.yaml logs fuzzforge-backend --tail 50
docker-compose -f docker-compose.temporal.yaml logs temporal --tail 20
```
- Restart the affected service:
```bash
docker compose restart fuzzforge-backend
docker-compose -f docker-compose.temporal.yaml restart fuzzforge-backend
docker-compose -f docker-compose.temporal.yaml restart temporal
```
---
@@ -197,13 +212,13 @@ Docker cant access the path you provided.
- Check Docker network configuration:
```bash
docker network ls
docker network inspect fuzzforge_default
docker network inspect fuzzforge-temporal_default
```
- Recreate the network:
```bash
docker compose down
docker-compose -f docker-compose.temporal.yaml down
docker network prune -f
docker compose up -d
docker-compose -f docker-compose.temporal.yaml up -d
```
---
@@ -229,10 +244,10 @@ Docker cant access the path you provided.
### Enable debug logging
```bash
export PREFECT_LOGGING_LEVEL=DEBUG
docker compose down
docker compose up -d
docker compose logs fuzzforge-backend -f
export TEMPORAL_LOGGING_LEVEL=DEBUG
docker-compose -f docker-compose.temporal.yaml down
docker-compose -f docker-compose.temporal.yaml up -d
docker-compose -f docker-compose.temporal.yaml logs fuzzforge-backend -f
```
### Collect diagnostic info
@@ -243,12 +258,12 @@ Save and run this script to gather info for support:
#!/bin/bash
echo "=== FuzzForge Diagnostics ==="
date
docker compose ps
docker-compose -f docker-compose.temporal.yaml ps
docker info | grep -A 5 -i "insecure registries"
curl -s http://localhost:8000/health || echo "Backend unhealthy"
curl -s http://localhost:4200 >/dev/null && echo "Prefect UI healthy" || echo "Prefect UI unhealthy"
curl -s http://localhost:5001/v2/ >/dev/null && echo "Registry healthy" || echo "Registry unhealthy"
docker compose logs --tail 10
curl -s http://localhost:8233 >/dev/null && echo "Temporal UI healthy" || echo "Temporal UI unhealthy"
curl -s http://localhost:9000 >/dev/null && echo "MinIO healthy" || echo "MinIO unhealthy"
docker-compose -f docker-compose.temporal.yaml logs --tail 10
```
### Still stuck?

View File

@@ -85,24 +85,23 @@ docker pull localhost:5001/hello-world 2>/dev/null || echo "Registry not accessi
Start all FuzzForge services:
```bash
docker compose up -d
docker-compose -f docker-compose.temporal.yaml up -d
```
This will start 8 services:
- **prefect-server**: Workflow orchestration server
- **prefect-worker**: Executes workflows in Docker containers
This will start 6+ services:
- **temporal**: Workflow orchestration server (includes embedded PostgreSQL for dev)
- **minio**: S3-compatible storage for uploaded targets and results
- **minio-setup**: One-time setup for MinIO buckets (exits after setup)
- **fuzzforge-backend**: FastAPI backend and workflow management
- **postgres**: Metadata and workflow state storage
- **redis**: Message broker and caching
- **registry**: Local Docker registry for workflow images
- **docker-proxy**: Secure Docker socket proxy
- **prefect-services**: Additional Prefect services
- **worker-rust**: Long-lived worker for Rust/native security analysis
- **worker-android**: Long-lived worker for Android security analysis (if configured)
- **worker-web**: Long-lived worker for web security analysis (if configured)
Wait for all services to be healthy (this may take 2-3 minutes on first startup):
```bash
# Check service health
docker compose ps
docker-compose -f docker-compose.temporal.yaml ps
# Verify FuzzForge is ready
curl http://localhost:8000/health
@@ -154,33 +153,41 @@ You should see 6 production workflows:
## Step 6: Run Your First Workflow
Let's run a static analysis workflow on one of the included vulnerable test projects.
Let's run a security assessment workflow on one of the included vulnerable test projects.
### Using the CLI (Recommended):
```bash
# Navigate to a test project
cd /path/to/fuzzforge/test_projects/static_analysis_vulnerable
cd /path/to/fuzzforge/test_projects/vulnerable_app
# Submit the workflow
fuzzforge runs submit static_analysis_scan .
# Submit the workflow - CLI automatically uploads the local directory
fuzzforge workflow run security_assessment .
# The CLI will:
# 1. Detect that '.' is a local directory
# 2. Create a compressed tarball of the directory
# 3. Upload it to the backend via HTTP
# 4. The backend stores it in MinIO
# 5. The worker downloads it when ready to analyze
# Monitor the workflow
fuzzforge runs status <run-id>
fuzzforge workflow status <run-id>
# View results when complete
fuzzforge findings get <run-id>
fuzzforge finding <run-id>
```
### Using the API:
For local files, you can use the upload endpoint:
```bash
# Submit workflow
curl -X POST "http://localhost:8000/workflows/static_analysis_scan/submit" \
-H "Content-Type: application/json" \
-d '{
"target_path": "/path/to/your/project"
}'
# Create tarball and upload
tar -czf project.tar.gz /path/to/your/project
curl -X POST "http://localhost:8000/workflows/security_assessment/upload-and-submit" \
-F "file=@project.tar.gz" \
-F "volume_mode=ro"
# Check status
curl "http://localhost:8000/runs/{run-id}/status"
@@ -189,6 +196,8 @@ curl "http://localhost:8000/runs/{run-id}/status"
curl "http://localhost:8000/runs/{run-id}/findings"
```
**Note**: The CLI handles file upload automatically. For remote workflows where the target path exists on the backend server, you can still use path-based submission for backward compatibility.
## Step 7: Understanding the Results
The workflow will complete in 30-60 seconds and return results in SARIF format. For the test project, you should see:
@@ -216,13 +225,19 @@ Example output:
}
```
## Step 8: Access the Prefect Dashboard
## Step 8: Access the Temporal Web UI
You can monitor workflow execution in real-time using the Prefect dashboard:
You can monitor workflow execution in real-time using the Temporal Web UI:
1. Open http://localhost:4200 in your browser
2. Navigate to "Flow Runs" to see workflow executions
3. Click on a run to see detailed logs and execution graph
1. Open http://localhost:8233 in your browser
2. Navigate to "Workflows" to see workflow executions
3. Click on a workflow to see detailed execution history and activity results
You can also access the MinIO console to view uploaded targets:
1. Open http://localhost:9001 in your browser
2. Login with: `fuzzforge` / `fuzzforge123`
3. Browse the `targets` bucket to see uploaded files
## Next Steps
@@ -242,9 +257,10 @@ Congratulations! You've successfully:
If you encounter problems:
1. **Workflow crashes with registry errors**: Check Docker insecure registry configuration
2. **Services won't start**: Ensure ports 4200, 5001, 8000 are available
2. **Services won't start**: Ensure ports 8000, 8233, 9000, 9001 are available
3. **No findings returned**: Verify the target path contains analyzable code files
4. **CLI not found**: Ensure Python/pip installation path is in your PATH
5. **Upload fails**: Check that MinIO is running and accessible at http://localhost:9000
See the [Troubleshooting Guide](../how-to/troubleshooting.md) for detailed solutions.