chore: Complete Temporal migration with updated CLI/SDK/docs

This commit includes all remaining Temporal migration changes:

## CLI Updates (cli/)
- Updated workflow execution commands for Temporal
- Enhanced error handling and exceptions
- Updated dependencies in uv.lock

## SDK Updates (sdk/)
- Client methods updated for Temporal workflows
- Updated models for new workflow execution
- Updated dependencies in uv.lock

## Documentation Updates (docs/)
- Architecture documentation for Temporal
- Workflow concept documentation
- Resource management documentation (new)
- Debugging guide (new)
- Updated tutorials and how-to guides
- Troubleshooting updates

## README Updates
- Main README with Temporal instructions
- Backend README
- CLI README
- SDK README

## Other
- Updated IMPLEMENTATION_STATUS.md
- Removed old vulnerable_app.tar.gz

These changes complete the Temporal migration and ensure the
CLI/SDK work correctly with the new backend.
This commit is contained in:
Tanguy Duhamel
2025-10-02 11:26:32 +02:00
parent fe50d4ef72
commit 8e0e167ddd
21 changed files with 2159 additions and 459 deletions
+53 -44
View File
@@ -93,51 +93,57 @@ graph TB
### Orchestration Layer
- **Prefect Server:** Schedules and tracks workflows, backed by PostgreSQL.
- **Prefect Workers:** Execute workflows in Docker containers. Can be scaled horizontally.
- **Workflow Scheduler:** Balances load, manages priorities, and enforces resource limits.
- **Temporal Server:** Schedules and tracks workflows, backed by PostgreSQL.
- **Vertical Workers:** Long-lived workers pre-built with domain-specific toolchains (Android, Rust, Web, etc.). Can be scaled horizontally.
- **Task Queues:** Route workflows to appropriate vertical workers based on workflow metadata.
### Execution Layer
- **Docker Engine:** Runs workflow containers, enforcing isolation and resource limits.
- **Workflow Containers:** Custom images with security tools, mounting code and results volumes.
- **Docker Registry:** Stores and distributes workflow images.
- **Vertical Workers:** Long-lived processes with pre-installed security tools for specific domains.
- **MinIO Storage:** S3-compatible storage for uploaded targets and results.
- **Worker Cache:** Local cache for downloaded targets, with LRU eviction.
### Storage Layer
- **PostgreSQL Database:** Stores workflow metadata, state, and results.
- **Docker Volumes:** Persist workflow results and artifacts.
- **Result Cache:** Speeds up access to recent results, with in-memory and disk persistence.
- **PostgreSQL Database:** Stores Temporal workflow state and metadata.
- **MinIO (S3):** Persistent storage for uploaded targets and workflow results.
- **Worker Cache:** Local filesystem cache for downloaded targets (LRU eviction).
## How Does Data Flow Through the System?
### Submitting a Workflow
1. **User submits a workflow** via CLI or API client.
2. **API validates** the request and creates a deployment in Prefect.
3. **Prefect schedules** the workflow and assigns it to a worker.
4. **Worker launches a container** to run the workflow.
5. **Results are stored** in Docker volumes and the database.
6. **Status updates** flow back through Prefect and the API to the user.
1. **User submits a workflow** via CLI or API client (with optional file upload).
2. **If file provided, API uploads** to MinIO and gets a `target_id`.
3. **API validates** the request and submits to Temporal.
4. **Temporal routes** the workflow to the appropriate vertical worker queue.
5. **Worker downloads target** from MinIO to local cache (if needed).
6. **Worker executes workflow** with pre-installed tools.
7. **Results are stored** in MinIO and metadata in PostgreSQL.
8. **Status updates** flow back through Temporal and the API to the user.
```mermaid
sequenceDiagram
participant User
participant API
participant Prefect
participant MinIO
participant Temporal
participant Worker
participant Container
participant Storage
participant Cache
User->>API: Submit workflow
User->>API: Submit workflow + file
API->>API: Validate parameters
API->>Prefect: Create deployment
Prefect->>Worker: Schedule execution
Worker->>Container: Create and start
Container->>Container: Execute security tools
Container->>Storage: Store SARIF results
Worker->>Prefect: Update status
Prefect->>API: Workflow complete
API->>MinIO: Upload target file
MinIO-->>API: Return target_id
API->>Temporal: Submit workflow(target_id)
Temporal->>Worker: Route to vertical queue
Worker->>MinIO: Download target
MinIO-->>Worker: Stream file
Worker->>Cache: Store in local cache
Worker->>Worker: Execute security tools
Worker->>MinIO: Upload SARIF results
Worker->>Temporal: Update status
Temporal->>API: Workflow complete
API->>User: Return results
```
@@ -149,25 +155,27 @@ sequenceDiagram
## How Do Services Communicate?
- **Internally:** FastAPI talks to Prefect via REST; Prefect coordinates with workers over HTTP; workers manage containers via the Docker Engine API. All core services use pooled connections to PostgreSQL.
- **Externally:** Users interact via CLI or API clients (HTTP REST). The MCP server can automate workflows via its own protocol.
- **Internally:** FastAPI talks to Temporal via gRPC; Temporal coordinates with workers over gRPC; workers access MinIO via S3 API. All core services use pooled connections to PostgreSQL.
- **Externally:** Users interact via CLI or API clients (HTTP REST).
## How Is Security Enforced?
- **Container Isolation:** Each workflow runs in its own Docker network, as a non-root user, with strict resource limits and only necessary volumes mounted.
- **Volume Security:** Source code is mounted read-only; results are written to dedicated, temporary volumes.
- **API Security:** All endpoints require API keys, validate inputs, enforce rate limits, and log requests for auditing.
- **Worker Isolation:** Each workflow runs in isolated vertical workers with pre-defined toolchains.
- **Storage Security:** Uploaded files stored in MinIO with lifecycle policies; read-only access by default.
- **API Security:** All endpoints validate inputs, enforce rate limits, and log requests for auditing.
- **No Host Access:** Workers access targets via MinIO, not host filesystem.
## How Does FuzzForge Scale?
- **Horizontally:** Add more Prefect workers to handle more workflows in parallel. Scale the database with read replicas and connection pooling.
- **Vertically:** Adjust CPU and memory limits for containers and services as needed.
- **Horizontally:** Add more vertical workers to handle more workflows in parallel. Scale specific worker types based on demand.
- **Vertically:** Adjust CPU and memory limits for workers and adjust concurrent activity limits.
Example Docker Compose scaling:
```yaml
services:
prefect-worker:
worker-rust:
deploy:
replicas: 3 # Scale rust workers
resources:
limits:
memory: 4G
@@ -179,21 +187,22 @@ services:
## How Is It Deployed?
- **Development:** All services run via Docker Compose—backend, Prefect, workers, database, and registry.
- **Production:** Add load balancers, database clustering, and multiple worker instances for high availability. Health checks, metrics, and centralized logging support monitoring and troubleshooting.
- **Development:** All services run via Docker Compose—backend, Temporal, vertical workers, database, and MinIO.
- **Production:** Add load balancers, Temporal clustering, database replication, and multiple worker instances for high availability. Health checks, metrics, and centralized logging support monitoring and troubleshooting.
## How Is Configuration Managed?
- **Environment Variables:** Control core settings like database URLs, registry location, and Prefect API endpoints.
- **Service Discovery:** Docker Composes internal DNS lets services find each other by name, with consistent port mapping and health check endpoints.
- **Environment Variables:** Control core settings like database URLs, MinIO endpoints, and Temporal addresses.
- **Service Discovery:** Docker Compose's internal DNS lets services find each other by name, with consistent port mapping and health check endpoints.
Example configuration:
```bash
COMPOSE_PROJECT_NAME=fuzzforge
DATABASE_URL=postgresql://postgres:postgres@postgres:5432/fuzzforge
PREFECT_API_URL=http://prefect-server:4200/api
DOCKER_REGISTRY=localhost:5001
DOCKER_INSECURE_REGISTRY=true
TEMPORAL_ADDRESS=temporal:7233
S3_ENDPOINT=http://minio:9000
S3_ACCESS_KEY=fuzzforge
S3_SECRET_KEY=fuzzforge123
```
## How Are Failures Handled?
@@ -203,9 +212,9 @@ DOCKER_INSECURE_REGISTRY=true
## Implementation Details
- **Tech Stack:** FastAPI (Python async), Prefect 3.x, Docker, Docker Compose, PostgreSQL (asyncpg), and Docker networking.
- **Performance:** Workflows start in 25 seconds; results are retrieved quickly thanks to caching and database indexing.
- **Extensibility:** Add new workflows by deploying new Docker images; extend the API with new endpoints; configure storage backends as needed.
- **Tech Stack:** FastAPI (Python async), Temporal, MinIO, Docker, Docker Compose, PostgreSQL (asyncpg), and boto3 (S3 client).
- **Performance:** Workflows start immediately (workers are long-lived); results are retrieved quickly thanks to MinIO caching and database indexing.
- **Extensibility:** Add new workflows by mounting code; add new vertical workers with specialized toolchains; extend the API with new endpoints.
---
+65 -53
View File
@@ -22,58 +22,62 @@ FuzzForge relies on Docker containers for several key reasons:
Every workflow in FuzzForge is executed inside a Docker container. Heres what that means in practice:
- **Workflow containers** are built from language-specific base images (like Python or Node.js), with security tools and workflow code pre-installed.
- **Infrastructure containers** (API server, Prefect, database) use official images and are configured for the platforms needs.
- **Vertical worker containers** are built from language-specific base images with domain-specific security toolchains pre-installed (Android, Rust, Web, etc.).
- **Infrastructure containers** (API server, Temporal, MinIO, database) use official images and are configured for the platform's needs.
### Container Lifecycle: From Build to Cleanup
### Worker Lifecycle: From Build to Long-Running
The lifecycle of a workflow container looks like this:
The lifecycle of a vertical worker looks like this:
1. **Image Build:** A Docker image is built with all required tools and code.
2. **Image Push/Pull:** The image is pushed to (and later pulled from) a local or remote registry.
3. **Container Creation:** The container is created with the right volumes and environment.
4. **Execution:** The workflow runs inside the container.
5. **Result Storage:** Results are written to mounted volumes.
6. **Cleanup:** The container and any temporary data are removed.
1. **Image Build:** A Docker image is built with all required toolchains for the vertical.
2. **Worker Start:** The worker container starts as a long-lived process.
3. **Workflow Discovery:** Worker scans mounted `/app/toolbox` for workflows matching its vertical.
4. **Registration:** Workflows are registered with Temporal on the worker's task queue.
5. **Execution:** When a workflow is submitted, the worker downloads the target from MinIO and executes.
6. **Continuous Running:** Worker remains running, ready for the next workflow.
```mermaid
graph TB
Build[Build Image] --> Push[Push to Registry]
Push --> Pull[Pull Image]
Pull --> Create[Create Container]
Create --> Mount[Mount Volumes]
Mount --> Start[Start Container]
Start --> Execute[Run Workflow]
Execute --> Results[Store Results]
Execute --> Stop[Stop Container]
Stop --> Cleanup[Cleanup Data]
Cleanup --> Remove[Remove Container]
Build[Build Worker Image] --> Start[Start Worker Container]
Start --> Mount[Mount Toolbox Volume]
Mount --> Discover[Discover Workflows]
Discover --> Register[Register with Temporal]
Register --> Ready[Worker Ready]
Ready --> Workflow[Workflow Submitted]
Workflow --> Download[Download Target from MinIO]
Download --> Execute[Execute Workflow]
Execute --> Upload[Upload Results to MinIO]
Upload --> Ready
```
---
## Whats Inside a Workflow Container?
## What's Inside a Vertical Worker Container?
A typical workflow container is structured like this:
A typical vertical worker container is structured like this:
- **Base Image:** Usually a slim language image (e.g., `python:3.11-slim`).
- **Base Image:** Language-specific image (e.g., `python:3.11-slim`).
- **System Dependencies:** Installed as needed (e.g., `git`, `curl`).
- **Security Tools:** Pre-installed (e.g., `semgrep`, `bandit`, `safety`).
- **Workflow Code:** Copied into the container.
- **Domain-Specific Toolchains:** Pre-installed (e.g., Rust: `AFL++`, `cargo-fuzz`; Android: `apktool`, `Frida`).
- **Temporal Python SDK:** For workflow execution.
- **Boto3:** For MinIO/S3 access.
- **Worker Script:** Discovers and registers workflows.
- **Non-root User:** Created for execution.
- **Entrypoint:** Runs the workflow code.
- **Entrypoint:** Runs the worker discovery and registration loop.
Example Dockerfile snippet:
Example Dockerfile snippet for Rust worker:
```dockerfile
FROM python:3.11-slim
RUN apt-get update && apt-get install -y git curl && rm -rf /var/lib/apt/lists/*
RUN pip install semgrep bandit safety
COPY ./toolbox /app/toolbox
RUN apt-get update && apt-get install -y git curl build-essential && rm -rf /var/lib/apt/lists/*
# Install AFL++, cargo, etc.
RUN pip install temporalio boto3 pydantic
COPY worker.py /app/
WORKDIR /app
RUN useradd -m -u 1000 fuzzforge
USER fuzzforge
CMD ["python", "-m", "toolbox.main"]
# Toolbox will be mounted as volume at /app/toolbox
CMD ["python", "worker.py"]
```
---
@@ -102,37 +106,42 @@ networks:
### Volume Types
- **Target Code Volume:** Mounts the code to be analyzed, read-only, into the container.
- **Result Volume:** Stores workflow results and artifacts, persists after container exit.
- **Temporary Volumes:** Used for scratch space, destroyed with the container.
- **Toolbox Volume:** Mounts the workflow code directory, read-only, for dynamic discovery.
- **Worker Cache:** Local cache for downloaded MinIO targets, with LRU eviction.
- **MinIO Data:** Persistent storage for uploaded targets and results (S3-compatible).
Example volume mount:
```yaml
volumes:
- "/host/path/to/code:/app/target:ro"
- "fuzzforge_prefect_storage:/app/prefect"
- "./toolbox:/app/toolbox:ro" # Workflow code
- "worker_cache:/cache" # Local cache
- "minio_data:/data" # MinIO storage
```
### Volume Security
- **Read-only Mounts:** Prevent workflows from modifying source code.
- **Isolated Results:** Each workflow writes to its own result directory.
- **No Arbitrary Host Access:** Only explicitly mounted paths are accessible.
- **Read-only Toolbox:** Workflows cannot modify the mounted toolbox code.
- **Isolated Storage:** Each workflow's target is stored with a unique `target_id` in MinIO.
- **No Host Filesystem Access:** Workers access targets via MinIO, not host paths.
- **Automatic Cleanup:** MinIO lifecycle policies delete old targets after 7 days.
---
## How Are Images Built and Managed?
## How Are Worker Images Built and Managed?
- **Automated Builds:** Images are built and pushed to a local registry for development, or a secure registry for production.
- **Automated Builds:** Vertical worker images are built with specialized toolchains.
- **Build Optimization:** Use layer caching, multi-stage builds, and minimal base images.
- **Versioning:** Use tags (`latest`, semantic versions, or SHA digests) to track images.
- **Versioning:** Use tags (`latest`, semantic versions) to track worker images.
- **Long-Lived:** Workers run continuously, not ephemeral per-workflow.
Example build and push:
Example build:
```bash
docker build -t localhost:5001/fuzzforge-static-analysis:latest .
docker push localhost:5001/fuzzforge-static-analysis:latest
cd workers/rust
docker build -t fuzzforge-worker-rust:latest .
# Or via docker-compose
docker-compose -f docker-compose.temporal.yaml build worker-rust
```
---
@@ -147,7 +156,7 @@ Example resource config:
```yaml
services:
prefect-worker:
worker-rust:
deploy:
resources:
limits:
@@ -156,6 +165,8 @@ services:
reservations:
memory: 1G
cpus: '0.5'
environment:
MAX_CONCURRENT_ACTIVITIES: 5
```
---
@@ -172,7 +183,7 @@ Example security options:
```yaml
services:
prefect-worker:
worker-rust:
security_opt:
- no-new-privileges:true
cap_drop:
@@ -188,8 +199,9 @@ services:
## How Is Performance Optimized?
- **Image Layering:** Structure Dockerfiles for efficient caching.
- **Dependency Preinstallation:** Reduce startup time by pre-installing dependencies.
- **Warm Containers:** Optionally pre-create containers for faster workflow startup.
- **Pre-installed Toolchains:** All tools installed in worker image, zero setup time per workflow.
- **Long-Lived Workers:** Eliminate container startup overhead entirely.
- **Local Caching:** MinIO targets cached locally for repeated workflows.
- **Horizontal Scaling:** Scale worker containers to handle more workflows in parallel.
---
@@ -205,10 +217,10 @@ services:
## How Does This All Fit Into FuzzForge?
- **Prefect Workers:** Manage the full lifecycle of workflow containers.
- **API Integration:** Exposes container status, logs, and resource metrics.
- **Volume Management:** Ensures results and artifacts are collected and persisted.
- **Security and Resource Controls:** Enforced automatically for every workflow.
- **Temporal Workers:** Long-lived vertical workers execute workflows with pre-installed toolchains.
- **API Integration:** Exposes workflow status, logs, and resource metrics via Temporal.
- **MinIO Storage:** Ensures targets and results are stored, cached, and cleaned up automatically.
- **Security and Resource Controls:** Enforced automatically for every worker and workflow.
---
+402
View File
@@ -0,0 +1,402 @@
# Resource Management in FuzzForge
FuzzForge uses a multi-layered approach to manage CPU, memory, and concurrency for workflow execution. This ensures stable operation, prevents resource exhaustion, and allows predictable performance.
---
## Overview
Resource limiting in FuzzForge operates at three levels:
1. **Docker Container Limits** (Primary Enforcement) - Hard limits enforced by Docker
2. **Worker Concurrency Limits** - Controls parallel workflow execution
3. **Workflow Metadata** (Advisory) - Documents resource requirements
---
## Level 1: Docker Container Limits (Primary)
Docker container limits are the **primary enforcement mechanism** for CPU and memory resources. These are configured in `docker-compose.temporal.yaml` and enforced by the Docker runtime.
### Configuration
```yaml
services:
worker-rust:
deploy:
resources:
limits:
cpus: '2.0' # Maximum 2 CPU cores
memory: 2G # Maximum 2GB RAM
reservations:
cpus: '0.5' # Minimum 0.5 CPU cores reserved
memory: 512M # Minimum 512MB RAM reserved
```
### How It Works
- **CPU Limit**: Docker throttles CPU usage when the container exceeds the limit
- **Memory Limit**: Docker kills the container (OOM) if it exceeds the memory limit
- **Reservations**: Guarantees minimum resources are available to the worker
### Example Configuration by Vertical
Different verticals have different resource needs:
**Rust Worker** (CPU-intensive fuzzing):
```yaml
worker-rust:
deploy:
resources:
limits:
cpus: '4.0'
memory: 4G
```
**Android Worker** (Memory-intensive emulation):
```yaml
worker-android:
deploy:
resources:
limits:
cpus: '2.0'
memory: 8G
```
**Web Worker** (Lightweight analysis):
```yaml
worker-web:
deploy:
resources:
limits:
cpus: '1.0'
memory: 1G
```
### Monitoring Container Resources
Check real-time resource usage:
```bash
# Monitor all workers
docker stats
# Monitor specific worker
docker stats fuzzforge-worker-rust
# Output:
# CONTAINER CPU % MEM USAGE / LIMIT MEM %
# fuzzforge-worker-rust 85% 1.5GiB / 2GiB 75%
```
---
## Level 2: Worker Concurrency Limits
The `MAX_CONCURRENT_ACTIVITIES` environment variable controls how many workflows can execute **simultaneously** on a single worker.
### Configuration
```yaml
services:
worker-rust:
environment:
MAX_CONCURRENT_ACTIVITIES: 5
deploy:
resources:
limits:
memory: 2G
```
### How It Works
- **Total Container Memory**: 2GB
- **Concurrent Workflows**: 5
- **Memory per Workflow**: ~400MB (2GB ÷ 5)
If a 6th workflow is submitted, it **waits in the Temporal queue** until one of the 5 running workflows completes.
### Calculating Concurrency
Use this formula to determine `MAX_CONCURRENT_ACTIVITIES`:
```
MAX_CONCURRENT_ACTIVITIES = Container Memory Limit / Estimated Workflow Memory
```
**Example:**
- Container limit: 4GB
- Workflow memory: ~800MB
- Concurrency: 4GB ÷ 800MB = **5 concurrent workflows**
### Configuration Examples
**High Concurrency (Lightweight Workflows)**:
```yaml
worker-web:
environment:
MAX_CONCURRENT_ACTIVITIES: 10 # Many small workflows
deploy:
resources:
limits:
memory: 2G # ~200MB per workflow
```
**Low Concurrency (Heavy Workflows)**:
```yaml
worker-rust:
environment:
MAX_CONCURRENT_ACTIVITIES: 2 # Few large workflows
deploy:
resources:
limits:
memory: 4G # ~2GB per workflow
```
### Monitoring Concurrency
Check how many workflows are running:
```bash
# View worker logs
docker-compose -f docker-compose.temporal.yaml logs worker-rust | grep "Starting"
# Check Temporal UI
# Open http://localhost:8233
# Navigate to "Task Queues" → "rust" → See pending/running counts
```
---
## Level 3: Workflow Metadata (Advisory)
Workflow metadata in `metadata.yaml` documents resource requirements, but these are **advisory only** (except for timeout).
### Configuration
```yaml
# backend/toolbox/workflows/security_assessment/metadata.yaml
requirements:
resources:
memory: "512Mi" # Estimated memory usage (advisory)
cpu: "500m" # Estimated CPU usage (advisory)
timeout: 1800 # Execution timeout in seconds (ENFORCED)
```
### What's Enforced vs Advisory
| Field | Enforcement | Description |
|-------|-------------|-------------|
| `timeout` | ✅ **Enforced by Temporal** | Workflow killed if exceeds timeout |
| `memory` | ⚠️ Advisory only | Documents expected memory usage |
| `cpu` | ⚠️ Advisory only | Documents expected CPU usage |
### Why Metadata Is Useful
Even though `memory` and `cpu` are advisory, they're valuable for:
1. **Capacity Planning**: Determine appropriate container limits
2. **Concurrency Tuning**: Calculate `MAX_CONCURRENT_ACTIVITIES`
3. **Documentation**: Communicate resource needs to users
4. **Scheduling Hints**: Future horizontal scaling logic
### Timeout Enforcement
The `timeout` field is **enforced by Temporal**:
```python
# Temporal automatically cancels workflow after timeout
@workflow.defn
class SecurityAssessmentWorkflow:
@workflow.run
async def run(self, target_id: str):
# If this takes longer than metadata.timeout (1800s),
# Temporal will cancel the workflow
...
```
**Check timeout in Temporal UI:**
1. Open http://localhost:8233
2. Navigate to workflow execution
3. See "Timeout" in workflow details
4. If exceeded, status shows "TIMED_OUT"
---
## Resource Management Best Practices
### 1. Set Conservative Container Limits
Start with lower limits and increase based on actual usage:
```yaml
# Start conservative
worker-rust:
deploy:
resources:
limits:
cpus: '2.0'
memory: 2G
# Monitor with: docker stats
# Increase if consistently hitting limits
```
### 2. Calculate Concurrency from Profiling
Profile a single workflow first:
```bash
# Run single workflow and monitor
docker stats fuzzforge-worker-rust
# Note peak memory usage (e.g., 800MB)
# Calculate concurrency: 4GB ÷ 800MB = 5
```
### 3. Set Realistic Timeouts
Base timeouts on actual workflow duration:
```yaml
# Static analysis: 5-10 minutes
timeout: 600
# Fuzzing: 1-24 hours
timeout: 86400
# Quick scans: 1-2 minutes
timeout: 120
```
### 4. Monitor Resource Exhaustion
Watch for these warning signs:
```bash
# Check for OOM kills
docker-compose -f docker-compose.temporal.yaml logs worker-rust | grep -i "oom\|killed"
# Check for CPU throttling
docker stats fuzzforge-worker-rust
# If CPU% consistently at limit → increase cpus
# Check for memory pressure
docker stats fuzzforge-worker-rust
# If MEM% consistently >90% → increase memory
```
### 5. Use Vertical-Specific Configuration
Different verticals have different needs:
| Vertical | CPU Priority | Memory Priority | Typical Config |
|----------|--------------|-----------------|----------------|
| Rust Fuzzing | High | Medium | 4 CPUs, 4GB RAM |
| Android Analysis | Medium | High | 2 CPUs, 8GB RAM |
| Web Scanning | Low | Low | 1 CPU, 1GB RAM |
| Static Analysis | Medium | Medium | 2 CPUs, 2GB RAM |
---
## Horizontal Scaling
To handle more workflows, scale worker containers horizontally:
```bash
# Scale rust worker to 3 instances
docker-compose -f docker-compose.temporal.yaml up -d --scale worker-rust=3
# Now you can run:
# - 3 workers × 5 concurrent activities = 15 workflows simultaneously
```
**How it works:**
- Temporal load balances across all workers on the same task queue
- Each worker has independent resource limits
- No shared state between workers
---
## Troubleshooting Resource Issues
### Issue: Workflows Stuck in "Running" State
**Symptom:** Workflow shows RUNNING but makes no progress
**Diagnosis:**
```bash
# Check worker is alive
docker-compose -f docker-compose.temporal.yaml ps worker-rust
# Check worker resource usage
docker stats fuzzforge-worker-rust
# Check for OOM kills
docker-compose -f docker-compose.temporal.yaml logs worker-rust | grep -i oom
```
**Solution:**
- Increase memory limit if worker was killed
- Reduce `MAX_CONCURRENT_ACTIVITIES` if overloaded
- Check worker logs for errors
### Issue: "Too Many Pending Tasks"
**Symptom:** Temporal shows many queued workflows
**Diagnosis:**
```bash
# Check concurrent activities setting
docker exec fuzzforge-worker-rust env | grep MAX_CONCURRENT_ACTIVITIES
# Check current workload
docker-compose -f docker-compose.temporal.yaml logs worker-rust | grep "Starting"
```
**Solution:**
- Increase `MAX_CONCURRENT_ACTIVITIES` if resources allow
- Add more worker instances (horizontal scaling)
- Increase container resource limits
### Issue: Workflow Timeout
**Symptom:** Workflow shows "TIMED_OUT" in Temporal UI
**Diagnosis:**
1. Check `metadata.yaml` timeout setting
2. Check Temporal UI for execution duration
3. Determine if timeout is appropriate
**Solution:**
```yaml
# Increase timeout in metadata.yaml
requirements:
resources:
timeout: 3600 # Increased from 1800
```
---
## Summary
FuzzForge's resource management strategy:
1. **Docker Container Limits**: Primary enforcement (CPU/memory hard limits)
2. **Concurrency Limits**: Controls parallel workflows per worker
3. **Workflow Metadata**: Advisory resource hints + enforced timeout
**Key Takeaways:**
- Set conservative Docker limits and adjust based on monitoring
- Calculate `MAX_CONCURRENT_ACTIVITIES` from container memory ÷ workflow memory
- Use `docker stats` and Temporal UI to monitor resource usage
- Scale horizontally by adding more worker instances
- Set realistic timeouts based on actual workflow duration
---
**Next Steps:**
- Review `docker-compose.temporal.yaml` resource configuration
- Profile your workflows to determine actual resource usage
- Adjust limits based on monitoring data
- Set up alerts for resource exhaustion
+23 -22
View File
@@ -25,30 +25,31 @@ Heres how a workflow moves through the FuzzForge system:
```mermaid
graph TB
User[User/CLI/API] --> API[FuzzForge API]
API --> Prefect[Prefect Orchestrator]
Prefect --> Worker[Prefect Worker]
Worker --> Container[Docker Container]
Container --> Tools[Security Tools]
API --> MinIO[MinIO Storage]
API --> Temporal[Temporal Orchestrator]
Temporal --> Worker[Vertical Worker]
Worker --> MinIO
Worker --> Tools[Security Tools]
Tools --> Results[SARIF Results]
Results --> Storage[Persistent Storage]
Results --> MinIO
```
**Key roles:**
- **User/CLI/API:** Submits and manages workflows.
- **FuzzForge API:** Validates, orchestrates, and tracks workflows.
- **Prefect Orchestrator:** Schedules and manages workflow execution.
- **Prefect Worker:** Runs the workflow in a Docker container.
- **User/CLI/API:** Submits workflows and uploads files.
- **FuzzForge API:** Validates, uploads targets, and tracks workflows.
- **Temporal Orchestrator:** Schedules and manages workflow execution.
- **Vertical Worker:** Long-lived worker with pre-installed security tools.
- **MinIO Storage:** Stores uploaded targets and results.
- **Security Tools:** Perform the actual analysis.
- **Persistent Storage:** Stores results and artifacts.
---
## Workflow Lifecycle: From Idea to Results
1. **Design:** Choose tools, define integration logic, set up parameters, and build the Docker image.
2. **Deployment:** Build and push the image, register the workflow, and configure defaults.
3. **Execution:** User submits a workflow; parameters and target are validated; the workflow is scheduled and executed in a container; tools run as designed.
4. **Completion:** Results are collected, normalized, and stored; status is updated; temporary resources are cleaned up; results are made available via API/CLI.
1. **Design:** Choose tools, define integration logic, set up parameters, and specify the vertical worker.
2. **Deployment:** Create workflow code, add metadata with `vertical` field, mount as volume in worker.
3. **Execution:** User submits a workflow with file upload; file is stored in MinIO; workflow is routed to vertical worker; worker downloads target and executes; tools run as designed.
4. **Completion:** Results are collected, normalized, and stored in MinIO; status is updated; MinIO lifecycle policies clean up old files; results are made available via API/CLI.
---
@@ -85,25 +86,25 @@ FuzzForge supports several workflow types, each optimized for a specific securit
## Data Flow and Storage
- **Input:** Target code and parameters are validated and mounted as read-only volumes.
- **Processing:** Tools are initialized and run (often in parallel); outputs are collected and normalized.
- **Output:** Results are stored in persistent volumes and indexed for fast retrieval; metadata is saved in the database; intermediate results may be cached for performance.
- **Input:** Target files uploaded via HTTP to MinIO; parameters validated and passed to Temporal.
- **Processing:** Worker downloads target from MinIO to local cache; tools are initialized and run (often in parallel); outputs are collected and normalized.
- **Output:** Results are stored in MinIO and indexed for fast retrieval; metadata is saved in PostgreSQL; targets cached locally for repeated workflows; lifecycle policies clean up after 7 days.
---
## Error Handling and Recovery
- **Tool-Level:** Timeouts, resource exhaustion, and crashes are handled gracefully; failed tools dont stop the workflow.
- **Workflow-Level:** Container failures, volume issues, and network problems are detected and reported.
- **Recovery:** Automatic retries for transient errors; partial results are returned when possible; workflows degrade gracefully if some tools are unavailable.
- **Tool-Level:** Timeouts, resource exhaustion, and crashes are handled gracefully; failed tools don't stop the workflow.
- **Workflow-Level:** Worker failures, storage issues, and network problems are detected and reported by Temporal.
- **Recovery:** Automatic retries for transient errors via Temporal; partial results are returned when possible; workflows degrade gracefully if some tools are unavailable; MinIO ensures targets remain accessible.
---
## Performance and Optimization
- **Container Efficiency:** Docker images are layered and cached for fast startup; containers may be reused when safe.
- **Worker Efficiency:** Long-lived workers eliminate container startup overhead; pre-installed toolchains reduce setup time.
- **Parallel Processing:** Independent tools run concurrently to maximize CPU usage and minimize wait times.
- **Caching:** Images, dependencies, and intermediate results are cached to avoid unnecessary recomputation.
- **Caching:** MinIO targets are cached locally; repeated workflows reuse cached targets; worker cache uses LRU eviction.
---
+153 -84
View File
@@ -9,18 +9,18 @@ This guide will walk you through the process of creating a custom security analy
Before you start, make sure you have:
- A working FuzzForge development environment (see [Contributing](/reference/contributing.md))
- Familiarity with Python (async/await), Docker, and Prefect 3
- Familiarity with Python (async/await), Docker, and Temporal
- At least one custom or built-in module to use in your workflow
---
## Step 1: Understand Workflow Architecture
A FuzzForge workflow is a Prefect 3 flow that:
A FuzzForge workflow is a Temporal workflow that:
- Runs in an isolated Docker container
- Runs inside a long-lived vertical worker container (pre-built with toolchains)
- Orchestrates one or more analysis modules (scanner, analyzer, reporter, etc.)
- Handles secure volume mounting for code and results
- Downloads targets from MinIO (S3-compatible storage) automatically
- Produces standardized SARIF output
- Supports configurable parameters and resource limits
@@ -28,9 +28,9 @@ A FuzzForge workflow is a Prefect 3 flow that:
```
backend/toolbox/workflows/{workflow_name}/
├── workflow.py # Main workflow definition (Prefect flow)
├── Dockerfile # Container image definition
├── metadata.yaml # Workflow metadata and configuration
├── workflow.py # Main workflow definition (Temporal workflow)
├── activities.py # Workflow activities (optional)
├── metadata.yaml # Workflow metadata and configuration (must include vertical field)
└── requirements.txt # Additional Python dependencies (optional)
```
@@ -48,6 +48,7 @@ version: "1.0.0"
description: "Analyzes project dependencies for security vulnerabilities"
author: "FuzzingLabs Security Team"
category: "comprehensive"
vertical: "web" # REQUIRED: Which vertical worker to use (rust, android, web, etc.)
tags:
- "dependency-scanning"
- "vulnerability-analysis"
@@ -63,10 +64,6 @@ requirements:
parameters:
type: object
properties:
target_path:
type: string
default: "/workspace"
description: "Path to analyze"
scan_dev_dependencies:
type: boolean
description: "Include development dependencies"
@@ -85,36 +82,35 @@ output_schema:
description: "Scan execution summary"
```
**Important:** The `vertical` field determines which worker runs your workflow. Ensure the worker has the required tools installed.
---
## Step 3: Add Live Statistics to Your Workflow 🚦
Want real-time progress and stats for your workflow? FuzzForge supports live statistics reporting using Prefect and structured logging. This lets users (and the platform) monitor workflow progress, see live updates, and stream stats via API or WebSocket.
Want real-time progress and stats for your workflow? FuzzForge supports live statistics reporting using Temporal workflow logging. This lets users (and the platform) monitor workflow progress, see live updates, and stream stats via API or WebSocket.
### 1. Import Required Dependencies
```python
from prefect import task, get_run_context
from temporalio import workflow, activity
import logging
logger = logging.getLogger(__name__)
```
### 2. Create a Statistics Callback Function
### 2. Create a Statistics Callback in Activity
Add a callback that logs structured stats updates:
Add a callback that logs structured stats updates in your activity:
```python
@task(name="my_workflow_task")
async def my_workflow_task(workspace: Path, config: Dict[str, Any]) -> Dict[str, Any]:
# Get run context for statistics reporting
try:
context = get_run_context()
run_id = str(context.flow_run.id)
logger.info(f"Running task for flow run: {run_id}")
except Exception:
run_id = None
logger.warning("Could not get run context for statistics")
@activity.defn
async def my_workflow_activity(target_path: str, config: Dict[str, Any]) -> Dict[str, Any]:
# Get activity info for run tracking
info = activity.info()
run_id = info.workflow_id
logger.info(f"Running activity for workflow: {run_id}")
# Define callback function for live statistics
async def stats_callback(stats_data: Dict[str, Any]):
@@ -124,7 +120,7 @@ async def my_workflow_task(workspace: Path, config: Dict[str, Any]) -> Dict[str,
logger.info("LIVE_STATS", extra={
"stats_type": "live_stats", # Type of statistics
"workflow_type": "my_workflow", # Your workflow name
"run_id": stats_data.get("run_id"),
"run_id": run_id,
# Add your custom statistics fields here:
"progress": stats_data.get("progress", 0),
@@ -138,7 +134,7 @@ async def my_workflow_task(workspace: Path, config: Dict[str, Any]) -> Dict[str,
# Pass callback to your module/processor
processor = MyWorkflowModule()
result = await processor.execute(config, workspace, stats_callback=stats_callback)
result = await processor.execute(config, target_path, stats_callback=stats_callback)
return result.dict()
```
@@ -224,15 +220,16 @@ Live statistics automatically appear in:
#### Example: Adding Stats to a Security Scanner
```python
async def security_scan_task(workspace: Path, config: Dict[str, Any]):
context = get_run_context()
run_id = str(context.flow_run.id)
@activity.defn
async def security_scan_activity(target_path: str, config: Dict[str, Any]):
info = activity.info()
run_id = info.workflow_id
async def stats_callback(stats_data):
logger.info("LIVE_STATS", extra={
"stats_type": "scan_progress",
"workflow_type": "security_scan",
"run_id": stats_data.get("run_id"),
"run_id": run_id,
"files_scanned": stats_data.get("files_scanned", 0),
"vulnerabilities_found": stats_data.get("vulnerabilities_found", 0),
"scan_percentage": stats_data.get("scan_percentage", 0.0),
@@ -241,7 +238,7 @@ async def security_scan_task(workspace: Path, config: Dict[str, Any]):
})
scanner = SecurityScannerModule()
return await scanner.execute(config, workspace, stats_callback=stats_callback)
return await scanner.execute(config, target_path, stats_callback=stats_callback)
```
With these steps, your workflow will provide rich, real-time feedback to users and the FuzzForge platform—making automation more transparent and interactive!
@@ -250,95 +247,167 @@ With these steps, your workflow will provide rich, real-time feedback to users a
## Step 4: Implement the Workflow Logic
Create a `workflow.py` file. This is where you define your Prefect flow and tasks.
Create a `workflow.py` file. This is where you define your Temporal workflow and activities.
Example (simplified):
```python
from pathlib import Path
from typing import Dict, Any
from prefect import flow, task
from temporalio import workflow, activity
from datetime import timedelta
from src.toolbox.modules.dependency_scanner import DependencyScanner
from src.toolbox.modules.vulnerability_analyzer import VulnerabilityAnalyzer
from src.toolbox.modules.reporter import SARIFReporter
@task
async def scan_dependencies(workspace: Path, config: Dict[str, Any]) -> Dict[str, Any]:
@activity.defn
async def scan_dependencies(target_path: str, config: Dict[str, Any]) -> Dict[str, Any]:
scanner = DependencyScanner()
return (await scanner.execute(config, workspace)).dict()
return (await scanner.execute(config, target_path)).dict()
@task
async def analyze_vulnerabilities(dependencies: Dict[str, Any], workspace: Path, config: Dict[str, Any]) -> Dict[str, Any]:
@activity.defn
async def analyze_vulnerabilities(dependencies: Dict[str, Any], target_path: str, config: Dict[str, Any]) -> Dict[str, Any]:
analyzer = VulnerabilityAnalyzer()
analyzer_config = {**config, 'dependencies': dependencies.get('findings', [])}
return (await analyzer.execute(analyzer_config, workspace)).dict()
return (await analyzer.execute(analyzer_config, target_path)).dict()
@task
async def generate_report(dep_results: Dict[str, Any], vuln_results: Dict[str, Any], config: Dict[str, Any], workspace: Path) -> Dict[str, Any]:
@activity.defn
async def generate_report(dep_results: Dict[str, Any], vuln_results: Dict[str, Any], config: Dict[str, Any]) -> Dict[str, Any]:
reporter = SARIFReporter()
all_findings = dep_results.get("findings", []) + vuln_results.get("findings", [])
reporter_config = {**config, "findings": all_findings}
return (await reporter.execute(reporter_config, workspace)).dict().get("sarif", {})
return (await reporter.execute(reporter_config, None)).dict().get("sarif", {})
@flow(name="dependency_analysis")
async def main_flow(
target_path: str = "/workspace",
scan_dev_dependencies: bool = True,
vulnerability_threshold: str = "medium"
) -> Dict[str, Any]:
workspace = Path(target_path)
scanner_config = {"scan_dev_dependencies": scan_dev_dependencies}
analyzer_config = {"vulnerability_threshold": vulnerability_threshold}
reporter_config = {}
@workflow.defn
class DependencyAnalysisWorkflow:
@workflow.run
async def run(
self,
target_id: str, # Target file ID from MinIO (downloaded by worker automatically)
scan_dev_dependencies: bool = True,
vulnerability_threshold: str = "medium"
) -> Dict[str, Any]:
workflow.logger.info(f"Starting dependency analysis for target: {target_id}")
dep_results = await scan_dependencies(workspace, scanner_config)
vuln_results = await analyze_vulnerabilities(dep_results, workspace, analyzer_config)
sarif_report = await generate_report(dep_results, vuln_results, reporter_config, workspace)
return sarif_report
# Worker downloads target from MinIO to /cache/{target_id}
target_path = f"/cache/{target_id}"
scanner_config = {"scan_dev_dependencies": scan_dev_dependencies}
analyzer_config = {"vulnerability_threshold": vulnerability_threshold}
# Execute activities with retries and timeouts
dep_results = await workflow.execute_activity(
scan_dependencies,
args=[target_path, scanner_config],
start_to_close_timeout=timedelta(minutes=10),
retry_policy=workflow.RetryPolicy(maximum_attempts=3)
)
vuln_results = await workflow.execute_activity(
analyze_vulnerabilities,
args=[dep_results, target_path, analyzer_config],
start_to_close_timeout=timedelta(minutes=10),
retry_policy=workflow.RetryPolicy(maximum_attempts=3)
)
sarif_report = await workflow.execute_activity(
generate_report,
args=[dep_results, vuln_results, {}],
start_to_close_timeout=timedelta(minutes=5),
retry_policy=workflow.RetryPolicy(maximum_attempts=3)
)
workflow.logger.info("Dependency analysis completed")
return sarif_report
```
**Key differences from Prefect:**
- Use `@workflow.defn` class instead of `@flow` function
- Use `@activity.defn` instead of `@task`
- Activities receive `target_id` (MinIO UUID), worker downloads automatically to `/cache/{target_id}`
- Use `workflow.execute_activity()` with explicit timeouts and retry policies
- Use `workflow.logger` for logging (appears in Temporal UI)
---
## Step 5: Create the Dockerfile
## Step 5: No Dockerfile Needed! 🎉
Your workflow runs in a container. Create a `Dockerfile`:
**Good news:** You don't need to create a Dockerfile for your workflow. Workflows run inside pre-built **vertical worker containers** that already have toolchains installed.
```dockerfile
FROM python:3.11-slim
WORKDIR /app
RUN apt-get update && apt-get install -y git curl && rm -rf /var/lib/apt/lists/*
COPY ../../../pyproject.toml ./
COPY ../../../uv.lock ./
RUN pip install uv && uv sync --no-dev
COPY requirements.txt ./
RUN uv pip install -r requirements.txt
COPY ../../../ .
RUN mkdir -p /workspace
CMD ["uv", "run", "python", "-m", "src.toolbox.workflows.dependency_analysis.workflow"]
```
**How it works:**
1. Your workflow code lives in `backend/toolbox/workflows/{workflow_name}/`
2. This directory is **mounted as a volume** in the worker container at `/app/toolbox/workflows/`
3. Worker discovers and registers your workflow automatically on startup
4. When submitted, the workflow runs inside the long-lived worker container
**Benefits:**
- Zero container build time per workflow
- Instant code changes (just restart worker)
- All toolchains pre-installed (AFL++, cargo-fuzz, apktool, etc.)
- Consistent environment across all workflows of the same vertical
---
## Step 6: Register and Test Your Workflow
## Step 6: Test Your Workflow
- Add your workflow to the registry (e.g., `backend/toolbox/workflows/registry.py`)
- Write a test script or use the CLI to submit a workflow run
- Check that SARIF results are produced and stored as expected
### Using the CLI
Example test:
```bash
# Start FuzzForge with Temporal
docker-compose -f docker-compose.temporal.yaml up -d
# Wait for services to initialize
sleep 10
# Submit workflow with file upload
cd test_projects/vulnerable_app/
fuzzforge workflow run dependency_analysis .
# CLI automatically:
# - Creates tarball of current directory
# - Uploads to MinIO via backend
# - Submits workflow with target_id
# - Worker downloads from MinIO and executes
```
### Using Python SDK
```python
import asyncio
from backend.src.toolbox.workflows.dependency_analysis.workflow import main_flow
from fuzzforge_sdk import FuzzForgeClient
from pathlib import Path
async def test_workflow():
result = await main_flow(target_path="/tmp/test-project", scan_dev_dependencies=True)
print(result)
client = FuzzForgeClient(base_url="http://localhost:8000")
if __name__ == "__main__":
asyncio.run(test_workflow())
# Submit with automatic upload
response = client.submit_workflow_with_upload(
workflow_name="dependency_analysis",
target_path=Path("/path/to/project"),
parameters={
"scan_dev_dependencies": True,
"vulnerability_threshold": "medium"
}
)
print(f"Workflow started: {response.run_id}")
# Wait for completion
final_status = client.wait_for_completion(response.run_id)
# Get findings
findings = client.get_run_findings(response.run_id)
print(findings.sarif)
client.close()
```
### Check Temporal UI
Open http://localhost:8233 to see:
- Workflow execution timeline
- Activity results
- Logs and errors
- Retry history
---
## Best Practices
+453
View File
@@ -0,0 +1,453 @@
# Debugging Workflows and Modules
This guide shows you how to debug FuzzForge workflows and modules using Temporal's powerful debugging features.
---
## Quick Debugging Checklist
When something goes wrong:
1. **Check worker logs** - `docker-compose -f docker-compose.temporal.yaml logs worker-rust -f`
2. **Check Temporal UI** - http://localhost:8233 (visual execution history)
3. **Check MinIO console** - http://localhost:9001 (inspect uploaded files)
4. **Check backend logs** - `docker-compose -f docker-compose.temporal.yaml logs fuzzforge-backend -f`
---
## Debugging Workflow Discovery
### Problem: Workflow Not Found
**Symptom:** Worker logs show "No workflows found for vertical: rust"
**Debug Steps:**
1. **Check if worker can see the workflow:**
```bash
docker exec fuzzforge-worker-rust ls /app/toolbox/workflows/
```
2. **Check metadata.yaml exists:**
```bash
docker exec fuzzforge-worker-rust cat /app/toolbox/workflows/my_workflow/metadata.yaml
```
3. **Verify vertical field matches:**
```bash
docker exec fuzzforge-worker-rust grep "vertical:" /app/toolbox/workflows/my_workflow/metadata.yaml
```
Should output: `vertical: rust`
4. **Check worker logs for discovery errors:**
```bash
docker-compose -f docker-compose.temporal.yaml logs worker-rust | grep "my_workflow"
```
**Solution:**
- Ensure `metadata.yaml` has correct `vertical` field
- Restart worker to reload: `docker-compose -f docker-compose.temporal.yaml restart worker-rust`
- Check worker logs for discovery confirmation
---
## Debugging Workflow Execution
### Using Temporal Web UI
The Temporal UI at http://localhost:8233 is your primary debugging tool.
**Navigate to a workflow:**
1. Open http://localhost:8233
2. Click "Workflows" in left sidebar
3. Find your workflow by `run_id` or workflow name
4. Click to see detailed execution
**What you can see:**
- **Execution timeline** - When each activity started/completed
- **Input/output** - Exact parameters passed to workflow
- **Activity results** - Return values from each activity
- **Error stack traces** - Full Python tracebacks
- **Retry history** - All retry attempts with reasons
- **Worker information** - Which worker executed each activity
**Example: Finding why an activity failed:**
1. Open workflow in Temporal UI
2. Scroll to failed activity (marked in red)
3. Click on the activity
4. See full error message and stack trace
5. Check "Input" tab to see what parameters were passed
---
## Viewing Worker Logs
### Real-time Monitoring
```bash
# Follow logs from rust worker
docker-compose -f docker-compose.temporal.yaml logs worker-rust -f
# Follow logs from all workers
docker-compose -f docker-compose.temporal.yaml logs worker-rust worker-android -f
# Show last 100 lines
docker-compose -f docker-compose.temporal.yaml logs worker-rust --tail 100
```
### What Worker Logs Show
**On startup:**
```
INFO: Scanning for workflows in: /app/toolbox/workflows
INFO: Importing workflow module: toolbox.workflows.security_assessment.workflow
INFO: ✓ Discovered workflow: SecurityAssessmentWorkflow from security_assessment (vertical: rust)
INFO: 🚀 Worker started for vertical 'rust'
```
**During execution:**
```
INFO: Starting SecurityAssessmentWorkflow (workflow_id=security_assessment-abc123, target_id=548193a1...)
INFO: Downloading target from MinIO: 548193a1-f73f-4ec1-8068-19ec2660b8e4
INFO: Executing activity: scan_files
INFO: Completed activity: scan_files (duration: 3.2s)
```
**On errors:**
```
ERROR: Failed to import workflow module toolbox.workflows.broken.workflow:
File "/app/toolbox/workflows/broken/workflow.py", line 42
def run(
IndentationError: expected an indented block
```
### Filtering Logs
```bash
# Show only errors
docker-compose -f docker-compose.temporal.yaml logs worker-rust | grep ERROR
# Show workflow discovery
docker-compose -f docker-compose.temporal.yaml logs worker-rust | grep "Discovered workflow"
# Show specific workflow execution
docker-compose -f docker-compose.temporal.yaml logs worker-rust | grep "security_assessment-abc123"
# Show activity execution
docker-compose -f docker-compose.temporal.yaml logs worker-rust | grep "activity"
```
---
## Debugging File Upload
### Check if File Was Uploaded
**Using MinIO Console:**
1. Open http://localhost:9001
2. Login: `fuzzforge` / `fuzzforge123`
3. Click "Buckets" → "targets"
4. Look for your `target_id` (UUID format)
5. Click to download and inspect locally
**Using CLI:**
```bash
# Check MinIO status
curl http://localhost:9000
# List backend logs for upload
docker-compose -f docker-compose.temporal.yaml logs fuzzforge-backend | grep "upload"
```
### Check Worker Cache
```bash
# List cached targets
docker exec fuzzforge-worker-rust ls -lh /cache/
# Check specific target
docker exec fuzzforge-worker-rust ls -lh /cache/548193a1-f73f-4ec1-8068-19ec2660b8e4
```
---
## Interactive Debugging
### Access Running Worker
```bash
# Open shell in worker container
docker exec -it fuzzforge-worker-rust bash
# Now you can:
# - Check filesystem
ls -la /app/toolbox/workflows/
# - Test imports
python3 -c "from toolbox.workflows.my_workflow.workflow import MyWorkflow; print(MyWorkflow)"
# - Check environment variables
env | grep TEMPORAL
# - Test activities
cd /app/toolbox/workflows/my_workflow
python3 -c "from activities import my_activity; print(my_activity)"
# - Check cache
ls -lh /cache/
```
### Test Module in Isolation
```bash
# Enter worker container
docker exec -it fuzzforge-worker-rust bash
# Navigate to module
cd /app/toolbox/modules/scanner
# Run module directly
python3 -c "
from file_scanner import FileScannerModule
scanner = FileScannerModule()
print(scanner.get_metadata())
"
```
---
## Debugging Module Code
### Edit and Reload
Since toolbox is mounted as a volume, you can edit code on your host and reload:
1. **Edit module on host:**
```bash
# On your host machine
vim backend/toolbox/modules/scanner/file_scanner.py
```
2. **Restart worker to reload:**
```bash
docker-compose -f docker-compose.temporal.yaml restart worker-rust
```
3. **Check discovery logs:**
```bash
docker-compose -f docker-compose.temporal.yaml logs worker-rust | tail -50
```
### Add Debug Logging
Add logging to your workflow or module:
```python
import logging
logger = logging.getLogger(__name__)
@workflow.defn
class MyWorkflow:
@workflow.run
async def run(self, target_id: str):
workflow.logger.info(f"Starting with target_id: {target_id}") # Shows in Temporal UI
logger.info("Processing step 1") # Shows in worker logs
logger.debug(f"Debug info: {some_variable}") # Shows if LOG_LEVEL=DEBUG
try:
result = await some_activity()
logger.info(f"Activity result: {result}")
except Exception as e:
logger.error(f"Activity failed: {e}", exc_info=True) # Full stack trace
raise
```
Set debug logging:
```bash
# Edit docker-compose.temporal.yaml
services:
worker-rust:
environment:
LOG_LEVEL: DEBUG # Change from INFO to DEBUG
# Restart
docker-compose -f docker-compose.temporal.yaml restart worker-rust
```
---
## Common Issues and Solutions
### Issue: Workflow stuck in "Running" state
**Debug:**
1. Check Temporal UI for last completed activity
2. Check worker logs for errors
3. Check if worker is still running: `docker-compose -f docker-compose.temporal.yaml ps worker-rust`
**Solution:**
- Worker may have crashed - restart it
- Activity may be hanging - check for infinite loops or stuck network calls
- Check worker resource limits: `docker stats fuzzforge-worker-rust`
### Issue: Import errors in workflow
**Debug:**
1. Check worker logs for full error trace
2. Check if module file exists:
```bash
docker exec fuzzforge-worker-rust ls /app/toolbox/modules/my_module/
```
**Solution:**
- Ensure module is in correct directory
- Check for syntax errors: `docker exec fuzzforge-worker-rust python3 -m py_compile /app/toolbox/modules/my_module/my_module.py`
- Verify imports are correct
### Issue: Target file not found in worker
**Debug:**
1. Check if target exists in MinIO console
2. Check worker logs for download errors
3. Verify target_id is correct
**Solution:**
- Re-upload file via CLI
- Check MinIO is running: `docker-compose -f docker-compose.temporal.yaml ps minio`
- Check MinIO credentials in worker environment
---
## Performance Debugging
### Check Activity Duration
**In Temporal UI:**
1. Open workflow execution
2. Scroll through activities
3. Each shows duration (e.g., "3.2s")
4. Identify slow activities
### Monitor Resource Usage
```bash
# Monitor worker resource usage
docker stats fuzzforge-worker-rust
# Check worker logs for memory warnings
docker-compose -f docker-compose.temporal.yaml logs worker-rust | grep -i "memory\|oom"
```
### Profile Workflow Execution
Add timing to your workflow:
```python
import time
@workflow.defn
class MyWorkflow:
@workflow.run
async def run(self, target_id: str):
start = time.time()
result1 = await activity1()
workflow.logger.info(f"Activity1 took: {time.time() - start:.2f}s")
start = time.time()
result2 = await activity2()
workflow.logger.info(f"Activity2 took: {time.time() - start:.2f}s")
```
---
## Advanced Debugging
### Enable Temporal Worker Debug Logs
```bash
# Edit docker-compose.temporal.yaml
services:
worker-rust:
environment:
TEMPORAL_LOG_LEVEL: DEBUG
LOG_LEVEL: DEBUG
# Restart
docker-compose -f docker-compose.temporal.yaml restart worker-rust
```
### Inspect Temporal Workflows via CLI
```bash
# Install Temporal CLI
docker exec fuzzforge-temporal tctl
# List workflows
docker exec fuzzforge-temporal tctl workflow list
# Describe workflow
docker exec fuzzforge-temporal tctl workflow describe -w security_assessment-abc123
# Show workflow history
docker exec fuzzforge-temporal tctl workflow show -w security_assessment-abc123
```
### Check Network Connectivity
```bash
# From worker to Temporal
docker exec fuzzforge-worker-rust ping temporal
# From worker to MinIO
docker exec fuzzforge-worker-rust curl http://minio:9000
# From host to services
curl http://localhost:8233 # Temporal UI
curl http://localhost:9000 # MinIO
curl http://localhost:8000/health # Backend
```
---
## Debugging Best Practices
1. **Always check Temporal UI first** - It shows the most complete execution history
2. **Use structured logging** - Include workflow_id, target_id in log messages
3. **Log at decision points** - Before/after each major operation
4. **Keep worker logs** - They persist across workflow runs
5. **Test modules in isolation** - Use `docker exec` to test before integrating
6. **Use debug builds** - Enable DEBUG logging during development
7. **Monitor resources** - Use `docker stats` to catch resource issues
---
## Getting Help
If you're still stuck:
1. **Collect diagnostic info:**
```bash
# Save all logs
docker-compose -f docker-compose.temporal.yaml logs > fuzzforge-logs.txt
# Check service status
docker-compose -f docker-compose.temporal.yaml ps > service-status.txt
```
2. **Check Temporal UI** and take screenshots of:
- Workflow execution timeline
- Failed activity details
- Error messages
3. **Report issue** with:
- Workflow name and run_id
- Error messages from logs
- Screenshots from Temporal UI
- Steps to reproduce
---
**Happy debugging!** 🐛🔍
+58 -43
View File
@@ -10,15 +10,16 @@ Before diving into specific errors, lets check the basics:
```bash
# Check all FuzzForge services
docker compose ps
docker-compose -f docker-compose.temporal.yaml ps
# Verify Docker registry config
# Verify Docker registry config (if using workflow registry)
docker info | grep -i "insecure registries"
# Test service health endpoints
curl http://localhost:8000/health
curl http://localhost:4200
curl http://localhost:5001/v2/
curl http://localhost:8233 # Temporal Web UI
curl http://localhost:9000 # MinIO API
curl http://localhost:9001 # MinIO Console
```
If any of these commands fail, note the error message and continue below.
@@ -51,15 +52,17 @@ Docker is trying to use HTTPS for the local registry, but its set up for HTTP
The registry isnt running or the port is blocked.
**How to fix:**
- Make sure the registry container is up:
- Make sure the registry container is up (if using registry for workflow images):
```bash
docker compose ps registry
docker-compose -f docker-compose.temporal.yaml ps registry
```
- Check logs for errors:
```bash
docker compose logs registry
docker-compose -f docker-compose.temporal.yaml logs registry
```
- If port 5001 is in use, change it in `docker-compose.yaml` and your Docker config.
- If port 5001 is in use, change it in `docker-compose.temporal.yaml` and your Docker config.
**Note:** With Temporal architecture, target files use MinIO (port 9000), not the registry.
### "no such host" error
@@ -74,31 +77,42 @@ Docker cant resolve `localhost`.
## Workflow Execution Issues
### "mounts denied" or volume errors
### Upload fails or file access errors
**Whats happening?**
Docker cant access the path you provided.
**What's happening?**
File upload to MinIO failed or worker can't download target.
**How to fix:**
- Always use absolute paths.
- On Docker Desktop, add your project directory to File Sharing.
- Confirm the path exists and is readable.
### Workflow status is "Crashed" or "Late"
**Whats happening?**
- "Crashed": Usually a registry, path, or tool error.
- "Late": Worker is overloaded or system is slow.
**How to fix:**
- Check logs for details:
- Check MinIO is running:
```bash
docker compose logs prefect-worker | tail -50
docker-compose -f docker-compose.temporal.yaml ps minio
```
- Check MinIO logs:
```bash
docker-compose -f docker-compose.temporal.yaml logs minio
```
- Verify MinIO is accessible:
```bash
curl http://localhost:9000
```
- Check file size (max 10GB by default).
### Workflow status is "Failed" or "Running" (stuck)
**What's happening?**
- "Failed": Usually a target download, storage, or tool error.
- "Running" (stuck): Worker is overloaded, target download failed, or worker crashed.
**How to fix:**
- Check worker logs for details:
```bash
docker-compose -f docker-compose.temporal.yaml logs worker-rust | tail -50
```
- Check Temporal Web UI at http://localhost:8233 for detailed execution history
- Restart services:
```bash
docker compose down
docker compose up -d
docker-compose -f docker-compose.temporal.yaml down
docker-compose -f docker-compose.temporal.yaml up -d
```
- Reduce the number of concurrent workflows if your system is resource-constrained.
@@ -106,22 +120,23 @@ Docker cant access the path you provided.
## Service Connectivity Issues
### Backend (port 8000) or Prefect UI (port 4200) not responding
### Backend (port 8000) or Temporal UI (port 8233) not responding
**How to fix:**
- Check if the service is running:
```bash
docker compose ps fuzzforge-backend
docker compose ps prefect-server
docker-compose -f docker-compose.temporal.yaml ps fuzzforge-backend
docker-compose -f docker-compose.temporal.yaml ps temporal
```
- View logs for errors:
```bash
docker compose logs fuzzforge-backend --tail 50
docker compose logs prefect-server --tail 20
docker-compose -f docker-compose.temporal.yaml logs fuzzforge-backend --tail 50
docker-compose -f docker-compose.temporal.yaml logs temporal --tail 20
```
- Restart the affected service:
```bash
docker compose restart fuzzforge-backend
docker-compose -f docker-compose.temporal.yaml restart fuzzforge-backend
docker-compose -f docker-compose.temporal.yaml restart temporal
```
---
@@ -197,13 +212,13 @@ Docker cant access the path you provided.
- Check Docker network configuration:
```bash
docker network ls
docker network inspect fuzzforge_default
docker network inspect fuzzforge-temporal_default
```
- Recreate the network:
```bash
docker compose down
docker-compose -f docker-compose.temporal.yaml down
docker network prune -f
docker compose up -d
docker-compose -f docker-compose.temporal.yaml up -d
```
---
@@ -229,10 +244,10 @@ Docker cant access the path you provided.
### Enable debug logging
```bash
export PREFECT_LOGGING_LEVEL=DEBUG
docker compose down
docker compose up -d
docker compose logs fuzzforge-backend -f
export TEMPORAL_LOGGING_LEVEL=DEBUG
docker-compose -f docker-compose.temporal.yaml down
docker-compose -f docker-compose.temporal.yaml up -d
docker-compose -f docker-compose.temporal.yaml logs fuzzforge-backend -f
```
### Collect diagnostic info
@@ -243,12 +258,12 @@ Save and run this script to gather info for support:
#!/bin/bash
echo "=== FuzzForge Diagnostics ==="
date
docker compose ps
docker-compose -f docker-compose.temporal.yaml ps
docker info | grep -A 5 -i "insecure registries"
curl -s http://localhost:8000/health || echo "Backend unhealthy"
curl -s http://localhost:4200 >/dev/null && echo "Prefect UI healthy" || echo "Prefect UI unhealthy"
curl -s http://localhost:5001/v2/ >/dev/null && echo "Registry healthy" || echo "Registry unhealthy"
docker compose logs --tail 10
curl -s http://localhost:8233 >/dev/null && echo "Temporal UI healthy" || echo "Temporal UI unhealthy"
curl -s http://localhost:9000 >/dev/null && echo "MinIO healthy" || echo "MinIO unhealthy"
docker-compose -f docker-compose.temporal.yaml logs --tail 10
```
### Still stuck?
+44 -28
View File
@@ -85,24 +85,23 @@ docker pull localhost:5001/hello-world 2>/dev/null || echo "Registry not accessi
Start all FuzzForge services:
```bash
docker compose up -d
docker-compose -f docker-compose.temporal.yaml up -d
```
This will start 8 services:
- **prefect-server**: Workflow orchestration server
- **prefect-worker**: Executes workflows in Docker containers
This will start 6+ services:
- **temporal**: Workflow orchestration server (includes embedded PostgreSQL for dev)
- **minio**: S3-compatible storage for uploaded targets and results
- **minio-setup**: One-time setup for MinIO buckets (exits after setup)
- **fuzzforge-backend**: FastAPI backend and workflow management
- **postgres**: Metadata and workflow state storage
- **redis**: Message broker and caching
- **registry**: Local Docker registry for workflow images
- **docker-proxy**: Secure Docker socket proxy
- **prefect-services**: Additional Prefect services
- **worker-rust**: Long-lived worker for Rust/native security analysis
- **worker-android**: Long-lived worker for Android security analysis (if configured)
- **worker-web**: Long-lived worker for web security analysis (if configured)
Wait for all services to be healthy (this may take 2-3 minutes on first startup):
```bash
# Check service health
docker compose ps
docker-compose -f docker-compose.temporal.yaml ps
# Verify FuzzForge is ready
curl http://localhost:8000/health
@@ -154,33 +153,41 @@ You should see 6 production workflows:
## Step 6: Run Your First Workflow
Let's run a static analysis workflow on one of the included vulnerable test projects.
Let's run a security assessment workflow on one of the included vulnerable test projects.
### Using the CLI (Recommended):
```bash
# Navigate to a test project
cd /path/to/fuzzforge/test_projects/static_analysis_vulnerable
cd /path/to/fuzzforge/test_projects/vulnerable_app
# Submit the workflow
fuzzforge runs submit static_analysis_scan .
# Submit the workflow - CLI automatically uploads the local directory
fuzzforge workflow run security_assessment .
# The CLI will:
# 1. Detect that '.' is a local directory
# 2. Create a compressed tarball of the directory
# 3. Upload it to the backend via HTTP
# 4. The backend stores it in MinIO
# 5. The worker downloads it when ready to analyze
# Monitor the workflow
fuzzforge runs status <run-id>
fuzzforge workflow status <run-id>
# View results when complete
fuzzforge findings get <run-id>
fuzzforge finding <run-id>
```
### Using the API:
For local files, you can use the upload endpoint:
```bash
# Submit workflow
curl -X POST "http://localhost:8000/workflows/static_analysis_scan/submit" \
-H "Content-Type: application/json" \
-d '{
"target_path": "/path/to/your/project"
}'
# Create tarball and upload
tar -czf project.tar.gz /path/to/your/project
curl -X POST "http://localhost:8000/workflows/security_assessment/upload-and-submit" \
-F "file=@project.tar.gz" \
-F "volume_mode=ro"
# Check status
curl "http://localhost:8000/runs/{run-id}/status"
@@ -189,6 +196,8 @@ curl "http://localhost:8000/runs/{run-id}/status"
curl "http://localhost:8000/runs/{run-id}/findings"
```
**Note**: The CLI handles file upload automatically. For remote workflows where the target path exists on the backend server, you can still use path-based submission for backward compatibility.
## Step 7: Understanding the Results
The workflow will complete in 30-60 seconds and return results in SARIF format. For the test project, you should see:
@@ -216,13 +225,19 @@ Example output:
}
```
## Step 8: Access the Prefect Dashboard
## Step 8: Access the Temporal Web UI
You can monitor workflow execution in real-time using the Prefect dashboard:
You can monitor workflow execution in real-time using the Temporal Web UI:
1. Open http://localhost:4200 in your browser
2. Navigate to "Flow Runs" to see workflow executions
3. Click on a run to see detailed logs and execution graph
1. Open http://localhost:8233 in your browser
2. Navigate to "Workflows" to see workflow executions
3. Click on a workflow to see detailed execution history and activity results
You can also access the MinIO console to view uploaded targets:
1. Open http://localhost:9001 in your browser
2. Login with: `fuzzforge` / `fuzzforge123`
3. Browse the `targets` bucket to see uploaded files
## Next Steps
@@ -242,9 +257,10 @@ Congratulations! You've successfully:
If you encounter problems:
1. **Workflow crashes with registry errors**: Check Docker insecure registry configuration
2. **Services won't start**: Ensure ports 4200, 5001, 8000 are available
2. **Services won't start**: Ensure ports 8000, 8233, 9000, 9001 are available
3. **No findings returned**: Verify the target path contains analyzable code files
4. **CLI not found**: Ensure Python/pip installation path is in your PATH
5. **Upload fails**: Check that MinIO is running and accessible at http://localhost:9000
See the [Troubleshooting Guide](../how-to/troubleshooting.md) for detailed solutions.