CI/CD Integration with Ephemeral Deployment Model (#14)

* feat: Complete migration from Prefect to Temporal

BREAKING CHANGE: Replaces Prefect workflow orchestration with Temporal

## Major Changes
- Replace Prefect with Temporal for workflow orchestration
- Implement vertical worker architecture (rust, android)
- Replace Docker registry with MinIO for unified storage
- Refactor activities to be co-located with workflows
- Update all API endpoints for Temporal compatibility

## Infrastructure
- New: docker-compose.temporal.yaml (Temporal + MinIO + workers)
- New: workers/ directory with rust and android vertical workers
- New: backend/src/temporal/ (manager, discovery)
- New: backend/src/storage/ (S3-cached storage with MinIO)
- New: backend/toolbox/common/ (shared storage activities)
- Deleted: docker-compose.yaml (old Prefect setup)
- Deleted: backend/src/core/prefect_manager.py
- Deleted: backend/src/services/prefect_stats_monitor.py
- Deleted: Docker registry and insecure-registries requirement

## Workflows
- Migrated: security_assessment workflow to Temporal
- New: rust_test workflow (example/test workflow)
- Deleted: secret_detection_scan (Prefect-based, to be reimplemented)
- Activities now co-located with workflows for independent testing

## API Changes
- Updated: backend/src/api/workflows.py (Temporal submission)
- Updated: backend/src/api/runs.py (Temporal status/results)
- Updated: backend/src/main.py (727 lines, TemporalManager integration)
- Updated: All 16 MCP tools to use TemporalManager

## Testing
-  All services healthy (Temporal, PostgreSQL, MinIO, workers, backend)
-  All API endpoints functional
-  End-to-end workflow test passed (72 findings from vulnerable_app)
-  MinIO storage integration working (target upload/download, results)
-  Worker activity discovery working (6 activities registered)
-  Tarball extraction working
-  SARIF report generation working

## Documentation
- ARCHITECTURE.md: Complete Temporal architecture documentation
- QUICKSTART_TEMPORAL.md: Getting started guide
- MIGRATION_DECISION.md: Why we chose Temporal over Prefect
- IMPLEMENTATION_STATUS.md: Migration progress tracking
- workers/README.md: Worker development guide

## Dependencies
- Added: temporalio>=1.6.0
- Added: boto3>=1.34.0 (MinIO S3 client)
- Removed: prefect>=3.4.18

* feat: Add Python fuzzing vertical with Atheris integration

This commit implements a complete Python fuzzing workflow using Atheris:

## Python Worker (workers/python/)
- Dockerfile with Python 3.11, Atheris, and build tools
- Generic worker.py for dynamic workflow discovery
- requirements.txt with temporalio, boto3, atheris dependencies
- Added to docker-compose.temporal.yaml with dedicated cache volume

## AtherisFuzzer Module (backend/toolbox/modules/fuzzer/)
- Reusable module extending BaseModule
- Auto-discovers fuzz targets (fuzz_*.py, *_fuzz.py, fuzz_target.py)
- Recursive search to find targets in nested directories
- Dynamically loads TestOneInput() function
- Configurable max_iterations and timeout
- Real-time stats callback support for live monitoring
- Returns findings as ModuleFinding objects

## Atheris Fuzzing Workflow (backend/toolbox/workflows/atheris_fuzzing/)
- Temporal workflow for orchestrating fuzzing
- Downloads user code from MinIO
- Executes AtherisFuzzer module
- Uploads results to MinIO
- Cleans up cache after execution
- metadata.yaml with vertical: python for routing

## Test Project (test_projects/python_fuzz_waterfall/)
- Demonstrates stateful waterfall vulnerability
- main.py with check_secret() that leaks progress
- fuzz_target.py with Atheris TestOneInput() harness
- Complete README with usage instructions

## Backend Fixes
- Fixed parameter merging in REST API endpoints (workflows.py)
- Changed workflow parameter passing from positional args to kwargs (manager.py)
- Default parameters now properly merged with user parameters

## Testing
 Worker discovered AtherisFuzzingWorkflow
 Workflow executed end-to-end successfully
 Fuzz target auto-discovered in nested directories
 Atheris ran 100,000 iterations
 Results uploaded and cache cleaned

* chore: Complete Temporal migration with updated CLI/SDK/docs

This commit includes all remaining Temporal migration changes:

## CLI Updates (cli/)
- Updated workflow execution commands for Temporal
- Enhanced error handling and exceptions
- Updated dependencies in uv.lock

## SDK Updates (sdk/)
- Client methods updated for Temporal workflows
- Updated models for new workflow execution
- Updated dependencies in uv.lock

## Documentation Updates (docs/)
- Architecture documentation for Temporal
- Workflow concept documentation
- Resource management documentation (new)
- Debugging guide (new)
- Updated tutorials and how-to guides
- Troubleshooting updates

## README Updates
- Main README with Temporal instructions
- Backend README
- CLI README
- SDK README

## Other
- Updated IMPLEMENTATION_STATUS.md
- Removed old vulnerable_app.tar.gz

These changes complete the Temporal migration and ensure the
CLI/SDK work correctly with the new backend.

* fix: Use positional args instead of kwargs for Temporal workflows

The Temporal Python SDK's start_workflow() method doesn't accept
a 'kwargs' parameter. Workflows must receive parameters as positional
arguments via the 'args' parameter.

Changed from:
  args=workflow_args  # Positional arguments

This fixes the error:
  TypeError: Client.start_workflow() got an unexpected keyword argument 'kwargs'

Workflows now correctly receive parameters in order:
- security_assessment: [target_id, scanner_config, analyzer_config, reporter_config]
- atheris_fuzzing: [target_id, target_file, max_iterations, timeout_seconds]
- rust_test: [target_id, test_message]

* fix: Filter metadata-only parameters from workflow arguments

SecurityAssessmentWorkflow was receiving 7 arguments instead of 2-5.
The issue was that target_path and volume_mode from default_parameters
were being passed to the workflow, when they should only be used by
the system for configuration.

Now filters out metadata-only parameters (target_path, volume_mode)
before passing arguments to workflow execution.

* refactor: Remove Prefect leftovers and volume mounting legacy

Complete cleanup of Prefect migration artifacts:

Backend:
- Delete registry.py and workflow_discovery.py (Prefect-specific files)
- Remove Docker validation from setup.py (no longer needed)
- Remove ResourceLimits and VolumeMount models
- Remove target_path and volume_mode from WorkflowSubmission
- Remove supported_volume_modes from API and discovery
- Clean up metadata.yaml files (remove volume/path fields)
- Simplify parameter filtering in manager.py

SDK:
- Remove volume_mode parameter from client methods
- Remove ResourceLimits and VolumeMount models
- Remove Prefect error patterns from docker_logs.py
- Clean up WorkflowSubmission and WorkflowMetadata models

CLI:
- Remove Volume Modes display from workflow info

All removed features are Prefect-specific or Docker volume mounting
artifacts. Temporal workflows use MinIO storage exclusively.

* feat: Add comprehensive test suite and benchmark infrastructure

- Add 68 unit tests for fuzzer, scanner, and analyzer modules
- Implement pytest-based test infrastructure with fixtures
- Add 6 performance benchmarks with category-specific thresholds
- Configure GitHub Actions for automated testing and benchmarking
- Add test and benchmark documentation

Test coverage:
- AtherisFuzzer: 8 tests
- CargoFuzzer: 14 tests
- FileScanner: 22 tests
- SecurityAnalyzer: 24 tests

All tests passing (68/68)
All benchmarks passing (6/6)

* fix: Resolve all ruff linting violations across codebase

Fixed 27 ruff violations in 12 files:
- Removed unused imports (Depends, Dict, Any, Optional, etc.)
- Fixed undefined workflow_info variable in workflows.py
- Removed dead code with undefined variables in atheris_fuzzer.py
- Changed f-string to regular string where no placeholders used

All files now pass ruff checks for CI/CD compliance.

* fix: Configure CI for unit tests only

- Renamed docker-compose.temporal.yaml → docker-compose.yml for CI compatibility
- Commented out integration-tests job (no integration tests yet)
- Updated test-summary to only depend on lint and unit-tests

CI will now run successfully with 68 unit tests. Integration tests can be added later.

* feat: Add CI/CD integration with ephemeral deployment model

Implements comprehensive CI/CD support for FuzzForge with on-demand worker management:

**Worker Management (v0.7.0)**
- Add WorkerManager for automatic worker lifecycle control
- Auto-start workers from stopped state when workflows execute
- Auto-stop workers after workflow completion
- Health checks and startup timeout handling (90s default)

**CI/CD Features**
- `--fail-on` flag: Fail builds based on SARIF severity levels (error/warning/note/info)
- `--export-sarif` flag: Export findings in SARIF 2.1.0 format
- `--auto-start`/`--auto-stop` flags: Control worker lifecycle
- Exit code propagation: Returns 1 on blocking findings, 0 on success

**Exit Code Fix**
- Add `except typer.Exit: raise` handlers at 3 critical locations
- Move worker cleanup to finally block for guaranteed execution
- Exit codes now propagate correctly even when build fails

**CI Scripts & Examples**
- ci-start.sh: Start FuzzForge services with health checks
- ci-stop.sh: Clean shutdown with volume preservation option
- GitHub Actions workflow example (security-scan.yml)
- GitLab CI pipeline example (.gitlab-ci.example.yml)
- docker-compose.ci.yml: CI-optimized compose file with profiles

**OSS-Fuzz Integration**
- New ossfuzz_campaign workflow for running OSS-Fuzz projects
- OSS-Fuzz worker with Docker-in-Docker support
- Configurable campaign duration and project selection

**Documentation**
- Comprehensive CI/CD integration guide (docs/how-to/cicd-integration.md)
- Updated architecture docs with worker lifecycle details
- Updated workspace isolation documentation
- CLI README with worker management examples

**SDK Enhancements**
- Add get_workflow_worker_info() endpoint
- Worker vertical metadata in workflow responses

**Testing**
- All workflows tested: security_assessment, atheris_fuzzing, secret_detection, cargo_fuzzing
- All monitoring commands tested: stats, crashes, status, finding
- Full CI pipeline simulation verified
- Exit codes verified for success/failure scenarios

Ephemeral CI/CD model: ~3-4GB RAM, ~60-90s startup, runs entirely in CI containers.

* fix: Resolve ruff linting violations in CI/CD code

- Remove unused variables (run_id, defaults, result)
- Remove unused imports
- Fix f-string without placeholders

All CI/CD integration files now pass ruff checks.
This commit is contained in:
tduhamel42
2025-10-14 10:13:45 +02:00
committed by GitHub
parent 987c49569c
commit 60ca088ecf
167 changed files with 26101 additions and 5703 deletions

353
workers/README.md Normal file
View File

@@ -0,0 +1,353 @@
# FuzzForge Vertical Workers
This directory contains vertical-specific worker implementations for the Temporal architecture.
## Architecture
Each vertical worker is a long-lived container pre-built with domain-specific security toolchains:
```
workers/
├── rust/ # Rust/Native security (AFL++, cargo-fuzz, gdb, valgrind)
├── android/ # Android security (apktool, Frida, jadx, MobSF)
├── web/ # Web security (OWASP ZAP, semgrep, eslint)
├── ios/ # iOS security (class-dump, Clutch, Frida)
├── blockchain/ # Smart contract security (mythril, slither, echidna)
└── go/ # Go security (go-fuzz, staticcheck, gosec)
```
## How It Works
1. **Worker Startup**: Worker discovers workflows from `/app/toolbox/workflows`
2. **Filtering**: Only loads workflows where `metadata.yaml` has `vertical: <name>`
3. **Dynamic Import**: Dynamically imports workflow Python modules
4. **Registration**: Registers discovered workflows with Temporal
5. **Processing**: Polls Temporal task queue for work
## Adding a New Vertical
### Step 1: Create Worker Directory
```bash
mkdir -p workers/my_vertical
cd workers/my_vertical
```
### Step 2: Create Dockerfile
```dockerfile
# workers/my_vertical/Dockerfile
FROM python:3.11-slim
# Install your vertical-specific tools
RUN apt-get update && apt-get install -y \
tool1 \
tool2 \
tool3 \
&& rm -rf /var/lib/apt/lists/*
# Install Python dependencies
COPY requirements.txt /tmp/
RUN pip install --no-cache-dir -r /tmp/requirements.txt
# Copy worker files
COPY worker.py /app/worker.py
COPY activities.py /app/activities.py
WORKDIR /app
ENV PYTHONPATH="/app:/app/toolbox:${PYTHONPATH}"
ENV PYTHONUNBUFFERED=1
CMD ["python", "worker.py"]
```
### Step 3: Copy Worker Files
```bash
# Copy from rust worker as template
cp workers/rust/worker.py workers/my_vertical/
cp workers/rust/activities.py workers/my_vertical/
cp workers/rust/requirements.txt workers/my_vertical/
```
**Note**: The worker.py and activities.py are generic and work for all verticals. You only need to customize the Dockerfile with your tools.
### Step 4: Add to docker-compose.yml
Add profiles to prevent auto-start:
```yaml
worker-my-vertical:
build:
context: ./workers/my_vertical
dockerfile: Dockerfile
container_name: fuzzforge-worker-my-vertical
profiles: # ← Prevents auto-start (saves RAM)
- workers
- my_vertical
depends_on:
temporal:
condition: service_healthy
minio:
condition: service_healthy
environment:
TEMPORAL_ADDRESS: temporal:7233
WORKER_VERTICAL: my_vertical # ← Important: matches metadata.yaml
WORKER_TASK_QUEUE: my-vertical-queue
MAX_CONCURRENT_ACTIVITIES: 5
# MinIO configuration (same for all workers)
STORAGE_BACKEND: s3
S3_ENDPOINT: http://minio:9000
S3_ACCESS_KEY: fuzzforge
S3_SECRET_KEY: fuzzforge123
S3_BUCKET: targets
CACHE_DIR: /cache
volumes:
- ./backend/toolbox:/app/toolbox:ro
- worker_my_vertical_cache:/cache
networks:
- fuzzforge-network
restart: "no" # ← Don't auto-restart
```
**Why profiles?** Workers are pre-built but don't auto-start, saving ~1-2GB RAM per worker when idle.
### Step 5: Add Volume
```yaml
volumes:
worker_my_vertical_cache:
name: fuzzforge_worker_my_vertical_cache
```
### Step 6: Create Workflows for Your Vertical
```bash
mkdir -p backend/toolbox/workflows/my_workflow
```
**metadata.yaml:**
```yaml
name: my_workflow
version: 1.0.0
vertical: my_vertical # ← Must match WORKER_VERTICAL
```
**workflow.py:**
```python
from temporalio import workflow
from datetime import timedelta
@workflow.defn
class MyWorkflow:
@workflow.run
async def run(self, target_id: str) -> dict:
# Download target
target_path = await workflow.execute_activity(
"get_target",
target_id,
start_to_close_timeout=timedelta(minutes=5)
)
# Your analysis logic here
results = {"status": "success"}
# Cleanup
await workflow.execute_activity(
"cleanup_cache",
target_path,
start_to_close_timeout=timedelta(minutes=1)
)
return results
```
### Step 7: Test
```bash
# Start services
docker-compose -f docker-compose.temporal.yaml up -d
# Check worker logs
docker logs -f fuzzforge-worker-my-vertical
# You should see:
# "Discovered workflow: MyWorkflow from my_workflow (vertical: my_vertical)"
```
## Worker Components
### worker.py
Generic worker entrypoint. Handles:
- Workflow discovery from mounted `/app/toolbox`
- Dynamic import of workflow modules
- Connection to Temporal
- Task queue polling
**No customization needed** - works for all verticals.
### activities.py
Common activities available to all workflows:
- `get_target(target_id: str) -> str`: Download target from MinIO
- `cleanup_cache(target_path: str) -> None`: Remove cached target
- `upload_results(workflow_id, results, format) -> str`: Upload results to MinIO
**Can be extended** with vertical-specific activities:
```python
# workers/my_vertical/activities.py
from temporalio import activity
@activity.defn(name="my_custom_activity")
async def my_custom_activity(input_data: str) -> str:
# Your vertical-specific logic
return "result"
# Add to worker.py activities list:
# activities=[..., my_custom_activity]
```
### Dockerfile
**Only component that needs customization** for each vertical. Install your tools here.
## Configuration
### Environment Variables
All workers support these environment variables:
| Variable | Default | Description |
|----------|---------|-------------|
| `TEMPORAL_ADDRESS` | `localhost:7233` | Temporal server address |
| `TEMPORAL_NAMESPACE` | `default` | Temporal namespace |
| `WORKER_VERTICAL` | `rust` | Vertical name (must match metadata.yaml) |
| `WORKER_TASK_QUEUE` | `{vertical}-queue` | Task queue name |
| `MAX_CONCURRENT_ACTIVITIES` | `5` | Max concurrent activities per worker |
| `S3_ENDPOINT` | `http://minio:9000` | MinIO/S3 endpoint |
| `S3_ACCESS_KEY` | `fuzzforge` | S3 access key |
| `S3_SECRET_KEY` | `fuzzforge123` | S3 secret key |
| `S3_BUCKET` | `targets` | Bucket for uploaded targets |
| `CACHE_DIR` | `/cache` | Local cache directory |
| `CACHE_MAX_SIZE` | `10GB` | Max cache size (not enforced yet) |
| `LOG_LEVEL` | `INFO` | Logging level |
## Scaling
### Vertical Scaling (More Work Per Worker)
Increase concurrent activities:
```yaml
environment:
MAX_CONCURRENT_ACTIVITIES: 10 # Handle 10 tasks at once
```
### Horizontal Scaling (More Workers)
```bash
# Scale to 3 workers for rust vertical
docker-compose -f docker-compose.temporal.yaml up -d --scale worker-rust=3
# Each worker polls the same task queue
# Temporal automatically load balances
```
## Troubleshooting
### Worker Not Discovering Workflows
Check:
1. Volume mount is correct: `./backend/toolbox:/app/toolbox:ro`
2. Workflow has `metadata.yaml` with correct `vertical:` field
3. Workflow has `workflow.py` with `@workflow.defn` decorated class
4. Worker logs show discovery attempt
### Cannot Connect to Temporal
Check:
1. Temporal container is healthy: `docker ps`
2. Network connectivity: `docker exec worker-rust ping temporal`
3. `TEMPORAL_ADDRESS` environment variable is correct
### Cannot Download from MinIO
Check:
1. MinIO is healthy: `docker ps`
2. Buckets exist: `docker exec fuzzforge-minio mc ls fuzzforge/targets`
3. S3 credentials are correct
4. Target was uploaded: Check MinIO console at http://localhost:9001
### Activity Timeouts
Increase timeout in workflow:
```python
await workflow.execute_activity(
"my_activity",
args,
start_to_close_timeout=timedelta(hours=2) # Increase from default
)
```
## Best Practices
1. **Keep Dockerfiles lean**: Only install necessary tools
2. **Use multi-stage builds**: Reduce final image size
3. **Pin tool versions**: Ensure reproducibility
4. **Log liberally**: Helps debugging workflow issues
5. **Handle errors gracefully**: Don't fail workflow for non-critical issues
6. **Test locally first**: Use docker-compose before deploying
## On-Demand Worker Management
Workers use Docker Compose profiles and CLI-managed lifecycle for resource optimization.
### How It Works
1. **Build Time**: `docker-compose build` creates all worker images
2. **Startup**: Workers DON'T auto-start with `docker-compose up -d`
3. **On Demand**: CLI starts workers automatically when workflows need them
4. **Shutdown**: Optional auto-stop after workflow completion
### Manual Control
```bash
# Start specific worker
docker start fuzzforge-worker-ossfuzz
# Stop specific worker
docker stop fuzzforge-worker-ossfuzz
# Check worker status
docker ps --filter "name=fuzzforge-worker"
```
### CLI Auto-Management
```bash
# Auto-start enabled by default
ff workflow run ossfuzz_campaign . project_name=zlib
# Disable auto-start
ff workflow run ossfuzz_campaign . project_name=zlib --no-auto-start
# Auto-stop after completion
ff workflow run ossfuzz_campaign . project_name=zlib --wait --auto-stop
```
### Resource Savings
- **Before**: All workers running = ~8GB RAM
- **After**: Only core services running = ~1.2GB RAM
- **Savings**: ~6-7GB RAM when idle
## Examples
See existing verticals for examples:
- `workers/rust/` - Complete working example
- `backend/toolbox/workflows/rust_test/` - Simple test workflow

View File

@@ -0,0 +1,94 @@
# FuzzForge Vertical Worker: Android Security
#
# Pre-installed tools for Android security analysis:
# - Android SDK (adb, aapt)
# - apktool (APK decompilation)
# - jadx (Dex to Java decompiler)
# - Frida (dynamic instrumentation)
# - androguard (Python APK analysis)
# - MobSF dependencies
FROM python:3.11-slim-bookworm
# Set working directory
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y \
# Build essentials
build-essential \
git \
curl \
wget \
unzip \
# Java (required for Android tools)
openjdk-17-jdk \
# Android tools dependencies
lib32stdc++6 \
lib32z1 \
# Frida dependencies
libc6-dev \
# XML/Binary analysis
libxml2-dev \
libxslt-dev \
# Network tools
netcat-openbsd \
tcpdump \
# Cleanup
&& rm -rf /var/lib/apt/lists/*
# Install Android SDK Command Line Tools
ENV ANDROID_HOME=/opt/android-sdk
ENV PATH="${ANDROID_HOME}/cmdline-tools/latest/bin:${ANDROID_HOME}/platform-tools:${PATH}"
RUN mkdir -p ${ANDROID_HOME}/cmdline-tools && \
cd ${ANDROID_HOME}/cmdline-tools && \
wget -q https://dl.google.com/android/repository/commandlinetools-linux-9477386_latest.zip && \
unzip -q commandlinetools-linux-9477386_latest.zip && \
mv cmdline-tools latest && \
rm commandlinetools-linux-9477386_latest.zip && \
# Accept licenses
yes | ${ANDROID_HOME}/cmdline-tools/latest/bin/sdkmanager --licenses && \
# Install platform tools (adb, fastboot)
${ANDROID_HOME}/cmdline-tools/latest/bin/sdkmanager "platform-tools" "build-tools;33.0.0"
# Install apktool
RUN wget -q https://raw.githubusercontent.com/iBotPeaches/Apktool/master/scripts/linux/apktool -O /usr/local/bin/apktool && \
wget -q https://bitbucket.org/iBotPeaches/apktool/downloads/apktool_2.9.3.jar -O /usr/local/bin/apktool.jar && \
chmod +x /usr/local/bin/apktool
# Install jadx (Dex to Java decompiler)
RUN wget -q https://github.com/skylot/jadx/releases/download/v1.4.7/jadx-1.4.7.zip -O /tmp/jadx.zip && \
unzip -q /tmp/jadx.zip -d /opt/jadx && \
ln -s /opt/jadx/bin/jadx /usr/local/bin/jadx && \
ln -s /opt/jadx/bin/jadx-gui /usr/local/bin/jadx-gui && \
rm /tmp/jadx.zip
# Install Python dependencies for Android security tools
COPY requirements.txt /tmp/requirements.txt
RUN pip3 install --no-cache-dir -r /tmp/requirements.txt && \
rm /tmp/requirements.txt
# Install androguard (Python APK analysis framework)
RUN pip3 install --no-cache-dir androguard pyaxmlparser
# Install Frida
RUN pip3 install --no-cache-dir frida-tools frida
# Create cache directory
RUN mkdir -p /cache && chmod 755 /cache
# Copy worker entrypoint (generic, works for all verticals)
COPY worker.py /app/worker.py
# Add toolbox to Python path (mounted at runtime)
ENV PYTHONPATH="/app:/app/toolbox:${PYTHONPATH}"
ENV PYTHONUNBUFFERED=1
ENV JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64
# Healthcheck
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
CMD python3 -c "import sys; sys.exit(0)"
# Run worker
CMD ["python3", "/app/worker.py"]

View File

@@ -0,0 +1,19 @@
# Temporal Python SDK
temporalio>=1.5.0
# S3/MinIO client
boto3>=1.34.0
botocore>=1.34.0
# Data validation
pydantic>=2.5.0
# YAML parsing
PyYAML>=6.0.1
# Utilities
python-dotenv>=1.0.0
aiofiles>=23.2.1
# Logging
structlog>=24.1.0

309
workers/android/worker.py Normal file
View File

@@ -0,0 +1,309 @@
"""
FuzzForge Vertical Worker: Rust/Native Security
This worker:
1. Discovers workflows for the 'rust' vertical from mounted toolbox
2. Dynamically imports and registers workflow classes
3. Connects to Temporal and processes tasks
4. Handles activities for target download/upload from MinIO
"""
import asyncio
import importlib
import inspect
import logging
import os
import sys
from pathlib import Path
from typing import List, Any
import yaml
from temporalio.client import Client
from temporalio.worker import Worker
# Add toolbox to path for workflow and activity imports
sys.path.insert(0, '/app/toolbox')
# Import common storage activities
from toolbox.common.storage_activities import (
get_target_activity,
cleanup_cache_activity,
upload_results_activity
)
# Configure logging
logging.basicConfig(
level=os.getenv('LOG_LEVEL', 'INFO'),
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)
async def discover_workflows(vertical: str) -> List[Any]:
"""
Discover workflows for this vertical from mounted toolbox.
Args:
vertical: The vertical name (e.g., 'rust', 'android', 'web')
Returns:
List of workflow classes decorated with @workflow.defn
"""
workflows = []
toolbox_path = Path("/app/toolbox/workflows")
if not toolbox_path.exists():
logger.warning(f"Toolbox path does not exist: {toolbox_path}")
return workflows
logger.info(f"Scanning for workflows in: {toolbox_path}")
for workflow_dir in toolbox_path.iterdir():
if not workflow_dir.is_dir():
continue
# Skip special directories
if workflow_dir.name.startswith('.') or workflow_dir.name == '__pycache__':
continue
metadata_file = workflow_dir / "metadata.yaml"
if not metadata_file.exists():
logger.debug(f"No metadata.yaml in {workflow_dir.name}, skipping")
continue
try:
# Parse metadata
with open(metadata_file) as f:
metadata = yaml.safe_load(f)
# Check if workflow is for this vertical
workflow_vertical = metadata.get("vertical")
if workflow_vertical != vertical:
logger.debug(
f"Workflow {workflow_dir.name} is for vertical '{workflow_vertical}', "
f"not '{vertical}', skipping"
)
continue
# Check if workflow.py exists
workflow_file = workflow_dir / "workflow.py"
if not workflow_file.exists():
logger.warning(
f"Workflow {workflow_dir.name} has metadata but no workflow.py, skipping"
)
continue
# Dynamically import workflow module
module_name = f"toolbox.workflows.{workflow_dir.name}.workflow"
logger.info(f"Importing workflow module: {module_name}")
try:
module = importlib.import_module(module_name)
except Exception as e:
logger.error(
f"Failed to import workflow module {module_name}: {e}",
exc_info=True
)
continue
# Find @workflow.defn decorated classes
found_workflows = False
for name, obj in inspect.getmembers(module, inspect.isclass):
# Check if class has Temporal workflow definition
if hasattr(obj, '__temporal_workflow_definition'):
workflows.append(obj)
found_workflows = True
logger.info(
f"✓ Discovered workflow: {name} from {workflow_dir.name} "
f"(vertical: {vertical})"
)
if not found_workflows:
logger.warning(
f"Workflow {workflow_dir.name} has no @workflow.defn decorated classes"
)
except Exception as e:
logger.error(
f"Error processing workflow {workflow_dir.name}: {e}",
exc_info=True
)
continue
logger.info(f"Discovered {len(workflows)} workflows for vertical '{vertical}'")
return workflows
async def discover_activities(workflows_dir: Path) -> List[Any]:
"""
Discover activities from workflow directories.
Looks for activities.py files alongside workflow.py in each workflow directory.
Args:
workflows_dir: Path to workflows directory
Returns:
List of activity functions decorated with @activity.defn
"""
activities = []
if not workflows_dir.exists():
logger.warning(f"Workflows directory does not exist: {workflows_dir}")
return activities
logger.info(f"Scanning for workflow activities in: {workflows_dir}")
for workflow_dir in workflows_dir.iterdir():
if not workflow_dir.is_dir():
continue
# Skip special directories
if workflow_dir.name.startswith('.') or workflow_dir.name == '__pycache__':
continue
# Check if activities.py exists
activities_file = workflow_dir / "activities.py"
if not activities_file.exists():
logger.debug(f"No activities.py in {workflow_dir.name}, skipping")
continue
try:
# Dynamically import activities module
module_name = f"toolbox.workflows.{workflow_dir.name}.activities"
logger.info(f"Importing activities module: {module_name}")
try:
module = importlib.import_module(module_name)
except Exception as e:
logger.error(
f"Failed to import activities module {module_name}: {e}",
exc_info=True
)
continue
# Find @activity.defn decorated functions
found_activities = False
for name, obj in inspect.getmembers(module, inspect.isfunction):
# Check if function has Temporal activity definition
if hasattr(obj, '__temporal_activity_definition'):
activities.append(obj)
found_activities = True
logger.info(
f"✓ Discovered activity: {name} from {workflow_dir.name}"
)
if not found_activities:
logger.warning(
f"Workflow {workflow_dir.name} has activities.py but no @activity.defn decorated functions"
)
except Exception as e:
logger.error(
f"Error processing activities from {workflow_dir.name}: {e}",
exc_info=True
)
continue
logger.info(f"Discovered {len(activities)} workflow-specific activities")
return activities
async def main():
"""Main worker entry point"""
# Get configuration from environment
vertical = os.getenv("WORKER_VERTICAL", "rust")
temporal_address = os.getenv("TEMPORAL_ADDRESS", "localhost:7233")
temporal_namespace = os.getenv("TEMPORAL_NAMESPACE", "default")
task_queue = os.getenv("WORKER_TASK_QUEUE", f"{vertical}-queue")
max_concurrent_activities = int(os.getenv("MAX_CONCURRENT_ACTIVITIES", "5"))
logger.info("=" * 60)
logger.info(f"FuzzForge Vertical Worker: {vertical}")
logger.info("=" * 60)
logger.info(f"Temporal Address: {temporal_address}")
logger.info(f"Temporal Namespace: {temporal_namespace}")
logger.info(f"Task Queue: {task_queue}")
logger.info(f"Max Concurrent Activities: {max_concurrent_activities}")
logger.info("=" * 60)
# Discover workflows for this vertical
logger.info(f"Discovering workflows for vertical: {vertical}")
workflows = await discover_workflows(vertical)
if not workflows:
logger.error(f"No workflows found for vertical: {vertical}")
logger.error("Worker cannot start without workflows. Exiting...")
sys.exit(1)
# Discover activities from workflow directories
logger.info("Discovering workflow-specific activities...")
workflows_dir = Path("/app/toolbox/workflows")
workflow_activities = await discover_activities(workflows_dir)
# Combine common storage activities with workflow-specific activities
activities = [
get_target_activity,
cleanup_cache_activity,
upload_results_activity
] + workflow_activities
logger.info(
f"Total activities registered: {len(activities)} "
f"(3 common + {len(workflow_activities)} workflow-specific)"
)
# Connect to Temporal
logger.info(f"Connecting to Temporal at {temporal_address}...")
try:
client = await Client.connect(
temporal_address,
namespace=temporal_namespace
)
logger.info("✓ Connected to Temporal successfully")
except Exception as e:
logger.error(f"Failed to connect to Temporal: {e}", exc_info=True)
sys.exit(1)
# Create worker with discovered workflows and activities
logger.info(f"Creating worker on task queue: {task_queue}")
try:
worker = Worker(
client,
task_queue=task_queue,
workflows=workflows,
activities=activities,
max_concurrent_activities=max_concurrent_activities
)
logger.info("✓ Worker created successfully")
except Exception as e:
logger.error(f"Failed to create worker: {e}", exc_info=True)
sys.exit(1)
# Start worker
logger.info("=" * 60)
logger.info(f"🚀 Worker started for vertical '{vertical}'")
logger.info(f"📦 Registered {len(workflows)} workflows")
logger.info(f"⚙️ Registered {len(activities)} activities")
logger.info(f"📨 Listening on task queue: {task_queue}")
logger.info("=" * 60)
logger.info("Worker is ready to process tasks...")
try:
await worker.run()
except KeyboardInterrupt:
logger.info("Shutting down worker (keyboard interrupt)...")
except Exception as e:
logger.error(f"Worker error: {e}", exc_info=True)
raise
if __name__ == "__main__":
try:
asyncio.run(main())
except KeyboardInterrupt:
logger.info("Worker stopped")
except Exception as e:
logger.error(f"Fatal error: {e}", exc_info=True)
sys.exit(1)

View File

@@ -0,0 +1,45 @@
# OSS-Fuzz Worker - Generic fuzzing using OSS-Fuzz infrastructure
FROM gcr.io/oss-fuzz-base/base-builder:latest
# Install Python, Docker CLI, and dependencies (use Python 3.8 from base image)
RUN apt-get update && apt-get install -y \
python3-pip \
python3-dev \
git \
docker.io \
&& rm -rf /var/lib/apt/lists/*
# Upgrade pip
RUN python3 -m pip install --upgrade pip
# Install Temporal Python SDK and dependencies
RUN pip3 install --no-cache-dir \
temporalio==1.5.0 \
boto3==1.34.50 \
pyyaml==6.0.1 \
psutil==5.9.8
# Create necessary directories
RUN mkdir -p /app /cache /corpus /output
# Set environment variables
ENV PYTHONPATH=/app
ENV WORKER_VERTICAL=ossfuzz
ENV MAX_CONCURRENT_ACTIVITIES=2
ENV CACHE_DIR=/cache
ENV CACHE_MAX_SIZE=50GB
ENV CACHE_TTL=30d
# Clone OSS-Fuzz repo (will be cached in /cache by worker)
# This is just to have helper scripts available
RUN git clone --depth=1 https://github.com/google/oss-fuzz.git /opt/oss-fuzz
# Copy worker code
COPY worker.py /app/
COPY activities.py /app/
COPY requirements.txt /app/
WORKDIR /app
# Run worker
CMD ["python3", "worker.py"]

View File

@@ -0,0 +1,413 @@
"""
OSS-Fuzz Campaign Activities
Activities for running OSS-Fuzz campaigns using Google's infrastructure.
"""
import logging
import os
import subprocess
import shutil
from pathlib import Path
from typing import Dict, Any, List, Optional
from datetime import datetime
import yaml
from temporalio import activity
logger = logging.getLogger(__name__)
# Paths
OSS_FUZZ_REPO = Path("/opt/oss-fuzz")
CACHE_DIR = Path(os.getenv("CACHE_DIR", "/cache"))
@activity.defn(name="load_ossfuzz_project")
async def load_ossfuzz_project_activity(project_name: str) -> Dict[str, Any]:
"""
Load OSS-Fuzz project configuration from project.yaml.
Args:
project_name: Name of the OSS-Fuzz project (e.g., "curl", "sqlite3")
Returns:
Dictionary with project config, paths, and metadata
"""
logger.info(f"Loading OSS-Fuzz project: {project_name}")
# Update OSS-Fuzz repo if it exists, clone if not
if OSS_FUZZ_REPO.exists():
logger.info("Updating OSS-Fuzz repository...")
subprocess.run(
["git", "-C", str(OSS_FUZZ_REPO), "pull", "--depth=1"],
check=False # Don't fail if already up to date
)
else:
logger.info("Cloning OSS-Fuzz repository...")
subprocess.run(
[
"git", "clone", "--depth=1",
"https://github.com/google/oss-fuzz.git",
str(OSS_FUZZ_REPO)
],
check=True
)
# Find project directory
project_path = OSS_FUZZ_REPO / "projects" / project_name
if not project_path.exists():
raise ValueError(
f"Project '{project_name}' not found in OSS-Fuzz. "
f"Available projects: https://github.com/google/oss-fuzz/tree/master/projects"
)
# Read project.yaml
config_file = project_path / "project.yaml"
if not config_file.exists():
raise ValueError(f"No project.yaml found for project '{project_name}'")
with open(config_file) as f:
config = yaml.safe_load(f)
# Add paths
config["project_name"] = project_name
config["project_path"] = str(project_path)
config["dockerfile_path"] = str(project_path / "Dockerfile")
config["build_script_path"] = str(project_path / "build.sh")
# Validate required fields
if not config.get("language"):
logger.warning(f"No language specified in project.yaml for {project_name}")
logger.info(
f"✓ Loaded project {project_name}: "
f"language={config.get('language', 'unknown')}, "
f"engines={config.get('fuzzing_engines', [])}, "
f"sanitizers={config.get('sanitizers', [])}"
)
return config
@activity.defn(name="build_ossfuzz_project")
async def build_ossfuzz_project_activity(
project_name: str,
project_config: Dict[str, Any],
sanitizer: Optional[str] = None,
engine: Optional[str] = None
) -> Dict[str, Any]:
"""
Build OSS-Fuzz project directly using build.sh (no Docker-in-Docker).
Args:
project_name: Name of the project
project_config: Configuration from project.yaml
sanitizer: Override sanitizer (default: first from project.yaml)
engine: Override engine (default: first from project.yaml)
Returns:
Dictionary with build results and discovered fuzz targets
"""
logger.info(f"Building OSS-Fuzz project: {project_name}")
# Determine sanitizer and engine
sanitizers = project_config.get("sanitizers", ["address"])
engines = project_config.get("fuzzing_engines", ["libfuzzer"])
use_sanitizer = sanitizer if sanitizer else sanitizers[0]
use_engine = engine if engine else engines[0]
logger.info(f"Building with sanitizer={use_sanitizer}, engine={use_engine}")
# Setup directories
src_dir = Path("/src")
out_dir = Path("/out")
src_dir.mkdir(exist_ok=True)
out_dir.mkdir(exist_ok=True)
# Clean previous build artifacts
for item in out_dir.glob("*"):
if item.is_file():
item.unlink()
elif item.is_dir():
shutil.rmtree(item)
# Copy project files from OSS-Fuzz repo to /src
project_path = Path(project_config["project_path"])
build_script = project_path / "build.sh"
if not build_script.exists():
raise Exception(f"build.sh not found for project {project_name}")
logger.info(f"Copying project files from {project_path} to {src_dir}")
# Copy build.sh
shutil.copy2(build_script, src_dir / "build.sh")
os.chmod(src_dir / "build.sh", 0o755)
# Copy any fuzzer source files (*.cc, *.c, *.cpp files)
for pattern in ["*.cc", "*.c", "*.cpp", "*.h", "*.hh", "*.hpp"]:
for src_file in project_path.glob(pattern):
dest_file = src_dir / src_file.name
shutil.copy2(src_file, dest_file)
logger.info(f"Copied: {src_file.name}")
# Clone project source code to subdirectory
main_repo = project_config.get("main_repo")
work_dir = src_dir
if main_repo:
logger.info(f"Cloning project source from {main_repo}")
project_src_dir = src_dir / project_name
# Remove existing directory if present
if project_src_dir.exists():
shutil.rmtree(project_src_dir)
clone_cmd = ["git", "clone", "--depth=1", main_repo, str(project_src_dir)]
result = subprocess.run(clone_cmd, capture_output=True, text=True, timeout=600)
if result.returncode != 0:
logger.warning(f"Failed to clone {main_repo}: {result.stderr}")
logger.info("Continuing without cloning (build.sh may download source)")
else:
# Copy build.sh into the project source directory
shutil.copy2(src_dir / "build.sh", project_src_dir / "build.sh")
os.chmod(project_src_dir / "build.sh", 0o755)
# build.sh should run from within the project directory
work_dir = project_src_dir
logger.info(f"Build will run from: {work_dir}")
else:
logger.info("No main_repo in project.yaml, build.sh will download source")
# Set OSS-Fuzz environment variables
build_env = os.environ.copy()
build_env.update({
"SRC": str(src_dir),
"OUT": str(out_dir),
"FUZZING_ENGINE": use_engine,
"SANITIZER": use_sanitizer,
"ARCHITECTURE": "x86_64",
# Use clang's built-in libfuzzer instead of separate library
"LIB_FUZZING_ENGINE": "-fsanitize=fuzzer",
})
# Set sanitizer flags
if use_sanitizer == "address":
build_env["CFLAGS"] = build_env.get("CFLAGS", "") + " -fsanitize=address"
build_env["CXXFLAGS"] = build_env.get("CXXFLAGS", "") + " -fsanitize=address"
elif use_sanitizer == "memory":
build_env["CFLAGS"] = build_env.get("CFLAGS", "") + " -fsanitize=memory"
build_env["CXXFLAGS"] = build_env.get("CXXFLAGS", "") + " -fsanitize=memory"
elif use_sanitizer == "undefined":
build_env["CFLAGS"] = build_env.get("CFLAGS", "") + " -fsanitize=undefined"
build_env["CXXFLAGS"] = build_env.get("CXXFLAGS", "") + " -fsanitize=undefined"
# Execute build.sh from the work directory
logger.info(f"Executing build.sh in {work_dir}")
build_cmd = ["bash", "./build.sh"]
result = subprocess.run(
build_cmd,
cwd=str(work_dir),
env=build_env,
capture_output=True,
text=True,
timeout=1800 # 30 minutes max build time
)
if result.returncode != 0:
logger.error(f"Build failed:\nSTDOUT:\n{result.stdout}\nSTDERR:\n{result.stderr}")
raise Exception(f"Build failed for {project_name}: {result.stderr}")
logger.info("✓ Build completed successfully")
logger.info(f"Build output:\n{result.stdout[-2000:]}") # Last 2000 chars
# Discover fuzz targets in /out
fuzz_targets = []
for file in out_dir.glob("*"):
if file.is_file() and os.access(file, os.X_OK):
# Check if it's a fuzz target (executable, not .so/.a/.o)
if file.suffix not in ['.so', '.a', '.o', '.zip']:
fuzz_targets.append(str(file))
logger.info(f"Found fuzz target: {file.name}")
if not fuzz_targets:
logger.warning(f"No fuzz targets found in {out_dir}")
logger.info(f"Directory contents: {list(out_dir.glob('*'))}")
return {
"fuzz_targets": fuzz_targets,
"build_log": result.stdout[-5000:], # Last 5000 chars
"sanitizer_used": use_sanitizer,
"engine_used": use_engine,
"out_dir": str(out_dir)
}
@activity.defn(name="fuzz_target")
async def fuzz_target_activity(
target_path: str,
engine: str,
duration_seconds: int,
corpus_dir: Optional[str] = None,
dict_file: Optional[str] = None
) -> Dict[str, Any]:
"""
Run fuzzing on a target with specified engine.
Args:
target_path: Path to fuzz target executable
engine: Fuzzing engine (libfuzzer, afl, honggfuzz)
duration_seconds: How long to fuzz
corpus_dir: Optional corpus directory
dict_file: Optional dictionary file
Returns:
Dictionary with fuzzing stats and results
"""
logger.info(f"Fuzzing {Path(target_path).name} with {engine} for {duration_seconds}s")
# Prepare corpus directory
if not corpus_dir:
corpus_dir = str(CACHE_DIR / "corpus" / Path(target_path).stem)
Path(corpus_dir).mkdir(parents=True, exist_ok=True)
output_dir = CACHE_DIR / "output" / Path(target_path).stem
output_dir.mkdir(parents=True, exist_ok=True)
start_time = datetime.now()
try:
if engine == "libfuzzer":
cmd = [
target_path,
corpus_dir,
f"-max_total_time={duration_seconds}",
"-print_final_stats=1",
f"-artifact_prefix={output_dir}/"
]
if dict_file:
cmd.append(f"-dict={dict_file}")
elif engine == "afl":
cmd = [
"afl-fuzz",
"-i", corpus_dir if Path(corpus_dir).glob("*") else "-", # Empty corpus OK
"-o", str(output_dir),
"-t", "1000", # Timeout per execution
"-m", "none", # No memory limit
"--", target_path, "@@"
]
elif engine == "honggfuzz":
cmd = [
"honggfuzz",
f"--run_time={duration_seconds}",
"-i", corpus_dir,
"-o", str(output_dir),
"--", target_path
]
else:
raise ValueError(f"Unsupported fuzzing engine: {engine}")
logger.info(f"Starting fuzzer: {' '.join(cmd[:5])}...")
result = subprocess.run(
cmd,
capture_output=True,
text=True,
timeout=duration_seconds + 120 # Add 2 minute buffer
)
end_time = datetime.now()
elapsed = (end_time - start_time).total_seconds()
# Parse stats from output
stats = parse_fuzzing_stats(result.stdout, result.stderr, engine)
stats["elapsed_time"] = elapsed
stats["target_name"] = Path(target_path).name
stats["engine"] = engine
# Find crashes
crashes = find_crashes(output_dir)
stats["crashes"] = len(crashes)
stats["crash_files"] = crashes
# Collect new corpus files
new_corpus = collect_corpus(corpus_dir)
stats["corpus_size"] = len(new_corpus)
stats["corpus_files"] = new_corpus
logger.info(
f"✓ Fuzzing completed: {stats.get('total_executions', 0)} execs, "
f"{len(crashes)} crashes"
)
return stats
except subprocess.TimeoutExpired:
logger.warning(f"Fuzzing timed out after {duration_seconds}s")
return {
"target_name": Path(target_path).name,
"engine": engine,
"status": "timeout",
"elapsed_time": duration_seconds
}
def parse_fuzzing_stats(stdout: str, stderr: str, engine: str) -> Dict[str, Any]:
"""Parse fuzzing statistics from output"""
stats = {}
if engine == "libfuzzer":
# Parse libFuzzer stats
for line in (stdout + stderr).split('\n'):
if "#" in line and "NEW" in line:
# Example: #8192 NEW cov: 1234 ft: 5678 corp: 89/10KB
parts = line.split()
for i, part in enumerate(parts):
if part.startswith("cov:"):
stats["coverage"] = int(parts[i+1])
elif part.startswith("corp:"):
stats["corpus_entries"] = int(parts[i+1].split('/')[0])
elif part.startswith("exec/s:"):
stats["executions_per_sec"] = float(parts[i+1])
elif part.startswith("#"):
stats["total_executions"] = int(part[1:])
elif engine == "afl":
# Parse AFL stats (would need to read fuzzer_stats file)
pass
elif engine == "honggfuzz":
# Parse Honggfuzz stats
pass
return stats
def find_crashes(output_dir: Path) -> List[str]:
"""Find crash files in output directory"""
crashes = []
# libFuzzer crash files start with "crash-" or "leak-"
for pattern in ["crash-*", "leak-*", "timeout-*"]:
crashes.extend([str(f) for f in output_dir.glob(pattern)])
# AFL crashes in crashes/ subdirectory
crashes_dir = output_dir / "crashes"
if crashes_dir.exists():
crashes.extend([str(f) for f in crashes_dir.glob("*") if f.is_file()])
return crashes
def collect_corpus(corpus_dir: str) -> List[str]:
"""Collect corpus files"""
corpus_path = Path(corpus_dir)
if not corpus_path.exists():
return []
return [str(f) for f in corpus_path.glob("*") if f.is_file()]

View File

@@ -0,0 +1,4 @@
temporalio==1.5.0
boto3==1.34.50
pyyaml==6.0.1
psutil==5.9.8

319
workers/ossfuzz/worker.py Normal file
View File

@@ -0,0 +1,319 @@
"""
FuzzForge Vertical Worker: OSS-Fuzz Campaigns
This worker:
1. Discovers workflows for the 'ossfuzz' vertical from mounted toolbox
2. Dynamically imports and registers workflow classes
3. Connects to Temporal and processes tasks
4. Handles activities for OSS-Fuzz project building and fuzzing
"""
import asyncio
import importlib
import inspect
import logging
import os
import sys
from pathlib import Path
from typing import List, Any
import yaml
from temporalio.client import Client
from temporalio.worker import Worker
# Add toolbox to path for workflow and activity imports
sys.path.insert(0, '/app/toolbox')
# Import common storage activities
from toolbox.common.storage_activities import (
get_target_activity,
cleanup_cache_activity,
upload_results_activity
)
# Import OSS-Fuzz specific activities
from activities import (
load_ossfuzz_project_activity,
build_ossfuzz_project_activity,
fuzz_target_activity
)
# Configure logging
logging.basicConfig(
level=os.getenv('LOG_LEVEL', 'INFO'),
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)
async def discover_workflows(vertical: str) -> List[Any]:
"""
Discover workflows for this vertical from mounted toolbox.
Args:
vertical: The vertical name (e.g., 'ossfuzz')
Returns:
List of workflow classes decorated with @workflow.defn
"""
workflows = []
toolbox_path = Path("/app/toolbox/workflows")
if not toolbox_path.exists():
logger.warning(f"Toolbox path does not exist: {toolbox_path}")
return workflows
logger.info(f"Scanning for workflows in: {toolbox_path}")
for workflow_dir in toolbox_path.iterdir():
if not workflow_dir.is_dir():
continue
# Skip special directories
if workflow_dir.name.startswith('.') or workflow_dir.name == '__pycache__':
continue
metadata_file = workflow_dir / "metadata.yaml"
if not metadata_file.exists():
logger.debug(f"No metadata.yaml in {workflow_dir.name}, skipping")
continue
try:
# Parse metadata
with open(metadata_file) as f:
metadata = yaml.safe_load(f)
# Check if workflow is for this vertical
workflow_vertical = metadata.get("vertical")
if workflow_vertical != vertical:
logger.debug(
f"Workflow {workflow_dir.name} is for vertical '{workflow_vertical}', "
f"not '{vertical}', skipping"
)
continue
# Check if workflow.py exists
workflow_file = workflow_dir / "workflow.py"
if not workflow_file.exists():
logger.warning(
f"Workflow {workflow_dir.name} has metadata but no workflow.py, skipping"
)
continue
# Dynamically import workflow module
module_name = f"toolbox.workflows.{workflow_dir.name}.workflow"
logger.info(f"Importing workflow module: {module_name}")
try:
module = importlib.import_module(module_name)
except Exception as e:
logger.error(
f"Failed to import workflow module {module_name}: {e}",
exc_info=True
)
continue
# Find @workflow.defn decorated classes
found_workflows = False
for name, obj in inspect.getmembers(module, inspect.isclass):
# Check if class has Temporal workflow definition
if hasattr(obj, '__temporal_workflow_definition'):
workflows.append(obj)
found_workflows = True
logger.info(
f"✓ Discovered workflow: {name} from {workflow_dir.name} "
f"(vertical: {vertical})"
)
if not found_workflows:
logger.warning(
f"Workflow {workflow_dir.name} has no @workflow.defn decorated classes"
)
except Exception as e:
logger.error(
f"Error processing workflow {workflow_dir.name}: {e}",
exc_info=True
)
continue
logger.info(f"Discovered {len(workflows)} workflows for vertical '{vertical}'")
return workflows
async def discover_activities(workflows_dir: Path) -> List[Any]:
"""
Discover activities from workflow directories.
Looks for activities.py files alongside workflow.py in each workflow directory.
Args:
workflows_dir: Path to workflows directory
Returns:
List of activity functions decorated with @activity.defn
"""
activities = []
if not workflows_dir.exists():
logger.warning(f"Workflows directory does not exist: {workflows_dir}")
return activities
logger.info(f"Scanning for workflow activities in: {workflows_dir}")
for workflow_dir in workflows_dir.iterdir():
if not workflow_dir.is_dir():
continue
# Skip special directories
if workflow_dir.name.startswith('.') or workflow_dir.name == '__pycache__':
continue
# Check if activities.py exists
activities_file = workflow_dir / "activities.py"
if not activities_file.exists():
logger.debug(f"No activities.py in {workflow_dir.name}, skipping")
continue
try:
# Dynamically import activities module
module_name = f"toolbox.workflows.{workflow_dir.name}.activities"
logger.info(f"Importing activities module: {module_name}")
try:
module = importlib.import_module(module_name)
except Exception as e:
logger.error(
f"Failed to import activities module {module_name}: {e}",
exc_info=True
)
continue
# Find @activity.defn decorated functions
found_activities = False
for name, obj in inspect.getmembers(module, inspect.isfunction):
# Check if function has Temporal activity definition
if hasattr(obj, '__temporal_activity_definition'):
activities.append(obj)
found_activities = True
logger.info(
f"✓ Discovered activity: {name} from {workflow_dir.name}"
)
if not found_activities:
logger.warning(
f"Workflow {workflow_dir.name} has activities.py but no @activity.defn decorated functions"
)
except Exception as e:
logger.error(
f"Error processing activities from {workflow_dir.name}: {e}",
exc_info=True
)
continue
logger.info(f"Discovered {len(activities)} workflow-specific activities")
return activities
async def main():
"""Main worker entry point"""
# Get configuration from environment
vertical = os.getenv("WORKER_VERTICAL", "ossfuzz")
temporal_address = os.getenv("TEMPORAL_ADDRESS", "localhost:7233")
temporal_namespace = os.getenv("TEMPORAL_NAMESPACE", "default")
task_queue = os.getenv("WORKER_TASK_QUEUE", f"{vertical}-queue")
max_concurrent_activities = int(os.getenv("MAX_CONCURRENT_ACTIVITIES", "2"))
logger.info("=" * 60)
logger.info(f"FuzzForge Vertical Worker: {vertical}")
logger.info("=" * 60)
logger.info(f"Temporal Address: {temporal_address}")
logger.info(f"Temporal Namespace: {temporal_namespace}")
logger.info(f"Task Queue: {task_queue}")
logger.info(f"Max Concurrent Activities: {max_concurrent_activities}")
logger.info("=" * 60)
# Discover workflows for this vertical
logger.info(f"Discovering workflows for vertical: {vertical}")
workflows = await discover_workflows(vertical)
if not workflows:
logger.error(f"No workflows found for vertical: {vertical}")
logger.error("Worker cannot start without workflows. Exiting...")
sys.exit(1)
# Discover activities from workflow directories
logger.info("Discovering workflow-specific activities...")
workflows_dir = Path("/app/toolbox/workflows")
workflow_activities = await discover_activities(workflows_dir)
# Combine common storage activities, OSS-Fuzz activities, and workflow-specific activities
activities = [
get_target_activity,
cleanup_cache_activity,
upload_results_activity,
load_ossfuzz_project_activity,
build_ossfuzz_project_activity,
fuzz_target_activity
] + workflow_activities
logger.info(
f"Total activities registered: {len(activities)} "
f"(3 common + 3 ossfuzz + {len(workflow_activities)} workflow-specific)"
)
# Connect to Temporal
logger.info(f"Connecting to Temporal at {temporal_address}...")
try:
client = await Client.connect(
temporal_address,
namespace=temporal_namespace
)
logger.info("✓ Connected to Temporal successfully")
except Exception as e:
logger.error(f"Failed to connect to Temporal: {e}", exc_info=True)
sys.exit(1)
# Create worker with discovered workflows and activities
logger.info(f"Creating worker on task queue: {task_queue}")
try:
worker = Worker(
client,
task_queue=task_queue,
workflows=workflows,
activities=activities,
max_concurrent_activities=max_concurrent_activities
)
logger.info("✓ Worker created successfully")
except Exception as e:
logger.error(f"Failed to create worker: {e}", exc_info=True)
sys.exit(1)
# Start worker
logger.info("=" * 60)
logger.info(f"🚀 Worker started for vertical '{vertical}'")
logger.info(f"📦 Registered {len(workflows)} workflows")
logger.info(f"⚙️ Registered {len(activities)} activities")
logger.info(f"📨 Listening on task queue: {task_queue}")
logger.info("=" * 60)
logger.info("Worker is ready to process tasks...")
try:
await worker.run()
except KeyboardInterrupt:
logger.info("Shutting down worker (keyboard interrupt)...")
except Exception as e:
logger.error(f"Worker error: {e}", exc_info=True)
raise
if __name__ == "__main__":
try:
asyncio.run(main())
except KeyboardInterrupt:
logger.info("Worker stopped")
except Exception as e:
logger.error(f"Fatal error: {e}", exc_info=True)
sys.exit(1)

47
workers/python/Dockerfile Normal file
View File

@@ -0,0 +1,47 @@
# FuzzForge Vertical Worker: Python Fuzzing
#
# Pre-installed tools for Python fuzzing and security analysis:
# - Python 3.11
# - Atheris (Python fuzzing)
# - Common Python security tools
# - Temporal worker
FROM python:3.11-slim
# Set working directory
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y \
# Build essentials for Atheris
build-essential \
clang \
llvm \
# Development tools
git \
curl \
wget \
# Cleanup
&& rm -rf /var/lib/apt/lists/*
# Install Python dependencies for Temporal worker
COPY requirements.txt /tmp/requirements.txt
RUN pip3 install --no-cache-dir -r /tmp/requirements.txt && \
rm /tmp/requirements.txt
# Create cache directory for downloaded targets
RUN mkdir -p /cache && chmod 755 /cache
# Copy worker entrypoint
COPY worker.py /app/worker.py
# Add toolbox to Python path (mounted at runtime)
ENV PYTHONPATH="/app:/app/toolbox:${PYTHONPATH}"
ENV PYTHONUNBUFFERED=1
# Healthcheck
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
CMD python3 -c "import sys; sys.exit(0)"
# Run worker
CMD ["python3", "/app/worker.py"]

View File

@@ -0,0 +1,15 @@
# Temporal worker dependencies
temporalio>=1.5.0
pydantic>=2.0.0
# Storage (MinIO/S3)
boto3>=1.34.0
# Configuration
pyyaml>=6.0.0
# HTTP Client (for real-time stats reporting)
httpx>=0.27.0
# Fuzzing
atheris>=2.3.0

309
workers/python/worker.py Normal file
View File

@@ -0,0 +1,309 @@
"""
FuzzForge Vertical Worker: Rust/Native Security
This worker:
1. Discovers workflows for the 'rust' vertical from mounted toolbox
2. Dynamically imports and registers workflow classes
3. Connects to Temporal and processes tasks
4. Handles activities for target download/upload from MinIO
"""
import asyncio
import importlib
import inspect
import logging
import os
import sys
from pathlib import Path
from typing import List, Any
import yaml
from temporalio.client import Client
from temporalio.worker import Worker
# Add toolbox to path for workflow and activity imports
sys.path.insert(0, '/app/toolbox')
# Import common storage activities
from toolbox.common.storage_activities import (
get_target_activity,
cleanup_cache_activity,
upload_results_activity
)
# Configure logging
logging.basicConfig(
level=os.getenv('LOG_LEVEL', 'INFO'),
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)
async def discover_workflows(vertical: str) -> List[Any]:
"""
Discover workflows for this vertical from mounted toolbox.
Args:
vertical: The vertical name (e.g., 'rust', 'android', 'web')
Returns:
List of workflow classes decorated with @workflow.defn
"""
workflows = []
toolbox_path = Path("/app/toolbox/workflows")
if not toolbox_path.exists():
logger.warning(f"Toolbox path does not exist: {toolbox_path}")
return workflows
logger.info(f"Scanning for workflows in: {toolbox_path}")
for workflow_dir in toolbox_path.iterdir():
if not workflow_dir.is_dir():
continue
# Skip special directories
if workflow_dir.name.startswith('.') or workflow_dir.name == '__pycache__':
continue
metadata_file = workflow_dir / "metadata.yaml"
if not metadata_file.exists():
logger.debug(f"No metadata.yaml in {workflow_dir.name}, skipping")
continue
try:
# Parse metadata
with open(metadata_file) as f:
metadata = yaml.safe_load(f)
# Check if workflow is for this vertical
workflow_vertical = metadata.get("vertical")
if workflow_vertical != vertical:
logger.debug(
f"Workflow {workflow_dir.name} is for vertical '{workflow_vertical}', "
f"not '{vertical}', skipping"
)
continue
# Check if workflow.py exists
workflow_file = workflow_dir / "workflow.py"
if not workflow_file.exists():
logger.warning(
f"Workflow {workflow_dir.name} has metadata but no workflow.py, skipping"
)
continue
# Dynamically import workflow module
module_name = f"toolbox.workflows.{workflow_dir.name}.workflow"
logger.info(f"Importing workflow module: {module_name}")
try:
module = importlib.import_module(module_name)
except Exception as e:
logger.error(
f"Failed to import workflow module {module_name}: {e}",
exc_info=True
)
continue
# Find @workflow.defn decorated classes
found_workflows = False
for name, obj in inspect.getmembers(module, inspect.isclass):
# Check if class has Temporal workflow definition
if hasattr(obj, '__temporal_workflow_definition'):
workflows.append(obj)
found_workflows = True
logger.info(
f"✓ Discovered workflow: {name} from {workflow_dir.name} "
f"(vertical: {vertical})"
)
if not found_workflows:
logger.warning(
f"Workflow {workflow_dir.name} has no @workflow.defn decorated classes"
)
except Exception as e:
logger.error(
f"Error processing workflow {workflow_dir.name}: {e}",
exc_info=True
)
continue
logger.info(f"Discovered {len(workflows)} workflows for vertical '{vertical}'")
return workflows
async def discover_activities(workflows_dir: Path) -> List[Any]:
"""
Discover activities from workflow directories.
Looks for activities.py files alongside workflow.py in each workflow directory.
Args:
workflows_dir: Path to workflows directory
Returns:
List of activity functions decorated with @activity.defn
"""
activities = []
if not workflows_dir.exists():
logger.warning(f"Workflows directory does not exist: {workflows_dir}")
return activities
logger.info(f"Scanning for workflow activities in: {workflows_dir}")
for workflow_dir in workflows_dir.iterdir():
if not workflow_dir.is_dir():
continue
# Skip special directories
if workflow_dir.name.startswith('.') or workflow_dir.name == '__pycache__':
continue
# Check if activities.py exists
activities_file = workflow_dir / "activities.py"
if not activities_file.exists():
logger.debug(f"No activities.py in {workflow_dir.name}, skipping")
continue
try:
# Dynamically import activities module
module_name = f"toolbox.workflows.{workflow_dir.name}.activities"
logger.info(f"Importing activities module: {module_name}")
try:
module = importlib.import_module(module_name)
except Exception as e:
logger.error(
f"Failed to import activities module {module_name}: {e}",
exc_info=True
)
continue
# Find @activity.defn decorated functions
found_activities = False
for name, obj in inspect.getmembers(module, inspect.isfunction):
# Check if function has Temporal activity definition
if hasattr(obj, '__temporal_activity_definition'):
activities.append(obj)
found_activities = True
logger.info(
f"✓ Discovered activity: {name} from {workflow_dir.name}"
)
if not found_activities:
logger.warning(
f"Workflow {workflow_dir.name} has activities.py but no @activity.defn decorated functions"
)
except Exception as e:
logger.error(
f"Error processing activities from {workflow_dir.name}: {e}",
exc_info=True
)
continue
logger.info(f"Discovered {len(activities)} workflow-specific activities")
return activities
async def main():
"""Main worker entry point"""
# Get configuration from environment
vertical = os.getenv("WORKER_VERTICAL", "rust")
temporal_address = os.getenv("TEMPORAL_ADDRESS", "localhost:7233")
temporal_namespace = os.getenv("TEMPORAL_NAMESPACE", "default")
task_queue = os.getenv("WORKER_TASK_QUEUE", f"{vertical}-queue")
max_concurrent_activities = int(os.getenv("MAX_CONCURRENT_ACTIVITIES", "5"))
logger.info("=" * 60)
logger.info(f"FuzzForge Vertical Worker: {vertical}")
logger.info("=" * 60)
logger.info(f"Temporal Address: {temporal_address}")
logger.info(f"Temporal Namespace: {temporal_namespace}")
logger.info(f"Task Queue: {task_queue}")
logger.info(f"Max Concurrent Activities: {max_concurrent_activities}")
logger.info("=" * 60)
# Discover workflows for this vertical
logger.info(f"Discovering workflows for vertical: {vertical}")
workflows = await discover_workflows(vertical)
if not workflows:
logger.error(f"No workflows found for vertical: {vertical}")
logger.error("Worker cannot start without workflows. Exiting...")
sys.exit(1)
# Discover activities from workflow directories
logger.info("Discovering workflow-specific activities...")
workflows_dir = Path("/app/toolbox/workflows")
workflow_activities = await discover_activities(workflows_dir)
# Combine common storage activities with workflow-specific activities
activities = [
get_target_activity,
cleanup_cache_activity,
upload_results_activity
] + workflow_activities
logger.info(
f"Total activities registered: {len(activities)} "
f"(3 common + {len(workflow_activities)} workflow-specific)"
)
# Connect to Temporal
logger.info(f"Connecting to Temporal at {temporal_address}...")
try:
client = await Client.connect(
temporal_address,
namespace=temporal_namespace
)
logger.info("✓ Connected to Temporal successfully")
except Exception as e:
logger.error(f"Failed to connect to Temporal: {e}", exc_info=True)
sys.exit(1)
# Create worker with discovered workflows and activities
logger.info(f"Creating worker on task queue: {task_queue}")
try:
worker = Worker(
client,
task_queue=task_queue,
workflows=workflows,
activities=activities,
max_concurrent_activities=max_concurrent_activities
)
logger.info("✓ Worker created successfully")
except Exception as e:
logger.error(f"Failed to create worker: {e}", exc_info=True)
sys.exit(1)
# Start worker
logger.info("=" * 60)
logger.info(f"🚀 Worker started for vertical '{vertical}'")
logger.info(f"📦 Registered {len(workflows)} workflows")
logger.info(f"⚙️ Registered {len(activities)} activities")
logger.info(f"📨 Listening on task queue: {task_queue}")
logger.info("=" * 60)
logger.info("Worker is ready to process tasks...")
try:
await worker.run()
except KeyboardInterrupt:
logger.info("Shutting down worker (keyboard interrupt)...")
except Exception as e:
logger.error(f"Worker error: {e}", exc_info=True)
raise
if __name__ == "__main__":
try:
asyncio.run(main())
except KeyboardInterrupt:
logger.info("Worker stopped")
except Exception as e:
logger.error(f"Fatal error: {e}", exc_info=True)
sys.exit(1)

87
workers/rust/Dockerfile Normal file
View File

@@ -0,0 +1,87 @@
# FuzzForge Vertical Worker: Rust/Native Security
#
# Pre-installed tools for Rust and native binary security analysis:
# - Rust toolchain (rustc, cargo)
# - AFL++ (fuzzing)
# - cargo-fuzz (Rust fuzzing)
# - gdb (debugging)
# - valgrind (memory analysis)
# - AddressSanitizer/MemorySanitizer support
# - Common reverse engineering tools
FROM rust:1.83-slim-bookworm
# Set working directory
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y \
# Build essentials
build-essential \
cmake \
git \
curl \
wget \
pkg-config \
libssl-dev \
# AFL++ dependencies
clang \
llvm \
# Debugging and analysis tools
gdb \
valgrind \
strace \
# Binary analysis (binutils includes objdump, readelf, etc.)
binutils \
# Network tools
netcat-openbsd \
tcpdump \
# Python for Temporal worker
python3 \
python3-pip \
python3-venv \
# Cleanup
&& rm -rf /var/lib/apt/lists/*
# Install AFL++
RUN git clone https://github.com/AFLplusplus/AFLplusplus /tmp/aflplusplus && \
cd /tmp/aflplusplus && \
make all && \
make install && \
cd / && \
rm -rf /tmp/aflplusplus
# Install Rust toolchain components (nightly required for cargo-fuzz)
RUN rustup install nightly && \
rustup default nightly && \
rustup component add rustfmt clippy && \
rustup target add x86_64-unknown-linux-musl
# Install cargo-fuzz and other Rust security tools
RUN cargo install --locked \
cargo-fuzz \
cargo-audit \
cargo-outdated \
cargo-tree
# Install Python dependencies for Temporal worker
COPY requirements.txt /tmp/requirements.txt
RUN pip3 install --break-system-packages --no-cache-dir -r /tmp/requirements.txt && \
rm /tmp/requirements.txt
# Create cache directory for downloaded targets
RUN mkdir -p /cache && chmod 755 /cache
# Copy worker entrypoint
COPY worker.py /app/worker.py
# Add toolbox to Python path (mounted at runtime)
ENV PYTHONPATH="/app:/app/toolbox:${PYTHONPATH}"
ENV PYTHONUNBUFFERED=1
# Healthcheck
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
CMD python3 -c "import sys; sys.exit(0)"
# Run worker
CMD ["python3", "/app/worker.py"]

View File

@@ -0,0 +1,22 @@
# Temporal Python SDK
temporalio>=1.5.0
# S3/MinIO client
boto3>=1.34.0
botocore>=1.34.0
# Data validation
pydantic>=2.5.0
# YAML parsing
PyYAML>=6.0.1
# Utilities
python-dotenv>=1.0.0
aiofiles>=23.2.1
# HTTP Client (for real-time stats reporting)
httpx>=0.27.0
# Logging
structlog>=24.1.0

309
workers/rust/worker.py Normal file
View File

@@ -0,0 +1,309 @@
"""
FuzzForge Vertical Worker: Rust/Native Security
This worker:
1. Discovers workflows for the 'rust' vertical from mounted toolbox
2. Dynamically imports and registers workflow classes
3. Connects to Temporal and processes tasks
4. Handles activities for target download/upload from MinIO
"""
import asyncio
import importlib
import inspect
import logging
import os
import sys
from pathlib import Path
from typing import List, Any
import yaml
from temporalio.client import Client
from temporalio.worker import Worker
# Add toolbox to path for workflow and activity imports
sys.path.insert(0, '/app/toolbox')
# Import common storage activities
from toolbox.common.storage_activities import (
get_target_activity,
cleanup_cache_activity,
upload_results_activity
)
# Configure logging
logging.basicConfig(
level=os.getenv('LOG_LEVEL', 'INFO'),
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)
async def discover_workflows(vertical: str) -> List[Any]:
"""
Discover workflows for this vertical from mounted toolbox.
Args:
vertical: The vertical name (e.g., 'rust', 'android', 'web')
Returns:
List of workflow classes decorated with @workflow.defn
"""
workflows = []
toolbox_path = Path("/app/toolbox/workflows")
if not toolbox_path.exists():
logger.warning(f"Toolbox path does not exist: {toolbox_path}")
return workflows
logger.info(f"Scanning for workflows in: {toolbox_path}")
for workflow_dir in toolbox_path.iterdir():
if not workflow_dir.is_dir():
continue
# Skip special directories
if workflow_dir.name.startswith('.') or workflow_dir.name == '__pycache__':
continue
metadata_file = workflow_dir / "metadata.yaml"
if not metadata_file.exists():
logger.debug(f"No metadata.yaml in {workflow_dir.name}, skipping")
continue
try:
# Parse metadata
with open(metadata_file) as f:
metadata = yaml.safe_load(f)
# Check if workflow is for this vertical
workflow_vertical = metadata.get("vertical")
if workflow_vertical != vertical:
logger.debug(
f"Workflow {workflow_dir.name} is for vertical '{workflow_vertical}', "
f"not '{vertical}', skipping"
)
continue
# Check if workflow.py exists
workflow_file = workflow_dir / "workflow.py"
if not workflow_file.exists():
logger.warning(
f"Workflow {workflow_dir.name} has metadata but no workflow.py, skipping"
)
continue
# Dynamically import workflow module
module_name = f"toolbox.workflows.{workflow_dir.name}.workflow"
logger.info(f"Importing workflow module: {module_name}")
try:
module = importlib.import_module(module_name)
except Exception as e:
logger.error(
f"Failed to import workflow module {module_name}: {e}",
exc_info=True
)
continue
# Find @workflow.defn decorated classes
found_workflows = False
for name, obj in inspect.getmembers(module, inspect.isclass):
# Check if class has Temporal workflow definition
if hasattr(obj, '__temporal_workflow_definition'):
workflows.append(obj)
found_workflows = True
logger.info(
f"✓ Discovered workflow: {name} from {workflow_dir.name} "
f"(vertical: {vertical})"
)
if not found_workflows:
logger.warning(
f"Workflow {workflow_dir.name} has no @workflow.defn decorated classes"
)
except Exception as e:
logger.error(
f"Error processing workflow {workflow_dir.name}: {e}",
exc_info=True
)
continue
logger.info(f"Discovered {len(workflows)} workflows for vertical '{vertical}'")
return workflows
async def discover_activities(workflows_dir: Path) -> List[Any]:
"""
Discover activities from workflow directories.
Looks for activities.py files alongside workflow.py in each workflow directory.
Args:
workflows_dir: Path to workflows directory
Returns:
List of activity functions decorated with @activity.defn
"""
activities = []
if not workflows_dir.exists():
logger.warning(f"Workflows directory does not exist: {workflows_dir}")
return activities
logger.info(f"Scanning for workflow activities in: {workflows_dir}")
for workflow_dir in workflows_dir.iterdir():
if not workflow_dir.is_dir():
continue
# Skip special directories
if workflow_dir.name.startswith('.') or workflow_dir.name == '__pycache__':
continue
# Check if activities.py exists
activities_file = workflow_dir / "activities.py"
if not activities_file.exists():
logger.debug(f"No activities.py in {workflow_dir.name}, skipping")
continue
try:
# Dynamically import activities module
module_name = f"toolbox.workflows.{workflow_dir.name}.activities"
logger.info(f"Importing activities module: {module_name}")
try:
module = importlib.import_module(module_name)
except Exception as e:
logger.error(
f"Failed to import activities module {module_name}: {e}",
exc_info=True
)
continue
# Find @activity.defn decorated functions
found_activities = False
for name, obj in inspect.getmembers(module, inspect.isfunction):
# Check if function has Temporal activity definition
if hasattr(obj, '__temporal_activity_definition'):
activities.append(obj)
found_activities = True
logger.info(
f"✓ Discovered activity: {name} from {workflow_dir.name}"
)
if not found_activities:
logger.warning(
f"Workflow {workflow_dir.name} has activities.py but no @activity.defn decorated functions"
)
except Exception as e:
logger.error(
f"Error processing activities from {workflow_dir.name}: {e}",
exc_info=True
)
continue
logger.info(f"Discovered {len(activities)} workflow-specific activities")
return activities
async def main():
"""Main worker entry point"""
# Get configuration from environment
vertical = os.getenv("WORKER_VERTICAL", "rust")
temporal_address = os.getenv("TEMPORAL_ADDRESS", "localhost:7233")
temporal_namespace = os.getenv("TEMPORAL_NAMESPACE", "default")
task_queue = os.getenv("WORKER_TASK_QUEUE", f"{vertical}-queue")
max_concurrent_activities = int(os.getenv("MAX_CONCURRENT_ACTIVITIES", "5"))
logger.info("=" * 60)
logger.info(f"FuzzForge Vertical Worker: {vertical}")
logger.info("=" * 60)
logger.info(f"Temporal Address: {temporal_address}")
logger.info(f"Temporal Namespace: {temporal_namespace}")
logger.info(f"Task Queue: {task_queue}")
logger.info(f"Max Concurrent Activities: {max_concurrent_activities}")
logger.info("=" * 60)
# Discover workflows for this vertical
logger.info(f"Discovering workflows for vertical: {vertical}")
workflows = await discover_workflows(vertical)
if not workflows:
logger.error(f"No workflows found for vertical: {vertical}")
logger.error("Worker cannot start without workflows. Exiting...")
sys.exit(1)
# Discover activities from workflow directories
logger.info("Discovering workflow-specific activities...")
workflows_dir = Path("/app/toolbox/workflows")
workflow_activities = await discover_activities(workflows_dir)
# Combine common storage activities with workflow-specific activities
activities = [
get_target_activity,
cleanup_cache_activity,
upload_results_activity
] + workflow_activities
logger.info(
f"Total activities registered: {len(activities)} "
f"(3 common + {len(workflow_activities)} workflow-specific)"
)
# Connect to Temporal
logger.info(f"Connecting to Temporal at {temporal_address}...")
try:
client = await Client.connect(
temporal_address,
namespace=temporal_namespace
)
logger.info("✓ Connected to Temporal successfully")
except Exception as e:
logger.error(f"Failed to connect to Temporal: {e}", exc_info=True)
sys.exit(1)
# Create worker with discovered workflows and activities
logger.info(f"Creating worker on task queue: {task_queue}")
try:
worker = Worker(
client,
task_queue=task_queue,
workflows=workflows,
activities=activities,
max_concurrent_activities=max_concurrent_activities
)
logger.info("✓ Worker created successfully")
except Exception as e:
logger.error(f"Failed to create worker: {e}", exc_info=True)
sys.exit(1)
# Start worker
logger.info("=" * 60)
logger.info(f"🚀 Worker started for vertical '{vertical}'")
logger.info(f"📦 Registered {len(workflows)} workflows")
logger.info(f"⚙️ Registered {len(activities)} activities")
logger.info(f"📨 Listening on task queue: {task_queue}")
logger.info("=" * 60)
logger.info("Worker is ready to process tasks...")
try:
await worker.run()
except KeyboardInterrupt:
logger.info("Shutting down worker (keyboard interrupt)...")
except Exception as e:
logger.error(f"Worker error: {e}", exc_info=True)
raise
if __name__ == "__main__":
try:
asyncio.run(main())
except KeyboardInterrupt:
logger.info("Worker stopped")
except Exception as e:
logger.error(f"Fatal error: {e}", exc_info=True)
sys.exit(1)