mirror of
https://github.com/FuzzingLabs/fuzzforge_ai.git
synced 2026-06-06 15:34:01 +02:00
CI/CD Integration with Ephemeral Deployment Model (#14)
* feat: Complete migration from Prefect to Temporal BREAKING CHANGE: Replaces Prefect workflow orchestration with Temporal ## Major Changes - Replace Prefect with Temporal for workflow orchestration - Implement vertical worker architecture (rust, android) - Replace Docker registry with MinIO for unified storage - Refactor activities to be co-located with workflows - Update all API endpoints for Temporal compatibility ## Infrastructure - New: docker-compose.temporal.yaml (Temporal + MinIO + workers) - New: workers/ directory with rust and android vertical workers - New: backend/src/temporal/ (manager, discovery) - New: backend/src/storage/ (S3-cached storage with MinIO) - New: backend/toolbox/common/ (shared storage activities) - Deleted: docker-compose.yaml (old Prefect setup) - Deleted: backend/src/core/prefect_manager.py - Deleted: backend/src/services/prefect_stats_monitor.py - Deleted: Docker registry and insecure-registries requirement ## Workflows - Migrated: security_assessment workflow to Temporal - New: rust_test workflow (example/test workflow) - Deleted: secret_detection_scan (Prefect-based, to be reimplemented) - Activities now co-located with workflows for independent testing ## API Changes - Updated: backend/src/api/workflows.py (Temporal submission) - Updated: backend/src/api/runs.py (Temporal status/results) - Updated: backend/src/main.py (727 lines, TemporalManager integration) - Updated: All 16 MCP tools to use TemporalManager ## Testing - ✅ All services healthy (Temporal, PostgreSQL, MinIO, workers, backend) - ✅ All API endpoints functional - ✅ End-to-end workflow test passed (72 findings from vulnerable_app) - ✅ MinIO storage integration working (target upload/download, results) - ✅ Worker activity discovery working (6 activities registered) - ✅ Tarball extraction working - ✅ SARIF report generation working ## Documentation - ARCHITECTURE.md: Complete Temporal architecture documentation - QUICKSTART_TEMPORAL.md: Getting started guide - MIGRATION_DECISION.md: Why we chose Temporal over Prefect - IMPLEMENTATION_STATUS.md: Migration progress tracking - workers/README.md: Worker development guide ## Dependencies - Added: temporalio>=1.6.0 - Added: boto3>=1.34.0 (MinIO S3 client) - Removed: prefect>=3.4.18 * feat: Add Python fuzzing vertical with Atheris integration This commit implements a complete Python fuzzing workflow using Atheris: ## Python Worker (workers/python/) - Dockerfile with Python 3.11, Atheris, and build tools - Generic worker.py for dynamic workflow discovery - requirements.txt with temporalio, boto3, atheris dependencies - Added to docker-compose.temporal.yaml with dedicated cache volume ## AtherisFuzzer Module (backend/toolbox/modules/fuzzer/) - Reusable module extending BaseModule - Auto-discovers fuzz targets (fuzz_*.py, *_fuzz.py, fuzz_target.py) - Recursive search to find targets in nested directories - Dynamically loads TestOneInput() function - Configurable max_iterations and timeout - Real-time stats callback support for live monitoring - Returns findings as ModuleFinding objects ## Atheris Fuzzing Workflow (backend/toolbox/workflows/atheris_fuzzing/) - Temporal workflow for orchestrating fuzzing - Downloads user code from MinIO - Executes AtherisFuzzer module - Uploads results to MinIO - Cleans up cache after execution - metadata.yaml with vertical: python for routing ## Test Project (test_projects/python_fuzz_waterfall/) - Demonstrates stateful waterfall vulnerability - main.py with check_secret() that leaks progress - fuzz_target.py with Atheris TestOneInput() harness - Complete README with usage instructions ## Backend Fixes - Fixed parameter merging in REST API endpoints (workflows.py) - Changed workflow parameter passing from positional args to kwargs (manager.py) - Default parameters now properly merged with user parameters ## Testing ✅ Worker discovered AtherisFuzzingWorkflow ✅ Workflow executed end-to-end successfully ✅ Fuzz target auto-discovered in nested directories ✅ Atheris ran 100,000 iterations ✅ Results uploaded and cache cleaned * chore: Complete Temporal migration with updated CLI/SDK/docs This commit includes all remaining Temporal migration changes: ## CLI Updates (cli/) - Updated workflow execution commands for Temporal - Enhanced error handling and exceptions - Updated dependencies in uv.lock ## SDK Updates (sdk/) - Client methods updated for Temporal workflows - Updated models for new workflow execution - Updated dependencies in uv.lock ## Documentation Updates (docs/) - Architecture documentation for Temporal - Workflow concept documentation - Resource management documentation (new) - Debugging guide (new) - Updated tutorials and how-to guides - Troubleshooting updates ## README Updates - Main README with Temporal instructions - Backend README - CLI README - SDK README ## Other - Updated IMPLEMENTATION_STATUS.md - Removed old vulnerable_app.tar.gz These changes complete the Temporal migration and ensure the CLI/SDK work correctly with the new backend. * fix: Use positional args instead of kwargs for Temporal workflows The Temporal Python SDK's start_workflow() method doesn't accept a 'kwargs' parameter. Workflows must receive parameters as positional arguments via the 'args' parameter. Changed from: args=workflow_args # Positional arguments This fixes the error: TypeError: Client.start_workflow() got an unexpected keyword argument 'kwargs' Workflows now correctly receive parameters in order: - security_assessment: [target_id, scanner_config, analyzer_config, reporter_config] - atheris_fuzzing: [target_id, target_file, max_iterations, timeout_seconds] - rust_test: [target_id, test_message] * fix: Filter metadata-only parameters from workflow arguments SecurityAssessmentWorkflow was receiving 7 arguments instead of 2-5. The issue was that target_path and volume_mode from default_parameters were being passed to the workflow, when they should only be used by the system for configuration. Now filters out metadata-only parameters (target_path, volume_mode) before passing arguments to workflow execution. * refactor: Remove Prefect leftovers and volume mounting legacy Complete cleanup of Prefect migration artifacts: Backend: - Delete registry.py and workflow_discovery.py (Prefect-specific files) - Remove Docker validation from setup.py (no longer needed) - Remove ResourceLimits and VolumeMount models - Remove target_path and volume_mode from WorkflowSubmission - Remove supported_volume_modes from API and discovery - Clean up metadata.yaml files (remove volume/path fields) - Simplify parameter filtering in manager.py SDK: - Remove volume_mode parameter from client methods - Remove ResourceLimits and VolumeMount models - Remove Prefect error patterns from docker_logs.py - Clean up WorkflowSubmission and WorkflowMetadata models CLI: - Remove Volume Modes display from workflow info All removed features are Prefect-specific or Docker volume mounting artifacts. Temporal workflows use MinIO storage exclusively. * feat: Add comprehensive test suite and benchmark infrastructure - Add 68 unit tests for fuzzer, scanner, and analyzer modules - Implement pytest-based test infrastructure with fixtures - Add 6 performance benchmarks with category-specific thresholds - Configure GitHub Actions for automated testing and benchmarking - Add test and benchmark documentation Test coverage: - AtherisFuzzer: 8 tests - CargoFuzzer: 14 tests - FileScanner: 22 tests - SecurityAnalyzer: 24 tests All tests passing (68/68) All benchmarks passing (6/6) * fix: Resolve all ruff linting violations across codebase Fixed 27 ruff violations in 12 files: - Removed unused imports (Depends, Dict, Any, Optional, etc.) - Fixed undefined workflow_info variable in workflows.py - Removed dead code with undefined variables in atheris_fuzzer.py - Changed f-string to regular string where no placeholders used All files now pass ruff checks for CI/CD compliance. * fix: Configure CI for unit tests only - Renamed docker-compose.temporal.yaml → docker-compose.yml for CI compatibility - Commented out integration-tests job (no integration tests yet) - Updated test-summary to only depend on lint and unit-tests CI will now run successfully with 68 unit tests. Integration tests can be added later. * feat: Add CI/CD integration with ephemeral deployment model Implements comprehensive CI/CD support for FuzzForge with on-demand worker management: **Worker Management (v0.7.0)** - Add WorkerManager for automatic worker lifecycle control - Auto-start workers from stopped state when workflows execute - Auto-stop workers after workflow completion - Health checks and startup timeout handling (90s default) **CI/CD Features** - `--fail-on` flag: Fail builds based on SARIF severity levels (error/warning/note/info) - `--export-sarif` flag: Export findings in SARIF 2.1.0 format - `--auto-start`/`--auto-stop` flags: Control worker lifecycle - Exit code propagation: Returns 1 on blocking findings, 0 on success **Exit Code Fix** - Add `except typer.Exit: raise` handlers at 3 critical locations - Move worker cleanup to finally block for guaranteed execution - Exit codes now propagate correctly even when build fails **CI Scripts & Examples** - ci-start.sh: Start FuzzForge services with health checks - ci-stop.sh: Clean shutdown with volume preservation option - GitHub Actions workflow example (security-scan.yml) - GitLab CI pipeline example (.gitlab-ci.example.yml) - docker-compose.ci.yml: CI-optimized compose file with profiles **OSS-Fuzz Integration** - New ossfuzz_campaign workflow for running OSS-Fuzz projects - OSS-Fuzz worker with Docker-in-Docker support - Configurable campaign duration and project selection **Documentation** - Comprehensive CI/CD integration guide (docs/how-to/cicd-integration.md) - Updated architecture docs with worker lifecycle details - Updated workspace isolation documentation - CLI README with worker management examples **SDK Enhancements** - Add get_workflow_worker_info() endpoint - Worker vertical metadata in workflow responses **Testing** - All workflows tested: security_assessment, atheris_fuzzing, secret_detection, cargo_fuzzing - All monitoring commands tested: stats, crashes, status, finding - Full CI pipeline simulation verified - Exit codes verified for success/failure scenarios Ephemeral CI/CD model: ~3-4GB RAM, ~60-90s startup, runs entirely in CI containers. * fix: Resolve ruff linting violations in CI/CD code - Remove unused variables (run_id, defaults, result) - Remove unused imports - Fix f-string without placeholders All CI/CD integration files now pass ruff checks.
This commit is contained in:
@@ -0,0 +1,165 @@
|
||||
name: Benchmarks
|
||||
|
||||
on:
|
||||
# Run on schedule (nightly)
|
||||
schedule:
|
||||
- cron: '0 2 * * *' # 2 AM UTC every day
|
||||
|
||||
# Allow manual trigger
|
||||
workflow_dispatch:
|
||||
inputs:
|
||||
compare_with:
|
||||
description: 'Baseline commit to compare against (optional)'
|
||||
required: false
|
||||
default: ''
|
||||
|
||||
# Run on PR when benchmarks are modified
|
||||
pull_request:
|
||||
paths:
|
||||
- 'backend/benchmarks/**'
|
||||
- 'backend/toolbox/modules/**'
|
||||
- '.github/workflows/benchmark.yml'
|
||||
|
||||
jobs:
|
||||
benchmark:
|
||||
name: Run Benchmarks
|
||||
runs-on: ubuntu-latest
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
fetch-depth: 0 # Fetch all history for comparison
|
||||
|
||||
- name: Set up Python
|
||||
uses: actions/setup-python@v5
|
||||
with:
|
||||
python-version: '3.11'
|
||||
|
||||
- name: Install system dependencies
|
||||
run: |
|
||||
sudo apt-get update
|
||||
sudo apt-get install -y build-essential
|
||||
|
||||
- name: Install Python dependencies
|
||||
working-directory: ./backend
|
||||
run: |
|
||||
python -m pip install --upgrade pip
|
||||
pip install -e ".[dev]"
|
||||
pip install pytest pytest-asyncio pytest-benchmark pytest-benchmark[histogram]
|
||||
|
||||
- name: Run benchmarks
|
||||
working-directory: ./backend
|
||||
run: |
|
||||
pytest benchmarks/ \
|
||||
-v \
|
||||
--benchmark-only \
|
||||
--benchmark-json=benchmark-results.json \
|
||||
--benchmark-histogram=benchmark-histogram
|
||||
|
||||
- name: Store benchmark results
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: benchmark-results-${{ github.run_number }}
|
||||
path: |
|
||||
backend/benchmark-results.json
|
||||
backend/benchmark-histogram.svg
|
||||
|
||||
- name: Download baseline benchmarks
|
||||
if: github.event_name == 'pull_request'
|
||||
uses: dawidd6/action-download-artifact@v3
|
||||
continue-on-error: true
|
||||
with:
|
||||
workflow: benchmark.yml
|
||||
branch: ${{ github.base_ref }}
|
||||
name: benchmark-results-*
|
||||
path: ./baseline
|
||||
search_artifacts: true
|
||||
|
||||
- name: Compare with baseline
|
||||
if: github.event_name == 'pull_request' && hashFiles('baseline/benchmark-results.json') != ''
|
||||
run: |
|
||||
python -c "
|
||||
import json
|
||||
import sys
|
||||
|
||||
with open('backend/benchmark-results.json') as f:
|
||||
current = json.load(f)
|
||||
|
||||
with open('baseline/benchmark-results.json') as f:
|
||||
baseline = json.load(f)
|
||||
|
||||
print('\\n## Benchmark Comparison\\n')
|
||||
print('| Benchmark | Current | Baseline | Change |')
|
||||
print('|-----------|---------|----------|--------|')
|
||||
|
||||
regressions = []
|
||||
|
||||
for bench in current['benchmarks']:
|
||||
name = bench['name']
|
||||
current_time = bench['stats']['mean']
|
||||
|
||||
# Find matching baseline
|
||||
baseline_bench = next((b for b in baseline['benchmarks'] if b['name'] == name), None)
|
||||
if baseline_bench:
|
||||
baseline_time = baseline_bench['stats']['mean']
|
||||
change = ((current_time - baseline_time) / baseline_time) * 100
|
||||
|
||||
print(f'| {name} | {current_time:.4f}s | {baseline_time:.4f}s | {change:+.2f}% |')
|
||||
|
||||
# Flag regressions > 10%
|
||||
if change > 10:
|
||||
regressions.append((name, change))
|
||||
else:
|
||||
print(f'| {name} | {current_time:.4f}s | N/A | NEW |')
|
||||
|
||||
if regressions:
|
||||
print('\\n⚠️ **Performance Regressions Detected:**')
|
||||
for name, change in regressions:
|
||||
print(f'- {name}: +{change:.2f}%')
|
||||
sys.exit(1)
|
||||
else:
|
||||
print('\\n✅ No significant performance regressions detected')
|
||||
"
|
||||
|
||||
- name: Comment PR with results
|
||||
if: github.event_name == 'pull_request'
|
||||
uses: actions/github-script@v7
|
||||
with:
|
||||
script: |
|
||||
const fs = require('fs');
|
||||
const results = JSON.parse(fs.readFileSync('backend/benchmark-results.json', 'utf8'));
|
||||
|
||||
let body = '## Benchmark Results\\n\\n';
|
||||
body += '| Category | Benchmark | Mean Time | Std Dev |\\n';
|
||||
body += '|----------|-----------|-----------|---------|\\n';
|
||||
|
||||
for (const bench of results.benchmarks) {
|
||||
const group = bench.group || 'ungrouped';
|
||||
const name = bench.name.split('::').pop();
|
||||
const mean = bench.stats.mean.toFixed(4);
|
||||
const stddev = bench.stats.stddev.toFixed(4);
|
||||
body += `| ${group} | ${name} | ${mean}s | ${stddev}s |\\n`;
|
||||
}
|
||||
|
||||
body += '\\n📊 Full benchmark results available in artifacts.';
|
||||
|
||||
github.rest.issues.createComment({
|
||||
issue_number: context.issue.number,
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
body: body
|
||||
});
|
||||
|
||||
benchmark-summary:
|
||||
name: Benchmark Summary
|
||||
runs-on: ubuntu-latest
|
||||
needs: benchmark
|
||||
if: always()
|
||||
steps:
|
||||
- name: Check results
|
||||
run: |
|
||||
if [ "${{ needs.benchmark.result }}" != "success" ]; then
|
||||
echo "Benchmarks failed or detected regressions"
|
||||
exit 1
|
||||
fi
|
||||
echo "Benchmarks completed successfully!"
|
||||
@@ -0,0 +1,152 @@
|
||||
# FuzzForge CI/CD Example - Security Scanning
|
||||
#
|
||||
# This workflow demonstrates how to integrate FuzzForge into your CI/CD pipeline
|
||||
# for automated security testing on pull requests and pushes.
|
||||
#
|
||||
# Features:
|
||||
# - Runs entirely in GitHub Actions (no external infrastructure needed)
|
||||
# - Auto-starts FuzzForge services on-demand
|
||||
# - Fails builds on error-level SARIF findings
|
||||
# - Uploads SARIF results to GitHub Security tab
|
||||
# - Exports findings as artifacts
|
||||
#
|
||||
# Prerequisites:
|
||||
# - Ubuntu runner with Docker support
|
||||
# - At least 4GB RAM available
|
||||
# - ~90 seconds startup time
|
||||
|
||||
name: Security Scan Example
|
||||
|
||||
on:
|
||||
pull_request:
|
||||
branches: [main, develop]
|
||||
push:
|
||||
branches: [main]
|
||||
|
||||
jobs:
|
||||
security-scan:
|
||||
name: Security Assessment
|
||||
runs-on: ubuntu-latest
|
||||
timeout-minutes: 30
|
||||
|
||||
steps:
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@v4
|
||||
|
||||
- name: Start FuzzForge
|
||||
run: |
|
||||
bash scripts/ci-start.sh
|
||||
|
||||
- name: Set up Python
|
||||
uses: actions/setup-python@v5
|
||||
with:
|
||||
python-version: '3.11'
|
||||
|
||||
- name: Install FuzzForge CLI
|
||||
run: |
|
||||
pip install ./cli
|
||||
|
||||
- name: Initialize FuzzForge
|
||||
run: |
|
||||
ff init --api-url http://localhost:8000 --name "GitHub Actions Security Scan"
|
||||
|
||||
- name: Run Security Assessment
|
||||
run: |
|
||||
ff workflow run security_assessment . \
|
||||
--wait \
|
||||
--fail-on error \
|
||||
--export-sarif results.sarif
|
||||
|
||||
- name: Upload SARIF to GitHub Security
|
||||
if: always()
|
||||
uses: github/codeql-action/upload-sarif@v3
|
||||
with:
|
||||
sarif_file: results.sarif
|
||||
|
||||
- name: Upload findings as artifact
|
||||
if: always()
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: security-findings
|
||||
path: results.sarif
|
||||
retention-days: 30
|
||||
|
||||
- name: Stop FuzzForge
|
||||
if: always()
|
||||
run: |
|
||||
bash scripts/ci-stop.sh
|
||||
|
||||
secret-scan:
|
||||
name: Secret Detection
|
||||
runs-on: ubuntu-latest
|
||||
timeout-minutes: 15
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Start FuzzForge
|
||||
run: bash scripts/ci-start.sh
|
||||
|
||||
- name: Install CLI
|
||||
run: |
|
||||
pip install ./cli
|
||||
|
||||
- name: Initialize & Scan
|
||||
run: |
|
||||
ff init --api-url http://localhost:8000 --name "Secret Detection"
|
||||
ff workflow run secret_detection . \
|
||||
--wait \
|
||||
--fail-on all \
|
||||
--export-sarif secrets.sarif
|
||||
|
||||
- name: Upload results
|
||||
if: always()
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: secret-scan-results
|
||||
path: secrets.sarif
|
||||
retention-days: 30
|
||||
|
||||
- name: Cleanup
|
||||
if: always()
|
||||
run: bash scripts/ci-stop.sh
|
||||
|
||||
# Example: Nightly fuzzing campaign (long-running)
|
||||
nightly-fuzzing:
|
||||
name: Nightly Fuzzing
|
||||
runs-on: ubuntu-latest
|
||||
timeout-minutes: 120
|
||||
# Only run on schedule
|
||||
if: github.event_name == 'schedule'
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Start FuzzForge
|
||||
run: bash scripts/ci-start.sh
|
||||
|
||||
- name: Install CLI
|
||||
run: pip install ./cli
|
||||
|
||||
- name: Run Fuzzing Campaign
|
||||
run: |
|
||||
ff init --api-url http://localhost:8000
|
||||
ff workflow run atheris_fuzzing . \
|
||||
max_iterations=100000000 \
|
||||
timeout_seconds=7200 \
|
||||
--wait \
|
||||
--export-sarif fuzzing-results.sarif
|
||||
# Don't fail on fuzzing findings, just report
|
||||
continue-on-error: true
|
||||
|
||||
- name: Upload fuzzing results
|
||||
if: always()
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: fuzzing-results
|
||||
path: fuzzing-results.sarif
|
||||
retention-days: 90
|
||||
|
||||
- name: Cleanup
|
||||
if: always()
|
||||
run: bash scripts/ci-stop.sh
|
||||
@@ -0,0 +1,155 @@
|
||||
name: Tests
|
||||
|
||||
on:
|
||||
push:
|
||||
branches: [ main, master, develop, feature/** ]
|
||||
pull_request:
|
||||
branches: [ main, master, develop ]
|
||||
|
||||
jobs:
|
||||
lint:
|
||||
name: Lint
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Set up Python
|
||||
uses: actions/setup-python@v5
|
||||
with:
|
||||
python-version: '3.11'
|
||||
|
||||
- name: Install dependencies
|
||||
run: |
|
||||
python -m pip install --upgrade pip
|
||||
pip install ruff mypy
|
||||
|
||||
- name: Run ruff
|
||||
run: ruff check backend/src backend/toolbox backend/tests backend/benchmarks --output-format=github
|
||||
|
||||
- name: Run mypy (continue on error)
|
||||
run: mypy backend/src backend/toolbox || true
|
||||
continue-on-error: true
|
||||
|
||||
unit-tests:
|
||||
name: Unit Tests
|
||||
runs-on: ubuntu-latest
|
||||
strategy:
|
||||
matrix:
|
||||
python-version: ['3.11', '3.12']
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Set up Python ${{ matrix.python-version }}
|
||||
uses: actions/setup-python@v5
|
||||
with:
|
||||
python-version: ${{ matrix.python-version }}
|
||||
|
||||
- name: Install system dependencies
|
||||
run: |
|
||||
sudo apt-get update
|
||||
sudo apt-get install -y build-essential
|
||||
|
||||
- name: Install Python dependencies
|
||||
working-directory: ./backend
|
||||
run: |
|
||||
python -m pip install --upgrade pip
|
||||
pip install -e ".[dev]"
|
||||
pip install pytest pytest-asyncio pytest-cov pytest-xdist
|
||||
|
||||
- name: Run unit tests
|
||||
working-directory: ./backend
|
||||
run: |
|
||||
pytest tests/unit/ \
|
||||
-v \
|
||||
--cov=toolbox/modules \
|
||||
--cov=src \
|
||||
--cov-report=xml \
|
||||
--cov-report=term \
|
||||
--cov-report=html \
|
||||
-n auto
|
||||
|
||||
- name: Upload coverage to Codecov
|
||||
if: matrix.python-version == '3.11'
|
||||
uses: codecov/codecov-action@v4
|
||||
with:
|
||||
file: ./backend/coverage.xml
|
||||
flags: unittests
|
||||
name: codecov-backend
|
||||
|
||||
- name: Upload coverage HTML
|
||||
if: matrix.python-version == '3.11'
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: coverage-report
|
||||
path: ./backend/htmlcov/
|
||||
|
||||
# integration-tests:
|
||||
# name: Integration Tests
|
||||
# runs-on: ubuntu-latest
|
||||
# needs: unit-tests
|
||||
#
|
||||
# services:
|
||||
# postgres:
|
||||
# image: postgres:15
|
||||
# env:
|
||||
# POSTGRES_USER: postgres
|
||||
# POSTGRES_PASSWORD: postgres
|
||||
# POSTGRES_DB: fuzzforge_test
|
||||
# options: >-
|
||||
# --health-cmd pg_isready
|
||||
# --health-interval 10s
|
||||
# --health-timeout 5s
|
||||
# --health-retries 5
|
||||
# ports:
|
||||
# - 5432:5432
|
||||
#
|
||||
# steps:
|
||||
# - uses: actions/checkout@v4
|
||||
#
|
||||
# - name: Set up Python
|
||||
# uses: actions/setup-python@v5
|
||||
# with:
|
||||
# python-version: '3.11'
|
||||
#
|
||||
# - name: Set up Docker Buildx
|
||||
# uses: docker/setup-buildx-action@v3
|
||||
#
|
||||
# - name: Install Python dependencies
|
||||
# working-directory: ./backend
|
||||
# run: |
|
||||
# python -m pip install --upgrade pip
|
||||
# pip install -e ".[dev]"
|
||||
# pip install pytest pytest-asyncio
|
||||
#
|
||||
# - name: Start services (Temporal, MinIO)
|
||||
# run: |
|
||||
# docker-compose -f docker-compose.yml up -d temporal minio
|
||||
# sleep 30
|
||||
#
|
||||
# - name: Run integration tests
|
||||
# working-directory: ./backend
|
||||
# run: |
|
||||
# pytest tests/integration/ -v --tb=short
|
||||
# env:
|
||||
# DATABASE_URL: postgresql://postgres:postgres@localhost:5432/fuzzforge_test
|
||||
# TEMPORAL_ADDRESS: localhost:7233
|
||||
# MINIO_ENDPOINT: localhost:9000
|
||||
#
|
||||
# - name: Shutdown services
|
||||
# if: always()
|
||||
# run: docker-compose down
|
||||
|
||||
test-summary:
|
||||
name: Test Summary
|
||||
runs-on: ubuntu-latest
|
||||
needs: [lint, unit-tests]
|
||||
if: always()
|
||||
steps:
|
||||
- name: Check test results
|
||||
run: |
|
||||
if [ "${{ needs.unit-tests.result }}" != "success" ]; then
|
||||
echo "Unit tests failed"
|
||||
exit 1
|
||||
fi
|
||||
echo "All tests passed!"
|
||||
@@ -0,0 +1,121 @@
|
||||
# FuzzForge CI/CD Example - GitLab CI
|
||||
#
|
||||
# This file demonstrates how to integrate FuzzForge into your GitLab CI/CD pipeline.
|
||||
# Copy this to `.gitlab-ci.yml` in your project root to enable security scanning.
|
||||
#
|
||||
# Features:
|
||||
# - Runs entirely in GitLab runners (no external infrastructure)
|
||||
# - Auto-starts FuzzForge services on-demand
|
||||
# - Fails pipelines on critical/high severity findings
|
||||
# - Uploads SARIF reports to GitLab Security Dashboard
|
||||
# - Exports findings as artifacts
|
||||
#
|
||||
# Prerequisites:
|
||||
# - GitLab Runner with Docker support (docker:dind)
|
||||
# - At least 4GB RAM available
|
||||
# - ~90 seconds startup time
|
||||
|
||||
stages:
|
||||
- security
|
||||
|
||||
variables:
|
||||
FUZZFORGE_API_URL: "http://localhost:8000"
|
||||
DOCKER_DRIVER: overlay2
|
||||
DOCKER_TLS_CERTDIR: ""
|
||||
|
||||
# Base template for all FuzzForge jobs
|
||||
.fuzzforge_template:
|
||||
image: docker:24
|
||||
services:
|
||||
- docker:24-dind
|
||||
before_script:
|
||||
# Install dependencies
|
||||
- apk add --no-cache bash curl python3 py3-pip git
|
||||
# Start FuzzForge
|
||||
- bash scripts/ci-start.sh
|
||||
# Install CLI
|
||||
- pip3 install ./cli --break-system-packages
|
||||
# Initialize project
|
||||
- ff init --api-url $FUZZFORGE_API_URL --name "GitLab CI Security Scan"
|
||||
after_script:
|
||||
# Cleanup
|
||||
- bash scripts/ci-stop.sh || true
|
||||
|
||||
# Security Assessment - Comprehensive code analysis
|
||||
security:scan:
|
||||
extends: .fuzzforge_template
|
||||
stage: security
|
||||
timeout: 30 minutes
|
||||
script:
|
||||
- ff workflow run security_assessment . --wait --fail-on error --export-sarif results.sarif
|
||||
artifacts:
|
||||
when: always
|
||||
reports:
|
||||
sast: results.sarif
|
||||
paths:
|
||||
- results.sarif
|
||||
expire_in: 30 days
|
||||
only:
|
||||
- merge_requests
|
||||
- main
|
||||
- develop
|
||||
|
||||
# Secret Detection - Scan for exposed credentials
|
||||
security:secrets:
|
||||
extends: .fuzzforge_template
|
||||
stage: security
|
||||
timeout: 15 minutes
|
||||
script:
|
||||
- ff workflow run secret_detection . --wait --fail-on all --export-sarif secrets.sarif
|
||||
artifacts:
|
||||
when: always
|
||||
paths:
|
||||
- secrets.sarif
|
||||
expire_in: 30 days
|
||||
only:
|
||||
- merge_requests
|
||||
- main
|
||||
|
||||
# Nightly Fuzzing - Long-running fuzzing campaign (scheduled only)
|
||||
security:fuzzing:
|
||||
extends: .fuzzforge_template
|
||||
stage: security
|
||||
timeout: 2 hours
|
||||
script:
|
||||
- |
|
||||
ff workflow run atheris_fuzzing . \
|
||||
max_iterations=100000000 \
|
||||
timeout_seconds=7200 \
|
||||
--wait \
|
||||
--export-sarif fuzzing-results.sarif
|
||||
artifacts:
|
||||
when: always
|
||||
paths:
|
||||
- fuzzing-results.sarif
|
||||
expire_in: 90 days
|
||||
allow_failure: true # Don't fail pipeline on fuzzing findings
|
||||
only:
|
||||
- schedules
|
||||
|
||||
# OSS-Fuzz Campaign (for supported projects)
|
||||
security:ossfuzz:
|
||||
extends: .fuzzforge_template
|
||||
stage: security
|
||||
timeout: 1 hour
|
||||
script:
|
||||
- |
|
||||
ff workflow run ossfuzz_campaign . \
|
||||
project_name=your-project-name \
|
||||
campaign_duration_hours=0.5 \
|
||||
--wait \
|
||||
--export-sarif ossfuzz-results.sarif
|
||||
artifacts:
|
||||
when: always
|
||||
paths:
|
||||
- ossfuzz-results.sarif
|
||||
expire_in: 90 days
|
||||
allow_failure: true
|
||||
only:
|
||||
- schedules
|
||||
# Uncomment and set your project name
|
||||
# when: manual
|
||||
+1068
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,421 @@
|
||||
# FuzzForge Temporal Architecture - Quick Start Guide
|
||||
|
||||
This guide walks you through starting and testing the new Temporal-based architecture.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Docker and Docker Compose installed
|
||||
- At least 2GB free RAM (core services only, workers start on-demand)
|
||||
- Ports available: 7233, 8233, 9000, 9001, 8000
|
||||
|
||||
## Step 1: Start Core Services
|
||||
|
||||
```bash
|
||||
# From project root
|
||||
cd /path/to/fuzzforge_ai
|
||||
|
||||
# Start core services (Temporal, MinIO, Backend)
|
||||
docker-compose up -d
|
||||
|
||||
# Workers are pre-built but don't auto-start (saves ~6-7GB RAM)
|
||||
# They'll start automatically when workflows need them
|
||||
|
||||
# Check status
|
||||
docker-compose ps
|
||||
```
|
||||
|
||||
**Expected output:**
|
||||
```
|
||||
NAME STATUS PORTS
|
||||
fuzzforge-minio healthy 0.0.0.0:9000-9001->9000-9001/tcp
|
||||
fuzzforge-temporal healthy 0.0.0.0:7233->7233/tcp
|
||||
fuzzforge-temporal-postgresql healthy 5432/tcp
|
||||
fuzzforge-backend healthy 0.0.0.0:8000->8000/tcp
|
||||
fuzzforge-minio-setup exited (0)
|
||||
# Workers NOT running (will start on-demand)
|
||||
```
|
||||
|
||||
**First startup takes ~30-60 seconds** for health checks to pass.
|
||||
|
||||
## Step 2: Verify Worker Discovery
|
||||
|
||||
Check worker logs to ensure workflows are discovered:
|
||||
|
||||
```bash
|
||||
docker logs fuzzforge-worker-rust
|
||||
```
|
||||
|
||||
**Expected output:**
|
||||
```
|
||||
============================================================
|
||||
FuzzForge Vertical Worker: rust
|
||||
============================================================
|
||||
Temporal Address: temporal:7233
|
||||
Task Queue: rust-queue
|
||||
Max Concurrent Activities: 5
|
||||
============================================================
|
||||
Discovering workflows for vertical: rust
|
||||
Importing workflow module: toolbox.workflows.rust_test.workflow
|
||||
✓ Discovered workflow: RustTestWorkflow from rust_test (vertical: rust)
|
||||
Discovered 1 workflows for vertical 'rust'
|
||||
Connecting to Temporal at temporal:7233...
|
||||
✓ Connected to Temporal successfully
|
||||
Creating worker on task queue: rust-queue
|
||||
✓ Worker created successfully
|
||||
============================================================
|
||||
🚀 Worker started for vertical 'rust'
|
||||
📦 Registered 1 workflows
|
||||
⚙️ Registered 3 activities
|
||||
📨 Listening on task queue: rust-queue
|
||||
============================================================
|
||||
Worker is ready to process tasks...
|
||||
```
|
||||
|
||||
## Step 2.5: Worker Lifecycle Management (New in v0.7.0)
|
||||
|
||||
Workers start on-demand when workflows need them:
|
||||
|
||||
```bash
|
||||
# Check worker status (should show Exited or not running)
|
||||
docker ps -a --filter "name=fuzzforge-worker"
|
||||
|
||||
# Run a workflow - worker starts automatically
|
||||
ff workflow run ossfuzz_campaign . project_name=zlib
|
||||
|
||||
# Worker is now running
|
||||
docker ps --filter "name=fuzzforge-worker-ossfuzz"
|
||||
```
|
||||
|
||||
**Configuration** (`.fuzzforge/config.yaml`):
|
||||
```yaml
|
||||
workers:
|
||||
auto_start_workers: true # Default: auto-start
|
||||
auto_stop_workers: false # Default: keep running
|
||||
worker_startup_timeout: 60 # Startup timeout in seconds
|
||||
```
|
||||
|
||||
**CLI Control**:
|
||||
```bash
|
||||
# Disable auto-start
|
||||
ff workflow run ossfuzz_campaign . --no-auto-start
|
||||
|
||||
# Enable auto-stop after completion
|
||||
ff workflow run ossfuzz_campaign . --wait --auto-stop
|
||||
```
|
||||
|
||||
## Step 3: Access Web UIs
|
||||
|
||||
### Temporal Web UI
|
||||
- URL: http://localhost:8233
|
||||
- View workflows, executions, and task queues
|
||||
|
||||
### MinIO Console
|
||||
- URL: http://localhost:9001
|
||||
- Login: `fuzzforge` / `fuzzforge123`
|
||||
- View uploaded targets and results
|
||||
|
||||
## Step 4: Test Workflow Execution
|
||||
|
||||
### Option A: Using Temporal CLI (tctl)
|
||||
|
||||
```bash
|
||||
# Install tctl (if not already installed)
|
||||
brew install temporal # macOS
|
||||
# or download from https://github.com/temporalio/tctl/releases
|
||||
|
||||
# Execute test workflow
|
||||
tctl workflow run \
|
||||
--address localhost:7233 \
|
||||
--taskqueue rust-queue \
|
||||
--workflow_type RustTestWorkflow \
|
||||
--input '{"target_id": "test-123", "test_message": "Hello Temporal!"}'
|
||||
```
|
||||
|
||||
### Option B: Using Python Client
|
||||
|
||||
Create `test_workflow.py`:
|
||||
|
||||
```python
|
||||
import asyncio
|
||||
from temporalio.client import Client
|
||||
|
||||
async def main():
|
||||
# Connect to Temporal
|
||||
client = await Client.connect("localhost:7233")
|
||||
|
||||
# Start workflow
|
||||
result = await client.execute_workflow(
|
||||
"RustTestWorkflow",
|
||||
{"target_id": "test-123", "test_message": "Hello Temporal!"},
|
||||
id="test-workflow-1",
|
||||
task_queue="rust-queue"
|
||||
)
|
||||
|
||||
print("Workflow result:", result)
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
```
|
||||
|
||||
```bash
|
||||
python test_workflow.py
|
||||
```
|
||||
|
||||
### Option C: Upload Target and Run (Full Flow)
|
||||
|
||||
```python
|
||||
# upload_and_run.py
|
||||
import asyncio
|
||||
import boto3
|
||||
from pathlib import Path
|
||||
from temporalio.client import Client
|
||||
|
||||
async def main():
|
||||
# 1. Upload target to MinIO
|
||||
s3 = boto3.client(
|
||||
's3',
|
||||
endpoint_url='http://localhost:9000',
|
||||
aws_access_key_id='fuzzforge',
|
||||
aws_secret_access_key='fuzzforge123',
|
||||
region_name='us-east-1'
|
||||
)
|
||||
|
||||
# Create a test file
|
||||
test_file = Path('/tmp/test_target.txt')
|
||||
test_file.write_text('This is a test target file')
|
||||
|
||||
# Upload to MinIO
|
||||
target_id = 'my-test-target-001'
|
||||
s3.upload_file(
|
||||
str(test_file),
|
||||
'targets',
|
||||
f'{target_id}/target'
|
||||
)
|
||||
print(f"✓ Uploaded target: {target_id}")
|
||||
|
||||
# 2. Run workflow
|
||||
client = await Client.connect("localhost:7233")
|
||||
|
||||
result = await client.execute_workflow(
|
||||
"RustTestWorkflow",
|
||||
{"target_id": target_id, "test_message": "Full flow test!"},
|
||||
id=f"workflow-{target_id}",
|
||||
task_queue="rust-queue"
|
||||
)
|
||||
|
||||
print("✓ Workflow completed!")
|
||||
print("Results:", result)
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
```
|
||||
|
||||
```bash
|
||||
# Install dependencies
|
||||
pip install temporalio boto3
|
||||
|
||||
# Run test
|
||||
python upload_and_run.py
|
||||
```
|
||||
|
||||
## Step 5: Monitor Execution
|
||||
|
||||
### View in Temporal UI
|
||||
|
||||
1. Open http://localhost:8233
|
||||
2. Click on "Workflows"
|
||||
3. Find your workflow by ID
|
||||
4. Click to see:
|
||||
- Execution history
|
||||
- Activity results
|
||||
- Error stack traces (if any)
|
||||
|
||||
### View Logs
|
||||
|
||||
```bash
|
||||
# Worker logs (shows activity execution)
|
||||
docker logs -f fuzzforge-worker-rust
|
||||
|
||||
# Temporal server logs
|
||||
docker logs -f fuzzforge-temporal
|
||||
```
|
||||
|
||||
### Check MinIO Storage
|
||||
|
||||
1. Open http://localhost:9001
|
||||
2. Login: `fuzzforge` / `fuzzforge123`
|
||||
3. Browse buckets:
|
||||
- `targets/` - Uploaded target files
|
||||
- `results/` - Workflow results (if uploaded)
|
||||
- `cache/` - Worker cache (temporary)
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Services Not Starting
|
||||
|
||||
```bash
|
||||
# Check logs for all services
|
||||
docker-compose -f docker-compose.temporal.yaml logs
|
||||
|
||||
# Check specific service
|
||||
docker-compose -f docker-compose.temporal.yaml logs temporal
|
||||
docker-compose -f docker-compose.temporal.yaml logs minio
|
||||
docker-compose -f docker-compose.temporal.yaml logs worker-rust
|
||||
```
|
||||
|
||||
### Worker Not Discovering Workflows
|
||||
|
||||
**Issue**: Worker logs show "No workflows found for vertical: rust"
|
||||
|
||||
**Solution**:
|
||||
1. Check toolbox mount: `docker exec fuzzforge-worker-rust ls /app/toolbox/workflows`
|
||||
2. Verify metadata.yaml exists and has `vertical: rust`
|
||||
3. Check workflow.py has `@workflow.defn` decorator
|
||||
|
||||
### Cannot Connect to Temporal
|
||||
|
||||
**Issue**: `Failed to connect to Temporal`
|
||||
|
||||
**Solution**:
|
||||
```bash
|
||||
# Wait for Temporal to be healthy
|
||||
docker-compose -f docker-compose.temporal.yaml ps
|
||||
|
||||
# Check Temporal health manually
|
||||
curl http://localhost:8233
|
||||
|
||||
# Restart Temporal if needed
|
||||
docker-compose -f docker-compose.temporal.yaml restart temporal
|
||||
```
|
||||
|
||||
### MinIO Connection Failed
|
||||
|
||||
**Issue**: `Failed to download target`
|
||||
|
||||
**Solution**:
|
||||
```bash
|
||||
# Check MinIO is running
|
||||
docker ps | grep minio
|
||||
|
||||
# Check buckets exist
|
||||
docker exec fuzzforge-minio mc ls fuzzforge/
|
||||
|
||||
# Verify target was uploaded
|
||||
docker exec fuzzforge-minio mc ls fuzzforge/targets/
|
||||
```
|
||||
|
||||
### Workflow Hangs
|
||||
|
||||
**Issue**: Workflow starts but never completes
|
||||
|
||||
**Check**:
|
||||
1. Worker logs for errors: `docker logs fuzzforge-worker-rust`
|
||||
2. Activity timeouts in workflow code
|
||||
3. Target file actually exists in MinIO
|
||||
|
||||
## Scaling
|
||||
|
||||
### Add More Workers
|
||||
|
||||
```bash
|
||||
# Scale rust workers horizontally
|
||||
docker-compose -f docker-compose.temporal.yaml up -d --scale worker-rust=3
|
||||
|
||||
# Verify all workers are running
|
||||
docker ps | grep worker-rust
|
||||
```
|
||||
|
||||
### Increase Concurrent Activities
|
||||
|
||||
Edit `docker-compose.temporal.yaml`:
|
||||
|
||||
```yaml
|
||||
worker-rust:
|
||||
environment:
|
||||
MAX_CONCURRENT_ACTIVITIES: 10 # Increase from 5
|
||||
```
|
||||
|
||||
```bash
|
||||
# Apply changes
|
||||
docker-compose -f docker-compose.temporal.yaml up -d worker-rust
|
||||
```
|
||||
|
||||
## Cleanup
|
||||
|
||||
```bash
|
||||
# Stop all services
|
||||
docker-compose -f docker-compose.temporal.yaml down
|
||||
|
||||
# Remove volumes (WARNING: deletes all data)
|
||||
docker-compose -f docker-compose.temporal.yaml down -v
|
||||
|
||||
# Remove everything including images
|
||||
docker-compose -f docker-compose.temporal.yaml down -v --rmi all
|
||||
```
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. **Add More Workflows**: Create workflows in `backend/toolbox/workflows/`
|
||||
2. **Add More Verticals**: Create new worker types (android, web, etc.) - see `workers/README.md`
|
||||
3. **Integrate with Backend**: Update FastAPI backend to use Temporal client
|
||||
4. **Update CLI**: Modify `ff` CLI to work with Temporal workflows
|
||||
|
||||
## Useful Commands
|
||||
|
||||
```bash
|
||||
# View all logs
|
||||
docker-compose -f docker-compose.temporal.yaml logs -f
|
||||
|
||||
# View specific service logs
|
||||
docker-compose -f docker-compose.temporal.yaml logs -f worker-rust
|
||||
|
||||
# Restart a service
|
||||
docker-compose -f docker-compose.temporal.yaml restart worker-rust
|
||||
|
||||
# Check service status
|
||||
docker-compose -f docker-compose.temporal.yaml ps
|
||||
|
||||
# Execute command in worker
|
||||
docker exec -it fuzzforge-worker-rust bash
|
||||
|
||||
# View worker Python environment
|
||||
docker exec fuzzforge-worker-rust pip list
|
||||
|
||||
# Check workflow discovery manually
|
||||
docker exec fuzzforge-worker-rust python -c "
|
||||
from pathlib import Path
|
||||
import yaml
|
||||
for w in Path('/app/toolbox/workflows').iterdir():
|
||||
if w.is_dir():
|
||||
meta = w / 'metadata.yaml'
|
||||
if meta.exists():
|
||||
print(f'{w.name}: {yaml.safe_load(meta.read_text()).get(\"vertical\")}')"
|
||||
```
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
```
|
||||
┌─────────────┐ ┌──────────────┐ ┌──────────────┐
|
||||
│ Temporal │────▶│ Task Queue │────▶│ Worker-Rust │
|
||||
│ Server │ │ rust-queue │ │ (Long-lived)│
|
||||
└─────────────┘ └──────────────┘ └──────┬───────┘
|
||||
│ │
|
||||
│ │
|
||||
▼ ▼
|
||||
┌─────────────┐ ┌──────────────┐
|
||||
│ Postgres │ │ MinIO │
|
||||
│ (State) │ │ (Storage) │
|
||||
└─────────────┘ └──────────────┘
|
||||
│
|
||||
┌──────┴──────┐
|
||||
│ │
|
||||
┌────▼────┐ ┌─────▼────┐
|
||||
│ Targets │ │ Results │
|
||||
└─────────┘ └──────────┘
|
||||
```
|
||||
|
||||
## Support
|
||||
|
||||
- **Documentation**: See `ARCHITECTURE.md` for detailed design
|
||||
- **Worker Guide**: See `workers/README.md` for adding verticals
|
||||
- **Issues**: Open GitHub issue with logs and steps to reproduce
|
||||
@@ -131,31 +131,38 @@ uv tool install --python python3.12 .
|
||||
|
||||
## ⚡ Quickstart
|
||||
|
||||
Run your first workflow :
|
||||
Run your first workflow with **Temporal orchestration** and **automatic file upload**:
|
||||
|
||||
```bash
|
||||
# 1. Clone the repo
|
||||
git clone https://github.com/fuzzinglabs/fuzzforge_ai.git
|
||||
cd fuzzforge_ai
|
||||
|
||||
# 2. Build & run with Docker
|
||||
# Set registry host for your OS (local registry is mandatory)
|
||||
# macOS/Windows (Docker Desktop):
|
||||
export REGISTRY_HOST=host.docker.internal
|
||||
# Linux (default):
|
||||
# export REGISTRY_HOST=localhost
|
||||
docker compose up -d
|
||||
# 2. Start FuzzForge with Temporal
|
||||
docker-compose -f docker-compose.temporal.yaml up -d
|
||||
```
|
||||
|
||||
> The first launch can take 5-10 minutes due to Docker image building - a good time for a coffee break ☕
|
||||
> The first launch can take 2-3 minutes for services to initialize ☕
|
||||
|
||||
```bash
|
||||
# 3. Run your first workflow
|
||||
cd test_projects/vulnerable_app/ # Go into the test directory
|
||||
fuzzforge init # Init a fuzzforge project
|
||||
ff workflow run security_assessment . # Start a workflow (you can also use ff command)
|
||||
# 3. Run your first workflow (files are automatically uploaded)
|
||||
cd test_projects/vulnerable_app/
|
||||
fuzzforge init # Initialize FuzzForge project
|
||||
ff workflow run security_assessment . # Start workflow - CLI uploads files automatically!
|
||||
|
||||
# The CLI will:
|
||||
# - Detect the local directory
|
||||
# - Create a compressed tarball
|
||||
# - Upload to backend (via MinIO)
|
||||
# - Start the workflow on vertical worker
|
||||
```
|
||||
|
||||
**What's running:**
|
||||
- **Temporal**: Workflow orchestration (UI at http://localhost:8233)
|
||||
- **MinIO**: File storage for targets (Console at http://localhost:9001)
|
||||
- **Vertical Workers**: Pre-built workers with security toolchains
|
||||
- **Backend API**: FuzzForge REST API (http://localhost:8000)
|
||||
|
||||
### Manual Workflow Setup
|
||||
|
||||

|
||||
|
||||
@@ -78,7 +78,7 @@ def create_a2a_app():
|
||||
print("\033[0m") # Reset color
|
||||
|
||||
# Create A2A app
|
||||
print(f"🚀 Starting FuzzForge A2A Server")
|
||||
print("🚀 Starting FuzzForge A2A Server")
|
||||
print(f" Model: {fuzzforge.model}")
|
||||
if fuzzforge.cognee_url:
|
||||
print(f" Memory: Cognee at {fuzzforge.cognee_url}")
|
||||
@@ -86,7 +86,7 @@ def create_a2a_app():
|
||||
|
||||
app = create_custom_a2a_app(fuzzforge.adk_agent, port=port, executor=fuzzforge.executor)
|
||||
|
||||
print(f"\n✅ FuzzForge A2A Server ready!")
|
||||
print("\n✅ FuzzForge A2A Server ready!")
|
||||
print(f" Agent card: http://localhost:{port}/.well-known/agent-card.json")
|
||||
print(f" A2A endpoint: http://localhost:{port}/")
|
||||
print(f"\n📡 Other agents can register FuzzForge at: http://localhost:{port}")
|
||||
@@ -101,7 +101,7 @@ def main():
|
||||
app = create_a2a_app()
|
||||
port = int(os.getenv('FUZZFORGE_PORT', 10100))
|
||||
|
||||
print(f"\n🎯 Starting server with uvicorn...")
|
||||
print("\n🎯 Starting server with uvicorn...")
|
||||
uvicorn.run(app, host="127.0.0.1", port=port)
|
||||
|
||||
|
||||
|
||||
@@ -18,7 +18,6 @@ from typing import Optional, Union
|
||||
|
||||
from starlette.applications import Starlette
|
||||
from starlette.responses import Response, FileResponse
|
||||
from starlette.routing import Route
|
||||
|
||||
from google.adk.a2a.executor.a2a_agent_executor import A2aAgentExecutor
|
||||
from google.adk.a2a.utils.agent_card_builder import AgentCardBuilder
|
||||
|
||||
@@ -15,7 +15,7 @@ Defines what FuzzForge can do and how others can discover it
|
||||
|
||||
|
||||
from dataclasses import dataclass
|
||||
from typing import List, Optional, Dict, Any
|
||||
from typing import List, Dict, Any
|
||||
|
||||
@dataclass
|
||||
class AgentSkill:
|
||||
|
||||
@@ -12,7 +12,6 @@
|
||||
|
||||
|
||||
import asyncio
|
||||
import base64
|
||||
import time
|
||||
import uuid
|
||||
import json
|
||||
@@ -392,7 +391,7 @@ class FuzzForgeExecutor:
|
||||
user_email = f"project_{config.get_project_context()['project_id']}@fuzzforge.example"
|
||||
user = await get_user(user_email)
|
||||
cognee.set_user(user)
|
||||
except Exception as e:
|
||||
except Exception:
|
||||
pass # User context not critical
|
||||
|
||||
# Use cognee search directly for maximum flexibility
|
||||
@@ -583,7 +582,6 @@ class FuzzForgeExecutor:
|
||||
pattern: Glob pattern (e.g. '*.py', '**/*.js', '')
|
||||
"""
|
||||
try:
|
||||
from pathlib import Path
|
||||
|
||||
# Get project root from config
|
||||
config = ProjectConfigManager()
|
||||
@@ -648,7 +646,6 @@ class FuzzForgeExecutor:
|
||||
max_lines: Maximum lines to read (0 for all, default 200 for large files)
|
||||
"""
|
||||
try:
|
||||
from pathlib import Path
|
||||
|
||||
# Get project root from config
|
||||
config = ProjectConfigManager()
|
||||
@@ -711,7 +708,6 @@ class FuzzForgeExecutor:
|
||||
"""
|
||||
try:
|
||||
import re
|
||||
from pathlib import Path
|
||||
|
||||
# Get project root from config
|
||||
config = ProjectConfigManager()
|
||||
@@ -757,7 +753,7 @@ class FuzzForgeExecutor:
|
||||
result = f"Found '{search_pattern}' in {len(matches)} locations (searched {files_searched} files):\n"
|
||||
result += "\n".join(matches[:50])
|
||||
if len(matches) >= 50:
|
||||
result += f"\n... (showing first 50 matches)"
|
||||
result += "\n... (showing first 50 matches)"
|
||||
return result
|
||||
else:
|
||||
return f"No matches found for '{search_pattern}' in {files_searched} files matching '{file_pattern}'"
|
||||
@@ -1088,7 +1084,7 @@ class FuzzForgeExecutor:
|
||||
|
||||
def _build_instruction(self) -> str:
|
||||
"""Build the agent's instruction prompt"""
|
||||
instruction = f"""You are FuzzForge, an intelligent A2A orchestrator with dual memory systems.
|
||||
instruction = """You are FuzzForge, an intelligent A2A orchestrator with dual memory systems.
|
||||
|
||||
## Your Core Responsibilities:
|
||||
|
||||
|
||||
@@ -26,7 +26,6 @@ import random
|
||||
from datetime import datetime
|
||||
from contextlib import contextmanager
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
from dotenv import load_dotenv
|
||||
|
||||
@@ -90,18 +89,12 @@ except ImportError:
|
||||
from rich.console import Console
|
||||
from rich.table import Table
|
||||
from rich.panel import Panel
|
||||
from rich.prompt import Prompt
|
||||
from rich import box
|
||||
|
||||
from google.adk.events.event import Event
|
||||
from google.adk.events.event_actions import EventActions
|
||||
from google.genai import types as gen_types
|
||||
|
||||
from .agent import FuzzForgeAgent
|
||||
from .agent_card import get_fuzzforge_agent_card
|
||||
from .config_manager import ConfigManager
|
||||
from .config_bridge import ProjectConfigManager
|
||||
from .remote_agent import RemoteAgentConnection
|
||||
|
||||
console = Console()
|
||||
|
||||
@@ -243,7 +236,7 @@ class FuzzForgeCLI:
|
||||
)
|
||||
)
|
||||
if self.agent.executor.agentops_trace:
|
||||
console.print(f"Tracking: [medium_purple1]AgentOps active[/medium_purple1]")
|
||||
console.print("Tracking: [medium_purple1]AgentOps active[/medium_purple1]")
|
||||
|
||||
# Show skills
|
||||
console.print("\nSkills:")
|
||||
@@ -320,7 +313,7 @@ class FuzzForgeCLI:
|
||||
url=args.strip(),
|
||||
description=description
|
||||
)
|
||||
console.print(f" [dim]Saved to config for auto-registration[/dim]")
|
||||
console.print(" [dim]Saved to config for auto-registration[/dim]")
|
||||
else:
|
||||
console.print(f"[red]Failed: {result['error']}[/red]")
|
||||
|
||||
@@ -346,9 +339,9 @@ class FuzzForgeCLI:
|
||||
# Remove from config
|
||||
if self.config_manager.remove_registered_agent(name=agent_to_remove['name'], url=agent_to_remove['url']):
|
||||
console.print(f"✅ Unregistered: [bold]{agent_to_remove['name']}[/bold]")
|
||||
console.print(f" [dim]Removed from config (won't auto-register next time)[/dim]")
|
||||
console.print(" [dim]Removed from config (won't auto-register next time)[/dim]")
|
||||
else:
|
||||
console.print(f"[yellow]Agent unregistered from session but not found in config[/yellow]")
|
||||
console.print("[yellow]Agent unregistered from session but not found in config[/yellow]")
|
||||
|
||||
async def cmd_list(self, args: str = "") -> None:
|
||||
"""List registered agents"""
|
||||
@@ -699,7 +692,7 @@ class FuzzForgeCLI:
|
||||
)
|
||||
|
||||
console.print(table)
|
||||
console.print(f"\n[dim]Use /artifacts <id> to view artifact content[/dim]")
|
||||
console.print("\n[dim]Use /artifacts <id> to view artifact content[/dim]")
|
||||
|
||||
async def cmd_tasks(self, args: str = "") -> None:
|
||||
"""List tasks or show details for a specific task."""
|
||||
|
||||
@@ -16,9 +16,7 @@ Can be reused by external agents and other components
|
||||
|
||||
|
||||
import os
|
||||
import asyncio
|
||||
import json
|
||||
from typing import Dict, List, Any, Optional, Union
|
||||
from typing import Dict, Any, Optional
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
|
||||
@@ -15,11 +15,9 @@ Provides integrated Cognee functionality for codebase analysis and knowledge gra
|
||||
|
||||
|
||||
import os
|
||||
import asyncio
|
||||
import logging
|
||||
from pathlib import Path
|
||||
from typing import Dict, List, Any, Optional
|
||||
from datetime import datetime
|
||||
from typing import Dict, List, Any
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@@ -13,7 +13,7 @@
|
||||
|
||||
try:
|
||||
from fuzzforge_cli.config import ProjectConfigManager as _ProjectConfigManager
|
||||
except ImportError as exc: # pragma: no cover - used when CLI not available
|
||||
except ImportError: # pragma: no cover - used when CLI not available
|
||||
class _ProjectConfigManager: # type: ignore[no-redef]
|
||||
"""Fallback implementation that raises a helpful error."""
|
||||
|
||||
|
||||
@@ -16,15 +16,12 @@ Separate from Cognee which will be used for RAG/codebase analysis
|
||||
|
||||
|
||||
import os
|
||||
import json
|
||||
from typing import Dict, List, Any, Optional
|
||||
from datetime import datetime
|
||||
from typing import Dict, Any
|
||||
import logging
|
||||
|
||||
# ADK Memory imports
|
||||
from google.adk.memory import InMemoryMemoryService, BaseMemoryService
|
||||
from google.adk.memory.base_memory_service import SearchMemoryResponse
|
||||
from google.adk.memory.memory_entry import MemoryEntry
|
||||
|
||||
# Optional VertexAI Memory Bank
|
||||
try:
|
||||
|
||||
+5
-9
@@ -17,25 +17,21 @@ RUN apt-get update && apt-get install -y \
|
||||
|
||||
# Docker client configuration removed - localhost:5001 doesn't require insecure registry config
|
||||
|
||||
# Install uv for faster package management
|
||||
RUN pip install uv
|
||||
|
||||
# Copy project files
|
||||
COPY pyproject.toml ./
|
||||
COPY uv.lock ./
|
||||
|
||||
# Install dependencies
|
||||
RUN uv sync --no-dev
|
||||
# Install dependencies with pip
|
||||
RUN pip install --no-cache-dir -e .
|
||||
|
||||
# Copy source code
|
||||
COPY . .
|
||||
|
||||
# Expose port
|
||||
EXPOSE 8000
|
||||
# Expose ports (API on 8000, MCP on 8010)
|
||||
EXPOSE 8000 8010
|
||||
|
||||
# Health check
|
||||
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
|
||||
CMD curl -f http://localhost:8000/health || exit 1
|
||||
|
||||
# Start the application
|
||||
CMD ["uv", "run", "uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "8000"]
|
||||
CMD ["uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "8000"]
|
||||
+101
-27
@@ -1,6 +1,6 @@
|
||||
# FuzzForge Backend
|
||||
|
||||
A stateless API server for security testing workflow orchestration using Prefect. This system dynamically discovers workflows, executes them in isolated Docker containers with volume mounting, and returns findings in SARIF format.
|
||||
A stateless API server for security testing workflow orchestration using Temporal. This system dynamically discovers workflows, executes them in isolated worker environments, and returns findings in SARIF format.
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
@@ -8,17 +8,17 @@ A stateless API server for security testing workflow orchestration using Prefect
|
||||
|
||||
1. **Workflow Discovery System**: Automatically discovers workflows at startup
|
||||
2. **Module System**: Reusable components (scanner, analyzer, reporter) with a common interface
|
||||
3. **Prefect Integration**: Handles container orchestration, workflow execution, and monitoring
|
||||
4. **Volume Mounting**: Secure file access with configurable permissions (ro/rw)
|
||||
3. **Temporal Integration**: Handles workflow orchestration, execution, and monitoring with vertical workers
|
||||
4. **File Upload & Storage**: HTTP multipart upload to MinIO for target files
|
||||
5. **SARIF Output**: Standardized security findings format
|
||||
|
||||
### Key Features
|
||||
|
||||
- **Stateless**: No persistent data, fully scalable
|
||||
- **Generic**: No hardcoded workflows, automatic discovery
|
||||
- **Isolated**: Each workflow runs in its own Docker container
|
||||
- **Isolated**: Each workflow runs in specialized vertical workers
|
||||
- **Extensible**: Easy to add new workflows and modules
|
||||
- **Secure**: Read-only volume mounts by default, path validation
|
||||
- **Secure**: File upload with MinIO storage, automatic cleanup via lifecycle policies
|
||||
- **Observable**: Comprehensive logging and status tracking
|
||||
|
||||
## Quick Start
|
||||
@@ -32,19 +32,17 @@ A stateless API server for security testing workflow orchestration using Prefect
|
||||
From the project root, start all services:
|
||||
|
||||
```bash
|
||||
docker-compose up -d
|
||||
docker-compose -f docker-compose.temporal.yaml up -d
|
||||
```
|
||||
|
||||
This will start:
|
||||
- Prefect server (API at http://localhost:4200/api)
|
||||
- PostgreSQL database
|
||||
- Redis cache
|
||||
- Docker registry (port 5001)
|
||||
- Prefect worker (for running workflows)
|
||||
- Temporal server (Web UI at http://localhost:8233, gRPC at :7233)
|
||||
- MinIO (S3 storage at http://localhost:9000, Console at http://localhost:9001)
|
||||
- PostgreSQL database (for Temporal state)
|
||||
- Vertical workers (worker-rust, worker-android, worker-web, etc.)
|
||||
- FuzzForge backend API (port 8000)
|
||||
- FuzzForge MCP server (port 8010)
|
||||
|
||||
**Note**: The Prefect UI at http://localhost:4200 is not currently accessible from the host due to the API being configured for inter-container communication. Use the REST API or MCP interface instead.
|
||||
**Note**: MinIO console login: `fuzzforge` / `fuzzforge123`
|
||||
|
||||
## API Endpoints
|
||||
|
||||
@@ -54,7 +52,8 @@ This will start:
|
||||
- `GET /workflows/{name}/metadata` - Get workflow metadata and parameters
|
||||
- `GET /workflows/{name}/parameters` - Get workflow parameter schema
|
||||
- `GET /workflows/metadata/schema` - Get metadata.yaml schema
|
||||
- `POST /workflows/{name}/submit` - Submit a workflow for execution
|
||||
- `POST /workflows/{name}/submit` - Submit a workflow for execution (path-based, legacy)
|
||||
- `POST /workflows/{name}/upload-and-submit` - **Upload local files and submit workflow** (recommended)
|
||||
|
||||
### Runs
|
||||
|
||||
@@ -68,12 +67,13 @@ Each workflow must have:
|
||||
|
||||
```
|
||||
toolbox/workflows/{workflow_name}/
|
||||
workflow.py # Prefect flow definition
|
||||
metadata.yaml # Mandatory metadata (parameters, version, etc.)
|
||||
Dockerfile # Optional custom container definition
|
||||
requirements.txt # Optional Python dependencies
|
||||
workflow.py # Temporal workflow definition
|
||||
metadata.yaml # Mandatory metadata (parameters, version, vertical, etc.)
|
||||
requirements.txt # Optional Python dependencies (installed in vertical worker)
|
||||
```
|
||||
|
||||
**Note**: With Temporal architecture, workflows run in pre-built vertical workers (e.g., `worker-rust`, `worker-android`), not individual Docker containers. The workflow code is mounted as a volume and discovered at runtime.
|
||||
|
||||
### Example metadata.yaml
|
||||
|
||||
```yaml
|
||||
@@ -82,6 +82,7 @@ version: "1.0.0"
|
||||
description: "Comprehensive security analysis workflow"
|
||||
author: "FuzzForge Team"
|
||||
category: "comprehensive"
|
||||
vertical: "rust" # Routes to worker-rust
|
||||
tags:
|
||||
- "security"
|
||||
- "analysis"
|
||||
@@ -169,6 +170,57 @@ curl -X POST "http://localhost:8000/workflows/security_assessment/submit" \
|
||||
|
||||
Resource precedence: User limits > Workflow requirements > System defaults
|
||||
|
||||
## File Upload and Target Access
|
||||
|
||||
### Upload Endpoint
|
||||
|
||||
The backend provides an upload endpoint for submitting workflows with local files:
|
||||
|
||||
```
|
||||
POST /workflows/{workflow_name}/upload-and-submit
|
||||
Content-Type: multipart/form-data
|
||||
|
||||
Parameters:
|
||||
file: File upload (supports .tar.gz for directories)
|
||||
parameters: JSON string of workflow parameters (optional)
|
||||
volume_mode: "ro" or "rw" (default: "ro")
|
||||
timeout: Execution timeout in seconds (optional)
|
||||
```
|
||||
|
||||
Example using curl:
|
||||
|
||||
```bash
|
||||
# Upload a directory (create tarball first)
|
||||
tar -czf project.tar.gz /path/to/project
|
||||
curl -X POST "http://localhost:8000/workflows/security_assessment/upload-and-submit" \
|
||||
-F "file=@project.tar.gz" \
|
||||
-F "parameters={\"check_secrets\":true}" \
|
||||
-F "volume_mode=ro"
|
||||
|
||||
# Upload a single file
|
||||
curl -X POST "http://localhost:8000/workflows/security_assessment/upload-and-submit" \
|
||||
-F "file=@binary.elf" \
|
||||
-F "volume_mode=ro"
|
||||
```
|
||||
|
||||
### Storage Flow
|
||||
|
||||
1. **CLI/API uploads file** via HTTP multipart
|
||||
2. **Backend receives file** and streams to temporary location (max 10GB)
|
||||
3. **Backend uploads to MinIO** with generated `target_id`
|
||||
4. **Workflow is submitted** to Temporal with `target_id`
|
||||
5. **Worker downloads target** from MinIO to local cache
|
||||
6. **Workflow processes target** from cache
|
||||
7. **MinIO lifecycle policy** deletes files after 7 days
|
||||
|
||||
### Advantages
|
||||
|
||||
- **No host filesystem access required** - workers can run anywhere
|
||||
- **Automatic cleanup** - lifecycle policies prevent disk exhaustion
|
||||
- **Caching** - repeated workflows reuse cached targets
|
||||
- **Multi-host ready** - targets accessible from any worker
|
||||
- **Secure** - isolated storage, no arbitrary host path access
|
||||
|
||||
## Module Development
|
||||
|
||||
Modules implement the `BaseModule` interface:
|
||||
@@ -198,7 +250,21 @@ class MyModule(BaseModule):
|
||||
|
||||
## Submitting a Workflow
|
||||
|
||||
### With File Upload (Recommended)
|
||||
|
||||
```bash
|
||||
# Automatic tarball and upload
|
||||
tar -czf project.tar.gz /home/user/project
|
||||
curl -X POST "http://localhost:8000/workflows/security_assessment/upload-and-submit" \
|
||||
-F "file=@project.tar.gz" \
|
||||
-F "parameters={\"scanner_config\":{\"patterns\":[\"*.py\"]},\"analyzer_config\":{\"check_secrets\":true}}" \
|
||||
-F "volume_mode=ro"
|
||||
```
|
||||
|
||||
### Legacy Path-Based Submission
|
||||
|
||||
```bash
|
||||
# Only works if backend and target are on same machine
|
||||
curl -X POST "http://localhost:8000/workflows/security_assessment/submit" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
@@ -235,23 +301,31 @@ Returns SARIF-formatted findings:
|
||||
|
||||
## Security Considerations
|
||||
|
||||
1. **Volume Mounting**: Only allowed directories can be mounted
|
||||
2. **Read-Only Default**: Volumes mounted as read-only unless explicitly set
|
||||
3. **Container Isolation**: Each workflow runs in an isolated container
|
||||
4. **Resource Limits**: Can set CPU/memory limits via Prefect
|
||||
5. **Network Isolation**: Containers use bridge networking
|
||||
1. **File Upload Security**: Files uploaded to MinIO with isolated storage
|
||||
2. **Read-Only Default**: Target files accessed as read-only unless explicitly set
|
||||
3. **Worker Isolation**: Each workflow runs in isolated vertical workers
|
||||
4. **Resource Limits**: Can set CPU/memory limits per worker
|
||||
5. **Automatic Cleanup**: MinIO lifecycle policies delete old files after 7 days
|
||||
|
||||
## Development
|
||||
|
||||
### Adding a New Workflow
|
||||
|
||||
1. Create directory: `toolbox/workflows/my_workflow/`
|
||||
2. Add `workflow.py` with a Prefect flow
|
||||
3. Add mandatory `metadata.yaml`
|
||||
4. Restart backend: `docker-compose restart fuzzforge-backend`
|
||||
2. Add `workflow.py` with a Temporal workflow (using `@workflow.defn`)
|
||||
3. Add mandatory `metadata.yaml` with `vertical` field
|
||||
4. Restart the appropriate worker: `docker-compose -f docker-compose.temporal.yaml restart worker-rust`
|
||||
5. Worker will automatically discover and register the new workflow
|
||||
|
||||
### Adding a New Module
|
||||
|
||||
1. Create module in `toolbox/modules/{category}/`
|
||||
2. Implement `BaseModule` interface
|
||||
3. Use in workflows via import
|
||||
3. Use in workflows via import
|
||||
|
||||
### Adding a New Vertical Worker
|
||||
|
||||
1. Create worker directory: `workers/{vertical}/`
|
||||
2. Create `Dockerfile` with required tools
|
||||
3. Add worker to `docker-compose.temporal.yaml`
|
||||
4. Worker will automatically discover workflows with matching `vertical` in metadata
|
||||
@@ -0,0 +1,184 @@
|
||||
# FuzzForge Benchmark Suite
|
||||
|
||||
Performance benchmarking infrastructure organized by module category.
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
benchmarks/
|
||||
├── conftest.py # Benchmark fixtures
|
||||
├── category_configs.py # Category-specific thresholds
|
||||
├── by_category/ # Benchmarks organized by category
|
||||
│ ├── fuzzer/
|
||||
│ │ ├── bench_cargo_fuzz.py
|
||||
│ │ └── bench_atheris.py
|
||||
│ ├── scanner/
|
||||
│ │ └── bench_file_scanner.py
|
||||
│ ├── secret_detection/
|
||||
│ │ ├── bench_gitleaks.py
|
||||
│ │ └── bench_trufflehog.py
|
||||
│ └── analyzer/
|
||||
│ └── bench_security_analyzer.py
|
||||
├── fixtures/ # Benchmark test data
|
||||
│ ├── small/ # ~1K LOC
|
||||
│ ├── medium/ # ~10K LOC
|
||||
│ └── large/ # ~100K LOC
|
||||
└── results/ # Benchmark results (JSON)
|
||||
```
|
||||
|
||||
## Module Categories
|
||||
|
||||
### Fuzzer
|
||||
**Expected Metrics**: execs/sec, coverage_rate, time_to_crash, memory_usage
|
||||
|
||||
**Performance Thresholds**:
|
||||
- Min 1000 execs/sec
|
||||
- Max 10s for small projects
|
||||
- Max 2GB memory
|
||||
|
||||
### Scanner
|
||||
**Expected Metrics**: files/sec, LOC/sec, findings_count
|
||||
|
||||
**Performance Thresholds**:
|
||||
- Min 100 files/sec
|
||||
- Min 10K LOC/sec
|
||||
- Max 512MB memory
|
||||
|
||||
### Secret Detection
|
||||
**Expected Metrics**: patterns/sec, precision, recall, F1
|
||||
|
||||
**Performance Thresholds**:
|
||||
- Min 90% precision
|
||||
- Min 95% recall
|
||||
- Max 5 false positives per 100 secrets
|
||||
|
||||
### Analyzer
|
||||
**Expected Metrics**: analysis_depth, files/sec, accuracy
|
||||
|
||||
**Performance Thresholds**:
|
||||
- Min 10 files/sec (deep analysis)
|
||||
- Min 85% accuracy
|
||||
- Max 2GB memory
|
||||
|
||||
## Running Benchmarks
|
||||
|
||||
### All Benchmarks
|
||||
```bash
|
||||
cd backend
|
||||
pytest benchmarks/ --benchmark-only -v
|
||||
```
|
||||
|
||||
### Specific Category
|
||||
```bash
|
||||
pytest benchmarks/by_category/fuzzer/ --benchmark-only -v
|
||||
```
|
||||
|
||||
### With Comparison
|
||||
```bash
|
||||
# Run and save baseline
|
||||
pytest benchmarks/ --benchmark-only --benchmark-save=baseline
|
||||
|
||||
# Compare against baseline
|
||||
pytest benchmarks/ --benchmark-only --benchmark-compare=baseline
|
||||
```
|
||||
|
||||
### Generate Histogram
|
||||
```bash
|
||||
pytest benchmarks/ --benchmark-only --benchmark-histogram=histogram
|
||||
```
|
||||
|
||||
## Benchmark Results
|
||||
|
||||
Results are saved as JSON and include:
|
||||
- Mean execution time
|
||||
- Standard deviation
|
||||
- Min/Max values
|
||||
- Iterations per second
|
||||
- Memory usage
|
||||
|
||||
Example output:
|
||||
```
|
||||
------------------------ benchmark: fuzzer --------------------------
|
||||
Name Mean StdDev Ops/Sec
|
||||
bench_cargo_fuzz[discovery] 0.0012s 0.0001s 833.33
|
||||
bench_cargo_fuzz[execution] 0.1250s 0.0050s 8.00
|
||||
bench_cargo_fuzz[memory] 0.0100s 0.0005s 100.00
|
||||
---------------------------------------------------------------------
|
||||
```
|
||||
|
||||
## CI/CD Integration
|
||||
|
||||
Benchmarks run:
|
||||
- **Nightly**: Full benchmark suite, track trends
|
||||
- **On PR**: When benchmarks/ or modules/ changed
|
||||
- **Manual**: Via workflow_dispatch
|
||||
|
||||
### Regression Detection
|
||||
|
||||
Benchmarks automatically fail if:
|
||||
- Performance degrades >10%
|
||||
- Memory usage exceeds thresholds
|
||||
- Throughput drops below minimum
|
||||
|
||||
See `.github/workflows/benchmark.yml` for configuration.
|
||||
|
||||
## Adding New Benchmarks
|
||||
|
||||
### 1. Create benchmark file in category directory
|
||||
```python
|
||||
# benchmarks/by_category/fuzzer/bench_new_fuzzer.py
|
||||
|
||||
import pytest
|
||||
from benchmarks.category_configs import ModuleCategory, get_threshold
|
||||
|
||||
@pytest.mark.benchmark(group="fuzzer")
|
||||
def test_execution_performance(benchmark, new_fuzzer, test_workspace):
|
||||
"""Benchmark execution speed"""
|
||||
result = benchmark(new_fuzzer.execute, config, test_workspace)
|
||||
|
||||
# Validate against threshold
|
||||
threshold = get_threshold(ModuleCategory.FUZZER, "max_execution_time_small")
|
||||
assert result.execution_time < threshold
|
||||
```
|
||||
|
||||
### 2. Update category_configs.py if needed
|
||||
Add new thresholds or metrics for your module.
|
||||
|
||||
### 3. Run locally
|
||||
```bash
|
||||
pytest benchmarks/by_category/fuzzer/bench_new_fuzzer.py --benchmark-only -v
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Use mocking** for external dependencies (network, disk I/O)
|
||||
2. **Fixed iterations** for consistent benchmarking
|
||||
3. **Warm-up runs** for JIT-compiled code
|
||||
4. **Category-specific metrics** aligned with module purpose
|
||||
5. **Realistic fixtures** that represent actual use cases
|
||||
6. **Memory profiling** using tracemalloc
|
||||
7. **Compare apples to apples** within the same category
|
||||
|
||||
## Interpreting Results
|
||||
|
||||
### Good Performance
|
||||
- ✅ Execution time below threshold
|
||||
- ✅ Memory usage within limits
|
||||
- ✅ Throughput meets minimum
|
||||
- ✅ <5% variance across runs
|
||||
|
||||
### Performance Issues
|
||||
- ⚠️ Execution time 10-20% over threshold
|
||||
- ❌ Execution time >20% over threshold
|
||||
- ❌ Memory leaks (increasing over iterations)
|
||||
- ❌ High variance (>10%) indicates instability
|
||||
|
||||
## Tracking Performance Over Time
|
||||
|
||||
Benchmark results are stored as artifacts with:
|
||||
- Commit SHA
|
||||
- Timestamp
|
||||
- Environment details (Python version, OS)
|
||||
- Full metrics
|
||||
|
||||
Use these to track long-term performance trends and detect gradual degradation.
|
||||
@@ -0,0 +1,221 @@
|
||||
"""
|
||||
Benchmarks for CargoFuzzer module
|
||||
|
||||
Tests performance characteristics of Rust fuzzing:
|
||||
- Execution throughput (execs/sec)
|
||||
- Coverage rate
|
||||
- Memory efficiency
|
||||
- Time to first crash
|
||||
"""
|
||||
|
||||
import pytest
|
||||
import asyncio
|
||||
from pathlib import Path
|
||||
from unittest.mock import AsyncMock, patch
|
||||
import sys
|
||||
|
||||
sys.path.insert(0, str(Path(__file__).resolve().parents[3] / "toolbox"))
|
||||
|
||||
from modules.fuzzer.cargo_fuzzer import CargoFuzzer
|
||||
from benchmarks.category_configs import ModuleCategory, get_threshold
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def cargo_fuzzer():
|
||||
"""Create CargoFuzzer instance for benchmarking"""
|
||||
return CargoFuzzer()
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def benchmark_config():
|
||||
"""Benchmark-optimized configuration"""
|
||||
return {
|
||||
"target_name": None,
|
||||
"max_iterations": 10000, # Fixed iterations for consistent benchmarking
|
||||
"timeout_seconds": 30,
|
||||
"sanitizer": "address"
|
||||
}
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def mock_rust_workspace(tmp_path):
|
||||
"""Create a minimal Rust workspace for benchmarking"""
|
||||
workspace = tmp_path / "rust_project"
|
||||
workspace.mkdir()
|
||||
|
||||
# Cargo.toml
|
||||
(workspace / "Cargo.toml").write_text("""[package]
|
||||
name = "bench_project"
|
||||
version = "0.1.0"
|
||||
edition = "2021"
|
||||
""")
|
||||
|
||||
# src/lib.rs
|
||||
src = workspace / "src"
|
||||
src.mkdir()
|
||||
(src / "lib.rs").write_text("""
|
||||
pub fn benchmark_function(data: &[u8]) -> Vec<u8> {
|
||||
data.to_vec()
|
||||
}
|
||||
""")
|
||||
|
||||
# fuzz structure
|
||||
fuzz = workspace / "fuzz"
|
||||
fuzz.mkdir()
|
||||
(fuzz / "Cargo.toml").write_text("""[package]
|
||||
name = "bench_project-fuzz"
|
||||
version = "0.0.0"
|
||||
edition = "2021"
|
||||
|
||||
[dependencies]
|
||||
libfuzzer-sys = "0.4"
|
||||
|
||||
[dependencies.bench_project]
|
||||
path = ".."
|
||||
|
||||
[[bin]]
|
||||
name = "fuzz_target_1"
|
||||
path = "fuzz_targets/fuzz_target_1.rs"
|
||||
""")
|
||||
|
||||
targets = fuzz / "fuzz_targets"
|
||||
targets.mkdir()
|
||||
(targets / "fuzz_target_1.rs").write_text("""#![no_main]
|
||||
use libfuzzer_sys::fuzz_target;
|
||||
use bench_project::benchmark_function;
|
||||
|
||||
fuzz_target!(|data: &[u8]| {
|
||||
let _ = benchmark_function(data);
|
||||
});
|
||||
""")
|
||||
|
||||
return workspace
|
||||
|
||||
|
||||
class TestCargoFuzzerPerformance:
|
||||
"""Benchmark CargoFuzzer performance metrics"""
|
||||
|
||||
@pytest.mark.benchmark(group="fuzzer")
|
||||
def test_target_discovery_performance(self, benchmark, cargo_fuzzer, mock_rust_workspace):
|
||||
"""Benchmark fuzz target discovery speed"""
|
||||
def discover():
|
||||
return asyncio.run(cargo_fuzzer._discover_fuzz_targets(mock_rust_workspace))
|
||||
|
||||
result = benchmark(discover)
|
||||
assert len(result) > 0
|
||||
|
||||
@pytest.mark.benchmark(group="fuzzer")
|
||||
def test_config_validation_performance(self, benchmark, cargo_fuzzer, benchmark_config):
|
||||
"""Benchmark configuration validation speed"""
|
||||
result = benchmark(cargo_fuzzer.validate_config, benchmark_config)
|
||||
assert result is True
|
||||
|
||||
@pytest.mark.benchmark(group="fuzzer")
|
||||
def test_module_initialization_performance(self, benchmark):
|
||||
"""Benchmark module instantiation time"""
|
||||
def init_module():
|
||||
return CargoFuzzer()
|
||||
|
||||
module = benchmark(init_module)
|
||||
assert module is not None
|
||||
|
||||
|
||||
class TestCargoFuzzerThroughput:
|
||||
"""Benchmark execution throughput"""
|
||||
|
||||
@pytest.mark.benchmark(group="fuzzer")
|
||||
def test_execution_throughput(self, benchmark, cargo_fuzzer, mock_rust_workspace, benchmark_config):
|
||||
"""Benchmark fuzzing execution throughput"""
|
||||
|
||||
# Mock actual fuzzing to focus on orchestration overhead
|
||||
async def mock_run(workspace, target, config, callback):
|
||||
# Simulate 10K execs at 1000 execs/sec
|
||||
if callback:
|
||||
await callback({
|
||||
"total_execs": 10000,
|
||||
"execs_per_sec": 1000.0,
|
||||
"crashes": 0,
|
||||
"coverage": 50,
|
||||
"corpus_size": 10,
|
||||
"elapsed_time": 10
|
||||
})
|
||||
return [], {"total_executions": 10000, "execution_time": 10.0}
|
||||
|
||||
with patch.object(cargo_fuzzer, '_build_fuzz_target', new_callable=AsyncMock, return_value=True):
|
||||
with patch.object(cargo_fuzzer, '_run_fuzzing', side_effect=mock_run):
|
||||
with patch.object(cargo_fuzzer, '_parse_crash_artifacts', new_callable=AsyncMock, return_value=[]):
|
||||
def run_fuzzer():
|
||||
# Run in new event loop
|
||||
loop = asyncio.new_event_loop()
|
||||
try:
|
||||
return loop.run_until_complete(
|
||||
cargo_fuzzer.execute(benchmark_config, mock_rust_workspace)
|
||||
)
|
||||
finally:
|
||||
loop.close()
|
||||
|
||||
result = benchmark(run_fuzzer)
|
||||
assert result.status == "success"
|
||||
|
||||
# Verify performance threshold
|
||||
threshold = get_threshold(ModuleCategory.FUZZER, "max_execution_time_small")
|
||||
assert result.execution_time < threshold, \
|
||||
f"Execution time {result.execution_time}s exceeds threshold {threshold}s"
|
||||
|
||||
|
||||
class TestCargoFuzzerMemory:
|
||||
"""Benchmark memory efficiency"""
|
||||
|
||||
@pytest.mark.benchmark(group="fuzzer")
|
||||
def test_memory_overhead(self, benchmark, cargo_fuzzer, mock_rust_workspace, benchmark_config):
|
||||
"""Benchmark memory usage during execution"""
|
||||
import tracemalloc
|
||||
|
||||
def measure_memory():
|
||||
tracemalloc.start()
|
||||
|
||||
# Simulate operations
|
||||
cargo_fuzzer.validate_config(benchmark_config)
|
||||
asyncio.run(cargo_fuzzer._discover_fuzz_targets(mock_rust_workspace))
|
||||
|
||||
current, peak = tracemalloc.get_traced_memory()
|
||||
tracemalloc.stop()
|
||||
|
||||
return peak / 1024 / 1024 # Convert to MB
|
||||
|
||||
peak_mb = benchmark(measure_memory)
|
||||
|
||||
# Check against threshold
|
||||
max_memory = get_threshold(ModuleCategory.FUZZER, "max_memory_mb")
|
||||
assert peak_mb < max_memory, \
|
||||
f"Peak memory {peak_mb:.2f}MB exceeds threshold {max_memory}MB"
|
||||
|
||||
|
||||
class TestCargoFuzzerScalability:
|
||||
"""Benchmark scalability characteristics"""
|
||||
|
||||
@pytest.mark.benchmark(group="fuzzer")
|
||||
def test_multiple_target_discovery(self, benchmark, cargo_fuzzer, tmp_path):
|
||||
"""Benchmark discovery with multiple targets"""
|
||||
workspace = tmp_path / "multi_target"
|
||||
workspace.mkdir()
|
||||
|
||||
# Create workspace with 10 fuzz targets
|
||||
(workspace / "Cargo.toml").write_text("[package]\nname = \"test\"\nversion = \"0.1.0\"\nedition = \"2021\"")
|
||||
src = workspace / "src"
|
||||
src.mkdir()
|
||||
(src / "lib.rs").write_text("pub fn test() {}")
|
||||
|
||||
fuzz = workspace / "fuzz"
|
||||
fuzz.mkdir()
|
||||
targets = fuzz / "fuzz_targets"
|
||||
targets.mkdir()
|
||||
|
||||
for i in range(10):
|
||||
(targets / f"fuzz_target_{i}.rs").write_text("// Target")
|
||||
|
||||
def discover():
|
||||
return asyncio.run(cargo_fuzzer._discover_fuzz_targets(workspace))
|
||||
|
||||
result = benchmark(discover)
|
||||
assert len(result) == 10
|
||||
@@ -0,0 +1,151 @@
|
||||
"""
|
||||
Category-specific benchmark configurations
|
||||
|
||||
Defines expected metrics and performance thresholds for each module category.
|
||||
"""
|
||||
|
||||
from dataclasses import dataclass
|
||||
from typing import List, Dict
|
||||
from enum import Enum
|
||||
|
||||
|
||||
class ModuleCategory(str, Enum):
|
||||
"""Module categories for benchmarking"""
|
||||
FUZZER = "fuzzer"
|
||||
SCANNER = "scanner"
|
||||
ANALYZER = "analyzer"
|
||||
SECRET_DETECTION = "secret_detection"
|
||||
REPORTER = "reporter"
|
||||
|
||||
|
||||
@dataclass
|
||||
class CategoryBenchmarkConfig:
|
||||
"""Benchmark configuration for a module category"""
|
||||
category: ModuleCategory
|
||||
expected_metrics: List[str]
|
||||
performance_thresholds: Dict[str, float]
|
||||
description: str
|
||||
|
||||
|
||||
# Fuzzer category configuration
|
||||
FUZZER_CONFIG = CategoryBenchmarkConfig(
|
||||
category=ModuleCategory.FUZZER,
|
||||
expected_metrics=[
|
||||
"execs_per_sec",
|
||||
"coverage_rate",
|
||||
"time_to_first_crash",
|
||||
"corpus_efficiency",
|
||||
"execution_time",
|
||||
"peak_memory_mb"
|
||||
],
|
||||
performance_thresholds={
|
||||
"min_execs_per_sec": 1000, # Minimum executions per second
|
||||
"max_execution_time_small": 10.0, # Max time for small project (seconds)
|
||||
"max_execution_time_medium": 60.0, # Max time for medium project
|
||||
"max_memory_mb": 2048, # Maximum memory usage
|
||||
"min_coverage_rate": 1.0, # Minimum new coverage per second
|
||||
},
|
||||
description="Fuzzing modules: coverage-guided fuzz testing"
|
||||
)
|
||||
|
||||
# Scanner category configuration
|
||||
SCANNER_CONFIG = CategoryBenchmarkConfig(
|
||||
category=ModuleCategory.SCANNER,
|
||||
expected_metrics=[
|
||||
"files_per_sec",
|
||||
"loc_per_sec",
|
||||
"execution_time",
|
||||
"peak_memory_mb",
|
||||
"findings_count"
|
||||
],
|
||||
performance_thresholds={
|
||||
"min_files_per_sec": 100, # Minimum files scanned per second
|
||||
"min_loc_per_sec": 10000, # Minimum lines of code per second
|
||||
"max_execution_time_small": 1.0,
|
||||
"max_execution_time_medium": 10.0,
|
||||
"max_memory_mb": 512,
|
||||
},
|
||||
description="File scanning modules: fast pattern-based scanning"
|
||||
)
|
||||
|
||||
# Secret detection category configuration
|
||||
SECRET_DETECTION_CONFIG = CategoryBenchmarkConfig(
|
||||
category=ModuleCategory.SECRET_DETECTION,
|
||||
expected_metrics=[
|
||||
"patterns_per_sec",
|
||||
"precision",
|
||||
"recall",
|
||||
"f1_score",
|
||||
"false_positive_rate",
|
||||
"execution_time",
|
||||
"peak_memory_mb"
|
||||
],
|
||||
performance_thresholds={
|
||||
"min_patterns_per_sec": 1000,
|
||||
"min_precision": 0.90, # 90% precision target
|
||||
"min_recall": 0.95, # 95% recall target
|
||||
"max_false_positives": 5, # Max false positives per 100 secrets
|
||||
"max_execution_time_small": 2.0,
|
||||
"max_execution_time_medium": 20.0,
|
||||
"max_memory_mb": 1024,
|
||||
},
|
||||
description="Secret detection modules: high precision pattern matching"
|
||||
)
|
||||
|
||||
# Analyzer category configuration
|
||||
ANALYZER_CONFIG = CategoryBenchmarkConfig(
|
||||
category=ModuleCategory.ANALYZER,
|
||||
expected_metrics=[
|
||||
"analysis_depth",
|
||||
"files_analyzed_per_sec",
|
||||
"execution_time",
|
||||
"peak_memory_mb",
|
||||
"findings_count",
|
||||
"accuracy"
|
||||
],
|
||||
performance_thresholds={
|
||||
"min_files_per_sec": 10, # Slower than scanners due to deep analysis
|
||||
"max_execution_time_small": 5.0,
|
||||
"max_execution_time_medium": 60.0,
|
||||
"max_memory_mb": 2048,
|
||||
"min_accuracy": 0.85, # 85% accuracy target
|
||||
},
|
||||
description="Code analysis modules: deep semantic analysis"
|
||||
)
|
||||
|
||||
# Reporter category configuration
|
||||
REPORTER_CONFIG = CategoryBenchmarkConfig(
|
||||
category=ModuleCategory.REPORTER,
|
||||
expected_metrics=[
|
||||
"report_generation_time",
|
||||
"findings_per_sec",
|
||||
"peak_memory_mb"
|
||||
],
|
||||
performance_thresholds={
|
||||
"max_report_time_100_findings": 1.0, # Max 1 second for 100 findings
|
||||
"max_report_time_1000_findings": 10.0, # Max 10 seconds for 1000 findings
|
||||
"max_memory_mb": 256,
|
||||
},
|
||||
description="Reporting modules: fast report generation"
|
||||
)
|
||||
|
||||
|
||||
# Category configurations map
|
||||
CATEGORY_CONFIGS = {
|
||||
ModuleCategory.FUZZER: FUZZER_CONFIG,
|
||||
ModuleCategory.SCANNER: SCANNER_CONFIG,
|
||||
ModuleCategory.SECRET_DETECTION: SECRET_DETECTION_CONFIG,
|
||||
ModuleCategory.ANALYZER: ANALYZER_CONFIG,
|
||||
ModuleCategory.REPORTER: REPORTER_CONFIG,
|
||||
}
|
||||
|
||||
|
||||
def get_category_config(category: ModuleCategory) -> CategoryBenchmarkConfig:
|
||||
"""Get benchmark configuration for a category"""
|
||||
return CATEGORY_CONFIGS[category]
|
||||
|
||||
|
||||
def get_threshold(category: ModuleCategory, metric: str) -> float:
|
||||
"""Get performance threshold for a specific metric"""
|
||||
config = get_category_config(category)
|
||||
return config.performance_thresholds.get(metric, 0.0)
|
||||
@@ -0,0 +1,60 @@
|
||||
"""
|
||||
Benchmark fixtures and configuration
|
||||
"""
|
||||
|
||||
import sys
|
||||
from pathlib import Path
|
||||
import pytest
|
||||
|
||||
# Add parent directories to path
|
||||
BACKEND_ROOT = Path(__file__).resolve().parents[1]
|
||||
TOOLBOX = BACKEND_ROOT / "toolbox"
|
||||
|
||||
if str(BACKEND_ROOT) not in sys.path:
|
||||
sys.path.insert(0, str(BACKEND_ROOT))
|
||||
if str(TOOLBOX) not in sys.path:
|
||||
sys.path.insert(0, str(TOOLBOX))
|
||||
|
||||
|
||||
# ============================================================================
|
||||
# Benchmark Fixtures
|
||||
# ============================================================================
|
||||
|
||||
@pytest.fixture(scope="session")
|
||||
def benchmark_fixtures_dir():
|
||||
"""Path to benchmark fixtures directory"""
|
||||
return Path(__file__).parent / "fixtures"
|
||||
|
||||
|
||||
@pytest.fixture(scope="session")
|
||||
def small_project_fixture(benchmark_fixtures_dir):
|
||||
"""Small project fixture (~1K LOC)"""
|
||||
return benchmark_fixtures_dir / "small"
|
||||
|
||||
|
||||
@pytest.fixture(scope="session")
|
||||
def medium_project_fixture(benchmark_fixtures_dir):
|
||||
"""Medium project fixture (~10K LOC)"""
|
||||
return benchmark_fixtures_dir / "medium"
|
||||
|
||||
|
||||
@pytest.fixture(scope="session")
|
||||
def large_project_fixture(benchmark_fixtures_dir):
|
||||
"""Large project fixture (~100K LOC)"""
|
||||
return benchmark_fixtures_dir / "large"
|
||||
|
||||
|
||||
# ============================================================================
|
||||
# pytest-benchmark Configuration
|
||||
# ============================================================================
|
||||
|
||||
def pytest_configure(config):
|
||||
"""Configure pytest-benchmark"""
|
||||
config.addinivalue_line(
|
||||
"markers", "benchmark: mark test as a benchmark"
|
||||
)
|
||||
|
||||
|
||||
def pytest_benchmark_group_stats(config, benchmarks, group_by):
|
||||
"""Group benchmark results by category"""
|
||||
return group_by
|
||||
+17
-1
@@ -7,7 +7,8 @@ readme = "README.md"
|
||||
requires-python = ">=3.11"
|
||||
dependencies = [
|
||||
"fastapi>=0.116.1",
|
||||
"prefect>=3.4.18",
|
||||
"temporalio>=1.6.0",
|
||||
"boto3>=1.34.0",
|
||||
"pydantic>=2.0.0",
|
||||
"pyyaml>=6.0",
|
||||
"docker>=7.0.0",
|
||||
@@ -21,5 +22,20 @@ dependencies = [
|
||||
dev = [
|
||||
"pytest>=8.0.0",
|
||||
"pytest-asyncio>=0.23.0",
|
||||
"pytest-benchmark>=4.0.0",
|
||||
"pytest-cov>=5.0.0",
|
||||
"pytest-xdist>=3.5.0",
|
||||
"pytest-mock>=3.12.0",
|
||||
"httpx>=0.27.0",
|
||||
"ruff>=0.1.0",
|
||||
]
|
||||
|
||||
[tool.pytest.ini_options]
|
||||
asyncio_mode = "auto"
|
||||
testpaths = ["tests", "benchmarks"]
|
||||
python_files = ["test_*.py", "bench_*.py"]
|
||||
python_classes = ["Test*"]
|
||||
python_functions = ["test_*"]
|
||||
markers = [
|
||||
"benchmark: mark test as a benchmark",
|
||||
]
|
||||
|
||||
@@ -14,8 +14,8 @@ API endpoints for fuzzing workflow management and real-time monitoring
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import logging
|
||||
from typing import List, Dict, Any
|
||||
from fastapi import APIRouter, HTTPException, Depends, WebSocket, WebSocketDisconnect
|
||||
from typing import List, Dict
|
||||
from fastapi import APIRouter, HTTPException, WebSocket, WebSocketDisconnect
|
||||
from fastapi.responses import StreamingResponse
|
||||
import asyncio
|
||||
import json
|
||||
@@ -25,7 +25,6 @@ from src.models.findings import (
|
||||
FuzzingStats,
|
||||
CrashReport
|
||||
)
|
||||
from src.core.workflow_discovery import WorkflowDiscovery
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -126,12 +125,13 @@ async def update_fuzzing_stats(run_id: str, stats: FuzzingStats):
|
||||
# Debug: log reception for live instrumentation
|
||||
try:
|
||||
logger.info(
|
||||
"Received fuzzing stats update: run_id=%s exec=%s eps=%.2f crashes=%s corpus=%s elapsed=%ss",
|
||||
"Received fuzzing stats update: run_id=%s exec=%s eps=%.2f crashes=%s corpus=%s coverage=%s elapsed=%ss",
|
||||
run_id,
|
||||
stats.executions,
|
||||
stats.executions_per_sec,
|
||||
stats.crashes,
|
||||
stats.corpus_size,
|
||||
stats.coverage,
|
||||
stats.elapsed_time,
|
||||
)
|
||||
except Exception:
|
||||
|
||||
+49
-56
@@ -14,7 +14,6 @@ API endpoints for workflow run management and findings retrieval
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import logging
|
||||
from typing import Dict, Any
|
||||
from fastapi import APIRouter, HTTPException, Depends
|
||||
|
||||
from src.models.findings import WorkflowFindings, WorkflowStatus
|
||||
@@ -24,22 +23,22 @@ logger = logging.getLogger(__name__)
|
||||
router = APIRouter(prefix="/runs", tags=["runs"])
|
||||
|
||||
|
||||
def get_prefect_manager():
|
||||
"""Dependency to get the Prefect manager instance"""
|
||||
from src.main import prefect_mgr
|
||||
return prefect_mgr
|
||||
def get_temporal_manager():
|
||||
"""Dependency to get the Temporal manager instance"""
|
||||
from src.main import temporal_mgr
|
||||
return temporal_mgr
|
||||
|
||||
|
||||
@router.get("/{run_id}/status", response_model=WorkflowStatus)
|
||||
async def get_run_status(
|
||||
run_id: str,
|
||||
prefect_mgr=Depends(get_prefect_manager)
|
||||
temporal_mgr=Depends(get_temporal_manager)
|
||||
) -> WorkflowStatus:
|
||||
"""
|
||||
Get the current status of a workflow run.
|
||||
|
||||
Args:
|
||||
run_id: The flow run ID
|
||||
run_id: The workflow run ID
|
||||
|
||||
Returns:
|
||||
Status information including state, timestamps, and completion flags
|
||||
@@ -48,25 +47,23 @@ async def get_run_status(
|
||||
HTTPException: 404 if run not found
|
||||
"""
|
||||
try:
|
||||
status = await prefect_mgr.get_flow_run_status(run_id)
|
||||
status = await temporal_mgr.get_workflow_status(run_id)
|
||||
|
||||
# Find workflow name from deployment
|
||||
workflow_name = "unknown"
|
||||
workflow_deployment_id = status.get("workflow", "")
|
||||
for name, deployment_id in prefect_mgr.deployments.items():
|
||||
if str(deployment_id) == str(workflow_deployment_id):
|
||||
workflow_name = name
|
||||
break
|
||||
# Map Temporal status to response format
|
||||
workflow_status = status.get("status", "UNKNOWN")
|
||||
is_completed = workflow_status in ["COMPLETED", "FAILED", "CANCELLED"]
|
||||
is_failed = workflow_status == "FAILED"
|
||||
is_running = workflow_status == "RUNNING"
|
||||
|
||||
return WorkflowStatus(
|
||||
run_id=status["run_id"],
|
||||
workflow=workflow_name,
|
||||
status=status["status"],
|
||||
is_completed=status["is_completed"],
|
||||
is_failed=status["is_failed"],
|
||||
is_running=status["is_running"],
|
||||
created_at=status["created_at"],
|
||||
updated_at=status["updated_at"]
|
||||
run_id=run_id,
|
||||
workflow="unknown", # Temporal doesn't track workflow name in status
|
||||
status=workflow_status,
|
||||
is_completed=is_completed,
|
||||
is_failed=is_failed,
|
||||
is_running=is_running,
|
||||
created_at=status.get("start_time"),
|
||||
updated_at=status.get("close_time") or status.get("execution_time")
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
@@ -80,13 +77,13 @@ async def get_run_status(
|
||||
@router.get("/{run_id}/findings", response_model=WorkflowFindings)
|
||||
async def get_run_findings(
|
||||
run_id: str,
|
||||
prefect_mgr=Depends(get_prefect_manager)
|
||||
temporal_mgr=Depends(get_temporal_manager)
|
||||
) -> WorkflowFindings:
|
||||
"""
|
||||
Get the findings from a completed workflow run.
|
||||
|
||||
Args:
|
||||
run_id: The flow run ID
|
||||
run_id: The workflow run ID
|
||||
|
||||
Returns:
|
||||
SARIF-formatted findings from the workflow execution
|
||||
@@ -96,50 +93,46 @@ async def get_run_findings(
|
||||
"""
|
||||
try:
|
||||
# Get run status first
|
||||
status = await prefect_mgr.get_flow_run_status(run_id)
|
||||
status = await temporal_mgr.get_workflow_status(run_id)
|
||||
workflow_status = status.get("status", "UNKNOWN")
|
||||
|
||||
if not status["is_completed"]:
|
||||
if status["is_running"]:
|
||||
if workflow_status not in ["COMPLETED", "FAILED", "CANCELLED"]:
|
||||
if workflow_status == "RUNNING":
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail=f"Run {run_id} is still running. Current status: {status['status']}"
|
||||
)
|
||||
elif status["is_failed"]:
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail=f"Run {run_id} failed. Status: {status['status']}"
|
||||
detail=f"Run {run_id} is still running. Current status: {workflow_status}"
|
||||
)
|
||||
else:
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail=f"Run {run_id} not completed. Status: {status['status']}"
|
||||
detail=f"Run {run_id} not completed. Status: {workflow_status}"
|
||||
)
|
||||
|
||||
# Get the findings
|
||||
findings = await prefect_mgr.get_flow_run_findings(run_id)
|
||||
if workflow_status == "FAILED":
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail=f"Run {run_id} failed. Status: {workflow_status}"
|
||||
)
|
||||
|
||||
# Find workflow name
|
||||
workflow_name = "unknown"
|
||||
workflow_deployment_id = status.get("workflow", "")
|
||||
for name, deployment_id in prefect_mgr.deployments.items():
|
||||
if str(deployment_id) == str(workflow_deployment_id):
|
||||
workflow_name = name
|
||||
break
|
||||
# Get the workflow result
|
||||
result = await temporal_mgr.get_workflow_result(run_id)
|
||||
|
||||
# Get workflow version if available
|
||||
# Extract SARIF from result (handle None for backwards compatibility)
|
||||
if isinstance(result, dict):
|
||||
sarif = result.get("sarif") or {}
|
||||
else:
|
||||
sarif = {}
|
||||
|
||||
# Metadata
|
||||
metadata = {
|
||||
"completion_time": status["updated_at"],
|
||||
"completion_time": status.get("close_time"),
|
||||
"workflow_version": "unknown"
|
||||
}
|
||||
|
||||
if workflow_name in prefect_mgr.workflows:
|
||||
workflow_info = prefect_mgr.workflows[workflow_name]
|
||||
metadata["workflow_version"] = workflow_info.metadata.get("version", "unknown")
|
||||
|
||||
return WorkflowFindings(
|
||||
workflow=workflow_name,
|
||||
workflow="unknown",
|
||||
run_id=run_id,
|
||||
sarif=findings,
|
||||
sarif=sarif,
|
||||
metadata=metadata
|
||||
)
|
||||
|
||||
@@ -157,7 +150,7 @@ async def get_run_findings(
|
||||
async def get_workflow_findings(
|
||||
workflow_name: str,
|
||||
run_id: str,
|
||||
prefect_mgr=Depends(get_prefect_manager)
|
||||
temporal_mgr=Depends(get_temporal_manager)
|
||||
) -> WorkflowFindings:
|
||||
"""
|
||||
Get findings for a specific workflow run.
|
||||
@@ -166,7 +159,7 @@ async def get_workflow_findings(
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow
|
||||
run_id: The flow run ID
|
||||
run_id: The workflow run ID
|
||||
|
||||
Returns:
|
||||
SARIF-formatted findings from the workflow execution
|
||||
@@ -174,11 +167,11 @@ async def get_workflow_findings(
|
||||
Raises:
|
||||
HTTPException: 404 if workflow or run not found, 400 if run not completed
|
||||
"""
|
||||
if workflow_name not in prefect_mgr.workflows:
|
||||
if workflow_name not in temporal_mgr.workflows:
|
||||
raise HTTPException(
|
||||
status_code=404,
|
||||
detail=f"Workflow not found: {workflow_name}"
|
||||
)
|
||||
|
||||
# Delegate to the main findings endpoint
|
||||
return await get_run_findings(run_id, prefect_mgr)
|
||||
return await get_run_findings(run_id, temporal_mgr)
|
||||
|
||||
+307
-59
@@ -15,8 +15,9 @@ API endpoints for workflow management with enhanced error handling
|
||||
|
||||
import logging
|
||||
import traceback
|
||||
import tempfile
|
||||
from typing import List, Dict, Any, Optional
|
||||
from fastapi import APIRouter, HTTPException, Depends
|
||||
from fastapi import APIRouter, HTTPException, Depends, UploadFile, File, Form
|
||||
from pathlib import Path
|
||||
|
||||
from src.models.findings import (
|
||||
@@ -25,10 +26,20 @@ from src.models.findings import (
|
||||
WorkflowListItem,
|
||||
RunSubmissionResponse
|
||||
)
|
||||
from src.core.workflow_discovery import WorkflowDiscovery
|
||||
from src.temporal.discovery import WorkflowDiscovery
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Configuration for file uploads
|
||||
MAX_UPLOAD_SIZE = 10 * 1024 * 1024 * 1024 # 10 GB
|
||||
ALLOWED_CONTENT_TYPES = [
|
||||
"application/gzip",
|
||||
"application/x-gzip",
|
||||
"application/x-tar",
|
||||
"application/x-compressed-tar",
|
||||
"application/octet-stream", # Generic binary
|
||||
]
|
||||
|
||||
router = APIRouter(prefix="/workflows", tags=["workflows"])
|
||||
|
||||
|
||||
@@ -68,15 +79,15 @@ def create_structured_error_response(
|
||||
return error_response
|
||||
|
||||
|
||||
def get_prefect_manager():
|
||||
"""Dependency to get the Prefect manager instance"""
|
||||
from src.main import prefect_mgr
|
||||
return prefect_mgr
|
||||
def get_temporal_manager():
|
||||
"""Dependency to get the Temporal manager instance"""
|
||||
from src.main import temporal_mgr
|
||||
return temporal_mgr
|
||||
|
||||
|
||||
@router.get("/", response_model=List[WorkflowListItem])
|
||||
async def list_workflows(
|
||||
prefect_mgr=Depends(get_prefect_manager)
|
||||
temporal_mgr=Depends(get_temporal_manager)
|
||||
) -> List[WorkflowListItem]:
|
||||
"""
|
||||
List all discovered workflows with their metadata.
|
||||
@@ -85,7 +96,7 @@ async def list_workflows(
|
||||
author, and tags.
|
||||
"""
|
||||
workflows = []
|
||||
for name, info in prefect_mgr.workflows.items():
|
||||
for name, info in temporal_mgr.workflows.items():
|
||||
workflows.append(WorkflowListItem(
|
||||
name=name,
|
||||
version=info.metadata.get("version", "0.6.0"),
|
||||
@@ -111,7 +122,7 @@ async def get_metadata_schema() -> Dict[str, Any]:
|
||||
@router.get("/{workflow_name}/metadata", response_model=WorkflowMetadata)
|
||||
async def get_workflow_metadata(
|
||||
workflow_name: str,
|
||||
prefect_mgr=Depends(get_prefect_manager)
|
||||
temporal_mgr=Depends(get_temporal_manager)
|
||||
) -> WorkflowMetadata:
|
||||
"""
|
||||
Get complete metadata for a specific workflow.
|
||||
@@ -126,8 +137,8 @@ async def get_workflow_metadata(
|
||||
Raises:
|
||||
HTTPException: 404 if workflow not found
|
||||
"""
|
||||
if workflow_name not in prefect_mgr.workflows:
|
||||
available_workflows = list(prefect_mgr.workflows.keys())
|
||||
if workflow_name not in temporal_mgr.workflows:
|
||||
available_workflows = list(temporal_mgr.workflows.keys())
|
||||
error_response = create_structured_error_response(
|
||||
error_type="WorkflowNotFound",
|
||||
message=f"Workflow '{workflow_name}' not found",
|
||||
@@ -143,7 +154,7 @@ async def get_workflow_metadata(
|
||||
detail=error_response
|
||||
)
|
||||
|
||||
info = prefect_mgr.workflows[workflow_name]
|
||||
info = temporal_mgr.workflows[workflow_name]
|
||||
metadata = info.metadata
|
||||
|
||||
return WorkflowMetadata(
|
||||
@@ -154,9 +165,7 @@ async def get_workflow_metadata(
|
||||
tags=metadata.get("tags", []),
|
||||
parameters=metadata.get("parameters", {}),
|
||||
default_parameters=metadata.get("default_parameters", {}),
|
||||
required_modules=metadata.get("required_modules", []),
|
||||
supported_volume_modes=metadata.get("supported_volume_modes", ["ro", "rw"]),
|
||||
has_custom_docker=info.has_docker
|
||||
required_modules=metadata.get("required_modules", [])
|
||||
)
|
||||
|
||||
|
||||
@@ -164,14 +173,14 @@ async def get_workflow_metadata(
|
||||
async def submit_workflow(
|
||||
workflow_name: str,
|
||||
submission: WorkflowSubmission,
|
||||
prefect_mgr=Depends(get_prefect_manager)
|
||||
temporal_mgr=Depends(get_temporal_manager)
|
||||
) -> RunSubmissionResponse:
|
||||
"""
|
||||
Submit a workflow for execution with volume mounting.
|
||||
Submit a workflow for execution.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow to execute
|
||||
submission: Submission parameters including target path and volume mode
|
||||
submission: Submission parameters including target path and parameters
|
||||
|
||||
Returns:
|
||||
Run submission response with run_id and initial status
|
||||
@@ -179,8 +188,8 @@ async def submit_workflow(
|
||||
Raises:
|
||||
HTTPException: 404 if workflow not found, 400 for invalid parameters
|
||||
"""
|
||||
if workflow_name not in prefect_mgr.workflows:
|
||||
available_workflows = list(prefect_mgr.workflows.keys())
|
||||
if workflow_name not in temporal_mgr.workflows:
|
||||
available_workflows = list(temporal_mgr.workflows.keys())
|
||||
error_response = create_structured_error_response(
|
||||
error_type="WorkflowNotFound",
|
||||
message=f"Workflow '{workflow_name}' not found",
|
||||
@@ -197,31 +206,36 @@ async def submit_workflow(
|
||||
)
|
||||
|
||||
try:
|
||||
# Convert ResourceLimits to dict if provided
|
||||
resource_limits_dict = None
|
||||
if submission.resource_limits:
|
||||
resource_limits_dict = {
|
||||
"cpu_limit": submission.resource_limits.cpu_limit,
|
||||
"memory_limit": submission.resource_limits.memory_limit,
|
||||
"cpu_request": submission.resource_limits.cpu_request,
|
||||
"memory_request": submission.resource_limits.memory_request
|
||||
}
|
||||
# Upload target file to MinIO and get target_id
|
||||
target_path = Path(submission.target_path)
|
||||
if not target_path.exists():
|
||||
raise ValueError(f"Target path does not exist: {submission.target_path}")
|
||||
|
||||
# Submit the workflow with enhanced parameters
|
||||
flow_run = await prefect_mgr.submit_workflow(
|
||||
workflow_name=workflow_name,
|
||||
target_path=submission.target_path,
|
||||
volume_mode=submission.volume_mode,
|
||||
parameters=submission.parameters,
|
||||
resource_limits=resource_limits_dict,
|
||||
additional_volumes=submission.additional_volumes,
|
||||
timeout=submission.timeout
|
||||
# Upload target (using anonymous user for now)
|
||||
target_id = await temporal_mgr.upload_target(
|
||||
file_path=target_path,
|
||||
user_id="api-user",
|
||||
metadata={"workflow": workflow_name}
|
||||
)
|
||||
|
||||
run_id = str(flow_run.id)
|
||||
# Merge default parameters with user parameters
|
||||
workflow_info = temporal_mgr.workflows[workflow_name]
|
||||
metadata = workflow_info.metadata or {}
|
||||
defaults = metadata.get("default_parameters", {})
|
||||
user_params = submission.parameters or {}
|
||||
workflow_params = {**defaults, **user_params}
|
||||
|
||||
# Start workflow execution
|
||||
handle = await temporal_mgr.run_workflow(
|
||||
workflow_name=workflow_name,
|
||||
target_id=target_id,
|
||||
workflow_params=workflow_params
|
||||
)
|
||||
|
||||
run_id = handle.id
|
||||
|
||||
# Initialize fuzzing tracking if this looks like a fuzzing workflow
|
||||
workflow_info = prefect_mgr.workflows.get(workflow_name, {})
|
||||
workflow_info = temporal_mgr.workflows.get(workflow_name, {})
|
||||
workflow_tags = workflow_info.metadata.get("tags", []) if hasattr(workflow_info, 'metadata') else []
|
||||
if "fuzzing" in workflow_tags or "fuzz" in workflow_name.lower():
|
||||
from src.api.fuzzing import initialize_fuzzing_tracking
|
||||
@@ -229,7 +243,7 @@ async def submit_workflow(
|
||||
|
||||
return RunSubmissionResponse(
|
||||
run_id=run_id,
|
||||
status=flow_run.state.name if flow_run.state else "PENDING",
|
||||
status="RUNNING",
|
||||
workflow=workflow_name,
|
||||
message=f"Workflow '{workflow_name}' submitted successfully"
|
||||
)
|
||||
@@ -261,17 +275,13 @@ async def submit_workflow(
|
||||
error_type = "WorkflowSubmissionError"
|
||||
|
||||
# Detect specific error patterns
|
||||
if "deployment" in error_message.lower():
|
||||
error_type = "DeploymentError"
|
||||
deployment_info = {
|
||||
"status": "failed",
|
||||
"error": error_message
|
||||
}
|
||||
if "workflow" in error_message.lower() and "not found" in error_message.lower():
|
||||
error_type = "WorkflowError"
|
||||
suggestions.extend([
|
||||
"Check if Prefect server is running and accessible",
|
||||
"Verify Docker is running and has sufficient resources",
|
||||
"Check container image availability",
|
||||
"Ensure volume paths exist and are accessible"
|
||||
"Check if Temporal server is running and accessible",
|
||||
"Verify workflow workers are running",
|
||||
"Check if workflow is registered with correct vertical",
|
||||
"Ensure Docker is running and has sufficient resources"
|
||||
])
|
||||
|
||||
elif "volume" in error_message.lower() or "mount" in error_message.lower():
|
||||
@@ -324,25 +334,200 @@ async def submit_workflow(
|
||||
)
|
||||
|
||||
|
||||
@router.get("/{workflow_name}/parameters")
|
||||
async def get_workflow_parameters(
|
||||
@router.post("/{workflow_name}/upload-and-submit", response_model=RunSubmissionResponse)
|
||||
async def upload_and_submit_workflow(
|
||||
workflow_name: str,
|
||||
prefect_mgr=Depends(get_prefect_manager)
|
||||
file: UploadFile = File(..., description="Target file or tarball to analyze"),
|
||||
parameters: Optional[str] = Form(None, description="JSON-encoded workflow parameters"),
|
||||
timeout: Optional[int] = Form(None, description="Timeout in seconds"),
|
||||
temporal_mgr=Depends(get_temporal_manager)
|
||||
) -> RunSubmissionResponse:
|
||||
"""
|
||||
Upload a target file/tarball and submit workflow for execution.
|
||||
|
||||
This endpoint accepts multipart/form-data uploads and is the recommended
|
||||
way to submit workflows from remote CLI clients.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow to execute
|
||||
file: Target file or tarball (compressed directory)
|
||||
parameters: JSON string of workflow parameters (optional)
|
||||
timeout: Execution timeout in seconds (optional)
|
||||
|
||||
Returns:
|
||||
Run submission response with run_id and initial status
|
||||
|
||||
Raises:
|
||||
HTTPException: 404 if workflow not found, 400 for invalid parameters,
|
||||
413 if file too large
|
||||
"""
|
||||
if workflow_name not in temporal_mgr.workflows:
|
||||
available_workflows = list(temporal_mgr.workflows.keys())
|
||||
error_response = create_structured_error_response(
|
||||
error_type="WorkflowNotFound",
|
||||
message=f"Workflow '{workflow_name}' not found",
|
||||
workflow_name=workflow_name,
|
||||
suggestions=[
|
||||
f"Available workflows: {', '.join(available_workflows)}",
|
||||
"Use GET /workflows/ to see all available workflows"
|
||||
]
|
||||
)
|
||||
raise HTTPException(status_code=404, detail=error_response)
|
||||
|
||||
temp_file_path = None
|
||||
|
||||
try:
|
||||
# Validate file size
|
||||
file_size = 0
|
||||
chunk_size = 1024 * 1024 # 1MB chunks
|
||||
|
||||
# Create temporary file
|
||||
temp_fd, temp_file_path = tempfile.mkstemp(suffix=".tar.gz")
|
||||
|
||||
logger.info(f"Receiving file upload for workflow '{workflow_name}': {file.filename}")
|
||||
|
||||
# Stream file to disk
|
||||
with open(temp_fd, 'wb') as temp_file:
|
||||
while True:
|
||||
chunk = await file.read(chunk_size)
|
||||
if not chunk:
|
||||
break
|
||||
|
||||
file_size += len(chunk)
|
||||
|
||||
# Check size limit
|
||||
if file_size > MAX_UPLOAD_SIZE:
|
||||
raise HTTPException(
|
||||
status_code=413,
|
||||
detail=create_structured_error_response(
|
||||
error_type="FileTooLarge",
|
||||
message=f"File size exceeds maximum allowed size of {MAX_UPLOAD_SIZE / (1024**3):.1f} GB",
|
||||
workflow_name=workflow_name,
|
||||
suggestions=[
|
||||
"Reduce the size of your target directory",
|
||||
"Exclude unnecessary files (build artifacts, dependencies, etc.)",
|
||||
"Consider splitting into smaller analysis targets"
|
||||
]
|
||||
)
|
||||
)
|
||||
|
||||
temp_file.write(chunk)
|
||||
|
||||
logger.info(f"Received file: {file_size / (1024**2):.2f} MB")
|
||||
|
||||
# Parse parameters
|
||||
workflow_params = {}
|
||||
if parameters:
|
||||
try:
|
||||
import json
|
||||
workflow_params = json.loads(parameters)
|
||||
if not isinstance(workflow_params, dict):
|
||||
raise ValueError("Parameters must be a JSON object")
|
||||
except (json.JSONDecodeError, ValueError) as e:
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail=create_structured_error_response(
|
||||
error_type="InvalidParameters",
|
||||
message=f"Invalid parameters JSON: {e}",
|
||||
workflow_name=workflow_name,
|
||||
suggestions=["Ensure parameters is valid JSON object"]
|
||||
)
|
||||
)
|
||||
|
||||
# Upload to MinIO
|
||||
target_id = await temporal_mgr.upload_target(
|
||||
file_path=Path(temp_file_path),
|
||||
user_id="api-user",
|
||||
metadata={
|
||||
"workflow": workflow_name,
|
||||
"original_filename": file.filename,
|
||||
"upload_method": "multipart"
|
||||
}
|
||||
)
|
||||
|
||||
logger.info(f"Uploaded to MinIO with target_id: {target_id}")
|
||||
|
||||
# Merge default parameters with user parameters
|
||||
workflow_info = temporal_mgr.workflows.get(workflow_name)
|
||||
metadata = workflow_info.metadata or {}
|
||||
defaults = metadata.get("default_parameters", {})
|
||||
workflow_params = {**defaults, **workflow_params}
|
||||
|
||||
# Start workflow execution
|
||||
handle = await temporal_mgr.run_workflow(
|
||||
workflow_name=workflow_name,
|
||||
target_id=target_id,
|
||||
workflow_params=workflow_params
|
||||
)
|
||||
|
||||
run_id = handle.id
|
||||
|
||||
# Initialize fuzzing tracking if needed
|
||||
workflow_info = temporal_mgr.workflows.get(workflow_name, {})
|
||||
workflow_tags = workflow_info.metadata.get("tags", []) if hasattr(workflow_info, 'metadata') else []
|
||||
if "fuzzing" in workflow_tags or "fuzz" in workflow_name.lower():
|
||||
from src.api.fuzzing import initialize_fuzzing_tracking
|
||||
initialize_fuzzing_tracking(run_id, workflow_name)
|
||||
|
||||
return RunSubmissionResponse(
|
||||
run_id=run_id,
|
||||
status="RUNNING",
|
||||
workflow=workflow_name,
|
||||
message=f"Workflow '{workflow_name}' submitted successfully with uploaded target"
|
||||
)
|
||||
|
||||
except HTTPException:
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to upload and submit workflow '{workflow_name}': {e}")
|
||||
logger.error(f"Traceback: {traceback.format_exc()}")
|
||||
|
||||
error_response = create_structured_error_response(
|
||||
error_type="WorkflowSubmissionError",
|
||||
message=f"Failed to process upload and submit workflow: {str(e)}",
|
||||
workflow_name=workflow_name,
|
||||
suggestions=[
|
||||
"Check if the uploaded file is a valid tarball",
|
||||
"Verify MinIO storage is accessible",
|
||||
"Check backend logs for detailed error information",
|
||||
"Ensure Temporal workers are running"
|
||||
]
|
||||
)
|
||||
|
||||
raise HTTPException(status_code=500, detail=error_response)
|
||||
|
||||
finally:
|
||||
# Cleanup temporary file
|
||||
if temp_file_path and Path(temp_file_path).exists():
|
||||
try:
|
||||
Path(temp_file_path).unlink()
|
||||
logger.debug(f"Cleaned up temp file: {temp_file_path}")
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to cleanup temp file {temp_file_path}: {e}")
|
||||
|
||||
|
||||
@router.get("/{workflow_name}/worker-info")
|
||||
async def get_workflow_worker_info(
|
||||
workflow_name: str,
|
||||
temporal_mgr=Depends(get_temporal_manager)
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Get the parameters schema for a workflow.
|
||||
Get worker information for a workflow.
|
||||
|
||||
Returns details about which worker is required to execute this workflow,
|
||||
including container name, task queue, and vertical.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow
|
||||
|
||||
Returns:
|
||||
Parameters schema with types, descriptions, and defaults
|
||||
Worker information including vertical, container name, and task queue
|
||||
|
||||
Raises:
|
||||
HTTPException: 404 if workflow not found
|
||||
"""
|
||||
if workflow_name not in prefect_mgr.workflows:
|
||||
available_workflows = list(prefect_mgr.workflows.keys())
|
||||
if workflow_name not in temporal_mgr.workflows:
|
||||
available_workflows = list(temporal_mgr.workflows.keys())
|
||||
error_response = create_structured_error_response(
|
||||
error_type="WorkflowNotFound",
|
||||
message=f"Workflow '{workflow_name}' not found",
|
||||
@@ -357,7 +542,70 @@ async def get_workflow_parameters(
|
||||
detail=error_response
|
||||
)
|
||||
|
||||
info = prefect_mgr.workflows[workflow_name]
|
||||
info = temporal_mgr.workflows[workflow_name]
|
||||
metadata = info.metadata
|
||||
|
||||
# Extract vertical from metadata
|
||||
vertical = metadata.get("vertical")
|
||||
|
||||
if not vertical:
|
||||
error_response = create_structured_error_response(
|
||||
error_type="MissingVertical",
|
||||
message=f"Workflow '{workflow_name}' does not specify a vertical in metadata",
|
||||
workflow_name=workflow_name,
|
||||
suggestions=[
|
||||
"Check workflow metadata.yaml for 'vertical' field",
|
||||
"Contact workflow author for support"
|
||||
]
|
||||
)
|
||||
raise HTTPException(
|
||||
status_code=500,
|
||||
detail=error_response
|
||||
)
|
||||
|
||||
return {
|
||||
"workflow": workflow_name,
|
||||
"vertical": vertical,
|
||||
"worker_container": f"fuzzforge-worker-{vertical}",
|
||||
"task_queue": f"{vertical}-queue",
|
||||
"required": True
|
||||
}
|
||||
|
||||
|
||||
@router.get("/{workflow_name}/parameters")
|
||||
async def get_workflow_parameters(
|
||||
workflow_name: str,
|
||||
temporal_mgr=Depends(get_temporal_manager)
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Get the parameters schema for a workflow.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow
|
||||
|
||||
Returns:
|
||||
Parameters schema with types, descriptions, and defaults
|
||||
|
||||
Raises:
|
||||
HTTPException: 404 if workflow not found
|
||||
"""
|
||||
if workflow_name not in temporal_mgr.workflows:
|
||||
available_workflows = list(temporal_mgr.workflows.keys())
|
||||
error_response = create_structured_error_response(
|
||||
error_type="WorkflowNotFound",
|
||||
message=f"Workflow '{workflow_name}' not found",
|
||||
workflow_name=workflow_name,
|
||||
suggestions=[
|
||||
f"Available workflows: {', '.join(available_workflows)}",
|
||||
"Use GET /workflows/ to see all available workflows"
|
||||
]
|
||||
)
|
||||
raise HTTPException(
|
||||
status_code=404,
|
||||
detail=error_response
|
||||
)
|
||||
|
||||
info = temporal_mgr.workflows[workflow_name]
|
||||
metadata = info.metadata
|
||||
|
||||
# Return parameters with enhanced schema information
|
||||
|
||||
@@ -1,770 +0,0 @@
|
||||
"""
|
||||
Prefect Manager - Core orchestration for workflow deployment and execution
|
||||
"""
|
||||
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import logging
|
||||
import os
|
||||
import platform
|
||||
import re
|
||||
from pathlib import Path
|
||||
from typing import Dict, Optional, Any
|
||||
from prefect import get_client
|
||||
from prefect.docker import DockerImage
|
||||
from prefect.client.schemas import FlowRun
|
||||
|
||||
from src.core.workflow_discovery import WorkflowDiscovery, WorkflowInfo
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
def get_registry_url(context: str = "default") -> str:
|
||||
"""
|
||||
Get the container registry URL to use for a given operation context.
|
||||
|
||||
Goals:
|
||||
- Work reliably across Linux and macOS Docker Desktop
|
||||
- Prefer in-network service discovery when running inside containers
|
||||
- Allow full override via env vars from docker-compose
|
||||
|
||||
Env overrides:
|
||||
- FUZZFORGE_REGISTRY_PUSH_URL: used for image builds/pushes
|
||||
- FUZZFORGE_REGISTRY_PULL_URL: used for workers to pull images
|
||||
"""
|
||||
# Normalize context
|
||||
ctx = (context or "default").lower()
|
||||
|
||||
# Always honor explicit overrides first
|
||||
if ctx in ("push", "build"):
|
||||
push_url = os.getenv("FUZZFORGE_REGISTRY_PUSH_URL")
|
||||
if push_url:
|
||||
logger.debug("Using FUZZFORGE_REGISTRY_PUSH_URL: %s", push_url)
|
||||
return push_url
|
||||
# Default to host-published registry for Docker daemon operations
|
||||
return "localhost:5001"
|
||||
|
||||
if ctx == "pull":
|
||||
pull_url = os.getenv("FUZZFORGE_REGISTRY_PULL_URL")
|
||||
if pull_url:
|
||||
logger.debug("Using FUZZFORGE_REGISTRY_PULL_URL: %s", pull_url)
|
||||
return pull_url
|
||||
# Prefect worker pulls via host Docker daemon as well
|
||||
return "localhost:5001"
|
||||
|
||||
# Default/fallback
|
||||
return os.getenv("FUZZFORGE_REGISTRY_PULL_URL", os.getenv("FUZZFORGE_REGISTRY_PUSH_URL", "localhost:5001"))
|
||||
|
||||
|
||||
def _compose_project_name(default: str = "fuzzforge") -> str:
|
||||
"""Return the docker-compose project name used for network/volume naming.
|
||||
|
||||
Always returns 'fuzzforge' regardless of environment variables.
|
||||
"""
|
||||
return "fuzzforge"
|
||||
|
||||
|
||||
class PrefectManager:
|
||||
"""
|
||||
Manages Prefect deployments and flow runs for discovered workflows.
|
||||
|
||||
This class handles:
|
||||
- Workflow discovery and registration
|
||||
- Docker image building through Prefect
|
||||
- Deployment creation and management
|
||||
- Flow run submission with volume mounting
|
||||
- Findings retrieval from completed runs
|
||||
"""
|
||||
|
||||
def __init__(self, workflows_dir: Path = None):
|
||||
"""
|
||||
Initialize the Prefect manager.
|
||||
|
||||
Args:
|
||||
workflows_dir: Path to the workflows directory (default: toolbox/workflows)
|
||||
"""
|
||||
if workflows_dir is None:
|
||||
workflows_dir = Path("toolbox/workflows")
|
||||
|
||||
self.discovery = WorkflowDiscovery(workflows_dir)
|
||||
self.workflows: Dict[str, WorkflowInfo] = {}
|
||||
self.deployments: Dict[str, str] = {} # workflow_name -> deployment_id
|
||||
|
||||
# Security: Define allowed and forbidden paths for host mounting
|
||||
self.allowed_base_paths = [
|
||||
"/tmp",
|
||||
"/home",
|
||||
"/Users", # macOS users
|
||||
"/opt",
|
||||
"/var/tmp",
|
||||
"/workspace", # Common container workspace
|
||||
"/app" # Container application directory (for test projects)
|
||||
]
|
||||
|
||||
self.forbidden_paths = [
|
||||
"/etc",
|
||||
"/root",
|
||||
"/var/run",
|
||||
"/sys",
|
||||
"/proc",
|
||||
"/dev",
|
||||
"/boot",
|
||||
"/var/lib/docker", # Critical Docker data
|
||||
"/var/log", # System logs
|
||||
"/usr/bin", # System binaries
|
||||
"/usr/sbin",
|
||||
"/sbin",
|
||||
"/bin"
|
||||
]
|
||||
|
||||
@staticmethod
|
||||
def _parse_memory_to_bytes(memory_str: str) -> int:
|
||||
"""
|
||||
Parse memory string (like '512Mi', '1Gi') to bytes.
|
||||
|
||||
Args:
|
||||
memory_str: Memory string with unit suffix
|
||||
|
||||
Returns:
|
||||
Memory in bytes
|
||||
|
||||
Raises:
|
||||
ValueError: If format is invalid
|
||||
"""
|
||||
if not memory_str:
|
||||
return 0
|
||||
|
||||
match = re.match(r'^(\d+(?:\.\d+)?)\s*([GMK]i?)$', memory_str.strip())
|
||||
if not match:
|
||||
raise ValueError(f"Invalid memory format: {memory_str}. Expected format like '512Mi', '1Gi'")
|
||||
|
||||
value, unit = match.groups()
|
||||
value = float(value)
|
||||
|
||||
# Convert to bytes based on unit (binary units: Ki, Mi, Gi)
|
||||
if unit in ['K', 'Ki']:
|
||||
multiplier = 1024
|
||||
elif unit in ['M', 'Mi']:
|
||||
multiplier = 1024 * 1024
|
||||
elif unit in ['G', 'Gi']:
|
||||
multiplier = 1024 * 1024 * 1024
|
||||
else:
|
||||
raise ValueError(f"Unsupported memory unit: {unit}")
|
||||
|
||||
return int(value * multiplier)
|
||||
|
||||
@staticmethod
|
||||
def _parse_cpu_to_millicores(cpu_str: str) -> int:
|
||||
"""
|
||||
Parse CPU string (like '500m', '1', '2.5') to millicores.
|
||||
|
||||
Args:
|
||||
cpu_str: CPU string
|
||||
|
||||
Returns:
|
||||
CPU in millicores (1 core = 1000 millicores)
|
||||
|
||||
Raises:
|
||||
ValueError: If format is invalid
|
||||
"""
|
||||
if not cpu_str:
|
||||
return 0
|
||||
|
||||
cpu_str = cpu_str.strip()
|
||||
|
||||
# Handle millicores format (e.g., '500m')
|
||||
if cpu_str.endswith('m'):
|
||||
try:
|
||||
return int(cpu_str[:-1])
|
||||
except ValueError:
|
||||
raise ValueError(f"Invalid CPU format: {cpu_str}")
|
||||
|
||||
# Handle core format (e.g., '1', '2.5')
|
||||
try:
|
||||
cores = float(cpu_str)
|
||||
return int(cores * 1000) # Convert to millicores
|
||||
except ValueError:
|
||||
raise ValueError(f"Invalid CPU format: {cpu_str}")
|
||||
|
||||
def _extract_resource_requirements(self, workflow_info: WorkflowInfo) -> Dict[str, str]:
|
||||
"""
|
||||
Extract resource requirements from workflow metadata.
|
||||
|
||||
Args:
|
||||
workflow_info: Workflow information with metadata
|
||||
|
||||
Returns:
|
||||
Dictionary with resource requirements in Docker format
|
||||
"""
|
||||
metadata = workflow_info.metadata
|
||||
requirements = metadata.get("requirements", {})
|
||||
resources = requirements.get("resources", {})
|
||||
|
||||
resource_config = {}
|
||||
|
||||
# Extract memory requirement
|
||||
memory = resources.get("memory")
|
||||
if memory:
|
||||
try:
|
||||
# Validate memory format and store original string for Docker
|
||||
self._parse_memory_to_bytes(memory)
|
||||
resource_config["memory"] = memory
|
||||
except ValueError as e:
|
||||
logger.warning(f"Invalid memory requirement in {workflow_info.name}: {e}")
|
||||
|
||||
# Extract CPU requirement
|
||||
cpu = resources.get("cpu")
|
||||
if cpu:
|
||||
try:
|
||||
# Validate CPU format and store original string for Docker
|
||||
self._parse_cpu_to_millicores(cpu)
|
||||
resource_config["cpus"] = cpu
|
||||
except ValueError as e:
|
||||
logger.warning(f"Invalid CPU requirement in {workflow_info.name}: {e}")
|
||||
|
||||
# Extract timeout
|
||||
timeout = resources.get("timeout")
|
||||
if timeout and isinstance(timeout, int):
|
||||
resource_config["timeout"] = str(timeout)
|
||||
|
||||
return resource_config
|
||||
|
||||
async def initialize(self):
|
||||
"""
|
||||
Initialize the manager by discovering and deploying all workflows.
|
||||
|
||||
This method:
|
||||
1. Discovers all valid workflows in the workflows directory
|
||||
2. Validates their metadata
|
||||
3. Deploys each workflow to Prefect with Docker images
|
||||
"""
|
||||
try:
|
||||
# Discover workflows
|
||||
self.workflows = await self.discovery.discover_workflows()
|
||||
|
||||
if not self.workflows:
|
||||
logger.warning("No workflows discovered")
|
||||
return
|
||||
|
||||
logger.info(f"Discovered {len(self.workflows)} workflows: {list(self.workflows.keys())}")
|
||||
|
||||
# Deploy each workflow
|
||||
for name, info in self.workflows.items():
|
||||
try:
|
||||
await self._deploy_workflow(name, info)
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to deploy workflow '{name}': {e}")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to initialize Prefect manager: {e}")
|
||||
raise
|
||||
|
||||
async def _deploy_workflow(self, name: str, info: WorkflowInfo):
|
||||
"""
|
||||
Deploy a single workflow to Prefect with Docker image.
|
||||
|
||||
Args:
|
||||
name: Workflow name
|
||||
info: Workflow information including metadata and paths
|
||||
"""
|
||||
logger.info(f"Deploying workflow '{name}'...")
|
||||
|
||||
# Get the flow function from registry
|
||||
flow_func = self.discovery.get_flow_function(name)
|
||||
if not flow_func:
|
||||
logger.error(
|
||||
f"Failed to get flow function for '{name}' from registry. "
|
||||
f"Ensure the workflow is properly registered in toolbox/workflows/registry.py"
|
||||
)
|
||||
return
|
||||
|
||||
# Use the mandatory Dockerfile with absolute paths for Docker Compose
|
||||
# Get absolute paths for build context and dockerfile
|
||||
toolbox_path = info.path.parent.parent.resolve()
|
||||
dockerfile_abs_path = info.dockerfile.resolve()
|
||||
|
||||
# Calculate relative dockerfile path from toolbox context
|
||||
try:
|
||||
dockerfile_rel_path = dockerfile_abs_path.relative_to(toolbox_path)
|
||||
except ValueError:
|
||||
# If relative path fails, use the workflow-specific path
|
||||
dockerfile_rel_path = Path("workflows") / name / "Dockerfile"
|
||||
|
||||
# Determine deployment strategy based on Dockerfile presence
|
||||
base_image = "prefecthq/prefect:3-python3.11"
|
||||
has_custom_dockerfile = info.has_docker and info.dockerfile.exists()
|
||||
|
||||
logger.info(f"=== DEPLOYMENT DEBUG for '{name}' ===")
|
||||
logger.info(f"info.has_docker: {info.has_docker}")
|
||||
logger.info(f"info.dockerfile: {info.dockerfile}")
|
||||
logger.info(f"info.dockerfile.exists(): {info.dockerfile.exists()}")
|
||||
logger.info(f"has_custom_dockerfile: {has_custom_dockerfile}")
|
||||
logger.info(f"toolbox_path: {toolbox_path}")
|
||||
logger.info(f"dockerfile_rel_path: {dockerfile_rel_path}")
|
||||
|
||||
if has_custom_dockerfile:
|
||||
logger.info(f"Workflow '{name}' has custom Dockerfile - building custom image")
|
||||
# Decide whether to use registry or keep images local to host engine
|
||||
import os
|
||||
# Default to using the local registry; set FUZZFORGE_USE_REGISTRY=false to bypass (not recommended)
|
||||
use_registry = os.getenv("FUZZFORGE_USE_REGISTRY", "true").lower() == "true"
|
||||
|
||||
if use_registry:
|
||||
registry_url = get_registry_url(context="push")
|
||||
image_spec = DockerImage(
|
||||
name=f"{registry_url}/fuzzforge/{name}",
|
||||
tag="latest",
|
||||
dockerfile=str(dockerfile_rel_path),
|
||||
context=str(toolbox_path)
|
||||
)
|
||||
deploy_image = f"{registry_url}/fuzzforge/{name}:latest"
|
||||
build_custom = True
|
||||
push_custom = True
|
||||
logger.info(f"Using registry: {registry_url} for '{name}'")
|
||||
else:
|
||||
# Single-host mode: build into host engine cache; no push required
|
||||
image_spec = DockerImage(
|
||||
name=f"fuzzforge/{name}",
|
||||
tag="latest",
|
||||
dockerfile=str(dockerfile_rel_path),
|
||||
context=str(toolbox_path)
|
||||
)
|
||||
deploy_image = f"fuzzforge/{name}:latest"
|
||||
build_custom = True
|
||||
push_custom = False
|
||||
logger.info("Using single-host image (no registry push): %s", deploy_image)
|
||||
else:
|
||||
logger.info(f"Workflow '{name}' using base image - no custom dependencies needed")
|
||||
deploy_image = base_image
|
||||
build_custom = False
|
||||
push_custom = False
|
||||
|
||||
# Pre-validate registry connectivity when pushing
|
||||
if push_custom:
|
||||
try:
|
||||
from .setup import validate_registry_connectivity
|
||||
await validate_registry_connectivity(registry_url)
|
||||
logger.info(f"Registry connectivity validated for {registry_url}")
|
||||
except Exception as e:
|
||||
logger.error(f"Registry connectivity validation failed for {registry_url}: {e}")
|
||||
raise RuntimeError(f"Cannot deploy workflow '{name}': Registry {registry_url} is not accessible. {e}")
|
||||
|
||||
# Deploy the workflow
|
||||
try:
|
||||
# Ensure any previous deployment is removed so job variables are updated
|
||||
try:
|
||||
async with get_client() as client:
|
||||
existing = await client.read_deployment_by_name(
|
||||
f"{name}/{name}-deployment"
|
||||
)
|
||||
if existing:
|
||||
logger.info(f"Removing existing deployment for '{name}' to refresh settings...")
|
||||
await client.delete_deployment(existing.id)
|
||||
except Exception:
|
||||
# If not found or deletion fails, continue with deployment
|
||||
pass
|
||||
|
||||
# Extract resource requirements from metadata
|
||||
workflow_resource_requirements = self._extract_resource_requirements(info)
|
||||
logger.info(f"Workflow '{name}' resource requirements: {workflow_resource_requirements}")
|
||||
|
||||
# Build job variables with resource requirements
|
||||
job_variables = {
|
||||
"image": deploy_image, # Use the worker-accessible registry name
|
||||
"volumes": [], # Populated at run submission with toolbox mount
|
||||
"env": {
|
||||
"PYTHONPATH": "/opt/prefect/toolbox:/opt/prefect",
|
||||
"WORKFLOW_NAME": name
|
||||
}
|
||||
}
|
||||
|
||||
# Add resource requirements to job variables if present
|
||||
if workflow_resource_requirements:
|
||||
job_variables["resources"] = workflow_resource_requirements
|
||||
|
||||
# Prepare deployment parameters
|
||||
deploy_params = {
|
||||
"name": f"{name}-deployment",
|
||||
"work_pool_name": "docker-pool",
|
||||
"image": image_spec if has_custom_dockerfile else deploy_image,
|
||||
"push": push_custom,
|
||||
"build": build_custom,
|
||||
"job_variables": job_variables
|
||||
}
|
||||
|
||||
deployment = await flow_func.deploy(**deploy_params)
|
||||
|
||||
self.deployments[name] = str(deployment.id) if hasattr(deployment, 'id') else name
|
||||
logger.info(f"Successfully deployed workflow '{name}'")
|
||||
|
||||
except Exception as e:
|
||||
# Enhanced error reporting with more context
|
||||
import traceback
|
||||
logger.error(f"Failed to deploy workflow '{name}': {e}")
|
||||
logger.error(f"Deployment traceback: {traceback.format_exc()}")
|
||||
|
||||
# Try to capture Docker-specific context
|
||||
error_context = {
|
||||
"workflow_name": name,
|
||||
"has_dockerfile": has_custom_dockerfile,
|
||||
"image_name": deploy_image if 'deploy_image' in locals() else "unknown",
|
||||
"registry_url": registry_url if 'registry_url' in locals() else "unknown",
|
||||
"error_type": type(e).__name__,
|
||||
"error_message": str(e)
|
||||
}
|
||||
|
||||
# Check for specific error patterns with detailed categorization
|
||||
error_msg_lower = str(e).lower()
|
||||
if "registry" in error_msg_lower and ("no such host" in error_msg_lower or "connection" in error_msg_lower):
|
||||
error_context["category"] = "registry_connectivity_error"
|
||||
error_context["solution"] = f"Cannot reach registry at {error_context['registry_url']}. Check Docker network and registry service."
|
||||
elif "docker" in error_msg_lower:
|
||||
error_context["category"] = "docker_error"
|
||||
if "build" in error_msg_lower:
|
||||
error_context["subcategory"] = "image_build_failed"
|
||||
error_context["solution"] = "Check Dockerfile syntax and dependencies."
|
||||
elif "pull" in error_msg_lower:
|
||||
error_context["subcategory"] = "image_pull_failed"
|
||||
error_context["solution"] = "Check if image exists in registry and network connectivity."
|
||||
elif "push" in error_msg_lower:
|
||||
error_context["subcategory"] = "image_push_failed"
|
||||
error_context["solution"] = f"Check registry connectivity and push permissions to {error_context['registry_url']}."
|
||||
elif "registry" in error_msg_lower:
|
||||
error_context["category"] = "registry_error"
|
||||
error_context["solution"] = "Check registry configuration and accessibility."
|
||||
elif "prefect" in error_msg_lower:
|
||||
error_context["category"] = "prefect_error"
|
||||
error_context["solution"] = "Check Prefect server connectivity and deployment configuration."
|
||||
else:
|
||||
error_context["category"] = "unknown_deployment_error"
|
||||
error_context["solution"] = "Check logs for more specific error details."
|
||||
|
||||
logger.error(f"Deployment error context: {error_context}")
|
||||
|
||||
# Raise enhanced exception with context
|
||||
enhanced_error = Exception(f"Deployment failed for workflow '{name}': {str(e)} | Context: {error_context}")
|
||||
enhanced_error.original_error = e
|
||||
enhanced_error.context = error_context
|
||||
raise enhanced_error
|
||||
|
||||
async def submit_workflow(
|
||||
self,
|
||||
workflow_name: str,
|
||||
target_path: str,
|
||||
volume_mode: str = "ro",
|
||||
parameters: Dict[str, Any] = None,
|
||||
resource_limits: Dict[str, str] = None,
|
||||
additional_volumes: list = None,
|
||||
timeout: int = None
|
||||
) -> FlowRun:
|
||||
"""
|
||||
Submit a workflow for execution with volume mounting.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow to execute
|
||||
target_path: Host path to mount as volume
|
||||
volume_mode: Volume mount mode ("ro" for read-only, "rw" for read-write)
|
||||
parameters: Workflow-specific parameters
|
||||
resource_limits: CPU/memory limits for container
|
||||
additional_volumes: List of additional volume mounts
|
||||
timeout: Timeout in seconds
|
||||
|
||||
Returns:
|
||||
FlowRun object with run information
|
||||
|
||||
Raises:
|
||||
ValueError: If workflow not found or volume mode not supported
|
||||
"""
|
||||
if workflow_name not in self.workflows:
|
||||
raise ValueError(f"Unknown workflow: {workflow_name}")
|
||||
|
||||
# Validate volume mode
|
||||
workflow_info = self.workflows[workflow_name]
|
||||
supported_modes = workflow_info.metadata.get("supported_volume_modes", ["ro", "rw"])
|
||||
|
||||
if volume_mode not in supported_modes:
|
||||
raise ValueError(
|
||||
f"Workflow '{workflow_name}' doesn't support volume mode '{volume_mode}'. "
|
||||
f"Supported modes: {supported_modes}"
|
||||
)
|
||||
|
||||
# Validate target path with security checks
|
||||
self._validate_target_path(target_path)
|
||||
|
||||
# Validate additional volumes if provided
|
||||
if additional_volumes:
|
||||
for volume in additional_volumes:
|
||||
self._validate_target_path(volume.host_path)
|
||||
|
||||
async with get_client() as client:
|
||||
# Get the deployment, auto-redeploy once if missing
|
||||
try:
|
||||
deployment = await client.read_deployment_by_name(
|
||||
f"{workflow_name}/{workflow_name}-deployment"
|
||||
)
|
||||
except Exception as e:
|
||||
import traceback
|
||||
logger.error(f"Failed to find deployment for workflow '{workflow_name}': {e}")
|
||||
logger.error(f"Deployment lookup traceback: {traceback.format_exc()}")
|
||||
|
||||
# Attempt a one-time auto-deploy to recover from startup races
|
||||
try:
|
||||
logger.info(f"Auto-deploying missing workflow '{workflow_name}' and retrying...")
|
||||
await self._deploy_workflow(workflow_name, workflow_info)
|
||||
deployment = await client.read_deployment_by_name(
|
||||
f"{workflow_name}/{workflow_name}-deployment"
|
||||
)
|
||||
except Exception as redeploy_exc:
|
||||
# Enhanced error with context
|
||||
error_context = {
|
||||
"workflow_name": workflow_name,
|
||||
"error_type": type(e).__name__,
|
||||
"error_message": str(e),
|
||||
"redeploy_error": str(redeploy_exc),
|
||||
"available_deployments": list(self.deployments.keys()),
|
||||
}
|
||||
enhanced_error = ValueError(
|
||||
f"Deployment not found and redeploy failed for workflow '{workflow_name}': {e} | Context: {error_context}"
|
||||
)
|
||||
enhanced_error.context = error_context
|
||||
raise enhanced_error
|
||||
|
||||
# Determine the Docker Compose network name and volume names
|
||||
# Hardcoded to 'fuzzforge' to avoid directory name dependencies
|
||||
import os
|
||||
compose_project = "fuzzforge"
|
||||
docker_network = "fuzzforge_default"
|
||||
|
||||
# Build volume mounts
|
||||
# Add toolbox volume mount for workflow code access
|
||||
backend_toolbox_path = "/app/toolbox" # Path in backend container
|
||||
|
||||
# Hardcoded volume names
|
||||
prefect_storage_volume = "fuzzforge_prefect_storage"
|
||||
toolbox_code_volume = "fuzzforge_toolbox_code"
|
||||
|
||||
volumes = [
|
||||
f"{target_path}:/workspace:{volume_mode}",
|
||||
f"{prefect_storage_volume}:/prefect-storage", # Shared storage for results
|
||||
f"{toolbox_code_volume}:/opt/prefect/toolbox:ro" # Mount workflow code
|
||||
]
|
||||
|
||||
# Add additional volumes if provided
|
||||
if additional_volumes:
|
||||
for volume in additional_volumes:
|
||||
volume_spec = f"{volume.host_path}:{volume.container_path}:{volume.mode}"
|
||||
volumes.append(volume_spec)
|
||||
|
||||
# Build environment variables
|
||||
env_vars = {
|
||||
"PREFECT_API_URL": "http://prefect-server:4200/api", # Use internal network hostname
|
||||
"PREFECT_LOGGING_LEVEL": "INFO",
|
||||
"PREFECT_LOCAL_STORAGE_PATH": "/prefect-storage", # Use shared storage
|
||||
"PREFECT_RESULTS_PERSIST_BY_DEFAULT": "true", # Enable result persistence
|
||||
"PREFECT_DEFAULT_RESULT_STORAGE_BLOCK": "local-file-system/fuzzforge-results", # Use our storage block
|
||||
"WORKSPACE_PATH": "/workspace",
|
||||
"VOLUME_MODE": volume_mode,
|
||||
"WORKFLOW_NAME": workflow_name
|
||||
}
|
||||
|
||||
# Add additional volume paths to environment for easy access
|
||||
if additional_volumes:
|
||||
for i, volume in enumerate(additional_volumes):
|
||||
env_vars[f"ADDITIONAL_VOLUME_{i}_PATH"] = volume.container_path
|
||||
|
||||
# Determine which image to use based on workflow configuration
|
||||
workflow_info = self.workflows[workflow_name]
|
||||
has_custom_dockerfile = workflow_info.has_docker and workflow_info.dockerfile.exists()
|
||||
# Use pull context for worker to pull from registry
|
||||
registry_url = get_registry_url(context="pull")
|
||||
workflow_image = f"{registry_url}/fuzzforge/{workflow_name}:latest" if has_custom_dockerfile else "prefecthq/prefect:3-python3.11"
|
||||
logger.debug(f"Worker will pull image: {workflow_image} (Registry: {registry_url})")
|
||||
|
||||
# Configure job variables with volume mounting and network access
|
||||
job_variables = {
|
||||
# Use custom image if available, otherwise base Prefect image
|
||||
"image": workflow_image,
|
||||
"volumes": volumes,
|
||||
"networks": [docker_network], # Connect to Docker Compose network
|
||||
"env": {
|
||||
**env_vars,
|
||||
"PYTHONPATH": "/opt/prefect/toolbox:/opt/prefect/toolbox/workflows",
|
||||
"WORKFLOW_NAME": workflow_name
|
||||
}
|
||||
}
|
||||
|
||||
# Apply resource requirements from workflow metadata and user overrides
|
||||
workflow_resource_requirements = self._extract_resource_requirements(workflow_info)
|
||||
final_resource_config = {}
|
||||
|
||||
# Start with workflow requirements as base
|
||||
if workflow_resource_requirements:
|
||||
final_resource_config.update(workflow_resource_requirements)
|
||||
|
||||
# Apply user-provided resource limits (overrides workflow defaults)
|
||||
if resource_limits:
|
||||
user_resource_config = {}
|
||||
if resource_limits.get("cpu_limit"):
|
||||
user_resource_config["cpus"] = resource_limits["cpu_limit"]
|
||||
if resource_limits.get("memory_limit"):
|
||||
user_resource_config["memory"] = resource_limits["memory_limit"]
|
||||
# Note: cpu_request and memory_request are not directly supported by Docker
|
||||
# but could be used for Kubernetes in the future
|
||||
|
||||
# User overrides take precedence
|
||||
final_resource_config.update(user_resource_config)
|
||||
|
||||
# Apply final resource configuration
|
||||
if final_resource_config:
|
||||
job_variables["resources"] = final_resource_config
|
||||
logger.info(f"Applied resource limits: {final_resource_config}")
|
||||
|
||||
# Merge parameters with defaults from metadata
|
||||
default_params = workflow_info.metadata.get("default_parameters", {})
|
||||
final_params = {**default_params, **(parameters or {})}
|
||||
|
||||
# Set flow parameters that match the flow signature
|
||||
final_params["target_path"] = "/workspace" # Container path where volume is mounted
|
||||
final_params["volume_mode"] = volume_mode
|
||||
|
||||
# Create and submit the flow run
|
||||
# Pass job_variables to ensure network, volumes, and environment are configured
|
||||
logger.info(f"Submitting flow with job_variables: {job_variables}")
|
||||
logger.info(f"Submitting flow with parameters: {final_params}")
|
||||
|
||||
# Prepare flow run creation parameters
|
||||
flow_run_params = {
|
||||
"deployment_id": deployment.id,
|
||||
"parameters": final_params,
|
||||
"job_variables": job_variables
|
||||
}
|
||||
|
||||
# Note: Timeout is handled through workflow-level configuration
|
||||
# Additional timeout configuration can be added to deployment metadata if needed
|
||||
|
||||
flow_run = await client.create_flow_run_from_deployment(**flow_run_params)
|
||||
|
||||
logger.info(
|
||||
f"Submitted workflow '{workflow_name}' with run_id: {flow_run.id}, "
|
||||
f"target: {target_path}, mode: {volume_mode}"
|
||||
)
|
||||
|
||||
return flow_run
|
||||
|
||||
async def get_flow_run_findings(self, run_id: str) -> Dict[str, Any]:
|
||||
"""
|
||||
Retrieve findings from a completed flow run.
|
||||
|
||||
Args:
|
||||
run_id: The flow run ID
|
||||
|
||||
Returns:
|
||||
Dictionary containing SARIF-formatted findings
|
||||
|
||||
Raises:
|
||||
ValueError: If run not completed or not found
|
||||
"""
|
||||
async with get_client() as client:
|
||||
flow_run = await client.read_flow_run(run_id)
|
||||
|
||||
if not flow_run.state.is_completed():
|
||||
raise ValueError(
|
||||
f"Flow run {run_id} not completed. Current status: {flow_run.state.name}"
|
||||
)
|
||||
|
||||
# Get the findings from the flow run result
|
||||
try:
|
||||
findings = await flow_run.state.result()
|
||||
return findings
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to retrieve findings for run {run_id}: {e}")
|
||||
raise ValueError(f"Failed to retrieve findings: {e}")
|
||||
|
||||
async def get_flow_run_status(self, run_id: str) -> Dict[str, Any]:
|
||||
"""
|
||||
Get the current status of a flow run.
|
||||
|
||||
Args:
|
||||
run_id: The flow run ID
|
||||
|
||||
Returns:
|
||||
Dictionary with status information
|
||||
"""
|
||||
async with get_client() as client:
|
||||
flow_run = await client.read_flow_run(run_id)
|
||||
|
||||
return {
|
||||
"run_id": str(flow_run.id),
|
||||
"workflow": flow_run.deployment_id,
|
||||
"status": flow_run.state.name,
|
||||
"is_completed": flow_run.state.is_completed(),
|
||||
"is_failed": flow_run.state.is_failed(),
|
||||
"is_running": flow_run.state.is_running(),
|
||||
"created_at": flow_run.created,
|
||||
"updated_at": flow_run.updated
|
||||
}
|
||||
|
||||
def _validate_target_path(self, target_path: str) -> None:
|
||||
"""
|
||||
Validate target path for security before mounting as volume.
|
||||
|
||||
Args:
|
||||
target_path: Host path to validate
|
||||
|
||||
Raises:
|
||||
ValueError: If path is not allowed for security reasons
|
||||
"""
|
||||
target = Path(target_path)
|
||||
|
||||
# Path must be absolute
|
||||
if not target.is_absolute():
|
||||
raise ValueError(f"Target path must be absolute: {target_path}")
|
||||
|
||||
# Resolve path to handle symlinks and relative components
|
||||
try:
|
||||
resolved_path = target.resolve()
|
||||
except (OSError, RuntimeError) as e:
|
||||
raise ValueError(f"Cannot resolve target path: {target_path} - {e}")
|
||||
|
||||
resolved_str = str(resolved_path)
|
||||
|
||||
# Check against forbidden paths first (more restrictive)
|
||||
for forbidden in self.forbidden_paths:
|
||||
if resolved_str.startswith(forbidden):
|
||||
raise ValueError(
|
||||
f"Access denied: Path '{target_path}' resolves to forbidden directory '{forbidden}'. "
|
||||
f"This path contains sensitive system files and cannot be mounted."
|
||||
)
|
||||
|
||||
# Check if path starts with any allowed base path
|
||||
path_allowed = False
|
||||
for allowed in self.allowed_base_paths:
|
||||
if resolved_str.startswith(allowed):
|
||||
path_allowed = True
|
||||
break
|
||||
|
||||
if not path_allowed:
|
||||
allowed_list = ", ".join(self.allowed_base_paths)
|
||||
raise ValueError(
|
||||
f"Access denied: Path '{target_path}' is not in allowed directories. "
|
||||
f"Allowed base paths: {allowed_list}"
|
||||
)
|
||||
|
||||
# Additional security checks
|
||||
if resolved_str == "/":
|
||||
raise ValueError("Cannot mount root filesystem")
|
||||
|
||||
# Warn if path doesn't exist (but don't block - it might be created later)
|
||||
if not resolved_path.exists():
|
||||
logger.warning(f"Target path does not exist: {target_path}")
|
||||
|
||||
logger.info(f"Path validation passed for: {target_path} -> {resolved_str}")
|
||||
+10
-367
@@ -1,5 +1,5 @@
|
||||
"""
|
||||
Setup utilities for Prefect infrastructure
|
||||
Setup utilities for FuzzForge infrastructure
|
||||
"""
|
||||
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
@@ -14,364 +14,21 @@ Setup utilities for Prefect infrastructure
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import logging
|
||||
from prefect import get_client
|
||||
from prefect.client.schemas.actions import WorkPoolCreate
|
||||
from prefect.client.schemas.objects import WorkPool
|
||||
from .prefect_manager import get_registry_url
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
async def setup_docker_pool():
|
||||
"""
|
||||
Create or update the Docker work pool for container execution.
|
||||
|
||||
This work pool is configured to:
|
||||
- Connect to the local Docker daemon
|
||||
- Support volume mounting at runtime
|
||||
- Clean up containers after execution
|
||||
- Use bridge networking by default
|
||||
"""
|
||||
import os
|
||||
|
||||
async with get_client() as client:
|
||||
pool_name = "docker-pool"
|
||||
|
||||
# Add force recreation flag for debugging fresh install issues
|
||||
force_recreate = os.getenv('FORCE_RECREATE_WORK_POOL', 'false').lower() == 'true'
|
||||
debug_setup = os.getenv('DEBUG_WORK_POOL_SETUP', 'false').lower() == 'true'
|
||||
|
||||
if force_recreate:
|
||||
logger.warning(f"FORCE_RECREATE_WORK_POOL=true - Will recreate work pool regardless of existing configuration")
|
||||
if debug_setup:
|
||||
logger.warning(f"DEBUG_WORK_POOL_SETUP=true - Enhanced logging enabled")
|
||||
# Temporarily set logging level to DEBUG for this function
|
||||
original_level = logger.level
|
||||
logger.setLevel(logging.DEBUG)
|
||||
|
||||
try:
|
||||
# Check if pool already exists and supports custom images
|
||||
existing_pools = await client.read_work_pools()
|
||||
existing_pool = None
|
||||
for pool in existing_pools:
|
||||
if pool.name == pool_name:
|
||||
existing_pool = pool
|
||||
break
|
||||
|
||||
if existing_pool and not force_recreate:
|
||||
logger.info(f"Found existing work pool '{pool_name}' - validating configuration...")
|
||||
|
||||
# Check if the existing pool has the correct configuration
|
||||
base_template = existing_pool.base_job_template or {}
|
||||
logger.debug(f"Base template keys: {list(base_template.keys())}")
|
||||
|
||||
job_config = base_template.get("job_configuration", {})
|
||||
logger.debug(f"Job config keys: {list(job_config.keys())}")
|
||||
|
||||
image_config = job_config.get("image", "")
|
||||
has_image_variable = "{{ image }}" in str(image_config)
|
||||
logger.debug(f"Image config: '{image_config}' -> has_image_variable: {has_image_variable}")
|
||||
|
||||
# Check if volume defaults include toolbox mount
|
||||
variables = base_template.get("variables", {})
|
||||
properties = variables.get("properties", {})
|
||||
volume_config = properties.get("volumes", {})
|
||||
volume_defaults = volume_config.get("default", [])
|
||||
has_toolbox_volume = any("toolbox_code" in str(vol) for vol in volume_defaults) if volume_defaults else False
|
||||
logger.debug(f"Volume defaults: {volume_defaults}")
|
||||
logger.debug(f"Has toolbox volume: {has_toolbox_volume}")
|
||||
|
||||
# Check if environment defaults include required settings
|
||||
env_config = properties.get("env", {})
|
||||
env_defaults = env_config.get("default", {})
|
||||
has_api_url = "PREFECT_API_URL" in env_defaults
|
||||
has_storage_path = "PREFECT_LOCAL_STORAGE_PATH" in env_defaults
|
||||
has_results_persist = "PREFECT_RESULTS_PERSIST_BY_DEFAULT" in env_defaults
|
||||
has_required_env = has_api_url and has_storage_path and has_results_persist
|
||||
logger.debug(f"Environment defaults: {env_defaults}")
|
||||
logger.debug(f"Has API URL: {has_api_url}, Has storage path: {has_storage_path}, Has results persist: {has_results_persist}")
|
||||
logger.debug(f"Has required env: {has_required_env}")
|
||||
|
||||
# Log the full validation result
|
||||
logger.info(f"Work pool validation - Image: {has_image_variable}, Toolbox: {has_toolbox_volume}, Environment: {has_required_env}")
|
||||
|
||||
if has_image_variable and has_toolbox_volume and has_required_env:
|
||||
logger.info(f"Docker work pool '{pool_name}' already exists with correct configuration")
|
||||
return
|
||||
else:
|
||||
reasons = []
|
||||
if not has_image_variable:
|
||||
reasons.append("missing image template")
|
||||
if not has_toolbox_volume:
|
||||
reasons.append("missing toolbox volume mount")
|
||||
if not has_required_env:
|
||||
if not has_api_url:
|
||||
reasons.append("missing PREFECT_API_URL")
|
||||
if not has_storage_path:
|
||||
reasons.append("missing PREFECT_LOCAL_STORAGE_PATH")
|
||||
if not has_results_persist:
|
||||
reasons.append("missing PREFECT_RESULTS_PERSIST_BY_DEFAULT")
|
||||
|
||||
logger.warning(f"Docker work pool '{pool_name}' exists but lacks: {', '.join(reasons)}. Recreating...")
|
||||
# Delete the old pool and recreate it
|
||||
try:
|
||||
await client.delete_work_pool(pool_name)
|
||||
logger.info(f"Deleted old work pool '{pool_name}'")
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to delete old work pool: {e}")
|
||||
elif force_recreate and existing_pool:
|
||||
logger.warning(f"Force recreation enabled - deleting existing work pool '{pool_name}'")
|
||||
try:
|
||||
await client.delete_work_pool(pool_name)
|
||||
logger.info(f"Deleted existing work pool for force recreation")
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to delete work pool for force recreation: {e}")
|
||||
|
||||
logger.info(f"Creating Docker work pool '{pool_name}' with custom image support...")
|
||||
|
||||
# Create the work pool with proper Docker configuration
|
||||
work_pool = WorkPoolCreate(
|
||||
name=pool_name,
|
||||
type="docker",
|
||||
description="Docker work pool for FuzzForge workflows with custom image support",
|
||||
base_job_template={
|
||||
"job_configuration": {
|
||||
"image": "{{ image }}", # Template variable for custom images
|
||||
"volumes": "{{ volumes }}", # List of volume mounts
|
||||
"env": "{{ env }}", # Environment variables
|
||||
"networks": "{{ networks }}", # Docker networks
|
||||
"stream_output": True,
|
||||
"auto_remove": True,
|
||||
"privileged": False,
|
||||
"network_mode": None, # Use networks instead
|
||||
"labels": {},
|
||||
"command": None # Let the image's CMD/ENTRYPOINT run
|
||||
},
|
||||
"variables": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"image": {
|
||||
"type": "string",
|
||||
"title": "Docker Image",
|
||||
"default": "prefecthq/prefect:3-python3.11",
|
||||
"description": "Docker image for the flow run"
|
||||
},
|
||||
"volumes": {
|
||||
"type": "array",
|
||||
"title": "Volume Mounts",
|
||||
"default": [
|
||||
"fuzzforge_prefect_storage:/prefect-storage",
|
||||
"fuzzforge_toolbox_code:/opt/prefect/toolbox:ro"
|
||||
],
|
||||
"description": "Volume mounts in format 'host:container:mode'",
|
||||
"items": {
|
||||
"type": "string"
|
||||
}
|
||||
},
|
||||
"networks": {
|
||||
"type": "array",
|
||||
"title": "Docker Networks",
|
||||
"default": ["fuzzforge_default"],
|
||||
"description": "Docker networks to connect container to",
|
||||
"items": {
|
||||
"type": "string"
|
||||
}
|
||||
},
|
||||
"env": {
|
||||
"type": "object",
|
||||
"title": "Environment Variables",
|
||||
"default": {
|
||||
"PREFECT_API_URL": "http://prefect-server:4200/api",
|
||||
"PREFECT_LOCAL_STORAGE_PATH": "/prefect-storage",
|
||||
"PREFECT_RESULTS_PERSIST_BY_DEFAULT": "true"
|
||||
},
|
||||
"description": "Environment variables for the container",
|
||||
"additionalProperties": {
|
||||
"type": "string"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
)
|
||||
|
||||
await client.create_work_pool(work_pool)
|
||||
logger.info(f"Created Docker work pool '{pool_name}'")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to setup Docker work pool: {e}")
|
||||
raise
|
||||
finally:
|
||||
# Restore original logging level if debug mode was enabled
|
||||
if debug_setup and 'original_level' in locals():
|
||||
logger.setLevel(original_level)
|
||||
|
||||
|
||||
def get_actual_compose_project_name():
|
||||
"""
|
||||
Return the hardcoded compose project name for FuzzForge.
|
||||
|
||||
Always returns 'fuzzforge' as per system requirements.
|
||||
"""
|
||||
logger.info("Using hardcoded compose project name: fuzzforge")
|
||||
return "fuzzforge"
|
||||
|
||||
|
||||
async def setup_result_storage():
|
||||
"""
|
||||
Create or update Prefect result storage block for findings persistence.
|
||||
Setup result storage (MinIO).
|
||||
|
||||
This sets up a LocalFileSystem storage block pointing to the shared
|
||||
/prefect-storage volume for result persistence.
|
||||
MinIO is used for both target upload and result storage.
|
||||
This is a placeholder for any MinIO-specific setup if needed.
|
||||
"""
|
||||
from prefect.filesystems import LocalFileSystem
|
||||
|
||||
storage_name = "fuzzforge-results"
|
||||
|
||||
try:
|
||||
# Create the storage block, overwrite if it exists
|
||||
logger.info(f"Setting up storage block '{storage_name}'...")
|
||||
storage = LocalFileSystem(basepath="/prefect-storage")
|
||||
|
||||
block_doc_id = await storage.save(name=storage_name, overwrite=True)
|
||||
logger.info(f"Storage block '{storage_name}' configured successfully")
|
||||
return str(block_doc_id)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to setup result storage: {e}")
|
||||
# Don't raise the exception - continue without storage block
|
||||
logger.warning("Continuing without result storage block - findings may not persist")
|
||||
return None
|
||||
|
||||
|
||||
async def validate_docker_connection():
|
||||
"""
|
||||
Validate that Docker is accessible and running.
|
||||
|
||||
Note: In containerized deployments with Docker socket proxy,
|
||||
the backend doesn't need direct Docker access.
|
||||
|
||||
Raises:
|
||||
RuntimeError: If Docker is not accessible
|
||||
"""
|
||||
import os
|
||||
|
||||
# Skip Docker validation if running in container without socket access
|
||||
if os.path.exists("/.dockerenv") and not os.path.exists("/var/run/docker.sock"):
|
||||
logger.info("Running in container without Docker socket - skipping Docker validation")
|
||||
return
|
||||
|
||||
try:
|
||||
import docker
|
||||
client = docker.from_env()
|
||||
client.ping()
|
||||
logger.info("Docker connection validated")
|
||||
except Exception as e:
|
||||
logger.error(f"Docker is not accessible: {e}")
|
||||
raise RuntimeError(
|
||||
"Docker is not running or not accessible. "
|
||||
"Please ensure Docker is installed and running."
|
||||
)
|
||||
|
||||
|
||||
async def validate_registry_connectivity(registry_url: str = None):
|
||||
"""
|
||||
Validate that the Docker registry is accessible.
|
||||
|
||||
Args:
|
||||
registry_url: URL of the Docker registry to validate (auto-detected if None)
|
||||
|
||||
Raises:
|
||||
RuntimeError: If registry is not accessible
|
||||
"""
|
||||
# Resolve a reachable test URL from within this process
|
||||
if registry_url is None:
|
||||
# If not specified, prefer internal service name in containers, host port on host
|
||||
import os
|
||||
if os.path.exists('/.dockerenv'):
|
||||
registry_url = "registry:5000"
|
||||
else:
|
||||
registry_url = "localhost:5001"
|
||||
|
||||
# If we're running inside a container and asked to probe localhost:PORT,
|
||||
# the probe would hit the container, not the host. Use host.docker.internal instead.
|
||||
import os
|
||||
try:
|
||||
host_part, port_part = registry_url.split(":", 1)
|
||||
except ValueError:
|
||||
host_part, port_part = registry_url, "80"
|
||||
|
||||
if os.path.exists('/.dockerenv') and host_part in ("localhost", "127.0.0.1"):
|
||||
test_host = "host.docker.internal"
|
||||
else:
|
||||
test_host = host_part
|
||||
test_url = f"http://{test_host}:{port_part}/v2/"
|
||||
|
||||
import aiohttp
|
||||
import asyncio
|
||||
|
||||
logger.info(f"Validating registry connectivity to {registry_url}...")
|
||||
|
||||
try:
|
||||
async with aiohttp.ClientSession(timeout=aiohttp.ClientTimeout(total=10)) as session:
|
||||
async with session.get(test_url) as response:
|
||||
if response.status == 200:
|
||||
logger.info(f"Registry at {registry_url} is accessible (tested via {test_host})")
|
||||
return
|
||||
else:
|
||||
raise RuntimeError(f"Registry returned status {response.status}")
|
||||
except asyncio.TimeoutError:
|
||||
raise RuntimeError(f"Registry at {registry_url} is not responding (timeout)")
|
||||
except aiohttp.ClientError as e:
|
||||
raise RuntimeError(f"Registry at {registry_url} is not accessible: {e}")
|
||||
except Exception as e:
|
||||
raise RuntimeError(f"Failed to validate registry connectivity: {e}")
|
||||
|
||||
|
||||
async def validate_docker_network(network_name: str):
|
||||
"""
|
||||
Validate that the specified Docker network exists.
|
||||
|
||||
Args:
|
||||
network_name: Name of the Docker network to validate
|
||||
|
||||
Raises:
|
||||
RuntimeError: If network doesn't exist
|
||||
"""
|
||||
import os
|
||||
|
||||
# Skip network validation if running in container without Docker socket
|
||||
if os.path.exists("/.dockerenv") and not os.path.exists("/var/run/docker.sock"):
|
||||
logger.info("Running in container without Docker socket - skipping network validation")
|
||||
return
|
||||
|
||||
try:
|
||||
import docker
|
||||
client = docker.from_env()
|
||||
|
||||
# List all networks
|
||||
networks = client.networks.list(names=[network_name])
|
||||
|
||||
if not networks:
|
||||
# Try to find networks with similar names
|
||||
all_networks = client.networks.list()
|
||||
similar_networks = [n.name for n in all_networks if "fuzzforge" in n.name.lower()]
|
||||
|
||||
error_msg = f"Docker network '{network_name}' not found."
|
||||
if similar_networks:
|
||||
error_msg += f" Available networks: {similar_networks}"
|
||||
else:
|
||||
error_msg += " Please ensure Docker Compose is running."
|
||||
|
||||
raise RuntimeError(error_msg)
|
||||
|
||||
logger.info(f"Docker network '{network_name}' validated")
|
||||
|
||||
except Exception as e:
|
||||
if isinstance(e, RuntimeError):
|
||||
raise
|
||||
logger.error(f"Network validation failed: {e}")
|
||||
raise RuntimeError(f"Failed to validate Docker network: {e}")
|
||||
logger.info("Result storage (MinIO) configured")
|
||||
# MinIO is configured via environment variables in docker-compose
|
||||
# No additional setup needed here
|
||||
return True
|
||||
|
||||
|
||||
async def validate_infrastructure():
|
||||
@@ -382,21 +39,7 @@ async def validate_infrastructure():
|
||||
"""
|
||||
logger.info("Validating infrastructure...")
|
||||
|
||||
# Validate Docker connection
|
||||
await validate_docker_connection()
|
||||
|
||||
# Validate registry connectivity for custom image building
|
||||
await validate_registry_connectivity()
|
||||
|
||||
# Validate network (hardcoded to avoid directory name dependencies)
|
||||
import os
|
||||
compose_project = "fuzzforge"
|
||||
docker_network = "fuzzforge_default"
|
||||
|
||||
try:
|
||||
await validate_docker_network(docker_network)
|
||||
except RuntimeError as e:
|
||||
logger.warning(f"Network validation failed: {e}")
|
||||
logger.warning("Workflows may not be able to connect to Prefect services")
|
||||
# Setup storage (MinIO)
|
||||
await setup_result_storage()
|
||||
|
||||
logger.info("Infrastructure validation completed")
|
||||
|
||||
@@ -1,459 +0,0 @@
|
||||
"""
|
||||
Workflow Discovery - Registry-based discovery and loading of workflows
|
||||
"""
|
||||
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import logging
|
||||
import yaml
|
||||
from pathlib import Path
|
||||
from typing import Dict, Optional, Any, Callable
|
||||
from pydantic import BaseModel, Field, ConfigDict
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class WorkflowInfo(BaseModel):
|
||||
"""Information about a discovered workflow"""
|
||||
name: str = Field(..., description="Workflow name")
|
||||
path: Path = Field(..., description="Path to workflow directory")
|
||||
workflow_file: Path = Field(..., description="Path to workflow.py file")
|
||||
dockerfile: Path = Field(..., description="Path to Dockerfile")
|
||||
has_docker: bool = Field(..., description="Whether workflow has custom Dockerfile")
|
||||
metadata: Dict[str, Any] = Field(..., description="Workflow metadata from YAML")
|
||||
flow_function_name: str = Field(default="main_flow", description="Name of the flow function")
|
||||
|
||||
model_config = ConfigDict(arbitrary_types_allowed=True)
|
||||
|
||||
|
||||
class WorkflowDiscovery:
|
||||
"""
|
||||
Discovers workflows from the filesystem and validates them against the registry.
|
||||
|
||||
This system:
|
||||
1. Scans for workflows with metadata.yaml files
|
||||
2. Cross-references them with the manual registry
|
||||
3. Provides registry-based flow functions for deployment
|
||||
|
||||
Workflows must have:
|
||||
- workflow.py: Contains the Prefect flow
|
||||
- metadata.yaml: Mandatory metadata file
|
||||
- Entry in toolbox/workflows/registry.py: Manual registration
|
||||
- Dockerfile (optional): Custom container definition
|
||||
- requirements.txt (optional): Python dependencies
|
||||
"""
|
||||
|
||||
def __init__(self, workflows_dir: Path):
|
||||
"""
|
||||
Initialize workflow discovery.
|
||||
|
||||
Args:
|
||||
workflows_dir: Path to the workflows directory
|
||||
"""
|
||||
self.workflows_dir = workflows_dir
|
||||
if not self.workflows_dir.exists():
|
||||
self.workflows_dir.mkdir(parents=True, exist_ok=True)
|
||||
logger.info(f"Created workflows directory: {self.workflows_dir}")
|
||||
|
||||
# Import registry - this validates it on import
|
||||
try:
|
||||
from toolbox.workflows.registry import WORKFLOW_REGISTRY, list_registered_workflows
|
||||
self.registry = WORKFLOW_REGISTRY
|
||||
logger.info(f"Loaded workflow registry with {len(self.registry)} registered workflows")
|
||||
except ImportError as e:
|
||||
logger.error(f"Failed to import workflow registry: {e}")
|
||||
self.registry = {}
|
||||
except Exception as e:
|
||||
logger.error(f"Registry validation failed: {e}")
|
||||
self.registry = {}
|
||||
|
||||
# Cache for discovered workflows
|
||||
self._workflow_cache: Optional[Dict[str, WorkflowInfo]] = None
|
||||
self._cache_timestamp: Optional[float] = None
|
||||
self._cache_ttl = 60.0 # Cache TTL in seconds
|
||||
|
||||
async def discover_workflows(self) -> Dict[str, WorkflowInfo]:
|
||||
"""
|
||||
Discover workflows by cross-referencing filesystem with registry.
|
||||
Uses caching to avoid frequent filesystem scans.
|
||||
|
||||
Returns:
|
||||
Dictionary mapping workflow names to their information
|
||||
"""
|
||||
# Check cache validity
|
||||
import time
|
||||
current_time = time.time()
|
||||
|
||||
if (self._workflow_cache is not None and
|
||||
self._cache_timestamp is not None and
|
||||
(current_time - self._cache_timestamp) < self._cache_ttl):
|
||||
# Return cached results
|
||||
logger.debug(f"Returning cached workflow discovery ({len(self._workflow_cache)} workflows)")
|
||||
return self._workflow_cache
|
||||
workflows = {}
|
||||
discovered_dirs = set()
|
||||
registry_names = set(self.registry.keys())
|
||||
|
||||
if not self.workflows_dir.exists():
|
||||
logger.warning(f"Workflows directory does not exist: {self.workflows_dir}")
|
||||
return workflows
|
||||
|
||||
# Recursively scan all directories and subdirectories
|
||||
await self._scan_directory_recursive(self.workflows_dir, workflows, discovered_dirs)
|
||||
|
||||
# Check for registry entries without corresponding directories
|
||||
missing_dirs = registry_names - discovered_dirs
|
||||
if missing_dirs:
|
||||
logger.warning(
|
||||
f"Registry contains workflows without filesystem directories: {missing_dirs}. "
|
||||
f"These workflows cannot be deployed."
|
||||
)
|
||||
|
||||
logger.info(
|
||||
f"Discovery complete: {len(workflows)} workflows ready for deployment, "
|
||||
f"{len(missing_dirs)} registry entries missing directories, "
|
||||
f"{len(discovered_dirs - registry_names)} filesystem workflows not registered"
|
||||
)
|
||||
|
||||
# Update cache
|
||||
self._workflow_cache = workflows
|
||||
self._cache_timestamp = current_time
|
||||
|
||||
return workflows
|
||||
|
||||
async def _scan_directory_recursive(self, directory: Path, workflows: Dict[str, WorkflowInfo], discovered_dirs: set):
|
||||
"""
|
||||
Recursively scan directory for workflows.
|
||||
|
||||
Args:
|
||||
directory: Directory to scan
|
||||
workflows: Dictionary to populate with discovered workflows
|
||||
discovered_dirs: Set to track discovered workflow names
|
||||
"""
|
||||
for item in directory.iterdir():
|
||||
if not item.is_dir():
|
||||
continue
|
||||
|
||||
if item.name.startswith('_') or item.name.startswith('.'):
|
||||
continue # Skip hidden or private directories
|
||||
|
||||
# Check if this directory contains workflow files (workflow.py and metadata.yaml)
|
||||
workflow_file = item / "workflow.py"
|
||||
metadata_file = item / "metadata.yaml"
|
||||
|
||||
if workflow_file.exists() and metadata_file.exists():
|
||||
# This is a workflow directory
|
||||
workflow_name = item.name
|
||||
discovered_dirs.add(workflow_name)
|
||||
|
||||
# Only process workflows that are in the registry
|
||||
if workflow_name not in self.registry:
|
||||
logger.warning(
|
||||
f"Workflow '{workflow_name}' found in filesystem but not in registry. "
|
||||
f"Add it to toolbox/workflows/registry.py to enable deployment."
|
||||
)
|
||||
continue
|
||||
|
||||
try:
|
||||
workflow_info = await self._load_workflow(item)
|
||||
if workflow_info:
|
||||
workflows[workflow_info.name] = workflow_info
|
||||
logger.info(f"Discovered and registered workflow: {workflow_info.name}")
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to load workflow from {item}: {e}")
|
||||
else:
|
||||
# This is a category directory, recurse into it
|
||||
await self._scan_directory_recursive(item, workflows, discovered_dirs)
|
||||
|
||||
async def _load_workflow(self, workflow_dir: Path) -> Optional[WorkflowInfo]:
|
||||
"""
|
||||
Load and validate a single workflow.
|
||||
|
||||
Args:
|
||||
workflow_dir: Path to the workflow directory
|
||||
|
||||
Returns:
|
||||
WorkflowInfo if valid, None otherwise
|
||||
"""
|
||||
workflow_name = workflow_dir.name
|
||||
|
||||
# Check for mandatory files
|
||||
workflow_file = workflow_dir / "workflow.py"
|
||||
metadata_file = workflow_dir / "metadata.yaml"
|
||||
|
||||
if not workflow_file.exists():
|
||||
logger.warning(f"Workflow {workflow_name} missing workflow.py")
|
||||
return None
|
||||
|
||||
if not metadata_file.exists():
|
||||
logger.error(f"Workflow {workflow_name} missing mandatory metadata.yaml")
|
||||
return None
|
||||
|
||||
# Load and validate metadata
|
||||
try:
|
||||
metadata = self._load_metadata(metadata_file)
|
||||
if not self._validate_metadata(metadata, workflow_name):
|
||||
return None
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to load metadata for {workflow_name}: {e}")
|
||||
return None
|
||||
|
||||
# Check for mandatory Dockerfile
|
||||
dockerfile = workflow_dir / "Dockerfile"
|
||||
if not dockerfile.exists():
|
||||
logger.error(f"Workflow {workflow_name} missing mandatory Dockerfile")
|
||||
return None
|
||||
|
||||
has_docker = True # Always True since Dockerfile is mandatory
|
||||
|
||||
# Get flow function name from metadata or use default
|
||||
flow_function_name = metadata.get("flow_function", "main_flow")
|
||||
|
||||
return WorkflowInfo(
|
||||
name=workflow_name,
|
||||
path=workflow_dir,
|
||||
workflow_file=workflow_file,
|
||||
dockerfile=dockerfile,
|
||||
has_docker=has_docker,
|
||||
metadata=metadata,
|
||||
flow_function_name=flow_function_name
|
||||
)
|
||||
|
||||
def _load_metadata(self, metadata_file: Path) -> Dict[str, Any]:
|
||||
"""
|
||||
Load metadata from YAML file.
|
||||
|
||||
Args:
|
||||
metadata_file: Path to metadata.yaml
|
||||
|
||||
Returns:
|
||||
Dictionary containing metadata
|
||||
"""
|
||||
with open(metadata_file, 'r') as f:
|
||||
metadata = yaml.safe_load(f)
|
||||
|
||||
if metadata is None:
|
||||
raise ValueError("Empty metadata file")
|
||||
|
||||
return metadata
|
||||
|
||||
def _validate_metadata(self, metadata: Dict[str, Any], workflow_name: str) -> bool:
|
||||
"""
|
||||
Validate that metadata contains all required fields.
|
||||
|
||||
Args:
|
||||
metadata: Metadata dictionary
|
||||
workflow_name: Name of the workflow for logging
|
||||
|
||||
Returns:
|
||||
True if valid, False otherwise
|
||||
"""
|
||||
required_fields = ["name", "version", "description", "author", "category", "parameters", "requirements"]
|
||||
|
||||
missing_fields = []
|
||||
for field in required_fields:
|
||||
if field not in metadata:
|
||||
missing_fields.append(field)
|
||||
|
||||
if missing_fields:
|
||||
logger.error(
|
||||
f"Workflow {workflow_name} metadata missing required fields: {missing_fields}"
|
||||
)
|
||||
return False
|
||||
|
||||
# Validate version format (semantic versioning)
|
||||
version = metadata.get("version", "")
|
||||
if not self._is_valid_version(version):
|
||||
logger.error(f"Workflow {workflow_name} has invalid version format: {version}")
|
||||
return False
|
||||
|
||||
# Validate parameters structure
|
||||
parameters = metadata.get("parameters", {})
|
||||
if not isinstance(parameters, dict):
|
||||
logger.error(f"Workflow {workflow_name} parameters must be a dictionary")
|
||||
return False
|
||||
|
||||
return True
|
||||
|
||||
def _is_valid_version(self, version: str) -> bool:
|
||||
"""
|
||||
Check if version follows semantic versioning (x.y.z).
|
||||
|
||||
Args:
|
||||
version: Version string
|
||||
|
||||
Returns:
|
||||
True if valid semantic version
|
||||
"""
|
||||
try:
|
||||
parts = version.split('.')
|
||||
if len(parts) != 3:
|
||||
return False
|
||||
for part in parts:
|
||||
int(part) # Check if each part is a number
|
||||
return True
|
||||
except (ValueError, AttributeError):
|
||||
return False
|
||||
|
||||
def invalidate_cache(self) -> None:
|
||||
"""
|
||||
Invalidate the workflow discovery cache.
|
||||
Useful when workflows are added or modified.
|
||||
"""
|
||||
self._workflow_cache = None
|
||||
self._cache_timestamp = None
|
||||
logger.debug("Workflow discovery cache invalidated")
|
||||
|
||||
def get_flow_function(self, workflow_name: str) -> Optional[Callable]:
|
||||
"""
|
||||
Get the flow function from the registry.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow
|
||||
|
||||
Returns:
|
||||
The flow function if found in registry, None otherwise
|
||||
"""
|
||||
if workflow_name not in self.registry:
|
||||
logger.error(
|
||||
f"Workflow '{workflow_name}' not found in registry. "
|
||||
f"Available workflows: {list(self.registry.keys())}"
|
||||
)
|
||||
return None
|
||||
|
||||
try:
|
||||
from toolbox.workflows.registry import get_workflow_flow
|
||||
flow_func = get_workflow_flow(workflow_name)
|
||||
logger.debug(f"Retrieved flow function for '{workflow_name}' from registry")
|
||||
return flow_func
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get flow function for '{workflow_name}': {e}")
|
||||
return None
|
||||
|
||||
def get_registry_info(self, workflow_name: str) -> Optional[Dict[str, Any]]:
|
||||
"""
|
||||
Get registry information for a workflow.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow
|
||||
|
||||
Returns:
|
||||
Registry information if found, None otherwise
|
||||
"""
|
||||
if workflow_name not in self.registry:
|
||||
return None
|
||||
|
||||
try:
|
||||
from toolbox.workflows.registry import get_workflow_info
|
||||
return get_workflow_info(workflow_name)
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get registry info for '{workflow_name}': {e}")
|
||||
return None
|
||||
|
||||
@staticmethod
|
||||
def get_metadata_schema() -> Dict[str, Any]:
|
||||
"""
|
||||
Get the JSON schema for workflow metadata.
|
||||
|
||||
Returns:
|
||||
JSON schema dictionary
|
||||
"""
|
||||
return {
|
||||
"type": "object",
|
||||
"required": ["name", "version", "description", "author", "category", "parameters", "requirements"],
|
||||
"properties": {
|
||||
"name": {
|
||||
"type": "string",
|
||||
"description": "Workflow name"
|
||||
},
|
||||
"version": {
|
||||
"type": "string",
|
||||
"pattern": "^\\d+\\.\\d+\\.\\d+$",
|
||||
"description": "Semantic version (x.y.z)"
|
||||
},
|
||||
"description": {
|
||||
"type": "string",
|
||||
"description": "Workflow description"
|
||||
},
|
||||
"author": {
|
||||
"type": "string",
|
||||
"description": "Workflow author"
|
||||
},
|
||||
"category": {
|
||||
"type": "string",
|
||||
"enum": ["comprehensive", "specialized", "fuzzing", "focused"],
|
||||
"description": "Workflow category"
|
||||
},
|
||||
"tags": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "Workflow tags for categorization"
|
||||
},
|
||||
"requirements": {
|
||||
"type": "object",
|
||||
"required": ["tools", "resources"],
|
||||
"properties": {
|
||||
"tools": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "Required security tools"
|
||||
},
|
||||
"resources": {
|
||||
"type": "object",
|
||||
"required": ["memory", "cpu", "timeout"],
|
||||
"properties": {
|
||||
"memory": {
|
||||
"type": "string",
|
||||
"pattern": "^\\d+[GMK]i$",
|
||||
"description": "Memory limit (e.g., 1Gi, 512Mi)"
|
||||
},
|
||||
"cpu": {
|
||||
"type": "string",
|
||||
"pattern": "^\\d+m?$",
|
||||
"description": "CPU limit (e.g., 1000m, 2)"
|
||||
},
|
||||
"timeout": {
|
||||
"type": "integer",
|
||||
"minimum": 60,
|
||||
"maximum": 7200,
|
||||
"description": "Workflow timeout in seconds"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"description": "Workflow parameters schema"
|
||||
},
|
||||
"default_parameters": {
|
||||
"type": "object",
|
||||
"description": "Default parameter values"
|
||||
},
|
||||
"required_modules": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "Required module names"
|
||||
},
|
||||
"supported_volume_modes": {
|
||||
"type": "array",
|
||||
"items": {"enum": ["ro", "rw"]},
|
||||
"default": ["ro", "rw"],
|
||||
"description": "Supported volume mount modes"
|
||||
},
|
||||
"flow_function": {
|
||||
"type": "string",
|
||||
"default": "main_flow",
|
||||
"description": "Name of the flow function in workflow.py"
|
||||
}
|
||||
}
|
||||
}
|
||||
+171
-310
@@ -12,7 +12,6 @@
|
||||
import asyncio
|
||||
import logging
|
||||
import os
|
||||
from uuid import UUID
|
||||
from contextlib import AsyncExitStack, asynccontextmanager, suppress
|
||||
from typing import Any, Dict, Optional, List
|
||||
|
||||
@@ -23,31 +22,20 @@ from starlette.routing import Mount
|
||||
|
||||
from fastmcp.server.http import create_sse_app
|
||||
|
||||
from src.core.prefect_manager import PrefectManager
|
||||
from src.core.setup import setup_docker_pool, setup_result_storage, validate_infrastructure
|
||||
from src.core.workflow_discovery import WorkflowDiscovery
|
||||
from src.temporal.manager import TemporalManager
|
||||
from src.core.setup import setup_result_storage, validate_infrastructure
|
||||
from src.api import workflows, runs, fuzzing
|
||||
from src.services.prefect_stats_monitor import prefect_stats_monitor
|
||||
|
||||
from fastmcp import FastMCP
|
||||
from prefect.client.orchestration import get_client
|
||||
from prefect.client.schemas.filters import (
|
||||
FlowRunFilter,
|
||||
FlowRunFilterDeploymentId,
|
||||
FlowRunFilterState,
|
||||
FlowRunFilterStateType,
|
||||
)
|
||||
from prefect.client.schemas.sorting import FlowRunSort
|
||||
from prefect.states import StateType
|
||||
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
prefect_mgr = PrefectManager()
|
||||
temporal_mgr = TemporalManager()
|
||||
|
||||
|
||||
class PrefectBootstrapState:
|
||||
"""Tracks Prefect initialization progress for API and MCP consumers."""
|
||||
class TemporalBootstrapState:
|
||||
"""Tracks Temporal initialization progress for API and MCP consumers."""
|
||||
|
||||
def __init__(self) -> None:
|
||||
self.ready: bool = False
|
||||
@@ -64,19 +52,19 @@ class PrefectBootstrapState:
|
||||
}
|
||||
|
||||
|
||||
prefect_bootstrap_state = PrefectBootstrapState()
|
||||
temporal_bootstrap_state = TemporalBootstrapState()
|
||||
|
||||
# Configure retry strategy for bootstrapping Prefect + infrastructure
|
||||
# Configure retry strategy for bootstrapping Temporal + infrastructure
|
||||
STARTUP_RETRY_SECONDS = max(1, int(os.getenv("FUZZFORGE_STARTUP_RETRY_SECONDS", "5")))
|
||||
STARTUP_RETRY_MAX_SECONDS = max(
|
||||
STARTUP_RETRY_SECONDS,
|
||||
int(os.getenv("FUZZFORGE_STARTUP_RETRY_MAX_SECONDS", "60")),
|
||||
)
|
||||
|
||||
prefect_bootstrap_task: Optional[asyncio.Task] = None
|
||||
temporal_bootstrap_task: Optional[asyncio.Task] = None
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# FastAPI application (REST API remains unchanged)
|
||||
# FastAPI application (REST API)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
app = FastAPI(
|
||||
@@ -90,20 +78,19 @@ app.include_router(runs.router)
|
||||
app.include_router(fuzzing.router)
|
||||
|
||||
|
||||
def get_prefect_status() -> Dict[str, Any]:
|
||||
"""Return a snapshot of Prefect bootstrap state for diagnostics."""
|
||||
status = prefect_bootstrap_state.as_dict()
|
||||
status["workflows_loaded"] = len(prefect_mgr.workflows)
|
||||
status["deployments_tracked"] = len(prefect_mgr.deployments)
|
||||
def get_temporal_status() -> Dict[str, Any]:
|
||||
"""Return a snapshot of Temporal bootstrap state for diagnostics."""
|
||||
status = temporal_bootstrap_state.as_dict()
|
||||
status["workflows_loaded"] = len(temporal_mgr.workflows)
|
||||
status["bootstrap_task_running"] = (
|
||||
prefect_bootstrap_task is not None and not prefect_bootstrap_task.done()
|
||||
temporal_bootstrap_task is not None and not temporal_bootstrap_task.done()
|
||||
)
|
||||
return status
|
||||
|
||||
|
||||
def _prefect_not_ready_status() -> Optional[Dict[str, Any]]:
|
||||
"""Return status details if Prefect is not ready yet."""
|
||||
status = get_prefect_status()
|
||||
def _temporal_not_ready_status() -> Optional[Dict[str, Any]]:
|
||||
"""Return status details if Temporal is not ready yet."""
|
||||
status = get_temporal_status()
|
||||
if status.get("ready"):
|
||||
return None
|
||||
return status
|
||||
@@ -111,19 +98,19 @@ def _prefect_not_ready_status() -> Optional[Dict[str, Any]]:
|
||||
|
||||
@app.get("/")
|
||||
async def root() -> Dict[str, Any]:
|
||||
status = get_prefect_status()
|
||||
status = get_temporal_status()
|
||||
return {
|
||||
"name": "FuzzForge API",
|
||||
"version": "0.6.0",
|
||||
"status": "ready" if status.get("ready") else "initializing",
|
||||
"workflows_loaded": status.get("workflows_loaded", 0),
|
||||
"prefect": status,
|
||||
"temporal": status,
|
||||
}
|
||||
|
||||
|
||||
@app.get("/health")
|
||||
async def health() -> Dict[str, str]:
|
||||
status = get_prefect_status()
|
||||
status = get_temporal_status()
|
||||
health_status = "healthy" if status.get("ready") else "initializing"
|
||||
return {"status": health_status}
|
||||
|
||||
@@ -165,65 +152,61 @@ _fastapi_mcp_imported = False
|
||||
mcp = FastMCP(name="FuzzForge MCP")
|
||||
|
||||
|
||||
async def _bootstrap_prefect_with_retries() -> None:
|
||||
"""Initialize Prefect infrastructure with exponential backoff retries."""
|
||||
async def _bootstrap_temporal_with_retries() -> None:
|
||||
"""Initialize Temporal infrastructure with exponential backoff retries."""
|
||||
|
||||
attempt = 0
|
||||
|
||||
while True:
|
||||
attempt += 1
|
||||
prefect_bootstrap_state.task_running = True
|
||||
prefect_bootstrap_state.status = "starting"
|
||||
prefect_bootstrap_state.ready = False
|
||||
prefect_bootstrap_state.last_error = None
|
||||
temporal_bootstrap_state.task_running = True
|
||||
temporal_bootstrap_state.status = "starting"
|
||||
temporal_bootstrap_state.ready = False
|
||||
temporal_bootstrap_state.last_error = None
|
||||
|
||||
try:
|
||||
logger.info("Bootstrapping Prefect infrastructure...")
|
||||
logger.info("Bootstrapping Temporal infrastructure...")
|
||||
await validate_infrastructure()
|
||||
await setup_docker_pool()
|
||||
await setup_result_storage()
|
||||
await prefect_mgr.initialize()
|
||||
await prefect_stats_monitor.start_monitoring()
|
||||
await temporal_mgr.initialize()
|
||||
|
||||
prefect_bootstrap_state.ready = True
|
||||
prefect_bootstrap_state.status = "ready"
|
||||
prefect_bootstrap_state.task_running = False
|
||||
logger.info("Prefect infrastructure ready")
|
||||
temporal_bootstrap_state.ready = True
|
||||
temporal_bootstrap_state.status = "ready"
|
||||
temporal_bootstrap_state.task_running = False
|
||||
logger.info("Temporal infrastructure ready")
|
||||
return
|
||||
|
||||
except asyncio.CancelledError:
|
||||
prefect_bootstrap_state.status = "cancelled"
|
||||
prefect_bootstrap_state.task_running = False
|
||||
logger.info("Prefect bootstrap task cancelled")
|
||||
temporal_bootstrap_state.status = "cancelled"
|
||||
temporal_bootstrap_state.task_running = False
|
||||
logger.info("Temporal bootstrap task cancelled")
|
||||
raise
|
||||
|
||||
except Exception as exc: # pragma: no cover - defensive logging on infra startup
|
||||
logger.exception("Prefect bootstrap failed")
|
||||
prefect_bootstrap_state.ready = False
|
||||
prefect_bootstrap_state.status = "error"
|
||||
prefect_bootstrap_state.last_error = str(exc)
|
||||
logger.exception("Temporal bootstrap failed")
|
||||
temporal_bootstrap_state.ready = False
|
||||
temporal_bootstrap_state.status = "error"
|
||||
temporal_bootstrap_state.last_error = str(exc)
|
||||
|
||||
# Ensure partial initialization does not leave stale state behind
|
||||
prefect_mgr.workflows.clear()
|
||||
prefect_mgr.deployments.clear()
|
||||
await prefect_stats_monitor.stop_monitoring()
|
||||
temporal_mgr.workflows.clear()
|
||||
|
||||
wait_time = min(
|
||||
STARTUP_RETRY_SECONDS * (2 ** (attempt - 1)),
|
||||
STARTUP_RETRY_MAX_SECONDS,
|
||||
)
|
||||
logger.info("Retrying Prefect bootstrap in %s second(s)", wait_time)
|
||||
logger.info("Retrying Temporal bootstrap in %s second(s)", wait_time)
|
||||
|
||||
try:
|
||||
await asyncio.sleep(wait_time)
|
||||
except asyncio.CancelledError:
|
||||
prefect_bootstrap_state.status = "cancelled"
|
||||
prefect_bootstrap_state.task_running = False
|
||||
temporal_bootstrap_state.status = "cancelled"
|
||||
temporal_bootstrap_state.task_running = False
|
||||
raise
|
||||
|
||||
|
||||
def _lookup_workflow(workflow_name: str):
|
||||
info = prefect_mgr.workflows.get(workflow_name)
|
||||
info = temporal_mgr.workflows.get(workflow_name)
|
||||
if not info:
|
||||
return None
|
||||
metadata = info.metadata
|
||||
@@ -248,24 +231,23 @@ def _lookup_workflow(workflow_name: str):
|
||||
"required_modules": metadata.get("required_modules", []),
|
||||
"supported_volume_modes": supported_modes,
|
||||
"default_target_path": default_target_path,
|
||||
"default_volume_mode": default_volume_mode,
|
||||
"has_custom_docker": bool(info.has_docker),
|
||||
"default_volume_mode": default_volume_mode
|
||||
}
|
||||
|
||||
|
||||
@mcp.tool
|
||||
async def list_workflows_mcp() -> Dict[str, Any]:
|
||||
"""List all discovered workflows and their metadata summary."""
|
||||
not_ready = _prefect_not_ready_status()
|
||||
not_ready = _temporal_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"workflows": [],
|
||||
"prefect": not_ready,
|
||||
"message": "Prefect infrastructure is still initializing",
|
||||
"temporal": not_ready,
|
||||
"message": "Temporal infrastructure is still initializing",
|
||||
}
|
||||
|
||||
workflows_summary = []
|
||||
for name, info in prefect_mgr.workflows.items():
|
||||
for name, info in temporal_mgr.workflows.items():
|
||||
metadata = info.metadata
|
||||
defaults = metadata.get("default_parameters", {})
|
||||
workflows_summary.append({
|
||||
@@ -279,20 +261,19 @@ async def list_workflows_mcp() -> Dict[str, Any]:
|
||||
or defaults.get("volume_mode")
|
||||
or "ro",
|
||||
"default_target_path": metadata.get("default_target_path")
|
||||
or defaults.get("target_path"),
|
||||
"has_custom_docker": bool(info.has_docker),
|
||||
or defaults.get("target_path")
|
||||
})
|
||||
return {"workflows": workflows_summary, "prefect": get_prefect_status()}
|
||||
return {"workflows": workflows_summary, "temporal": get_temporal_status()}
|
||||
|
||||
|
||||
@mcp.tool
|
||||
async def get_workflow_metadata_mcp(workflow_name: str) -> Dict[str, Any]:
|
||||
"""Fetch detailed metadata for a workflow."""
|
||||
not_ready = _prefect_not_ready_status()
|
||||
not_ready = _temporal_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"error": "Prefect infrastructure not ready",
|
||||
"prefect": not_ready,
|
||||
"error": "Temporal infrastructure not ready",
|
||||
"temporal": not_ready,
|
||||
}
|
||||
|
||||
data = _lookup_workflow(workflow_name)
|
||||
@@ -304,11 +285,11 @@ async def get_workflow_metadata_mcp(workflow_name: str) -> Dict[str, Any]:
|
||||
@mcp.tool
|
||||
async def get_workflow_parameters_mcp(workflow_name: str) -> Dict[str, Any]:
|
||||
"""Return the parameter schema and defaults for a workflow."""
|
||||
not_ready = _prefect_not_ready_status()
|
||||
not_ready = _temporal_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"error": "Prefect infrastructure not ready",
|
||||
"prefect": not_ready,
|
||||
"error": "Temporal infrastructure not ready",
|
||||
"temporal": not_ready,
|
||||
}
|
||||
|
||||
data = _lookup_workflow(workflow_name)
|
||||
@@ -323,72 +304,41 @@ async def get_workflow_parameters_mcp(workflow_name: str) -> Dict[str, Any]:
|
||||
@mcp.tool
|
||||
async def get_workflow_metadata_schema_mcp() -> Dict[str, Any]:
|
||||
"""Return the JSON schema describing workflow metadata files."""
|
||||
from src.temporal.discovery import WorkflowDiscovery
|
||||
return WorkflowDiscovery.get_metadata_schema()
|
||||
|
||||
|
||||
@mcp.tool
|
||||
async def submit_security_scan_mcp(
|
||||
workflow_name: str,
|
||||
target_path: str | None = None,
|
||||
volume_mode: str | None = None,
|
||||
target_id: str,
|
||||
parameters: Dict[str, Any] | None = None,
|
||||
) -> Dict[str, Any] | Dict[str, str]:
|
||||
"""Submit a Prefect workflow via MCP."""
|
||||
"""Submit a Temporal workflow via MCP."""
|
||||
try:
|
||||
not_ready = _prefect_not_ready_status()
|
||||
not_ready = _temporal_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"error": "Prefect infrastructure not ready",
|
||||
"prefect": not_ready,
|
||||
"error": "Temporal infrastructure not ready",
|
||||
"temporal": not_ready,
|
||||
}
|
||||
|
||||
workflow_info = prefect_mgr.workflows.get(workflow_name)
|
||||
workflow_info = temporal_mgr.workflows.get(workflow_name)
|
||||
if not workflow_info:
|
||||
return {"error": f"Workflow '{workflow_name}' not found"}
|
||||
|
||||
metadata = workflow_info.metadata or {}
|
||||
defaults = metadata.get("default_parameters", {})
|
||||
|
||||
resolved_target_path = target_path or metadata.get("default_target_path") or defaults.get("target_path")
|
||||
if not resolved_target_path:
|
||||
return {
|
||||
"error": (
|
||||
"target_path is required and no default_target_path is defined in metadata"
|
||||
),
|
||||
"metadata": {
|
||||
"workflow": workflow_name,
|
||||
"default_target_path": metadata.get("default_target_path"),
|
||||
},
|
||||
}
|
||||
|
||||
requested_volume_mode = volume_mode or metadata.get("default_volume_mode") or defaults.get("volume_mode")
|
||||
if not requested_volume_mode:
|
||||
requested_volume_mode = "ro"
|
||||
|
||||
normalised_volume_mode = (
|
||||
str(requested_volume_mode).strip().lower().replace("-", "_")
|
||||
)
|
||||
if normalised_volume_mode in {"read_only", "readonly", "ro"}:
|
||||
normalised_volume_mode = "ro"
|
||||
elif normalised_volume_mode in {"read_write", "readwrite", "rw"}:
|
||||
normalised_volume_mode = "rw"
|
||||
else:
|
||||
supported_modes = metadata.get("supported_volume_modes", ["ro", "rw"])
|
||||
if isinstance(supported_modes, list) and normalised_volume_mode in supported_modes:
|
||||
pass
|
||||
else:
|
||||
normalised_volume_mode = "ro"
|
||||
|
||||
parameters = parameters or {}
|
||||
|
||||
cleaned_parameters: Dict[str, Any] = {**defaults, **parameters}
|
||||
|
||||
# Ensure *_config structures default to dicts so Prefect validation passes.
|
||||
# Ensure *_config structures default to dicts
|
||||
for key, value in list(cleaned_parameters.items()):
|
||||
if isinstance(key, str) and key.endswith("_config") and value is None:
|
||||
cleaned_parameters[key] = {}
|
||||
|
||||
# Some workflows expect configuration dictionaries even when omitted.
|
||||
# Some workflows expect configuration dictionaries even when omitted
|
||||
parameter_definitions = (
|
||||
metadata.get("parameters", {}).get("properties", {})
|
||||
if isinstance(metadata.get("parameters"), dict)
|
||||
@@ -403,20 +353,19 @@ async def submit_security_scan_mcp(
|
||||
elif cleaned_parameters[key] is None:
|
||||
cleaned_parameters[key] = {}
|
||||
|
||||
flow_run = await prefect_mgr.submit_workflow(
|
||||
# Start workflow
|
||||
handle = await temporal_mgr.run_workflow(
|
||||
workflow_name=workflow_name,
|
||||
target_path=resolved_target_path,
|
||||
volume_mode=normalised_volume_mode,
|
||||
parameters=cleaned_parameters,
|
||||
target_id=target_id,
|
||||
workflow_params=cleaned_parameters,
|
||||
)
|
||||
|
||||
return {
|
||||
"run_id": str(flow_run.id),
|
||||
"status": flow_run.state.name if flow_run.state else "PENDING",
|
||||
"run_id": handle.id,
|
||||
"status": "RUNNING",
|
||||
"workflow": workflow_name,
|
||||
"message": f"Workflow '{workflow_name}' submitted successfully",
|
||||
"target_path": resolved_target_path,
|
||||
"volume_mode": normalised_volume_mode,
|
||||
"target_id": target_id,
|
||||
"parameters": cleaned_parameters,
|
||||
"mcp_enabled": True,
|
||||
}
|
||||
@@ -427,43 +376,38 @@ async def submit_security_scan_mcp(
|
||||
|
||||
@mcp.tool
|
||||
async def get_comprehensive_scan_summary(run_id: str) -> Dict[str, Any] | Dict[str, str]:
|
||||
"""Return a summary for the given flow run via MCP."""
|
||||
"""Return a summary for the given workflow run via MCP."""
|
||||
try:
|
||||
not_ready = _prefect_not_ready_status()
|
||||
not_ready = _temporal_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"error": "Prefect infrastructure not ready",
|
||||
"prefect": not_ready,
|
||||
"error": "Temporal infrastructure not ready",
|
||||
"temporal": not_ready,
|
||||
}
|
||||
|
||||
status = await prefect_mgr.get_flow_run_status(run_id)
|
||||
findings = await prefect_mgr.get_flow_run_findings(run_id)
|
||||
|
||||
workflow_name = "unknown"
|
||||
deployment_id = status.get("workflow", "")
|
||||
for name, deployment in prefect_mgr.deployments.items():
|
||||
if str(deployment) == str(deployment_id):
|
||||
workflow_name = name
|
||||
break
|
||||
status = await temporal_mgr.get_workflow_status(run_id)
|
||||
|
||||
# Try to get result if completed
|
||||
total_findings = 0
|
||||
severity_summary = {"critical": 0, "high": 0, "medium": 0, "low": 0, "info": 0}
|
||||
|
||||
if findings and "sarif" in findings:
|
||||
sarif = findings["sarif"]
|
||||
if isinstance(sarif, dict):
|
||||
total_findings = sarif.get("total_findings", 0)
|
||||
if status.get("status") == "COMPLETED":
|
||||
try:
|
||||
result = await temporal_mgr.get_workflow_result(run_id)
|
||||
if isinstance(result, dict):
|
||||
summary = result.get("summary", {})
|
||||
total_findings = summary.get("total_findings", 0)
|
||||
except Exception as e:
|
||||
logger.debug(f"Could not retrieve result for {run_id}: {e}")
|
||||
|
||||
return {
|
||||
"run_id": run_id,
|
||||
"workflow": workflow_name,
|
||||
"workflow": "unknown", # Temporal doesn't track workflow name in status
|
||||
"status": status.get("status", "unknown"),
|
||||
"is_completed": status.get("is_completed", False),
|
||||
"is_completed": status.get("status") == "COMPLETED",
|
||||
"total_findings": total_findings,
|
||||
"severity_summary": severity_summary,
|
||||
"scan_duration": status.get("updated_at", "")
|
||||
if status.get("is_completed")
|
||||
else "In progress",
|
||||
"scan_duration": status.get("close_time", "In progress"),
|
||||
"recommendations": (
|
||||
[
|
||||
"Review high and critical severity findings first",
|
||||
@@ -482,32 +426,26 @@ async def get_comprehensive_scan_summary(run_id: str) -> Dict[str, Any] | Dict[s
|
||||
|
||||
@mcp.tool
|
||||
async def get_run_status_mcp(run_id: str) -> Dict[str, Any]:
|
||||
"""Return current status information for a Prefect run."""
|
||||
"""Return current status information for a Temporal run."""
|
||||
try:
|
||||
not_ready = _prefect_not_ready_status()
|
||||
not_ready = _temporal_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"error": "Prefect infrastructure not ready",
|
||||
"prefect": not_ready,
|
||||
"error": "Temporal infrastructure not ready",
|
||||
"temporal": not_ready,
|
||||
}
|
||||
|
||||
status = await prefect_mgr.get_flow_run_status(run_id)
|
||||
workflow_name = "unknown"
|
||||
deployment_id = status.get("workflow", "")
|
||||
for name, deployment in prefect_mgr.deployments.items():
|
||||
if str(deployment) == str(deployment_id):
|
||||
workflow_name = name
|
||||
break
|
||||
status = await temporal_mgr.get_workflow_status(run_id)
|
||||
|
||||
return {
|
||||
"run_id": status["run_id"],
|
||||
"workflow": workflow_name,
|
||||
"run_id": run_id,
|
||||
"workflow": "unknown",
|
||||
"status": status["status"],
|
||||
"is_completed": status["is_completed"],
|
||||
"is_failed": status["is_failed"],
|
||||
"is_running": status["is_running"],
|
||||
"created_at": status["created_at"],
|
||||
"updated_at": status["updated_at"],
|
||||
"is_completed": status["status"] in ["COMPLETED", "FAILED", "CANCELLED"],
|
||||
"is_failed": status["status"] == "FAILED",
|
||||
"is_running": status["status"] == "RUNNING",
|
||||
"created_at": status.get("start_time"),
|
||||
"updated_at": status.get("close_time") or status.get("execution_time"),
|
||||
}
|
||||
except Exception as exc:
|
||||
logger.exception("MCP run status failed")
|
||||
@@ -518,38 +456,30 @@ async def get_run_status_mcp(run_id: str) -> Dict[str, Any]:
|
||||
async def get_run_findings_mcp(run_id: str) -> Dict[str, Any]:
|
||||
"""Return SARIF findings for a completed run."""
|
||||
try:
|
||||
not_ready = _prefect_not_ready_status()
|
||||
not_ready = _temporal_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"error": "Prefect infrastructure not ready",
|
||||
"prefect": not_ready,
|
||||
"error": "Temporal infrastructure not ready",
|
||||
"temporal": not_ready,
|
||||
}
|
||||
|
||||
status = await prefect_mgr.get_flow_run_status(run_id)
|
||||
if not status.get("is_completed"):
|
||||
status = await temporal_mgr.get_workflow_status(run_id)
|
||||
if status.get("status") != "COMPLETED":
|
||||
return {"error": f"Run {run_id} not completed. Status: {status.get('status')}"}
|
||||
|
||||
findings = await prefect_mgr.get_flow_run_findings(run_id)
|
||||
|
||||
workflow_name = "unknown"
|
||||
deployment_id = status.get("workflow", "")
|
||||
for name, deployment in prefect_mgr.deployments.items():
|
||||
if str(deployment) == str(deployment_id):
|
||||
workflow_name = name
|
||||
break
|
||||
result = await temporal_mgr.get_workflow_result(run_id)
|
||||
|
||||
metadata = {
|
||||
"completion_time": status.get("updated_at"),
|
||||
"completion_time": status.get("close_time"),
|
||||
"workflow_version": "unknown",
|
||||
}
|
||||
info = prefect_mgr.workflows.get(workflow_name)
|
||||
if info:
|
||||
metadata["workflow_version"] = info.metadata.get("version", "unknown")
|
||||
|
||||
sarif = result.get("sarif", {}) if isinstance(result, dict) else {}
|
||||
|
||||
return {
|
||||
"workflow": workflow_name,
|
||||
"workflow": "unknown",
|
||||
"run_id": run_id,
|
||||
"sarif": findings,
|
||||
"sarif": sarif,
|
||||
"metadata": metadata,
|
||||
}
|
||||
except Exception as exc:
|
||||
@@ -561,16 +491,15 @@ async def get_run_findings_mcp(run_id: str) -> Dict[str, Any]:
|
||||
async def list_recent_runs_mcp(
|
||||
limit: int = 10,
|
||||
workflow_name: str | None = None,
|
||||
states: List[str] | None = None,
|
||||
) -> Dict[str, Any]:
|
||||
"""List recent Prefect runs with optional workflow/state filters."""
|
||||
"""List recent Temporal runs with optional workflow filter."""
|
||||
|
||||
not_ready = _prefect_not_ready_status()
|
||||
not_ready = _temporal_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"runs": [],
|
||||
"prefect": not_ready,
|
||||
"message": "Prefect infrastructure is still initializing",
|
||||
"temporal": not_ready,
|
||||
"message": "Temporal infrastructure is still initializing",
|
||||
}
|
||||
|
||||
try:
|
||||
@@ -579,116 +508,49 @@ async def list_recent_runs_mcp(
|
||||
limit_value = 10
|
||||
limit_value = max(1, min(limit_value, 100))
|
||||
|
||||
deployment_map = {
|
||||
str(deployment_id): workflow
|
||||
for workflow, deployment_id in prefect_mgr.deployments.items()
|
||||
}
|
||||
try:
|
||||
# Build filter query
|
||||
filter_query = None
|
||||
if workflow_name:
|
||||
workflow_info = temporal_mgr.workflows.get(workflow_name)
|
||||
if workflow_info:
|
||||
filter_query = f'WorkflowType="{workflow_info.workflow_type}"'
|
||||
|
||||
deployment_filter_value = None
|
||||
if workflow_name:
|
||||
deployment_id = prefect_mgr.deployments.get(workflow_name)
|
||||
if not deployment_id:
|
||||
return {
|
||||
"runs": [],
|
||||
"prefect": get_prefect_status(),
|
||||
"error": f"Workflow '{workflow_name}' has no registered deployment",
|
||||
}
|
||||
try:
|
||||
deployment_filter_value = UUID(str(deployment_id))
|
||||
except ValueError:
|
||||
return {
|
||||
"runs": [],
|
||||
"prefect": get_prefect_status(),
|
||||
"error": (
|
||||
f"Deployment id '{deployment_id}' for workflow '{workflow_name}' is invalid"
|
||||
),
|
||||
}
|
||||
workflows = await temporal_mgr.list_workflows(filter_query, limit_value)
|
||||
|
||||
desired_state_types: List[StateType] = []
|
||||
if states:
|
||||
for raw_state in states:
|
||||
if not raw_state:
|
||||
continue
|
||||
normalised = raw_state.strip().upper()
|
||||
if normalised == "ALL":
|
||||
desired_state_types = []
|
||||
break
|
||||
try:
|
||||
desired_state_types.append(StateType[normalised])
|
||||
except KeyError:
|
||||
continue
|
||||
if not desired_state_types:
|
||||
desired_state_types = [
|
||||
StateType.RUNNING,
|
||||
StateType.COMPLETED,
|
||||
StateType.FAILED,
|
||||
StateType.CANCELLED,
|
||||
]
|
||||
results: List[Dict[str, Any]] = []
|
||||
for wf in workflows:
|
||||
results.append({
|
||||
"run_id": wf["workflow_id"],
|
||||
"workflow": workflow_name or "unknown",
|
||||
"state": wf["status"],
|
||||
"state_type": wf["status"],
|
||||
"is_completed": wf["status"] in ["COMPLETED", "FAILED", "CANCELLED"],
|
||||
"is_running": wf["status"] == "RUNNING",
|
||||
"is_failed": wf["status"] == "FAILED",
|
||||
"created_at": wf.get("start_time"),
|
||||
"updated_at": wf.get("close_time"),
|
||||
})
|
||||
|
||||
flow_filter = FlowRunFilter()
|
||||
if desired_state_types:
|
||||
flow_filter.state = FlowRunFilterState(
|
||||
type=FlowRunFilterStateType(any_=desired_state_types)
|
||||
)
|
||||
if deployment_filter_value:
|
||||
flow_filter.deployment_id = FlowRunFilterDeploymentId(
|
||||
any_=[deployment_filter_value]
|
||||
)
|
||||
return {"runs": results, "temporal": get_temporal_status()}
|
||||
|
||||
async with get_client() as client:
|
||||
flow_runs = await client.read_flow_runs(
|
||||
limit=limit_value,
|
||||
flow_run_filter=flow_filter,
|
||||
sort=FlowRunSort.START_TIME_DESC,
|
||||
)
|
||||
|
||||
results: List[Dict[str, Any]] = []
|
||||
for flow_run in flow_runs:
|
||||
deployment_id = getattr(flow_run, "deployment_id", None)
|
||||
workflow = deployment_map.get(str(deployment_id), "unknown")
|
||||
state = getattr(flow_run, "state", None)
|
||||
state_name = getattr(state, "name", None) if state else None
|
||||
state_type = getattr(state, "type", None) if state else None
|
||||
|
||||
results.append(
|
||||
{
|
||||
"run_id": str(flow_run.id),
|
||||
"workflow": workflow,
|
||||
"deployment_id": str(deployment_id) if deployment_id else None,
|
||||
"state": state_name or (state_type.name if state_type else None),
|
||||
"state_type": state_type.name if state_type else None,
|
||||
"is_completed": bool(getattr(state, "is_completed", lambda: False)()),
|
||||
"is_running": bool(getattr(state, "is_running", lambda: False)()),
|
||||
"is_failed": bool(getattr(state, "is_failed", lambda: False)()),
|
||||
"created_at": getattr(flow_run, "created", None),
|
||||
"updated_at": getattr(flow_run, "updated", None),
|
||||
"expected_start_time": getattr(flow_run, "expected_start_time", None),
|
||||
"start_time": getattr(flow_run, "start_time", None),
|
||||
}
|
||||
)
|
||||
|
||||
# Normalise datetimes to ISO 8601 strings for serialization
|
||||
for entry in results:
|
||||
for key in ("created_at", "updated_at", "expected_start_time", "start_time"):
|
||||
value = entry.get(key)
|
||||
if value is None:
|
||||
continue
|
||||
try:
|
||||
entry[key] = value.isoformat()
|
||||
except AttributeError:
|
||||
entry[key] = str(value)
|
||||
|
||||
return {"runs": results, "prefect": get_prefect_status()}
|
||||
except Exception as exc:
|
||||
logger.exception("Failed to list runs")
|
||||
return {
|
||||
"runs": [],
|
||||
"temporal": get_temporal_status(),
|
||||
"error": str(exc)
|
||||
}
|
||||
|
||||
|
||||
@mcp.tool
|
||||
async def get_fuzzing_stats_mcp(run_id: str) -> Dict[str, Any]:
|
||||
"""Return fuzzing statistics for a run if available."""
|
||||
not_ready = _prefect_not_ready_status()
|
||||
not_ready = _temporal_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"error": "Prefect infrastructure not ready",
|
||||
"prefect": not_ready,
|
||||
"error": "Temporal infrastructure not ready",
|
||||
"temporal": not_ready,
|
||||
}
|
||||
|
||||
stats = fuzzing.fuzzing_stats.get(run_id)
|
||||
@@ -708,11 +570,11 @@ async def get_fuzzing_stats_mcp(run_id: str) -> Dict[str, Any]:
|
||||
@mcp.tool
|
||||
async def get_fuzzing_crash_reports_mcp(run_id: str) -> Dict[str, Any]:
|
||||
"""Return crash reports collected for a fuzzing run."""
|
||||
not_ready = _prefect_not_ready_status()
|
||||
not_ready = _temporal_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"error": "Prefect infrastructure not ready",
|
||||
"prefect": not_ready,
|
||||
"error": "Temporal infrastructure not ready",
|
||||
"temporal": not_ready,
|
||||
}
|
||||
|
||||
reports = fuzzing.crash_reports.get(run_id)
|
||||
@@ -725,11 +587,11 @@ async def get_fuzzing_crash_reports_mcp(run_id: str) -> Dict[str, Any]:
|
||||
async def get_backend_status_mcp() -> Dict[str, Any]:
|
||||
"""Expose backend readiness, workflows, and registered MCP tools."""
|
||||
|
||||
status = get_prefect_status()
|
||||
response: Dict[str, Any] = {"prefect": status}
|
||||
status = get_temporal_status()
|
||||
response: Dict[str, Any] = {"temporal": status}
|
||||
|
||||
if status.get("ready"):
|
||||
response["workflows"] = list(prefect_mgr.workflows.keys())
|
||||
response["workflows"] = list(temporal_mgr.workflows.keys())
|
||||
|
||||
try:
|
||||
tools = await mcp._tool_manager.list_tools()
|
||||
@@ -775,12 +637,12 @@ def create_mcp_transport_app() -> Starlette:
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Combined lifespan: Prefect init + dedicated MCP transports
|
||||
# Combined lifespan: Temporal init + dedicated MCP transports
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@asynccontextmanager
|
||||
async def combined_lifespan(app: FastAPI):
|
||||
global prefect_bootstrap_task, _fastapi_mcp_imported
|
||||
global temporal_bootstrap_task, _fastapi_mcp_imported
|
||||
|
||||
logger.info("Starting FuzzForge backend...")
|
||||
|
||||
@@ -793,12 +655,12 @@ async def combined_lifespan(app: FastAPI):
|
||||
except Exception as exc:
|
||||
logger.exception("Failed to import FastAPI endpoints into MCP", exc_info=exc)
|
||||
|
||||
# Kick off Prefect bootstrap in the background if needed
|
||||
if prefect_bootstrap_task is None or prefect_bootstrap_task.done():
|
||||
prefect_bootstrap_task = asyncio.create_task(_bootstrap_prefect_with_retries())
|
||||
logger.info("Prefect bootstrap task started")
|
||||
# Kick off Temporal bootstrap in the background if needed
|
||||
if temporal_bootstrap_task is None or temporal_bootstrap_task.done():
|
||||
temporal_bootstrap_task = asyncio.create_task(_bootstrap_temporal_with_retries())
|
||||
logger.info("Temporal bootstrap task started")
|
||||
else:
|
||||
logger.info("Prefect bootstrap task already running")
|
||||
logger.info("Temporal bootstrap task already running")
|
||||
|
||||
# Start MCP transports on shared port (HTTP + SSE)
|
||||
mcp_app = create_mcp_transport_app()
|
||||
@@ -846,18 +708,17 @@ async def combined_lifespan(app: FastAPI):
|
||||
mcp_server.force_exit = True
|
||||
await asyncio.gather(mcp_task, return_exceptions=True)
|
||||
|
||||
if prefect_bootstrap_task and not prefect_bootstrap_task.done():
|
||||
prefect_bootstrap_task.cancel()
|
||||
if temporal_bootstrap_task and not temporal_bootstrap_task.done():
|
||||
temporal_bootstrap_task.cancel()
|
||||
with suppress(asyncio.CancelledError):
|
||||
await prefect_bootstrap_task
|
||||
prefect_bootstrap_state.task_running = False
|
||||
if not prefect_bootstrap_state.ready:
|
||||
prefect_bootstrap_state.status = "stopped"
|
||||
prefect_bootstrap_state.next_retry_seconds = None
|
||||
prefect_bootstrap_task = None
|
||||
await temporal_bootstrap_task
|
||||
temporal_bootstrap_state.task_running = False
|
||||
if not temporal_bootstrap_state.ready:
|
||||
temporal_bootstrap_state.status = "stopped"
|
||||
temporal_bootstrap_task = None
|
||||
|
||||
logger.info("Shutting down Prefect statistics monitor...")
|
||||
await prefect_stats_monitor.stop_monitoring()
|
||||
# Close Temporal client
|
||||
await temporal_mgr.close()
|
||||
logger.info("Shutting down FuzzForge backend...")
|
||||
|
||||
|
||||
|
||||
@@ -13,10 +13,9 @@ Models for workflow findings and submissions
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
from pydantic import BaseModel, Field, field_validator
|
||||
from pydantic import BaseModel, Field
|
||||
from typing import Dict, Any, Optional, Literal, List
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
class WorkflowFindings(BaseModel):
|
||||
@@ -27,47 +26,13 @@ class WorkflowFindings(BaseModel):
|
||||
metadata: Dict[str, Any] = Field(default_factory=dict, description="Additional metadata")
|
||||
|
||||
|
||||
class ResourceLimits(BaseModel):
|
||||
"""Resource limits for workflow execution"""
|
||||
cpu_limit: Optional[str] = Field(None, description="CPU limit (e.g., '2' for 2 cores, '500m' for 0.5 cores)")
|
||||
memory_limit: Optional[str] = Field(None, description="Memory limit (e.g., '1Gi', '512Mi')")
|
||||
cpu_request: Optional[str] = Field(None, description="CPU request (guaranteed)")
|
||||
memory_request: Optional[str] = Field(None, description="Memory request (guaranteed)")
|
||||
|
||||
|
||||
class VolumeMount(BaseModel):
|
||||
"""Volume mount specification"""
|
||||
host_path: str = Field(..., description="Host path to mount")
|
||||
container_path: str = Field(..., description="Container path for mount")
|
||||
mode: Literal["ro", "rw"] = Field(default="ro", description="Mount mode")
|
||||
|
||||
@field_validator("host_path")
|
||||
@classmethod
|
||||
def validate_host_path(cls, v):
|
||||
"""Validate that the host path is absolute (existence checked at runtime)"""
|
||||
path = Path(v)
|
||||
if not path.is_absolute():
|
||||
raise ValueError(f"Host path must be absolute: {v}")
|
||||
# Note: Path existence is validated at workflow runtime
|
||||
# We can't validate existence here as this runs inside Docker container
|
||||
return str(path)
|
||||
|
||||
@field_validator("container_path")
|
||||
@classmethod
|
||||
def validate_container_path(cls, v):
|
||||
"""Validate that the container path is absolute"""
|
||||
if not v.startswith('/'):
|
||||
raise ValueError(f"Container path must be absolute: {v}")
|
||||
return v
|
||||
|
||||
|
||||
class WorkflowSubmission(BaseModel):
|
||||
"""Submit a workflow with configurable settings"""
|
||||
target_path: str = Field(..., description="Absolute path to analyze")
|
||||
volume_mode: Literal["ro", "rw"] = Field(
|
||||
default="ro",
|
||||
description="Volume mount mode: read-only (ro) or read-write (rw)"
|
||||
)
|
||||
"""
|
||||
Submit a workflow with configurable settings.
|
||||
|
||||
Note: This model is deprecated in favor of the /upload-and-submit endpoint
|
||||
which handles file uploads directly.
|
||||
"""
|
||||
parameters: Dict[str, Any] = Field(
|
||||
default_factory=dict,
|
||||
description="Workflow-specific parameters"
|
||||
@@ -78,25 +43,6 @@ class WorkflowSubmission(BaseModel):
|
||||
ge=1,
|
||||
le=604800 # Max 7 days to support fuzzing campaigns
|
||||
)
|
||||
resource_limits: Optional[ResourceLimits] = Field(
|
||||
None,
|
||||
description="Resource limits for workflow container"
|
||||
)
|
||||
additional_volumes: List[VolumeMount] = Field(
|
||||
default_factory=list,
|
||||
description="Additional volume mounts (e.g., for corpus, output directories)"
|
||||
)
|
||||
|
||||
@field_validator("target_path")
|
||||
@classmethod
|
||||
def validate_path(cls, v):
|
||||
"""Validate that the target path is absolute (existence checked at runtime)"""
|
||||
path = Path(v)
|
||||
if not path.is_absolute():
|
||||
raise ValueError(f"Path must be absolute: {v}")
|
||||
# Note: Path existence is validated at workflow runtime when volumes are mounted
|
||||
# We can't validate existence here as this runs inside Docker container
|
||||
return str(path)
|
||||
|
||||
|
||||
class WorkflowStatus(BaseModel):
|
||||
@@ -131,10 +77,6 @@ class WorkflowMetadata(BaseModel):
|
||||
default=["ro", "rw"],
|
||||
description="Supported volume mount modes"
|
||||
)
|
||||
has_custom_docker: bool = Field(
|
||||
default=False,
|
||||
description="Whether workflow has custom Dockerfile"
|
||||
)
|
||||
|
||||
|
||||
class WorkflowListItem(BaseModel):
|
||||
|
||||
@@ -1,394 +0,0 @@
|
||||
"""
|
||||
Generic Prefect Statistics Monitor Service
|
||||
|
||||
This service monitors ALL workflows for structured live data logging and
|
||||
updates the appropriate statistics APIs. Works with any workflow that follows
|
||||
the standard LIVE_STATS logging pattern.
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
import asyncio
|
||||
import json
|
||||
import logging
|
||||
from datetime import datetime, timedelta, timezone
|
||||
from typing import Dict, Any, Optional
|
||||
from prefect.client.orchestration import get_client
|
||||
from prefect.client.schemas.objects import FlowRun, TaskRun
|
||||
from src.models.findings import FuzzingStats
|
||||
from src.api.fuzzing import fuzzing_stats, initialize_fuzzing_tracking, active_connections
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class PrefectStatsMonitor:
|
||||
"""Monitors Prefect flows and tasks for live statistics from any workflow"""
|
||||
|
||||
def __init__(self):
|
||||
self.monitoring = False
|
||||
self.monitor_task = None
|
||||
self.monitored_runs = set()
|
||||
self.last_log_ts: Dict[str, datetime] = {}
|
||||
self._client = None
|
||||
self._client_refresh_time = None
|
||||
self._client_refresh_interval = 300 # Refresh connection every 5 minutes
|
||||
|
||||
async def start_monitoring(self):
|
||||
"""Start the Prefect statistics monitoring service"""
|
||||
if self.monitoring:
|
||||
logger.warning("Prefect stats monitor already running")
|
||||
return
|
||||
|
||||
self.monitoring = True
|
||||
self.monitor_task = asyncio.create_task(self._monitor_flows())
|
||||
logger.info("Started Prefect statistics monitor")
|
||||
|
||||
async def stop_monitoring(self):
|
||||
"""Stop the monitoring service"""
|
||||
self.monitoring = False
|
||||
if self.monitor_task:
|
||||
self.monitor_task.cancel()
|
||||
try:
|
||||
await self.monitor_task
|
||||
except asyncio.CancelledError:
|
||||
pass
|
||||
logger.info("Stopped Prefect statistics monitor")
|
||||
|
||||
async def _get_or_refresh_client(self):
|
||||
"""Get or refresh Prefect client with connection pooling."""
|
||||
now = datetime.now(timezone.utc)
|
||||
|
||||
if (self._client is None or
|
||||
self._client_refresh_time is None or
|
||||
(now - self._client_refresh_time).total_seconds() > self._client_refresh_interval):
|
||||
|
||||
if self._client:
|
||||
try:
|
||||
await self._client.aclose()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
self._client = get_client()
|
||||
self._client_refresh_time = now
|
||||
await self._client.__aenter__()
|
||||
|
||||
return self._client
|
||||
|
||||
async def _monitor_flows(self):
|
||||
"""Main monitoring loop that watches Prefect flows"""
|
||||
try:
|
||||
while self.monitoring:
|
||||
try:
|
||||
# Use connection pooling for better performance
|
||||
client = await self._get_or_refresh_client()
|
||||
|
||||
# Get recent flow runs (limit to reduce load)
|
||||
flow_runs = await client.read_flow_runs(
|
||||
limit=50,
|
||||
sort="START_TIME_DESC",
|
||||
)
|
||||
|
||||
# Only consider runs from the last 15 minutes
|
||||
recent_cutoff = datetime.now(timezone.utc) - timedelta(minutes=15)
|
||||
for flow_run in flow_runs:
|
||||
created = getattr(flow_run, "created", None)
|
||||
if created is None:
|
||||
continue
|
||||
try:
|
||||
# Ensure timezone-aware comparison
|
||||
if created.tzinfo is None:
|
||||
created = created.replace(tzinfo=timezone.utc)
|
||||
if created >= recent_cutoff:
|
||||
await self._monitor_flow_run(client, flow_run)
|
||||
except Exception:
|
||||
# If comparison fails, attempt monitoring anyway
|
||||
await self._monitor_flow_run(client, flow_run)
|
||||
|
||||
await asyncio.sleep(5) # Check every 5 seconds
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error in Prefect monitoring: {e}")
|
||||
await asyncio.sleep(10)
|
||||
|
||||
except asyncio.CancelledError:
|
||||
logger.info("Prefect monitoring cancelled")
|
||||
except Exception as e:
|
||||
logger.error(f"Fatal error in Prefect monitoring: {e}")
|
||||
finally:
|
||||
# Clean up client on exit
|
||||
if self._client:
|
||||
try:
|
||||
await self._client.__aexit__(None, None, None)
|
||||
except Exception:
|
||||
pass
|
||||
self._client = None
|
||||
|
||||
async def _monitor_flow_run(self, client, flow_run: FlowRun):
|
||||
"""Monitor a specific flow run for statistics"""
|
||||
run_id = str(flow_run.id)
|
||||
workflow_name = flow_run.name or "unknown"
|
||||
|
||||
try:
|
||||
# Initialize tracking if not exists - only for workflows that might have live stats
|
||||
if run_id not in fuzzing_stats:
|
||||
initialize_fuzzing_tracking(run_id, workflow_name)
|
||||
self.monitored_runs.add(run_id)
|
||||
|
||||
# Skip corrupted entries (should not happen after startup cleanup, but defensive)
|
||||
elif not isinstance(fuzzing_stats[run_id], FuzzingStats):
|
||||
logger.warning(f"Skipping corrupted stats entry for {run_id}, reinitializing")
|
||||
initialize_fuzzing_tracking(run_id, workflow_name)
|
||||
self.monitored_runs.add(run_id)
|
||||
|
||||
# Get task runs for this flow
|
||||
task_runs = await client.read_task_runs(
|
||||
flow_run_filter={"id": {"any_": [flow_run.id]}},
|
||||
limit=25,
|
||||
)
|
||||
|
||||
# Check all tasks for live statistics logging
|
||||
for task_run in task_runs:
|
||||
await self._extract_stats_from_task(client, run_id, task_run, workflow_name)
|
||||
|
||||
# Also scan flow-level logs as a fallback
|
||||
await self._extract_stats_from_flow_logs(client, run_id, flow_run, workflow_name)
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"Error monitoring flow run {run_id}: {e}")
|
||||
|
||||
async def _extract_stats_from_task(self, client, run_id: str, task_run: TaskRun, workflow_name: str):
|
||||
"""Extract statistics from any task that logs live stats"""
|
||||
try:
|
||||
# Get task run logs
|
||||
logs = await client.read_logs(
|
||||
log_filter={
|
||||
"task_run_id": {"any_": [task_run.id]}
|
||||
},
|
||||
limit=100,
|
||||
sort="TIMESTAMP_ASC"
|
||||
)
|
||||
|
||||
# Parse logs for LIVE_STATS entries (generic pattern for any workflow)
|
||||
latest_stats = None
|
||||
for log in logs:
|
||||
# Prefer structured extra field if present
|
||||
extra_data = getattr(log, "extra", None) or getattr(log, "extra_fields", None) or None
|
||||
if isinstance(extra_data, dict):
|
||||
stat_type = extra_data.get("stats_type")
|
||||
if stat_type in ["fuzzing_live_update", "scan_progress", "analysis_update", "live_stats"]:
|
||||
latest_stats = extra_data
|
||||
continue
|
||||
|
||||
# Fallback to parsing from message text
|
||||
if ("FUZZ_STATS" in log.message or "LIVE_STATS" in log.message):
|
||||
stats = self._parse_stats_from_log(log.message)
|
||||
if stats:
|
||||
latest_stats = stats
|
||||
|
||||
# Update statistics if we found any
|
||||
if latest_stats:
|
||||
# Calculate elapsed time from task start
|
||||
elapsed_time = 0
|
||||
if task_run.start_time:
|
||||
# Ensure timezone-aware arithmetic
|
||||
now = datetime.now(timezone.utc)
|
||||
try:
|
||||
elapsed_time = int((now - task_run.start_time).total_seconds())
|
||||
except Exception:
|
||||
# Fallback to naive UTC if types mismatch
|
||||
elapsed_time = int((datetime.utcnow() - task_run.start_time.replace(tzinfo=None)).total_seconds())
|
||||
|
||||
updated_stats = FuzzingStats(
|
||||
run_id=run_id,
|
||||
workflow=workflow_name,
|
||||
executions=latest_stats.get("executions", 0),
|
||||
executions_per_sec=latest_stats.get("executions_per_sec", 0.0),
|
||||
crashes=latest_stats.get("crashes", 0),
|
||||
unique_crashes=latest_stats.get("unique_crashes", 0),
|
||||
corpus_size=latest_stats.get("corpus_size", 0),
|
||||
elapsed_time=elapsed_time
|
||||
)
|
||||
|
||||
# Update the global stats
|
||||
previous = fuzzing_stats.get(run_id)
|
||||
fuzzing_stats[run_id] = updated_stats
|
||||
|
||||
# Broadcast to any active WebSocket clients for this run
|
||||
if active_connections.get(run_id):
|
||||
# Handle both Pydantic objects and plain dicts
|
||||
if isinstance(updated_stats, dict):
|
||||
stats_data = updated_stats
|
||||
elif hasattr(updated_stats, 'model_dump'):
|
||||
stats_data = updated_stats.model_dump()
|
||||
elif hasattr(updated_stats, 'dict'):
|
||||
stats_data = updated_stats.dict()
|
||||
else:
|
||||
stats_data = updated_stats.__dict__
|
||||
|
||||
message = {
|
||||
"type": "stats_update",
|
||||
"data": stats_data,
|
||||
}
|
||||
disconnected = []
|
||||
for ws in active_connections[run_id]:
|
||||
try:
|
||||
await ws.send_text(json.dumps(message))
|
||||
except Exception:
|
||||
disconnected.append(ws)
|
||||
# Clean up disconnected sockets
|
||||
for ws in disconnected:
|
||||
try:
|
||||
active_connections[run_id].remove(ws)
|
||||
except ValueError:
|
||||
pass
|
||||
|
||||
logger.debug(f"Updated Prefect stats for {run_id}: {updated_stats.executions} execs")
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"Error extracting stats from task {task_run.id}: {e}")
|
||||
|
||||
async def _extract_stats_from_flow_logs(self, client, run_id: str, flow_run: FlowRun, workflow_name: str):
|
||||
"""Extract statistics by scanning flow-level logs for LIVE/FUZZ stats"""
|
||||
try:
|
||||
logs = await client.read_logs(
|
||||
log_filter={
|
||||
"flow_run_id": {"any_": [flow_run.id]}
|
||||
},
|
||||
limit=200,
|
||||
sort="TIMESTAMP_ASC"
|
||||
)
|
||||
|
||||
latest_stats = None
|
||||
last_seen = self.last_log_ts.get(run_id)
|
||||
max_ts = last_seen
|
||||
|
||||
for log in logs:
|
||||
# Skip logs we've already processed
|
||||
ts = getattr(log, "timestamp", None)
|
||||
if last_seen and ts and ts <= last_seen:
|
||||
continue
|
||||
if ts and (max_ts is None or ts > max_ts):
|
||||
max_ts = ts
|
||||
|
||||
# Prefer structured extra field if available
|
||||
extra_data = getattr(log, "extra", None) or getattr(log, "extra_fields", None) or None
|
||||
if isinstance(extra_data, dict):
|
||||
stat_type = extra_data.get("stats_type")
|
||||
if stat_type in ["fuzzing_live_update", "scan_progress", "analysis_update", "live_stats"]:
|
||||
latest_stats = extra_data
|
||||
continue
|
||||
|
||||
# Fallback to message parse
|
||||
if ("FUZZ_STATS" in log.message or "LIVE_STATS" in log.message):
|
||||
stats = self._parse_stats_from_log(log.message)
|
||||
if stats:
|
||||
latest_stats = stats
|
||||
|
||||
if max_ts:
|
||||
self.last_log_ts[run_id] = max_ts
|
||||
|
||||
if latest_stats:
|
||||
# Use flow_run timestamps for elapsed time if available
|
||||
elapsed_time = 0
|
||||
start_time = getattr(flow_run, "start_time", None) or getattr(flow_run, "start_time", None)
|
||||
if start_time:
|
||||
now = datetime.now(timezone.utc)
|
||||
try:
|
||||
if start_time.tzinfo is None:
|
||||
start_time = start_time.replace(tzinfo=timezone.utc)
|
||||
elapsed_time = int((now - start_time).total_seconds())
|
||||
except Exception:
|
||||
elapsed_time = int((datetime.utcnow() - start_time.replace(tzinfo=None)).total_seconds())
|
||||
|
||||
updated_stats = FuzzingStats(
|
||||
run_id=run_id,
|
||||
workflow=workflow_name,
|
||||
executions=latest_stats.get("executions", 0),
|
||||
executions_per_sec=latest_stats.get("executions_per_sec", 0.0),
|
||||
crashes=latest_stats.get("crashes", 0),
|
||||
unique_crashes=latest_stats.get("unique_crashes", 0),
|
||||
corpus_size=latest_stats.get("corpus_size", 0),
|
||||
elapsed_time=elapsed_time
|
||||
)
|
||||
|
||||
fuzzing_stats[run_id] = updated_stats
|
||||
|
||||
# Broadcast if listeners exist
|
||||
if active_connections.get(run_id):
|
||||
# Handle both Pydantic objects and plain dicts
|
||||
if isinstance(updated_stats, dict):
|
||||
stats_data = updated_stats
|
||||
elif hasattr(updated_stats, 'model_dump'):
|
||||
stats_data = updated_stats.model_dump()
|
||||
elif hasattr(updated_stats, 'dict'):
|
||||
stats_data = updated_stats.dict()
|
||||
else:
|
||||
stats_data = updated_stats.__dict__
|
||||
|
||||
message = {
|
||||
"type": "stats_update",
|
||||
"data": stats_data,
|
||||
}
|
||||
disconnected = []
|
||||
for ws in active_connections[run_id]:
|
||||
try:
|
||||
await ws.send_text(json.dumps(message))
|
||||
except Exception:
|
||||
disconnected.append(ws)
|
||||
for ws in disconnected:
|
||||
try:
|
||||
active_connections[run_id].remove(ws)
|
||||
except ValueError:
|
||||
pass
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"Error extracting stats from flow logs {run_id}: {e}")
|
||||
|
||||
def _parse_stats_from_log(self, log_message: str) -> Optional[Dict[str, Any]]:
|
||||
"""Parse statistics from a log message"""
|
||||
try:
|
||||
import re
|
||||
|
||||
# Prefer explicit JSON after marker tokens
|
||||
m = re.search(r'(?:FUZZ_STATS|LIVE_STATS)\s+(\{.*\})', log_message)
|
||||
if m:
|
||||
try:
|
||||
return json.loads(m.group(1))
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Fallback: Extract the extra= dict and coerce to JSON
|
||||
stats_match = re.search(r'extra=({.*?})', log_message)
|
||||
if not stats_match:
|
||||
return None
|
||||
|
||||
extra_str = stats_match.group(1)
|
||||
extra_str = extra_str.replace("'", '"')
|
||||
extra_str = extra_str.replace('None', 'null')
|
||||
extra_str = extra_str.replace('True', 'true')
|
||||
extra_str = extra_str.replace('False', 'false')
|
||||
|
||||
stats_data = json.loads(extra_str)
|
||||
|
||||
# Support multiple stat types for different workflows
|
||||
stat_type = stats_data.get("stats_type")
|
||||
if stat_type in ["fuzzing_live_update", "scan_progress", "analysis_update", "live_stats"]:
|
||||
return stats_data
|
||||
|
||||
except Exception as e:
|
||||
logger.debug(f"Error parsing log stats: {e}")
|
||||
|
||||
return None
|
||||
|
||||
|
||||
# Global instance
|
||||
prefect_stats_monitor = PrefectStatsMonitor()
|
||||
@@ -0,0 +1,10 @@
|
||||
"""
|
||||
Storage abstraction layer for FuzzForge.
|
||||
|
||||
Provides unified interface for storing and retrieving targets and results.
|
||||
"""
|
||||
|
||||
from .base import StorageBackend
|
||||
from .s3_cached import S3CachedStorage
|
||||
|
||||
__all__ = ["StorageBackend", "S3CachedStorage"]
|
||||
@@ -0,0 +1,153 @@
|
||||
"""
|
||||
Base storage backend interface.
|
||||
|
||||
All storage implementations must implement this interface.
|
||||
"""
|
||||
|
||||
from abc import ABC, abstractmethod
|
||||
from pathlib import Path
|
||||
from typing import Optional, Dict, Any
|
||||
|
||||
|
||||
class StorageBackend(ABC):
|
||||
"""
|
||||
Abstract base class for storage backends.
|
||||
|
||||
Implementations handle storage and retrieval of:
|
||||
- Uploaded targets (code, binaries, etc.)
|
||||
- Workflow results
|
||||
- Temporary files
|
||||
"""
|
||||
|
||||
@abstractmethod
|
||||
async def upload_target(
|
||||
self,
|
||||
file_path: Path,
|
||||
user_id: str,
|
||||
metadata: Optional[Dict[str, Any]] = None
|
||||
) -> str:
|
||||
"""
|
||||
Upload a target file to storage.
|
||||
|
||||
Args:
|
||||
file_path: Local path to file to upload
|
||||
user_id: ID of user uploading the file
|
||||
metadata: Optional metadata to store with file
|
||||
|
||||
Returns:
|
||||
Target ID (unique identifier for retrieval)
|
||||
|
||||
Raises:
|
||||
FileNotFoundError: If file_path doesn't exist
|
||||
StorageError: If upload fails
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def get_target(self, target_id: str) -> Path:
|
||||
"""
|
||||
Get target file from storage.
|
||||
|
||||
Args:
|
||||
target_id: Unique identifier from upload_target()
|
||||
|
||||
Returns:
|
||||
Local path to cached file
|
||||
|
||||
Raises:
|
||||
FileNotFoundError: If target doesn't exist
|
||||
StorageError: If download fails
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def delete_target(self, target_id: str) -> None:
|
||||
"""
|
||||
Delete target from storage.
|
||||
|
||||
Args:
|
||||
target_id: Unique identifier to delete
|
||||
|
||||
Raises:
|
||||
StorageError: If deletion fails (doesn't raise if not found)
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def upload_results(
|
||||
self,
|
||||
workflow_id: str,
|
||||
results: Dict[str, Any],
|
||||
results_format: str = "json"
|
||||
) -> str:
|
||||
"""
|
||||
Upload workflow results to storage.
|
||||
|
||||
Args:
|
||||
workflow_id: Workflow execution ID
|
||||
results: Results dictionary
|
||||
results_format: Format (json, sarif, etc.)
|
||||
|
||||
Returns:
|
||||
URL to uploaded results
|
||||
|
||||
Raises:
|
||||
StorageError: If upload fails
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def get_results(self, workflow_id: str) -> Dict[str, Any]:
|
||||
"""
|
||||
Get workflow results from storage.
|
||||
|
||||
Args:
|
||||
workflow_id: Workflow execution ID
|
||||
|
||||
Returns:
|
||||
Results dictionary
|
||||
|
||||
Raises:
|
||||
FileNotFoundError: If results don't exist
|
||||
StorageError: If download fails
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def list_targets(
|
||||
self,
|
||||
user_id: Optional[str] = None,
|
||||
limit: int = 100
|
||||
) -> list[Dict[str, Any]]:
|
||||
"""
|
||||
List uploaded targets.
|
||||
|
||||
Args:
|
||||
user_id: Filter by user ID (None = all users)
|
||||
limit: Maximum number of results
|
||||
|
||||
Returns:
|
||||
List of target metadata dictionaries
|
||||
|
||||
Raises:
|
||||
StorageError: If listing fails
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def cleanup_cache(self) -> int:
|
||||
"""
|
||||
Clean up local cache (LRU eviction).
|
||||
|
||||
Returns:
|
||||
Number of files removed
|
||||
|
||||
Raises:
|
||||
StorageError: If cleanup fails
|
||||
"""
|
||||
pass
|
||||
|
||||
|
||||
class StorageError(Exception):
|
||||
"""Base exception for storage operations."""
|
||||
pass
|
||||
@@ -0,0 +1,423 @@
|
||||
"""
|
||||
S3-compatible storage backend with local caching.
|
||||
|
||||
Works with MinIO (dev/prod) or AWS S3 (cloud).
|
||||
"""
|
||||
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import shutil
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
from typing import Optional, Dict, Any
|
||||
from uuid import uuid4
|
||||
|
||||
import boto3
|
||||
from botocore.exceptions import ClientError
|
||||
|
||||
from .base import StorageBackend, StorageError
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class S3CachedStorage(StorageBackend):
|
||||
"""
|
||||
S3-compatible storage with local caching.
|
||||
|
||||
Features:
|
||||
- Upload targets to S3/MinIO
|
||||
- Download with local caching (LRU eviction)
|
||||
- Lifecycle management (auto-cleanup old files)
|
||||
- Metadata tracking
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
endpoint_url: Optional[str] = None,
|
||||
access_key: Optional[str] = None,
|
||||
secret_key: Optional[str] = None,
|
||||
bucket: str = "targets",
|
||||
region: str = "us-east-1",
|
||||
use_ssl: bool = False,
|
||||
cache_dir: Optional[Path] = None,
|
||||
cache_max_size_gb: int = 10
|
||||
):
|
||||
"""
|
||||
Initialize S3 storage backend.
|
||||
|
||||
Args:
|
||||
endpoint_url: S3 endpoint (None = AWS S3, or MinIO URL)
|
||||
access_key: S3 access key (None = from env)
|
||||
secret_key: S3 secret key (None = from env)
|
||||
bucket: S3 bucket name
|
||||
region: AWS region
|
||||
use_ssl: Use HTTPS
|
||||
cache_dir: Local cache directory
|
||||
cache_max_size_gb: Maximum cache size in GB
|
||||
"""
|
||||
# Use environment variables as defaults
|
||||
self.endpoint_url = endpoint_url or os.getenv('S3_ENDPOINT', 'http://minio:9000')
|
||||
self.access_key = access_key or os.getenv('S3_ACCESS_KEY', 'fuzzforge')
|
||||
self.secret_key = secret_key or os.getenv('S3_SECRET_KEY', 'fuzzforge123')
|
||||
self.bucket = bucket or os.getenv('S3_BUCKET', 'targets')
|
||||
self.region = region or os.getenv('S3_REGION', 'us-east-1')
|
||||
self.use_ssl = use_ssl or os.getenv('S3_USE_SSL', 'false').lower() == 'true'
|
||||
|
||||
# Cache configuration
|
||||
self.cache_dir = cache_dir or Path(os.getenv('CACHE_DIR', '/tmp/fuzzforge-cache'))
|
||||
self.cache_max_size = cache_max_size_gb * (1024 ** 3) # Convert to bytes
|
||||
|
||||
# Ensure cache directory exists
|
||||
self.cache_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Initialize S3 client
|
||||
try:
|
||||
self.s3_client = boto3.client(
|
||||
's3',
|
||||
endpoint_url=self.endpoint_url,
|
||||
aws_access_key_id=self.access_key,
|
||||
aws_secret_access_key=self.secret_key,
|
||||
region_name=self.region,
|
||||
use_ssl=self.use_ssl
|
||||
)
|
||||
logger.info(f"Initialized S3 storage: {self.endpoint_url}/{self.bucket}")
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to initialize S3 client: {e}")
|
||||
raise StorageError(f"S3 initialization failed: {e}")
|
||||
|
||||
async def upload_target(
|
||||
self,
|
||||
file_path: Path,
|
||||
user_id: str,
|
||||
metadata: Optional[Dict[str, Any]] = None
|
||||
) -> str:
|
||||
"""Upload target file to S3/MinIO."""
|
||||
if not file_path.exists():
|
||||
raise FileNotFoundError(f"File not found: {file_path}")
|
||||
|
||||
# Generate unique target ID
|
||||
target_id = str(uuid4())
|
||||
|
||||
# Prepare metadata
|
||||
upload_metadata = {
|
||||
'user_id': user_id,
|
||||
'uploaded_at': datetime.now().isoformat(),
|
||||
'filename': file_path.name,
|
||||
'size': str(file_path.stat().st_size)
|
||||
}
|
||||
if metadata:
|
||||
upload_metadata.update(metadata)
|
||||
|
||||
# Upload to S3
|
||||
s3_key = f'{target_id}/target'
|
||||
try:
|
||||
logger.info(f"Uploading target to s3://{self.bucket}/{s3_key}")
|
||||
|
||||
self.s3_client.upload_file(
|
||||
str(file_path),
|
||||
self.bucket,
|
||||
s3_key,
|
||||
ExtraArgs={
|
||||
'Metadata': upload_metadata
|
||||
}
|
||||
)
|
||||
|
||||
file_size_mb = file_path.stat().st_size / (1024 * 1024)
|
||||
logger.info(
|
||||
f"✓ Uploaded target {target_id} "
|
||||
f"({file_path.name}, {file_size_mb:.2f} MB)"
|
||||
)
|
||||
|
||||
return target_id
|
||||
|
||||
except ClientError as e:
|
||||
logger.error(f"S3 upload failed: {e}", exc_info=True)
|
||||
raise StorageError(f"Failed to upload target: {e}")
|
||||
except Exception as e:
|
||||
logger.error(f"Upload failed: {e}", exc_info=True)
|
||||
raise StorageError(f"Upload error: {e}")
|
||||
|
||||
async def get_target(self, target_id: str) -> Path:
|
||||
"""Get target from cache or download from S3/MinIO."""
|
||||
# Check cache first
|
||||
cache_path = self.cache_dir / target_id
|
||||
cached_file = cache_path / "target"
|
||||
|
||||
if cached_file.exists():
|
||||
# Update access time for LRU
|
||||
cached_file.touch()
|
||||
logger.info(f"Cache HIT: {target_id}")
|
||||
return cached_file
|
||||
|
||||
# Cache miss - download from S3
|
||||
logger.info(f"Cache MISS: {target_id}, downloading from S3...")
|
||||
|
||||
try:
|
||||
# Create cache directory
|
||||
cache_path.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Download from S3
|
||||
s3_key = f'{target_id}/target'
|
||||
logger.info(f"Downloading s3://{self.bucket}/{s3_key}")
|
||||
|
||||
self.s3_client.download_file(
|
||||
self.bucket,
|
||||
s3_key,
|
||||
str(cached_file)
|
||||
)
|
||||
|
||||
# Verify download
|
||||
if not cached_file.exists():
|
||||
raise StorageError(f"Downloaded file not found: {cached_file}")
|
||||
|
||||
file_size_mb = cached_file.stat().st_size / (1024 * 1024)
|
||||
logger.info(f"✓ Downloaded target {target_id} ({file_size_mb:.2f} MB)")
|
||||
|
||||
return cached_file
|
||||
|
||||
except ClientError as e:
|
||||
error_code = e.response.get('Error', {}).get('Code')
|
||||
if error_code in ['404', 'NoSuchKey']:
|
||||
logger.error(f"Target not found: {target_id}")
|
||||
raise FileNotFoundError(f"Target {target_id} not found in storage")
|
||||
else:
|
||||
logger.error(f"S3 download failed: {e}", exc_info=True)
|
||||
raise StorageError(f"Download failed: {e}")
|
||||
except Exception as e:
|
||||
logger.error(f"Download error: {e}", exc_info=True)
|
||||
# Cleanup partial download
|
||||
if cache_path.exists():
|
||||
shutil.rmtree(cache_path, ignore_errors=True)
|
||||
raise StorageError(f"Download error: {e}")
|
||||
|
||||
async def delete_target(self, target_id: str) -> None:
|
||||
"""Delete target from S3/MinIO."""
|
||||
try:
|
||||
s3_key = f'{target_id}/target'
|
||||
logger.info(f"Deleting s3://{self.bucket}/{s3_key}")
|
||||
|
||||
self.s3_client.delete_object(
|
||||
Bucket=self.bucket,
|
||||
Key=s3_key
|
||||
)
|
||||
|
||||
# Also delete from cache if present
|
||||
cache_path = self.cache_dir / target_id
|
||||
if cache_path.exists():
|
||||
shutil.rmtree(cache_path, ignore_errors=True)
|
||||
logger.info(f"✓ Deleted target {target_id} from S3 and cache")
|
||||
else:
|
||||
logger.info(f"✓ Deleted target {target_id} from S3")
|
||||
|
||||
except ClientError as e:
|
||||
logger.error(f"S3 delete failed: {e}", exc_info=True)
|
||||
# Don't raise error if object doesn't exist
|
||||
if e.response.get('Error', {}).get('Code') not in ['404', 'NoSuchKey']:
|
||||
raise StorageError(f"Delete failed: {e}")
|
||||
except Exception as e:
|
||||
logger.error(f"Delete error: {e}", exc_info=True)
|
||||
raise StorageError(f"Delete error: {e}")
|
||||
|
||||
async def upload_results(
|
||||
self,
|
||||
workflow_id: str,
|
||||
results: Dict[str, Any],
|
||||
results_format: str = "json"
|
||||
) -> str:
|
||||
"""Upload workflow results to S3/MinIO."""
|
||||
try:
|
||||
# Prepare results content
|
||||
if results_format == "json":
|
||||
content = json.dumps(results, indent=2).encode('utf-8')
|
||||
content_type = 'application/json'
|
||||
file_ext = 'json'
|
||||
elif results_format == "sarif":
|
||||
content = json.dumps(results, indent=2).encode('utf-8')
|
||||
content_type = 'application/sarif+json'
|
||||
file_ext = 'sarif'
|
||||
else:
|
||||
content = json.dumps(results, indent=2).encode('utf-8')
|
||||
content_type = 'application/json'
|
||||
file_ext = 'json'
|
||||
|
||||
# Upload to results bucket
|
||||
results_bucket = 'results'
|
||||
s3_key = f'{workflow_id}/results.{file_ext}'
|
||||
|
||||
logger.info(f"Uploading results to s3://{results_bucket}/{s3_key}")
|
||||
|
||||
self.s3_client.put_object(
|
||||
Bucket=results_bucket,
|
||||
Key=s3_key,
|
||||
Body=content,
|
||||
ContentType=content_type,
|
||||
Metadata={
|
||||
'workflow_id': workflow_id,
|
||||
'format': results_format,
|
||||
'uploaded_at': datetime.now().isoformat()
|
||||
}
|
||||
)
|
||||
|
||||
# Construct URL
|
||||
results_url = f"{self.endpoint_url}/{results_bucket}/{s3_key}"
|
||||
logger.info(f"✓ Uploaded results: {results_url}")
|
||||
|
||||
return results_url
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Results upload failed: {e}", exc_info=True)
|
||||
raise StorageError(f"Results upload failed: {e}")
|
||||
|
||||
async def get_results(self, workflow_id: str) -> Dict[str, Any]:
|
||||
"""Get workflow results from S3/MinIO."""
|
||||
try:
|
||||
results_bucket = 'results'
|
||||
s3_key = f'{workflow_id}/results.json'
|
||||
|
||||
logger.info(f"Downloading results from s3://{results_bucket}/{s3_key}")
|
||||
|
||||
response = self.s3_client.get_object(
|
||||
Bucket=results_bucket,
|
||||
Key=s3_key
|
||||
)
|
||||
|
||||
content = response['Body'].read().decode('utf-8')
|
||||
results = json.loads(content)
|
||||
|
||||
logger.info(f"✓ Downloaded results for workflow {workflow_id}")
|
||||
return results
|
||||
|
||||
except ClientError as e:
|
||||
error_code = e.response.get('Error', {}).get('Code')
|
||||
if error_code in ['404', 'NoSuchKey']:
|
||||
logger.error(f"Results not found: {workflow_id}")
|
||||
raise FileNotFoundError(f"Results for workflow {workflow_id} not found")
|
||||
else:
|
||||
logger.error(f"Results download failed: {e}", exc_info=True)
|
||||
raise StorageError(f"Results download failed: {e}")
|
||||
except Exception as e:
|
||||
logger.error(f"Results download error: {e}", exc_info=True)
|
||||
raise StorageError(f"Results download error: {e}")
|
||||
|
||||
async def list_targets(
|
||||
self,
|
||||
user_id: Optional[str] = None,
|
||||
limit: int = 100
|
||||
) -> list[Dict[str, Any]]:
|
||||
"""List uploaded targets."""
|
||||
try:
|
||||
targets = []
|
||||
paginator = self.s3_client.get_paginator('list_objects_v2')
|
||||
|
||||
for page in paginator.paginate(Bucket=self.bucket, PaginationConfig={'MaxItems': limit}):
|
||||
for obj in page.get('Contents', []):
|
||||
# Get object metadata
|
||||
try:
|
||||
metadata_response = self.s3_client.head_object(
|
||||
Bucket=self.bucket,
|
||||
Key=obj['Key']
|
||||
)
|
||||
metadata = metadata_response.get('Metadata', {})
|
||||
|
||||
# Filter by user_id if specified
|
||||
if user_id and metadata.get('user_id') != user_id:
|
||||
continue
|
||||
|
||||
targets.append({
|
||||
'target_id': obj['Key'].split('/')[0],
|
||||
'key': obj['Key'],
|
||||
'size': obj['Size'],
|
||||
'last_modified': obj['LastModified'].isoformat(),
|
||||
'metadata': metadata
|
||||
})
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to get metadata for {obj['Key']}: {e}")
|
||||
continue
|
||||
|
||||
logger.info(f"Listed {len(targets)} targets (user_id={user_id})")
|
||||
return targets
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"List targets failed: {e}", exc_info=True)
|
||||
raise StorageError(f"List targets failed: {e}")
|
||||
|
||||
async def cleanup_cache(self) -> int:
|
||||
"""Clean up local cache using LRU eviction."""
|
||||
try:
|
||||
cache_files = []
|
||||
total_size = 0
|
||||
|
||||
# Gather all cached files with metadata
|
||||
for cache_file in self.cache_dir.rglob('*'):
|
||||
if cache_file.is_file():
|
||||
try:
|
||||
stat = cache_file.stat()
|
||||
cache_files.append({
|
||||
'path': cache_file,
|
||||
'size': stat.st_size,
|
||||
'atime': stat.st_atime # Last access time
|
||||
})
|
||||
total_size += stat.st_size
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to stat {cache_file}: {e}")
|
||||
continue
|
||||
|
||||
# Check if cleanup is needed
|
||||
if total_size <= self.cache_max_size:
|
||||
logger.info(
|
||||
f"Cache size OK: {total_size / (1024**3):.2f} GB / "
|
||||
f"{self.cache_max_size / (1024**3):.2f} GB"
|
||||
)
|
||||
return 0
|
||||
|
||||
# Sort by access time (oldest first)
|
||||
cache_files.sort(key=lambda x: x['atime'])
|
||||
|
||||
# Remove files until under limit
|
||||
removed_count = 0
|
||||
for file_info in cache_files:
|
||||
if total_size <= self.cache_max_size:
|
||||
break
|
||||
|
||||
try:
|
||||
file_info['path'].unlink()
|
||||
total_size -= file_info['size']
|
||||
removed_count += 1
|
||||
logger.debug(f"Evicted from cache: {file_info['path']}")
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to delete {file_info['path']}: {e}")
|
||||
continue
|
||||
|
||||
logger.info(
|
||||
f"✓ Cache cleanup: removed {removed_count} files, "
|
||||
f"new size: {total_size / (1024**3):.2f} GB"
|
||||
)
|
||||
return removed_count
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Cache cleanup failed: {e}", exc_info=True)
|
||||
raise StorageError(f"Cache cleanup failed: {e}")
|
||||
|
||||
def get_cache_stats(self) -> Dict[str, Any]:
|
||||
"""Get cache statistics."""
|
||||
try:
|
||||
total_size = 0
|
||||
file_count = 0
|
||||
|
||||
for cache_file in self.cache_dir.rglob('*'):
|
||||
if cache_file.is_file():
|
||||
total_size += cache_file.stat().st_size
|
||||
file_count += 1
|
||||
|
||||
return {
|
||||
'total_size_bytes': total_size,
|
||||
'total_size_gb': total_size / (1024 ** 3),
|
||||
'file_count': file_count,
|
||||
'max_size_gb': self.cache_max_size / (1024 ** 3),
|
||||
'usage_percent': (total_size / self.cache_max_size) * 100
|
||||
}
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get cache stats: {e}")
|
||||
return {'error': str(e)}
|
||||
@@ -0,0 +1,10 @@
|
||||
"""
|
||||
Temporal integration for FuzzForge.
|
||||
|
||||
Handles workflow execution, monitoring, and management.
|
||||
"""
|
||||
|
||||
from .manager import TemporalManager
|
||||
from .discovery import WorkflowDiscovery
|
||||
|
||||
__all__ = ["TemporalManager", "WorkflowDiscovery"]
|
||||
@@ -0,0 +1,257 @@
|
||||
"""
|
||||
Workflow Discovery for Temporal
|
||||
|
||||
Discovers workflows from the toolbox/workflows directory
|
||||
and provides metadata about available workflows.
|
||||
"""
|
||||
|
||||
import logging
|
||||
import yaml
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any
|
||||
from pydantic import BaseModel, Field, ConfigDict
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class WorkflowInfo(BaseModel):
|
||||
"""Information about a discovered workflow"""
|
||||
name: str = Field(..., description="Workflow name")
|
||||
path: Path = Field(..., description="Path to workflow directory")
|
||||
workflow_file: Path = Field(..., description="Path to workflow.py file")
|
||||
metadata: Dict[str, Any] = Field(..., description="Workflow metadata from YAML")
|
||||
workflow_type: str = Field(..., description="Workflow class name")
|
||||
vertical: str = Field(..., description="Vertical (worker type) for this workflow")
|
||||
|
||||
model_config = ConfigDict(arbitrary_types_allowed=True)
|
||||
|
||||
|
||||
class WorkflowDiscovery:
|
||||
"""
|
||||
Discovers workflows from the filesystem.
|
||||
|
||||
Scans toolbox/workflows/ for directories containing:
|
||||
- metadata.yaml (required)
|
||||
- workflow.py (required)
|
||||
|
||||
Each workflow declares its vertical (rust, android, web, etc.)
|
||||
which determines which worker pool will execute it.
|
||||
"""
|
||||
|
||||
def __init__(self, workflows_dir: Path):
|
||||
"""
|
||||
Initialize workflow discovery.
|
||||
|
||||
Args:
|
||||
workflows_dir: Path to the workflows directory
|
||||
"""
|
||||
self.workflows_dir = workflows_dir
|
||||
if not self.workflows_dir.exists():
|
||||
self.workflows_dir.mkdir(parents=True, exist_ok=True)
|
||||
logger.info(f"Created workflows directory: {self.workflows_dir}")
|
||||
|
||||
async def discover_workflows(self) -> Dict[str, WorkflowInfo]:
|
||||
"""
|
||||
Discover workflows by scanning the workflows directory.
|
||||
|
||||
Returns:
|
||||
Dictionary mapping workflow names to their information
|
||||
"""
|
||||
workflows = {}
|
||||
|
||||
logger.info(f"Scanning for workflows in: {self.workflows_dir}")
|
||||
|
||||
for workflow_dir in self.workflows_dir.iterdir():
|
||||
if not workflow_dir.is_dir():
|
||||
continue
|
||||
|
||||
# Skip special directories
|
||||
if workflow_dir.name.startswith('.') or workflow_dir.name == '__pycache__':
|
||||
continue
|
||||
|
||||
metadata_file = workflow_dir / "metadata.yaml"
|
||||
if not metadata_file.exists():
|
||||
logger.debug(f"No metadata.yaml in {workflow_dir.name}, skipping")
|
||||
continue
|
||||
|
||||
workflow_file = workflow_dir / "workflow.py"
|
||||
if not workflow_file.exists():
|
||||
logger.warning(
|
||||
f"Workflow {workflow_dir.name} has metadata but no workflow.py, skipping"
|
||||
)
|
||||
continue
|
||||
|
||||
try:
|
||||
# Parse metadata
|
||||
with open(metadata_file) as f:
|
||||
metadata = yaml.safe_load(f)
|
||||
|
||||
# Validate required fields
|
||||
if 'name' not in metadata:
|
||||
logger.warning(f"Workflow {workflow_dir.name} metadata missing 'name' field")
|
||||
metadata['name'] = workflow_dir.name
|
||||
|
||||
if 'vertical' not in metadata:
|
||||
logger.warning(
|
||||
f"Workflow {workflow_dir.name} metadata missing 'vertical' field"
|
||||
)
|
||||
continue
|
||||
|
||||
# Infer workflow class name from metadata or use convention
|
||||
workflow_type = metadata.get('workflow_class')
|
||||
if not workflow_type:
|
||||
# Convention: convert snake_case to PascalCase + Workflow
|
||||
# e.g., rust_test -> RustTestWorkflow
|
||||
parts = workflow_dir.name.split('_')
|
||||
workflow_type = ''.join(part.capitalize() for part in parts) + 'Workflow'
|
||||
|
||||
# Create workflow info
|
||||
info = WorkflowInfo(
|
||||
name=metadata['name'],
|
||||
path=workflow_dir,
|
||||
workflow_file=workflow_file,
|
||||
metadata=metadata,
|
||||
workflow_type=workflow_type,
|
||||
vertical=metadata['vertical']
|
||||
)
|
||||
|
||||
workflows[info.name] = info
|
||||
logger.info(
|
||||
f"✓ Discovered workflow: {info.name} "
|
||||
f"(vertical: {info.vertical}, class: {info.workflow_type})"
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(
|
||||
f"Error discovering workflow {workflow_dir.name}: {e}",
|
||||
exc_info=True
|
||||
)
|
||||
continue
|
||||
|
||||
logger.info(f"Discovered {len(workflows)} workflows")
|
||||
return workflows
|
||||
|
||||
def get_workflows_by_vertical(
|
||||
self,
|
||||
workflows: Dict[str, WorkflowInfo],
|
||||
vertical: str
|
||||
) -> Dict[str, WorkflowInfo]:
|
||||
"""
|
||||
Filter workflows by vertical.
|
||||
|
||||
Args:
|
||||
workflows: All discovered workflows
|
||||
vertical: Vertical name to filter by
|
||||
|
||||
Returns:
|
||||
Filtered workflows dictionary
|
||||
"""
|
||||
return {
|
||||
name: info
|
||||
for name, info in workflows.items()
|
||||
if info.vertical == vertical
|
||||
}
|
||||
|
||||
def get_available_verticals(self, workflows: Dict[str, WorkflowInfo]) -> list[str]:
|
||||
"""
|
||||
Get list of all verticals from discovered workflows.
|
||||
|
||||
Args:
|
||||
workflows: All discovered workflows
|
||||
|
||||
Returns:
|
||||
List of unique vertical names
|
||||
"""
|
||||
return list(set(info.vertical for info in workflows.values()))
|
||||
|
||||
@staticmethod
|
||||
def get_metadata_schema() -> Dict[str, Any]:
|
||||
"""
|
||||
Get the JSON schema for workflow metadata.
|
||||
|
||||
Returns:
|
||||
JSON schema dictionary
|
||||
"""
|
||||
return {
|
||||
"type": "object",
|
||||
"required": ["name", "version", "description", "author", "vertical", "parameters"],
|
||||
"properties": {
|
||||
"name": {
|
||||
"type": "string",
|
||||
"description": "Workflow name"
|
||||
},
|
||||
"version": {
|
||||
"type": "string",
|
||||
"pattern": "^\\d+\\.\\d+\\.\\d+$",
|
||||
"description": "Semantic version (x.y.z)"
|
||||
},
|
||||
"vertical": {
|
||||
"type": "string",
|
||||
"description": "Vertical worker type (rust, android, web, etc.)"
|
||||
},
|
||||
"description": {
|
||||
"type": "string",
|
||||
"description": "Workflow description"
|
||||
},
|
||||
"author": {
|
||||
"type": "string",
|
||||
"description": "Workflow author"
|
||||
},
|
||||
"category": {
|
||||
"type": "string",
|
||||
"enum": ["comprehensive", "specialized", "fuzzing", "focused"],
|
||||
"description": "Workflow category"
|
||||
},
|
||||
"tags": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "Workflow tags for categorization"
|
||||
},
|
||||
"requirements": {
|
||||
"type": "object",
|
||||
"required": ["tools", "resources"],
|
||||
"properties": {
|
||||
"tools": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "Required security tools"
|
||||
},
|
||||
"resources": {
|
||||
"type": "object",
|
||||
"required": ["memory", "cpu", "timeout"],
|
||||
"properties": {
|
||||
"memory": {
|
||||
"type": "string",
|
||||
"pattern": "^\\d+[GMK]i$",
|
||||
"description": "Memory limit (e.g., 1Gi, 512Mi)"
|
||||
},
|
||||
"cpu": {
|
||||
"type": "string",
|
||||
"pattern": "^\\d+m?$",
|
||||
"description": "CPU limit (e.g., 1000m, 2)"
|
||||
},
|
||||
"timeout": {
|
||||
"type": "integer",
|
||||
"minimum": 60,
|
||||
"maximum": 7200,
|
||||
"description": "Workflow timeout in seconds"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"description": "Workflow parameters schema"
|
||||
},
|
||||
"default_parameters": {
|
||||
"type": "object",
|
||||
"description": "Default parameter values"
|
||||
},
|
||||
"required_modules": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "Required module names"
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,371 @@
|
||||
"""
|
||||
Temporal Manager - Workflow execution and management
|
||||
|
||||
Handles:
|
||||
- Workflow discovery from toolbox
|
||||
- Workflow execution (submit to Temporal)
|
||||
- Status monitoring
|
||||
- Results retrieval
|
||||
"""
|
||||
|
||||
import logging
|
||||
import os
|
||||
from pathlib import Path
|
||||
from typing import Dict, Optional, Any
|
||||
from uuid import uuid4
|
||||
|
||||
from temporalio.client import Client, WorkflowHandle
|
||||
from temporalio.common import RetryPolicy
|
||||
from datetime import timedelta
|
||||
|
||||
from .discovery import WorkflowDiscovery, WorkflowInfo
|
||||
from src.storage import S3CachedStorage
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class TemporalManager:
|
||||
"""
|
||||
Manages Temporal workflow execution for FuzzForge.
|
||||
|
||||
This class:
|
||||
- Discovers available workflows from toolbox
|
||||
- Submits workflow executions to Temporal
|
||||
- Monitors workflow status
|
||||
- Retrieves workflow results
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
workflows_dir: Optional[Path] = None,
|
||||
temporal_address: Optional[str] = None,
|
||||
temporal_namespace: str = "default",
|
||||
storage: Optional[S3CachedStorage] = None
|
||||
):
|
||||
"""
|
||||
Initialize Temporal manager.
|
||||
|
||||
Args:
|
||||
workflows_dir: Path to workflows directory (default: toolbox/workflows)
|
||||
temporal_address: Temporal server address (default: from env or localhost:7233)
|
||||
temporal_namespace: Temporal namespace
|
||||
storage: Storage backend for file uploads (default: S3CachedStorage)
|
||||
"""
|
||||
if workflows_dir is None:
|
||||
workflows_dir = Path("toolbox/workflows")
|
||||
|
||||
self.temporal_address = temporal_address or os.getenv(
|
||||
'TEMPORAL_ADDRESS',
|
||||
'localhost:7233'
|
||||
)
|
||||
self.temporal_namespace = temporal_namespace
|
||||
self.discovery = WorkflowDiscovery(workflows_dir)
|
||||
self.workflows: Dict[str, WorkflowInfo] = {}
|
||||
self.client: Optional[Client] = None
|
||||
|
||||
# Initialize storage backend
|
||||
self.storage = storage or S3CachedStorage()
|
||||
|
||||
logger.info(
|
||||
f"TemporalManager initialized: {self.temporal_address} "
|
||||
f"(namespace: {self.temporal_namespace})"
|
||||
)
|
||||
|
||||
async def initialize(self):
|
||||
"""Initialize the manager by discovering workflows and connecting to Temporal."""
|
||||
try:
|
||||
# Discover workflows
|
||||
self.workflows = await self.discovery.discover_workflows()
|
||||
|
||||
if not self.workflows:
|
||||
logger.warning("No workflows discovered")
|
||||
else:
|
||||
logger.info(
|
||||
f"Discovered {len(self.workflows)} workflows: "
|
||||
f"{list(self.workflows.keys())}"
|
||||
)
|
||||
|
||||
# Connect to Temporal
|
||||
self.client = await Client.connect(
|
||||
self.temporal_address,
|
||||
namespace=self.temporal_namespace
|
||||
)
|
||||
logger.info(f"✓ Connected to Temporal: {self.temporal_address}")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to initialize Temporal manager: {e}", exc_info=True)
|
||||
raise
|
||||
|
||||
async def close(self):
|
||||
"""Close Temporal client connection."""
|
||||
if self.client:
|
||||
# Temporal client doesn't need explicit close in Python SDK
|
||||
pass
|
||||
|
||||
async def get_workflows(self) -> Dict[str, WorkflowInfo]:
|
||||
"""
|
||||
Get all discovered workflows.
|
||||
|
||||
Returns:
|
||||
Dictionary mapping workflow names to their info
|
||||
"""
|
||||
return self.workflows
|
||||
|
||||
async def get_workflow(self, name: str) -> Optional[WorkflowInfo]:
|
||||
"""
|
||||
Get workflow info by name.
|
||||
|
||||
Args:
|
||||
name: Workflow name
|
||||
|
||||
Returns:
|
||||
WorkflowInfo or None if not found
|
||||
"""
|
||||
return self.workflows.get(name)
|
||||
|
||||
async def upload_target(
|
||||
self,
|
||||
file_path: Path,
|
||||
user_id: str,
|
||||
metadata: Optional[Dict[str, Any]] = None
|
||||
) -> str:
|
||||
"""
|
||||
Upload target file to storage.
|
||||
|
||||
Args:
|
||||
file_path: Local path to file
|
||||
user_id: User ID
|
||||
metadata: Optional metadata
|
||||
|
||||
Returns:
|
||||
Target ID for use in workflow execution
|
||||
"""
|
||||
target_id = await self.storage.upload_target(file_path, user_id, metadata)
|
||||
logger.info(f"Uploaded target: {target_id}")
|
||||
return target_id
|
||||
|
||||
async def run_workflow(
|
||||
self,
|
||||
workflow_name: str,
|
||||
target_id: str,
|
||||
workflow_params: Optional[Dict[str, Any]] = None,
|
||||
workflow_id: Optional[str] = None
|
||||
) -> WorkflowHandle:
|
||||
"""
|
||||
Execute a workflow.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of workflow to execute
|
||||
target_id: Target ID (from upload_target)
|
||||
workflow_params: Additional workflow parameters
|
||||
workflow_id: Optional workflow ID (generated if not provided)
|
||||
|
||||
Returns:
|
||||
WorkflowHandle for monitoring/results
|
||||
|
||||
Raises:
|
||||
ValueError: If workflow not found or client not initialized
|
||||
"""
|
||||
if not self.client:
|
||||
raise ValueError("Temporal client not initialized. Call initialize() first.")
|
||||
|
||||
# Get workflow info
|
||||
workflow_info = self.workflows.get(workflow_name)
|
||||
if not workflow_info:
|
||||
raise ValueError(f"Workflow not found: {workflow_name}")
|
||||
|
||||
# Generate workflow ID if not provided
|
||||
if not workflow_id:
|
||||
workflow_id = f"{workflow_name}-{str(uuid4())[:8]}"
|
||||
|
||||
# Prepare workflow input arguments
|
||||
workflow_params = workflow_params or {}
|
||||
|
||||
# Build args list: [target_id, ...workflow_params values]
|
||||
# The workflow parameters are passed as individual positional args
|
||||
workflow_args = [target_id]
|
||||
|
||||
# Add parameters in order based on workflow signature
|
||||
# For security_assessment: scanner_config, analyzer_config, reporter_config
|
||||
# For atheris_fuzzing: target_file, max_iterations, timeout_seconds
|
||||
if workflow_params:
|
||||
workflow_args.extend(workflow_params.values())
|
||||
|
||||
# Determine task queue from workflow vertical
|
||||
vertical = workflow_info.metadata.get("vertical", "default")
|
||||
task_queue = f"{vertical}-queue"
|
||||
|
||||
logger.info(
|
||||
f"Starting workflow: {workflow_name} "
|
||||
f"(id={workflow_id}, queue={task_queue}, target={target_id})"
|
||||
)
|
||||
|
||||
try:
|
||||
# Start workflow execution with positional arguments
|
||||
handle = await self.client.start_workflow(
|
||||
workflow=workflow_info.workflow_type, # Workflow class name
|
||||
args=workflow_args, # Positional arguments
|
||||
id=workflow_id,
|
||||
task_queue=task_queue,
|
||||
retry_policy=RetryPolicy(
|
||||
initial_interval=timedelta(seconds=1),
|
||||
maximum_interval=timedelta(minutes=1),
|
||||
maximum_attempts=3
|
||||
)
|
||||
)
|
||||
|
||||
logger.info(f"✓ Workflow started: {workflow_id}")
|
||||
return handle
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to start workflow {workflow_name}: {e}", exc_info=True)
|
||||
raise
|
||||
|
||||
async def get_workflow_status(self, workflow_id: str) -> Dict[str, Any]:
|
||||
"""
|
||||
Get workflow execution status.
|
||||
|
||||
Args:
|
||||
workflow_id: Workflow execution ID
|
||||
|
||||
Returns:
|
||||
Status dictionary with workflow state
|
||||
|
||||
Raises:
|
||||
ValueError: If client not initialized or workflow not found
|
||||
"""
|
||||
if not self.client:
|
||||
raise ValueError("Temporal client not initialized")
|
||||
|
||||
try:
|
||||
# Get workflow handle
|
||||
handle = self.client.get_workflow_handle(workflow_id)
|
||||
|
||||
# Try to get result (non-blocking describe)
|
||||
description = await handle.describe()
|
||||
|
||||
status = {
|
||||
"workflow_id": workflow_id,
|
||||
"status": description.status.name,
|
||||
"start_time": description.start_time.isoformat() if description.start_time else None,
|
||||
"execution_time": description.execution_time.isoformat() if description.execution_time else None,
|
||||
"close_time": description.close_time.isoformat() if description.close_time else None,
|
||||
"task_queue": description.task_queue,
|
||||
}
|
||||
|
||||
logger.info(f"Workflow {workflow_id} status: {status['status']}")
|
||||
return status
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get workflow status: {e}", exc_info=True)
|
||||
raise
|
||||
|
||||
async def get_workflow_result(
|
||||
self,
|
||||
workflow_id: str,
|
||||
timeout: Optional[timedelta] = None
|
||||
) -> Any:
|
||||
"""
|
||||
Get workflow execution result (blocking).
|
||||
|
||||
Args:
|
||||
workflow_id: Workflow execution ID
|
||||
timeout: Maximum time to wait for result
|
||||
|
||||
Returns:
|
||||
Workflow result
|
||||
|
||||
Raises:
|
||||
ValueError: If client not initialized
|
||||
TimeoutError: If timeout exceeded
|
||||
"""
|
||||
if not self.client:
|
||||
raise ValueError("Temporal client not initialized")
|
||||
|
||||
try:
|
||||
handle = self.client.get_workflow_handle(workflow_id)
|
||||
|
||||
logger.info(f"Waiting for workflow result: {workflow_id}")
|
||||
|
||||
# Wait for workflow to complete and get result
|
||||
if timeout:
|
||||
# Use asyncio timeout if provided
|
||||
import asyncio
|
||||
result = await asyncio.wait_for(handle.result(), timeout=timeout.total_seconds())
|
||||
else:
|
||||
result = await handle.result()
|
||||
|
||||
logger.info(f"✓ Workflow {workflow_id} completed")
|
||||
return result
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get workflow result: {e}", exc_info=True)
|
||||
raise
|
||||
|
||||
async def cancel_workflow(self, workflow_id: str) -> None:
|
||||
"""
|
||||
Cancel a running workflow.
|
||||
|
||||
Args:
|
||||
workflow_id: Workflow execution ID
|
||||
|
||||
Raises:
|
||||
ValueError: If client not initialized
|
||||
"""
|
||||
if not self.client:
|
||||
raise ValueError("Temporal client not initialized")
|
||||
|
||||
try:
|
||||
handle = self.client.get_workflow_handle(workflow_id)
|
||||
await handle.cancel()
|
||||
|
||||
logger.info(f"✓ Workflow cancelled: {workflow_id}")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to cancel workflow: {e}", exc_info=True)
|
||||
raise
|
||||
|
||||
async def list_workflows(
|
||||
self,
|
||||
filter_query: Optional[str] = None,
|
||||
limit: int = 100
|
||||
) -> list[Dict[str, Any]]:
|
||||
"""
|
||||
List workflow executions.
|
||||
|
||||
Args:
|
||||
filter_query: Optional Temporal list filter query
|
||||
limit: Maximum number of results
|
||||
|
||||
Returns:
|
||||
List of workflow execution info
|
||||
|
||||
Raises:
|
||||
ValueError: If client not initialized
|
||||
"""
|
||||
if not self.client:
|
||||
raise ValueError("Temporal client not initialized")
|
||||
|
||||
try:
|
||||
workflows = []
|
||||
|
||||
# Use Temporal's list API
|
||||
async for workflow in self.client.list_workflows(filter_query):
|
||||
workflows.append({
|
||||
"workflow_id": workflow.id,
|
||||
"workflow_type": workflow.workflow_type,
|
||||
"status": workflow.status.name,
|
||||
"start_time": workflow.start_time.isoformat() if workflow.start_time else None,
|
||||
"close_time": workflow.close_time.isoformat() if workflow.close_time else None,
|
||||
"task_queue": workflow.task_queue,
|
||||
})
|
||||
|
||||
if len(workflows) >= limit:
|
||||
break
|
||||
|
||||
logger.info(f"Listed {len(workflows)} workflows")
|
||||
return workflows
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to list workflows: {e}", exc_info=True)
|
||||
raise
|
||||
@@ -0,0 +1,119 @@
|
||||
# FuzzForge Test Suite
|
||||
|
||||
Comprehensive test infrastructure for FuzzForge modules and workflows.
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
tests/
|
||||
├── conftest.py # Shared pytest fixtures
|
||||
├── unit/ # Fast, isolated unit tests
|
||||
│ ├── test_modules/ # Module-specific tests
|
||||
│ │ ├── test_cargo_fuzzer.py
|
||||
│ │ └── test_atheris_fuzzer.py
|
||||
│ ├── test_workflows/ # Workflow tests
|
||||
│ └── test_api/ # API endpoint tests
|
||||
├── integration/ # Integration tests (requires Docker)
|
||||
└── fixtures/ # Test data and projects
|
||||
├── test_projects/ # Vulnerable projects for testing
|
||||
└── expected_results/ # Expected output for validation
|
||||
```
|
||||
|
||||
## Running Tests
|
||||
|
||||
### All Tests
|
||||
```bash
|
||||
cd backend
|
||||
pytest tests/ -v
|
||||
```
|
||||
|
||||
### Unit Tests Only (Fast)
|
||||
```bash
|
||||
pytest tests/unit/ -v
|
||||
```
|
||||
|
||||
### Integration Tests (Requires Docker)
|
||||
```bash
|
||||
# Start services
|
||||
docker-compose up -d
|
||||
|
||||
# Run integration tests
|
||||
pytest tests/integration/ -v
|
||||
|
||||
# Cleanup
|
||||
docker-compose down
|
||||
```
|
||||
|
||||
### With Coverage
|
||||
```bash
|
||||
pytest tests/ --cov=toolbox/modules --cov=src --cov-report=html
|
||||
```
|
||||
|
||||
### Parallel Execution
|
||||
```bash
|
||||
pytest tests/unit/ -n auto
|
||||
```
|
||||
|
||||
## Available Fixtures
|
||||
|
||||
### Workspace Fixtures
|
||||
- `temp_workspace`: Empty temporary workspace
|
||||
- `python_test_workspace`: Python project with vulnerabilities
|
||||
- `rust_test_workspace`: Rust project with fuzz targets
|
||||
|
||||
### Module Fixtures
|
||||
- `atheris_fuzzer`: AtherisFuzzer instance
|
||||
- `cargo_fuzzer`: CargoFuzzer instance
|
||||
- `file_scanner`: FileScanner instance
|
||||
|
||||
### Configuration Fixtures
|
||||
- `atheris_config`: Default Atheris configuration
|
||||
- `cargo_fuzz_config`: Default cargo-fuzz configuration
|
||||
- `gitleaks_config`: Default Gitleaks configuration
|
||||
|
||||
### Mock Fixtures
|
||||
- `mock_stats_callback`: Mock stats callback for fuzzing
|
||||
- `mock_temporal_context`: Mock Temporal activity context
|
||||
|
||||
## Writing Tests
|
||||
|
||||
### Unit Test Example
|
||||
```python
|
||||
import pytest
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_module_execution(cargo_fuzzer, rust_test_workspace, cargo_fuzz_config):
|
||||
"""Test module execution"""
|
||||
result = await cargo_fuzzer.execute(cargo_fuzz_config, rust_test_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
assert result.execution_time > 0
|
||||
```
|
||||
|
||||
### Integration Test Example
|
||||
```python
|
||||
@pytest.mark.integration
|
||||
async def test_end_to_end_workflow():
|
||||
"""Test complete workflow execution"""
|
||||
# Test full workflow with real services
|
||||
pass
|
||||
```
|
||||
|
||||
## CI/CD Integration
|
||||
|
||||
Tests run automatically on:
|
||||
- **Push to main/develop**: Full test suite
|
||||
- **Pull requests**: Full test suite + coverage
|
||||
- **Nightly**: Extended integration tests
|
||||
|
||||
See `.github/workflows/test.yml` for configuration.
|
||||
|
||||
## Code Coverage
|
||||
|
||||
Target coverage: **80%+** for core modules
|
||||
|
||||
View coverage report:
|
||||
```bash
|
||||
pytest tests/ --cov --cov-report=html
|
||||
open htmlcov/index.html
|
||||
```
|
||||
@@ -11,9 +11,220 @@
|
||||
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any
|
||||
import pytest
|
||||
|
||||
# Ensure project root is on sys.path so `src` is importable
|
||||
ROOT = Path(__file__).resolve().parents[1]
|
||||
if str(ROOT) not in sys.path:
|
||||
sys.path.insert(0, str(ROOT))
|
||||
|
||||
# Add toolbox to path for module imports
|
||||
TOOLBOX = ROOT / "toolbox"
|
||||
if str(TOOLBOX) not in sys.path:
|
||||
sys.path.insert(0, str(TOOLBOX))
|
||||
|
||||
|
||||
# ============================================================================
|
||||
# Workspace Fixtures
|
||||
# ============================================================================
|
||||
|
||||
@pytest.fixture
|
||||
def temp_workspace(tmp_path):
|
||||
"""Create a temporary workspace directory for testing"""
|
||||
workspace = tmp_path / "workspace"
|
||||
workspace.mkdir()
|
||||
return workspace
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def python_test_workspace(temp_workspace):
|
||||
"""Create a Python test workspace with sample files"""
|
||||
# Create a simple Python project structure
|
||||
(temp_workspace / "main.py").write_text("""
|
||||
def process_data(data):
|
||||
# Intentional bug: no bounds checking
|
||||
return data[0:100]
|
||||
|
||||
def divide(a, b):
|
||||
# Division by zero vulnerability
|
||||
return a / b
|
||||
""")
|
||||
|
||||
(temp_workspace / "config.py").write_text("""
|
||||
# Hardcoded secrets for testing
|
||||
API_KEY = "sk_test_1234567890abcdef"
|
||||
DATABASE_URL = "postgresql://admin:password123@localhost/db"
|
||||
AWS_SECRET = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
|
||||
""")
|
||||
|
||||
return temp_workspace
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def rust_test_workspace(temp_workspace):
|
||||
"""Create a Rust test workspace with fuzz targets"""
|
||||
# Create Cargo.toml
|
||||
(temp_workspace / "Cargo.toml").write_text("""[package]
|
||||
name = "test_project"
|
||||
version = "0.1.0"
|
||||
edition = "2021"
|
||||
|
||||
[dependencies]
|
||||
""")
|
||||
|
||||
# Create src/lib.rs
|
||||
src_dir = temp_workspace / "src"
|
||||
src_dir.mkdir()
|
||||
(src_dir / "lib.rs").write_text("""
|
||||
pub fn process_buffer(data: &[u8]) -> Vec<u8> {
|
||||
if data.len() < 4 {
|
||||
return Vec::new();
|
||||
}
|
||||
|
||||
// Vulnerability: bounds checking issue
|
||||
let size = data[0] as usize;
|
||||
let mut result = Vec::new();
|
||||
for i in 0..size {
|
||||
result.push(data[i]);
|
||||
}
|
||||
result
|
||||
}
|
||||
""")
|
||||
|
||||
# Create fuzz directory structure
|
||||
fuzz_dir = temp_workspace / "fuzz"
|
||||
fuzz_dir.mkdir()
|
||||
|
||||
(fuzz_dir / "Cargo.toml").write_text("""[package]
|
||||
name = "test_project-fuzz"
|
||||
version = "0.0.0"
|
||||
edition = "2021"
|
||||
|
||||
[dependencies]
|
||||
libfuzzer-sys = "0.4"
|
||||
|
||||
[dependencies.test_project]
|
||||
path = ".."
|
||||
|
||||
[[bin]]
|
||||
name = "fuzz_target_1"
|
||||
path = "fuzz_targets/fuzz_target_1.rs"
|
||||
""")
|
||||
|
||||
fuzz_targets_dir = fuzz_dir / "fuzz_targets"
|
||||
fuzz_targets_dir.mkdir()
|
||||
|
||||
(fuzz_targets_dir / "fuzz_target_1.rs").write_text("""#![no_main]
|
||||
use libfuzzer_sys::fuzz_target;
|
||||
use test_project::process_buffer;
|
||||
|
||||
fuzz_target!(|data: &[u8]| {
|
||||
let _ = process_buffer(data);
|
||||
});
|
||||
""")
|
||||
|
||||
return temp_workspace
|
||||
|
||||
|
||||
# ============================================================================
|
||||
# Module Configuration Fixtures
|
||||
# ============================================================================
|
||||
|
||||
@pytest.fixture
|
||||
def atheris_config():
|
||||
"""Default Atheris fuzzer configuration"""
|
||||
return {
|
||||
"target_file": "auto-discover",
|
||||
"max_iterations": 1000,
|
||||
"timeout_seconds": 10,
|
||||
"corpus_dir": None
|
||||
}
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def cargo_fuzz_config():
|
||||
"""Default cargo-fuzz configuration"""
|
||||
return {
|
||||
"target_name": None,
|
||||
"max_iterations": 1000,
|
||||
"timeout_seconds": 10,
|
||||
"sanitizer": "address"
|
||||
}
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def gitleaks_config():
|
||||
"""Default Gitleaks configuration"""
|
||||
return {
|
||||
"config_path": None,
|
||||
"scan_uncommitted": True
|
||||
}
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def file_scanner_config():
|
||||
"""Default file scanner configuration"""
|
||||
return {
|
||||
"scan_patterns": ["*.py", "*.rs", "*.js"],
|
||||
"exclude_patterns": ["*.test.*", "*.spec.*"],
|
||||
"max_file_size": 1048576 # 1MB
|
||||
}
|
||||
|
||||
|
||||
# ============================================================================
|
||||
# Module Instance Fixtures
|
||||
# ============================================================================
|
||||
|
||||
@pytest.fixture
|
||||
def atheris_fuzzer():
|
||||
"""Create an AtherisFuzzer instance"""
|
||||
from modules.fuzzer.atheris_fuzzer import AtherisFuzzer
|
||||
return AtherisFuzzer()
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def cargo_fuzzer():
|
||||
"""Create a CargoFuzzer instance"""
|
||||
from modules.fuzzer.cargo_fuzzer import CargoFuzzer
|
||||
return CargoFuzzer()
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def file_scanner():
|
||||
"""Create a FileScanner instance"""
|
||||
from modules.scanner.file_scanner import FileScanner
|
||||
return FileScanner()
|
||||
|
||||
|
||||
# ============================================================================
|
||||
# Mock Fixtures
|
||||
# ============================================================================
|
||||
|
||||
@pytest.fixture
|
||||
def mock_stats_callback():
|
||||
"""Mock stats callback for fuzzing"""
|
||||
stats_received = []
|
||||
|
||||
async def callback(stats: Dict[str, Any]):
|
||||
stats_received.append(stats)
|
||||
|
||||
callback.stats_received = stats_received
|
||||
return callback
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def mock_temporal_context():
|
||||
"""Mock Temporal activity context"""
|
||||
class MockActivityInfo:
|
||||
def __init__(self):
|
||||
self.workflow_id = "test-workflow-123"
|
||||
self.activity_id = "test-activity-1"
|
||||
self.attempt = 1
|
||||
|
||||
class MockContext:
|
||||
def __init__(self):
|
||||
self.info = MockActivityInfo()
|
||||
|
||||
return MockContext()
|
||||
|
||||
|
||||
@@ -1,82 +0,0 @@
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import asyncio
|
||||
from datetime import datetime, timezone, timedelta
|
||||
|
||||
|
||||
from src.services.prefect_stats_monitor import PrefectStatsMonitor
|
||||
from src.api import fuzzing
|
||||
|
||||
|
||||
class FakeLog:
|
||||
def __init__(self, message: str):
|
||||
self.message = message
|
||||
|
||||
|
||||
class FakeClient:
|
||||
def __init__(self, logs):
|
||||
self._logs = logs
|
||||
|
||||
async def read_logs(self, log_filter=None, limit=100, sort="TIMESTAMP_ASC"):
|
||||
return self._logs
|
||||
|
||||
|
||||
class FakeTaskRun:
|
||||
def __init__(self):
|
||||
self.id = "task-1"
|
||||
self.start_time = datetime.now(timezone.utc) - timedelta(seconds=5)
|
||||
|
||||
|
||||
def test_parse_stats_from_log_fuzzing():
|
||||
mon = PrefectStatsMonitor()
|
||||
msg = (
|
||||
"INFO LIVE_STATS extra={'stats_type': 'fuzzing_live_update', "
|
||||
"'executions': 42, 'executions_per_sec': 3.14, 'crashes': 1, 'unique_crashes': 1, 'corpus_size': 9}"
|
||||
)
|
||||
stats = mon._parse_stats_from_log(msg)
|
||||
assert stats is not None
|
||||
assert stats["stats_type"] == "fuzzing_live_update"
|
||||
assert stats["executions"] == 42
|
||||
|
||||
|
||||
def test_extract_stats_updates_and_broadcasts():
|
||||
mon = PrefectStatsMonitor()
|
||||
run_id = "run-123"
|
||||
workflow = "wf"
|
||||
fuzzing.initialize_fuzzing_tracking(run_id, workflow)
|
||||
|
||||
# Prepare a fake websocket to capture messages
|
||||
sent = []
|
||||
|
||||
class FakeWS:
|
||||
async def send_text(self, text: str):
|
||||
sent.append(text)
|
||||
|
||||
fuzzing.active_connections[run_id] = [FakeWS()]
|
||||
|
||||
# Craft a log line the parser understands
|
||||
msg = (
|
||||
"INFO LIVE_STATS extra={'stats_type': 'fuzzing_live_update', "
|
||||
"'executions': 10, 'executions_per_sec': 1.5, 'crashes': 0, 'unique_crashes': 0, 'corpus_size': 2}"
|
||||
)
|
||||
fake_client = FakeClient([FakeLog(msg)])
|
||||
task_run = FakeTaskRun()
|
||||
|
||||
asyncio.run(mon._extract_stats_from_task(fake_client, run_id, task_run, workflow))
|
||||
|
||||
# Verify stats updated
|
||||
stats = fuzzing.fuzzing_stats[run_id]
|
||||
assert stats.executions == 10
|
||||
assert stats.executions_per_sec == 1.5
|
||||
|
||||
# Verify a message was sent to WebSocket
|
||||
assert sent, "Expected a stats_update message to be sent"
|
||||
@@ -0,0 +1,177 @@
|
||||
"""
|
||||
Unit tests for AtherisFuzzer module
|
||||
"""
|
||||
|
||||
import pytest
|
||||
from unittest.mock import AsyncMock, patch
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestAtherisFuzzerMetadata:
|
||||
"""Test AtherisFuzzer metadata"""
|
||||
|
||||
async def test_metadata_structure(self, atheris_fuzzer):
|
||||
"""Test that module metadata is properly defined"""
|
||||
metadata = atheris_fuzzer.get_metadata()
|
||||
|
||||
assert metadata.name == "atheris_fuzzer"
|
||||
assert metadata.category == "fuzzer"
|
||||
assert "fuzzing" in metadata.tags
|
||||
assert "python" in metadata.tags
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestAtherisFuzzerConfigValidation:
|
||||
"""Test configuration validation"""
|
||||
|
||||
async def test_valid_config(self, atheris_fuzzer, atheris_config):
|
||||
"""Test validation of valid configuration"""
|
||||
assert atheris_fuzzer.validate_config(atheris_config) is True
|
||||
|
||||
async def test_invalid_max_iterations(self, atheris_fuzzer):
|
||||
"""Test validation fails with invalid max_iterations"""
|
||||
config = {
|
||||
"target_file": "fuzz_target.py",
|
||||
"max_iterations": -1,
|
||||
"timeout_seconds": 10
|
||||
}
|
||||
with pytest.raises(ValueError, match="max_iterations"):
|
||||
atheris_fuzzer.validate_config(config)
|
||||
|
||||
async def test_invalid_timeout(self, atheris_fuzzer):
|
||||
"""Test validation fails with invalid timeout"""
|
||||
config = {
|
||||
"target_file": "fuzz_target.py",
|
||||
"max_iterations": 1000,
|
||||
"timeout_seconds": 0
|
||||
}
|
||||
with pytest.raises(ValueError, match="timeout_seconds"):
|
||||
atheris_fuzzer.validate_config(config)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestAtherisFuzzerDiscovery:
|
||||
"""Test fuzz target discovery"""
|
||||
|
||||
async def test_auto_discover(self, atheris_fuzzer, python_test_workspace):
|
||||
"""Test auto-discovery of Python fuzz targets"""
|
||||
# Create a fuzz target file
|
||||
(python_test_workspace / "fuzz_target.py").write_text("""
|
||||
import atheris
|
||||
import sys
|
||||
|
||||
def TestOneInput(data):
|
||||
pass
|
||||
|
||||
if __name__ == "__main__":
|
||||
atheris.Setup(sys.argv, TestOneInput)
|
||||
atheris.Fuzz()
|
||||
""")
|
||||
|
||||
# Pass None for auto-discovery
|
||||
target = atheris_fuzzer._discover_target(python_test_workspace, None)
|
||||
|
||||
assert target is not None
|
||||
assert "fuzz_target.py" in str(target)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestAtherisFuzzerExecution:
|
||||
"""Test fuzzer execution logic"""
|
||||
|
||||
async def test_execution_creates_result(self, atheris_fuzzer, python_test_workspace, atheris_config):
|
||||
"""Test that execution returns a ModuleResult"""
|
||||
# Create a simple fuzz target
|
||||
(python_test_workspace / "fuzz_target.py").write_text("""
|
||||
import atheris
|
||||
import sys
|
||||
|
||||
def TestOneInput(data):
|
||||
if len(data) > 0:
|
||||
pass
|
||||
|
||||
if __name__ == "__main__":
|
||||
atheris.Setup(sys.argv, TestOneInput)
|
||||
atheris.Fuzz()
|
||||
""")
|
||||
|
||||
# Use a very short timeout for testing
|
||||
test_config = {
|
||||
"target_file": "fuzz_target.py",
|
||||
"max_iterations": 10,
|
||||
"timeout_seconds": 1
|
||||
}
|
||||
|
||||
# Mock the fuzzing subprocess to avoid actual execution
|
||||
with patch.object(atheris_fuzzer, '_run_fuzzing', new_callable=AsyncMock, return_value=([], {"total_executions": 10})):
|
||||
result = await atheris_fuzzer.execute(test_config, python_test_workspace)
|
||||
|
||||
assert result.module == "atheris_fuzzer"
|
||||
assert result.status in ["success", "partial", "failed"]
|
||||
assert isinstance(result.execution_time, float)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestAtherisFuzzerStatsCallback:
|
||||
"""Test stats callback functionality"""
|
||||
|
||||
async def test_stats_callback_invoked(self, atheris_fuzzer, python_test_workspace, atheris_config, mock_stats_callback):
|
||||
"""Test that stats callback is invoked during fuzzing"""
|
||||
(python_test_workspace / "fuzz_target.py").write_text("""
|
||||
import atheris
|
||||
import sys
|
||||
|
||||
def TestOneInput(data):
|
||||
pass
|
||||
|
||||
if __name__ == "__main__":
|
||||
atheris.Setup(sys.argv, TestOneInput)
|
||||
atheris.Fuzz()
|
||||
""")
|
||||
|
||||
# Mock fuzzing to simulate stats
|
||||
async def mock_run_fuzzing(test_one_input, target_path, workspace, max_iterations, timeout_seconds, stats_callback):
|
||||
if stats_callback:
|
||||
await stats_callback({
|
||||
"total_execs": 100,
|
||||
"execs_per_sec": 10.0,
|
||||
"crashes": 0,
|
||||
"coverage": 5,
|
||||
"corpus_size": 2,
|
||||
"elapsed_time": 10
|
||||
})
|
||||
return
|
||||
|
||||
with patch.object(atheris_fuzzer, '_run_fuzzing', side_effect=mock_run_fuzzing):
|
||||
with patch.object(atheris_fuzzer, '_load_target_module', return_value=lambda x: None):
|
||||
# Put stats_callback in config dict, not as kwarg
|
||||
atheris_config["target_file"] = "fuzz_target.py"
|
||||
atheris_config["stats_callback"] = mock_stats_callback
|
||||
await atheris_fuzzer.execute(atheris_config, python_test_workspace)
|
||||
|
||||
# Verify callback was invoked
|
||||
assert len(mock_stats_callback.stats_received) > 0
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestAtherisFuzzerFindingGeneration:
|
||||
"""Test finding generation from crashes"""
|
||||
|
||||
async def test_create_crash_finding(self, atheris_fuzzer):
|
||||
"""Test crash finding creation"""
|
||||
finding = atheris_fuzzer.create_finding(
|
||||
title="Crash: Exception in TestOneInput",
|
||||
description="IndexError: list index out of range",
|
||||
severity="high",
|
||||
category="crash",
|
||||
file_path="fuzz_target.py",
|
||||
metadata={
|
||||
"crash_type": "IndexError",
|
||||
"stack_trace": "Traceback..."
|
||||
}
|
||||
)
|
||||
|
||||
assert finding.title == "Crash: Exception in TestOneInput"
|
||||
assert finding.severity == "high"
|
||||
assert finding.category == "crash"
|
||||
assert "IndexError" in finding.metadata["crash_type"]
|
||||
@@ -0,0 +1,177 @@
|
||||
"""
|
||||
Unit tests for CargoFuzzer module
|
||||
"""
|
||||
|
||||
import pytest
|
||||
from unittest.mock import AsyncMock, patch
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestCargoFuzzerMetadata:
|
||||
"""Test CargoFuzzer metadata"""
|
||||
|
||||
async def test_metadata_structure(self, cargo_fuzzer):
|
||||
"""Test that module metadata is properly defined"""
|
||||
metadata = cargo_fuzzer.get_metadata()
|
||||
|
||||
assert metadata.name == "cargo_fuzz"
|
||||
assert metadata.version == "0.11.2"
|
||||
assert metadata.category == "fuzzer"
|
||||
assert "fuzzing" in metadata.tags
|
||||
assert "rust" in metadata.tags
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestCargoFuzzerConfigValidation:
|
||||
"""Test configuration validation"""
|
||||
|
||||
async def test_valid_config(self, cargo_fuzzer, cargo_fuzz_config):
|
||||
"""Test validation of valid configuration"""
|
||||
assert cargo_fuzzer.validate_config(cargo_fuzz_config) is True
|
||||
|
||||
async def test_invalid_max_iterations(self, cargo_fuzzer):
|
||||
"""Test validation fails with invalid max_iterations"""
|
||||
config = {
|
||||
"max_iterations": -1,
|
||||
"timeout_seconds": 10,
|
||||
"sanitizer": "address"
|
||||
}
|
||||
with pytest.raises(ValueError, match="max_iterations"):
|
||||
cargo_fuzzer.validate_config(config)
|
||||
|
||||
async def test_invalid_timeout(self, cargo_fuzzer):
|
||||
"""Test validation fails with invalid timeout"""
|
||||
config = {
|
||||
"max_iterations": 1000,
|
||||
"timeout_seconds": 0,
|
||||
"sanitizer": "address"
|
||||
}
|
||||
with pytest.raises(ValueError, match="timeout_seconds"):
|
||||
cargo_fuzzer.validate_config(config)
|
||||
|
||||
async def test_invalid_sanitizer(self, cargo_fuzzer):
|
||||
"""Test validation fails with invalid sanitizer"""
|
||||
config = {
|
||||
"max_iterations": 1000,
|
||||
"timeout_seconds": 10,
|
||||
"sanitizer": "invalid_sanitizer"
|
||||
}
|
||||
with pytest.raises(ValueError, match="sanitizer"):
|
||||
cargo_fuzzer.validate_config(config)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestCargoFuzzerWorkspaceValidation:
|
||||
"""Test workspace validation"""
|
||||
|
||||
async def test_valid_workspace(self, cargo_fuzzer, rust_test_workspace):
|
||||
"""Test validation of valid workspace"""
|
||||
assert cargo_fuzzer.validate_workspace(rust_test_workspace) is True
|
||||
|
||||
async def test_nonexistent_workspace(self, cargo_fuzzer, tmp_path):
|
||||
"""Test validation fails with nonexistent workspace"""
|
||||
nonexistent = tmp_path / "does_not_exist"
|
||||
with pytest.raises(ValueError, match="does not exist"):
|
||||
cargo_fuzzer.validate_workspace(nonexistent)
|
||||
|
||||
async def test_workspace_is_file(self, cargo_fuzzer, tmp_path):
|
||||
"""Test validation fails when workspace is a file"""
|
||||
file_path = tmp_path / "file.txt"
|
||||
file_path.write_text("test")
|
||||
with pytest.raises(ValueError, match="not a directory"):
|
||||
cargo_fuzzer.validate_workspace(file_path)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestCargoFuzzerDiscovery:
|
||||
"""Test fuzz target discovery"""
|
||||
|
||||
async def test_discover_targets(self, cargo_fuzzer, rust_test_workspace):
|
||||
"""Test discovery of fuzz targets"""
|
||||
targets = await cargo_fuzzer._discover_fuzz_targets(rust_test_workspace)
|
||||
|
||||
assert len(targets) == 1
|
||||
assert "fuzz_target_1" in targets
|
||||
|
||||
async def test_no_fuzz_directory(self, cargo_fuzzer, temp_workspace):
|
||||
"""Test discovery with no fuzz directory"""
|
||||
targets = await cargo_fuzzer._discover_fuzz_targets(temp_workspace)
|
||||
|
||||
assert targets == []
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestCargoFuzzerExecution:
|
||||
"""Test fuzzer execution logic"""
|
||||
|
||||
async def test_execution_creates_result(self, cargo_fuzzer, rust_test_workspace, cargo_fuzz_config):
|
||||
"""Test that execution returns a ModuleResult"""
|
||||
# Mock the build and run methods to avoid actual fuzzing
|
||||
with patch.object(cargo_fuzzer, '_build_fuzz_target', new_callable=AsyncMock, return_value=True):
|
||||
with patch.object(cargo_fuzzer, '_run_fuzzing', new_callable=AsyncMock, return_value=([], {"total_executions": 0, "crashes_found": 0})):
|
||||
with patch.object(cargo_fuzzer, '_parse_crash_artifacts', new_callable=AsyncMock, return_value=[]):
|
||||
result = await cargo_fuzzer.execute(cargo_fuzz_config, rust_test_workspace)
|
||||
|
||||
assert result.module == "cargo_fuzz"
|
||||
assert result.status == "success"
|
||||
assert isinstance(result.execution_time, float)
|
||||
assert result.execution_time >= 0
|
||||
|
||||
async def test_execution_with_no_targets(self, cargo_fuzzer, temp_workspace, cargo_fuzz_config):
|
||||
"""Test execution fails gracefully with no fuzz targets"""
|
||||
result = await cargo_fuzzer.execute(cargo_fuzz_config, temp_workspace)
|
||||
|
||||
assert result.status == "failed"
|
||||
assert "No fuzz targets found" in result.error
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestCargoFuzzerStatsCallback:
|
||||
"""Test stats callback functionality"""
|
||||
|
||||
async def test_stats_callback_invoked(self, cargo_fuzzer, rust_test_workspace, cargo_fuzz_config, mock_stats_callback):
|
||||
"""Test that stats callback is invoked during fuzzing"""
|
||||
# Mock build/run to simulate stats generation
|
||||
async def mock_run_fuzzing(workspace, target, config, callback):
|
||||
# Simulate stats callback
|
||||
if callback:
|
||||
await callback({
|
||||
"total_execs": 1000,
|
||||
"execs_per_sec": 100.0,
|
||||
"crashes": 0,
|
||||
"coverage": 10,
|
||||
"corpus_size": 5,
|
||||
"elapsed_time": 10
|
||||
})
|
||||
return [], {"total_executions": 1000}
|
||||
|
||||
with patch.object(cargo_fuzzer, '_build_fuzz_target', new_callable=AsyncMock, return_value=True):
|
||||
with patch.object(cargo_fuzzer, '_run_fuzzing', side_effect=mock_run_fuzzing):
|
||||
with patch.object(cargo_fuzzer, '_parse_crash_artifacts', new_callable=AsyncMock, return_value=[]):
|
||||
await cargo_fuzzer.execute(cargo_fuzz_config, rust_test_workspace, stats_callback=mock_stats_callback)
|
||||
|
||||
# Verify callback was invoked
|
||||
assert len(mock_stats_callback.stats_received) > 0
|
||||
assert mock_stats_callback.stats_received[0]["total_execs"] == 1000
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestCargoFuzzerFindingGeneration:
|
||||
"""Test finding generation from crashes"""
|
||||
|
||||
async def test_create_finding_from_crash(self, cargo_fuzzer):
|
||||
"""Test finding creation"""
|
||||
finding = cargo_fuzzer.create_finding(
|
||||
title="Crash: Segmentation Fault",
|
||||
description="Test crash",
|
||||
severity="critical",
|
||||
category="crash",
|
||||
file_path="fuzz/fuzz_targets/fuzz_target_1.rs",
|
||||
metadata={"crash_type": "SIGSEGV"}
|
||||
)
|
||||
|
||||
assert finding.title == "Crash: Segmentation Fault"
|
||||
assert finding.severity == "critical"
|
||||
assert finding.category == "crash"
|
||||
assert finding.file_path == "fuzz/fuzz_targets/fuzz_target_1.rs"
|
||||
assert finding.metadata["crash_type"] == "SIGSEGV"
|
||||
@@ -0,0 +1,349 @@
|
||||
"""
|
||||
Unit tests for FileScanner module
|
||||
"""
|
||||
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
sys.path.insert(0, str(Path(__file__).resolve().parents[3] / "toolbox"))
|
||||
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestFileScannerMetadata:
|
||||
"""Test FileScanner metadata"""
|
||||
|
||||
async def test_metadata_structure(self, file_scanner):
|
||||
"""Test that metadata has correct structure"""
|
||||
metadata = file_scanner.get_metadata()
|
||||
|
||||
assert metadata.name == "file_scanner"
|
||||
assert metadata.version == "1.0.0"
|
||||
assert metadata.category == "scanner"
|
||||
assert "files" in metadata.tags
|
||||
assert "enumeration" in metadata.tags
|
||||
assert metadata.requires_workspace is True
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestFileScannerConfigValidation:
|
||||
"""Test configuration validation"""
|
||||
|
||||
async def test_valid_config(self, file_scanner):
|
||||
"""Test that valid config passes validation"""
|
||||
config = {
|
||||
"patterns": ["*.py", "*.js"],
|
||||
"max_file_size": 1048576,
|
||||
"check_sensitive": True,
|
||||
"calculate_hashes": False
|
||||
}
|
||||
assert file_scanner.validate_config(config) is True
|
||||
|
||||
async def test_default_config(self, file_scanner):
|
||||
"""Test that empty config uses defaults"""
|
||||
config = {}
|
||||
assert file_scanner.validate_config(config) is True
|
||||
|
||||
async def test_invalid_patterns_type(self, file_scanner):
|
||||
"""Test that non-list patterns raises error"""
|
||||
config = {"patterns": "*.py"}
|
||||
with pytest.raises(ValueError, match="patterns must be a list"):
|
||||
file_scanner.validate_config(config)
|
||||
|
||||
async def test_invalid_max_file_size(self, file_scanner):
|
||||
"""Test that invalid max_file_size raises error"""
|
||||
config = {"max_file_size": -1}
|
||||
with pytest.raises(ValueError, match="max_file_size must be a positive integer"):
|
||||
file_scanner.validate_config(config)
|
||||
|
||||
async def test_invalid_max_file_size_type(self, file_scanner):
|
||||
"""Test that non-integer max_file_size raises error"""
|
||||
config = {"max_file_size": "large"}
|
||||
with pytest.raises(ValueError, match="max_file_size must be a positive integer"):
|
||||
file_scanner.validate_config(config)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestFileScannerExecution:
|
||||
"""Test scanner execution"""
|
||||
|
||||
async def test_scan_python_files(self, file_scanner, python_test_workspace):
|
||||
"""Test scanning Python files"""
|
||||
config = {
|
||||
"patterns": ["*.py"],
|
||||
"check_sensitive": False,
|
||||
"calculate_hashes": False
|
||||
}
|
||||
|
||||
result = await file_scanner.execute(config, python_test_workspace)
|
||||
|
||||
assert result.module == "file_scanner"
|
||||
assert result.status == "success"
|
||||
assert len(result.findings) > 0
|
||||
|
||||
# Check that Python files were found
|
||||
python_files = [f for f in result.findings if f.file_path.endswith('.py')]
|
||||
assert len(python_files) > 0
|
||||
|
||||
async def test_scan_all_files(self, file_scanner, python_test_workspace):
|
||||
"""Test scanning all files with wildcard"""
|
||||
config = {
|
||||
"patterns": ["*"],
|
||||
"check_sensitive": False,
|
||||
"calculate_hashes": False
|
||||
}
|
||||
|
||||
result = await file_scanner.execute(config, python_test_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
assert len(result.findings) > 0
|
||||
assert result.summary["total_files"] > 0
|
||||
|
||||
async def test_scan_with_multiple_patterns(self, file_scanner, python_test_workspace):
|
||||
"""Test scanning with multiple patterns"""
|
||||
config = {
|
||||
"patterns": ["*.py", "*.txt"],
|
||||
"check_sensitive": False,
|
||||
"calculate_hashes": False
|
||||
}
|
||||
|
||||
result = await file_scanner.execute(config, python_test_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
assert len(result.findings) > 0
|
||||
|
||||
async def test_empty_workspace(self, file_scanner, temp_workspace):
|
||||
"""Test scanning empty workspace"""
|
||||
config = {
|
||||
"patterns": ["*.py"],
|
||||
"check_sensitive": False
|
||||
}
|
||||
|
||||
result = await file_scanner.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
assert len(result.findings) == 0
|
||||
assert result.summary["total_files"] == 0
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestFileScannerSensitiveDetection:
|
||||
"""Test sensitive file detection"""
|
||||
|
||||
async def test_detect_env_file(self, file_scanner, temp_workspace):
|
||||
"""Test detection of .env file"""
|
||||
# Create .env file
|
||||
(temp_workspace / ".env").write_text("API_KEY=secret123")
|
||||
|
||||
config = {
|
||||
"patterns": ["*"],
|
||||
"check_sensitive": True,
|
||||
"calculate_hashes": False
|
||||
}
|
||||
|
||||
result = await file_scanner.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
|
||||
# Check for sensitive file finding
|
||||
sensitive_findings = [f for f in result.findings if f.category == "sensitive_file"]
|
||||
assert len(sensitive_findings) > 0
|
||||
assert any(".env" in f.title for f in sensitive_findings)
|
||||
|
||||
async def test_detect_private_key(self, file_scanner, temp_workspace):
|
||||
"""Test detection of private key file"""
|
||||
# Create private key file
|
||||
(temp_workspace / "id_rsa").write_text("-----BEGIN RSA PRIVATE KEY-----")
|
||||
|
||||
config = {
|
||||
"patterns": ["*"],
|
||||
"check_sensitive": True
|
||||
}
|
||||
|
||||
result = await file_scanner.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
sensitive_findings = [f for f in result.findings if f.category == "sensitive_file"]
|
||||
assert len(sensitive_findings) > 0
|
||||
|
||||
async def test_no_sensitive_detection_when_disabled(self, file_scanner, temp_workspace):
|
||||
"""Test that sensitive detection can be disabled"""
|
||||
(temp_workspace / ".env").write_text("API_KEY=secret123")
|
||||
|
||||
config = {
|
||||
"patterns": ["*"],
|
||||
"check_sensitive": False
|
||||
}
|
||||
|
||||
result = await file_scanner.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
sensitive_findings = [f for f in result.findings if f.category == "sensitive_file"]
|
||||
assert len(sensitive_findings) == 0
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestFileScannerHashing:
|
||||
"""Test file hashing functionality"""
|
||||
|
||||
async def test_hash_calculation(self, file_scanner, temp_workspace):
|
||||
"""Test SHA256 hash calculation"""
|
||||
# Create test file
|
||||
test_file = temp_workspace / "test.txt"
|
||||
test_file.write_text("Hello World")
|
||||
|
||||
config = {
|
||||
"patterns": ["*.txt"],
|
||||
"calculate_hashes": True
|
||||
}
|
||||
|
||||
result = await file_scanner.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
|
||||
# Find the test.txt finding
|
||||
txt_findings = [f for f in result.findings if "test.txt" in f.file_path]
|
||||
assert len(txt_findings) > 0
|
||||
|
||||
# Check that hash was calculated
|
||||
finding = txt_findings[0]
|
||||
assert finding.metadata.get("file_hash") is not None
|
||||
assert len(finding.metadata["file_hash"]) == 64 # SHA256 hex length
|
||||
|
||||
async def test_no_hash_when_disabled(self, file_scanner, temp_workspace):
|
||||
"""Test that hashing can be disabled"""
|
||||
test_file = temp_workspace / "test.txt"
|
||||
test_file.write_text("Hello World")
|
||||
|
||||
config = {
|
||||
"patterns": ["*.txt"],
|
||||
"calculate_hashes": False
|
||||
}
|
||||
|
||||
result = await file_scanner.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
txt_findings = [f for f in result.findings if "test.txt" in f.file_path]
|
||||
|
||||
if len(txt_findings) > 0:
|
||||
finding = txt_findings[0]
|
||||
assert finding.metadata.get("file_hash") is None
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestFileScannerFileTypes:
|
||||
"""Test file type detection"""
|
||||
|
||||
async def test_detect_python_type(self, file_scanner, temp_workspace):
|
||||
"""Test detection of Python file type"""
|
||||
(temp_workspace / "script.py").write_text("print('hello')")
|
||||
|
||||
config = {"patterns": ["*.py"]}
|
||||
result = await file_scanner.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
py_findings = [f for f in result.findings if "script.py" in f.file_path]
|
||||
assert len(py_findings) > 0
|
||||
assert "python" in py_findings[0].metadata["file_type"]
|
||||
|
||||
async def test_detect_javascript_type(self, file_scanner, temp_workspace):
|
||||
"""Test detection of JavaScript file type"""
|
||||
(temp_workspace / "app.js").write_text("console.log('hello')")
|
||||
|
||||
config = {"patterns": ["*.js"]}
|
||||
result = await file_scanner.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
js_findings = [f for f in result.findings if "app.js" in f.file_path]
|
||||
assert len(js_findings) > 0
|
||||
assert "javascript" in js_findings[0].metadata["file_type"]
|
||||
|
||||
async def test_file_type_summary(self, file_scanner, temp_workspace):
|
||||
"""Test that file type summary is generated"""
|
||||
(temp_workspace / "script.py").write_text("print('hello')")
|
||||
(temp_workspace / "app.js").write_text("console.log('hello')")
|
||||
(temp_workspace / "readme.txt").write_text("Documentation")
|
||||
|
||||
config = {"patterns": ["*"]}
|
||||
result = await file_scanner.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
assert "file_types" in result.summary
|
||||
assert len(result.summary["file_types"]) > 0
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestFileScannerSizeLimits:
|
||||
"""Test file size handling"""
|
||||
|
||||
async def test_skip_large_files(self, file_scanner, temp_workspace):
|
||||
"""Test that large files are skipped"""
|
||||
# Create a "large" file
|
||||
large_file = temp_workspace / "large.txt"
|
||||
large_file.write_text("x" * 1000)
|
||||
|
||||
config = {
|
||||
"patterns": ["*.txt"],
|
||||
"max_file_size": 500 # Set limit smaller than file
|
||||
}
|
||||
|
||||
result = await file_scanner.execute(config, temp_workspace)
|
||||
|
||||
# Should succeed but skip the large file
|
||||
assert result.status == "success"
|
||||
|
||||
# The file should still be counted but not have a detailed finding
|
||||
assert result.summary["total_files"] > 0
|
||||
|
||||
async def test_process_small_files(self, file_scanner, temp_workspace):
|
||||
"""Test that small files are processed"""
|
||||
small_file = temp_workspace / "small.txt"
|
||||
small_file.write_text("small content")
|
||||
|
||||
config = {
|
||||
"patterns": ["*.txt"],
|
||||
"max_file_size": 1048576 # 1MB
|
||||
}
|
||||
|
||||
result = await file_scanner.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
txt_findings = [f for f in result.findings if "small.txt" in f.file_path]
|
||||
assert len(txt_findings) > 0
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestFileScannerSummary:
|
||||
"""Test result summary generation"""
|
||||
|
||||
async def test_summary_structure(self, file_scanner, python_test_workspace):
|
||||
"""Test that summary has correct structure"""
|
||||
config = {"patterns": ["*"]}
|
||||
result = await file_scanner.execute(config, python_test_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
assert "total_files" in result.summary
|
||||
assert "total_size_bytes" in result.summary
|
||||
assert "file_types" in result.summary
|
||||
assert "patterns_scanned" in result.summary
|
||||
|
||||
assert isinstance(result.summary["total_files"], int)
|
||||
assert isinstance(result.summary["total_size_bytes"], int)
|
||||
assert isinstance(result.summary["file_types"], dict)
|
||||
assert isinstance(result.summary["patterns_scanned"], list)
|
||||
|
||||
async def test_summary_counts(self, file_scanner, temp_workspace):
|
||||
"""Test that summary counts are accurate"""
|
||||
# Create known files
|
||||
(temp_workspace / "file1.py").write_text("content1")
|
||||
(temp_workspace / "file2.py").write_text("content2")
|
||||
(temp_workspace / "file3.txt").write_text("content3")
|
||||
|
||||
config = {"patterns": ["*"]}
|
||||
result = await file_scanner.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
assert result.summary["total_files"] == 3
|
||||
assert result.summary["total_size_bytes"] > 0
|
||||
@@ -0,0 +1,493 @@
|
||||
"""
|
||||
Unit tests for SecurityAnalyzer module
|
||||
"""
|
||||
|
||||
import pytest
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
sys.path.insert(0, str(Path(__file__).resolve().parents[3] / "toolbox"))
|
||||
|
||||
from modules.analyzer.security_analyzer import SecurityAnalyzer
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def security_analyzer():
|
||||
"""Create SecurityAnalyzer instance"""
|
||||
return SecurityAnalyzer()
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestSecurityAnalyzerMetadata:
|
||||
"""Test SecurityAnalyzer metadata"""
|
||||
|
||||
async def test_metadata_structure(self, security_analyzer):
|
||||
"""Test that metadata has correct structure"""
|
||||
metadata = security_analyzer.get_metadata()
|
||||
|
||||
assert metadata.name == "security_analyzer"
|
||||
assert metadata.version == "1.0.0"
|
||||
assert metadata.category == "analyzer"
|
||||
assert "security" in metadata.tags
|
||||
assert "vulnerabilities" in metadata.tags
|
||||
assert metadata.requires_workspace is True
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestSecurityAnalyzerConfigValidation:
|
||||
"""Test configuration validation"""
|
||||
|
||||
async def test_valid_config(self, security_analyzer):
|
||||
"""Test that valid config passes validation"""
|
||||
config = {
|
||||
"file_extensions": [".py", ".js"],
|
||||
"check_secrets": True,
|
||||
"check_sql": True,
|
||||
"check_dangerous_functions": True
|
||||
}
|
||||
assert security_analyzer.validate_config(config) is True
|
||||
|
||||
async def test_default_config(self, security_analyzer):
|
||||
"""Test that empty config uses defaults"""
|
||||
config = {}
|
||||
assert security_analyzer.validate_config(config) is True
|
||||
|
||||
async def test_invalid_extensions_type(self, security_analyzer):
|
||||
"""Test that non-list extensions raises error"""
|
||||
config = {"file_extensions": ".py"}
|
||||
with pytest.raises(ValueError, match="file_extensions must be a list"):
|
||||
security_analyzer.validate_config(config)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestSecurityAnalyzerSecretDetection:
|
||||
"""Test hardcoded secret detection"""
|
||||
|
||||
async def test_detect_api_key(self, security_analyzer, temp_workspace):
|
||||
"""Test detection of hardcoded API key"""
|
||||
code_file = temp_workspace / "config.py"
|
||||
code_file.write_text("""
|
||||
# Configuration file
|
||||
api_key = "apikey_live_abcdefghijklmnopqrstuvwxyzabcdefghijk"
|
||||
database_url = "postgresql://localhost/db"
|
||||
""")
|
||||
|
||||
config = {
|
||||
"file_extensions": [".py"],
|
||||
"check_secrets": True,
|
||||
"check_sql": False,
|
||||
"check_dangerous_functions": False
|
||||
}
|
||||
|
||||
result = await security_analyzer.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
secret_findings = [f for f in result.findings if f.category == "hardcoded_secret"]
|
||||
assert len(secret_findings) > 0
|
||||
assert any("API Key" in f.title for f in secret_findings)
|
||||
|
||||
async def test_detect_password(self, security_analyzer, temp_workspace):
|
||||
"""Test detection of hardcoded password"""
|
||||
code_file = temp_workspace / "auth.py"
|
||||
code_file.write_text("""
|
||||
def connect():
|
||||
password = "mySecretP@ssw0rd"
|
||||
return connect_db(password)
|
||||
""")
|
||||
|
||||
config = {
|
||||
"file_extensions": [".py"],
|
||||
"check_secrets": True,
|
||||
"check_sql": False,
|
||||
"check_dangerous_functions": False
|
||||
}
|
||||
|
||||
result = await security_analyzer.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
secret_findings = [f for f in result.findings if f.category == "hardcoded_secret"]
|
||||
assert len(secret_findings) > 0
|
||||
|
||||
async def test_detect_aws_credentials(self, security_analyzer, temp_workspace):
|
||||
"""Test detection of AWS credentials"""
|
||||
code_file = temp_workspace / "aws_config.py"
|
||||
code_file.write_text("""
|
||||
aws_access_key = "AKIAIOSFODNN7REALKEY"
|
||||
aws_secret_key = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYREALKEY"
|
||||
""")
|
||||
|
||||
config = {
|
||||
"file_extensions": [".py"],
|
||||
"check_secrets": True
|
||||
}
|
||||
|
||||
result = await security_analyzer.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
aws_findings = [f for f in result.findings if "AWS" in f.title]
|
||||
assert len(aws_findings) >= 2 # Both access key and secret key
|
||||
|
||||
async def test_no_secret_detection_when_disabled(self, security_analyzer, temp_workspace):
|
||||
"""Test that secret detection can be disabled"""
|
||||
code_file = temp_workspace / "config.py"
|
||||
code_file.write_text('api_key = "sk_live_1234567890abcdef"')
|
||||
|
||||
config = {
|
||||
"file_extensions": [".py"],
|
||||
"check_secrets": False
|
||||
}
|
||||
|
||||
result = await security_analyzer.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
secret_findings = [f for f in result.findings if f.category == "hardcoded_secret"]
|
||||
assert len(secret_findings) == 0
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestSecurityAnalyzerSQLInjection:
|
||||
"""Test SQL injection detection"""
|
||||
|
||||
async def test_detect_string_concatenation(self, security_analyzer, temp_workspace):
|
||||
"""Test detection of SQL string concatenation"""
|
||||
code_file = temp_workspace / "db.py"
|
||||
code_file.write_text("""
|
||||
def get_user(user_id):
|
||||
query = "SELECT * FROM users WHERE id = " + user_id
|
||||
return execute(query)
|
||||
""")
|
||||
|
||||
config = {
|
||||
"file_extensions": [".py"],
|
||||
"check_secrets": False,
|
||||
"check_sql": True,
|
||||
"check_dangerous_functions": False
|
||||
}
|
||||
|
||||
result = await security_analyzer.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
sql_findings = [f for f in result.findings if f.category == "sql_injection"]
|
||||
assert len(sql_findings) > 0
|
||||
|
||||
async def test_detect_f_string_sql(self, security_analyzer, temp_workspace):
|
||||
"""Test detection of f-string in SQL"""
|
||||
code_file = temp_workspace / "db.py"
|
||||
code_file.write_text("""
|
||||
def get_user(name):
|
||||
query = f"SELECT * FROM users WHERE name = '{name}'"
|
||||
return execute(query)
|
||||
""")
|
||||
|
||||
config = {
|
||||
"file_extensions": [".py"],
|
||||
"check_sql": True
|
||||
}
|
||||
|
||||
result = await security_analyzer.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
sql_findings = [f for f in result.findings if f.category == "sql_injection"]
|
||||
assert len(sql_findings) > 0
|
||||
|
||||
async def test_detect_dynamic_query_building(self, security_analyzer, temp_workspace):
|
||||
"""Test detection of dynamic query building"""
|
||||
code_file = temp_workspace / "queries.py"
|
||||
code_file.write_text("""
|
||||
def search(keyword):
|
||||
query = "SELECT * FROM products WHERE name LIKE " + keyword
|
||||
execute(query + " ORDER BY price")
|
||||
""")
|
||||
|
||||
config = {
|
||||
"file_extensions": [".py"],
|
||||
"check_sql": True
|
||||
}
|
||||
|
||||
result = await security_analyzer.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
sql_findings = [f for f in result.findings if f.category == "sql_injection"]
|
||||
assert len(sql_findings) > 0
|
||||
|
||||
async def test_no_sql_detection_when_disabled(self, security_analyzer, temp_workspace):
|
||||
"""Test that SQL detection can be disabled"""
|
||||
code_file = temp_workspace / "db.py"
|
||||
code_file.write_text('query = "SELECT * FROM users WHERE id = " + user_id')
|
||||
|
||||
config = {
|
||||
"file_extensions": [".py"],
|
||||
"check_sql": False
|
||||
}
|
||||
|
||||
result = await security_analyzer.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
sql_findings = [f for f in result.findings if f.category == "sql_injection"]
|
||||
assert len(sql_findings) == 0
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestSecurityAnalyzerDangerousFunctions:
|
||||
"""Test dangerous function detection"""
|
||||
|
||||
async def test_detect_eval(self, security_analyzer, temp_workspace):
|
||||
"""Test detection of eval() usage"""
|
||||
code_file = temp_workspace / "dangerous.py"
|
||||
code_file.write_text("""
|
||||
def process_input(user_input):
|
||||
result = eval(user_input)
|
||||
return result
|
||||
""")
|
||||
|
||||
config = {
|
||||
"file_extensions": [".py"],
|
||||
"check_secrets": False,
|
||||
"check_sql": False,
|
||||
"check_dangerous_functions": True
|
||||
}
|
||||
|
||||
result = await security_analyzer.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
dangerous_findings = [f for f in result.findings if f.category == "dangerous_function"]
|
||||
assert len(dangerous_findings) > 0
|
||||
assert any("eval" in f.title.lower() for f in dangerous_findings)
|
||||
|
||||
async def test_detect_exec(self, security_analyzer, temp_workspace):
|
||||
"""Test detection of exec() usage"""
|
||||
code_file = temp_workspace / "runner.py"
|
||||
code_file.write_text("""
|
||||
def run_code(code):
|
||||
exec(code)
|
||||
""")
|
||||
|
||||
config = {
|
||||
"file_extensions": [".py"],
|
||||
"check_dangerous_functions": True
|
||||
}
|
||||
|
||||
result = await security_analyzer.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
dangerous_findings = [f for f in result.findings if f.category == "dangerous_function"]
|
||||
assert len(dangerous_findings) > 0
|
||||
|
||||
async def test_detect_os_system(self, security_analyzer, temp_workspace):
|
||||
"""Test detection of os.system() usage"""
|
||||
code_file = temp_workspace / "commands.py"
|
||||
code_file.write_text("""
|
||||
import os
|
||||
|
||||
def run_command(cmd):
|
||||
os.system(cmd)
|
||||
""")
|
||||
|
||||
config = {
|
||||
"file_extensions": [".py"],
|
||||
"check_dangerous_functions": True
|
||||
}
|
||||
|
||||
result = await security_analyzer.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
dangerous_findings = [f for f in result.findings if f.category == "dangerous_function"]
|
||||
assert len(dangerous_findings) > 0
|
||||
assert any("os.system" in f.title for f in dangerous_findings)
|
||||
|
||||
async def test_detect_pickle_loads(self, security_analyzer, temp_workspace):
|
||||
"""Test detection of pickle.loads() usage"""
|
||||
code_file = temp_workspace / "serializer.py"
|
||||
code_file.write_text("""
|
||||
import pickle
|
||||
|
||||
def deserialize(data):
|
||||
return pickle.loads(data)
|
||||
""")
|
||||
|
||||
config = {
|
||||
"file_extensions": [".py"],
|
||||
"check_dangerous_functions": True
|
||||
}
|
||||
|
||||
result = await security_analyzer.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
dangerous_findings = [f for f in result.findings if f.category == "dangerous_function"]
|
||||
assert len(dangerous_findings) > 0
|
||||
|
||||
async def test_detect_javascript_eval(self, security_analyzer, temp_workspace):
|
||||
"""Test detection of eval() in JavaScript"""
|
||||
code_file = temp_workspace / "app.js"
|
||||
code_file.write_text("""
|
||||
function processInput(userInput) {
|
||||
return eval(userInput);
|
||||
}
|
||||
""")
|
||||
|
||||
config = {
|
||||
"file_extensions": [".js"],
|
||||
"check_dangerous_functions": True
|
||||
}
|
||||
|
||||
result = await security_analyzer.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
dangerous_findings = [f for f in result.findings if f.category == "dangerous_function"]
|
||||
assert len(dangerous_findings) > 0
|
||||
|
||||
async def test_detect_innerHTML(self, security_analyzer, temp_workspace):
|
||||
"""Test detection of innerHTML (XSS risk)"""
|
||||
code_file = temp_workspace / "dom.js"
|
||||
code_file.write_text("""
|
||||
function updateContent(html) {
|
||||
document.getElementById("content").innerHTML = html;
|
||||
}
|
||||
""")
|
||||
|
||||
config = {
|
||||
"file_extensions": [".js"],
|
||||
"check_dangerous_functions": True
|
||||
}
|
||||
|
||||
result = await security_analyzer.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
dangerous_findings = [f for f in result.findings if f.category == "dangerous_function"]
|
||||
assert len(dangerous_findings) > 0
|
||||
|
||||
async def test_no_dangerous_detection_when_disabled(self, security_analyzer, temp_workspace):
|
||||
"""Test that dangerous function detection can be disabled"""
|
||||
code_file = temp_workspace / "code.py"
|
||||
code_file.write_text('result = eval(user_input)')
|
||||
|
||||
config = {
|
||||
"file_extensions": [".py"],
|
||||
"check_dangerous_functions": False
|
||||
}
|
||||
|
||||
result = await security_analyzer.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
dangerous_findings = [f for f in result.findings if f.category == "dangerous_function"]
|
||||
assert len(dangerous_findings) == 0
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestSecurityAnalyzerMultipleIssues:
|
||||
"""Test detection of multiple issues in same file"""
|
||||
|
||||
async def test_detect_multiple_vulnerabilities(self, security_analyzer, temp_workspace):
|
||||
"""Test detection of multiple vulnerability types"""
|
||||
code_file = temp_workspace / "vulnerable.py"
|
||||
code_file.write_text("""
|
||||
import os
|
||||
|
||||
# Hardcoded credentials
|
||||
api_key = "apikey_live_abcdefghijklmnopqrstuvwxyzabcdef"
|
||||
password = "MySecureP@ssw0rd"
|
||||
|
||||
def process_query(user_input):
|
||||
# SQL injection
|
||||
query = "SELECT * FROM users WHERE name = " + user_input
|
||||
|
||||
# Dangerous function
|
||||
result = eval(user_input)
|
||||
|
||||
# Command injection
|
||||
os.system(user_input)
|
||||
|
||||
return result
|
||||
""")
|
||||
|
||||
config = {
|
||||
"file_extensions": [".py"],
|
||||
"check_secrets": True,
|
||||
"check_sql": True,
|
||||
"check_dangerous_functions": True
|
||||
}
|
||||
|
||||
result = await security_analyzer.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
|
||||
# Should find multiple types of issues
|
||||
secret_findings = [f for f in result.findings if f.category == "hardcoded_secret"]
|
||||
sql_findings = [f for f in result.findings if f.category == "sql_injection"]
|
||||
dangerous_findings = [f for f in result.findings if f.category == "dangerous_function"]
|
||||
|
||||
assert len(secret_findings) > 0
|
||||
assert len(sql_findings) > 0
|
||||
assert len(dangerous_findings) > 0
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestSecurityAnalyzerSummary:
|
||||
"""Test result summary generation"""
|
||||
|
||||
async def test_summary_structure(self, security_analyzer, temp_workspace):
|
||||
"""Test that summary has correct structure"""
|
||||
(temp_workspace / "test.py").write_text("print('hello')")
|
||||
|
||||
config = {"file_extensions": [".py"]}
|
||||
result = await security_analyzer.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
assert "files_analyzed" in result.summary
|
||||
assert "total_findings" in result.summary
|
||||
assert "extensions_scanned" in result.summary
|
||||
|
||||
assert isinstance(result.summary["files_analyzed"], int)
|
||||
assert isinstance(result.summary["total_findings"], int)
|
||||
assert isinstance(result.summary["extensions_scanned"], list)
|
||||
|
||||
async def test_empty_workspace(self, security_analyzer, temp_workspace):
|
||||
"""Test analyzing empty workspace"""
|
||||
config = {"file_extensions": [".py"]}
|
||||
result = await security_analyzer.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "partial" # No files found
|
||||
assert result.summary["files_analyzed"] == 0
|
||||
|
||||
async def test_analyze_multiple_file_types(self, security_analyzer, temp_workspace):
|
||||
"""Test analyzing multiple file types"""
|
||||
(temp_workspace / "app.py").write_text("print('hello')")
|
||||
(temp_workspace / "script.js").write_text("console.log('hello')")
|
||||
(temp_workspace / "index.php").write_text("<?php echo 'hello'; ?>")
|
||||
|
||||
config = {"file_extensions": [".py", ".js", ".php"]}
|
||||
result = await security_analyzer.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
assert result.summary["files_analyzed"] == 3
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestSecurityAnalyzerFalsePositives:
|
||||
"""Test false positive filtering"""
|
||||
|
||||
async def test_skip_test_secrets(self, security_analyzer, temp_workspace):
|
||||
"""Test that test/example secrets are filtered"""
|
||||
code_file = temp_workspace / "test_config.py"
|
||||
code_file.write_text("""
|
||||
# Test configuration - should be filtered
|
||||
api_key = "test_key_example"
|
||||
password = "dummy_password_123"
|
||||
token = "sample_token_placeholder"
|
||||
""")
|
||||
|
||||
config = {
|
||||
"file_extensions": [".py"],
|
||||
"check_secrets": True
|
||||
}
|
||||
|
||||
result = await security_analyzer.execute(config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
# These should be filtered as false positives
|
||||
secret_findings = [f for f in result.findings if f.category == "hardcoded_secret"]
|
||||
# Should have fewer or no findings due to false positive filtering
|
||||
assert len(secret_findings) == 0 or all(
|
||||
not any(fp in f.description.lower() for fp in ['test', 'example', 'dummy', 'sample'])
|
||||
for f in secret_findings
|
||||
)
|
||||
@@ -0,0 +1,369 @@
|
||||
"""
|
||||
FuzzForge Common Storage Activities
|
||||
|
||||
Activities for interacting with MinIO storage:
|
||||
- get_target_activity: Download target from MinIO to local cache
|
||||
- cleanup_cache_activity: Remove target from local cache
|
||||
- upload_results_activity: Upload workflow results to MinIO
|
||||
"""
|
||||
|
||||
import logging
|
||||
import os
|
||||
import shutil
|
||||
from pathlib import Path
|
||||
|
||||
import boto3
|
||||
from botocore.exceptions import ClientError
|
||||
from temporalio import activity
|
||||
|
||||
# Configure logging
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Initialize S3 client (MinIO)
|
||||
s3_client = boto3.client(
|
||||
's3',
|
||||
endpoint_url=os.getenv('S3_ENDPOINT', 'http://minio:9000'),
|
||||
aws_access_key_id=os.getenv('S3_ACCESS_KEY', 'fuzzforge'),
|
||||
aws_secret_access_key=os.getenv('S3_SECRET_KEY', 'fuzzforge123'),
|
||||
region_name=os.getenv('S3_REGION', 'us-east-1'),
|
||||
use_ssl=os.getenv('S3_USE_SSL', 'false').lower() == 'true'
|
||||
)
|
||||
|
||||
# Configuration
|
||||
S3_BUCKET = os.getenv('S3_BUCKET', 'targets')
|
||||
CACHE_DIR = Path(os.getenv('CACHE_DIR', '/cache'))
|
||||
CACHE_MAX_SIZE_GB = int(os.getenv('CACHE_MAX_SIZE', '10').rstrip('GB'))
|
||||
|
||||
|
||||
@activity.defn(name="get_target")
|
||||
async def get_target_activity(
|
||||
target_id: str,
|
||||
run_id: str = None,
|
||||
workspace_isolation: str = "isolated"
|
||||
) -> str:
|
||||
"""
|
||||
Download target from MinIO to local cache.
|
||||
|
||||
Args:
|
||||
target_id: UUID of the uploaded target
|
||||
run_id: Workflow run ID for isolation (required for isolated mode)
|
||||
workspace_isolation: Isolation mode - "isolated" (default), "shared", or "copy-on-write"
|
||||
|
||||
Returns:
|
||||
Local path to the cached target workspace
|
||||
|
||||
Raises:
|
||||
FileNotFoundError: If target doesn't exist in MinIO
|
||||
ValueError: If run_id not provided for isolated mode
|
||||
Exception: For other download errors
|
||||
"""
|
||||
logger.info(
|
||||
f"Activity: get_target (target_id={target_id}, run_id={run_id}, "
|
||||
f"isolation={workspace_isolation})"
|
||||
)
|
||||
|
||||
# Validate isolation mode
|
||||
valid_modes = ["isolated", "shared", "copy-on-write"]
|
||||
if workspace_isolation not in valid_modes:
|
||||
raise ValueError(
|
||||
f"Invalid workspace_isolation mode: {workspace_isolation}. "
|
||||
f"Must be one of: {valid_modes}"
|
||||
)
|
||||
|
||||
# Require run_id for isolated and copy-on-write modes
|
||||
if workspace_isolation in ["isolated", "copy-on-write"] and not run_id:
|
||||
raise ValueError(
|
||||
f"run_id is required for workspace_isolation='{workspace_isolation}'"
|
||||
)
|
||||
|
||||
# Define cache paths based on isolation mode
|
||||
if workspace_isolation == "isolated":
|
||||
# Each run gets its own isolated workspace
|
||||
cache_path = CACHE_DIR / target_id / run_id
|
||||
cached_file = cache_path / "target"
|
||||
elif workspace_isolation == "shared":
|
||||
# All runs share the same workspace (legacy behavior)
|
||||
cache_path = CACHE_DIR / target_id
|
||||
cached_file = cache_path / "target"
|
||||
else: # copy-on-write
|
||||
# Shared download, run-specific copy
|
||||
shared_cache_path = CACHE_DIR / target_id / "shared"
|
||||
cache_path = CACHE_DIR / target_id / run_id
|
||||
cached_file = shared_cache_path / "target"
|
||||
|
||||
# Handle copy-on-write mode
|
||||
if workspace_isolation == "copy-on-write":
|
||||
# Check if shared cache exists
|
||||
if cached_file.exists():
|
||||
logger.info(f"Copy-on-write: Shared cache HIT for {target_id}")
|
||||
|
||||
# Copy shared workspace to run-specific path
|
||||
shared_workspace = shared_cache_path / "workspace"
|
||||
run_workspace = cache_path / "workspace"
|
||||
|
||||
if shared_workspace.exists():
|
||||
logger.info(f"Copying workspace to isolated run path: {run_workspace}")
|
||||
cache_path.mkdir(parents=True, exist_ok=True)
|
||||
shutil.copytree(shared_workspace, run_workspace)
|
||||
return str(run_workspace)
|
||||
else:
|
||||
# Shared file exists but not extracted (non-tarball)
|
||||
run_file = cache_path / "target"
|
||||
cache_path.mkdir(parents=True, exist_ok=True)
|
||||
shutil.copy2(cached_file, run_file)
|
||||
return str(run_file)
|
||||
# If shared cache doesn't exist, fall through to download
|
||||
|
||||
# Check if target is already cached (isolated or shared mode)
|
||||
elif cached_file.exists():
|
||||
# Update access time for LRU
|
||||
cached_file.touch()
|
||||
logger.info(f"Cache HIT: {target_id} (mode: {workspace_isolation})")
|
||||
|
||||
# Check if workspace directory exists (extracted tarball)
|
||||
workspace_dir = cache_path / "workspace"
|
||||
if workspace_dir.exists() and workspace_dir.is_dir():
|
||||
logger.info(f"Returning cached workspace: {workspace_dir}")
|
||||
return str(workspace_dir)
|
||||
else:
|
||||
# Return cached file (not a tarball)
|
||||
return str(cached_file)
|
||||
|
||||
# Cache miss - download from MinIO
|
||||
logger.info(
|
||||
f"Cache MISS: {target_id} (mode: {workspace_isolation}), "
|
||||
f"downloading from MinIO..."
|
||||
)
|
||||
|
||||
try:
|
||||
# Create cache directory
|
||||
cache_path.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Download from S3/MinIO
|
||||
s3_key = f'{target_id}/target'
|
||||
logger.info(f"Downloading s3://{S3_BUCKET}/{s3_key} -> {cached_file}")
|
||||
|
||||
s3_client.download_file(
|
||||
Bucket=S3_BUCKET,
|
||||
Key=s3_key,
|
||||
Filename=str(cached_file)
|
||||
)
|
||||
|
||||
# Verify file was downloaded
|
||||
if not cached_file.exists():
|
||||
raise FileNotFoundError(f"Downloaded file not found: {cached_file}")
|
||||
|
||||
file_size = cached_file.stat().st_size
|
||||
logger.info(
|
||||
f"✓ Downloaded target {target_id} "
|
||||
f"({file_size / 1024 / 1024:.2f} MB)"
|
||||
)
|
||||
|
||||
# Extract tarball if it's an archive
|
||||
import tarfile
|
||||
workspace_dir = cache_path / "workspace"
|
||||
|
||||
if tarfile.is_tarfile(str(cached_file)):
|
||||
logger.info(f"Extracting tarball to {workspace_dir}...")
|
||||
workspace_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
with tarfile.open(str(cached_file), 'r:*') as tar:
|
||||
tar.extractall(path=workspace_dir)
|
||||
|
||||
logger.info(f"✓ Extracted tarball to {workspace_dir}")
|
||||
|
||||
# For copy-on-write mode, copy to run-specific path
|
||||
if workspace_isolation == "copy-on-write":
|
||||
run_cache_path = CACHE_DIR / target_id / run_id
|
||||
run_workspace = run_cache_path / "workspace"
|
||||
logger.info(f"Copy-on-write: Copying to {run_workspace}")
|
||||
run_cache_path.mkdir(parents=True, exist_ok=True)
|
||||
shutil.copytree(workspace_dir, run_workspace)
|
||||
return str(run_workspace)
|
||||
|
||||
return str(workspace_dir)
|
||||
else:
|
||||
# Not a tarball
|
||||
if workspace_isolation == "copy-on-write":
|
||||
# Copy file to run-specific path
|
||||
run_cache_path = CACHE_DIR / target_id / run_id
|
||||
run_file = run_cache_path / "target"
|
||||
logger.info(f"Copy-on-write: Copying file to {run_file}")
|
||||
run_cache_path.mkdir(parents=True, exist_ok=True)
|
||||
shutil.copy2(cached_file, run_file)
|
||||
return str(run_file)
|
||||
|
||||
return str(cached_file)
|
||||
|
||||
except ClientError as e:
|
||||
error_code = e.response['Error']['Code']
|
||||
if error_code == '404' or error_code == 'NoSuchKey':
|
||||
logger.error(f"Target not found in MinIO: {target_id}")
|
||||
raise FileNotFoundError(f"Target {target_id} not found in storage")
|
||||
else:
|
||||
logger.error(f"S3/MinIO error downloading target: {e}", exc_info=True)
|
||||
raise
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to download target {target_id}: {e}", exc_info=True)
|
||||
# Cleanup partial download
|
||||
if cache_path.exists():
|
||||
shutil.rmtree(cache_path, ignore_errors=True)
|
||||
raise
|
||||
|
||||
|
||||
@activity.defn(name="cleanup_cache")
|
||||
async def cleanup_cache_activity(
|
||||
target_path: str,
|
||||
workspace_isolation: str = "isolated"
|
||||
) -> None:
|
||||
"""
|
||||
Remove target from local cache after workflow completes.
|
||||
|
||||
Args:
|
||||
target_path: Path to the cached target workspace (from get_target_activity)
|
||||
workspace_isolation: Isolation mode used - determines cleanup scope
|
||||
|
||||
Notes:
|
||||
- "isolated" mode: Removes the entire run-specific directory
|
||||
- "copy-on-write" mode: Removes run-specific directory, keeps shared cache
|
||||
- "shared" mode: Does NOT remove cache (shared across runs)
|
||||
"""
|
||||
logger.info(
|
||||
f"Activity: cleanup_cache (path={target_path}, "
|
||||
f"isolation={workspace_isolation})"
|
||||
)
|
||||
|
||||
try:
|
||||
target = Path(target_path)
|
||||
|
||||
# For shared mode, don't clean up (cache is shared across runs)
|
||||
if workspace_isolation == "shared":
|
||||
logger.info(
|
||||
f"Skipping cleanup for shared workspace (mode={workspace_isolation})"
|
||||
)
|
||||
return
|
||||
|
||||
# For isolated and copy-on-write modes, clean up run-specific directory
|
||||
# Navigate up to the run-specific directory: /cache/{target_id}/{run_id}/
|
||||
if target.name == "workspace":
|
||||
# Path is .../workspace, go up one level to run directory
|
||||
run_dir = target.parent
|
||||
else:
|
||||
# Path is a file, go up one level to run directory
|
||||
run_dir = target.parent
|
||||
|
||||
# Validate it's in cache and looks like a run-specific path
|
||||
if run_dir.exists() and run_dir.is_relative_to(CACHE_DIR):
|
||||
# Check if parent is target_id directory (validate structure)
|
||||
target_id_dir = run_dir.parent
|
||||
if target_id_dir.is_relative_to(CACHE_DIR):
|
||||
shutil.rmtree(run_dir)
|
||||
logger.info(
|
||||
f"✓ Cleaned up run-specific directory: {run_dir} "
|
||||
f"(mode={workspace_isolation})"
|
||||
)
|
||||
else:
|
||||
logger.warning(
|
||||
f"Unexpected cache structure, skipping cleanup: {run_dir}"
|
||||
)
|
||||
else:
|
||||
logger.warning(
|
||||
f"Cache path not in CACHE_DIR or doesn't exist: {run_dir}"
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
# Don't fail workflow if cleanup fails
|
||||
logger.error(
|
||||
f"Failed to cleanup cache {target_path}: {e}",
|
||||
exc_info=True
|
||||
)
|
||||
|
||||
|
||||
@activity.defn(name="upload_results")
|
||||
async def upload_results_activity(
|
||||
workflow_id: str,
|
||||
results: dict,
|
||||
results_format: str = "json"
|
||||
) -> str:
|
||||
"""
|
||||
Upload workflow results to MinIO.
|
||||
|
||||
Args:
|
||||
workflow_id: Workflow execution ID
|
||||
results: Results dictionary to upload
|
||||
results_format: Format for results (json, sarif, etc.)
|
||||
|
||||
Returns:
|
||||
S3 URL to the uploaded results
|
||||
"""
|
||||
logger.info(
|
||||
f"Activity: upload_results "
|
||||
f"(workflow_id={workflow_id}, format={results_format})"
|
||||
)
|
||||
|
||||
try:
|
||||
import json
|
||||
|
||||
# Prepare results content
|
||||
if results_format == "json":
|
||||
content = json.dumps(results, indent=2).encode('utf-8')
|
||||
content_type = 'application/json'
|
||||
file_ext = 'json'
|
||||
elif results_format == "sarif":
|
||||
content = json.dumps(results, indent=2).encode('utf-8')
|
||||
content_type = 'application/sarif+json'
|
||||
file_ext = 'sarif'
|
||||
else:
|
||||
# Default to JSON
|
||||
content = json.dumps(results, indent=2).encode('utf-8')
|
||||
content_type = 'application/json'
|
||||
file_ext = 'json'
|
||||
|
||||
# Upload to MinIO
|
||||
s3_key = f'{workflow_id}/results.{file_ext}'
|
||||
logger.info(f"Uploading results to s3://results/{s3_key}")
|
||||
|
||||
s3_client.put_object(
|
||||
Bucket='results',
|
||||
Key=s3_key,
|
||||
Body=content,
|
||||
ContentType=content_type,
|
||||
Metadata={
|
||||
'workflow_id': workflow_id,
|
||||
'format': results_format
|
||||
}
|
||||
)
|
||||
|
||||
# Construct S3 URL
|
||||
s3_endpoint = os.getenv('S3_ENDPOINT', 'http://minio:9000')
|
||||
s3_url = f"{s3_endpoint}/results/{s3_key}"
|
||||
|
||||
logger.info(f"✓ Uploaded results: {s3_url}")
|
||||
return s3_url
|
||||
|
||||
except Exception as e:
|
||||
logger.error(
|
||||
f"Failed to upload results for workflow {workflow_id}: {e}",
|
||||
exc_info=True
|
||||
)
|
||||
raise
|
||||
|
||||
|
||||
def _check_cache_size():
|
||||
"""Check total cache size and log warning if exceeding limit"""
|
||||
try:
|
||||
total_size = 0
|
||||
for item in CACHE_DIR.rglob('*'):
|
||||
if item.is_file():
|
||||
total_size += item.stat().st_size
|
||||
|
||||
total_size_gb = total_size / (1024 ** 3)
|
||||
if total_size_gb > CACHE_MAX_SIZE_GB:
|
||||
logger.warning(
|
||||
f"Cache size ({total_size_gb:.2f} GB) exceeds "
|
||||
f"limit ({CACHE_MAX_SIZE_GB} GB). Consider cleanup."
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to check cache size: {e}")
|
||||
@@ -16,7 +16,7 @@ Security Analyzer Module - Analyzes code for security vulnerabilities
|
||||
import logging
|
||||
import re
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any, List, Optional
|
||||
from typing import Dict, Any, List
|
||||
|
||||
try:
|
||||
from toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
|
||||
|
||||
@@ -17,7 +17,6 @@ from abc import ABC, abstractmethod
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any, List, Optional
|
||||
from pydantic import BaseModel, Field
|
||||
from datetime import datetime
|
||||
import logging
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -0,0 +1,10 @@
|
||||
"""
|
||||
Fuzzing modules for FuzzForge
|
||||
|
||||
This package contains fuzzing modules for different fuzzing engines.
|
||||
"""
|
||||
|
||||
from .atheris_fuzzer import AtherisFuzzer
|
||||
from .cargo_fuzzer import CargoFuzzer
|
||||
|
||||
__all__ = ["AtherisFuzzer", "CargoFuzzer"]
|
||||
@@ -0,0 +1,608 @@
|
||||
"""
|
||||
Atheris Fuzzer Module
|
||||
|
||||
Reusable module for fuzzing Python code using Atheris.
|
||||
Discovers and fuzzes user-provided Python targets with TestOneInput() function.
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import base64
|
||||
import importlib.util
|
||||
import logging
|
||||
import multiprocessing
|
||||
import os
|
||||
import sys
|
||||
import time
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any, List, Optional, Callable
|
||||
import uuid
|
||||
|
||||
import httpx
|
||||
from modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
def _run_atheris_in_subprocess(
|
||||
target_path_str: str,
|
||||
corpus_dir_str: str,
|
||||
max_iterations: int,
|
||||
timeout_seconds: int,
|
||||
shared_crashes: Any,
|
||||
exec_counter: multiprocessing.Value,
|
||||
crash_counter: multiprocessing.Value,
|
||||
coverage_counter: multiprocessing.Value
|
||||
):
|
||||
"""
|
||||
Run atheris.Fuzz() in a separate process to isolate os._exit() calls.
|
||||
|
||||
This function runs in a subprocess and loads the target module,
|
||||
sets up atheris, and runs fuzzing. Stats are communicated via shared memory.
|
||||
|
||||
Args:
|
||||
target_path_str: String path to target file
|
||||
corpus_dir_str: String path to corpus directory
|
||||
max_iterations: Maximum fuzzing iterations
|
||||
timeout_seconds: Timeout in seconds
|
||||
shared_crashes: Manager().list() for storing crash details
|
||||
exec_counter: Shared counter for executions
|
||||
crash_counter: Shared counter for crashes
|
||||
coverage_counter: Shared counter for coverage edges
|
||||
"""
|
||||
import atheris
|
||||
import importlib.util
|
||||
import traceback
|
||||
from pathlib import Path
|
||||
|
||||
target_path = Path(target_path_str)
|
||||
total_executions = 0
|
||||
|
||||
# NOTE: Crash details are written directly to shared_crashes (Manager().list())
|
||||
# so they can be accessed by parent process after subprocess exits.
|
||||
# We don't use a local crashes list because os._exit() prevents cleanup code.
|
||||
|
||||
try:
|
||||
# Load target module in subprocess
|
||||
module_name = f"fuzz_target_{uuid.uuid4().hex[:8]}"
|
||||
spec = importlib.util.spec_from_file_location(module_name, target_path)
|
||||
if spec is None or spec.loader is None:
|
||||
raise ImportError(f"Could not load module from {target_path}")
|
||||
|
||||
module = importlib.util.module_from_spec(spec)
|
||||
sys.modules[module_name] = module
|
||||
spec.loader.exec_module(module)
|
||||
|
||||
if not hasattr(module, "TestOneInput"):
|
||||
raise AttributeError("Module does not have TestOneInput() function")
|
||||
|
||||
test_one_input = module.TestOneInput
|
||||
|
||||
# Wrapper to track executions and crashes
|
||||
def fuzz_wrapper(data):
|
||||
nonlocal total_executions
|
||||
total_executions += 1
|
||||
|
||||
# Update shared counter for live stats
|
||||
with exec_counter.get_lock():
|
||||
exec_counter.value += 1
|
||||
|
||||
try:
|
||||
test_one_input(data)
|
||||
except Exception as e:
|
||||
# Capture crash details to shared memory
|
||||
crash_info = {
|
||||
"input": bytes(data), # Convert to bytes for serialization
|
||||
"exception_type": type(e).__name__,
|
||||
"exception_message": str(e),
|
||||
"stack_trace": traceback.format_exc(),
|
||||
"execution": total_executions
|
||||
}
|
||||
# Write to shared memory so parent process can access crash details
|
||||
shared_crashes.append(crash_info)
|
||||
|
||||
# Update shared crash counter
|
||||
with crash_counter.get_lock():
|
||||
crash_counter.value += 1
|
||||
|
||||
# Re-raise so Atheris detects it
|
||||
raise
|
||||
|
||||
# Check for dictionary file in target directory
|
||||
dict_args = []
|
||||
target_dir = target_path.parent
|
||||
for dict_name in ["fuzz.dict", "fuzzing.dict", "dict.txt"]:
|
||||
dict_path = target_dir / dict_name
|
||||
if dict_path.exists():
|
||||
dict_args.append(f"-dict={dict_path}")
|
||||
break
|
||||
|
||||
# Configure Atheris
|
||||
atheris_args = [
|
||||
"atheris_fuzzer",
|
||||
f"-runs={max_iterations}",
|
||||
f"-max_total_time={timeout_seconds}",
|
||||
"-print_final_stats=1"
|
||||
] + dict_args + [corpus_dir_str] # Corpus directory as positional arg
|
||||
|
||||
atheris.Setup(atheris_args, fuzz_wrapper)
|
||||
|
||||
# Run fuzzing (this will call os._exit() when done)
|
||||
atheris.Fuzz()
|
||||
|
||||
except SystemExit:
|
||||
# Atheris exits when done - this is normal
|
||||
# Crash details already written to shared_crashes
|
||||
pass
|
||||
except Exception:
|
||||
# Fatal error - traceback already written to shared memory
|
||||
# via crash handler in fuzz_wrapper
|
||||
pass
|
||||
|
||||
|
||||
class AtherisFuzzer(BaseModule):
|
||||
"""
|
||||
Atheris fuzzing module - discovers and fuzzes Python code.
|
||||
|
||||
This module can be used by any workflow to fuzz Python targets.
|
||||
"""
|
||||
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.crashes = []
|
||||
self.total_executions = 0
|
||||
self.start_time = None
|
||||
self.last_stats_time = 0
|
||||
self.run_id = None
|
||||
|
||||
def get_metadata(self) -> ModuleMetadata:
|
||||
"""Return module metadata"""
|
||||
return ModuleMetadata(
|
||||
name="atheris_fuzzer",
|
||||
version="1.0.0",
|
||||
description="Python fuzzing using Atheris - discovers and fuzzes TestOneInput() functions",
|
||||
author="FuzzForge Team",
|
||||
category="fuzzer",
|
||||
tags=["fuzzing", "atheris", "python", "coverage"],
|
||||
input_schema={
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"target_file": {
|
||||
"type": "string",
|
||||
"description": "Python file with TestOneInput() function (auto-discovered if not specified)"
|
||||
},
|
||||
"max_iterations": {
|
||||
"type": "integer",
|
||||
"description": "Maximum fuzzing iterations",
|
||||
"default": 100000
|
||||
},
|
||||
"timeout_seconds": {
|
||||
"type": "integer",
|
||||
"description": "Fuzzing timeout in seconds",
|
||||
"default": 300
|
||||
},
|
||||
"stats_callback": {
|
||||
"description": "Optional callback for real-time statistics"
|
||||
}
|
||||
}
|
||||
},
|
||||
requires_workspace=True
|
||||
)
|
||||
|
||||
def validate_config(self, config: Dict[str, Any]) -> bool:
|
||||
"""Validate fuzzing configuration"""
|
||||
max_iterations = config.get("max_iterations", 100000)
|
||||
if not isinstance(max_iterations, int) or max_iterations <= 0:
|
||||
raise ValueError(f"max_iterations must be positive integer, got: {max_iterations}")
|
||||
|
||||
timeout = config.get("timeout_seconds", 300)
|
||||
if not isinstance(timeout, int) or timeout <= 0:
|
||||
raise ValueError(f"timeout_seconds must be positive integer, got: {timeout}")
|
||||
|
||||
return True
|
||||
|
||||
async def execute(self, config: Dict[str, Any], workspace: Path) -> ModuleResult:
|
||||
"""
|
||||
Execute Atheris fuzzing on user code.
|
||||
|
||||
Args:
|
||||
config: Fuzzing configuration
|
||||
workspace: Path to user's uploaded code
|
||||
|
||||
Returns:
|
||||
ModuleResult with crash findings
|
||||
"""
|
||||
self.start_timer()
|
||||
self.start_time = time.time()
|
||||
|
||||
# Validate configuration
|
||||
self.validate_config(config)
|
||||
self.validate_workspace(workspace)
|
||||
|
||||
# Extract config
|
||||
target_file = config.get("target_file")
|
||||
max_iterations = config.get("max_iterations", 100000)
|
||||
timeout_seconds = config.get("timeout_seconds", 300)
|
||||
stats_callback = config.get("stats_callback")
|
||||
self.run_id = config.get("run_id")
|
||||
|
||||
logger.info(
|
||||
f"Starting Atheris fuzzing (max_iterations={max_iterations}, "
|
||||
f"timeout={timeout_seconds}s, target={target_file or 'auto-discover'})"
|
||||
)
|
||||
|
||||
try:
|
||||
# Step 1: Discover or load target
|
||||
target_path = self._discover_target(workspace, target_file)
|
||||
logger.info(f"Using fuzz target: {target_path}")
|
||||
|
||||
# Step 2: Load target module
|
||||
test_one_input = self._load_target_module(target_path)
|
||||
logger.info(f"Loaded TestOneInput function from {target_path}")
|
||||
|
||||
# Step 3: Run fuzzing
|
||||
await self._run_fuzzing(
|
||||
test_one_input=test_one_input,
|
||||
target_path=target_path,
|
||||
workspace=workspace,
|
||||
max_iterations=max_iterations,
|
||||
timeout_seconds=timeout_seconds,
|
||||
stats_callback=stats_callback
|
||||
)
|
||||
|
||||
# Step 4: Generate findings from crashes
|
||||
findings = await self._generate_findings(target_path)
|
||||
|
||||
logger.info(
|
||||
f"Fuzzing completed: {self.total_executions} executions, "
|
||||
f"{len(self.crashes)} crashes found"
|
||||
)
|
||||
|
||||
# Generate SARIF report (always, even with no findings)
|
||||
from modules.reporter import SARIFReporter
|
||||
reporter = SARIFReporter()
|
||||
reporter_config = {
|
||||
"findings": findings,
|
||||
"tool_name": "Atheris Fuzzer",
|
||||
"tool_version": self._metadata.version
|
||||
}
|
||||
reporter_result = await reporter.execute(reporter_config, workspace)
|
||||
sarif_report = reporter_result.sarif
|
||||
|
||||
return ModuleResult(
|
||||
module=self._metadata.name,
|
||||
version=self._metadata.version,
|
||||
status="success",
|
||||
execution_time=self.get_execution_time(),
|
||||
findings=findings,
|
||||
summary={
|
||||
"total_executions": self.total_executions,
|
||||
"crashes_found": len(self.crashes),
|
||||
"execution_time": self.get_execution_time(),
|
||||
"target_file": str(target_path.relative_to(workspace))
|
||||
},
|
||||
metadata={
|
||||
"max_iterations": max_iterations,
|
||||
"timeout_seconds": timeout_seconds
|
||||
},
|
||||
sarif=sarif_report
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Fuzzing failed: {e}", exc_info=True)
|
||||
return self.create_result(
|
||||
findings=[],
|
||||
status="failed",
|
||||
error=str(e)
|
||||
)
|
||||
|
||||
def _discover_target(self, workspace: Path, target_file: Optional[str]) -> Path:
|
||||
"""
|
||||
Discover fuzz target in workspace.
|
||||
|
||||
Args:
|
||||
workspace: Path to workspace
|
||||
target_file: Explicit target file or None for auto-discovery
|
||||
|
||||
Returns:
|
||||
Path to target file
|
||||
"""
|
||||
if target_file:
|
||||
# Use specified target
|
||||
target_path = workspace / target_file
|
||||
if not target_path.exists():
|
||||
raise FileNotFoundError(f"Target file not found: {target_file}")
|
||||
return target_path
|
||||
|
||||
# Auto-discover: look for fuzz_*.py or *_fuzz.py
|
||||
logger.info("Auto-discovering fuzz targets...")
|
||||
|
||||
candidates = []
|
||||
# Use rglob for recursive search (searches all subdirectories)
|
||||
for pattern in ["fuzz_*.py", "*_fuzz.py", "fuzz_target.py"]:
|
||||
matches = list(workspace.rglob(pattern))
|
||||
candidates.extend(matches)
|
||||
|
||||
if not candidates:
|
||||
raise FileNotFoundError(
|
||||
"No fuzz targets found. Expected files matching: fuzz_*.py, *_fuzz.py, or fuzz_target.py"
|
||||
)
|
||||
|
||||
# Use first candidate
|
||||
target = candidates[0]
|
||||
if len(candidates) > 1:
|
||||
logger.warning(
|
||||
f"Multiple fuzz targets found: {[str(c) for c in candidates]}. "
|
||||
f"Using: {target.name}"
|
||||
)
|
||||
|
||||
return target
|
||||
|
||||
def _load_target_module(self, target_path: Path) -> Callable:
|
||||
"""
|
||||
Load target module and get TestOneInput function.
|
||||
|
||||
Args:
|
||||
target_path: Path to Python file with TestOneInput
|
||||
|
||||
Returns:
|
||||
TestOneInput function
|
||||
"""
|
||||
# Add target directory to sys.path
|
||||
target_dir = target_path.parent
|
||||
if str(target_dir) not in sys.path:
|
||||
sys.path.insert(0, str(target_dir))
|
||||
|
||||
# Load module dynamically
|
||||
module_name = target_path.stem
|
||||
spec = importlib.util.spec_from_file_location(module_name, target_path)
|
||||
if spec is None or spec.loader is None:
|
||||
raise ImportError(f"Cannot load module from {target_path}")
|
||||
|
||||
module = importlib.util.module_from_spec(spec)
|
||||
spec.loader.exec_module(module)
|
||||
|
||||
# Get TestOneInput function
|
||||
if not hasattr(module, "TestOneInput"):
|
||||
raise AttributeError(
|
||||
f"Module {module_name} does not have TestOneInput() function. "
|
||||
"Atheris requires a TestOneInput(data: bytes) function."
|
||||
)
|
||||
|
||||
return module.TestOneInput
|
||||
|
||||
async def _run_fuzzing(
|
||||
self,
|
||||
test_one_input: Callable,
|
||||
target_path: Path,
|
||||
workspace: Path,
|
||||
max_iterations: int,
|
||||
timeout_seconds: int,
|
||||
stats_callback: Optional[Callable] = None
|
||||
):
|
||||
"""
|
||||
Run Atheris fuzzing with real-time monitoring.
|
||||
|
||||
Args:
|
||||
test_one_input: TestOneInput function to fuzz (not used, loaded in subprocess)
|
||||
target_path: Path to target file
|
||||
workspace: Path to workspace directory
|
||||
max_iterations: Max iterations
|
||||
timeout_seconds: Timeout in seconds
|
||||
stats_callback: Optional callback for stats
|
||||
"""
|
||||
self.crashes = []
|
||||
self.total_executions = 0
|
||||
|
||||
# Create corpus directory in workspace
|
||||
corpus_dir = workspace / ".fuzzforge_corpus"
|
||||
corpus_dir.mkdir(exist_ok=True)
|
||||
logger.info(f"Using corpus directory: {corpus_dir}")
|
||||
|
||||
logger.info(f"Starting Atheris fuzzer in subprocess (max_runs={max_iterations}, timeout={timeout_seconds}s)...")
|
||||
|
||||
# Create shared memory for subprocess communication
|
||||
ctx = multiprocessing.get_context('spawn')
|
||||
manager = ctx.Manager()
|
||||
shared_crashes = manager.list() # Shared list for crash details
|
||||
exec_counter = ctx.Value('i', 0) # Shared execution counter
|
||||
crash_counter = ctx.Value('i', 0) # Shared crash counter
|
||||
coverage_counter = ctx.Value('i', 0) # Shared coverage counter
|
||||
|
||||
# Start fuzzing in subprocess
|
||||
process = ctx.Process(
|
||||
target=_run_atheris_in_subprocess,
|
||||
args=(str(target_path), str(corpus_dir), max_iterations, timeout_seconds, shared_crashes, exec_counter, crash_counter, coverage_counter)
|
||||
)
|
||||
|
||||
# Run fuzzing in a separate task with monitoring
|
||||
async def monitor_stats():
|
||||
"""Monitor and report stats every 0.5 seconds"""
|
||||
while True:
|
||||
await asyncio.sleep(0.5)
|
||||
|
||||
if stats_callback:
|
||||
elapsed = time.time() - self.start_time
|
||||
# Read from shared counters
|
||||
current_execs = exec_counter.value
|
||||
current_crashes = crash_counter.value
|
||||
current_coverage = coverage_counter.value
|
||||
execs_per_sec = current_execs / elapsed if elapsed > 0 else 0
|
||||
|
||||
# Count corpus files
|
||||
try:
|
||||
corpus_size = len(list(corpus_dir.iterdir())) if corpus_dir.exists() else 0
|
||||
except Exception:
|
||||
corpus_size = 0
|
||||
|
||||
# TODO: Get real coverage from Atheris
|
||||
# For now use corpus_size as proxy
|
||||
coverage_value = current_coverage if current_coverage > 0 else corpus_size
|
||||
|
||||
await stats_callback({
|
||||
"total_execs": current_execs,
|
||||
"execs_per_sec": execs_per_sec,
|
||||
"crashes": current_crashes,
|
||||
"corpus_size": corpus_size,
|
||||
"coverage": coverage_value, # Using corpus as coverage proxy
|
||||
"elapsed_time": int(elapsed)
|
||||
})
|
||||
|
||||
# Start monitoring task
|
||||
monitor_task = None
|
||||
if stats_callback:
|
||||
monitor_task = asyncio.create_task(monitor_stats())
|
||||
|
||||
try:
|
||||
# Start subprocess
|
||||
process.start()
|
||||
logger.info(f"Fuzzing subprocess started (PID: {process.pid})")
|
||||
|
||||
# Wait for subprocess to complete
|
||||
while process.is_alive():
|
||||
await asyncio.sleep(0.1)
|
||||
|
||||
# NOTE: We cannot use result_queue because Atheris calls os._exit()
|
||||
# which terminates immediately without putting results in the queue.
|
||||
# Instead, we rely on shared memory (Manager().list() and Value counters).
|
||||
|
||||
# Read final values from shared memory
|
||||
self.total_executions = exec_counter.value
|
||||
total_crashes = crash_counter.value
|
||||
|
||||
# Read crash details from shared memory and convert to our format
|
||||
self.crashes = []
|
||||
for crash_data in shared_crashes:
|
||||
# Reconstruct crash info with exception object
|
||||
crash_info = {
|
||||
"input": crash_data["input"],
|
||||
"exception": Exception(crash_data["exception_message"]),
|
||||
"exception_type": crash_data["exception_type"],
|
||||
"stack_trace": crash_data["stack_trace"],
|
||||
"execution": crash_data["execution"]
|
||||
}
|
||||
self.crashes.append(crash_info)
|
||||
|
||||
logger.warning(
|
||||
f"Crash found (execution {crash_data['execution']}): "
|
||||
f"{crash_data['exception_type']}: {crash_data['exception_message']}"
|
||||
)
|
||||
|
||||
logger.info(f"Fuzzing completed: {self.total_executions} executions, {total_crashes} crashes found")
|
||||
|
||||
# Send final stats update
|
||||
if stats_callback:
|
||||
elapsed = time.time() - self.start_time
|
||||
execs_per_sec = self.total_executions / elapsed if elapsed > 0 else 0
|
||||
|
||||
# Count final corpus size
|
||||
try:
|
||||
final_corpus_size = len(list(corpus_dir.iterdir())) if corpus_dir.exists() else 0
|
||||
except Exception:
|
||||
final_corpus_size = 0
|
||||
|
||||
# TODO: Parse coverage from Atheris output
|
||||
# For now, use corpus size as proxy (corpus grows with coverage)
|
||||
# libFuzzer writes coverage to stdout but sys.stdout redirection
|
||||
# doesn't work because it writes to FD 1 directly from C++
|
||||
final_coverage = coverage_counter.value if coverage_counter.value > 0 else final_corpus_size
|
||||
|
||||
await stats_callback({
|
||||
"total_execs": self.total_executions,
|
||||
"execs_per_sec": execs_per_sec,
|
||||
"crashes": total_crashes,
|
||||
"corpus_size": final_corpus_size,
|
||||
"coverage": final_coverage,
|
||||
"elapsed_time": int(elapsed)
|
||||
})
|
||||
|
||||
# Wait for process to fully terminate
|
||||
process.join(timeout=5)
|
||||
|
||||
if process.exitcode is not None and process.exitcode != 0:
|
||||
logger.warning(f"Subprocess exited with code: {process.exitcode}")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Fuzzing execution error: {e}")
|
||||
if process.is_alive():
|
||||
logger.warning("Terminating fuzzing subprocess...")
|
||||
process.terminate()
|
||||
process.join(timeout=5)
|
||||
if process.is_alive():
|
||||
process.kill()
|
||||
raise
|
||||
finally:
|
||||
# Stop monitoring
|
||||
if monitor_task:
|
||||
monitor_task.cancel()
|
||||
try:
|
||||
await monitor_task
|
||||
except asyncio.CancelledError:
|
||||
pass
|
||||
|
||||
async def _generate_findings(self, target_path: Path) -> List[ModuleFinding]:
|
||||
"""
|
||||
Generate ModuleFinding objects from crashes.
|
||||
|
||||
Args:
|
||||
target_path: Path to target file
|
||||
|
||||
Returns:
|
||||
List of findings
|
||||
"""
|
||||
findings = []
|
||||
|
||||
for idx, crash in enumerate(self.crashes):
|
||||
# Encode crash input for storage
|
||||
crash_input_b64 = base64.b64encode(crash["input"]).decode()
|
||||
|
||||
finding = self.create_finding(
|
||||
title=f"Crash: {crash['exception_type']}",
|
||||
description=(
|
||||
f"Atheris found crash during fuzzing:\n"
|
||||
f"Exception: {crash['exception_type']}\n"
|
||||
f"Message: {str(crash['exception'])}\n"
|
||||
f"Execution: {crash['execution']}"
|
||||
),
|
||||
severity="critical",
|
||||
category="crash",
|
||||
file_path=str(target_path),
|
||||
metadata={
|
||||
"crash_input_base64": crash_input_b64,
|
||||
"crash_input_hex": crash["input"].hex(),
|
||||
"exception_type": crash["exception_type"],
|
||||
"stack_trace": crash["stack_trace"],
|
||||
"execution_number": crash["execution"]
|
||||
},
|
||||
recommendation=(
|
||||
"Review the crash stack trace and input to identify the vulnerability. "
|
||||
"The crash input is provided in base64 and hex formats for reproduction."
|
||||
)
|
||||
)
|
||||
findings.append(finding)
|
||||
|
||||
# Report crash to backend for real-time monitoring
|
||||
if self.run_id:
|
||||
try:
|
||||
crash_report = {
|
||||
"run_id": self.run_id,
|
||||
"crash_id": f"crash_{idx + 1}",
|
||||
"timestamp": datetime.utcnow().isoformat(),
|
||||
"crash_type": crash["exception_type"],
|
||||
"stack_trace": crash["stack_trace"],
|
||||
"input_file": crash_input_b64,
|
||||
"severity": "critical",
|
||||
"exploitability": "unknown"
|
||||
}
|
||||
|
||||
backend_url = os.getenv("BACKEND_URL", "http://backend:8000")
|
||||
async with httpx.AsyncClient(timeout=5.0) as client:
|
||||
await client.post(
|
||||
f"{backend_url}/fuzzing/{self.run_id}/crash",
|
||||
json=crash_report
|
||||
)
|
||||
logger.debug(f"Crash report sent to backend: {crash_report['crash_id']}")
|
||||
except Exception as e:
|
||||
logger.debug(f"Failed to post crash report to backend: {e}")
|
||||
|
||||
return findings
|
||||
@@ -0,0 +1,455 @@
|
||||
"""
|
||||
Cargo Fuzzer Module
|
||||
|
||||
Reusable module for fuzzing Rust code using cargo-fuzz (libFuzzer).
|
||||
Discovers and fuzzes user-provided Rust targets with fuzz_target!() macros.
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
import os
|
||||
import re
|
||||
import time
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any, List, Optional, Callable
|
||||
|
||||
from modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class CargoFuzzer(BaseModule):
|
||||
"""
|
||||
Cargo-fuzz (libFuzzer) fuzzer module for Rust code.
|
||||
|
||||
Discovers fuzz targets in user's Rust project and runs cargo-fuzz
|
||||
to find crashes, undefined behavior, and memory safety issues.
|
||||
"""
|
||||
|
||||
def get_metadata(self) -> ModuleMetadata:
|
||||
"""Get module metadata"""
|
||||
return ModuleMetadata(
|
||||
name="cargo_fuzz",
|
||||
version="0.11.2",
|
||||
description="Fuzz Rust code using cargo-fuzz with libFuzzer backend",
|
||||
author="FuzzForge Team",
|
||||
category="fuzzer",
|
||||
tags=["fuzzing", "rust", "cargo-fuzz", "libfuzzer", "memory-safety"],
|
||||
input_schema={
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"target_name": {
|
||||
"type": "string",
|
||||
"description": "Fuzz target name (auto-discovered if not specified)"
|
||||
},
|
||||
"max_iterations": {
|
||||
"type": "integer",
|
||||
"default": 1000000,
|
||||
"description": "Maximum fuzzing iterations"
|
||||
},
|
||||
"timeout_seconds": {
|
||||
"type": "integer",
|
||||
"default": 1800,
|
||||
"description": "Fuzzing timeout in seconds"
|
||||
},
|
||||
"sanitizer": {
|
||||
"type": "string",
|
||||
"enum": ["address", "memory", "undefined"],
|
||||
"default": "address",
|
||||
"description": "Sanitizer to use (address, memory, undefined)"
|
||||
}
|
||||
}
|
||||
},
|
||||
output_schema={
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"findings": {
|
||||
"type": "array",
|
||||
"description": "Crashes and memory safety issues found"
|
||||
},
|
||||
"summary": {
|
||||
"type": "object",
|
||||
"description": "Fuzzing execution summary"
|
||||
}
|
||||
}
|
||||
}
|
||||
)
|
||||
|
||||
def validate_config(self, config: Dict[str, Any]) -> bool:
|
||||
"""Validate configuration"""
|
||||
max_iterations = config.get("max_iterations", 1000000)
|
||||
if not isinstance(max_iterations, int) or max_iterations < 1:
|
||||
raise ValueError("max_iterations must be a positive integer")
|
||||
|
||||
timeout = config.get("timeout_seconds", 1800)
|
||||
if not isinstance(timeout, int) or timeout < 1:
|
||||
raise ValueError("timeout_seconds must be a positive integer")
|
||||
|
||||
sanitizer = config.get("sanitizer", "address")
|
||||
if sanitizer not in ["address", "memory", "undefined"]:
|
||||
raise ValueError("sanitizer must be one of: address, memory, undefined")
|
||||
|
||||
return True
|
||||
|
||||
async def execute(
|
||||
self,
|
||||
config: Dict[str, Any],
|
||||
workspace: Path,
|
||||
stats_callback: Optional[Callable] = None
|
||||
) -> ModuleResult:
|
||||
"""
|
||||
Execute cargo-fuzz on user's Rust code.
|
||||
|
||||
Args:
|
||||
config: Fuzzer configuration
|
||||
workspace: Path to workspace directory containing Rust project
|
||||
stats_callback: Optional callback for real-time stats updates
|
||||
|
||||
Returns:
|
||||
ModuleResult containing findings and summary
|
||||
"""
|
||||
self.start_timer()
|
||||
|
||||
try:
|
||||
# Validate inputs
|
||||
self.validate_config(config)
|
||||
self.validate_workspace(workspace)
|
||||
|
||||
logger.info(f"Running cargo-fuzz on {workspace}")
|
||||
|
||||
# Step 1: Discover fuzz targets
|
||||
targets = await self._discover_fuzz_targets(workspace)
|
||||
if not targets:
|
||||
return self.create_result(
|
||||
findings=[],
|
||||
status="failed",
|
||||
error="No fuzz targets found. Expected fuzz targets in fuzz/fuzz_targets/"
|
||||
)
|
||||
|
||||
# Get target name from config or use first discovered target
|
||||
target_name = config.get("target_name")
|
||||
if not target_name:
|
||||
target_name = targets[0]
|
||||
logger.info(f"No target specified, using first discovered target: {target_name}")
|
||||
elif target_name not in targets:
|
||||
return self.create_result(
|
||||
findings=[],
|
||||
status="failed",
|
||||
error=f"Target '{target_name}' not found. Available targets: {', '.join(targets)}"
|
||||
)
|
||||
|
||||
# Step 2: Build fuzz target
|
||||
logger.info(f"Building fuzz target: {target_name}")
|
||||
build_success = await self._build_fuzz_target(workspace, target_name, config)
|
||||
if not build_success:
|
||||
return self.create_result(
|
||||
findings=[],
|
||||
status="failed",
|
||||
error=f"Failed to build fuzz target: {target_name}"
|
||||
)
|
||||
|
||||
# Step 3: Run fuzzing
|
||||
logger.info(f"Starting fuzzing: {target_name}")
|
||||
findings, stats = await self._run_fuzzing(
|
||||
workspace,
|
||||
target_name,
|
||||
config,
|
||||
stats_callback
|
||||
)
|
||||
|
||||
# Step 4: Parse crash artifacts
|
||||
crash_findings = await self._parse_crash_artifacts(workspace, target_name)
|
||||
findings.extend(crash_findings)
|
||||
|
||||
logger.info(f"Fuzzing completed: {len(findings)} crashes found")
|
||||
|
||||
return self.create_result(
|
||||
findings=findings,
|
||||
status="success",
|
||||
summary=stats
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Cargo fuzzer failed: {e}")
|
||||
return self.create_result(
|
||||
findings=[],
|
||||
status="failed",
|
||||
error=str(e)
|
||||
)
|
||||
|
||||
async def _discover_fuzz_targets(self, workspace: Path) -> List[str]:
|
||||
"""
|
||||
Discover fuzz targets in the project.
|
||||
|
||||
Looks for fuzz targets in fuzz/fuzz_targets/ directory.
|
||||
"""
|
||||
fuzz_targets_dir = workspace / "fuzz" / "fuzz_targets"
|
||||
if not fuzz_targets_dir.exists():
|
||||
logger.warning(f"No fuzz targets directory found: {fuzz_targets_dir}")
|
||||
return []
|
||||
|
||||
targets = []
|
||||
for file in fuzz_targets_dir.glob("*.rs"):
|
||||
target_name = file.stem
|
||||
targets.append(target_name)
|
||||
logger.info(f"Discovered fuzz target: {target_name}")
|
||||
|
||||
return targets
|
||||
|
||||
async def _build_fuzz_target(
|
||||
self,
|
||||
workspace: Path,
|
||||
target_name: str,
|
||||
config: Dict[str, Any]
|
||||
) -> bool:
|
||||
"""Build the fuzz target with instrumentation"""
|
||||
try:
|
||||
sanitizer = config.get("sanitizer", "address")
|
||||
|
||||
# Build command
|
||||
cmd = [
|
||||
"cargo", "fuzz", "build",
|
||||
target_name,
|
||||
f"--sanitizer={sanitizer}"
|
||||
]
|
||||
|
||||
logger.debug(f"Build command: {' '.join(cmd)}")
|
||||
|
||||
proc = await asyncio.create_subprocess_exec(
|
||||
*cmd,
|
||||
cwd=workspace,
|
||||
stdout=asyncio.subprocess.PIPE,
|
||||
stderr=asyncio.subprocess.PIPE
|
||||
)
|
||||
|
||||
stdout, stderr = await proc.communicate()
|
||||
|
||||
if proc.returncode != 0:
|
||||
logger.error(f"Build failed: {stderr.decode()}")
|
||||
return False
|
||||
|
||||
logger.info("Build successful")
|
||||
return True
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Build error: {e}")
|
||||
return False
|
||||
|
||||
async def _run_fuzzing(
|
||||
self,
|
||||
workspace: Path,
|
||||
target_name: str,
|
||||
config: Dict[str, Any],
|
||||
stats_callback: Optional[Callable]
|
||||
) -> tuple[List[ModuleFinding], Dict[str, Any]]:
|
||||
"""
|
||||
Run cargo-fuzz and collect statistics.
|
||||
|
||||
Returns:
|
||||
Tuple of (findings, stats_dict)
|
||||
"""
|
||||
max_iterations = config.get("max_iterations", 1000000)
|
||||
timeout_seconds = config.get("timeout_seconds", 1800)
|
||||
sanitizer = config.get("sanitizer", "address")
|
||||
|
||||
findings = []
|
||||
stats = {
|
||||
"total_executions": 0,
|
||||
"crashes_found": 0,
|
||||
"corpus_size": 0,
|
||||
"coverage": 0.0,
|
||||
"execution_time": 0.0
|
||||
}
|
||||
|
||||
try:
|
||||
# Cargo fuzz run command
|
||||
cmd = [
|
||||
"cargo", "fuzz", "run",
|
||||
target_name,
|
||||
f"--sanitizer={sanitizer}",
|
||||
"--",
|
||||
f"-runs={max_iterations}",
|
||||
f"-max_total_time={timeout_seconds}"
|
||||
]
|
||||
|
||||
logger.debug(f"Fuzz command: {' '.join(cmd)}")
|
||||
|
||||
start_time = time.time()
|
||||
proc = await asyncio.create_subprocess_exec(
|
||||
*cmd,
|
||||
cwd=workspace,
|
||||
stdout=asyncio.subprocess.PIPE,
|
||||
stderr=asyncio.subprocess.STDOUT
|
||||
)
|
||||
|
||||
# Monitor output and extract stats
|
||||
last_stats_time = time.time()
|
||||
async for line in proc.stdout:
|
||||
line_str = line.decode('utf-8', errors='ignore').strip()
|
||||
|
||||
# Parse libFuzzer stats
|
||||
# Example: "#12345 NEW cov: 123 ft: 456 corp: 10/234b"
|
||||
stats_match = re.match(r'#(\d+)\s+.*cov:\s*(\d+).*corp:\s*(\d+)', line_str)
|
||||
if stats_match:
|
||||
execs = int(stats_match.group(1))
|
||||
cov = int(stats_match.group(2))
|
||||
corp = int(stats_match.group(3))
|
||||
|
||||
stats["total_executions"] = execs
|
||||
stats["coverage"] = float(cov)
|
||||
stats["corpus_size"] = corp
|
||||
stats["execution_time"] = time.time() - start_time
|
||||
|
||||
# Invoke stats callback for real-time monitoring
|
||||
if stats_callback and time.time() - last_stats_time >= 0.5:
|
||||
await stats_callback({
|
||||
"total_execs": execs,
|
||||
"execs_per_sec": execs / stats["execution_time"] if stats["execution_time"] > 0 else 0,
|
||||
"crashes": stats["crashes_found"],
|
||||
"coverage": cov,
|
||||
"corpus_size": corp,
|
||||
"elapsed_time": int(stats["execution_time"])
|
||||
})
|
||||
last_stats_time = time.time()
|
||||
|
||||
# Detect crash line
|
||||
if "SUMMARY:" in line_str or "ERROR:" in line_str:
|
||||
logger.info(f"Detected crash: {line_str}")
|
||||
stats["crashes_found"] += 1
|
||||
|
||||
await proc.wait()
|
||||
stats["execution_time"] = time.time() - start_time
|
||||
|
||||
# Send final stats update
|
||||
if stats_callback:
|
||||
await stats_callback({
|
||||
"total_execs": stats["total_executions"],
|
||||
"execs_per_sec": stats["total_executions"] / stats["execution_time"] if stats["execution_time"] > 0 else 0,
|
||||
"crashes": stats["crashes_found"],
|
||||
"coverage": stats["coverage"],
|
||||
"corpus_size": stats["corpus_size"],
|
||||
"elapsed_time": int(stats["execution_time"])
|
||||
})
|
||||
|
||||
logger.info(
|
||||
f"Fuzzing completed: {stats['total_executions']} execs, "
|
||||
f"{stats['crashes_found']} crashes"
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Fuzzing error: {e}")
|
||||
|
||||
return findings, stats
|
||||
|
||||
async def _parse_crash_artifacts(
|
||||
self,
|
||||
workspace: Path,
|
||||
target_name: str
|
||||
) -> List[ModuleFinding]:
|
||||
"""
|
||||
Parse crash artifacts from fuzz/artifacts directory.
|
||||
|
||||
Cargo-fuzz stores crashes in: fuzz/artifacts/<target_name>/
|
||||
"""
|
||||
findings = []
|
||||
artifacts_dir = workspace / "fuzz" / "artifacts" / target_name
|
||||
|
||||
if not artifacts_dir.exists():
|
||||
logger.info("No crash artifacts found")
|
||||
return findings
|
||||
|
||||
# Find all crash files
|
||||
for crash_file in artifacts_dir.glob("crash-*"):
|
||||
try:
|
||||
finding = await self._analyze_crash(workspace, target_name, crash_file)
|
||||
if finding:
|
||||
findings.append(finding)
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to analyze crash {crash_file}: {e}")
|
||||
|
||||
logger.info(f"Parsed {len(findings)} crash artifacts")
|
||||
return findings
|
||||
|
||||
async def _analyze_crash(
|
||||
self,
|
||||
workspace: Path,
|
||||
target_name: str,
|
||||
crash_file: Path
|
||||
) -> Optional[ModuleFinding]:
|
||||
"""
|
||||
Analyze a single crash file.
|
||||
|
||||
Runs cargo-fuzz with the crash input to reproduce and get stack trace.
|
||||
"""
|
||||
try:
|
||||
# Read crash input
|
||||
crash_input = crash_file.read_bytes()
|
||||
|
||||
# Reproduce crash to get stack trace
|
||||
cmd = [
|
||||
"cargo", "fuzz", "run",
|
||||
target_name,
|
||||
str(crash_file)
|
||||
]
|
||||
|
||||
proc = await asyncio.create_subprocess_exec(
|
||||
*cmd,
|
||||
cwd=workspace,
|
||||
stdout=asyncio.subprocess.PIPE,
|
||||
stderr=asyncio.subprocess.STDOUT,
|
||||
env={**os.environ, "RUST_BACKTRACE": "1"}
|
||||
)
|
||||
|
||||
stdout, _ = await proc.communicate()
|
||||
output = stdout.decode('utf-8', errors='ignore')
|
||||
|
||||
# Parse stack trace and error type
|
||||
error_type = "Unknown Crash"
|
||||
stack_trace = output
|
||||
|
||||
# Extract error type
|
||||
if "SEGV" in output:
|
||||
error_type = "Segmentation Fault"
|
||||
severity = "critical"
|
||||
elif "heap-use-after-free" in output:
|
||||
error_type = "Use After Free"
|
||||
severity = "critical"
|
||||
elif "heap-buffer-overflow" in output:
|
||||
error_type = "Heap Buffer Overflow"
|
||||
severity = "critical"
|
||||
elif "stack-buffer-overflow" in output:
|
||||
error_type = "Stack Buffer Overflow"
|
||||
severity = "high"
|
||||
elif "panic" in output.lower():
|
||||
error_type = "Panic"
|
||||
severity = "medium"
|
||||
else:
|
||||
severity = "high"
|
||||
|
||||
# Create finding
|
||||
finding = self.create_finding(
|
||||
title=f"Crash: {error_type} in {target_name}",
|
||||
description=f"Cargo-fuzz discovered a crash in target '{target_name}'. "
|
||||
f"Error type: {error_type}. "
|
||||
f"Input size: {len(crash_input)} bytes.",
|
||||
severity=severity,
|
||||
category="crash",
|
||||
file_path=f"fuzz/fuzz_targets/{target_name}.rs",
|
||||
code_snippet=stack_trace[:500],
|
||||
recommendation="Review the crash details and fix the underlying bug. "
|
||||
"Use AddressSanitizer to identify memory safety issues. "
|
||||
"Consider adding bounds checks or using safer APIs.",
|
||||
metadata={
|
||||
"error_type": error_type,
|
||||
"crash_file": crash_file.name,
|
||||
"input_size": len(crash_input),
|
||||
"reproducer": crash_file.name,
|
||||
"stack_trace": stack_trace
|
||||
}
|
||||
)
|
||||
|
||||
return finding
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to analyze crash {crash_file}: {e}")
|
||||
return None
|
||||
@@ -17,7 +17,6 @@ import logging
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any, List
|
||||
from datetime import datetime
|
||||
import json
|
||||
|
||||
try:
|
||||
from toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
|
||||
|
||||
@@ -16,16 +16,16 @@ File Scanner Module - Scans and enumerates files in the workspace
|
||||
import logging
|
||||
import mimetypes
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any, List
|
||||
from typing import Dict, Any
|
||||
import hashlib
|
||||
|
||||
try:
|
||||
from toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
|
||||
from toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult
|
||||
except ImportError:
|
||||
try:
|
||||
from modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
|
||||
from modules.base import BaseModule, ModuleMetadata, ModuleResult
|
||||
except ImportError:
|
||||
from src.toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
|
||||
from src.toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@@ -0,0 +1,9 @@
|
||||
"""
|
||||
Atheris Fuzzing Workflow
|
||||
|
||||
Fuzzes user-provided Python code using Atheris.
|
||||
"""
|
||||
|
||||
from .workflow import AtherisFuzzingWorkflow
|
||||
|
||||
__all__ = ["AtherisFuzzingWorkflow"]
|
||||
@@ -0,0 +1,122 @@
|
||||
"""
|
||||
Atheris Fuzzing Workflow Activities
|
||||
|
||||
Activities specific to the Atheris fuzzing workflow.
|
||||
"""
|
||||
|
||||
import logging
|
||||
import sys
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any
|
||||
import os
|
||||
|
||||
import httpx
|
||||
from temporalio import activity
|
||||
|
||||
# Configure logging
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Add toolbox to path for module imports
|
||||
sys.path.insert(0, '/app/toolbox')
|
||||
|
||||
|
||||
@activity.defn(name="fuzz_with_atheris")
|
||||
async def fuzz_activity(workspace_path: str, config: dict) -> dict:
|
||||
"""
|
||||
Fuzzing activity using the AtherisFuzzer module on user code.
|
||||
|
||||
This activity:
|
||||
1. Imports the reusable AtherisFuzzer module
|
||||
2. Sets up real-time stats callback
|
||||
3. Executes fuzzing on user's TestOneInput() function
|
||||
4. Returns findings as ModuleResult
|
||||
|
||||
Args:
|
||||
workspace_path: Path to the workspace directory (user's uploaded code)
|
||||
config: Fuzzer configuration (target_file, max_iterations, timeout_seconds)
|
||||
|
||||
Returns:
|
||||
Fuzzer results dictionary (findings, summary, metadata)
|
||||
"""
|
||||
logger.info(f"Activity: fuzz_with_atheris (workspace={workspace_path})")
|
||||
|
||||
try:
|
||||
# Import reusable AtherisFuzzer module
|
||||
from modules.fuzzer import AtherisFuzzer
|
||||
|
||||
workspace = Path(workspace_path)
|
||||
if not workspace.exists():
|
||||
raise FileNotFoundError(f"Workspace not found: {workspace_path}")
|
||||
|
||||
# Get activity info for real-time stats
|
||||
info = activity.info()
|
||||
run_id = info.workflow_id
|
||||
|
||||
# Define stats callback for real-time monitoring
|
||||
async def stats_callback(stats_data: Dict[str, Any]):
|
||||
"""Callback for live fuzzing statistics"""
|
||||
try:
|
||||
# Prepare stats payload for backend
|
||||
coverage_value = stats_data.get("coverage", 0)
|
||||
logger.info(f"COVERAGE_DEBUG: coverage from stats_data = {coverage_value}")
|
||||
|
||||
stats_payload = {
|
||||
"run_id": run_id,
|
||||
"workflow": "atheris_fuzzing",
|
||||
"executions": stats_data.get("total_execs", 0),
|
||||
"executions_per_sec": stats_data.get("execs_per_sec", 0.0),
|
||||
"crashes": stats_data.get("crashes", 0),
|
||||
"unique_crashes": stats_data.get("crashes", 0),
|
||||
"coverage": coverage_value,
|
||||
"corpus_size": stats_data.get("corpus_size", 0),
|
||||
"elapsed_time": stats_data.get("elapsed_time", 0),
|
||||
"last_crash_time": None
|
||||
}
|
||||
|
||||
# POST stats to backend API for real-time monitoring
|
||||
backend_url = os.getenv("BACKEND_URL", "http://backend:8000")
|
||||
async with httpx.AsyncClient(timeout=5.0) as client:
|
||||
try:
|
||||
await client.post(
|
||||
f"{backend_url}/fuzzing/{run_id}/stats",
|
||||
json=stats_payload
|
||||
)
|
||||
except Exception as http_err:
|
||||
logger.debug(f"Failed to post stats to backend: {http_err}")
|
||||
|
||||
# Also log for debugging
|
||||
logger.info("LIVE_STATS", extra={
|
||||
"stats_type": "fuzzing_live_update",
|
||||
"workflow_type": "atheris_fuzzing",
|
||||
"run_id": run_id,
|
||||
"executions": stats_data.get("total_execs", 0),
|
||||
"executions_per_sec": stats_data.get("execs_per_sec", 0.0),
|
||||
"crashes": stats_data.get("crashes", 0),
|
||||
"corpus_size": stats_data.get("corpus_size", 0),
|
||||
"coverage": stats_data.get("coverage", 0.0),
|
||||
"elapsed_time": stats_data.get("elapsed_time", 0),
|
||||
"timestamp": datetime.utcnow().isoformat()
|
||||
})
|
||||
except Exception as e:
|
||||
logger.warning(f"Error in stats callback: {e}")
|
||||
|
||||
# Add stats callback and run_id to config
|
||||
config["stats_callback"] = stats_callback
|
||||
config["run_id"] = run_id
|
||||
|
||||
# Execute the fuzzer module
|
||||
fuzzer = AtherisFuzzer()
|
||||
result = await fuzzer.execute(config, workspace)
|
||||
|
||||
logger.info(
|
||||
f"✓ Fuzzing completed: "
|
||||
f"{result.summary.get('total_executions', 0)} executions, "
|
||||
f"{result.summary.get('crashes_found', 0)} crashes"
|
||||
)
|
||||
|
||||
return result.dict()
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Fuzzing failed: {e}", exc_info=True)
|
||||
raise
|
||||
@@ -0,0 +1,65 @@
|
||||
name: atheris_fuzzing
|
||||
version: "1.0.0"
|
||||
vertical: python
|
||||
description: "Fuzz Python code using Atheris with real-time monitoring. Automatically discovers and fuzzes TestOneInput() functions in user code."
|
||||
author: "FuzzForge Team"
|
||||
tags:
|
||||
- "fuzzing"
|
||||
- "atheris"
|
||||
- "python"
|
||||
- "coverage"
|
||||
- "security"
|
||||
|
||||
# Workspace isolation mode (system-level configuration)
|
||||
# - "isolated" (default): Each workflow run gets its own isolated workspace (safe for concurrent fuzzing)
|
||||
# - "shared": All runs share the same workspace (for read-only analysis workflows)
|
||||
# - "copy-on-write": Download once, copy for each run (balances performance and isolation)
|
||||
workspace_isolation: "isolated"
|
||||
|
||||
default_parameters:
|
||||
target_file: null
|
||||
max_iterations: 1000000
|
||||
timeout_seconds: 1800
|
||||
|
||||
parameters:
|
||||
type: object
|
||||
properties:
|
||||
target_file:
|
||||
type: string
|
||||
description: "Python file with TestOneInput() function (auto-discovered if not specified)"
|
||||
max_iterations:
|
||||
type: integer
|
||||
default: 1000000
|
||||
description: "Maximum fuzzing iterations"
|
||||
timeout_seconds:
|
||||
type: integer
|
||||
default: 1800
|
||||
description: "Fuzzing timeout in seconds (30 minutes)"
|
||||
|
||||
output_schema:
|
||||
type: object
|
||||
properties:
|
||||
findings:
|
||||
type: array
|
||||
description: "Crashes and vulnerabilities found during fuzzing"
|
||||
items:
|
||||
type: object
|
||||
properties:
|
||||
title:
|
||||
type: string
|
||||
severity:
|
||||
type: string
|
||||
category:
|
||||
type: string
|
||||
metadata:
|
||||
type: object
|
||||
summary:
|
||||
type: object
|
||||
description: "Fuzzing execution summary"
|
||||
properties:
|
||||
total_executions:
|
||||
type: integer
|
||||
crashes_found:
|
||||
type: integer
|
||||
execution_time:
|
||||
type: number
|
||||
@@ -0,0 +1,175 @@
|
||||
"""
|
||||
Atheris Fuzzing Workflow - Temporal Version
|
||||
|
||||
Fuzzes user-provided Python code using Atheris with real-time monitoring.
|
||||
"""
|
||||
|
||||
from datetime import timedelta
|
||||
from typing import Dict, Any, Optional
|
||||
|
||||
from temporalio import workflow
|
||||
from temporalio.common import RetryPolicy
|
||||
|
||||
# Import for type hints (will be executed by worker)
|
||||
with workflow.unsafe.imports_passed_through():
|
||||
import logging
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@workflow.defn
|
||||
class AtherisFuzzingWorkflow:
|
||||
"""
|
||||
Fuzz Python code using Atheris.
|
||||
|
||||
User workflow:
|
||||
1. User runs: ff workflow run atheris_fuzzing .
|
||||
2. CLI uploads project to MinIO
|
||||
3. Worker downloads project
|
||||
4. Worker fuzzes TestOneInput() function
|
||||
5. Crashes reported as findings
|
||||
"""
|
||||
|
||||
@workflow.run
|
||||
async def run(
|
||||
self,
|
||||
target_id: str, # MinIO UUID of uploaded user code
|
||||
target_file: Optional[str] = None, # Optional: specific file to fuzz
|
||||
max_iterations: int = 1000000,
|
||||
timeout_seconds: int = 1800 # 30 minutes default for fuzzing
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Main workflow execution.
|
||||
|
||||
Args:
|
||||
target_id: UUID of the uploaded target in MinIO
|
||||
target_file: Optional specific Python file with TestOneInput() (auto-discovered if None)
|
||||
max_iterations: Maximum fuzzing iterations
|
||||
timeout_seconds: Fuzzing timeout in seconds
|
||||
|
||||
Returns:
|
||||
Dictionary containing findings and summary
|
||||
"""
|
||||
workflow_id = workflow.info().workflow_id
|
||||
|
||||
workflow.logger.info(
|
||||
f"Starting AtherisFuzzingWorkflow "
|
||||
f"(workflow_id={workflow_id}, target_id={target_id}, "
|
||||
f"target_file={target_file or 'auto-discover'}, max_iterations={max_iterations}, "
|
||||
f"timeout_seconds={timeout_seconds})"
|
||||
)
|
||||
|
||||
results = {
|
||||
"workflow_id": workflow_id,
|
||||
"target_id": target_id,
|
||||
"status": "running",
|
||||
"steps": []
|
||||
}
|
||||
|
||||
try:
|
||||
# Get run ID for workspace isolation
|
||||
run_id = workflow.info().run_id
|
||||
|
||||
# Step 1: Download user's project from MinIO
|
||||
workflow.logger.info("Step 1: Downloading user code from MinIO")
|
||||
target_path = await workflow.execute_activity(
|
||||
"get_target",
|
||||
args=[target_id, run_id, "isolated"], # target_id, run_id, workspace_isolation
|
||||
start_to_close_timeout=timedelta(minutes=5),
|
||||
retry_policy=RetryPolicy(
|
||||
initial_interval=timedelta(seconds=1),
|
||||
maximum_interval=timedelta(seconds=30),
|
||||
maximum_attempts=3
|
||||
)
|
||||
)
|
||||
results["steps"].append({
|
||||
"step": "download_target",
|
||||
"status": "success",
|
||||
"target_path": target_path
|
||||
})
|
||||
workflow.logger.info(f"✓ User code downloaded to: {target_path}")
|
||||
|
||||
# Step 2: Run Atheris fuzzing
|
||||
workflow.logger.info("Step 2: Running Atheris fuzzing")
|
||||
|
||||
# Use defaults if parameters are None
|
||||
actual_max_iterations = max_iterations if max_iterations is not None else 1000000
|
||||
actual_timeout_seconds = timeout_seconds if timeout_seconds is not None else 1800
|
||||
|
||||
fuzz_config = {
|
||||
"target_file": target_file,
|
||||
"max_iterations": actual_max_iterations,
|
||||
"timeout_seconds": actual_timeout_seconds
|
||||
}
|
||||
|
||||
fuzz_results = await workflow.execute_activity(
|
||||
"fuzz_with_atheris",
|
||||
args=[target_path, fuzz_config],
|
||||
start_to_close_timeout=timedelta(seconds=actual_timeout_seconds + 60),
|
||||
retry_policy=RetryPolicy(
|
||||
initial_interval=timedelta(seconds=2),
|
||||
maximum_interval=timedelta(seconds=60),
|
||||
maximum_attempts=1 # Fuzzing shouldn't retry
|
||||
)
|
||||
)
|
||||
|
||||
results["steps"].append({
|
||||
"step": "fuzzing",
|
||||
"status": "success",
|
||||
"executions": fuzz_results.get("summary", {}).get("total_executions", 0),
|
||||
"crashes": fuzz_results.get("summary", {}).get("crashes_found", 0)
|
||||
})
|
||||
workflow.logger.info(
|
||||
f"✓ Fuzzing completed: "
|
||||
f"{fuzz_results.get('summary', {}).get('total_executions', 0)} executions, "
|
||||
f"{fuzz_results.get('summary', {}).get('crashes_found', 0)} crashes"
|
||||
)
|
||||
|
||||
# Step 3: Upload results to MinIO
|
||||
workflow.logger.info("Step 3: Uploading results")
|
||||
try:
|
||||
results_url = await workflow.execute_activity(
|
||||
"upload_results",
|
||||
args=[workflow_id, fuzz_results, "json"],
|
||||
start_to_close_timeout=timedelta(minutes=2)
|
||||
)
|
||||
results["results_url"] = results_url
|
||||
workflow.logger.info(f"✓ Results uploaded to: {results_url}")
|
||||
except Exception as e:
|
||||
workflow.logger.warning(f"Failed to upload results: {e}")
|
||||
results["results_url"] = None
|
||||
|
||||
# Step 4: Cleanup cache
|
||||
workflow.logger.info("Step 4: Cleaning up cache")
|
||||
try:
|
||||
await workflow.execute_activity(
|
||||
"cleanup_cache",
|
||||
args=[target_path, "isolated"], # target_path, workspace_isolation
|
||||
start_to_close_timeout=timedelta(minutes=1)
|
||||
)
|
||||
workflow.logger.info("✓ Cache cleaned up")
|
||||
except Exception as e:
|
||||
workflow.logger.warning(f"Cache cleanup failed: {e}")
|
||||
|
||||
# Mark workflow as successful
|
||||
results["status"] = "success"
|
||||
results["findings"] = fuzz_results.get("findings", [])
|
||||
results["summary"] = fuzz_results.get("summary", {})
|
||||
results["sarif"] = fuzz_results.get("sarif") or {}
|
||||
workflow.logger.info(
|
||||
f"✓ Workflow completed successfully: {workflow_id} "
|
||||
f"({results['summary'].get('crashes_found', 0)} crashes found)"
|
||||
)
|
||||
|
||||
return results
|
||||
|
||||
except Exception as e:
|
||||
workflow.logger.error(f"Workflow failed: {e}")
|
||||
results["status"] = "error"
|
||||
results["error"] = str(e)
|
||||
results["steps"].append({
|
||||
"step": "error",
|
||||
"status": "failed",
|
||||
"error": str(e)
|
||||
})
|
||||
raise
|
||||
@@ -0,0 +1,5 @@
|
||||
"""Cargo Fuzzing Workflow"""
|
||||
|
||||
from .workflow import CargoFuzzingWorkflow
|
||||
|
||||
__all__ = ["CargoFuzzingWorkflow"]
|
||||
@@ -0,0 +1,203 @@
|
||||
"""
|
||||
Cargo Fuzzing Workflow Activities
|
||||
|
||||
Activities specific to the cargo-fuzz fuzzing workflow.
|
||||
"""
|
||||
|
||||
import logging
|
||||
import sys
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any
|
||||
import os
|
||||
|
||||
import httpx
|
||||
from temporalio import activity
|
||||
|
||||
# Configure logging
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Add toolbox to path for module imports
|
||||
sys.path.insert(0, '/app/toolbox')
|
||||
|
||||
|
||||
@activity.defn(name="fuzz_with_cargo")
|
||||
async def fuzz_activity(workspace_path: str, config: dict) -> dict:
|
||||
"""
|
||||
Fuzzing activity using the CargoFuzzer module on user code.
|
||||
|
||||
This activity:
|
||||
1. Imports the reusable CargoFuzzer module
|
||||
2. Sets up real-time stats callback
|
||||
3. Executes fuzzing on user's fuzz_target!() functions
|
||||
4. Returns findings as ModuleResult
|
||||
|
||||
Args:
|
||||
workspace_path: Path to the workspace directory (user's uploaded Rust project)
|
||||
config: Fuzzer configuration (target_name, max_iterations, timeout_seconds, sanitizer)
|
||||
|
||||
Returns:
|
||||
Fuzzer results dictionary (findings, summary, metadata)
|
||||
"""
|
||||
logger.info(f"Activity: fuzz_with_cargo (workspace={workspace_path})")
|
||||
|
||||
try:
|
||||
# Import reusable CargoFuzzer module
|
||||
from modules.fuzzer import CargoFuzzer
|
||||
|
||||
workspace = Path(workspace_path)
|
||||
if not workspace.exists():
|
||||
raise FileNotFoundError(f"Workspace not found: {workspace_path}")
|
||||
|
||||
# Get activity info for real-time stats
|
||||
info = activity.info()
|
||||
run_id = info.workflow_id
|
||||
|
||||
# Define stats callback for real-time monitoring
|
||||
async def stats_callback(stats_data: Dict[str, Any]):
|
||||
"""Callback for live fuzzing statistics"""
|
||||
try:
|
||||
# Prepare stats payload for backend
|
||||
coverage_value = stats_data.get("coverage", 0)
|
||||
|
||||
stats_payload = {
|
||||
"run_id": run_id,
|
||||
"workflow": "cargo_fuzzing",
|
||||
"executions": stats_data.get("total_execs", 0),
|
||||
"executions_per_sec": stats_data.get("execs_per_sec", 0.0),
|
||||
"crashes": stats_data.get("crashes", 0),
|
||||
"unique_crashes": stats_data.get("crashes", 0),
|
||||
"coverage": coverage_value,
|
||||
"corpus_size": stats_data.get("corpus_size", 0),
|
||||
"elapsed_time": stats_data.get("elapsed_time", 0),
|
||||
"last_crash_time": None
|
||||
}
|
||||
|
||||
# POST stats to backend API for real-time monitoring
|
||||
backend_url = os.getenv("BACKEND_URL", "http://backend:8000")
|
||||
async with httpx.AsyncClient(timeout=5.0) as client:
|
||||
try:
|
||||
await client.post(
|
||||
f"{backend_url}/fuzzing/{run_id}/stats",
|
||||
json=stats_payload
|
||||
)
|
||||
except Exception as http_err:
|
||||
logger.debug(f"Failed to post stats to backend: {http_err}")
|
||||
|
||||
# Also log for debugging
|
||||
logger.info("LIVE_STATS", extra={
|
||||
"stats_type": "fuzzing_live_update",
|
||||
"workflow_type": "cargo_fuzzing",
|
||||
"run_id": run_id,
|
||||
"executions": stats_data.get("total_execs", 0),
|
||||
"executions_per_sec": stats_data.get("execs_per_sec", 0.0),
|
||||
"crashes": stats_data.get("crashes", 0),
|
||||
"corpus_size": stats_data.get("corpus_size", 0),
|
||||
"coverage": stats_data.get("coverage", 0.0),
|
||||
"elapsed_time": stats_data.get("elapsed_time", 0),
|
||||
"timestamp": datetime.utcnow().isoformat()
|
||||
})
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Stats callback error: {e}")
|
||||
|
||||
# Initialize CargoFuzzer module
|
||||
fuzzer = CargoFuzzer()
|
||||
|
||||
# Execute fuzzing with stats callback
|
||||
module_result = await fuzzer.execute(
|
||||
config=config,
|
||||
workspace=workspace,
|
||||
stats_callback=stats_callback
|
||||
)
|
||||
|
||||
# Convert ModuleResult to dictionary
|
||||
result_dict = {
|
||||
"findings": [],
|
||||
"summary": module_result.summary,
|
||||
"metadata": module_result.metadata,
|
||||
"status": module_result.status,
|
||||
"error": module_result.error
|
||||
}
|
||||
|
||||
# Convert findings to dict format
|
||||
for finding in module_result.findings:
|
||||
finding_dict = {
|
||||
"id": finding.id,
|
||||
"title": finding.title,
|
||||
"description": finding.description,
|
||||
"severity": finding.severity,
|
||||
"category": finding.category,
|
||||
"file_path": finding.file_path,
|
||||
"line_start": finding.line_start,
|
||||
"line_end": finding.line_end,
|
||||
"code_snippet": finding.code_snippet,
|
||||
"recommendation": finding.recommendation,
|
||||
"metadata": finding.metadata
|
||||
}
|
||||
result_dict["findings"].append(finding_dict)
|
||||
|
||||
# Generate SARIF report from findings
|
||||
if module_result.findings:
|
||||
# Convert findings to SARIF format
|
||||
severity_map = {
|
||||
"critical": "error",
|
||||
"high": "error",
|
||||
"medium": "warning",
|
||||
"low": "note",
|
||||
"info": "note"
|
||||
}
|
||||
|
||||
results = []
|
||||
for finding in module_result.findings:
|
||||
result = {
|
||||
"ruleId": finding.metadata.get("rule_id", finding.category),
|
||||
"level": severity_map.get(finding.severity, "warning"),
|
||||
"message": {"text": finding.description},
|
||||
"locations": []
|
||||
}
|
||||
|
||||
if finding.file_path:
|
||||
location = {
|
||||
"physicalLocation": {
|
||||
"artifactLocation": {"uri": finding.file_path},
|
||||
"region": {
|
||||
"startLine": finding.line_start or 1,
|
||||
"endLine": finding.line_end or finding.line_start or 1
|
||||
}
|
||||
}
|
||||
}
|
||||
result["locations"].append(location)
|
||||
|
||||
results.append(result)
|
||||
|
||||
result_dict["sarif"] = {
|
||||
"version": "2.1.0",
|
||||
"$schema": "https://raw.githubusercontent.com/oasis-tcs/sarif-spec/master/Schemata/sarif-schema-2.1.0.json",
|
||||
"runs": [{
|
||||
"tool": {
|
||||
"driver": {
|
||||
"name": "cargo-fuzz",
|
||||
"version": "0.11.2"
|
||||
}
|
||||
},
|
||||
"results": results
|
||||
}]
|
||||
}
|
||||
else:
|
||||
result_dict["sarif"] = {
|
||||
"version": "2.1.0",
|
||||
"$schema": "https://raw.githubusercontent.com/oasis-tcs/sarif-spec/master/Schemata/sarif-schema-2.1.0.json",
|
||||
"runs": []
|
||||
}
|
||||
|
||||
logger.info(
|
||||
f"Fuzzing activity completed: {len(module_result.findings)} crashes found, "
|
||||
f"{module_result.summary.get('total_executions', 0)} executions"
|
||||
)
|
||||
|
||||
return result_dict
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Fuzzing activity failed: {e}", exc_info=True)
|
||||
raise
|
||||
@@ -0,0 +1,71 @@
|
||||
name: cargo_fuzzing
|
||||
version: "1.0.0"
|
||||
vertical: rust
|
||||
description: "Fuzz Rust code using cargo-fuzz with real-time monitoring. Automatically discovers and fuzzes fuzz_target!() functions in user code."
|
||||
author: "FuzzForge Team"
|
||||
tags:
|
||||
- "fuzzing"
|
||||
- "cargo-fuzz"
|
||||
- "rust"
|
||||
- "libfuzzer"
|
||||
- "memory-safety"
|
||||
|
||||
# Workspace isolation mode (system-level configuration)
|
||||
# - "isolated" (default): Each workflow run gets its own isolated workspace (safe for concurrent fuzzing)
|
||||
# - "shared": All runs share the same workspace (for read-only analysis workflows)
|
||||
# - "copy-on-write": Download once, copy for each run (balances performance and isolation)
|
||||
workspace_isolation: "isolated"
|
||||
|
||||
default_parameters:
|
||||
target_name: null
|
||||
max_iterations: 1000000
|
||||
timeout_seconds: 1800
|
||||
sanitizer: "address"
|
||||
|
||||
parameters:
|
||||
type: object
|
||||
properties:
|
||||
target_name:
|
||||
type: string
|
||||
description: "Fuzz target name from fuzz/fuzz_targets/ (auto-discovered if not specified)"
|
||||
max_iterations:
|
||||
type: integer
|
||||
default: 1000000
|
||||
description: "Maximum fuzzing iterations"
|
||||
timeout_seconds:
|
||||
type: integer
|
||||
default: 1800
|
||||
description: "Fuzzing timeout in seconds (30 minutes)"
|
||||
sanitizer:
|
||||
type: string
|
||||
enum: ["address", "memory", "undefined"]
|
||||
default: "address"
|
||||
description: "Sanitizer to use (address, memory, undefined)"
|
||||
|
||||
output_schema:
|
||||
type: object
|
||||
properties:
|
||||
findings:
|
||||
type: array
|
||||
description: "Crashes and memory safety issues found during fuzzing"
|
||||
items:
|
||||
type: object
|
||||
properties:
|
||||
title:
|
||||
type: string
|
||||
severity:
|
||||
type: string
|
||||
category:
|
||||
type: string
|
||||
metadata:
|
||||
type: object
|
||||
summary:
|
||||
type: object
|
||||
description: "Fuzzing execution summary"
|
||||
properties:
|
||||
total_executions:
|
||||
type: integer
|
||||
crashes_found:
|
||||
type: integer
|
||||
execution_time:
|
||||
type: number
|
||||
@@ -0,0 +1,180 @@
|
||||
"""
|
||||
Cargo Fuzzing Workflow - Temporal Version
|
||||
|
||||
Fuzzes user-provided Rust code using cargo-fuzz with real-time monitoring.
|
||||
"""
|
||||
|
||||
from datetime import timedelta
|
||||
from typing import Dict, Any, Optional
|
||||
|
||||
from temporalio import workflow
|
||||
from temporalio.common import RetryPolicy
|
||||
|
||||
# Import for type hints (will be executed by worker)
|
||||
with workflow.unsafe.imports_passed_through():
|
||||
import logging
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@workflow.defn
|
||||
class CargoFuzzingWorkflow:
|
||||
"""
|
||||
Fuzz Rust code using cargo-fuzz (libFuzzer).
|
||||
|
||||
User workflow:
|
||||
1. User runs: ff workflow run cargo_fuzzing .
|
||||
2. CLI uploads Rust project to MinIO
|
||||
3. Worker downloads project
|
||||
4. Worker discovers fuzz targets in fuzz/fuzz_targets/
|
||||
5. Worker fuzzes the target with cargo-fuzz
|
||||
6. Crashes reported as findings
|
||||
"""
|
||||
|
||||
@workflow.run
|
||||
async def run(
|
||||
self,
|
||||
target_id: str, # MinIO UUID of uploaded user code
|
||||
target_name: Optional[str] = None, # Optional: specific fuzz target name
|
||||
max_iterations: int = 1000000,
|
||||
timeout_seconds: int = 1800, # 30 minutes default for fuzzing
|
||||
sanitizer: str = "address"
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Main workflow execution.
|
||||
|
||||
Args:
|
||||
target_id: UUID of the uploaded target in MinIO
|
||||
target_name: Optional specific fuzz target name (auto-discovered if None)
|
||||
max_iterations: Maximum fuzzing iterations
|
||||
timeout_seconds: Fuzzing timeout in seconds
|
||||
sanitizer: Sanitizer to use (address, memory, undefined)
|
||||
|
||||
Returns:
|
||||
Dictionary containing findings and summary
|
||||
"""
|
||||
workflow_id = workflow.info().workflow_id
|
||||
|
||||
workflow.logger.info(
|
||||
f"Starting CargoFuzzingWorkflow "
|
||||
f"(workflow_id={workflow_id}, target_id={target_id}, "
|
||||
f"target_name={target_name or 'auto-discover'}, max_iterations={max_iterations}, "
|
||||
f"timeout_seconds={timeout_seconds}, sanitizer={sanitizer})"
|
||||
)
|
||||
|
||||
results = {
|
||||
"workflow_id": workflow_id,
|
||||
"target_id": target_id,
|
||||
"status": "running",
|
||||
"steps": []
|
||||
}
|
||||
|
||||
try:
|
||||
# Get run ID for workspace isolation
|
||||
run_id = workflow.info().run_id
|
||||
|
||||
# Step 1: Download user's Rust project from MinIO
|
||||
workflow.logger.info("Step 1: Downloading user code from MinIO")
|
||||
target_path = await workflow.execute_activity(
|
||||
"get_target",
|
||||
args=[target_id, run_id, "isolated"], # target_id, run_id, workspace_isolation
|
||||
start_to_close_timeout=timedelta(minutes=5),
|
||||
retry_policy=RetryPolicy(
|
||||
initial_interval=timedelta(seconds=1),
|
||||
maximum_interval=timedelta(seconds=30),
|
||||
maximum_attempts=3
|
||||
)
|
||||
)
|
||||
results["steps"].append({
|
||||
"step": "download_target",
|
||||
"status": "success",
|
||||
"target_path": target_path
|
||||
})
|
||||
workflow.logger.info(f"✓ User code downloaded to: {target_path}")
|
||||
|
||||
# Step 2: Run cargo-fuzz
|
||||
workflow.logger.info("Step 2: Running cargo-fuzz")
|
||||
|
||||
# Use defaults if parameters are None
|
||||
actual_max_iterations = max_iterations if max_iterations is not None else 1000000
|
||||
actual_timeout_seconds = timeout_seconds if timeout_seconds is not None else 1800
|
||||
actual_sanitizer = sanitizer if sanitizer is not None else "address"
|
||||
|
||||
fuzz_config = {
|
||||
"target_name": target_name,
|
||||
"max_iterations": actual_max_iterations,
|
||||
"timeout_seconds": actual_timeout_seconds,
|
||||
"sanitizer": actual_sanitizer
|
||||
}
|
||||
|
||||
fuzz_results = await workflow.execute_activity(
|
||||
"fuzz_with_cargo",
|
||||
args=[target_path, fuzz_config],
|
||||
start_to_close_timeout=timedelta(seconds=actual_timeout_seconds + 120),
|
||||
retry_policy=RetryPolicy(
|
||||
initial_interval=timedelta(seconds=2),
|
||||
maximum_interval=timedelta(seconds=60),
|
||||
maximum_attempts=1 # Fuzzing shouldn't retry
|
||||
)
|
||||
)
|
||||
|
||||
results["steps"].append({
|
||||
"step": "fuzzing",
|
||||
"status": "success",
|
||||
"executions": fuzz_results.get("summary", {}).get("total_executions", 0),
|
||||
"crashes": fuzz_results.get("summary", {}).get("crashes_found", 0)
|
||||
})
|
||||
workflow.logger.info(
|
||||
f"✓ Fuzzing completed: "
|
||||
f"{fuzz_results.get('summary', {}).get('total_executions', 0)} executions, "
|
||||
f"{fuzz_results.get('summary', {}).get('crashes_found', 0)} crashes"
|
||||
)
|
||||
|
||||
# Step 3: Upload results to MinIO
|
||||
workflow.logger.info("Step 3: Uploading results")
|
||||
try:
|
||||
results_url = await workflow.execute_activity(
|
||||
"upload_results",
|
||||
args=[workflow_id, fuzz_results, "json"],
|
||||
start_to_close_timeout=timedelta(minutes=2)
|
||||
)
|
||||
results["results_url"] = results_url
|
||||
workflow.logger.info(f"✓ Results uploaded to: {results_url}")
|
||||
except Exception as e:
|
||||
workflow.logger.warning(f"Failed to upload results: {e}")
|
||||
results["results_url"] = None
|
||||
|
||||
# Step 4: Cleanup cache
|
||||
workflow.logger.info("Step 4: Cleaning up cache")
|
||||
try:
|
||||
await workflow.execute_activity(
|
||||
"cleanup_cache",
|
||||
args=[target_path, "isolated"], # target_path, workspace_isolation
|
||||
start_to_close_timeout=timedelta(minutes=1)
|
||||
)
|
||||
workflow.logger.info("✓ Cache cleaned up")
|
||||
except Exception as e:
|
||||
workflow.logger.warning(f"Cache cleanup failed: {e}")
|
||||
|
||||
# Mark workflow as successful
|
||||
results["status"] = "success"
|
||||
results["findings"] = fuzz_results.get("findings", [])
|
||||
results["summary"] = fuzz_results.get("summary", {})
|
||||
results["sarif"] = fuzz_results.get("sarif") or {}
|
||||
workflow.logger.info(
|
||||
f"✓ Workflow completed successfully: {workflow_id} "
|
||||
f"({results['summary'].get('crashes_found', 0)} crashes found)"
|
||||
)
|
||||
|
||||
return results
|
||||
|
||||
except Exception as e:
|
||||
workflow.logger.error(f"Workflow failed: {e}")
|
||||
results["status"] = "error"
|
||||
results["error"] = str(e)
|
||||
results["steps"].append({
|
||||
"step": "error",
|
||||
"status": "failed",
|
||||
"error": str(e)
|
||||
})
|
||||
raise
|
||||
@@ -1,12 +0,0 @@
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
@@ -1,47 +0,0 @@
|
||||
# Secret Detection Workflow Dockerfile
|
||||
FROM prefecthq/prefect:3-python3.11
|
||||
|
||||
# Install system dependencies
|
||||
RUN apt-get update && apt-get install -y \
|
||||
curl \
|
||||
wget \
|
||||
git \
|
||||
ca-certificates \
|
||||
gnupg \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Install TruffleHog (use direct binary download to avoid install script issues)
|
||||
RUN curl -sSfL "https://github.com/trufflesecurity/trufflehog/releases/download/v3.63.2/trufflehog_3.63.2_linux_amd64.tar.gz" -o trufflehog.tar.gz \
|
||||
&& tar -xzf trufflehog.tar.gz \
|
||||
&& mv trufflehog /usr/local/bin/ \
|
||||
&& rm trufflehog.tar.gz
|
||||
|
||||
# Install Gitleaks (use specific version to avoid API rate limiting)
|
||||
RUN wget https://github.com/gitleaks/gitleaks/releases/download/v8.18.2/gitleaks_8.18.2_linux_x64.tar.gz \
|
||||
&& tar -xzf gitleaks_8.18.2_linux_x64.tar.gz \
|
||||
&& mv gitleaks /usr/local/bin/ \
|
||||
&& rm gitleaks_8.18.2_linux_x64.tar.gz
|
||||
|
||||
# Verify installations
|
||||
RUN trufflehog --version && gitleaks version
|
||||
|
||||
# Set working directory
|
||||
WORKDIR /opt/prefect
|
||||
|
||||
# Create toolbox directory structure
|
||||
RUN mkdir -p /opt/prefect/toolbox
|
||||
|
||||
# Set environment variables
|
||||
ENV PYTHONPATH=/opt/prefect/toolbox:/opt/prefect/toolbox/workflows
|
||||
ENV WORKFLOW_NAME=secret_detection_scan
|
||||
|
||||
# The toolbox code will be mounted at runtime from the backend container
|
||||
# This includes:
|
||||
# - /opt/prefect/toolbox/modules/base.py
|
||||
# - /opt/prefect/toolbox/modules/secret_detection/ (TruffleHog, Gitleaks modules)
|
||||
# - /opt/prefect/toolbox/modules/reporter/ (SARIF reporter)
|
||||
# - /opt/prefect/toolbox/workflows/comprehensive/secret_detection_scan/
|
||||
VOLUME /opt/prefect/toolbox
|
||||
|
||||
# Set working directory for execution
|
||||
WORKDIR /opt/prefect
|
||||
-58
@@ -1,58 +0,0 @@
|
||||
# Secret Detection Workflow Dockerfile - Self-Contained Version
|
||||
# This version copies all required modules into the image for complete isolation
|
||||
FROM prefecthq/prefect:3-python3.11
|
||||
|
||||
# Install system dependencies
|
||||
RUN apt-get update && apt-get install -y \
|
||||
curl \
|
||||
wget \
|
||||
git \
|
||||
ca-certificates \
|
||||
gnupg \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Install TruffleHog
|
||||
RUN curl -sSfL https://raw.githubusercontent.com/trufflesecurity/trufflehog/main/scripts/install.sh | sh -s -- -b /usr/local/bin
|
||||
|
||||
# Install Gitleaks
|
||||
RUN wget https://github.com/gitleaks/gitleaks/releases/latest/download/gitleaks_linux_x64.tar.gz \
|
||||
&& tar -xzf gitleaks_linux_x64.tar.gz \
|
||||
&& mv gitleaks /usr/local/bin/ \
|
||||
&& rm gitleaks_linux_x64.tar.gz
|
||||
|
||||
# Verify installations
|
||||
RUN trufflehog --version && gitleaks version
|
||||
|
||||
# Set working directory
|
||||
WORKDIR /opt/prefect
|
||||
|
||||
# Create directory structure
|
||||
RUN mkdir -p /opt/prefect/toolbox/modules/secret_detection \
|
||||
/opt/prefect/toolbox/modules/reporter \
|
||||
/opt/prefect/toolbox/workflows/comprehensive/secret_detection_scan
|
||||
|
||||
# Copy the base module and required modules
|
||||
COPY toolbox/modules/base.py /opt/prefect/toolbox/modules/base.py
|
||||
COPY toolbox/modules/__init__.py /opt/prefect/toolbox/modules/__init__.py
|
||||
COPY toolbox/modules/secret_detection/ /opt/prefect/toolbox/modules/secret_detection/
|
||||
COPY toolbox/modules/reporter/ /opt/prefect/toolbox/modules/reporter/
|
||||
|
||||
# Copy the workflow code
|
||||
COPY toolbox/workflows/comprehensive/secret_detection_scan/ /opt/prefect/toolbox/workflows/comprehensive/secret_detection_scan/
|
||||
|
||||
# Copy toolbox init files
|
||||
COPY toolbox/__init__.py /opt/prefect/toolbox/__init__.py
|
||||
COPY toolbox/workflows/__init__.py /opt/prefect/toolbox/workflows/__init__.py
|
||||
COPY toolbox/workflows/comprehensive/__init__.py /opt/prefect/toolbox/workflows/comprehensive/__init__.py
|
||||
|
||||
# Install Python dependencies for the modules
|
||||
RUN pip install --no-cache-dir \
|
||||
pydantic \
|
||||
asyncio
|
||||
|
||||
# Set environment variables
|
||||
ENV PYTHONPATH=/opt/prefect/toolbox:/opt/prefect/toolbox/workflows
|
||||
ENV WORKFLOW_NAME=secret_detection_scan
|
||||
|
||||
# Set default command (can be overridden)
|
||||
CMD ["python", "-m", "toolbox.workflows.comprehensive.secret_detection_scan.workflow"]
|
||||
@@ -1,130 +0,0 @@
|
||||
# Secret Detection Scan Workflow
|
||||
|
||||
This workflow performs comprehensive secret detection using multiple industry-standard tools:
|
||||
|
||||
- **TruffleHog**: Comprehensive secret detection with verification capabilities
|
||||
- **Gitleaks**: Git-specific secret scanning and leak detection
|
||||
|
||||
## Features
|
||||
|
||||
- **Parallel Execution**: Runs TruffleHog and Gitleaks concurrently for faster results
|
||||
- **Deduplication**: Automatically removes duplicate findings across tools
|
||||
- **SARIF Output**: Generates standardized SARIF reports for integration with security tools
|
||||
- **Configurable**: Supports extensive configuration for both tools
|
||||
|
||||
## Dependencies
|
||||
|
||||
### Required Modules
|
||||
- `toolbox.modules.secret_detection.trufflehog`
|
||||
- `toolbox.modules.secret_detection.gitleaks`
|
||||
- `toolbox.modules.reporter` (SARIF reporter)
|
||||
- `toolbox.modules.base` (Base module interface)
|
||||
|
||||
### External Tools
|
||||
- TruffleHog v3.63.2+
|
||||
- Gitleaks v8.18.0+
|
||||
|
||||
## Docker Deployment
|
||||
|
||||
This workflow provides two Docker deployment approaches:
|
||||
|
||||
### 1. Volume-Based Approach (Default: `Dockerfile`)
|
||||
|
||||
**Advantages:**
|
||||
- Live code updates without rebuilding images
|
||||
- Smaller image sizes
|
||||
- Consistent module versions across workflows
|
||||
- Faster development iteration
|
||||
|
||||
**How it works:**
|
||||
- Docker image contains only external tools (TruffleHog, Gitleaks)
|
||||
- Python modules are mounted at runtime from the backend container
|
||||
- Backend manages code synchronization via shared volumes
|
||||
|
||||
### 2. Self-Contained Approach (`Dockerfile.self-contained`)
|
||||
|
||||
**Advantages:**
|
||||
- Complete isolation and reproducibility
|
||||
- No runtime dependencies on backend code
|
||||
- Can run independently of FuzzForge platform
|
||||
- Better for CI/CD integration
|
||||
|
||||
**How it works:**
|
||||
- All required Python modules are copied into the Docker image
|
||||
- Image is completely self-contained
|
||||
- Larger image size but fully portable
|
||||
|
||||
## Configuration
|
||||
|
||||
### TruffleHog Configuration
|
||||
|
||||
```json
|
||||
{
|
||||
"trufflehog_config": {
|
||||
"verify": true, // Verify discovered secrets
|
||||
"concurrency": 10, // Number of concurrent workers
|
||||
"max_depth": 10, // Maximum directory depth
|
||||
"include_detectors": [], // Specific detectors to include
|
||||
"exclude_detectors": [] // Specific detectors to exclude
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Gitleaks Configuration
|
||||
|
||||
```json
|
||||
{
|
||||
"gitleaks_config": {
|
||||
"scan_mode": "detect", // "detect" or "protect"
|
||||
"redact": true, // Redact secrets in output
|
||||
"max_target_megabytes": 100, // Maximum file size (MB)
|
||||
"no_git": false, // Scan without Git context
|
||||
"config_file": "", // Custom Gitleaks config
|
||||
"baseline_file": "" // Baseline file for known findings
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Usage Example
|
||||
|
||||
```bash
|
||||
curl -X POST "http://localhost:8000/workflows/secret_detection_scan/submit" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"target_path": "/path/to/scan",
|
||||
"volume_mode": "ro",
|
||||
"parameters": {
|
||||
"trufflehog_config": {
|
||||
"verify": true,
|
||||
"concurrency": 15
|
||||
},
|
||||
"gitleaks_config": {
|
||||
"scan_mode": "detect",
|
||||
"max_target_megabytes": 200
|
||||
}
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
## Output Format
|
||||
|
||||
The workflow generates a SARIF report containing:
|
||||
- All unique findings from both tools
|
||||
- Severity levels mapped to standard scale
|
||||
- File locations and line numbers
|
||||
- Detailed descriptions and recommendations
|
||||
- Tool-specific metadata
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
- **TruffleHog**: CPU-intensive with verification enabled
|
||||
- **Gitleaks**: Memory-intensive for large repositories
|
||||
- **Recommended Resources**: 512Mi memory, 500m CPU
|
||||
- **Typical Runtime**: 1-5 minutes for small repos, 10-30 minutes for large ones
|
||||
|
||||
## Security Notes
|
||||
|
||||
- Secrets are redacted in output by default
|
||||
- Verified secrets are marked with higher severity
|
||||
- Both tools support custom rules and exclusions
|
||||
- Consider using baseline files for known false positives
|
||||
@@ -1,17 +0,0 @@
|
||||
"""
|
||||
Secret Detection Scan Workflow
|
||||
|
||||
This package contains the comprehensive secret detection workflow that combines
|
||||
multiple secret detection tools for thorough analysis.
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
@@ -1,113 +0,0 @@
|
||||
name: secret_detection_scan
|
||||
version: "2.0.0"
|
||||
description: "Comprehensive secret detection using TruffleHog and Gitleaks"
|
||||
author: "FuzzForge Team"
|
||||
category: "comprehensive"
|
||||
tags:
|
||||
- "secrets"
|
||||
- "credentials"
|
||||
- "detection"
|
||||
- "trufflehog"
|
||||
- "gitleaks"
|
||||
- "comprehensive"
|
||||
|
||||
supported_volume_modes:
|
||||
- "ro"
|
||||
- "rw"
|
||||
|
||||
default_volume_mode: "ro"
|
||||
default_target_path: "/workspace"
|
||||
|
||||
requirements:
|
||||
tools:
|
||||
- "trufflehog"
|
||||
- "gitleaks"
|
||||
resources:
|
||||
memory: "512Mi"
|
||||
cpu: "500m"
|
||||
timeout: 1800
|
||||
|
||||
has_docker: true
|
||||
|
||||
default_parameters:
|
||||
target_path: "/workspace"
|
||||
volume_mode: "ro"
|
||||
trufflehog_config: {}
|
||||
gitleaks_config: {}
|
||||
reporter_config: {}
|
||||
|
||||
parameters:
|
||||
type: object
|
||||
properties:
|
||||
target_path:
|
||||
type: string
|
||||
default: "/workspace"
|
||||
description: "Path to analyze"
|
||||
volume_mode:
|
||||
type: string
|
||||
enum: ["ro", "rw"]
|
||||
default: "ro"
|
||||
description: "Volume mount mode"
|
||||
trufflehog_config:
|
||||
type: object
|
||||
description: "TruffleHog configuration"
|
||||
properties:
|
||||
verify:
|
||||
type: boolean
|
||||
description: "Verify discovered secrets"
|
||||
concurrency:
|
||||
type: integer
|
||||
description: "Number of concurrent workers"
|
||||
max_depth:
|
||||
type: integer
|
||||
description: "Maximum directory depth to scan"
|
||||
include_detectors:
|
||||
type: array
|
||||
items:
|
||||
type: string
|
||||
description: "Specific detectors to include"
|
||||
exclude_detectors:
|
||||
type: array
|
||||
items:
|
||||
type: string
|
||||
description: "Specific detectors to exclude"
|
||||
gitleaks_config:
|
||||
type: object
|
||||
description: "Gitleaks configuration"
|
||||
properties:
|
||||
scan_mode:
|
||||
type: string
|
||||
enum: ["detect", "protect"]
|
||||
description: "Scan mode"
|
||||
redact:
|
||||
type: boolean
|
||||
description: "Redact secrets in output"
|
||||
max_target_megabytes:
|
||||
type: integer
|
||||
description: "Maximum file size to scan (MB)"
|
||||
no_git:
|
||||
type: boolean
|
||||
description: "Scan files without Git context"
|
||||
config_file:
|
||||
type: string
|
||||
description: "Path to custom configuration file"
|
||||
baseline_file:
|
||||
type: string
|
||||
description: "Path to baseline file"
|
||||
reporter_config:
|
||||
type: object
|
||||
description: "SARIF reporter configuration"
|
||||
properties:
|
||||
output_file:
|
||||
type: string
|
||||
description: "Output SARIF file name"
|
||||
include_code_flows:
|
||||
type: boolean
|
||||
description: "Include code flow information"
|
||||
|
||||
output_schema:
|
||||
type: object
|
||||
properties:
|
||||
sarif:
|
||||
type: object
|
||||
description: "SARIF-formatted security findings"
|
||||
@@ -1,290 +0,0 @@
|
||||
"""
|
||||
Secret Detection Scan Workflow
|
||||
|
||||
This workflow performs comprehensive secret detection using multiple tools:
|
||||
- TruffleHog: Comprehensive secret detection with verification
|
||||
- Gitleaks: Git-specific secret scanning
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
import sys
|
||||
import logging
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any, List, Optional
|
||||
from prefect import flow, task
|
||||
from prefect.artifacts import create_markdown_artifact, create_table_artifact
|
||||
import asyncio
|
||||
import json
|
||||
|
||||
# Add modules to path
|
||||
sys.path.insert(0, '/app')
|
||||
|
||||
# Import modules
|
||||
from toolbox.modules.secret_detection.trufflehog import TruffleHogModule
|
||||
from toolbox.modules.secret_detection.gitleaks import GitleaksModule
|
||||
from toolbox.modules.reporter import SARIFReporter
|
||||
|
||||
# Configure logging
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@task(name="trufflehog_scan")
|
||||
async def run_trufflehog_task(workspace: Path, config: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""
|
||||
Task to run TruffleHog secret detection.
|
||||
|
||||
Args:
|
||||
workspace: Path to the workspace
|
||||
config: TruffleHog configuration
|
||||
|
||||
Returns:
|
||||
TruffleHog results
|
||||
"""
|
||||
logger.info("Running TruffleHog secret detection")
|
||||
module = TruffleHogModule()
|
||||
result = await module.execute(config, workspace)
|
||||
logger.info(f"TruffleHog completed: {result.summary.get('total_secrets', 0)} secrets found")
|
||||
return result.dict()
|
||||
|
||||
|
||||
@task(name="gitleaks_scan")
|
||||
async def run_gitleaks_task(workspace: Path, config: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""
|
||||
Task to run Gitleaks secret detection.
|
||||
|
||||
Args:
|
||||
workspace: Path to the workspace
|
||||
config: Gitleaks configuration
|
||||
|
||||
Returns:
|
||||
Gitleaks results
|
||||
"""
|
||||
logger.info("Running Gitleaks secret detection")
|
||||
module = GitleaksModule()
|
||||
result = await module.execute(config, workspace)
|
||||
logger.info(f"Gitleaks completed: {result.summary.get('total_leaks', 0)} leaks found")
|
||||
return result.dict()
|
||||
|
||||
|
||||
@task(name="aggregate_findings")
|
||||
async def aggregate_findings_task(
|
||||
trufflehog_results: Dict[str, Any],
|
||||
gitleaks_results: Dict[str, Any],
|
||||
config: Dict[str, Any],
|
||||
workspace: Path
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Task to aggregate findings from all secret detection tools.
|
||||
|
||||
Args:
|
||||
trufflehog_results: Results from TruffleHog
|
||||
gitleaks_results: Results from Gitleaks
|
||||
config: Reporter configuration
|
||||
workspace: Path to workspace
|
||||
|
||||
Returns:
|
||||
Aggregated SARIF report
|
||||
"""
|
||||
logger.info("Aggregating secret detection findings")
|
||||
|
||||
# Combine all findings
|
||||
all_findings = []
|
||||
|
||||
# Add TruffleHog findings
|
||||
trufflehog_findings = trufflehog_results.get("findings", [])
|
||||
all_findings.extend(trufflehog_findings)
|
||||
|
||||
# Add Gitleaks findings
|
||||
gitleaks_findings = gitleaks_results.get("findings", [])
|
||||
all_findings.extend(gitleaks_findings)
|
||||
|
||||
# Deduplicate findings based on file path and line number
|
||||
unique_findings = []
|
||||
seen_signatures = set()
|
||||
|
||||
for finding in all_findings:
|
||||
# Create signature for deduplication
|
||||
signature = (
|
||||
finding.get("file_path", ""),
|
||||
finding.get("line_start", 0),
|
||||
finding.get("title", "").lower()[:50] # First 50 chars of title
|
||||
)
|
||||
|
||||
if signature not in seen_signatures:
|
||||
seen_signatures.add(signature)
|
||||
unique_findings.append(finding)
|
||||
else:
|
||||
logger.debug(f"Deduplicated finding: {signature}")
|
||||
|
||||
logger.info(f"Aggregated {len(unique_findings)} unique findings from {len(all_findings)} total")
|
||||
|
||||
# Generate SARIF report
|
||||
reporter = SARIFReporter()
|
||||
reporter_config = {
|
||||
**config,
|
||||
"findings": unique_findings,
|
||||
"tool_name": "FuzzForge Secret Detection",
|
||||
"tool_version": "1.0.0",
|
||||
"tool_description": "Comprehensive secret detection using TruffleHog and Gitleaks"
|
||||
}
|
||||
|
||||
result = await reporter.execute(reporter_config, workspace)
|
||||
return result.dict().get("sarif", {})
|
||||
|
||||
|
||||
@flow(name="secret_detection_scan", log_prints=True)
|
||||
async def main_flow(
|
||||
target_path: str = "/workspace",
|
||||
volume_mode: str = "ro",
|
||||
trufflehog_config: Optional[Dict[str, Any]] = None,
|
||||
gitleaks_config: Optional[Dict[str, Any]] = None,
|
||||
reporter_config: Optional[Dict[str, Any]] = None
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Main secret detection workflow.
|
||||
|
||||
This workflow:
|
||||
1. Runs TruffleHog for comprehensive secret detection
|
||||
2. Runs Gitleaks for Git-specific secret detection
|
||||
3. Aggregates and deduplicates findings
|
||||
4. Generates a unified SARIF report
|
||||
|
||||
Args:
|
||||
target_path: Path to the mounted workspace (default: /workspace)
|
||||
volume_mode: Volume mount mode (ro/rw)
|
||||
trufflehog_config: Configuration for TruffleHog
|
||||
gitleaks_config: Configuration for Gitleaks
|
||||
reporter_config: Configuration for SARIF reporter
|
||||
|
||||
Returns:
|
||||
SARIF-formatted findings report
|
||||
"""
|
||||
logger.info("Starting comprehensive secret detection workflow")
|
||||
logger.info(f"Workspace: {target_path}, Mode: {volume_mode}")
|
||||
|
||||
# Set workspace path
|
||||
workspace = Path(target_path)
|
||||
|
||||
if not workspace.exists():
|
||||
logger.error(f"Workspace does not exist: {workspace}")
|
||||
return {
|
||||
"error": f"Workspace not found: {workspace}",
|
||||
"sarif": None
|
||||
}
|
||||
|
||||
# Default configurations - merge with provided configs to ensure defaults are always applied
|
||||
default_trufflehog_config = {
|
||||
"verify": False,
|
||||
"concurrency": 10,
|
||||
"max_depth": 10,
|
||||
"no_git": True # Add no_git for filesystem scanning
|
||||
}
|
||||
trufflehog_config = {**default_trufflehog_config, **(trufflehog_config or {})}
|
||||
|
||||
default_gitleaks_config = {
|
||||
"scan_mode": "detect",
|
||||
"redact": True,
|
||||
"max_target_megabytes": 100,
|
||||
"no_git": True # Critical for non-git directories
|
||||
}
|
||||
gitleaks_config = {**default_gitleaks_config, **(gitleaks_config or {})}
|
||||
|
||||
default_reporter_config = {
|
||||
"include_code_flows": False
|
||||
}
|
||||
reporter_config = {**default_reporter_config, **(reporter_config or {})}
|
||||
|
||||
try:
|
||||
# Run secret detection tools in parallel
|
||||
logger.info("Phase 1: Running secret detection tools")
|
||||
|
||||
# Create tasks for parallel execution
|
||||
trufflehog_task_result = run_trufflehog_task(workspace, trufflehog_config)
|
||||
gitleaks_task_result = run_gitleaks_task(workspace, gitleaks_config)
|
||||
|
||||
# Wait for both to complete
|
||||
trufflehog_results, gitleaks_results = await asyncio.gather(
|
||||
trufflehog_task_result,
|
||||
gitleaks_task_result,
|
||||
return_exceptions=True
|
||||
)
|
||||
|
||||
# Handle any exceptions
|
||||
if isinstance(trufflehog_results, Exception):
|
||||
logger.error(f"TruffleHog failed: {trufflehog_results}")
|
||||
trufflehog_results = {"findings": [], "status": "failed"}
|
||||
|
||||
if isinstance(gitleaks_results, Exception):
|
||||
logger.error(f"Gitleaks failed: {gitleaks_results}")
|
||||
gitleaks_results = {"findings": [], "status": "failed"}
|
||||
|
||||
# Aggregate findings
|
||||
logger.info("Phase 2: Aggregating findings")
|
||||
sarif_report = await aggregate_findings_task(
|
||||
trufflehog_results,
|
||||
gitleaks_results,
|
||||
reporter_config,
|
||||
workspace
|
||||
)
|
||||
|
||||
# Log summary
|
||||
if sarif_report and "runs" in sarif_report:
|
||||
results_count = len(sarif_report["runs"][0].get("results", []))
|
||||
logger.info(f"Workflow completed successfully with {results_count} unique secret findings")
|
||||
|
||||
# Log tool-specific stats
|
||||
trufflehog_count = len(trufflehog_results.get("findings", []))
|
||||
gitleaks_count = len(gitleaks_results.get("findings", []))
|
||||
logger.info(f"Tool results - TruffleHog: {trufflehog_count}, Gitleaks: {gitleaks_count}")
|
||||
else:
|
||||
logger.info("Workflow completed successfully with no findings")
|
||||
|
||||
return sarif_report
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Secret detection workflow failed: {e}")
|
||||
# Return error in SARIF format
|
||||
return {
|
||||
"$schema": "https://raw.githubusercontent.com/oasis-tcs/sarif-spec/master/Schemata/sarif-schema-2.1.0.json",
|
||||
"version": "2.1.0",
|
||||
"runs": [
|
||||
{
|
||||
"tool": {
|
||||
"driver": {
|
||||
"name": "FuzzForge Secret Detection",
|
||||
"version": "1.0.0"
|
||||
}
|
||||
},
|
||||
"results": [],
|
||||
"invocations": [
|
||||
{
|
||||
"executionSuccessful": False,
|
||||
"exitCode": 1,
|
||||
"exitCodeDescription": str(e)
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
# For local testing
|
||||
import asyncio
|
||||
|
||||
asyncio.run(main_flow(
|
||||
target_path="/tmp/test",
|
||||
trufflehog_config={"verify": True, "max_depth": 5},
|
||||
gitleaks_config={"scan_mode": "detect"}
|
||||
))
|
||||
@@ -0,0 +1,113 @@
|
||||
name: ossfuzz_campaign
|
||||
version: "1.0.0"
|
||||
vertical: ossfuzz
|
||||
description: "Generic OSS-Fuzz fuzzing campaign. Automatically reads project configuration from OSS-Fuzz repo and runs fuzzing using Google's infrastructure."
|
||||
author: "FuzzForge Team"
|
||||
tags:
|
||||
- "fuzzing"
|
||||
- "oss-fuzz"
|
||||
- "libfuzzer"
|
||||
- "afl"
|
||||
- "honggfuzz"
|
||||
- "memory-safety"
|
||||
- "security"
|
||||
|
||||
# Workspace isolation mode
|
||||
# OSS-Fuzz campaigns use isolated mode for safe concurrent campaigns
|
||||
workspace_isolation: "isolated"
|
||||
|
||||
default_parameters:
|
||||
project_name: null
|
||||
campaign_duration_hours: 1
|
||||
override_engine: null
|
||||
override_sanitizer: null
|
||||
max_iterations: null
|
||||
|
||||
parameters:
|
||||
type: object
|
||||
required:
|
||||
- project_name
|
||||
properties:
|
||||
project_name:
|
||||
type: string
|
||||
description: "OSS-Fuzz project name (e.g., 'curl', 'sqlite3', 'libxml2')"
|
||||
examples:
|
||||
- "curl"
|
||||
- "sqlite3"
|
||||
- "libxml2"
|
||||
- "openssl"
|
||||
- "zlib"
|
||||
|
||||
campaign_duration_hours:
|
||||
type: integer
|
||||
default: 1
|
||||
minimum: 1
|
||||
maximum: 168 # 1 week max
|
||||
description: "How many hours to run the fuzzing campaign"
|
||||
|
||||
override_engine:
|
||||
type: string
|
||||
enum: ["libfuzzer", "afl", "honggfuzz"]
|
||||
description: "Override fuzzing engine from project.yaml (optional)"
|
||||
|
||||
override_sanitizer:
|
||||
type: string
|
||||
enum: ["address", "memory", "undefined", "dataflow"]
|
||||
description: "Override sanitizer from project.yaml (optional)"
|
||||
|
||||
max_iterations:
|
||||
type: integer
|
||||
minimum: 1000
|
||||
description: "Optional limit on fuzzing iterations (optional)"
|
||||
|
||||
output_schema:
|
||||
type: object
|
||||
properties:
|
||||
project_name:
|
||||
type: string
|
||||
description: "OSS-Fuzz project that was fuzzed"
|
||||
|
||||
summary:
|
||||
type: object
|
||||
description: "Campaign execution summary"
|
||||
properties:
|
||||
total_executions:
|
||||
type: integer
|
||||
crashes_found:
|
||||
type: integer
|
||||
unique_crashes:
|
||||
type: integer
|
||||
duration_hours:
|
||||
type: number
|
||||
engine_used:
|
||||
type: string
|
||||
sanitizer_used:
|
||||
type: string
|
||||
|
||||
crashes:
|
||||
type: array
|
||||
description: "List of crash file paths"
|
||||
items:
|
||||
type: string
|
||||
|
||||
sarif:
|
||||
type: object
|
||||
description: "SARIF-formatted crash reports (future)"
|
||||
|
||||
examples:
|
||||
- name: "Fuzz curl for 1 hour"
|
||||
parameters:
|
||||
project_name: "curl"
|
||||
campaign_duration_hours: 1
|
||||
|
||||
- name: "Fuzz sqlite3 with AFL"
|
||||
parameters:
|
||||
project_name: "sqlite3"
|
||||
campaign_duration_hours: 2
|
||||
override_engine: "afl"
|
||||
|
||||
- name: "Fuzz libxml2 with memory sanitizer"
|
||||
parameters:
|
||||
project_name: "libxml2"
|
||||
campaign_duration_hours: 6
|
||||
override_sanitizer: "memory"
|
||||
@@ -0,0 +1,219 @@
|
||||
"""
|
||||
OSS-Fuzz Campaign Workflow - Temporal Version
|
||||
|
||||
Generic workflow for running OSS-Fuzz campaigns using Google's infrastructure.
|
||||
Automatically reads project configuration from OSS-Fuzz project.yaml files.
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
from datetime import timedelta
|
||||
from typing import Dict, Any, Optional
|
||||
|
||||
from temporalio import workflow
|
||||
from temporalio.common import RetryPolicy
|
||||
|
||||
# Import for type hints (will be executed by worker)
|
||||
with workflow.unsafe.imports_passed_through():
|
||||
import logging
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@workflow.defn
|
||||
class OssfuzzCampaignWorkflow:
|
||||
"""
|
||||
Generic OSS-Fuzz fuzzing campaign workflow.
|
||||
|
||||
User workflow:
|
||||
1. User runs: ff workflow run ossfuzz_campaign . project_name=curl
|
||||
2. Worker loads project config from OSS-Fuzz repo
|
||||
3. Worker builds project using OSS-Fuzz's build system
|
||||
4. Worker runs fuzzing with engines from project.yaml
|
||||
5. Crashes and corpus reported as findings
|
||||
"""
|
||||
|
||||
@workflow.run
|
||||
async def run(
|
||||
self,
|
||||
target_id: str, # Required by FuzzForge (not used, OSS-Fuzz downloads from Google)
|
||||
project_name: str, # Required: OSS-Fuzz project name (e.g., "curl", "sqlite3")
|
||||
campaign_duration_hours: int = 1,
|
||||
override_engine: Optional[str] = None, # Override engine from project.yaml
|
||||
override_sanitizer: Optional[str] = None, # Override sanitizer from project.yaml
|
||||
max_iterations: Optional[int] = None # Optional: limit fuzzing iterations
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Main workflow execution.
|
||||
|
||||
Args:
|
||||
target_id: UUID of uploaded target (not used, required by FuzzForge)
|
||||
project_name: Name of OSS-Fuzz project (e.g., "curl", "sqlite3", "libxml2")
|
||||
campaign_duration_hours: How many hours to fuzz (default: 1)
|
||||
override_engine: Override fuzzing engine from project.yaml
|
||||
override_sanitizer: Override sanitizer from project.yaml
|
||||
max_iterations: Optional limit on fuzzing iterations
|
||||
|
||||
Returns:
|
||||
Dictionary containing crashes, stats, and SARIF report
|
||||
"""
|
||||
workflow_id = workflow.info().workflow_id
|
||||
|
||||
workflow.logger.info(
|
||||
f"Starting OSS-Fuzz Campaign for project '{project_name}' "
|
||||
f"(workflow_id={workflow_id}, duration={campaign_duration_hours}h)"
|
||||
)
|
||||
|
||||
results = {
|
||||
"workflow_id": workflow_id,
|
||||
"project_name": project_name,
|
||||
"status": "running",
|
||||
"steps": []
|
||||
}
|
||||
|
||||
try:
|
||||
# Step 1: Load OSS-Fuzz project configuration
|
||||
workflow.logger.info(f"Step 1: Loading project config for '{project_name}'")
|
||||
project_config = await workflow.execute_activity(
|
||||
"load_ossfuzz_project",
|
||||
args=[project_name],
|
||||
start_to_close_timeout=timedelta(minutes=5),
|
||||
retry_policy=RetryPolicy(
|
||||
initial_interval=timedelta(seconds=1),
|
||||
maximum_interval=timedelta(seconds=30),
|
||||
maximum_attempts=3
|
||||
)
|
||||
)
|
||||
|
||||
results["steps"].append({
|
||||
"step": "load_config",
|
||||
"status": "success",
|
||||
"language": project_config.get("language"),
|
||||
"engines": project_config.get("fuzzing_engines", []),
|
||||
"sanitizers": project_config.get("sanitizers", [])
|
||||
})
|
||||
|
||||
workflow.logger.info(
|
||||
f"✓ Loaded config: language={project_config.get('language')}, "
|
||||
f"engines={project_config.get('fuzzing_engines')}"
|
||||
)
|
||||
|
||||
# Step 2: Build project using OSS-Fuzz infrastructure
|
||||
workflow.logger.info(f"Step 2: Building project '{project_name}'")
|
||||
|
||||
build_result = await workflow.execute_activity(
|
||||
"build_ossfuzz_project",
|
||||
args=[
|
||||
project_name,
|
||||
project_config,
|
||||
override_sanitizer,
|
||||
override_engine
|
||||
],
|
||||
start_to_close_timeout=timedelta(minutes=30),
|
||||
retry_policy=RetryPolicy(
|
||||
initial_interval=timedelta(seconds=2),
|
||||
maximum_interval=timedelta(seconds=60),
|
||||
maximum_attempts=2
|
||||
)
|
||||
)
|
||||
|
||||
results["steps"].append({
|
||||
"step": "build_project",
|
||||
"status": "success",
|
||||
"fuzz_targets": len(build_result.get("fuzz_targets", [])),
|
||||
"sanitizer": build_result.get("sanitizer_used"),
|
||||
"engine": build_result.get("engine_used")
|
||||
})
|
||||
|
||||
workflow.logger.info(
|
||||
f"✓ Build completed: {len(build_result.get('fuzz_targets', []))} fuzz targets found"
|
||||
)
|
||||
|
||||
if not build_result.get("fuzz_targets"):
|
||||
raise Exception(f"No fuzz targets found for project {project_name}")
|
||||
|
||||
# Step 3: Run fuzzing on discovered targets
|
||||
workflow.logger.info(f"Step 3: Fuzzing {len(build_result['fuzz_targets'])} targets")
|
||||
|
||||
# Determine which engine to use
|
||||
engine_to_use = override_engine if override_engine else build_result["engine_used"]
|
||||
duration_seconds = campaign_duration_hours * 3600
|
||||
|
||||
# Fuzz each target (in parallel if multiple targets)
|
||||
fuzz_futures = []
|
||||
for target_path in build_result["fuzz_targets"]:
|
||||
future = workflow.execute_activity(
|
||||
"fuzz_target",
|
||||
args=[target_path, engine_to_use, duration_seconds, None, None],
|
||||
start_to_close_timeout=timedelta(seconds=duration_seconds + 300),
|
||||
retry_policy=RetryPolicy(
|
||||
initial_interval=timedelta(seconds=2),
|
||||
maximum_interval=timedelta(seconds=60),
|
||||
maximum_attempts=1 # Fuzzing shouldn't retry
|
||||
)
|
||||
)
|
||||
fuzz_futures.append(future)
|
||||
|
||||
# Wait for all fuzzing to complete
|
||||
fuzz_results = await asyncio.gather(*fuzz_futures, return_exceptions=True)
|
||||
|
||||
# Aggregate results
|
||||
total_execs = 0
|
||||
total_crashes = 0
|
||||
all_crashes = []
|
||||
|
||||
for i, result in enumerate(fuzz_results):
|
||||
if isinstance(result, Exception):
|
||||
workflow.logger.error(f"Fuzzing failed for target {i}: {result}")
|
||||
continue
|
||||
|
||||
total_execs += result.get("total_executions", 0)
|
||||
total_crashes += result.get("crashes", 0)
|
||||
all_crashes.extend(result.get("crash_files", []))
|
||||
|
||||
results["steps"].append({
|
||||
"step": "fuzzing",
|
||||
"status": "success",
|
||||
"total_executions": total_execs,
|
||||
"crashes_found": total_crashes,
|
||||
"targets_fuzzed": len(build_result["fuzz_targets"])
|
||||
})
|
||||
|
||||
workflow.logger.info(
|
||||
f"✓ Fuzzing completed: {total_execs} executions, {total_crashes} crashes"
|
||||
)
|
||||
|
||||
# Step 4: Generate SARIF report
|
||||
workflow.logger.info("Step 4: Generating SARIF report")
|
||||
|
||||
# TODO: Implement crash minimization and SARIF generation
|
||||
# For now, return raw results
|
||||
|
||||
results["status"] = "success"
|
||||
results["summary"] = {
|
||||
"project": project_name,
|
||||
"total_executions": total_execs,
|
||||
"crashes_found": total_crashes,
|
||||
"unique_crashes": len(set(all_crashes)),
|
||||
"duration_hours": campaign_duration_hours,
|
||||
"engine_used": engine_to_use,
|
||||
"sanitizer_used": build_result.get("sanitizer_used")
|
||||
}
|
||||
results["crashes"] = all_crashes[:100] # Limit to first 100 crashes
|
||||
|
||||
workflow.logger.info(
|
||||
f"✓ Campaign completed: {project_name} - "
|
||||
f"{total_execs} execs, {total_crashes} crashes"
|
||||
)
|
||||
|
||||
return results
|
||||
|
||||
except Exception as e:
|
||||
workflow.logger.error(f"Workflow failed: {e}")
|
||||
results["status"] = "error"
|
||||
results["error"] = str(e)
|
||||
results["steps"].append({
|
||||
"step": "error",
|
||||
"status": "failed",
|
||||
"error": str(e)
|
||||
})
|
||||
raise
|
||||
@@ -1,187 +0,0 @@
|
||||
"""
|
||||
Manual Workflow Registry for Prefect Deployment
|
||||
|
||||
This file contains the manual registry of all workflows that can be deployed.
|
||||
Developers MUST add their workflows here after creating them.
|
||||
|
||||
This approach is required because:
|
||||
1. Prefect cannot deploy dynamically imported flows
|
||||
2. Docker deployment needs static flow references
|
||||
3. Explicit registration provides better control and visibility
|
||||
"""
|
||||
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
from typing import Dict, Any, Callable
|
||||
import logging
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Import only essential workflows
|
||||
# Import each workflow individually to handle failures gracefully
|
||||
security_assessment_flow = None
|
||||
secret_detection_flow = None
|
||||
|
||||
# Try to import each workflow individually
|
||||
try:
|
||||
from .security_assessment.workflow import main_flow as security_assessment_flow
|
||||
except ImportError as e:
|
||||
logger.warning(f"Failed to import security_assessment workflow: {e}")
|
||||
|
||||
try:
|
||||
from .comprehensive.secret_detection_scan.workflow import main_flow as secret_detection_flow
|
||||
except ImportError as e:
|
||||
logger.warning(f"Failed to import secret_detection_scan workflow: {e}")
|
||||
|
||||
|
||||
# Manual registry - developers add workflows here after creation
|
||||
# Only include workflows that were successfully imported
|
||||
WORKFLOW_REGISTRY: Dict[str, Dict[str, Any]] = {}
|
||||
|
||||
# Add workflows that were successfully imported
|
||||
if security_assessment_flow is not None:
|
||||
WORKFLOW_REGISTRY["security_assessment"] = {
|
||||
"flow": security_assessment_flow,
|
||||
"module_path": "toolbox.workflows.security_assessment.workflow",
|
||||
"function_name": "main_flow",
|
||||
"description": "Comprehensive security assessment workflow that scans files, analyzes code for vulnerabilities, and generates SARIF reports",
|
||||
"version": "1.0.0",
|
||||
"author": "FuzzForge Team",
|
||||
"tags": ["security", "scanner", "analyzer", "static-analysis", "sarif"]
|
||||
}
|
||||
|
||||
if secret_detection_flow is not None:
|
||||
WORKFLOW_REGISTRY["secret_detection_scan"] = {
|
||||
"flow": secret_detection_flow,
|
||||
"module_path": "toolbox.workflows.comprehensive.secret_detection_scan.workflow",
|
||||
"function_name": "main_flow",
|
||||
"description": "Comprehensive secret detection using TruffleHog and Gitleaks for thorough credential scanning",
|
||||
"version": "1.0.0",
|
||||
"author": "FuzzForge Team",
|
||||
"tags": ["secrets", "credentials", "detection", "trufflehog", "gitleaks", "comprehensive"]
|
||||
}
|
||||
|
||||
#
|
||||
# To add a new workflow, follow this pattern:
|
||||
#
|
||||
# "my_new_workflow": {
|
||||
# "flow": my_new_flow_function, # Import the flow function above
|
||||
# "module_path": "toolbox.workflows.my_new_workflow.workflow",
|
||||
# "function_name": "my_new_flow_function",
|
||||
# "description": "Description of what this workflow does",
|
||||
# "version": "1.0.0",
|
||||
# "author": "Developer Name",
|
||||
# "tags": ["tag1", "tag2"]
|
||||
# }
|
||||
|
||||
|
||||
def get_workflow_flow(workflow_name: str) -> Callable:
|
||||
"""
|
||||
Get the flow function for a workflow.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow
|
||||
|
||||
Returns:
|
||||
Flow function
|
||||
|
||||
Raises:
|
||||
KeyError: If workflow not found in registry
|
||||
"""
|
||||
if workflow_name not in WORKFLOW_REGISTRY:
|
||||
available = list(WORKFLOW_REGISTRY.keys())
|
||||
raise KeyError(
|
||||
f"Workflow '{workflow_name}' not found in registry. "
|
||||
f"Available workflows: {available}. "
|
||||
f"Please add the workflow to toolbox/workflows/registry.py"
|
||||
)
|
||||
|
||||
return WORKFLOW_REGISTRY[workflow_name]["flow"]
|
||||
|
||||
|
||||
def get_workflow_info(workflow_name: str) -> Dict[str, Any]:
|
||||
"""
|
||||
Get registry information for a workflow.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow
|
||||
|
||||
Returns:
|
||||
Registry information dictionary
|
||||
|
||||
Raises:
|
||||
KeyError: If workflow not found in registry
|
||||
"""
|
||||
if workflow_name not in WORKFLOW_REGISTRY:
|
||||
available = list(WORKFLOW_REGISTRY.keys())
|
||||
raise KeyError(
|
||||
f"Workflow '{workflow_name}' not found in registry. "
|
||||
f"Available workflows: {available}"
|
||||
)
|
||||
|
||||
return WORKFLOW_REGISTRY[workflow_name]
|
||||
|
||||
|
||||
def list_registered_workflows() -> Dict[str, Dict[str, Any]]:
|
||||
"""
|
||||
Get all registered workflows.
|
||||
|
||||
Returns:
|
||||
Dictionary of all workflow registry entries
|
||||
"""
|
||||
return WORKFLOW_REGISTRY.copy()
|
||||
|
||||
|
||||
def validate_registry() -> bool:
|
||||
"""
|
||||
Validate the workflow registry for consistency.
|
||||
|
||||
Returns:
|
||||
True if valid, raises exceptions if not
|
||||
|
||||
Raises:
|
||||
ValueError: If registry is invalid
|
||||
"""
|
||||
if not WORKFLOW_REGISTRY:
|
||||
raise ValueError("Workflow registry is empty")
|
||||
|
||||
required_fields = ["flow", "module_path", "function_name", "description"]
|
||||
|
||||
for name, entry in WORKFLOW_REGISTRY.items():
|
||||
# Check required fields
|
||||
missing_fields = [field for field in required_fields if field not in entry]
|
||||
if missing_fields:
|
||||
raise ValueError(
|
||||
f"Workflow '{name}' missing required fields: {missing_fields}"
|
||||
)
|
||||
|
||||
# Check if flow is callable
|
||||
if not callable(entry["flow"]):
|
||||
raise ValueError(f"Workflow '{name}' flow is not callable")
|
||||
|
||||
# Check if flow has the required Prefect attributes
|
||||
if not hasattr(entry["flow"], "deploy"):
|
||||
raise ValueError(
|
||||
f"Workflow '{name}' flow is not a Prefect flow (missing deploy method)"
|
||||
)
|
||||
|
||||
logger.info(f"Registry validation passed. {len(WORKFLOW_REGISTRY)} workflows registered.")
|
||||
return True
|
||||
|
||||
|
||||
# Validate registry on import
|
||||
try:
|
||||
validate_registry()
|
||||
logger.info(f"Workflow registry loaded successfully with {len(WORKFLOW_REGISTRY)} workflows")
|
||||
except Exception as e:
|
||||
logger.error(f"Workflow registry validation failed: {e}")
|
||||
raise
|
||||
@@ -1,30 +0,0 @@
|
||||
FROM prefecthq/prefect:3-python3.11
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# Create toolbox directory structure to match expected import paths
|
||||
RUN mkdir -p /app/toolbox/workflows /app/toolbox/modules
|
||||
|
||||
# Copy base module infrastructure
|
||||
COPY modules/__init__.py /app/toolbox/modules/
|
||||
COPY modules/base.py /app/toolbox/modules/
|
||||
|
||||
# Copy only required modules (manual selection)
|
||||
COPY modules/scanner /app/toolbox/modules/scanner
|
||||
COPY modules/analyzer /app/toolbox/modules/analyzer
|
||||
COPY modules/reporter /app/toolbox/modules/reporter
|
||||
|
||||
# Copy this workflow
|
||||
COPY workflows/security_assessment /app/toolbox/workflows/security_assessment
|
||||
|
||||
# Install workflow-specific requirements if they exist
|
||||
RUN if [ -f /app/toolbox/workflows/security_assessment/requirements.txt ]; then pip install --no-cache-dir -r /app/toolbox/workflows/security_assessment/requirements.txt; fi
|
||||
|
||||
# Install common requirements
|
||||
RUN pip install --no-cache-dir pyyaml
|
||||
|
||||
# Set Python path
|
||||
ENV PYTHONPATH=/app:$PYTHONPATH
|
||||
|
||||
# Create workspace directory
|
||||
RUN mkdir -p /workspace
|
||||
@@ -0,0 +1,150 @@
|
||||
"""
|
||||
Security Assessment Workflow Activities
|
||||
|
||||
Activities specific to the security assessment workflow:
|
||||
- scan_files_activity: Scan files in the workspace
|
||||
- analyze_security_activity: Analyze security vulnerabilities
|
||||
- generate_sarif_report_activity: Generate SARIF report from findings
|
||||
"""
|
||||
|
||||
import logging
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
from temporalio import activity
|
||||
|
||||
# Configure logging
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Add toolbox to path for module imports
|
||||
sys.path.insert(0, '/app/toolbox')
|
||||
|
||||
|
||||
@activity.defn(name="scan_files")
|
||||
async def scan_files_activity(workspace_path: str, config: dict) -> dict:
|
||||
"""
|
||||
Scan files in the workspace.
|
||||
|
||||
Args:
|
||||
workspace_path: Path to the workspace directory
|
||||
config: Scanner configuration
|
||||
|
||||
Returns:
|
||||
Scanner results dictionary
|
||||
"""
|
||||
logger.info(f"Activity: scan_files (workspace={workspace_path})")
|
||||
|
||||
try:
|
||||
from modules.scanner import FileScanner
|
||||
|
||||
workspace = Path(workspace_path)
|
||||
if not workspace.exists():
|
||||
raise FileNotFoundError(f"Workspace not found: {workspace_path}")
|
||||
|
||||
scanner = FileScanner()
|
||||
result = await scanner.execute(config, workspace)
|
||||
|
||||
logger.info(
|
||||
f"✓ File scanning completed: "
|
||||
f"{result.summary.get('total_files', 0)} files scanned"
|
||||
)
|
||||
return result.dict()
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"File scanning failed: {e}", exc_info=True)
|
||||
raise
|
||||
|
||||
|
||||
@activity.defn(name="analyze_security")
|
||||
async def analyze_security_activity(workspace_path: str, config: dict) -> dict:
|
||||
"""
|
||||
Analyze security vulnerabilities in the workspace.
|
||||
|
||||
Args:
|
||||
workspace_path: Path to the workspace directory
|
||||
config: Analyzer configuration
|
||||
|
||||
Returns:
|
||||
Analysis results dictionary
|
||||
"""
|
||||
logger.info(f"Activity: analyze_security (workspace={workspace_path})")
|
||||
|
||||
try:
|
||||
from modules.analyzer import SecurityAnalyzer
|
||||
|
||||
workspace = Path(workspace_path)
|
||||
if not workspace.exists():
|
||||
raise FileNotFoundError(f"Workspace not found: {workspace_path}")
|
||||
|
||||
analyzer = SecurityAnalyzer()
|
||||
result = await analyzer.execute(config, workspace)
|
||||
|
||||
logger.info(
|
||||
f"✓ Security analysis completed: "
|
||||
f"{result.summary.get('total_findings', 0)} findings"
|
||||
)
|
||||
return result.dict()
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Security analysis failed: {e}", exc_info=True)
|
||||
raise
|
||||
|
||||
|
||||
@activity.defn(name="generate_sarif_report")
|
||||
async def generate_sarif_report_activity(
|
||||
scan_results: dict,
|
||||
analysis_results: dict,
|
||||
config: dict,
|
||||
workspace_path: str
|
||||
) -> dict:
|
||||
"""
|
||||
Generate SARIF report from scan and analysis results.
|
||||
|
||||
Args:
|
||||
scan_results: Results from file scanner
|
||||
analysis_results: Results from security analyzer
|
||||
config: Reporter configuration
|
||||
workspace_path: Path to the workspace
|
||||
|
||||
Returns:
|
||||
SARIF report dictionary
|
||||
"""
|
||||
logger.info("Activity: generate_sarif_report")
|
||||
|
||||
try:
|
||||
from modules.reporter import SARIFReporter
|
||||
|
||||
workspace = Path(workspace_path)
|
||||
|
||||
# Combine findings from all modules
|
||||
all_findings = []
|
||||
|
||||
# Add scanner findings (only sensitive files, not all files)
|
||||
scanner_findings = scan_results.get("findings", [])
|
||||
sensitive_findings = [f for f in scanner_findings if f.get("severity") != "info"]
|
||||
all_findings.extend(sensitive_findings)
|
||||
|
||||
# Add analyzer findings
|
||||
analyzer_findings = analysis_results.get("findings", [])
|
||||
all_findings.extend(analyzer_findings)
|
||||
|
||||
# Prepare reporter config
|
||||
reporter_config = {
|
||||
**config,
|
||||
"findings": all_findings,
|
||||
"tool_name": "FuzzForge Security Assessment",
|
||||
"tool_version": "1.0.0"
|
||||
}
|
||||
|
||||
reporter = SARIFReporter()
|
||||
result = await reporter.execute(reporter_config, workspace)
|
||||
|
||||
# Extract SARIF from result
|
||||
sarif = result.dict().get("sarif", {})
|
||||
|
||||
logger.info(f"✓ SARIF report generated with {len(all_findings)} findings")
|
||||
return sarif
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"SARIF report generation failed: {e}", exc_info=True)
|
||||
raise
|
||||
@@ -1,8 +1,8 @@
|
||||
name: security_assessment
|
||||
version: "2.0.0"
|
||||
vertical: rust
|
||||
description: "Comprehensive security assessment workflow that scans files, analyzes code for vulnerabilities, and generates SARIF reports"
|
||||
author: "FuzzForge Team"
|
||||
category: "comprehensive"
|
||||
tags:
|
||||
- "security"
|
||||
- "scanner"
|
||||
@@ -11,28 +11,14 @@ tags:
|
||||
- "sarif"
|
||||
- "comprehensive"
|
||||
|
||||
supported_volume_modes:
|
||||
- "ro"
|
||||
- "rw"
|
||||
|
||||
default_volume_mode: "ro"
|
||||
default_target_path: "/workspace"
|
||||
|
||||
requirements:
|
||||
tools:
|
||||
- "file_scanner"
|
||||
- "security_analyzer"
|
||||
- "sarif_reporter"
|
||||
resources:
|
||||
memory: "512Mi"
|
||||
cpu: "500m"
|
||||
timeout: 1800
|
||||
|
||||
has_docker: true
|
||||
# Workspace isolation mode (system-level configuration)
|
||||
# - "isolated" (default): Each workflow run gets its own isolated workspace (safe for concurrent fuzzing)
|
||||
# - "shared": All runs share the same workspace (for read-only analysis workflows)
|
||||
# - "copy-on-write": Download once, copy for each run (balances performance and isolation)
|
||||
# Using "shared" mode for read-only security analysis (no file modifications)
|
||||
workspace_isolation: "shared"
|
||||
|
||||
default_parameters:
|
||||
target_path: "/workspace"
|
||||
volume_mode: "ro"
|
||||
scanner_config: {}
|
||||
analyzer_config: {}
|
||||
reporter_config: {}
|
||||
@@ -40,15 +26,6 @@ default_parameters:
|
||||
parameters:
|
||||
type: object
|
||||
properties:
|
||||
target_path:
|
||||
type: string
|
||||
default: "/workspace"
|
||||
description: "Path to analyze"
|
||||
volume_mode:
|
||||
type: string
|
||||
enum: ["ro", "rw"]
|
||||
default: "ro"
|
||||
description: "Volume mount mode"
|
||||
scanner_config:
|
||||
type: object
|
||||
description: "File scanner configuration"
|
||||
|
||||
@@ -1,4 +0,0 @@
|
||||
# Requirements for security assessment workflow
|
||||
pydantic>=2.0.0
|
||||
pyyaml>=6.0
|
||||
aiofiles>=23.0.0
|
||||
@@ -1,5 +1,7 @@
|
||||
"""
|
||||
Security Assessment Workflow - Comprehensive security analysis using multiple modules
|
||||
Security Assessment Workflow - Temporal Version
|
||||
|
||||
Comprehensive security analysis using multiple modules.
|
||||
"""
|
||||
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
@@ -13,240 +15,219 @@ Security Assessment Workflow - Comprehensive security analysis using multiple mo
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import sys
|
||||
import logging
|
||||
from pathlib import Path
|
||||
from datetime import timedelta
|
||||
from typing import Dict, Any, Optional
|
||||
from prefect import flow, task
|
||||
import json
|
||||
|
||||
# Add modules to path
|
||||
sys.path.insert(0, '/app')
|
||||
from temporalio import workflow
|
||||
from temporalio.common import RetryPolicy
|
||||
|
||||
# Import modules
|
||||
from toolbox.modules.scanner import FileScanner
|
||||
from toolbox.modules.analyzer import SecurityAnalyzer
|
||||
from toolbox.modules.reporter import SARIFReporter
|
||||
# Import activity interfaces (will be executed by worker)
|
||||
with workflow.unsafe.imports_passed_through():
|
||||
import logging
|
||||
|
||||
# Configure logging
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@task(name="file_scanning")
|
||||
async def scan_files_task(workspace: Path, config: Dict[str, Any]) -> Dict[str, Any]:
|
||||
@workflow.defn
|
||||
class SecurityAssessmentWorkflow:
|
||||
"""
|
||||
Task to scan files in the workspace.
|
||||
|
||||
Args:
|
||||
workspace: Path to the workspace
|
||||
config: Scanner configuration
|
||||
|
||||
Returns:
|
||||
Scanner results
|
||||
"""
|
||||
logger.info(f"Starting file scanning in {workspace}")
|
||||
scanner = FileScanner()
|
||||
|
||||
result = await scanner.execute(config, workspace)
|
||||
|
||||
logger.info(f"File scanning completed: {result.summary.get('total_files', 0)} files found")
|
||||
return result.dict()
|
||||
|
||||
|
||||
@task(name="security_analysis")
|
||||
async def analyze_security_task(workspace: Path, config: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""
|
||||
Task to analyze security vulnerabilities.
|
||||
|
||||
Args:
|
||||
workspace: Path to the workspace
|
||||
config: Analyzer configuration
|
||||
|
||||
Returns:
|
||||
Analysis results
|
||||
"""
|
||||
logger.info("Starting security analysis")
|
||||
analyzer = SecurityAnalyzer()
|
||||
|
||||
result = await analyzer.execute(config, workspace)
|
||||
|
||||
logger.info(
|
||||
f"Security analysis completed: {result.summary.get('total_findings', 0)} findings"
|
||||
)
|
||||
return result.dict()
|
||||
|
||||
|
||||
@task(name="report_generation")
|
||||
async def generate_report_task(
|
||||
scan_results: Dict[str, Any],
|
||||
analysis_results: Dict[str, Any],
|
||||
config: Dict[str, Any],
|
||||
workspace: Path
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Task to generate SARIF report from all findings.
|
||||
|
||||
Args:
|
||||
scan_results: Results from scanner
|
||||
analysis_results: Results from analyzer
|
||||
config: Reporter configuration
|
||||
workspace: Path to the workspace
|
||||
|
||||
Returns:
|
||||
SARIF report
|
||||
"""
|
||||
logger.info("Generating SARIF report")
|
||||
reporter = SARIFReporter()
|
||||
|
||||
# Combine findings from all modules
|
||||
all_findings = []
|
||||
|
||||
# Add scanner findings (only sensitive files, not all files)
|
||||
scanner_findings = scan_results.get("findings", [])
|
||||
sensitive_findings = [f for f in scanner_findings if f.get("severity") != "info"]
|
||||
all_findings.extend(sensitive_findings)
|
||||
|
||||
# Add analyzer findings
|
||||
analyzer_findings = analysis_results.get("findings", [])
|
||||
all_findings.extend(analyzer_findings)
|
||||
|
||||
# Prepare reporter config
|
||||
reporter_config = {
|
||||
**config,
|
||||
"findings": all_findings,
|
||||
"tool_name": "FuzzForge Security Assessment",
|
||||
"tool_version": "1.0.0"
|
||||
}
|
||||
|
||||
result = await reporter.execute(reporter_config, workspace)
|
||||
|
||||
# Extract SARIF from result
|
||||
sarif = result.dict().get("sarif", {})
|
||||
|
||||
logger.info(f"Report generated with {len(all_findings)} total findings")
|
||||
return sarif
|
||||
|
||||
|
||||
@flow(name="security_assessment", log_prints=True)
|
||||
async def main_flow(
|
||||
target_path: str = "/workspace",
|
||||
volume_mode: str = "ro",
|
||||
scanner_config: Optional[Dict[str, Any]] = None,
|
||||
analyzer_config: Optional[Dict[str, Any]] = None,
|
||||
reporter_config: Optional[Dict[str, Any]] = None
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Main security assessment workflow.
|
||||
Comprehensive security assessment workflow.
|
||||
|
||||
This workflow:
|
||||
1. Scans files in the workspace
|
||||
2. Analyzes code for security vulnerabilities
|
||||
3. Generates a SARIF report with all findings
|
||||
|
||||
Args:
|
||||
target_path: Path to the mounted workspace (default: /workspace)
|
||||
volume_mode: Volume mount mode (ro/rw)
|
||||
scanner_config: Configuration for file scanner
|
||||
analyzer_config: Configuration for security analyzer
|
||||
reporter_config: Configuration for SARIF reporter
|
||||
|
||||
Returns:
|
||||
SARIF-formatted findings report
|
||||
1. Downloads target from MinIO
|
||||
2. Scans files in the workspace
|
||||
3. Analyzes code for security vulnerabilities
|
||||
4. Generates a SARIF report with all findings
|
||||
5. Uploads results to MinIO
|
||||
6. Cleans up cache
|
||||
"""
|
||||
logger.info(f"Starting security assessment workflow")
|
||||
logger.info(f"Workspace: {target_path}, Mode: {volume_mode}")
|
||||
|
||||
# Set workspace path
|
||||
workspace = Path(target_path)
|
||||
@workflow.run
|
||||
async def run(
|
||||
self,
|
||||
target_id: str,
|
||||
scanner_config: Optional[Dict[str, Any]] = None,
|
||||
analyzer_config: Optional[Dict[str, Any]] = None,
|
||||
reporter_config: Optional[Dict[str, Any]] = None
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Main workflow execution.
|
||||
|
||||
if not workspace.exists():
|
||||
logger.error(f"Workspace does not exist: {workspace}")
|
||||
return {
|
||||
"error": f"Workspace not found: {workspace}",
|
||||
"sarif": None
|
||||
}
|
||||
Args:
|
||||
target_id: UUID of the uploaded target in MinIO
|
||||
scanner_config: Configuration for file scanner
|
||||
analyzer_config: Configuration for security analyzer
|
||||
reporter_config: Configuration for SARIF reporter
|
||||
|
||||
# Default configurations
|
||||
if not scanner_config:
|
||||
scanner_config = {
|
||||
"patterns": ["*"],
|
||||
"check_sensitive": True,
|
||||
"calculate_hashes": False,
|
||||
"max_file_size": 10485760 # 10MB
|
||||
}
|
||||
Returns:
|
||||
Dictionary containing SARIF report and summary
|
||||
"""
|
||||
workflow_id = workflow.info().workflow_id
|
||||
|
||||
if not analyzer_config:
|
||||
analyzer_config = {
|
||||
"file_extensions": [".py", ".js", ".java", ".php", ".rb", ".go"],
|
||||
"check_secrets": True,
|
||||
"check_sql": True,
|
||||
"check_dangerous_functions": True
|
||||
}
|
||||
|
||||
if not reporter_config:
|
||||
reporter_config = {
|
||||
"include_code_flows": False
|
||||
}
|
||||
|
||||
try:
|
||||
# Execute workflow tasks
|
||||
logger.info("Phase 1: File scanning")
|
||||
scan_results = await scan_files_task(workspace, scanner_config)
|
||||
|
||||
logger.info("Phase 2: Security analysis")
|
||||
analysis_results = await analyze_security_task(workspace, analyzer_config)
|
||||
|
||||
logger.info("Phase 3: Report generation")
|
||||
sarif_report = await generate_report_task(
|
||||
scan_results,
|
||||
analysis_results,
|
||||
reporter_config,
|
||||
workspace
|
||||
workflow.logger.info(
|
||||
f"Starting SecurityAssessmentWorkflow "
|
||||
f"(workflow_id={workflow_id}, target_id={target_id})"
|
||||
)
|
||||
|
||||
# Log summary
|
||||
if sarif_report and "runs" in sarif_report:
|
||||
results_count = len(sarif_report["runs"][0].get("results", []))
|
||||
logger.info(f"Workflow completed successfully with {results_count} findings")
|
||||
else:
|
||||
logger.info("Workflow completed successfully")
|
||||
# Default configurations
|
||||
if not scanner_config:
|
||||
scanner_config = {
|
||||
"patterns": ["*"],
|
||||
"check_sensitive": True,
|
||||
"calculate_hashes": False,
|
||||
"max_file_size": 10485760 # 10MB
|
||||
}
|
||||
|
||||
return sarif_report
|
||||
if not analyzer_config:
|
||||
analyzer_config = {
|
||||
"file_extensions": [".py", ".js", ".java", ".php", ".rb", ".go"],
|
||||
"check_secrets": True,
|
||||
"check_sql": True,
|
||||
"check_dangerous_functions": True
|
||||
}
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Workflow failed: {e}")
|
||||
# Return error in SARIF format
|
||||
return {
|
||||
"$schema": "https://raw.githubusercontent.com/oasis-tcs/sarif-spec/master/Schemata/sarif-schema-2.1.0.json",
|
||||
"version": "2.1.0",
|
||||
"runs": [
|
||||
{
|
||||
"tool": {
|
||||
"driver": {
|
||||
"name": "FuzzForge Security Assessment",
|
||||
"version": "1.0.0"
|
||||
}
|
||||
},
|
||||
"results": [],
|
||||
"invocations": [
|
||||
{
|
||||
"executionSuccessful": False,
|
||||
"exitCode": 1,
|
||||
"exitCodeDescription": str(e)
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
if not reporter_config:
|
||||
reporter_config = {
|
||||
"include_code_flows": False
|
||||
}
|
||||
|
||||
results = {
|
||||
"workflow_id": workflow_id,
|
||||
"target_id": target_id,
|
||||
"status": "running",
|
||||
"steps": []
|
||||
}
|
||||
|
||||
try:
|
||||
# Get run ID for workspace isolation (using shared mode for read-only analysis)
|
||||
run_id = workflow.info().run_id
|
||||
|
||||
if __name__ == "__main__":
|
||||
# For local testing
|
||||
import asyncio
|
||||
# Step 1: Download target from MinIO
|
||||
workflow.logger.info("Step 1: Downloading target from MinIO")
|
||||
target_path = await workflow.execute_activity(
|
||||
"get_target",
|
||||
args=[target_id, run_id, "shared"], # target_id, run_id, workspace_isolation
|
||||
start_to_close_timeout=timedelta(minutes=5),
|
||||
retry_policy=RetryPolicy(
|
||||
initial_interval=timedelta(seconds=1),
|
||||
maximum_interval=timedelta(seconds=30),
|
||||
maximum_attempts=3
|
||||
)
|
||||
)
|
||||
results["steps"].append({
|
||||
"step": "download_target",
|
||||
"status": "success",
|
||||
"target_path": target_path
|
||||
})
|
||||
workflow.logger.info(f"✓ Target downloaded to: {target_path}")
|
||||
|
||||
asyncio.run(main_flow(
|
||||
target_path="/tmp/test",
|
||||
scanner_config={"patterns": ["*.py"]},
|
||||
analyzer_config={"check_secrets": True}
|
||||
))
|
||||
# Step 2: File scanning
|
||||
workflow.logger.info("Step 2: Scanning files")
|
||||
scan_results = await workflow.execute_activity(
|
||||
"scan_files",
|
||||
args=[target_path, scanner_config],
|
||||
start_to_close_timeout=timedelta(minutes=10),
|
||||
retry_policy=RetryPolicy(
|
||||
initial_interval=timedelta(seconds=2),
|
||||
maximum_interval=timedelta(seconds=60),
|
||||
maximum_attempts=2
|
||||
)
|
||||
)
|
||||
results["steps"].append({
|
||||
"step": "file_scanning",
|
||||
"status": "success",
|
||||
"files_scanned": scan_results.get("summary", {}).get("total_files", 0)
|
||||
})
|
||||
workflow.logger.info(
|
||||
f"✓ File scanning completed: "
|
||||
f"{scan_results.get('summary', {}).get('total_files', 0)} files"
|
||||
)
|
||||
|
||||
# Step 3: Security analysis
|
||||
workflow.logger.info("Step 3: Analyzing security vulnerabilities")
|
||||
analysis_results = await workflow.execute_activity(
|
||||
"analyze_security",
|
||||
args=[target_path, analyzer_config],
|
||||
start_to_close_timeout=timedelta(minutes=15),
|
||||
retry_policy=RetryPolicy(
|
||||
initial_interval=timedelta(seconds=2),
|
||||
maximum_interval=timedelta(seconds=60),
|
||||
maximum_attempts=2
|
||||
)
|
||||
)
|
||||
results["steps"].append({
|
||||
"step": "security_analysis",
|
||||
"status": "success",
|
||||
"findings": analysis_results.get("summary", {}).get("total_findings", 0)
|
||||
})
|
||||
workflow.logger.info(
|
||||
f"✓ Security analysis completed: "
|
||||
f"{analysis_results.get('summary', {}).get('total_findings', 0)} findings"
|
||||
)
|
||||
|
||||
# Step 4: Generate SARIF report
|
||||
workflow.logger.info("Step 4: Generating SARIF report")
|
||||
sarif_report = await workflow.execute_activity(
|
||||
"generate_sarif_report",
|
||||
args=[scan_results, analysis_results, reporter_config, target_path],
|
||||
start_to_close_timeout=timedelta(minutes=5)
|
||||
)
|
||||
results["steps"].append({
|
||||
"step": "report_generation",
|
||||
"status": "success"
|
||||
})
|
||||
|
||||
# Count total findings in SARIF
|
||||
total_findings = 0
|
||||
if sarif_report and "runs" in sarif_report:
|
||||
total_findings = len(sarif_report["runs"][0].get("results", []))
|
||||
|
||||
workflow.logger.info(f"✓ SARIF report generated with {total_findings} findings")
|
||||
|
||||
# Step 5: Upload results to MinIO
|
||||
workflow.logger.info("Step 5: Uploading results")
|
||||
try:
|
||||
results_url = await workflow.execute_activity(
|
||||
"upload_results",
|
||||
args=[workflow_id, sarif_report, "sarif"],
|
||||
start_to_close_timeout=timedelta(minutes=2)
|
||||
)
|
||||
results["results_url"] = results_url
|
||||
workflow.logger.info(f"✓ Results uploaded to: {results_url}")
|
||||
except Exception as e:
|
||||
workflow.logger.warning(f"Failed to upload results: {e}")
|
||||
results["results_url"] = None
|
||||
|
||||
# Step 6: Cleanup cache
|
||||
workflow.logger.info("Step 6: Cleaning up cache")
|
||||
try:
|
||||
await workflow.execute_activity(
|
||||
"cleanup_cache",
|
||||
args=[target_path, "shared"], # target_path, workspace_isolation
|
||||
start_to_close_timeout=timedelta(minutes=1)
|
||||
)
|
||||
workflow.logger.info("✓ Cache cleaned up (skipped for shared mode)")
|
||||
except Exception as e:
|
||||
workflow.logger.warning(f"Cache cleanup failed: {e}")
|
||||
|
||||
# Mark workflow as successful
|
||||
results["status"] = "success"
|
||||
results["sarif"] = sarif_report
|
||||
results["summary"] = {
|
||||
"total_findings": total_findings,
|
||||
"files_scanned": scan_results.get("summary", {}).get("total_files", 0)
|
||||
}
|
||||
workflow.logger.info(f"✓ Workflow completed successfully: {workflow_id}")
|
||||
|
||||
return results
|
||||
|
||||
except Exception as e:
|
||||
workflow.logger.error(f"Workflow failed: {e}")
|
||||
results["status"] = "error"
|
||||
results["error"] = str(e)
|
||||
results["steps"].append({
|
||||
"step": "error",
|
||||
"status": "failed",
|
||||
"error": str(e)
|
||||
})
|
||||
raise
|
||||
|
||||
Generated
+310
-926
File diff suppressed because it is too large
Load Diff
+67
-2
@@ -153,10 +153,10 @@ fuzzforge workflows parameters security_assessment --no-interactive
|
||||
### Workflow Execution
|
||||
|
||||
#### `fuzzforge workflow <workflow> <target-path>`
|
||||
Execute a security testing workflow.
|
||||
Execute a security testing workflow with **automatic file upload**.
|
||||
|
||||
```bash
|
||||
# Basic execution
|
||||
# Basic execution - CLI automatically detects local files and uploads them
|
||||
fuzzforge workflow security_assessment /path/to/code
|
||||
|
||||
# With parameters
|
||||
@@ -172,6 +172,49 @@ fuzzforge workflow security_assessment /path/to/code \
|
||||
fuzzforge workflow security_assessment /path/to/code --wait
|
||||
```
|
||||
|
||||
**Automatic File Upload Behavior:**
|
||||
|
||||
The CLI intelligently handles target files based on whether they exist locally:
|
||||
|
||||
1. **Local file/directory exists** → **Automatic upload to MinIO**:
|
||||
- CLI creates a compressed tarball (`.tar.gz`) for directories
|
||||
- Uploads via HTTP to backend API
|
||||
- Backend stores in MinIO with unique `target_id`
|
||||
- Worker downloads from MinIO when ready to analyze
|
||||
- ✅ **Works from any machine** (no shared filesystem needed)
|
||||
|
||||
2. **Path doesn't exist locally** → **Path-based submission** (legacy):
|
||||
- Path is sent to backend as-is
|
||||
- Backend expects target to be accessible on its filesystem
|
||||
- ⚠️ Only works when CLI and backend share filesystem
|
||||
|
||||
**Example workflow:**
|
||||
```bash
|
||||
$ ff workflow security_assessment ./my-project
|
||||
|
||||
🔧 Getting workflow information for: security_assessment
|
||||
📦 Detected local directory: ./my-project (21 files)
|
||||
🗜️ Creating compressed tarball...
|
||||
📤 Uploading to backend (0.01 MB)...
|
||||
✅ Upload complete! Target ID: 548193a1-f73f-4ec1-8068-19ec2660b8e4
|
||||
|
||||
🎯 Executing workflow:
|
||||
Workflow: security_assessment
|
||||
Target: my-project.tar.gz (uploaded)
|
||||
Volume Mode: ro
|
||||
Status: 🔄 RUNNING
|
||||
|
||||
✅ Workflow started successfully!
|
||||
Execution ID: security_assessment-52781925
|
||||
```
|
||||
|
||||
**Upload Details:**
|
||||
- **Max file size**: 10 GB (configurable on backend)
|
||||
- **Compression**: Automatic for directories (reduces upload time)
|
||||
- **Storage**: Files stored in MinIO (S3-compatible)
|
||||
- **Lifecycle**: Automatic cleanup after 7 days
|
||||
- **Caching**: Workers cache downloaded targets for faster repeated workflows
|
||||
|
||||
**Options:**
|
||||
- `--param, -p` - Parameter in key=value format (can be used multiple times)
|
||||
- `--param-file, -f` - JSON file containing parameters
|
||||
@@ -181,6 +224,22 @@ fuzzforge workflow security_assessment /path/to/code --wait
|
||||
- `--wait, -w` - Wait for execution to complete
|
||||
- `--live, -l` - Show live monitoring during execution
|
||||
|
||||
**Worker Lifecycle Options (v0.7.0):**
|
||||
- `--auto-start/--no-auto-start` - Auto-start required worker (default: from config)
|
||||
- `--auto-stop/--no-auto-stop` - Auto-stop worker after completion (default: from config)
|
||||
|
||||
**Examples:**
|
||||
```bash
|
||||
# Worker starts automatically (default behavior)
|
||||
fuzzforge workflow ossfuzz_campaign . project_name=zlib
|
||||
|
||||
# Disable auto-start (worker must be running already)
|
||||
fuzzforge workflow ossfuzz_campaign . --no-auto-start
|
||||
|
||||
# Auto-stop worker after completion
|
||||
fuzzforge workflow ossfuzz_campaign . --wait --auto-stop
|
||||
```
|
||||
|
||||
#### `fuzzforge workflow status [execution-id]`
|
||||
Check the status of a workflow execution.
|
||||
|
||||
@@ -402,6 +461,12 @@ preferences:
|
||||
show_progress_bars: true
|
||||
table_style: "rich"
|
||||
color_output: true
|
||||
|
||||
workers:
|
||||
auto_start_workers: true # Auto-start workers when needed
|
||||
auto_stop_workers: false # Auto-stop workers after completion
|
||||
worker_startup_timeout: 60 # Worker startup timeout (seconds)
|
||||
docker_compose_file: null # Custom docker-compose.yml path
|
||||
```
|
||||
|
||||
## 🔧 Advanced Usage
|
||||
|
||||
@@ -207,7 +207,7 @@ def install_zsh_completion():
|
||||
|
||||
# Add fpath to .zshrc if not present
|
||||
zshrc = Path.home() / ".zshrc"
|
||||
fpath_line = f'fpath=(~/.zsh/completions $fpath)'
|
||||
fpath_line = 'fpath=(~/.zsh/completions $fpath)'
|
||||
autoload_line = 'autoload -U compinit && compinit'
|
||||
|
||||
if zshrc.exists():
|
||||
@@ -222,7 +222,7 @@ def install_zsh_completion():
|
||||
|
||||
if lines_to_add:
|
||||
with zshrc.open("a") as f:
|
||||
f.write(f"\n# FuzzForge CLI completion\n")
|
||||
f.write("\n# FuzzForge CLI completion\n")
|
||||
for line in lines_to_add:
|
||||
f.write(f"{line}\n")
|
||||
print("✅ Added completion setup to ~/.zshrc")
|
||||
|
||||
@@ -15,7 +15,6 @@ This module provides the main entry point for the FuzzForge CLI application.
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
import typer
|
||||
from src.fuzzforge_cli.main import app
|
||||
|
||||
if __name__ == "__main__":
|
||||
|
||||
@@ -14,10 +14,10 @@ API response validation and graceful degradation utilities.
|
||||
|
||||
|
||||
import logging
|
||||
from typing import Any, Dict, List, Optional, Union
|
||||
from typing import Any, Dict, List, Optional
|
||||
from pydantic import BaseModel, ValidationError as PydanticValidationError
|
||||
|
||||
from .exceptions import ValidationError, APIConnectionError
|
||||
from .exceptions import ValidationError
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -29,7 +29,6 @@ class WorkflowMetadata(BaseModel):
|
||||
author: Optional[str] = None
|
||||
description: Optional[str] = None
|
||||
parameters: Dict[str, Any] = {}
|
||||
supported_volume_modes: List[str] = ["ro", "rw"]
|
||||
|
||||
|
||||
class RunStatus(BaseModel):
|
||||
|
||||
@@ -15,15 +15,11 @@ from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import os
|
||||
from datetime import datetime
|
||||
from typing import Optional
|
||||
|
||||
import typer
|
||||
from rich.console import Console
|
||||
from rich.panel import Panel
|
||||
from rich.table import Table
|
||||
|
||||
from ..config import ProjectConfigManager
|
||||
|
||||
console = Console()
|
||||
app = typer.Typer(name="ai", help="Interact with the FuzzForge AI system")
|
||||
|
||||
@@ -18,13 +18,11 @@ from pathlib import Path
|
||||
from rich.console import Console
|
||||
from rich.table import Table
|
||||
from rich.panel import Panel
|
||||
from rich.prompt import Prompt, Confirm
|
||||
from rich.prompt import Confirm
|
||||
from rich import box
|
||||
from typing import Optional
|
||||
|
||||
from ..config import (
|
||||
get_project_config,
|
||||
ensure_project_config,
|
||||
get_global_config,
|
||||
save_global_config,
|
||||
FuzzForgeConfig
|
||||
@@ -335,7 +333,6 @@ def edit_config(
|
||||
"""
|
||||
📝 Open configuration file in default editor
|
||||
"""
|
||||
import os
|
||||
import subprocess
|
||||
|
||||
if global_config:
|
||||
@@ -369,7 +366,7 @@ def edit_config(
|
||||
try:
|
||||
console.print(f"📝 Opening {config_type} configuration in {editor}...")
|
||||
subprocess.run([editor, str(config_path)], check=True)
|
||||
console.print(f"✅ Configuration file edited", style="green")
|
||||
console.print("✅ Configuration file edited", style="green")
|
||||
|
||||
except subprocess.CalledProcessError as e:
|
||||
console.print(f"❌ Failed to open editor: {e}", style="red")
|
||||
|
||||
@@ -21,18 +21,17 @@ from typing import Optional, Dict, Any, List
|
||||
|
||||
import typer
|
||||
from rich.console import Console
|
||||
from rich.table import Table, Column
|
||||
from rich.table import Table
|
||||
from rich.panel import Panel
|
||||
from rich.syntax import Syntax
|
||||
from rich.tree import Tree
|
||||
from rich.text import Text
|
||||
from rich import box
|
||||
|
||||
from ..config import get_project_config, FuzzForgeConfig
|
||||
from ..database import get_project_db, ensure_project_db, FindingRecord
|
||||
from ..exceptions import (
|
||||
handle_error, retry_on_network_error, validate_run_id,
|
||||
require_project, ValidationError, DatabaseError
|
||||
retry_on_network_error, validate_run_id,
|
||||
require_project, ValidationError
|
||||
)
|
||||
from fuzzforge_sdk import FuzzForgeClient
|
||||
|
||||
@@ -159,7 +158,7 @@ def display_findings_table(sarif_data: Dict[str, Any]):
|
||||
driver = tool.get("driver", {})
|
||||
|
||||
# Tool information
|
||||
console.print(f"\n🔍 [bold]Security Analysis Results[/bold]")
|
||||
console.print("\n🔍 [bold]Security Analysis Results[/bold]")
|
||||
if driver.get("name"):
|
||||
console.print(f"Tool: {driver.get('name')} v{driver.get('version', 'unknown')}")
|
||||
|
||||
@@ -241,7 +240,7 @@ def display_findings_table(sarif_data: Dict[str, Any]):
|
||||
location_text
|
||||
)
|
||||
|
||||
console.print(f"\n📋 [bold]Detailed Results[/bold]")
|
||||
console.print("\n📋 [bold]Detailed Results[/bold]")
|
||||
if len(results) > 50:
|
||||
console.print(f"Showing first 50 of {len(results)} results")
|
||||
console.print()
|
||||
@@ -297,7 +296,7 @@ def findings_history(
|
||||
console.print(f"\n📚 [bold]Findings History ({len(findings)})[/bold]\n")
|
||||
console.print(table)
|
||||
|
||||
console.print(f"\n💡 Use [bold cyan]fuzzforge finding <run-id>[/bold cyan] to view detailed findings")
|
||||
console.print("\n💡 Use [bold cyan]fuzzforge finding <run-id>[/bold cyan] to view detailed findings")
|
||||
|
||||
except Exception as e:
|
||||
console.print(f"❌ Failed to get findings history: {e}", style="red")
|
||||
@@ -710,10 +709,10 @@ def all_findings(
|
||||
if show_findings:
|
||||
display_detailed_findings(findings, max_findings)
|
||||
|
||||
console.print(f"\n💡 Use filters to refine results: --workflow, --severity, --since")
|
||||
console.print(f"💡 Show findings content: --show-findings")
|
||||
console.print(f"💡 Export findings: --export json --output report.json")
|
||||
console.print(f"💡 View specific findings: [bold cyan]fuzzforge finding <run-id>[/bold cyan]")
|
||||
console.print("\n💡 Use filters to refine results: --workflow, --severity, --since")
|
||||
console.print("💡 Show findings content: --show-findings")
|
||||
console.print("💡 Export findings: --export json --output report.json")
|
||||
console.print("💡 View specific findings: [bold cyan]fuzzforge finding <run-id>[/bold cyan]")
|
||||
|
||||
except Exception as e:
|
||||
console.print(f"❌ Failed to get all findings: {e}", style="red")
|
||||
|
||||
@@ -164,7 +164,7 @@ fuzzforge finding <run-id>
|
||||
console.print("📚 Created README.md")
|
||||
|
||||
console.print("\n✅ FuzzForge project initialized successfully!", style="green")
|
||||
console.print(f"\n🎯 Next steps:")
|
||||
console.print("\n🎯 Next steps:")
|
||||
console.print(" • ff workflows - See available workflows")
|
||||
console.print(" • ff status - Check API connectivity")
|
||||
console.print(" • ff workflow <workflow> <path> - Start your first analysis")
|
||||
|
||||
@@ -13,23 +13,18 @@ Real-time monitoring and statistics commands.
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
import asyncio
|
||||
import time
|
||||
from datetime import datetime, timedelta
|
||||
from typing import Optional
|
||||
from datetime import datetime
|
||||
|
||||
import typer
|
||||
from rich.console import Console
|
||||
from rich.table import Table
|
||||
from rich.panel import Panel
|
||||
from rich.live import Live
|
||||
from rich.layout import Layout
|
||||
from rich.progress import Progress, BarColumn, TextColumn, SpinnerColumn
|
||||
from rich.align import Align
|
||||
from rich import box
|
||||
|
||||
from ..config import get_project_config, FuzzForgeConfig
|
||||
from ..database import get_project_db, ensure_project_db, CrashRecord
|
||||
from ..database import ensure_project_db, CrashRecord
|
||||
from fuzzforge_sdk import FuzzForgeClient
|
||||
|
||||
console = Console()
|
||||
@@ -93,9 +88,21 @@ def fuzzing_stats(
|
||||
with Live(auto_refresh=False, console=console) as live:
|
||||
while True:
|
||||
try:
|
||||
# Check workflow status
|
||||
run_status = client.get_run_status(run_id)
|
||||
stats = client.get_fuzzing_stats(run_id)
|
||||
table = create_stats_table(stats)
|
||||
live.update(table, refresh=True)
|
||||
|
||||
# Exit if workflow completed or failed
|
||||
if getattr(run_status, 'is_completed', False) or getattr(run_status, 'is_failed', False):
|
||||
final_status = getattr(run_status, 'status', 'Unknown')
|
||||
if getattr(run_status, 'is_completed', False):
|
||||
console.print("\n✅ [bold green]Workflow completed[/bold green]", style="green")
|
||||
else:
|
||||
console.print(f"\n⚠️ [bold yellow]Workflow ended[/bold yellow] | Status: {final_status}", style="yellow")
|
||||
break
|
||||
|
||||
time.sleep(refresh)
|
||||
except KeyboardInterrupt:
|
||||
console.print("\n📊 Monitoring stopped", style="yellow")
|
||||
@@ -124,8 +131,8 @@ def create_stats_table(stats) -> Panel:
|
||||
stats_table.add_row("Total Crashes", format_number(stats.crashes))
|
||||
stats_table.add_row("Unique Crashes", format_number(stats.unique_crashes))
|
||||
|
||||
if stats.coverage is not None:
|
||||
stats_table.add_row("Code Coverage", f"{stats.coverage:.1f}%")
|
||||
if stats.coverage is not None and stats.coverage > 0:
|
||||
stats_table.add_row("Code Coverage", f"{stats.coverage} edges")
|
||||
|
||||
stats_table.add_row("Corpus Size", format_number(stats.corpus_size))
|
||||
stats_table.add_row("Elapsed Time", format_duration(stats.elapsed_time))
|
||||
@@ -206,7 +213,7 @@ def crash_reports(
|
||||
console.print(
|
||||
Panel.fit(
|
||||
summary_table,
|
||||
title=f"🐛 Crash Summary",
|
||||
title="🐛 Crash Summary",
|
||||
box=box.ROUNDED
|
||||
)
|
||||
)
|
||||
@@ -246,7 +253,7 @@ def crash_reports(
|
||||
input_display
|
||||
)
|
||||
|
||||
console.print(f"\n🐛 [bold]Crash Details[/bold]")
|
||||
console.print("\n🐛 [bold]Crash Details[/bold]")
|
||||
if len(crashes) > limit:
|
||||
console.print(f"Showing first {limit} of {len(crashes)} crashes")
|
||||
console.print()
|
||||
@@ -260,78 +267,70 @@ def crash_reports(
|
||||
|
||||
|
||||
def _live_monitor(run_id: str, refresh: int):
|
||||
"""Helper for live monitoring to allow for cleaner exit handling"""
|
||||
"""Helper for live monitoring with inline real-time display"""
|
||||
with get_client() as client:
|
||||
start_time = time.time()
|
||||
|
||||
def render_layout(run_status, stats):
|
||||
layout = Layout()
|
||||
layout.split_column(
|
||||
Layout(name="header", size=3),
|
||||
Layout(name="main", ratio=1),
|
||||
Layout(name="footer", size=3)
|
||||
)
|
||||
layout["main"].split_row(
|
||||
Layout(name="stats", ratio=1),
|
||||
Layout(name="progress", ratio=1)
|
||||
)
|
||||
header = Panel(
|
||||
f"[bold]FuzzForge Live Monitor[/bold]\n"
|
||||
f"Run: {run_id[:12]}... | Status: {run_status.status} | "
|
||||
f"Uptime: {format_duration(int(time.time() - start_time))}",
|
||||
box=box.ROUNDED,
|
||||
style="cyan"
|
||||
)
|
||||
layout["header"].update(header)
|
||||
layout["stats"].update(create_stats_table(stats))
|
||||
def render_inline_stats(run_status, stats):
|
||||
"""Render inline stats display (non-dashboard)"""
|
||||
lines = []
|
||||
|
||||
progress_table = Table(show_header=False, box=box.SIMPLE)
|
||||
progress_table.add_column("Metric", style="bold")
|
||||
progress_table.add_column("Progress")
|
||||
if stats.executions > 0:
|
||||
exec_rate_percent = min(100, (stats.executions_per_sec / 1000) * 100)
|
||||
progress_table.add_row("Exec Rate", create_progress_bar(exec_rate_percent, "green"))
|
||||
crash_rate = (stats.crashes / stats.executions) * 100000
|
||||
crash_rate_percent = min(100, crash_rate * 10)
|
||||
progress_table.add_row("Crash Rate", create_progress_bar(crash_rate_percent, "red"))
|
||||
if stats.coverage is not None:
|
||||
progress_table.add_row("Coverage", create_progress_bar(stats.coverage, "blue"))
|
||||
layout["progress"].update(Panel.fit(progress_table, title="📊 Progress Indicators", box=box.ROUNDED))
|
||||
# Header line
|
||||
workflow_name = getattr(stats, 'workflow', 'unknown')
|
||||
status_emoji = "🔄" if not getattr(run_status, 'is_completed', False) else "✅"
|
||||
status_color = "yellow" if not getattr(run_status, 'is_completed', False) else "green"
|
||||
|
||||
footer = Panel(
|
||||
f"Last updated: {datetime.now().strftime('%H:%M:%S')} | "
|
||||
f"Refresh interval: {refresh}s | Press Ctrl+C to exit",
|
||||
box=box.ROUNDED,
|
||||
style="dim"
|
||||
)
|
||||
layout["footer"].update(footer)
|
||||
return layout
|
||||
lines.append(f"\n[bold cyan]📊 Live Fuzzing Monitor[/bold cyan] - {workflow_name} (Run: {run_id[:12]}...)\n")
|
||||
|
||||
with Live(auto_refresh=False, console=console, screen=True) as live:
|
||||
# Stats lines with emojis
|
||||
lines.append(f" [bold]⚡ Executions[/bold] {format_number(stats.executions):>8} [dim]({stats.executions_per_sec:,.1f}/sec)[/dim]")
|
||||
lines.append(f" [bold]💥 Crashes[/bold] {stats.crashes:>8} [dim](unique: {stats.unique_crashes})[/dim]")
|
||||
lines.append(f" [bold]📦 Corpus[/bold] {stats.corpus_size:>8} inputs")
|
||||
|
||||
if stats.coverage is not None and stats.coverage > 0:
|
||||
lines.append(f" [bold]📈 Coverage[/bold] {stats.coverage:>8} edges")
|
||||
|
||||
lines.append(f" [bold]⏱️ Elapsed[/bold] {format_duration(stats.elapsed_time):>8}")
|
||||
|
||||
# Last crash info
|
||||
if stats.last_crash_time:
|
||||
time_since = datetime.now() - stats.last_crash_time
|
||||
crash_ago = format_duration(int(time_since.total_seconds()))
|
||||
lines.append(f" [bold red]🐛 Last Crash[/bold red] {crash_ago:>8} ago")
|
||||
|
||||
# Status line
|
||||
status_text = getattr(run_status, 'status', 'Unknown')
|
||||
current_time = datetime.now().strftime('%H:%M:%S')
|
||||
lines.append(f"\n[{status_color}]{status_emoji} Status: {status_text}[/{status_color}] | Last update: [dim]{current_time}[/dim] | Refresh: {refresh}s | [dim]Press Ctrl+C to stop[/dim]")
|
||||
|
||||
return "\n".join(lines)
|
||||
|
||||
# Fallback stats class
|
||||
class FallbackStats:
|
||||
def __init__(self, run_id):
|
||||
self.run_id = run_id
|
||||
self.workflow = "unknown"
|
||||
self.executions = 0
|
||||
self.executions_per_sec = 0.0
|
||||
self.crashes = 0
|
||||
self.unique_crashes = 0
|
||||
self.coverage = None
|
||||
self.corpus_size = 0
|
||||
self.elapsed_time = 0
|
||||
self.last_crash_time = None
|
||||
|
||||
with Live(auto_refresh=False, console=console) as live:
|
||||
# Initial fetch
|
||||
try:
|
||||
run_status = client.get_run_status(run_id)
|
||||
stats = client.get_fuzzing_stats(run_id)
|
||||
except Exception:
|
||||
# Minimal fallback stats
|
||||
class FallbackStats:
|
||||
def __init__(self, run_id):
|
||||
self.run_id = run_id
|
||||
self.workflow = "unknown"
|
||||
self.executions = 0
|
||||
self.executions_per_sec = 0.0
|
||||
self.crashes = 0
|
||||
self.unique_crashes = 0
|
||||
self.coverage = None
|
||||
self.corpus_size = 0
|
||||
self.elapsed_time = 0
|
||||
self.last_crash_time = None
|
||||
stats = FallbackStats(run_id)
|
||||
run_status = type("RS", (), {"status":"Unknown","is_completed":False,"is_failed":False})()
|
||||
|
||||
live.update(render_layout(run_status, stats), refresh=True)
|
||||
live.update(render_inline_stats(run_status, stats), refresh=True)
|
||||
|
||||
# Simple polling approach that actually works
|
||||
# Polling loop
|
||||
consecutive_errors = 0
|
||||
max_errors = 5
|
||||
|
||||
@@ -344,7 +343,7 @@ def _live_monitor(run_id: str, refresh: int):
|
||||
except Exception as e:
|
||||
consecutive_errors += 1
|
||||
if consecutive_errors >= max_errors:
|
||||
console.print(f"❌ Too many errors getting run status: {e}", style="red")
|
||||
console.print(f"\n❌ Too many errors getting run status: {e}", style="red")
|
||||
break
|
||||
time.sleep(refresh)
|
||||
continue
|
||||
@@ -352,18 +351,14 @@ def _live_monitor(run_id: str, refresh: int):
|
||||
# Try to get fuzzing stats
|
||||
try:
|
||||
stats = client.get_fuzzing_stats(run_id)
|
||||
except Exception as e:
|
||||
# Create fallback stats if not available
|
||||
except Exception:
|
||||
stats = FallbackStats(run_id)
|
||||
|
||||
# Update display
|
||||
live.update(render_layout(run_status, stats), refresh=True)
|
||||
live.update(render_inline_stats(run_status, stats), refresh=True)
|
||||
|
||||
# Check if completed
|
||||
if getattr(run_status, 'is_completed', False) or getattr(run_status, 'is_failed', False):
|
||||
# Show final state for a few seconds
|
||||
console.print("\n🏁 Run completed. Showing final state for 10 seconds...")
|
||||
time.sleep(10)
|
||||
break
|
||||
|
||||
# Wait before next poll
|
||||
@@ -372,17 +367,17 @@ def _live_monitor(run_id: str, refresh: int):
|
||||
except KeyboardInterrupt:
|
||||
raise
|
||||
except Exception as e:
|
||||
console.print(f"⚠️ Monitoring error: {e}", style="yellow")
|
||||
console.print(f"\n⚠️ Monitoring error: {e}", style="yellow")
|
||||
time.sleep(refresh)
|
||||
|
||||
# Completed status update
|
||||
final_message = (
|
||||
f"[bold]FuzzForge Live Monitor - COMPLETED[/bold]\n"
|
||||
f"Run: {run_id[:12]}... | Status: {run_status.status} | "
|
||||
f"Total runtime: {format_duration(int(time.time() - start_time))}"
|
||||
)
|
||||
style = "green" if getattr(run_status, 'is_completed', False) else "red"
|
||||
live.update(Panel(final_message, box=box.ROUNDED, style=style), refresh=True)
|
||||
# Final status
|
||||
final_status = getattr(run_status, 'status', 'Unknown')
|
||||
total_time = format_duration(int(time.time() - start_time))
|
||||
|
||||
if getattr(run_status, 'is_completed', False):
|
||||
console.print(f"\n✅ [bold green]Run completed successfully[/bold green] | Total runtime: {total_time}")
|
||||
else:
|
||||
console.print(f"\n⚠️ [bold yellow]Run ended[/bold yellow] | Status: {final_status} | Total runtime: {total_time}")
|
||||
|
||||
|
||||
@app.command("live")
|
||||
@@ -390,21 +385,18 @@ def live_monitor(
|
||||
run_id: str = typer.Argument(..., help="Run ID to monitor live"),
|
||||
refresh: int = typer.Option(
|
||||
2, "--refresh", "-r",
|
||||
help="Refresh interval in seconds (fallback when streaming unavailable)"
|
||||
help="Refresh interval in seconds"
|
||||
)
|
||||
):
|
||||
"""
|
||||
📺 Real-time monitoring dashboard with live updates (WebSocket/SSE with REST fallback)
|
||||
📺 Real-time inline monitoring with live statistics updates
|
||||
"""
|
||||
console.print(f"📺 [bold]Live Monitoring Dashboard[/bold]")
|
||||
console.print(f"Run: {run_id}")
|
||||
console.print(f"Press Ctrl+C to stop monitoring\n")
|
||||
try:
|
||||
_live_monitor(run_id, refresh)
|
||||
except KeyboardInterrupt:
|
||||
console.print("\n📊 Monitoring stopped by user.", style="yellow")
|
||||
console.print("\n\n📊 Monitoring stopped by user.", style="yellow")
|
||||
except Exception as e:
|
||||
console.print(f"❌ Failed to start live monitoring: {e}", style="red")
|
||||
console.print(f"\n❌ Failed to start live monitoring: {e}", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
|
||||
@@ -426,11 +418,11 @@ def monitor_callback(ctx: typer.Context):
|
||||
# Let the subcommand handle it
|
||||
return
|
||||
|
||||
# Show not implemented message for default command
|
||||
# Show help message for default command
|
||||
from rich.console import Console
|
||||
console = Console()
|
||||
console.print("🚧 [yellow]Monitor command is not fully implemented yet.[/yellow]")
|
||||
console.print("Please use specific subcommands:")
|
||||
console.print("📊 [bold cyan]Monitor Command[/bold cyan]")
|
||||
console.print("\nAvailable subcommands:")
|
||||
console.print(" • [cyan]ff monitor stats <run-id>[/cyan] - Show execution statistics")
|
||||
console.print(" • [cyan]ff monitor crashes <run-id>[/cyan] - Show crash reports")
|
||||
console.print(" • [cyan]ff monitor live <run-id>[/cyan] - Live monitoring dashboard")
|
||||
console.print(" • [cyan]ff monitor live <run-id>[/cyan] - Real-time inline monitoring")
|
||||
|
||||
@@ -115,7 +115,7 @@ def show_status():
|
||||
api_table.add_column("Property", style="bold cyan")
|
||||
api_table.add_column("Value")
|
||||
|
||||
api_table.add_row("Status", f"✅ Connected")
|
||||
api_table.add_row("Status", "✅ Connected")
|
||||
api_table.add_row("Service", f"{api_status.name} v{api_status.version}")
|
||||
api_table.add_row("Workflows", str(len(workflows)))
|
||||
|
||||
|
||||
@@ -24,27 +24,25 @@ import typer
|
||||
from rich.console import Console
|
||||
from rich.table import Table
|
||||
from rich.panel import Panel
|
||||
from rich.progress import Progress, SpinnerColumn, TextColumn, BarColumn, TaskProgressColumn
|
||||
from rich.prompt import Prompt, Confirm
|
||||
from rich.live import Live
|
||||
from rich import box
|
||||
|
||||
from ..config import get_project_config, FuzzForgeConfig
|
||||
from ..database import get_project_db, ensure_project_db, RunRecord
|
||||
from ..exceptions import (
|
||||
handle_error, retry_on_network_error, safe_json_load, require_project,
|
||||
APIConnectionError, ValidationError, DatabaseError, FileOperationError
|
||||
ValidationError, DatabaseError
|
||||
)
|
||||
from ..validation import (
|
||||
validate_run_id, validate_workflow_name, validate_target_path,
|
||||
validate_volume_mode, validate_parameters, validate_timeout
|
||||
validate_parameters, validate_timeout
|
||||
)
|
||||
from ..progress import progress_manager, spinner, step_progress
|
||||
from ..completion import WorkflowNameComplete, TargetPathComplete, VolumeModetComplete
|
||||
from ..progress import step_progress
|
||||
from ..constants import (
|
||||
STATUS_EMOJIS, MAX_RUN_ID_DISPLAY_LENGTH, DEFAULT_VOLUME_MODE,
|
||||
PROGRESS_STEP_DELAYS, MAX_RETRIES, RETRY_DELAY, POLL_INTERVAL
|
||||
)
|
||||
from ..worker_manager import WorkerManager
|
||||
from fuzzforge_sdk import FuzzForgeClient, WorkflowSubmission
|
||||
|
||||
console = Console()
|
||||
@@ -63,6 +61,47 @@ def status_emoji(status: str) -> str:
|
||||
return STATUS_EMOJIS.get(status.lower(), STATUS_EMOJIS["unknown"])
|
||||
|
||||
|
||||
def should_fail_build(sarif_data: Dict[str, Any], fail_on: str) -> bool:
|
||||
"""
|
||||
Check if findings warrant build failure based on SARIF severity levels.
|
||||
|
||||
Args:
|
||||
sarif_data: SARIF format findings data
|
||||
fail_on: Comma-separated SARIF levels (error,warning,note,info,all,none)
|
||||
|
||||
Returns:
|
||||
True if build should fail, False otherwise
|
||||
"""
|
||||
if fail_on == "none":
|
||||
return False
|
||||
|
||||
# Parse fail_on parameter - accept SARIF levels
|
||||
if fail_on == "all":
|
||||
check_levels = {"error", "warning", "note", "info"}
|
||||
else:
|
||||
check_levels = {s.strip().lower() for s in fail_on.split(",")}
|
||||
|
||||
# Validate levels
|
||||
valid_levels = {"error", "warning", "note", "info", "none"}
|
||||
invalid = check_levels - valid_levels
|
||||
if invalid:
|
||||
console.print(f"⚠️ Invalid SARIF levels: {', '.join(invalid)}", style="yellow")
|
||||
console.print("Valid levels: error, warning, note, info, all, none")
|
||||
|
||||
# Check SARIF results
|
||||
runs = sarif_data.get("runs", [])
|
||||
if not runs:
|
||||
return False
|
||||
|
||||
results = runs[0].get("results", [])
|
||||
for result in results:
|
||||
level = result.get("level", "note") # SARIF default is "note"
|
||||
if level in check_levels:
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
|
||||
def parse_inline_parameters(params: List[str]) -> Dict[str, Any]:
|
||||
"""Parse inline key=value parameters using improved validation"""
|
||||
return validate_parameters(params)
|
||||
@@ -77,17 +116,15 @@ def execute_workflow_submission(
|
||||
timeout: Optional[int],
|
||||
interactive: bool
|
||||
) -> Any:
|
||||
"""Handle the workflow submission process"""
|
||||
"""Handle the workflow submission process with file upload"""
|
||||
# Get workflow metadata for parameter validation
|
||||
console.print(f"🔧 Getting workflow information for: {workflow}")
|
||||
workflow_meta = client.get_workflow_metadata(workflow)
|
||||
param_response = client.get_workflow_parameters(workflow)
|
||||
|
||||
# Interactive parameter input
|
||||
if interactive and workflow_meta.parameters.get("properties"):
|
||||
properties = workflow_meta.parameters.get("properties", {})
|
||||
required_params = set(workflow_meta.parameters.get("required", []))
|
||||
defaults = param_response.defaults
|
||||
|
||||
missing_required = required_params - set(parameters.keys())
|
||||
|
||||
@@ -123,24 +160,10 @@ def execute_workflow_submission(
|
||||
except ValueError as e:
|
||||
console.print(f"❌ Invalid {param_type}: {e}", style="red")
|
||||
|
||||
# Validate volume mode
|
||||
validate_volume_mode(volume_mode)
|
||||
if volume_mode not in workflow_meta.supported_volume_modes:
|
||||
raise ValidationError(
|
||||
"volume mode", volume_mode,
|
||||
f"one of: {', '.join(workflow_meta.supported_volume_modes)}"
|
||||
)
|
||||
|
||||
# Create submission
|
||||
submission = WorkflowSubmission(
|
||||
target_path=target_path,
|
||||
volume_mode=volume_mode,
|
||||
parameters=parameters,
|
||||
timeout=timeout
|
||||
)
|
||||
# Note: volume_mode is no longer used (Temporal uses MinIO storage)
|
||||
|
||||
# Show submission summary
|
||||
console.print(f"\n🎯 [bold]Executing workflow:[/bold]")
|
||||
console.print("\n🎯 [bold]Executing workflow:[/bold]")
|
||||
console.print(f" Workflow: {workflow}")
|
||||
console.print(f" Target: {target_path}")
|
||||
console.print(f" Volume Mode: {volume_mode}")
|
||||
@@ -149,6 +172,22 @@ def execute_workflow_submission(
|
||||
if timeout:
|
||||
console.print(f" Timeout: {timeout}s")
|
||||
|
||||
# Check if target path exists locally
|
||||
target_path_obj = Path(target_path)
|
||||
use_upload = target_path_obj.exists()
|
||||
|
||||
if use_upload:
|
||||
# Show file/directory info
|
||||
if target_path_obj.is_dir():
|
||||
num_files = sum(1 for _ in target_path_obj.rglob("*") if _.is_file())
|
||||
console.print(f" Upload: Directory with {num_files} files")
|
||||
else:
|
||||
size_mb = target_path_obj.stat().st_size / (1024 * 1024)
|
||||
console.print(f" Upload: File ({size_mb:.2f} MB)")
|
||||
else:
|
||||
console.print(" [yellow]⚠️ Warning: Target path does not exist locally[/yellow]")
|
||||
console.print(" [yellow] Attempting to use path-based submission (backend must have access)[/yellow]")
|
||||
|
||||
# Only ask for confirmation in interactive mode
|
||||
if interactive:
|
||||
if not Confirm.ask("\nExecute workflow?", default=True, console=console):
|
||||
@@ -160,32 +199,74 @@ def execute_workflow_submission(
|
||||
# Submit the workflow with enhanced progress
|
||||
console.print(f"\n🚀 Executing workflow: [bold yellow]{workflow}[/bold yellow]")
|
||||
|
||||
steps = [
|
||||
"Validating workflow configuration",
|
||||
"Connecting to FuzzForge API",
|
||||
"Uploading parameters and settings",
|
||||
"Creating workflow deployment",
|
||||
"Initializing execution environment"
|
||||
]
|
||||
if use_upload:
|
||||
# Use new upload-based submission
|
||||
steps = [
|
||||
"Validating workflow configuration",
|
||||
"Creating tarball (if directory)",
|
||||
"Uploading target to backend",
|
||||
"Starting workflow execution",
|
||||
"Initializing execution environment"
|
||||
]
|
||||
|
||||
with step_progress(steps, f"Executing {workflow}") as progress:
|
||||
progress.next_step() # Validating
|
||||
time.sleep(PROGRESS_STEP_DELAYS["validating"])
|
||||
with step_progress(steps, f"Executing {workflow}") as progress:
|
||||
progress.next_step() # Validating
|
||||
time.sleep(PROGRESS_STEP_DELAYS["validating"])
|
||||
|
||||
progress.next_step() # Connecting
|
||||
time.sleep(PROGRESS_STEP_DELAYS["connecting"])
|
||||
progress.next_step() # Creating tarball
|
||||
time.sleep(PROGRESS_STEP_DELAYS["connecting"])
|
||||
|
||||
progress.next_step() # Uploading
|
||||
response = client.submit_workflow(workflow, submission)
|
||||
time.sleep(PROGRESS_STEP_DELAYS["uploading"])
|
||||
progress.next_step() # Uploading
|
||||
# Use the new upload method
|
||||
response = client.submit_workflow_with_upload(
|
||||
workflow_name=workflow,
|
||||
target_path=target_path,
|
||||
parameters=parameters,
|
||||
timeout=timeout
|
||||
)
|
||||
time.sleep(PROGRESS_STEP_DELAYS["uploading"])
|
||||
|
||||
progress.next_step() # Creating deployment
|
||||
time.sleep(PROGRESS_STEP_DELAYS["creating"])
|
||||
progress.next_step() # Starting
|
||||
time.sleep(PROGRESS_STEP_DELAYS["creating"])
|
||||
|
||||
progress.next_step() # Initializing
|
||||
time.sleep(PROGRESS_STEP_DELAYS["initializing"])
|
||||
progress.next_step() # Initializing
|
||||
time.sleep(PROGRESS_STEP_DELAYS["initializing"])
|
||||
|
||||
progress.complete(f"Workflow started successfully!")
|
||||
progress.complete("Workflow started successfully!")
|
||||
else:
|
||||
# Fall back to path-based submission (for backward compatibility)
|
||||
steps = [
|
||||
"Validating workflow configuration",
|
||||
"Connecting to FuzzForge API",
|
||||
"Submitting workflow parameters",
|
||||
"Creating workflow deployment",
|
||||
"Initializing execution environment"
|
||||
]
|
||||
|
||||
with step_progress(steps, f"Executing {workflow}") as progress:
|
||||
progress.next_step() # Validating
|
||||
time.sleep(PROGRESS_STEP_DELAYS["validating"])
|
||||
|
||||
progress.next_step() # Connecting
|
||||
time.sleep(PROGRESS_STEP_DELAYS["connecting"])
|
||||
|
||||
progress.next_step() # Submitting
|
||||
submission = WorkflowSubmission(
|
||||
target_path=target_path,
|
||||
volume_mode=volume_mode,
|
||||
parameters=parameters,
|
||||
timeout=timeout
|
||||
)
|
||||
response = client.submit_workflow(workflow, submission)
|
||||
time.sleep(PROGRESS_STEP_DELAYS["uploading"])
|
||||
|
||||
progress.next_step() # Creating deployment
|
||||
time.sleep(PROGRESS_STEP_DELAYS["creating"])
|
||||
|
||||
progress.next_step() # Initializing
|
||||
time.sleep(PROGRESS_STEP_DELAYS["initializing"])
|
||||
|
||||
progress.complete("Workflow started successfully!")
|
||||
|
||||
return response
|
||||
|
||||
@@ -219,6 +300,22 @@ def execute_workflow(
|
||||
live: bool = typer.Option(
|
||||
False, "--live", "-l",
|
||||
help="Start live monitoring after execution (useful for fuzzing workflows)"
|
||||
),
|
||||
auto_start: Optional[bool] = typer.Option(
|
||||
None, "--auto-start/--no-auto-start",
|
||||
help="Automatically start required worker if not running (default: from config)"
|
||||
),
|
||||
auto_stop: Optional[bool] = typer.Option(
|
||||
None, "--auto-stop/--no-auto-stop",
|
||||
help="Automatically stop worker after execution completes (default: from config)"
|
||||
),
|
||||
fail_on: Optional[str] = typer.Option(
|
||||
None, "--fail-on",
|
||||
help="Fail build if findings match severity (critical,high,medium,low,all,none). Use with --wait"
|
||||
),
|
||||
export_sarif: Optional[str] = typer.Option(
|
||||
None, "--export-sarif",
|
||||
help="Export SARIF results to file after completion. Use with --wait"
|
||||
)
|
||||
):
|
||||
"""
|
||||
@@ -226,6 +323,8 @@ def execute_workflow(
|
||||
|
||||
Use --live for fuzzing workflows to see real-time progress.
|
||||
Use --wait to wait for completion without live dashboard.
|
||||
Use --fail-on with --wait to fail CI builds based on finding severity.
|
||||
Use --export-sarif with --wait to export SARIF findings to a file.
|
||||
"""
|
||||
try:
|
||||
# Validate inputs
|
||||
@@ -261,14 +360,60 @@ def execute_workflow(
|
||||
except Exception as e:
|
||||
handle_error(e, "parsing parameters")
|
||||
|
||||
# Get config for worker management settings
|
||||
config = get_project_config() or FuzzForgeConfig()
|
||||
should_auto_start = auto_start if auto_start is not None else config.workers.auto_start_workers
|
||||
should_auto_stop = auto_stop if auto_stop is not None else config.workers.auto_stop_workers
|
||||
|
||||
worker_container = None # Track for cleanup
|
||||
worker_mgr = None
|
||||
wait_completed = False # Track if wait completed successfully
|
||||
|
||||
try:
|
||||
with get_client() as client:
|
||||
# Get worker information for this workflow
|
||||
try:
|
||||
console.print(f"🔍 Checking worker requirements for: {workflow}")
|
||||
worker_info = client.get_workflow_worker_info(workflow)
|
||||
|
||||
# Initialize worker manager
|
||||
compose_file = config.workers.docker_compose_file
|
||||
worker_mgr = WorkerManager(
|
||||
compose_file=Path(compose_file) if compose_file else None,
|
||||
startup_timeout=config.workers.worker_startup_timeout
|
||||
)
|
||||
|
||||
# Ensure worker is running
|
||||
worker_container = worker_info["worker_container"]
|
||||
if not worker_mgr.ensure_worker_running(worker_info, auto_start=should_auto_start):
|
||||
console.print(
|
||||
f"❌ Worker not available: {worker_info['vertical']}",
|
||||
style="red"
|
||||
)
|
||||
console.print(
|
||||
f"💡 Start the worker manually: docker-compose start {worker_container}"
|
||||
)
|
||||
raise typer.Exit(1)
|
||||
|
||||
except typer.Exit:
|
||||
raise # Re-raise Exit to preserve exit code
|
||||
except Exception as e:
|
||||
# If we can't get worker info, warn but continue (might be old backend)
|
||||
console.print(
|
||||
f"⚠️ Could not check worker requirements: {e}",
|
||||
style="yellow"
|
||||
)
|
||||
console.print(
|
||||
" Continuing without worker management...",
|
||||
style="yellow"
|
||||
)
|
||||
|
||||
response = execute_workflow_submission(
|
||||
client, workflow, target_path, parameters,
|
||||
volume_mode, timeout, interactive
|
||||
)
|
||||
|
||||
console.print(f"✅ Workflow execution started!", style="green")
|
||||
console.print("✅ Workflow execution started!", style="green")
|
||||
console.print(f" Execution ID: [bold cyan]{response.run_id}[/bold cyan]")
|
||||
console.print(f" Status: {status_emoji(response.status)} {response.status}")
|
||||
|
||||
@@ -288,22 +433,22 @@ def execute_workflow(
|
||||
# Don't fail the whole operation if database save fails
|
||||
console.print(f"⚠️ Failed to save execution to database: {e}", style="yellow")
|
||||
|
||||
console.print(f"\n💡 Monitor progress: [bold cyan]fuzzforge monitor {response.run_id}[/bold cyan]")
|
||||
console.print(f"\n💡 Monitor progress: [bold cyan]fuzzforge monitor stats {response.run_id}[/bold cyan]")
|
||||
console.print(f"💡 Check status: [bold cyan]fuzzforge workflow status {response.run_id}[/bold cyan]")
|
||||
|
||||
# Suggest --live for fuzzing workflows
|
||||
if not live and not wait and "fuzzing" in workflow.lower():
|
||||
console.print(f"💡 Next time try: [bold cyan]fuzzforge workflow {workflow} {target_path} --live[/bold cyan] for real-time fuzzing dashboard", style="dim")
|
||||
console.print(f"💡 Next time try: [bold cyan]fuzzforge workflow {workflow} {target_path} --live[/bold cyan] for real-time monitoring", style="dim")
|
||||
|
||||
# Start live monitoring if requested
|
||||
if live:
|
||||
# Check if this is a fuzzing workflow to show appropriate messaging
|
||||
is_fuzzing = "fuzzing" in workflow.lower()
|
||||
if is_fuzzing:
|
||||
console.print(f"\n📺 Starting live fuzzing dashboard...")
|
||||
console.print("\n📺 Starting live fuzzing monitor...")
|
||||
console.print("💡 You'll see real-time crash discovery, execution stats, and coverage data.")
|
||||
else:
|
||||
console.print(f"\n📺 Starting live monitoring dashboard...")
|
||||
console.print("\n📺 Starting live monitoring...")
|
||||
|
||||
console.print("Press Ctrl+C to stop monitoring (execution continues in background).\n")
|
||||
|
||||
@@ -312,14 +457,14 @@ def execute_workflow(
|
||||
# Import monitor command and run it
|
||||
live_monitor(response.run_id, refresh=3)
|
||||
except KeyboardInterrupt:
|
||||
console.print(f"\n⏹️ Live monitoring stopped (execution continues in background)", style="yellow")
|
||||
console.print("\n⏹️ Live monitoring stopped (execution continues in background)", style="yellow")
|
||||
except Exception as e:
|
||||
console.print(f"⚠️ Failed to start live monitoring: {e}", style="yellow")
|
||||
console.print(f"💡 You can still monitor manually: [bold cyan]fuzzforge monitor {response.run_id}[/bold cyan]")
|
||||
|
||||
# Wait for completion if requested
|
||||
elif wait:
|
||||
console.print(f"\n⏳ Waiting for execution to complete...")
|
||||
console.print("\n⏳ Waiting for execution to complete...")
|
||||
try:
|
||||
final_status = client.wait_for_completion(response.run_id, poll_interval=POLL_INTERVAL)
|
||||
|
||||
@@ -334,17 +479,63 @@ def execute_workflow(
|
||||
console.print(f"⚠️ Failed to update database: {e}", style="yellow")
|
||||
|
||||
console.print(f"🏁 Execution completed with status: {status_emoji(final_status.status)} {final_status.status}")
|
||||
wait_completed = True # Mark wait as completed
|
||||
|
||||
if final_status.is_completed:
|
||||
console.print(f"💡 View findings: [bold cyan]fuzzforge findings {response.run_id}[/bold cyan]")
|
||||
# Export SARIF if requested
|
||||
if export_sarif:
|
||||
try:
|
||||
console.print("\n📤 Exporting SARIF results...")
|
||||
findings = client.get_run_findings(response.run_id)
|
||||
output_path = Path(export_sarif)
|
||||
with open(output_path, 'w') as f:
|
||||
json.dump(findings.sarif, f, indent=2)
|
||||
console.print(f"✅ SARIF exported to: [bold cyan]{output_path}[/bold cyan]")
|
||||
except Exception as e:
|
||||
console.print(f"⚠️ Failed to export SARIF: {e}", style="yellow")
|
||||
|
||||
# Check if build should fail based on findings
|
||||
if fail_on:
|
||||
try:
|
||||
console.print(f"\n🔍 Checking findings against severity threshold: {fail_on}")
|
||||
findings = client.get_run_findings(response.run_id)
|
||||
if should_fail_build(findings.sarif, fail_on):
|
||||
console.print("❌ [bold red]Build failed: Found blocking security issues[/bold red]")
|
||||
console.print(f"💡 View details: [bold cyan]fuzzforge finding {response.run_id}[/bold cyan]")
|
||||
raise typer.Exit(1)
|
||||
else:
|
||||
console.print("✅ [bold green]No blocking security issues found[/bold green]")
|
||||
except typer.Exit:
|
||||
raise # Re-raise Exit to preserve exit code
|
||||
except Exception as e:
|
||||
console.print(f"⚠️ Failed to check findings: {e}", style="yellow")
|
||||
|
||||
if not fail_on and not export_sarif:
|
||||
console.print(f"💡 View findings: [bold cyan]fuzzforge findings {response.run_id}[/bold cyan]")
|
||||
|
||||
except KeyboardInterrupt:
|
||||
console.print(f"\n⏹️ Monitoring cancelled (execution continues in background)", style="yellow")
|
||||
console.print("\n⏹️ Monitoring cancelled (execution continues in background)", style="yellow")
|
||||
except typer.Exit:
|
||||
raise # Re-raise Exit to preserve exit code
|
||||
except Exception as e:
|
||||
handle_error(e, "waiting for completion")
|
||||
|
||||
except typer.Exit:
|
||||
raise # Re-raise Exit to preserve exit code
|
||||
except Exception as e:
|
||||
handle_error(e, "executing workflow")
|
||||
finally:
|
||||
# Stop worker if auto-stop is enabled and wait completed
|
||||
if should_auto_stop and worker_container and worker_mgr and wait_completed:
|
||||
try:
|
||||
console.print("\n🛑 Stopping worker (auto-stop enabled)...")
|
||||
if worker_mgr.stop_worker(worker_container):
|
||||
console.print(f"✅ Worker stopped: {worker_container}")
|
||||
except Exception as e:
|
||||
console.print(
|
||||
f"⚠️ Failed to stop worker: {e}",
|
||||
style="yellow"
|
||||
)
|
||||
|
||||
|
||||
@app.command("status")
|
||||
@@ -409,7 +600,7 @@ def workflow_status(
|
||||
console.print(
|
||||
Panel.fit(
|
||||
status_table,
|
||||
title=f"📊 Status Information",
|
||||
title="📊 Status Information",
|
||||
box=box.ROUNDED
|
||||
)
|
||||
)
|
||||
@@ -479,7 +670,7 @@ def workflow_history(
|
||||
console.print()
|
||||
console.print(table)
|
||||
|
||||
console.print(f"\n💡 Use [bold cyan]fuzzforge workflow status <execution-id>[/bold cyan] for detailed status")
|
||||
console.print("\n💡 Use [bold cyan]fuzzforge workflow status <execution-id>[/bold cyan] for detailed status")
|
||||
|
||||
except Exception as e:
|
||||
handle_error(e, "listing execution history")
|
||||
@@ -527,7 +718,7 @@ def retry_workflow(
|
||||
|
||||
# Modify parameters if requested
|
||||
if modify_params and parameters:
|
||||
console.print(f"\n📝 [bold]Current parameters:[/bold]")
|
||||
console.print("\n📝 [bold]Current parameters:[/bold]")
|
||||
for key, value in parameters.items():
|
||||
new_value = Prompt.ask(
|
||||
f"{key}",
|
||||
@@ -559,7 +750,7 @@ def retry_workflow(
|
||||
|
||||
response = client.submit_workflow(original_run.workflow, submission)
|
||||
|
||||
console.print(f"\n✅ Retry submitted successfully!", style="green")
|
||||
console.print("\n✅ Retry submitted successfully!", style="green")
|
||||
console.print(f" New Execution ID: [bold cyan]{response.run_id}[/bold cyan]")
|
||||
console.print(f" Status: {status_emoji(response.status)} {response.status}")
|
||||
|
||||
@@ -578,7 +769,7 @@ def retry_workflow(
|
||||
except Exception as e:
|
||||
console.print(f"⚠️ Failed to save execution to database: {e}", style="yellow")
|
||||
|
||||
console.print(f"\n💡 Monitor progress: [bold cyan]fuzzforge monitor {response.run_id}[/bold cyan]")
|
||||
console.print(f"\n💡 Monitor progress: [bold cyan]fuzzforge monitor stats {response.run_id}[/bold cyan]")
|
||||
|
||||
except Exception as e:
|
||||
handle_error(e, "retrying workflow")
|
||||
|
||||
@@ -18,10 +18,10 @@ import typer
|
||||
from rich.console import Console
|
||||
from rich.table import Table
|
||||
from rich.panel import Panel
|
||||
from rich.prompt import Prompt, Confirm
|
||||
from rich.prompt import Prompt
|
||||
from rich.syntax import Syntax
|
||||
from rich import box
|
||||
from typing import Optional, Dict, Any
|
||||
from typing import Optional
|
||||
|
||||
from ..config import get_project_config, FuzzForgeConfig
|
||||
from ..fuzzy import enhanced_workflow_not_found_handler
|
||||
@@ -68,7 +68,7 @@ def list_workflows():
|
||||
console.print(f"\n🔧 [bold]Available Workflows ({len(workflows)})[/bold]\n")
|
||||
console.print(table)
|
||||
|
||||
console.print(f"\n💡 Use [bold cyan]fuzzforge workflows info <name>[/bold cyan] for detailed information")
|
||||
console.print("\n💡 Use [bold cyan]fuzzforge workflows info <name>[/bold cyan] for detailed information")
|
||||
|
||||
except Exception as e:
|
||||
console.print(f"❌ Failed to fetch workflows: {e}", style="red")
|
||||
@@ -100,7 +100,6 @@ def workflow_info(
|
||||
info_table.add_row("Author", workflow.author)
|
||||
if workflow.tags:
|
||||
info_table.add_row("Tags", ", ".join(workflow.tags))
|
||||
info_table.add_row("Volume Modes", ", ".join(workflow.supported_volume_modes))
|
||||
info_table.add_row("Custom Docker", "✅ Yes" if workflow.has_custom_docker else "❌ No")
|
||||
|
||||
console.print(
|
||||
@@ -193,7 +192,7 @@ def workflow_parameters(
|
||||
parameters = {}
|
||||
properties = workflow.parameters.get("properties", {})
|
||||
required_params = set(workflow.parameters.get("required", []))
|
||||
defaults = param_response.defaults
|
||||
defaults = param_response.default_parameters
|
||||
|
||||
if interactive:
|
||||
console.print("🔧 Enter parameter values (press Enter for default):\n")
|
||||
|
||||
@@ -16,7 +16,7 @@ Provides intelligent tab completion for commands, workflows, run IDs, and parame
|
||||
|
||||
|
||||
import typer
|
||||
from typing import List, Optional
|
||||
from typing import List
|
||||
from pathlib import Path
|
||||
|
||||
from .config import get_project_config, FuzzForgeConfig
|
||||
|
||||
@@ -66,6 +66,15 @@ class PreferencesConfig(BaseModel):
|
||||
color_output: bool = True
|
||||
|
||||
|
||||
class WorkerConfig(BaseModel):
|
||||
"""Worker lifecycle management configuration."""
|
||||
|
||||
auto_start_workers: bool = True
|
||||
auto_stop_workers: bool = False
|
||||
worker_startup_timeout: int = 60
|
||||
docker_compose_file: Optional[str] = None
|
||||
|
||||
|
||||
class CogneeConfig(BaseModel):
|
||||
"""Cognee integration metadata."""
|
||||
|
||||
@@ -84,6 +93,7 @@ class FuzzForgeConfig(BaseModel):
|
||||
project: ProjectConfig = Field(default_factory=ProjectConfig)
|
||||
retention: RetentionConfig = Field(default_factory=RetentionConfig)
|
||||
preferences: PreferencesConfig = Field(default_factory=PreferencesConfig)
|
||||
workers: WorkerConfig = Field(default_factory=WorkerConfig)
|
||||
cognee: CogneeConfig = Field(default_factory=CogneeConfig)
|
||||
|
||||
@classmethod
|
||||
|
||||
@@ -163,7 +163,7 @@ class FuzzForgeDatabase:
|
||||
"Database is corrupted. Use 'ff init --force' to reset."
|
||||
) from e
|
||||
raise
|
||||
except Exception as e:
|
||||
except Exception:
|
||||
if conn:
|
||||
try:
|
||||
conn.rollback()
|
||||
|
||||
@@ -15,7 +15,7 @@ Enhanced exception handling and error utilities for FuzzForge CLI with rich cont
|
||||
|
||||
import time
|
||||
import functools
|
||||
from typing import Any, Callable, Optional, Type, Union, List
|
||||
from typing import Any, Callable, Optional, Union, List
|
||||
from pathlib import Path
|
||||
|
||||
import typer
|
||||
@@ -24,20 +24,10 @@ from rich.console import Console
|
||||
from rich.panel import Panel
|
||||
from rich.text import Text
|
||||
from rich.table import Table
|
||||
from rich.columns import Columns
|
||||
from rich.syntax import Syntax
|
||||
from rich.markdown import Markdown
|
||||
|
||||
# Import SDK exceptions for rich handling
|
||||
from fuzzforge_sdk.exceptions import (
|
||||
FuzzForgeError as SDKFuzzForgeError,
|
||||
FuzzForgeHTTPError,
|
||||
DeploymentError,
|
||||
WorkflowExecutionError,
|
||||
ContainerError,
|
||||
VolumeError,
|
||||
ValidationError as SDKValidationError,
|
||||
ConnectionError as SDKConnectionError
|
||||
FuzzForgeError as SDKFuzzForgeError
|
||||
)
|
||||
|
||||
console = Console()
|
||||
@@ -335,7 +325,7 @@ def handle_error(error: Exception, context: str = "") -> None:
|
||||
|
||||
# Show error details for debugging
|
||||
console.print(f"\n[dim yellow]Error type: {type(error).__name__}[/dim yellow]")
|
||||
console.print(f"[dim yellow]Please report this issue if it persists[/dim yellow]")
|
||||
console.print("[dim yellow]Please report this issue if it persists[/dim yellow]")
|
||||
console.print()
|
||||
|
||||
raise typer.Exit(1)
|
||||
@@ -430,8 +420,9 @@ def validate_run_id(run_id: str) -> str:
|
||||
if not run_id or len(run_id) < 8:
|
||||
raise ValidationError("run_id", run_id, "at least 8 characters")
|
||||
|
||||
if not run_id.replace('-', '').isalnum():
|
||||
raise ValidationError("run_id", run_id, "alphanumeric characters and hyphens only")
|
||||
# Allow alphanumeric characters, hyphens, and underscores
|
||||
if not run_id.replace('-', '').replace('_', '').isalnum():
|
||||
raise ValidationError("run_id", run_id, "alphanumeric characters, hyphens, and underscores only")
|
||||
|
||||
return run_id
|
||||
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user