Compare commits

...

19 Commits

Author SHA1 Message Date
tduhamel42
9ea4d66586 fix: update license badge to BSL 1.1 and add roadmap section to README 2026-02-10 18:36:30 +01:00
tduhamel42
ec16b37410 Merge fuzzforge-ai-new-version: complete rewrite with MCP-native module architecture
Fuzzforge ai new version
2026-02-10 18:28:46 +01:00
AFredefon
66a10d1bc4 docs: add ROADMAP.md with planned features 2026-02-09 10:36:33 +01:00
AFredefon
48ad2a59af refactor(modules): rename metadata fields and use natural 2026-02-09 10:17:16 +01:00
AFredefon
8b8662d7af feat(modules): add harness-tester module for Rust fuzzing pipeline 2026-02-03 18:12:28 +01:00
AFredefon
f099bd018d chore(modules): remove redundant harness-validator module 2026-02-03 18:12:20 +01:00
tduhamel42
d786c6dab1 fix: block Podman on macOS and remove ghcr.io default (#39)
* fix: block Podman on macOS and remove ghcr.io default

- Add platform check in PodmanCLI.__init__() that raises FuzzForgeError
  on macOS with instructions to use Docker instead
- Change RegistrySettings.url default from "ghcr.io/fuzzinglabs" to ""
  (empty string) for local-only mode since no images are published yet
- Update _ensure_module_image() to show helpful error when image not
  found locally and no registry configured
- Update tests to mock Linux platform for Podman tests
- Add root ruff.toml to fix broken configuration in fuzzforge-runner

* rewrite guides for module architecture and update repo links

---------

Co-authored-by: AFredefon <antoinefredefon@yahoo.fr>
2026-02-03 10:15:16 +01:00
AFredefon
e72c5fb201 docs: fix USAGE.md setup instructions for new users 2026-01-30 13:55:48 +01:00
AFredefon
404c89a742 feat: make Docker the default container engine 2026-01-30 13:20:03 +01:00
AFredefon
aea50ac42a fix: use SNAP detection for podman storage, update tests for OSS 2026-01-30 11:59:40 +01:00
AFredefon
5d300e5366 fix: remove enterprise SDK references from OSS tests 2026-01-30 10:36:33 +01:00
AFredefon
1186f57a5c chore: remove old fuzzforge_ai files 2026-01-30 10:06:21 +01:00
AFredefon
9a97cc0f31 merge old fuzzforge_ai for cleanup 2026-01-30 10:02:49 +01:00
AFredefon
b46f050aef feat: FuzzForge AI - complete rewrite for OSS release 2026-01-30 09:57:48 +01:00
vhash
50ffad46a4 fix: broken links (#35)
move fuzzinglabs.io to fuzzinglabs.ai
2025-11-14 09:44:57 +01:00
Steve
83244ee537 Fix Discord link in README.md (#34) 2025-11-06 11:11:03 +01:00
Songbird99
e1b0b1b178 Support flexible A2A agent registration and fix redirects (#33)
- Accept direct .json URLs (e.g., http://host/.well-known/agent-card.json)
- Accept base agent URLs (e.g., http://host/a2a/sentinel)
- Extract canonical URL from agent card response
- Try both agent-card.json and agent.json for compatibility
- Follow HTTP redirects for POST requests (fixes 307 redirects)
- Remove trailing slash from POST endpoint to avoid redirect loops

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
2025-11-06 11:08:05 +01:00
tduhamel42
943bc9a114 Release v0.7.3 - Android workflows, LiteLLM integration, ARM64 support (#32)
* ci: add worker validation and Docker build checks

Add automated validation to prevent worker-related issues:

**Worker Validation Script:**
- New script: .github/scripts/validate-workers.sh
- Validates all workers in docker-compose.yml exist
- Checks required files: Dockerfile, requirements.txt, worker.py
- Verifies files are tracked by git (not gitignored)
- Detects gitignore issues that could hide workers

**CI Workflow Updates:**
- Added validate-workers job (runs on every PR)
- Added build-workers job (runs if workers/ modified)
- Uses Docker Buildx for caching
- Validates Docker images build successfully
- Updated test-summary to check validation results

**PR Template:**
- New pull request template with comprehensive checklist
- Specific section for worker-related changes
- Reminds contributors to validate worker files
- Includes documentation and changelog reminders

These checks would have caught the secrets worker gitignore issue.

Implements Phase 1 improvements from CI/CD quality assessment.

* fix: add dev branch to test workflow triggers

The test workflow was configured for 'develop' but the actual branch is named 'dev'.
This caused tests not to run on PRs to dev branch.

Now tests will run on:
- PRs to: main, master, dev, develop
- Pushes to: main, master, dev, develop, feature/**

* fix: properly detect worker file changes in CI

The previous condition used invalid GitHub context field.
Now uses git diff to properly detect changes to workers/ or docker-compose.yml.

Behavior:
- Job always runs the check step
- Detects if workers/ or docker-compose.yml modified
- Only builds Docker images if workers actually changed
- Shows clear skip message when no worker changes detected

* feat: Add Python SAST workflow with three security analysis tools

Implements Issue #5 - Python SAST workflow that combines:
- Dependency scanning (pip-audit) for CVE detection
- Security linting (Bandit) for vulnerability patterns
- Type checking (Mypy) for type safety issues

## Changes

**New Modules:**
- `DependencyScanner`: Scans Python dependencies for known CVEs using pip-audit
- `BanditAnalyzer`: Analyzes Python code for security issues using Bandit
- `MypyAnalyzer`: Checks Python code for type safety issues using Mypy

**New Workflow:**
- `python_sast`: Temporal workflow that orchestrates all three SAST tools
  - Runs tools in parallel for fast feedback (3-5 min vs hours for fuzzing)
  - Generates unified SARIF report with findings from all tools
  - Supports configurable severity/confidence thresholds

**Updates:**
- Added SAST dependencies to Python worker (bandit, pip-audit, mypy)
- Updated module __init__.py files to export new analyzers
- Added type_errors.py test file to vulnerable_app for Mypy validation

## Testing

Workflow tested successfully on vulnerable_app:
-  Bandit: Detected 9 security issues (command injection, unsafe functions)
-  Mypy: Detected 5 type errors
-  DependencyScanner: Ran successfully (no CVEs in test dependencies)
-  SARIF export: Generated valid SARIF with 14 total findings

* fix: Remove unused imports to pass linter

* fix: resolve live monitoring bug, remove deprecated parameters, and auto-start Python worker

- Fix live monitoring style error by calling _live_monitor() helper directly
- Remove default_parameters duplication from 10 workflow metadata files
- Remove deprecated volume_mode parameter from 26 files across CLI, SDK, backend, and docs
- Configure Python worker to start automatically with docker compose up
- Clean up constants, validation, completion, and example files

Fixes #
- Live monitoring now works correctly with --live flag
- Workflow metadata follows JSON Schema standard
- Cleaner codebase without deprecated volume_mode
- Python worker (most commonly used) starts by default

* fix: resolve linter errors and optimize CI worker builds

- Remove unused Literal import from backend findings model
- Remove unnecessary f-string prefixes in CLI findings command
- Optimize GitHub Actions to build only modified workers
  - Detect specific worker changes (python, secrets, rust, android, ossfuzz)
  - Build only changed workers instead of all 5
  - Build all workers if docker-compose.yml changes
  - Significantly reduces CI build time

* feat: Add Android static analysis workflow with Jadx, OpenGrep, and MobSF

Comprehensive Android security testing workflow converted from Prefect to Temporal architecture:

Modules (3):
- JadxDecompiler: APK to Java source code decompilation
- OpenGrepAndroid: Static analysis with Android-specific security rules
- MobSFScanner: Comprehensive mobile security framework integration

Custom Rules (13):
- clipboard-sensitive-data, hardcoded-secrets, insecure-data-storage
- insecure-deeplink, insecure-logging, intent-redirection
- sensitive_data_sharedPreferences, sqlite-injection
- vulnerable-activity, vulnerable-content-provider, vulnerable-service
- webview-javascript-enabled, webview-load-arbitrary-url

Workflow:
- 6-phase Temporal workflow: download → Jadx → OpenGrep → MobSF → SARIF → upload
- 4 activities: decompile_with_jadx, scan_with_opengrep, scan_with_mobsf, generate_android_sarif
- SARIF output combining findings from all security tools

Docker Worker:
- ARM64 Mac compatibility via amd64 platform emulation
- Pre-installed: Android SDK, Jadx 1.4.7, OpenGrep 1.45.0, MobSF 3.9.7
- MobSF runs as background service with API key auto-generation
- Added aiohttp for async HTTP communication

Test APKs:
- BeetleBug.apk and shopnest.apk for workflow validation

* fix(android): correct activity names and MobSF API key generation

- Fix activity names in workflow.py (get_target, upload_results, cleanup_cache)
- Fix MobSF API key generation in Dockerfile startup script (cut delimiter)
- Update activity parameter signatures to match actual implementations
- Workflow now executes successfully with Jadx and OpenGrep

* feat: add platform-aware worker architecture with ARM64 support

Implement platform-specific Dockerfile selection and graceful tool degradation to support both x86_64 and ARM64 (Apple Silicon) platforms.

**Backend Changes:**
- Add system info API endpoint (/system/info) exposing host filesystem paths
- Add FUZZFORGE_HOST_ROOT environment variable to backend service
- Add graceful degradation in MobSF activity for ARM64 platforms

**CLI Changes:**
- Implement multi-strategy path resolution (backend API, .fuzzforge marker, env var)
- Add platform detection (linux/amd64 vs linux/arm64)
- Add worker metadata.yaml reading for platform capabilities
- Auto-select appropriate Dockerfile based on detected platform
- Pass platform-specific env vars to docker-compose

**Worker Changes:**
- Create workers/android/metadata.yaml defining platform capabilities
- Rename Dockerfile -> Dockerfile.amd64 (full toolchain with MobSF)
- Create Dockerfile.arm64 (excludes MobSF due to Rosetta 2 incompatibility)
- Update docker-compose.yml to use ${ANDROID_DOCKERFILE} variable

**Workflow Changes:**
- Handle MobSF "skipped" status gracefully in workflow
- Log clear warnings when tools are unavailable on platform

**Key Features:**
- Automatic platform detection and Dockerfile selection
- Graceful degradation when tools unavailable (MobSF on ARM64)
- Works from any directory (backend API provides paths)
- Manual override via environment variables
- Clear user feedback about platform and selected Dockerfile

**Benefits:**
- Android workflow now works on Apple Silicon Macs
- No code changes needed for other workflows
- Convention established for future platform-specific workers

Closes: MobSF Rosetta 2 incompatibility issue
Implements: Platform-aware worker architecture (Option B)

* fix: make MobSFScanner import conditional for ARM64 compatibility

- Add try-except block to conditionally import MobSFScanner in modules/android/__init__.py
- Allows Android worker to start on ARM64 without MobSF dependencies (aiohttp)
- MobSF activity gracefully skips on ARM64 with clear warning message
- Remove workflow path detection logic (not needed - workflows receive directories)

Platform-aware architecture fully functional on ARM64:
- CLI detects ARM64 and selects Dockerfile.arm64 automatically
- Worker builds and runs without MobSF on ARM64
- Jadx successfully decompiles APKs (4145 files from BeetleBug.apk)
- OpenGrep finds security vulnerabilities (8 issues found)
- MobSF gracefully skips with warning on ARM64
- Graceful degradation working as designed

Tested with:
  ff workflow run android_static_analysis test_projects/android_test/ \
    --wait --no-interactive apk_path=BeetleBug.apk decompile_apk=true

Results: 8 security findings (1 ERROR, 7 WARNINGS)

* docs: update CHANGELOG with Android workflow and ARM64 support

Added [Unreleased] section documenting:
- Android Static Analysis Workflow (Jadx, OpenGrep, MobSF)
- Platform-Aware Worker Architecture with ARM64 support
- Python SAST Workflow
- CI/CD improvements and worker validation
- CLI enhancements
- Bug fixes and technical changes

Fixed date typo: 2025-01-16 → 2025-10-16

* fix: resolve linter errors in Android modules

- Remove unused imports from mobsf_scanner.py (asyncio, hashlib, json, Optional)
- Remove unused variables from opengrep_android.py (start_col, end_col)
- Remove duplicate Path import from workflow.py

* ci: support multi-platform Dockerfiles in worker validation

Updated worker validation script to accept both:
- Single Dockerfile pattern (existing workers)
- Multi-platform Dockerfile pattern (Dockerfile.amd64, Dockerfile.arm64, etc.)

This enables platform-aware worker architectures like the Android worker
which uses different Dockerfiles for x86_64 and ARM64 platforms.

* Feature/litellm proxy (#27)

* feat: seed governance config and responses routing

* Add env-configurable timeout for proxy providers

* Integrate LiteLLM OTEL collector and update docs

* Make .env.litellm optional for LiteLLM proxy

* Add LiteLLM proxy integration with model-agnostic virtual keys

Changes:
- Bootstrap generates 3 virtual keys with individual budgets (CLI: $100, Task-Agent: $25, Cognee: $50)
- Task-agent loads config at runtime via entrypoint script to wait for bootstrap completion
- All keys are model-agnostic by default (no LITELLM_DEFAULT_MODELS restrictions)
- Bootstrap handles database/env mismatch after docker prune by deleting stale aliases
- CLI and Cognee configured to use LiteLLM proxy with virtual keys
- Added comprehensive documentation in volumes/env/README.md

Technical details:
- task-agent entrypoint waits for keys in .env file before starting uvicorn
- Bootstrap creates/updates TASK_AGENT_API_KEY, COGNEE_API_KEY, and OPENAI_API_KEY
- Removed hardcoded API keys from docker-compose.yml
- All services route through http://localhost:10999 proxy

* Fix CLI not loading virtual keys from global .env

Project .env files with empty OPENAI_API_KEY values were overriding
the global virtual keys. Updated _load_env_file_if_exists to only
override with non-empty values.

* Fix agent executor not passing API key to LiteLLM

The agent was initializing LiteLlm without api_key or api_base,
causing authentication errors when using the LiteLLM proxy. Now
reads from OPENAI_API_KEY/LLM_API_KEY and LLM_ENDPOINT environment
variables and passes them to LiteLlm constructor.

* Auto-populate project .env with virtual key from global config

When running 'ff init', the command now checks for a global
volumes/env/.env file and automatically uses the OPENAI_API_KEY
virtual key if found. This ensures projects work with LiteLLM
proxy out of the box without manual key configuration.

* docs: Update README with LiteLLM configuration instructions

Add note about LITELLM_GEMINI_API_KEY configuration and clarify that OPENAI_API_KEY default value should not be changed as it's used for the LLM proxy.

* Refactor workflow parameters to use JSON Schema defaults

Consolidates parameter defaults into JSON Schema format, removing the separate default_parameters field. Adds extract_defaults_from_json_schema() helper to extract defaults from the standard schema structure. Updates LiteLLM proxy config to use LITELLM_OPENAI_API_KEY environment variable.

* Remove .env.example from task_agent

* Fix MDX syntax error in llm-proxy.md

* fix: apply default parameters from metadata.yaml automatically

Fixed TemporalManager.run_workflow() to correctly apply default parameter
values from workflow metadata.yaml files when parameters are not provided
by the caller.

Previous behavior:
- When workflow_params was empty {}, the condition
  `if workflow_params and 'parameters' in metadata` would fail
- Parameters would not be extracted from schema, resulting in workflows
  receiving only target_id with no other parameters

New behavior:
- Removed the `workflow_params and` requirement from the condition
- Now explicitly checks for defaults in parameter spec
- Applies defaults from metadata.yaml automatically when param not provided
- Workflows receive all parameters with proper fallback:
  provided value > metadata default > None

This makes metadata.yaml the single source of truth for parameter defaults,
removing the need for workflows to implement defensive default handling.

Affected workflows:
- llm_secret_detection (was failing with KeyError)
- All other workflows now benefit from automatic default application

Co-authored-by: tduhamel42 <tduhamel@fuzzinglabs.com>

* fix: add default values to llm_analysis workflow parameters

Resolves validation error where agent_url was None when not explicitly provided. The TemporalManager applies defaults from metadata.yaml, not from module input schemas, so all parameters need defaults in the workflow metadata.

Changes:
- Add default agent_url, llm_model (gpt-5-mini), llm_provider (openai)
- Expand file_patterns to 45 comprehensive patterns covering code, configs, secrets, and Docker files
- Increase default limits: max_files (10), max_file_size (100KB), timeout (90s)

* refactor: replace .env.example with .env.template in documentation

- Remove volumes/env/.env.example file
- Update all documentation references to use .env.template instead
- Update bootstrap script error message
- Update .gitignore comment

* feat(cli): add worker management commands with improved progress feedback

Add comprehensive CLI commands for managing Temporal workers:
- ff worker list - List workers with status and uptime
- ff worker start <name> - Start specific worker with optional rebuild
- ff worker stop - Safely stop all workers without affecting core services

Improvements:
- Live progress display during worker startup with Rich Status spinner
- Real-time elapsed time counter and container state updates
- Health check status tracking (starting → unhealthy → healthy)
- Helpful contextual hints at 10s, 30s, 60s intervals
- Better timeout messages showing last known state

Worker management enhancements:
- Use 'docker compose' (space) instead of 'docker-compose' (hyphen)
- Stop workers individually with 'docker stop' to avoid stopping core services
- Platform detection and Dockerfile selection (ARM64/AMD64)

Documentation:
- Updated docker-setup.md with CLI commands as primary method
- Created comprehensive cli-reference.md with all commands and examples
- Added worker management best practices

* fix: MobSF scanner now properly parses files dict structure

MobSF returns 'files' as a dict (not list):
{"filename": "line_numbers"}

The parser was treating it as a list, causing zero findings
to be extracted. Now properly iterates over the dict and
creates one finding per affected file with correct line numbers
and metadata (CWE, OWASP, MASVS, CVSS).

Fixed in both code_analysis and behaviour sections.

* chore: bump version to 0.7.3

* docs: fix broken documentation links in cli-reference

* chore: add worker startup documentation and cleanup .gitignore

- Add workflow-to-worker mapping tables across documentation
- Update troubleshooting guide with worker requirements section
- Enhance getting started guide with worker examples
- Add quick reference to docker setup guide
- Add WEEK_SUMMARY*.md pattern to .gitignore

* docs: update CHANGELOG with missing versions and recent changes

- Add Unreleased section for post-v0.7.3 documentation updates
- Add v0.7.2 entry with bug fixes and worker improvements
- Document that v0.7.1 was re-tagged as v0.7.2
- Fix v0.6.0 date to "Undocumented" (no tag exists)
- Add version comparison links for easier navigation

* chore: bump all package versions to 0.7.3 for consistency

* Update GitHub link to fuzzforge_ai

---------

Co-authored-by: Songbird99 <150154823+Songbird99@users.noreply.github.com>
Co-authored-by: Songbird <Songbirdx99@gmail.com>
2025-11-06 11:07:50 +01:00
Ectario
f6cdb1ae2e fix(docs): fixing workflow docs (#29) 2025-10-27 12:37:04 +01:00
551 changed files with 15926 additions and 80068 deletions

View File

@@ -1,48 +0,0 @@
---
name: 🐛 Bug Report
about: Create a report to help us improve FuzzForge
title: "[BUG] "
labels: bug
assignees: ''
---
## Description
A clear and concise description of the bug you encountered.
## Environment
Please provide details about your environment:
- **OS**: (e.g., macOS 14.0, Ubuntu 22.04, Windows 11)
- **Python version**: (e.g., 3.9.7)
- **Docker version**: (e.g., 24.0.6)
- **FuzzForge version**: (e.g., 0.6.0)
## Steps to Reproduce
Clear steps to recreate the issue:
1. Go to '...'
2. Run command '...'
3. Click on '...'
4. See error
## Expected Behavior
A clear and concise description of what should happen.
## Actual Behavior
A clear and concise description of what actually happens.
## Logs
Please include relevant error messages and stack traces:
```
Paste logs here
```
## Screenshots
If applicable, add screenshots to help explain your problem.
## Additional Context
Add any other context about the problem here (workflow used, specific target, configuration, etc.).
---
💬 **Need help?** Join our [Discord Community](https://discord.com/invite/acqv9FVG) for real-time support.

View File

@@ -1,8 +0,0 @@
blank_issues_enabled: false
contact_links:
- name: 💬 Community Discord
url: https://discord.com/invite/acqv9FVG
about: Join our Discord to discuss ideas, workflows, and security research with the community.
- name: 📖 Documentation
url: https://github.com/FuzzingLabs/fuzzforge_ai/tree/main/docs
about: Check our documentation for guides, tutorials, and API reference.

View File

@@ -1,38 +0,0 @@
---
name: ✨ Feature Request
about: Suggest an idea for FuzzForge
title: "[FEATURE] "
labels: enhancement
assignees: ''
---
## Use Case
Why is this feature needed? Describe the problem you're trying to solve or the improvement you'd like to see.
## Proposed Solution
How should it work? Describe your ideal solution in detail.
## Alternatives
What other approaches have you considered? List any alternative solutions or features you've thought about.
## Implementation
**(Optional)** Do you have any technical considerations or implementation ideas?
## Category
What area of FuzzForge would this feature enhance?
- [ ] 🤖 AI Agents for Security
- [ ] 🛠 Workflow Automation
- [ ] 📈 Vulnerability Research
- [ ] 🔗 Fuzzer Integration
- [ ] 🌐 Community Marketplace
- [ ] 🔒 Enterprise Features
- [ ] 📚 Documentation
- [ ] 🎯 Other
## Additional Context
Add any other context, screenshots, references, or examples about the feature request here.
---
💬 **Want to discuss this idea?** Join our [Discord Community](https://discord.com/invite/acqv9FVG) to collaborate with other contributors!

View File

@@ -1,67 +0,0 @@
---
name: 🔄 Workflow Submission
about: Contribute a security workflow or module to the FuzzForge community
title: "[WORKFLOW] "
labels: workflow, community
assignees: ''
---
## Workflow Name
Provide a short, descriptive name for your workflow.
## Description
Explain what this workflow does and what security problems it solves.
## Category
What type of security workflow is this?
- [ ] 🛡️ **Security Assessment** - Static analysis, vulnerability scanning
- [ ] 🔍 **Secret Detection** - Credential and secret scanning
- [ ] 🎯 **Fuzzing** - Dynamic testing and fuzz testing
- [ ] 🔄 **Reverse Engineering** - Binary analysis and decompilation
- [ ] 🌐 **Infrastructure Security** - Container, cloud, network security
- [ ] 🔒 **Penetration Testing** - Offensive security testing
- [ ] 📋 **Other** - Please describe
## Files
Please attach or provide links to your workflow files:
- [ ] `workflow.py` - Main Temporal flow implementation
- [ ] `Dockerfile` - Container definition
- [ ] `metadata.yaml` - Workflow metadata
- [ ] Test files or examples
- [ ] Documentation
## Testing
How did you test this workflow? Please describe:
- **Test targets used**: (e.g., vulnerable_app, custom test cases)
- **Expected outputs**: (e.g., SARIF format, specific vulnerabilities detected)
- **Validation results**: (e.g., X vulnerabilities found, Y false positives)
## SARIF Compliance
- [ ] My workflow outputs results in SARIF format
- [ ] Results include severity levels and descriptions
- [ ] Code flow information is provided where applicable
## Security Guidelines
- [ ] This workflow focuses on **defensive security** purposes only
- [ ] I have not included any malicious tools or capabilities
- [ ] All secrets/credentials are parameterized (no hardcoded values)
- [ ] I have followed responsible disclosure practices
## Registry Integration
Have you updated the workflow registry?
- [ ] Added import statement to `backend/toolbox/workflows/registry.py`
- [ ] Added registry entry with proper metadata
- [ ] Tested workflow registration and deployment
## Additional Notes
Anything else the maintainers should know about this workflow?
---
🚀 **Thank you for contributing to FuzzForge!** Your workflow will help the security community automate and scale their testing efforts.
💬 **Questions?** Join our [Discord Community](https://discord.com/invite/acqv9FVG) to discuss your contribution!

View File

@@ -1,165 +0,0 @@
name: Benchmarks
on:
# Disabled automatic runs - benchmarks not ready for CI/CD yet
# schedule:
# - cron: '0 2 * * *' # 2 AM UTC every day
# Allow manual trigger for testing
workflow_dispatch:
inputs:
compare_with:
description: 'Baseline commit to compare against (optional)'
required: false
default: ''
# pull_request:
# paths:
# - 'backend/benchmarks/**'
# - 'backend/toolbox/modules/**'
# - '.github/workflows/benchmark.yml'
jobs:
benchmark:
name: Run Benchmarks
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Fetch all history for comparison
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install system dependencies
run: |
sudo apt-get update
sudo apt-get install -y build-essential
- name: Install Python dependencies
working-directory: ./backend
run: |
python -m pip install --upgrade pip
pip install -e ".[dev]"
pip install pytest pytest-asyncio pytest-benchmark pytest-benchmark[histogram]
pip install -e ../sdk # Install SDK for benchmarks
- name: Run benchmarks
working-directory: ./backend
run: |
pytest benchmarks/ \
-v \
--benchmark-only \
--benchmark-json=benchmark-results.json \
--benchmark-histogram=benchmark-histogram
- name: Store benchmark results
uses: actions/upload-artifact@v4
with:
name: benchmark-results-${{ github.run_number }}
path: |
backend/benchmark-results.json
backend/benchmark-histogram.svg
- name: Download baseline benchmarks
if: github.event_name == 'pull_request'
uses: dawidd6/action-download-artifact@v3
continue-on-error: true
with:
workflow: benchmark.yml
branch: ${{ github.base_ref }}
name: benchmark-results-*
path: ./baseline
search_artifacts: true
- name: Compare with baseline
if: github.event_name == 'pull_request' && hashFiles('baseline/benchmark-results.json') != ''
run: |
python -c "
import json
import sys
with open('backend/benchmark-results.json') as f:
current = json.load(f)
with open('baseline/benchmark-results.json') as f:
baseline = json.load(f)
print('\\n## Benchmark Comparison\\n')
print('| Benchmark | Current | Baseline | Change |')
print('|-----------|---------|----------|--------|')
regressions = []
for bench in current['benchmarks']:
name = bench['name']
current_time = bench['stats']['mean']
# Find matching baseline
baseline_bench = next((b for b in baseline['benchmarks'] if b['name'] == name), None)
if baseline_bench:
baseline_time = baseline_bench['stats']['mean']
change = ((current_time - baseline_time) / baseline_time) * 100
print(f'| {name} | {current_time:.4f}s | {baseline_time:.4f}s | {change:+.2f}% |')
# Flag regressions > 10%
if change > 10:
regressions.append((name, change))
else:
print(f'| {name} | {current_time:.4f}s | N/A | NEW |')
if regressions:
print('\\n⚠ **Performance Regressions Detected:**')
for name, change in regressions:
print(f'- {name}: +{change:.2f}%')
sys.exit(1)
else:
print('\\n✅ No significant performance regressions detected')
"
- name: Comment PR with results
if: github.event_name == 'pull_request'
uses: actions/github-script@v7
with:
script: |
const fs = require('fs');
const results = JSON.parse(fs.readFileSync('backend/benchmark-results.json', 'utf8'));
let body = '## Benchmark Results\\n\\n';
body += '| Category | Benchmark | Mean Time | Std Dev |\\n';
body += '|----------|-----------|-----------|---------|\\n';
for (const bench of results.benchmarks) {
const group = bench.group || 'ungrouped';
const name = bench.name.split('::').pop();
const mean = bench.stats.mean.toFixed(4);
const stddev = bench.stats.stddev.toFixed(4);
body += `| ${group} | ${name} | ${mean}s | ${stddev}s |\\n`;
}
body += '\\n📊 Full benchmark results available in artifacts.';
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: body
});
benchmark-summary:
name: Benchmark Summary
runs-on: ubuntu-latest
needs: benchmark
if: always()
steps:
- name: Check results
run: |
if [ "${{ needs.benchmark.result }}" != "success" ]; then
echo "Benchmarks failed or detected regressions"
exit 1
fi
echo "Benchmarks completed successfully!"

View File

@@ -1,70 +0,0 @@
name: Python CI
# This is a dumb Ci to ensure that the python client and backend builds correctly
# It could be optimized to run faster, building, testing and linting only changed code
# but for now it is good enough. It runs on every push and PR to any branch.
# It also runs on demand.
on:
workflow_dispatch:
push:
paths:
- "ai/**"
- "backend/**"
- "cli/**"
- "sdk/**"
- "src/**"
pull_request:
paths:
- "ai/**"
- "backend/**"
- "cli/**"
- "sdk/**"
- "src/**"
jobs:
ci:
name: ci
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v5
- name: Setup uv
uses: astral-sh/setup-uv@v6
with:
enable-cache: true
- name: Set up Python
run: uv python install
# Validate no obvious issues
# Quick hack because CLI returns non-zero exit code when no args are provided
- name: Run base command
run: |
set +e
uv run ff
if [ $? -ne 2 ]; then
echo "Expected exit code 2 from 'uv run ff', got $?"
exit 1
fi
- name: Build fuzzforge_ai package
run: uv build
- name: Build ai package
working-directory: ai
run: uv build
- name: Build cli package
working-directory: cli
run: uv build
- name: Build sdk package
working-directory: sdk
run: uv build
- name: Build backend package
working-directory: backend
run: uv build

View File

@@ -1,57 +0,0 @@
name: Deploy Docusaurus to GitHub Pages
on:
workflow_dispatch:
push:
branches:
- master
paths:
- "docs/**"
jobs:
build:
name: Build Docusaurus
runs-on: ubuntu-latest
defaults:
run:
working-directory: ./docs
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: actions/setup-node@v4
with:
node-version: 24
cache: npm
cache-dependency-path: "**/package-lock.json"
- name: Install dependencies
run: npm ci
- name: Build website
run: npm run build
- name: Upload Build Artifact
uses: actions/upload-pages-artifact@v3
with:
path: ./docs/build
deploy:
name: Deploy to GitHub Pages
needs: build
# Grant GITHUB_TOKEN the permissions required to make a Pages deployment
permissions:
pages: write # to deploy to Pages
id-token: write # to verify the deployment originates from an appropriate source
# Deploy to the github-pages environment
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
runs-on: ubuntu-latest
steps:
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v4

View File

@@ -1,33 +0,0 @@
name: Docusaurus test deployment
on:
workflow_dispatch:
push:
paths:
- "docs/**"
pull_request:
paths:
- "docs/**"
jobs:
test-deploy:
name: Test deployment
runs-on: ubuntu-latest
defaults:
run:
working-directory: ./docs
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: actions/setup-node@v4
with:
node-version: 24
cache: npm
cache-dependency-path: "**/package-lock.json"
- name: Install dependencies
run: npm ci
- name: Test build website
run: npm run build

View File

@@ -1,152 +0,0 @@
# FuzzForge CI/CD Example - Security Scanning
#
# This workflow demonstrates how to integrate FuzzForge into your CI/CD pipeline
# for automated security testing on pull requests and pushes.
#
# Features:
# - Runs entirely in GitHub Actions (no external infrastructure needed)
# - Auto-starts FuzzForge services on-demand
# - Fails builds on error-level SARIF findings
# - Uploads SARIF results to GitHub Security tab
# - Exports findings as artifacts
#
# Prerequisites:
# - Ubuntu runner with Docker support
# - At least 4GB RAM available
# - ~90 seconds startup time
name: Security Scan Example
on:
pull_request:
branches: [main, develop]
push:
branches: [main]
jobs:
security-scan:
name: Security Assessment
runs-on: ubuntu-latest
timeout-minutes: 30
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Start FuzzForge
run: |
bash scripts/ci-start.sh
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install FuzzForge CLI
run: |
pip install ./cli
- name: Initialize FuzzForge
run: |
ff init --api-url http://localhost:8000 --name "GitHub Actions Security Scan"
- name: Run Security Assessment
run: |
ff workflow run security_assessment . \
--wait \
--fail-on error \
--export-sarif results.sarif
- name: Upload SARIF to GitHub Security
if: always()
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: results.sarif
- name: Upload findings as artifact
if: always()
uses: actions/upload-artifact@v4
with:
name: security-findings
path: results.sarif
retention-days: 30
- name: Stop FuzzForge
if: always()
run: |
bash scripts/ci-stop.sh
secret-scan:
name: Secret Detection
runs-on: ubuntu-latest
timeout-minutes: 15
steps:
- uses: actions/checkout@v4
- name: Start FuzzForge
run: bash scripts/ci-start.sh
- name: Install CLI
run: |
pip install ./cli
- name: Initialize & Scan
run: |
ff init --api-url http://localhost:8000 --name "Secret Detection"
ff workflow run secret_detection . \
--wait \
--fail-on all \
--export-sarif secrets.sarif
- name: Upload results
if: always()
uses: actions/upload-artifact@v4
with:
name: secret-scan-results
path: secrets.sarif
retention-days: 30
- name: Cleanup
if: always()
run: bash scripts/ci-stop.sh
# Example: Nightly fuzzing campaign (long-running)
nightly-fuzzing:
name: Nightly Fuzzing
runs-on: ubuntu-latest
timeout-minutes: 120
# Only run on schedule
if: github.event_name == 'schedule'
steps:
- uses: actions/checkout@v4
- name: Start FuzzForge
run: bash scripts/ci-start.sh
- name: Install CLI
run: pip install ./cli
- name: Run Fuzzing Campaign
run: |
ff init --api-url http://localhost:8000
ff workflow run atheris_fuzzing . \
max_iterations=100000000 \
timeout_seconds=7200 \
--wait \
--export-sarif fuzzing-results.sarif
# Don't fail on fuzzing findings, just report
continue-on-error: true
- name: Upload fuzzing results
if: always()
uses: actions/upload-artifact@v4
with:
name: fuzzing-results
path: fuzzing-results.sarif
retention-days: 90
- name: Cleanup
if: always()
run: bash scripts/ci-stop.sh

View File

@@ -1,155 +0,0 @@
name: Tests
on:
push:
branches: [ main, master, develop, feature/** ]
pull_request:
branches: [ main, master, develop ]
jobs:
lint:
name: Lint
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install ruff mypy
- name: Run ruff
run: ruff check backend/src backend/toolbox backend/tests backend/benchmarks --output-format=github
- name: Run mypy (continue on error)
run: mypy backend/src backend/toolbox || true
continue-on-error: true
unit-tests:
name: Unit Tests
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ['3.11', '3.12']
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Install system dependencies
run: |
sudo apt-get update
sudo apt-get install -y build-essential
- name: Install Python dependencies
working-directory: ./backend
run: |
python -m pip install --upgrade pip
pip install -e ".[dev]"
pip install pytest pytest-asyncio pytest-cov pytest-xdist
- name: Run unit tests
working-directory: ./backend
run: |
pytest tests/unit/ \
-v \
--cov=toolbox/modules \
--cov=src \
--cov-report=xml \
--cov-report=term \
--cov-report=html \
-n auto
- name: Upload coverage to Codecov
if: matrix.python-version == '3.11'
uses: codecov/codecov-action@v4
with:
file: ./backend/coverage.xml
flags: unittests
name: codecov-backend
- name: Upload coverage HTML
if: matrix.python-version == '3.11'
uses: actions/upload-artifact@v4
with:
name: coverage-report
path: ./backend/htmlcov/
# integration-tests:
# name: Integration Tests
# runs-on: ubuntu-latest
# needs: unit-tests
#
# services:
# postgres:
# image: postgres:15
# env:
# POSTGRES_USER: postgres
# POSTGRES_PASSWORD: postgres
# POSTGRES_DB: fuzzforge_test
# options: >-
# --health-cmd pg_isready
# --health-interval 10s
# --health-timeout 5s
# --health-retries 5
# ports:
# - 5432:5432
#
# steps:
# - uses: actions/checkout@v4
#
# - name: Set up Python
# uses: actions/setup-python@v5
# with:
# python-version: '3.11'
#
# - name: Set up Docker Buildx
# uses: docker/setup-buildx-action@v3
#
# - name: Install Python dependencies
# working-directory: ./backend
# run: |
# python -m pip install --upgrade pip
# pip install -e ".[dev]"
# pip install pytest pytest-asyncio
#
# - name: Start services (Temporal, MinIO)
# run: |
# docker-compose -f docker-compose.yml up -d temporal minio
# sleep 30
#
# - name: Run integration tests
# working-directory: ./backend
# run: |
# pytest tests/integration/ -v --tb=short
# env:
# DATABASE_URL: postgresql://postgres:postgres@localhost:5432/fuzzforge_test
# TEMPORAL_ADDRESS: localhost:7233
# MINIO_ENDPOINT: localhost:9000
#
# - name: Shutdown services
# if: always()
# run: docker-compose down
test-summary:
name: Test Summary
runs-on: ubuntu-latest
needs: [lint, unit-tests]
if: always()
steps:
- name: Check test results
run: |
if [ "${{ needs.unit-tests.result }}" != "success" ]; then
echo "Unit tests failed"
exit 1
fi
echo "All tests passed!"

313
.gitignore vendored
View File

@@ -1,307 +1,12 @@
# ========================================
# FuzzForge Platform .gitignore
# ========================================
# -------------------- Python --------------------
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
# Environments
*.egg-info
*.whl
.env
.mypy_cache
.pytest_cache
.ruff_cache
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
.python-version
.vscode
__pycache__
# UV package manager
uv.lock
# But allow uv.lock in CLI and SDK for reproducible builds
!cli/uv.lock
!sdk/uv.lock
!backend/uv.lock
# MyPy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# pytype static type analyzer
.pytype/
# Cython debug symbols
cython_debug/
# -------------------- IDE / Editor --------------------
# VSCode
.vscode/
*.code-workspace
# PyCharm
.idea/
# Vim
*.swp
*.swo
*~
# Emacs
*~
\#*\#
/.emacs.desktop
/.emacs.desktop.lock
*.elc
auto-save-list
tramp
.\#*
# Sublime Text
*.sublime-project
*.sublime-workspace
# -------------------- Operating System --------------------
# macOS
.DS_Store
.AppleDouble
.LSOverride
Icon
._*
.DocumentRevisions-V100
.fseventsd
.Spotlight-V100
.TemporaryItems
.Trashes
.VolumeIcon.icns
.com.apple.timemachine.donotpresent
.AppleDB
.AppleDesktop
Network Trash Folder
Temporary Items
.apdisk
# Windows
Thumbs.db
Thumbs.db:encryptable
ehthumbs.db
ehthumbs_vista.db
*.stackdump
[Dd]esktop.ini
$RECYCLE.BIN/
*.cab
*.msi
*.msix
*.msm
*.msp
*.lnk
# Linux
*~
.fuse_hidden*
.directory
.Trash-*
.nfs*
# -------------------- Docker --------------------
# Docker volumes and data
docker-volumes/
.dockerignore.bak
# Docker Compose override files
docker-compose.override.yml
docker-compose.override.yaml
# -------------------- Database --------------------
# SQLite
*.sqlite
*.sqlite3
*.db
*.db-journal
*.db-shm
*.db-wal
# PostgreSQL
*.sql.backup
# -------------------- Logs --------------------
# General logs
*.log
logs/
*.log.*
# -------------------- FuzzForge Specific --------------------
# FuzzForge project directories (user projects should manage their own .gitignore)
.fuzzforge/
# Docker volume configs (keep .env.example but ignore actual .env)
volumes/env/.env
# Test project databases and configurations
test_projects/*/.fuzzforge/
test_projects/*/findings.db*
test_projects/*/config.yaml
test_projects/*/.gitignore
# Local development configurations
local_config.yaml
dev_config.yaml
.env.local
.env.development
# Generated reports and outputs
reports/
output/
findings/
*.sarif
*.sarif.json
*.html.report
security_report.*
# Temporary files
tmp/
temp/
*.tmp
*.temp
# Backup files
*.bak
*.backup
*~
# -------------------- Node.js (for any JS tooling) --------------------
node_modules/
npm-debug.log*
yarn-debug.log*
yarn-error.log*
.npm
# -------------------- Security --------------------
# Never commit these files
*.pem
*.key
*.p12
*.pfx
# Exception: Secret detection benchmark test files (not real secrets)
!test_projects/secret_detection_benchmark/
!test_projects/secret_detection_benchmark/**
!**/secret_detection_benchmark_GROUND_TRUTH.json
!**/secret_detection/results/
# Exception: Allow workers/secrets/ directory (secrets detection worker)
!workers/secrets/
!workers/secrets/**
secret*
secrets/
credentials*
api_keys*
.env.production
.env.staging
# AWS credentials
.aws/
# -------------------- Build Artifacts --------------------
# Python builds
build/
dist/
*.wheel
# Documentation builds
docs/_build/
site/
# -------------------- Miscellaneous --------------------
# Jupyter Notebook checkpoints
.ipynb_checkpoints
# IPython history
.ipython/
# Rope project settings
.ropeproject
# spyderproject
.spyderproject
.spyproject
# mkdocs documentation
/site
# Local Netlify folder
.netlify
# -------------------- Project Specific Overrides --------------------
# Allow specific test project files that should be tracked
!test_projects/*/src/
!test_projects/*/scripts/
!test_projects/*/config/
!test_projects/*/data/
!test_projects/*/README.md
!test_projects/*/*.py
!test_projects/*/*.js
!test_projects/*/*.php
!test_projects/*/*.java
# But exclude their sensitive content
test_projects/*/.env
test_projects/*/private_key.pem
test_projects/*/wallet.json
test_projects/*/.npmrc
test_projects/*/.git-credentials
test_projects/*/credentials.*
test_projects/*/api_keys.*
test_projects/*/ci-*.sh
# Podman/Docker container storage artifacts
~/.fuzzforge/

View File

@@ -1,121 +0,0 @@
# FuzzForge CI/CD Example - GitLab CI
#
# This file demonstrates how to integrate FuzzForge into your GitLab CI/CD pipeline.
# Copy this to `.gitlab-ci.yml` in your project root to enable security scanning.
#
# Features:
# - Runs entirely in GitLab runners (no external infrastructure)
# - Auto-starts FuzzForge services on-demand
# - Fails pipelines on critical/high severity findings
# - Uploads SARIF reports to GitLab Security Dashboard
# - Exports findings as artifacts
#
# Prerequisites:
# - GitLab Runner with Docker support (docker:dind)
# - At least 4GB RAM available
# - ~90 seconds startup time
stages:
- security
variables:
FUZZFORGE_API_URL: "http://localhost:8000"
DOCKER_DRIVER: overlay2
DOCKER_TLS_CERTDIR: ""
# Base template for all FuzzForge jobs
.fuzzforge_template:
image: docker:24
services:
- docker:24-dind
before_script:
# Install dependencies
- apk add --no-cache bash curl python3 py3-pip git
# Start FuzzForge
- bash scripts/ci-start.sh
# Install CLI
- pip3 install ./cli --break-system-packages
# Initialize project
- ff init --api-url $FUZZFORGE_API_URL --name "GitLab CI Security Scan"
after_script:
# Cleanup
- bash scripts/ci-stop.sh || true
# Security Assessment - Comprehensive code analysis
security:scan:
extends: .fuzzforge_template
stage: security
timeout: 30 minutes
script:
- ff workflow run security_assessment . --wait --fail-on error --export-sarif results.sarif
artifacts:
when: always
reports:
sast: results.sarif
paths:
- results.sarif
expire_in: 30 days
only:
- merge_requests
- main
- develop
# Secret Detection - Scan for exposed credentials
security:secrets:
extends: .fuzzforge_template
stage: security
timeout: 15 minutes
script:
- ff workflow run secret_detection . --wait --fail-on all --export-sarif secrets.sarif
artifacts:
when: always
paths:
- secrets.sarif
expire_in: 30 days
only:
- merge_requests
- main
# Nightly Fuzzing - Long-running fuzzing campaign (scheduled only)
security:fuzzing:
extends: .fuzzforge_template
stage: security
timeout: 2 hours
script:
- |
ff workflow run atheris_fuzzing . \
max_iterations=100000000 \
timeout_seconds=7200 \
--wait \
--export-sarif fuzzing-results.sarif
artifacts:
when: always
paths:
- fuzzing-results.sarif
expire_in: 90 days
allow_failure: true # Don't fail pipeline on fuzzing findings
only:
- schedules
# OSS-Fuzz Campaign (for supported projects)
security:ossfuzz:
extends: .fuzzforge_template
stage: security
timeout: 1 hour
script:
- |
ff workflow run ossfuzz_campaign . \
project_name=your-project-name \
campaign_duration_hours=0.5 \
--wait \
--export-sarif ossfuzz-results.sarif
artifacts:
when: always
paths:
- ossfuzz-results.sarif
expire_in: 90 days
allow_failure: true
only:
- schedules
# Uncomment and set your project name
# when: manual

1
.python-version Normal file
View File

@@ -0,0 +1 @@
3.14.2

File diff suppressed because it is too large Load Diff

View File

@@ -1,85 +0,0 @@
# Changelog
All notable changes to FuzzForge will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [0.7.0] - 2025-01-16
### 🎯 Major Features
#### Secret Detection Workflows
- **Added three secret detection workflows**:
- `gitleaks_detection` - Pattern-based secret scanning
- `trufflehog_detection` - Entropy-based secret detection with verification
- `llm_secret_detection` - AI-powered semantic secret detection using LLMs
- **Comprehensive benchmarking infrastructure**:
- 32-secret ground truth dataset for precision/recall testing
- Difficulty levels: 12 Easy, 10 Medium, 10 Hard secrets
- SARIF-formatted output for all workflows
- Achieved 100% recall with LLM-based detection on benchmark dataset
#### AI Module & Agent Integration
- Added A2A (Agent-to-Agent) wrapper for multi-agent orchestration
- Task agent implementation with Google ADK
- LLM analysis workflow for code security analysis
- Reactivated AI agent command (`ff ai agent`)
#### Temporal Migration Complete
- Fully migrated from Prefect to Temporal for workflow orchestration
- MinIO storage for unified file handling (replaces volume mounts)
- Vertical workers with pre-built security toolchains
- Improved worker lifecycle management
#### CI/CD Integration
- Ephemeral deployment model for testing
- Automated workflow validation in CI pipeline
### ✨ Enhancements
#### Documentation
- Updated README for Temporal + MinIO architecture
- Removed obsolete `volume_mode` references across all documentation
- Added `.env` configuration guide for AI agent API keys
- Fixed worker startup instructions with correct service names
- Updated docker compose commands to modern syntax
#### Worker Management
- Added `worker_service` field to API responses for correct service naming
- Improved error messages with actionable manual start commands
- Fixed default parameters for gitleaks (now uses `no_git=True` by default)
### 🐛 Bug Fixes
- Fixed gitleaks workflow failing on uploaded directories without Git history
- Fixed worker startup command suggestions (now uses `docker compose up -d` with service names)
- Fixed missing `cognify_text` method in CogneeProjectIntegration
### 🔧 Technical Changes
- Updated all package versions to 0.7.0
- Improved SARIF output formatting for secret detection workflows
- Enhanced benchmark validation with ground truth JSON
- Better integration between CLI and backend for worker management
### 📝 Test Projects
- Added `secret_detection_benchmark` with 32 documented secrets
- Ground truth JSON for automated precision/recall calculations
- Updated `vulnerable_app` for comprehensive security testing
---
## [0.6.0] - 2024-12-XX
### Features
- Initial Temporal migration
- Fuzzing workflows (Atheris, Cargo, OSS-Fuzz)
- Security assessment workflow
- Basic CLI commands
---
[0.7.0]: https://github.com/FuzzingLabs/fuzzforge_ai/compare/v0.6.0...v0.7.0
[0.6.0]: https://github.com/FuzzingLabs/fuzzforge_ai/releases/tag/v0.6.0

View File

@@ -1,17 +1,21 @@
# Contributing to FuzzForge 🤝
# Contributing to FuzzForge OSS
Thank you for your interest in contributing to FuzzForge! We welcome contributions from the community and are excited to collaborate with you.
Thank you for your interest in contributing to FuzzForge OSS! We welcome contributions from the community and are excited to collaborate with you.
## 🌟 Ways to Contribute
**Our Vision**: FuzzForge aims to be a **universal platform for security research** across all cybersecurity domains. Through our modular architecture, any security tool—from fuzzing engines to cloud scanners, from mobile app analyzers to IoT security tools—can be integrated as a containerized module and controlled via AI agents.
- 🐛 **Bug Reports** - Help us identify and fix issues
- 💡 **Feature Requests** - Suggest new capabilities and improvements
- 🔧 **Code Contributions** - Submit bug fixes, features, and enhancements
- 📚 **Documentation** - Improve guides, tutorials, and API documentation
- 🧪 **Testing** - Help test new features and report issues
- 🛡️ **Security Workflows** - Contribute new security analysis workflows
## Ways to Contribute
## 📋 Contribution Guidelines
- **Security Modules** - Create modules for any cybersecurity domain (AppSec, NetSec, Cloud, IoT, etc.)
- **Bug Reports** - Help us identify and fix issues
- **Feature Requests** - Suggest new capabilities and improvements
- **Core Features** - Contribute to the MCP server, runner, or CLI
- **Documentation** - Improve guides, tutorials, and module documentation
- **Testing** - Help test new features and report issues
- **AI Integration** - Improve MCP tools and AI agent interactions
- **Tool Integrations** - Wrap existing security tools as FuzzForge modules
## Contribution Guidelines
### Code Style
@@ -44,9 +48,10 @@ We use conventional commits for clear history:
**Examples:**
```
feat(workflows): add new static analysis workflow for Go
fix(api): resolve authentication timeout issue
docs(readme): update installation instructions
feat(modules): add cloud security scanner module
fix(mcp): resolve module listing timeout
docs(sdk): update module development guide
test(runner): add container execution tests
```
### Pull Request Process
@@ -65,9 +70,14 @@ docs(readme): update installation instructions
3. **Test Your Changes**
```bash
# Test workflows
cd test_projects/vulnerable_app/
ff workflow security_assessment .
# Test modules
FUZZFORGE_MODULES_PATH=./fuzzforge-modules uv run fuzzforge modules list
# Run a module
uv run fuzzforge modules run your-module --assets ./test-assets
# Test MCP integration (if applicable)
uv run fuzzforge mcp status
```
4. **Submit Pull Request**
@@ -76,65 +86,353 @@ docs(readme): update installation instructions
- Link related issues using `Fixes #123` or `Closes #123`
- Ensure all CI checks pass
## 🛡️ Security Workflow Development
## Module Development
### Creating New Workflows
FuzzForge uses a modular architecture where security tools run as isolated containers. The `fuzzforge-modules-sdk` provides everything you need to create new modules.
1. **Workflow Structure**
```
backend/toolbox/workflows/your_workflow/
├── __init__.py
├── workflow.py # Main Temporal workflow
├── activities.py # Workflow activities (optional)
├── metadata.yaml # Workflow metadata (includes vertical field)
└── requirements.txt # Additional dependencies (optional)
**Documentation:**
- [Module SDK Documentation](fuzzforge-modules/fuzzforge-modules-sdk/README.md) - Complete SDK reference
- [Module Template](fuzzforge-modules/fuzzforge-module-template/) - Starting point for new modules
- [USAGE Guide](USAGE.md) - Setup and installation instructions
### Creating a New Module
1. **Use the Module Template**
```bash
# Generate a new module from template
cd fuzzforge-modules/
cp -r fuzzforge-module-template my-new-module
cd my-new-module
```
2. **Register Your Workflow**
Add your workflow to `backend/toolbox/workflows/registry.py`:
2. **Module Structure**
```
my-new-module/
├── Dockerfile # Container definition
├── Makefile # Build commands
├── README.md # Module documentation
├── pyproject.toml # Python dependencies
├── mypy.ini # Type checking config
├── ruff.toml # Linting config
└── src/
└── module/
├── __init__.py
├── __main__.py # Entry point
├── mod.py # Main module logic
├── models.py # Pydantic models
└── settings.py # Configuration
```
3. **Implement Your Module**
Edit `src/module/mod.py`:
```python
# Import your workflow
from .your_workflow.workflow import main_flow as your_workflow_flow
# Add to registry
WORKFLOW_REGISTRY["your_workflow"] = {
"flow": your_workflow_flow,
"module_path": "toolbox.workflows.your_workflow.workflow",
"function_name": "main_flow",
"description": "Description of your workflow",
"version": "1.0.0",
"author": "Your Name",
"tags": ["tag1", "tag2"]
}
from fuzzforge_modules_sdk.api.modules import BaseModule
from fuzzforge_modules_sdk.api.models import ModuleResult
from .models import MyModuleConfig, MyModuleOutput
class MyModule(BaseModule[MyModuleConfig, MyModuleOutput]):
"""Your module description."""
def execute(self) -> ModuleResult[MyModuleOutput]:
"""Main execution logic."""
# Access input assets
assets = self.input_path
# Your security tool logic here
results = self.run_analysis(assets)
# Return structured results
return ModuleResult(
success=True,
output=MyModuleOutput(
findings=results,
summary="Analysis complete"
)
)
```
3. **Testing Workflows**
- Create test cases in `test_projects/vulnerable_app/`
- Ensure SARIF output format compliance
- Test with various input scenarios
4. **Define Configuration Models**
Edit `src/module/models.py`:
```python
from pydantic import BaseModel, Field
from fuzzforge_modules_sdk.api.models import BaseModuleConfig, BaseModuleOutput
class MyModuleConfig(BaseModuleConfig):
"""Configuration for your module."""
timeout: int = Field(default=300, description="Timeout in seconds")
max_iterations: int = Field(default=1000, description="Max iterations")
class MyModuleOutput(BaseModuleOutput):
"""Output from your module."""
findings: list[dict] = Field(default_factory=list)
coverage: float = Field(default=0.0)
```
5. **Build Your Module**
```bash
# Build the SDK first (if not already done)
cd ../fuzzforge-modules-sdk
uv build
mkdir -p .wheels
cp ../../dist/fuzzforge_modules_sdk-*.whl .wheels/
cd ../..
docker build -t localhost/fuzzforge-modules-sdk:0.1.0 fuzzforge-modules/fuzzforge-modules-sdk/
# Build your module
cd fuzzforge-modules/my-new-module
docker build -t fuzzforge-my-new-module:0.1.0 .
```
6. **Test Your Module**
```bash
# Run with test assets
uv run fuzzforge modules run my-new-module --assets ./test-assets
# Check module info
uv run fuzzforge modules info my-new-module
```
### Module Development Guidelines
**Important Conventions:**
- **Input/Output**: Use `/fuzzforge/input` for assets and `/fuzzforge/output` for results
- **Configuration**: Support JSON configuration via stdin or file
- **Logging**: Use structured logging (structlog is pre-configured)
- **Error Handling**: Return proper exit codes and error messages
- **Security**: Run as non-root user when possible
- **Documentation**: Include clear README with usage examples
- **Dependencies**: Minimize container size, use multi-stage builds
**See also:**
- [Module SDK API Reference](fuzzforge-modules/fuzzforge-modules-sdk/src/fuzzforge_modules_sdk/api/)
- [Dockerfile Best Practices](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/)
### Module Types
FuzzForge is designed to support modules across **all cybersecurity domains**. The modular architecture allows any security tool to be containerized and integrated. Here are the main categories:
**Application Security**
- Fuzzing engines (coverage-guided, grammar-based, mutation-based)
- Static analysis (SAST, code quality, dependency scanning)
- Dynamic analysis (DAST, runtime analysis, instrumentation)
- Test validation and coverage analysis
- Crash analysis and exploit detection
**Network & Infrastructure Security**
- Network scanning and service enumeration
- Protocol analysis and fuzzing
- Firewall and configuration testing
- Cloud security (AWS/Azure/GCP misconfiguration detection, IAM analysis)
- Container security (image scanning, Kubernetes security)
**Web & API Security**
- Web vulnerability scanners (XSS, SQL injection, CSRF)
- Authentication and session testing
- API security (REST/GraphQL/gRPC testing, fuzzing)
- SSL/TLS analysis
**Binary & Reverse Engineering**
- Binary analysis and disassembly
- Malware sandboxing and behavior analysis
- Exploit development tools
- Firmware extraction and analysis
**Mobile & IoT Security**
- Mobile app analysis (Android/iOS static/dynamic analysis)
- IoT device security and firmware analysis
- SCADA/ICS and industrial protocol testing
- Automotive security (CAN bus, ECU testing)
**Data & Compliance**
- Database security testing
- Encryption and cryptography analysis
- Secrets and credential detection
- Privacy tools (PII detection, GDPR compliance)
- Compliance checkers (PCI-DSS, HIPAA, SOC2, ISO27001)
**Threat Intelligence & Risk**
- OSINT and reconnaissance tools
- Threat hunting and IOC correlation
- Risk assessment and attack surface mapping
- Security audit and policy validation
**Emerging Technologies**
- AI/ML security (model poisoning, adversarial testing)
- Blockchain and smart contract analysis
- Quantum-safe cryptography testing
**Custom & Integration**
- Domain-specific security tools
- Bridges to existing security tools
- Multi-tool orchestration and result aggregation
### Example: Simple Security Scanner Module
```python
# src/module/mod.py
from pathlib import Path
from fuzzforge_modules_sdk.api.modules import BaseModule
from fuzzforge_modules_sdk.api.models import ModuleResult
from .models import ScannerConfig, ScannerOutput
class SecurityScanner(BaseModule[ScannerConfig, ScannerOutput]):
"""Scans for common security issues in code."""
def execute(self) -> ModuleResult[ScannerOutput]:
findings = []
# Scan all source files
for file_path in self.input_path.rglob("*"):
if file_path.is_file():
findings.extend(self.scan_file(file_path))
return ModuleResult(
success=True,
output=ScannerOutput(
findings=findings,
files_scanned=len(list(self.input_path.rglob("*")))
)
)
def scan_file(self, path: Path) -> list[dict]:
"""Scan a single file for security issues."""
# Your scanning logic here
return []
```
### Testing Modules
Create tests in `tests/`:
```python
import pytest
from module.mod import MyModule
from module.models import MyModuleConfig
def test_module_execution():
config = MyModuleConfig(timeout=60)
module = MyModule(config=config, input_path=Path("test_assets"))
result = module.execute()
assert result.success
assert len(result.output.findings) >= 0
```
Run tests:
```bash
uv run pytest
```
### Security Guidelines
- 🔐 Never commit secrets, API keys, or credentials
- 🛡️ Focus on **defensive security** tools and analysis
- ⚠️ Do not create tools for malicious purposes
- 🧪 Test workflows thoroughly before submission
- 📋 Follow responsible disclosure for security issues
**Critical Requirements:**
- Never commit secrets, API keys, or credentials
- Focus on **defensive security** tools and analysis
- Do not create tools for malicious purposes
- Test modules thoroughly before submission
- Follow responsible disclosure for security issues
- Use minimal, secure base images for containers
- Avoid running containers as root when possible
## 🐛 Bug Reports
**Security Resources:**
- [OWASP Container Security](https://cheatsheetseries.owasp.org/cheatsheets/Docker_Security_Cheat_Sheet.html)
- [CIS Docker Benchmarks](https://www.cisecurity.org/benchmark/docker)
## Contributing to Core Features
Beyond modules, you can contribute to FuzzForge's core components.
**Useful Resources:**
- [Project Structure](README.md) - Overview of the codebase
- [USAGE Guide](USAGE.md) - Installation and setup
- Python best practices: [PEP 8](https://pep8.org/)
### Core Components
- **fuzzforge-mcp** - MCP server for AI agent integration
- **fuzzforge-runner** - Module execution engine
- **fuzzforge-cli** - Command-line interface
- **fuzzforge-common** - Shared utilities and sandbox engines
- **fuzzforge-types** - Type definitions and schemas
### Development Setup
1. **Clone and Install**
```bash
git clone https://github.com/FuzzingLabs/fuzzforge-oss.git
cd fuzzforge-oss
uv sync --all-extras
```
2. **Run Tests**
```bash
# Run all tests
make test
# Run specific package tests
cd fuzzforge-mcp
uv run pytest
```
3. **Type Checking**
```bash
# Type check all packages
make typecheck
# Type check specific package
cd fuzzforge-runner
uv run mypy .
```
4. **Linting and Formatting**
```bash
# Format code
make format
# Lint code
make lint
```
## Bug Reports
When reporting bugs, please include:
- **Environment**: OS, Python version, Docker version
- **Environment**: OS, Python version, Docker version, uv version
- **FuzzForge Version**: Output of `uv run fuzzforge --version`
- **Module**: Which module or component is affected
- **Steps to Reproduce**: Clear steps to recreate the issue
- **Expected Behavior**: What should happen
- **Actual Behavior**: What actually happens
- **Logs**: Relevant error messages and stack traces
- **Container Logs**: For module issues, include Docker/Podman logs
- **Screenshots**: If applicable
Use our [Bug Report Template](.github/ISSUE_TEMPLATE/bug_report.md).
**Example:**
```markdown
**Environment:**
- OS: Ubuntu 22.04
- Python: 3.14.2
- Docker: 24.0.7
- uv: 0.5.13
## 💡 Feature Requests
**Module:** my-custom-scanner
**Steps to Reproduce:**
1. Run `uv run fuzzforge modules run my-scanner --assets ./test-target`
2. Module fails with timeout error
**Expected:** Module completes analysis
**Actual:** Times out after 30 seconds
**Logs:**
```
ERROR: Module execution timeout
...
```
```
## Feature Requests
For new features, please provide:
@@ -142,33 +440,124 @@ For new features, please provide:
- **Proposed Solution**: How should it work?
- **Alternatives**: Other approaches considered
- **Implementation**: Technical considerations (optional)
- **Module vs Core**: Should this be a module or core feature?
Use our [Feature Request Template](.github/ISSUE_TEMPLATE/feature_request.md).
**Example Feature Requests:**
- New module for cloud security posture management (CSPM)
- Module for analyzing smart contract vulnerabilities
- MCP tool for orchestrating multi-module workflows
- CLI command for batch module execution across multiple targets
- Support for distributed fuzzing campaigns
- Integration with CI/CD pipelines
- Module marketplace/registry features
## 📚 Documentation
## Documentation
Help improve our documentation:
- **Module Documentation**: Document your modules in their README.md
- **API Documentation**: Update docstrings and type hints
- **User Guides**: Create tutorials and how-to guides
- **Workflow Documentation**: Document new security workflows
- **Examples**: Add practical usage examples
- **User Guides**: Improve USAGE.md and tutorial content
- **Module SDK Guides**: Help document the SDK for module developers
- **MCP Integration**: Document AI agent integration patterns
- **Examples**: Add practical usage examples and workflows
## 🙏 Recognition
### Documentation Standards
- Use clear, concise language
- Include code examples
- Add command-line examples with expected output
- Document all configuration options
- Explain error messages and troubleshooting
### Module README Template
```markdown
# Module Name
Brief description of what this module does.
## Features
- Feature 1
- Feature 2
## Configuration
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| timeout | int | 300 | Timeout in seconds |
## Usage
\`\`\`bash
uv run fuzzforge modules run module-name --assets ./path/to/assets
\`\`\`
## Output
Describes the output structure and format.
## Examples
Practical usage examples.
```
## Recognition
Contributors will be:
- Listed in our [Contributors](CONTRIBUTORS.md) file
- Mentioned in release notes for significant contributions
- Invited to join our Discord community
- Eligible for FuzzingLabs Academy courses and swag
- Credited in module documentation (for module authors)
- Invited to join our [Discord community](https://discord.gg/8XEX33UUwZ)
## 📜 License
## Module Submission Checklist
By contributing to FuzzForge, you agree that your contributions will be licensed under the same [Business Source License 1.1](LICENSE) as the project.
Before submitting a new module:
- [ ] Module follows SDK structure and conventions
- [ ] Dockerfile builds successfully
- [ ] Module executes without errors
- [ ] Configuration options are documented
- [ ] README.md is complete with examples
- [ ] Tests are included (pytest)
- [ ] Type hints are used throughout
- [ ] Linting passes (ruff)
- [ ] Security best practices followed
- [ ] No secrets or credentials in code
- [ ] License headers included
## Review Process
1. **Initial Review** - Maintainers review for completeness
2. **Technical Review** - Code quality and security assessment
3. **Testing** - Module tested in isolated environment
4. **Documentation Review** - Ensure docs are clear and complete
5. **Approval** - Module merged and included in next release
## License
By contributing to FuzzForge OSS, you agree that your contributions will be licensed under the same license as the project (see [LICENSE](LICENSE)).
For module contributions:
- Modules you create remain under the project license
- You retain credit as the module author
- Your module may be used by others under the project license terms
---
**Thank you for making FuzzForge better! 🚀**
## Getting Help
Every contribution, no matter how small, helps build a stronger security community.
Need help contributing?
- Join our [Discord](https://discord.gg/8XEX33UUwZ)
- Read the [Module SDK Documentation](fuzzforge-modules/fuzzforge-modules-sdk/README.md)
- Check the module template for examples
- Contact: contact@fuzzinglabs.com
---
**Thank you for making FuzzForge better!**
Every contribution, no matter how small, helps build a stronger security research platform. Whether you're creating a module for web security, cloud scanning, mobile analysis, or any other cybersecurity domain, your work makes FuzzForge more powerful and versatile for the entire security community!

103
Makefile Normal file
View File

@@ -0,0 +1,103 @@
.PHONY: help install sync format lint typecheck test build-modules clean
SHELL := /bin/bash
# Default target
help:
@echo "FuzzForge OSS Development Commands"
@echo ""
@echo " make install - Install all dependencies"
@echo " make sync - Sync shared packages from upstream"
@echo " make format - Format code with ruff"
@echo " make lint - Lint code with ruff"
@echo " make typecheck - Type check with mypy"
@echo " make test - Run all tests"
@echo " make build-modules - Build all module container images"
@echo " make clean - Clean build artifacts"
@echo ""
# Install all dependencies
install:
uv sync
# Sync shared packages from upstream fuzzforge-core
sync:
@if [ -z "$(UPSTREAM)" ]; then \
echo "Usage: make sync UPSTREAM=/path/to/fuzzforge-core"; \
exit 1; \
fi
./scripts/sync-upstream.sh $(UPSTREAM)
# Format all packages
format:
@for pkg in packages/fuzzforge-*/; do \
if [ -f "$$pkg/pyproject.toml" ]; then \
echo "Formatting $$pkg..."; \
cd "$$pkg" && uv run ruff format . && cd -; \
fi \
done
# Lint all packages
lint:
@for pkg in packages/fuzzforge-*/; do \
if [ -f "$$pkg/pyproject.toml" ]; then \
echo "Linting $$pkg..."; \
cd "$$pkg" && uv run ruff check . && cd -; \
fi \
done
# Type check all packages
typecheck:
@for pkg in packages/fuzzforge-*/; do \
if [ -f "$$pkg/pyproject.toml" ] && [ -f "$$pkg/mypy.ini" ]; then \
echo "Type checking $$pkg..."; \
cd "$$pkg" && uv run mypy . && cd -; \
fi \
done
# Run all tests
test:
@for pkg in packages/fuzzforge-*/; do \
if [ -f "$$pkg/pytest.ini" ]; then \
echo "Testing $$pkg..."; \
cd "$$pkg" && uv run pytest && cd -; \
fi \
done
# Build all module container images
# Uses Docker by default, or Podman if FUZZFORGE_ENGINE=podman
build-modules:
@echo "Building FuzzForge module images..."
@if [ "$$FUZZFORGE_ENGINE" = "podman" ]; then \
if [ -n "$$SNAP" ]; then \
echo "Using Podman with isolated storage (Snap detected)"; \
CONTAINER_CMD="podman --root ~/.fuzzforge/containers/storage --runroot ~/.fuzzforge/containers/run"; \
else \
echo "Using Podman"; \
CONTAINER_CMD="podman"; \
fi; \
else \
echo "Using Docker"; \
CONTAINER_CMD="docker"; \
fi; \
for module in fuzzforge-modules/*/; do \
if [ -f "$$module/Dockerfile" ] && \
[ "$$module" != "fuzzforge-modules/fuzzforge-modules-sdk/" ] && \
[ "$$module" != "fuzzforge-modules/fuzzforge-module-template/" ]; then \
name=$$(basename $$module); \
version=$$(grep 'version' "$$module/pyproject.toml" 2>/dev/null | head -1 | sed 's/.*"\(.*\\)".*/\\1/' || echo "0.1.0"); \
echo "Building $$name:$$version..."; \
$$CONTAINER_CMD build -t "fuzzforge-$$name:$$version" "$$module" || exit 1; \
fi \
done
@echo ""
@echo "✓ All modules built successfully!"
# Clean build artifacts
clean:
find . -type d -name "__pycache__" -exec rm -rf {} + 2>/dev/null || true
find . -type d -name ".pytest_cache" -exec rm -rf {} + 2>/dev/null || true
find . -type d -name ".mypy_cache" -exec rm -rf {} + 2>/dev/null || true
find . -type d -name ".ruff_cache" -exec rm -rf {} + 2>/dev/null || true
find . -type d -name "*.egg-info" -exec rm -rf {} + 2>/dev/null || true
find . -type f -name "*.pyc" -delete 2>/dev/null || true

View File

@@ -1,421 +0,0 @@
# FuzzForge Temporal Architecture - Quick Start Guide
This guide walks you through starting and testing the new Temporal-based architecture.
## Prerequisites
- Docker and Docker Compose installed
- At least 2GB free RAM (core services only, workers start on-demand)
- Ports available: 7233, 8233, 9000, 9001, 8000
## Step 1: Start Core Services
```bash
# From project root
cd /path/to/fuzzforge_ai
# Start core services (Temporal, MinIO, Backend)
docker-compose up -d
# Workers are pre-built but don't auto-start (saves ~6-7GB RAM)
# They'll start automatically when workflows need them
# Check status
docker-compose ps
```
**Expected output:**
```
NAME STATUS PORTS
fuzzforge-minio healthy 0.0.0.0:9000-9001->9000-9001/tcp
fuzzforge-temporal healthy 0.0.0.0:7233->7233/tcp
fuzzforge-temporal-postgresql healthy 5432/tcp
fuzzforge-backend healthy 0.0.0.0:8000->8000/tcp
fuzzforge-minio-setup exited (0)
# Workers NOT running (will start on-demand)
```
**First startup takes ~30-60 seconds** for health checks to pass.
## Step 2: Verify Worker Discovery
Check worker logs to ensure workflows are discovered:
```bash
docker logs fuzzforge-worker-rust
```
**Expected output:**
```
============================================================
FuzzForge Vertical Worker: rust
============================================================
Temporal Address: temporal:7233
Task Queue: rust-queue
Max Concurrent Activities: 5
============================================================
Discovering workflows for vertical: rust
Importing workflow module: toolbox.workflows.rust_test.workflow
✓ Discovered workflow: RustTestWorkflow from rust_test (vertical: rust)
Discovered 1 workflows for vertical 'rust'
Connecting to Temporal at temporal:7233...
✓ Connected to Temporal successfully
Creating worker on task queue: rust-queue
✓ Worker created successfully
============================================================
🚀 Worker started for vertical 'rust'
📦 Registered 1 workflows
⚙️ Registered 3 activities
📨 Listening on task queue: rust-queue
============================================================
Worker is ready to process tasks...
```
## Step 2.5: Worker Lifecycle Management (New in v0.7.0)
Workers start on-demand when workflows need them:
```bash
# Check worker status (should show Exited or not running)
docker ps -a --filter "name=fuzzforge-worker"
# Run a workflow - worker starts automatically
ff workflow run ossfuzz_campaign . project_name=zlib
# Worker is now running
docker ps --filter "name=fuzzforge-worker-ossfuzz"
```
**Configuration** (`.fuzzforge/config.yaml`):
```yaml
workers:
auto_start_workers: true # Default: auto-start
auto_stop_workers: false # Default: keep running
worker_startup_timeout: 60 # Startup timeout in seconds
```
**CLI Control**:
```bash
# Disable auto-start
ff workflow run ossfuzz_campaign . --no-auto-start
# Enable auto-stop after completion
ff workflow run ossfuzz_campaign . --wait --auto-stop
```
## Step 3: Access Web UIs
### Temporal Web UI
- URL: http://localhost:8233
- View workflows, executions, and task queues
### MinIO Console
- URL: http://localhost:9001
- Login: `fuzzforge` / `fuzzforge123`
- View uploaded targets and results
## Step 4: Test Workflow Execution
### Option A: Using Temporal CLI (tctl)
```bash
# Install tctl (if not already installed)
brew install temporal # macOS
# or download from https://github.com/temporalio/tctl/releases
# Execute test workflow
tctl workflow run \
--address localhost:7233 \
--taskqueue rust-queue \
--workflow_type RustTestWorkflow \
--input '{"target_id": "test-123", "test_message": "Hello Temporal!"}'
```
### Option B: Using Python Client
Create `test_workflow.py`:
```python
import asyncio
from temporalio.client import Client
async def main():
# Connect to Temporal
client = await Client.connect("localhost:7233")
# Start workflow
result = await client.execute_workflow(
"RustTestWorkflow",
{"target_id": "test-123", "test_message": "Hello Temporal!"},
id="test-workflow-1",
task_queue="rust-queue"
)
print("Workflow result:", result)
if __name__ == "__main__":
asyncio.run(main())
```
```bash
python test_workflow.py
```
### Option C: Upload Target and Run (Full Flow)
```python
# upload_and_run.py
import asyncio
import boto3
from pathlib import Path
from temporalio.client import Client
async def main():
# 1. Upload target to MinIO
s3 = boto3.client(
's3',
endpoint_url='http://localhost:9000',
aws_access_key_id='fuzzforge',
aws_secret_access_key='fuzzforge123',
region_name='us-east-1'
)
# Create a test file
test_file = Path('/tmp/test_target.txt')
test_file.write_text('This is a test target file')
# Upload to MinIO
target_id = 'my-test-target-001'
s3.upload_file(
str(test_file),
'targets',
f'{target_id}/target'
)
print(f"✓ Uploaded target: {target_id}")
# 2. Run workflow
client = await Client.connect("localhost:7233")
result = await client.execute_workflow(
"RustTestWorkflow",
{"target_id": target_id, "test_message": "Full flow test!"},
id=f"workflow-{target_id}",
task_queue="rust-queue"
)
print("✓ Workflow completed!")
print("Results:", result)
if __name__ == "__main__":
asyncio.run(main())
```
```bash
# Install dependencies
pip install temporalio boto3
# Run test
python upload_and_run.py
```
## Step 5: Monitor Execution
### View in Temporal UI
1. Open http://localhost:8233
2. Click on "Workflows"
3. Find your workflow by ID
4. Click to see:
- Execution history
- Activity results
- Error stack traces (if any)
### View Logs
```bash
# Worker logs (shows activity execution)
docker logs -f fuzzforge-worker-rust
# Temporal server logs
docker logs -f fuzzforge-temporal
```
### Check MinIO Storage
1. Open http://localhost:9001
2. Login: `fuzzforge` / `fuzzforge123`
3. Browse buckets:
- `targets/` - Uploaded target files
- `results/` - Workflow results (if uploaded)
- `cache/` - Worker cache (temporary)
## Troubleshooting
### Services Not Starting
```bash
# Check logs for all services
docker-compose -f docker-compose.temporal.yaml logs
# Check specific service
docker-compose -f docker-compose.temporal.yaml logs temporal
docker-compose -f docker-compose.temporal.yaml logs minio
docker-compose -f docker-compose.temporal.yaml logs worker-rust
```
### Worker Not Discovering Workflows
**Issue**: Worker logs show "No workflows found for vertical: rust"
**Solution**:
1. Check toolbox mount: `docker exec fuzzforge-worker-rust ls /app/toolbox/workflows`
2. Verify metadata.yaml exists and has `vertical: rust`
3. Check workflow.py has `@workflow.defn` decorator
### Cannot Connect to Temporal
**Issue**: `Failed to connect to Temporal`
**Solution**:
```bash
# Wait for Temporal to be healthy
docker-compose -f docker-compose.temporal.yaml ps
# Check Temporal health manually
curl http://localhost:8233
# Restart Temporal if needed
docker-compose -f docker-compose.temporal.yaml restart temporal
```
### MinIO Connection Failed
**Issue**: `Failed to download target`
**Solution**:
```bash
# Check MinIO is running
docker ps | grep minio
# Check buckets exist
docker exec fuzzforge-minio mc ls fuzzforge/
# Verify target was uploaded
docker exec fuzzforge-minio mc ls fuzzforge/targets/
```
### Workflow Hangs
**Issue**: Workflow starts but never completes
**Check**:
1. Worker logs for errors: `docker logs fuzzforge-worker-rust`
2. Activity timeouts in workflow code
3. Target file actually exists in MinIO
## Scaling
### Add More Workers
```bash
# Scale rust workers horizontally
docker-compose -f docker-compose.temporal.yaml up -d --scale worker-rust=3
# Verify all workers are running
docker ps | grep worker-rust
```
### Increase Concurrent Activities
Edit `docker-compose.temporal.yaml`:
```yaml
worker-rust:
environment:
MAX_CONCURRENT_ACTIVITIES: 10 # Increase from 5
```
```bash
# Apply changes
docker-compose -f docker-compose.temporal.yaml up -d worker-rust
```
## Cleanup
```bash
# Stop all services
docker-compose -f docker-compose.temporal.yaml down
# Remove volumes (WARNING: deletes all data)
docker-compose -f docker-compose.temporal.yaml down -v
# Remove everything including images
docker-compose -f docker-compose.temporal.yaml down -v --rmi all
```
## Next Steps
1. **Add More Workflows**: Create workflows in `backend/toolbox/workflows/`
2. **Add More Verticals**: Create new worker types (android, web, etc.) - see `workers/README.md`
3. **Integrate with Backend**: Update FastAPI backend to use Temporal client
4. **Update CLI**: Modify `ff` CLI to work with Temporal workflows
## Useful Commands
```bash
# View all logs
docker-compose -f docker-compose.temporal.yaml logs -f
# View specific service logs
docker-compose -f docker-compose.temporal.yaml logs -f worker-rust
# Restart a service
docker-compose -f docker-compose.temporal.yaml restart worker-rust
# Check service status
docker-compose -f docker-compose.temporal.yaml ps
# Execute command in worker
docker exec -it fuzzforge-worker-rust bash
# View worker Python environment
docker exec fuzzforge-worker-rust pip list
# Check workflow discovery manually
docker exec fuzzforge-worker-rust python -c "
from pathlib import Path
import yaml
for w in Path('/app/toolbox/workflows').iterdir():
if w.is_dir():
meta = w / 'metadata.yaml'
if meta.exists():
print(f'{w.name}: {yaml.safe_load(meta.read_text()).get(\"vertical\")}')"
```
## Architecture Overview
```
┌─────────────┐ ┌──────────────┐ ┌──────────────┐
│ Temporal │────▶│ Task Queue │────▶│ Worker-Rust │
│ Server │ │ rust-queue │ │ (Long-lived)│
└─────────────┘ └──────────────┘ └──────┬───────┘
│ │
│ │
▼ ▼
┌─────────────┐ ┌──────────────┐
│ Postgres │ │ MinIO │
│ (State) │ │ (Storage) │
└─────────────┘ └──────────────┘
┌──────┴──────┐
│ │
┌────▼────┐ ┌─────▼────┐
│ Targets │ │ Results │
└─────────┘ └──────────┘
```
## Support
- **Documentation**: See `ARCHITECTURE.md` for detailed design
- **Worker Guide**: See `workers/README.md` for adding verticals
- **Issues**: Open GitHub issue with logs and steps to reproduce

366
README.md
View File

@@ -1,231 +1,291 @@
<h1 align="center"> FuzzForge OSS</h1>
<h3 align="center">AI-Powered Security Research Orchestration via MCP</h3>
<p align="center">
<img src="docs/static/img/fuzzforge_banner_github.png" alt="FuzzForge Banner" width="100%">
<a href="https://discord.gg/8XEX33UUwZ"><img src="https://img.shields.io/discord/1420767905255133267?logo=discord&label=Discord" alt="Discord"></a>
<a href="LICENSE"><img src="https://img.shields.io/badge/license-BSL%201.1-blue" alt="License: BSL 1.1"></a>
<a href="https://www.python.org/downloads/"><img src="https://img.shields.io/badge/python-3.12%2B-blue" alt="Python 3.12+"/></a>
<a href="https://modelcontextprotocol.io"><img src="https://img.shields.io/badge/MCP-compatible-green" alt="MCP Compatible"/></a>
<a href="https://fuzzforge.ai"><img src="https://img.shields.io/badge/Website-fuzzforge.ai-purple" alt="Website"/></a>
</p>
<h1 align="center">🚧 FuzzForge is under active development</h1>
<p align="center"><strong>AI-powered workflow automation and AI Agents for AppSec, Fuzzing & Offensive Security</strong></p>
<p align="center">
<a href="https://discord.gg/8XEX33UUwZ/"><img src="https://img.shields.io/discord/1420767905255133267?logo=discord&label=Discord" alt="Discord"></a>
<a href="LICENSE"><img src="https://img.shields.io/badge/license-BSL%20%2B%20Apache-orange" alt="License: BSL + Apache"></a>
<a href="https://www.python.org/downloads/"><img src="https://img.shields.io/badge/python-3.11%2B-blue" alt="Python 3.11+"/></a>
<a href="https://fuzzforge.ai"><img src="https://img.shields.io/badge/Website-fuzzforge.ai-blue" alt="Website"/></a>
<img src="https://img.shields.io/badge/version-0.7.0-green" alt="Version">
<a href="https://github.com/FuzzingLabs/fuzzforge_ai/stargazers"><img src="https://img.shields.io/github/stars/FuzzingLabs/fuzzforge_ai?style=social" alt="GitHub Stars"></a>
<strong>Let AI agents orchestrate your security research workflows locally</strong>
</p>
<p align="center">
<sub>
<a href="#-overview"><b>Overview</b></a>
<a href="#-key-features"><b>Features</b></a>
<a href="#-installation"><b>Installation</b></a>
<a href="#-quickstart"><b>Quickstart</b></a>
<a href="#ai-powered-workflow-execution"><b>AI Demo</b></a>
<a href="#-contributing"><b>Contributing</b></a>
• <a href="#%EF%B8%8F-roadmap"><b>Roadmap</b></a>
<a href="#-overview"><b>Overview</b></a>
<a href="#-features"><b>Features</b></a>
<a href="#-installation"><b>Installation</b></a>
<a href="USAGE.md"><b>Usage Guide</b></a>
<a href="#-modules"><b>Modules</b></a>
<a href="#-contributing"><b>Contributing</b></a>
</sub>
</p>
---
> 🚧 **FuzzForge OSS is under active development.** Expect breaking changes and new features!
---
## 🚀 Overview
**FuzzForge** helps security researchers and engineers automate **application security** and **offensive security** workflows with the power of AI and fuzzing frameworks.
**FuzzForge OSS** is an open-source runtime that enables AI agents (GitHub Copilot, Claude, etc.) to orchestrate security research workflows through the **Model Context Protocol (MCP)**.
- Orchestrate static & dynamic analysis
- Automate vulnerability research
- Scale AppSec testing with AI agents
- Build, share & reuse workflows across teams
### The Core: Modules
FuzzForge is **open source**, built to empower security teams, researchers, and the community.
At the heart of FuzzForge are **modules** - containerized security tools that AI agents can discover, configure, and orchestrate. Each module encapsulates a specific security capability (static analysis, fuzzing, crash analysis, etc.) and runs in an isolated container.
> 🚧 FuzzForge is under active development. Expect breaking changes.
>
> **Note:** Fuzzing workflows (`atheris_fuzzing`, `cargo_fuzzing`, `ossfuzz_campaign`) are in early development. OSS-Fuzz integration is under heavy active development. For stable workflows, use: `security_assessment`, `gitleaks_detection`, `trufflehog_detection`, or `llm_secret_detection`.
- **🔌 Plug & Play**: Modules are self-contained - just pull and run
- **🤖 AI-Native**: Designed for AI agent orchestration via MCP
- **🔗 Composable**: Chain modules together into automated workflows
- **📦 Extensible**: Build custom modules with the Python SDK
---
The OSS runtime handles module discovery, execution, and result collection. Security modules (developed separately) provide the actual security tooling - from static analyzers to fuzzers to crash triagers.
## Demo - Manual Workflow Setup
Instead of manually running security tools, describe what you want and let your AI assistant handle it.
![Manual Workflow Demo](docs/static/videos/manual_workflow.gif)
### 🎬 Use Case: Rust Fuzzing Pipeline
_Setting up and running security workflows through the interface_
> **Scenario**: Fuzz a Rust crate to discover vulnerabilities using AI-assisted harness generation and parallel fuzzing.
👉 More installation options in the [Documentation](https://docs.fuzzforge.ai).
---
## ✨ Key Features
- 🤖 **AI Agents for Security** Specialized agents for AppSec, reversing, and fuzzing
- 🛠 **Workflow Automation** Define & execute AppSec workflows as code
- 📈 **Vulnerability Research at Scale** Rediscover 1-days & find 0-days with automation
- 🔗 **Fuzzer Integration** Atheris (Python), cargo-fuzz (Rust), OSS-Fuzz campaigns
- 🌐 **Community Marketplace** Share workflows, corpora, PoCs, and modules
- 🔒 **Enterprise Ready** Team/Corp cloud tiers for scaling offensive security
<table align="center">
<tr>
<th>1⃣ Analyze, Generate & Validate Harnesses</th>
<th>2⃣ Run Parallel Continuous Fuzzing</th>
</tr>
<tr>
<td><img src="assets/demopart2.gif" alt="FuzzForge Demo - Analysis Pipeline" width="100%"></td>
<td><img src="assets/demopart1.gif" alt="FuzzForge Demo - Parallel Fuzzing" width="100%"></td>
</tr>
<tr>
<td align="center"><sub>AI agent analyzes code, generates harnesses, and validates they compile</sub></td>
<td align="center"><sub>Multiple fuzzing sessions run in parallel with live metrics</sub></td>
</tr>
</table>
---
## ⭐ Support the Project
If you find FuzzForge useful, please **star the repo** to support development! 🚀
<a href="https://github.com/FuzzingLabs/fuzzforge_ai/stargazers">
<img src="https://img.shields.io/github/stars/FuzzingLabs/fuzzforge_ai?style=social" alt="GitHub Stars">
</a>
If you find FuzzForge useful, please star the repo to support development 🚀
---
## ✨ Features
| Feature | Description |
|---------|-------------|
| 🤖 **AI-Native** | Built for MCP - works with GitHub Copilot, Claude, and any MCP-compatible agent |
| 📦 **Containerized** | Each module runs in isolation via Docker or Podman |
| 🔄 **Continuous Mode** | Long-running tasks (fuzzing) with real-time metrics streaming |
| 🔗 **Workflows** | Chain multiple modules together in automated pipelines |
| 🛠️ **Extensible** | Create custom modules with the Python SDK |
| 🏠 **Local First** | All execution happens on your machine - no cloud required |
| 🔒 **Secure** | Sandboxed containers with no network access by default |
---
## 🔍 Secret Detection Benchmarks
## 🏗️ Architecture
FuzzForge includes three secret detection workflows benchmarked on a controlled dataset of **32 documented secrets** (12 Easy, 10 Medium, 10 Hard):
| Tool | Recall | Secrets Found | Speed |
|------|--------|---------------|-------|
| **LLM (gpt-5-mini)** | **84.4%** | 41 | 618s |
| **LLM (gpt-4o-mini)** | 56.2% | 30 | 297s |
| **Gitleaks** | 37.5% | 12 | 5s |
| **TruffleHog** | 0.0% | 1 | 5s |
📊 [Full benchmark results and analysis](backend/benchmarks/by_category/secret_detection/results/comparison_report.md)
The LLM-based detector excels at finding obfuscated and hidden secrets through semantic analysis, while pattern-based tools (Gitleaks) offer speed for standard secret formats.
```
┌─────────────────────────────────────────────────────────────────┐
│ AI Agent (Copilot/Claude) │
└───────────────────────────┬─────────────────────────────────────┘
│ MCP Protocol (stdio)
┌─────────────────────────────────────────────────────────────────┐
│ FuzzForge MCP Server │
│ ┌─────────────┐ ┌──────────────┐ ┌────────────────────────┐ │
│ │list_modules │ │execute_module│ │start_continuous_module │ │
│ └─────────────┘ └──────────────┘ └────────────────────────┘ │
└───────────────────────────┬─────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ FuzzForge Runner │
│ Container Engine (Docker/Podman) │
└───────────────────────────┬─────────────────────────────────────┘
┌───────────────────┼───────────────────┐
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ Module A │ │ Module B │ │ Module C │
│ (Container) │ │ (Container) │ │ (Container) │
└───────────────┘ └───────────────┘ └───────────────┘
```
---
## 📦 Installation
### Requirements
### Prerequisites
**Python 3.11+**
Python 3.11 or higher is required.
- **Python 3.12+**
- **[uv](https://docs.astral.sh/uv/)** package manager
- **Docker** ([Install Docker](https://docs.docker.com/get-docker/)) or Podman
**uv Package Manager**
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```
**Docker**
For containerized workflows, see the [Docker Installation Guide](https://docs.docker.com/get-docker/).
#### Configure AI Agent API Keys (Optional)
For AI-powered workflows, configure your LLM API keys:
```bash
cp volumes/env/.env.example volumes/env/.env
# Edit volumes/env/.env and add your API keys (OpenAI, Anthropic, Google, etc.)
```
This is required for:
- `llm_secret_detection` workflow
- AI agent features (`ff ai agent`)
Basic security workflows (gitleaks, trufflehog, security_assessment) work without this configuration.
### CLI Installation
After installing the requirements, install the FuzzForge CLI:
### Quick Install
```bash
# Clone the repository
git clone https://github.com/fuzzinglabs/fuzzforge_ai.git
git clone https://github.com/FuzzingLabs/fuzzforge_ai.git
cd fuzzforge_ai
# Install CLI with uv (from the root directory)
uv tool install --python python3.12 .
# Install dependencies
uv sync
# Build module images
make build-modules
```
### Configure MCP for Your AI Agent
```bash
# For GitHub Copilot
uv run fuzzforge mcp install copilot
# For Claude Code (CLI)
uv run fuzzforge mcp install claude-code
# For Claude Desktop (standalone app)
uv run fuzzforge mcp install claude-desktop
# Verify installation
uv run fuzzforge mcp status
```
**Restart your editor** and your AI agent will have access to FuzzForge tools!
---
## 📦 Modules
FuzzForge modules are containerized security tools that AI agents can orchestrate. The module ecosystem is designed around a simple principle: **the OSS runtime orchestrates, enterprise modules execute**.
### Module Ecosystem
| | FuzzForge OSS | FuzzForge Enterprise Modules |
|---|---|---|
| **What** | Runtime & MCP server | Security research modules |
| **License** | Apache 2.0 | BSL 1.1 (Business Source License) |
| **Compatibility** | ✅ Runs any compatible module | ✅ Works with OSS runtime |
**Enterprise modules** are developed separately and provide production-ready security tooling:
| Category | Modules | Description |
|----------|---------|-------------|
| 🔍 **Static Analysis** | Rust Analyzer, Solidity Analyzer, Cairo Analyzer | Code analysis and fuzzable function detection |
| 🎯 **Fuzzing** | Cargo Fuzzer, Honggfuzz, AFL++ | Coverage-guided fuzz testing |
| 💥 **Crash Analysis** | Crash Triager, Root Cause Analyzer | Automated crash deduplication and analysis |
| 🔐 **Vulnerability Detection** | Pattern Matcher, Taint Analyzer | Security vulnerability scanning |
| 📝 **Reporting** | Report Generator, SARIF Exporter | Automated security report generation |
> 💡 **Build your own modules!** The FuzzForge SDK allows you to create custom modules that integrate seamlessly with the OSS runtime. See [Creating Custom Modules](#-creating-custom-modules).
### Execution Modes
Modules run in two execution modes:
#### One-shot Execution
Run a module once and get results:
```python
result = execute_module("my-analyzer", assets_path="/path/to/project")
```
#### Continuous Execution
For long-running tasks like fuzzing, with real-time metrics:
```python
# Start continuous execution
session = start_continuous_module("my-fuzzer",
assets_path="/path/to/project",
configuration={"target": "my_target"})
# Check status with live metrics
status = get_continuous_status(session["session_id"])
# Stop and collect results
stop_continuous_module(session["session_id"])
```
---
## ⚡ Quickstart
## 🛠️ Creating Custom Modules
Run your first workflow with **Temporal orchestration** and **automatic file upload**:
Build your own security modules with the FuzzForge SDK:
```bash
# 1. Clone the repo
git clone https://github.com/fuzzinglabs/fuzzforge_ai.git
cd fuzzforge_ai
```python
from fuzzforge_modules_sdk import FuzzForgeModule, FuzzForgeModuleResults
# 2. Copy the default LLM env config
cp volumes/env/.env.example volumes/env/.env
# 3. Start FuzzForge with Temporal
docker compose up -d
# 4. Start the Python worker (needed for security_assessment workflow)
docker compose up -d worker-python
class MySecurityModule(FuzzForgeModule):
def _run(self, resources):
self.emit_event("started", target=resources[0].path)
# Your analysis logic here
results = self.analyze(resources)
self.emit_progress(100, status="completed",
message=f"Analysis complete")
return FuzzForgeModuleResults.SUCCESS
```
> The first launch can take 2-3 minutes for services to initialize ☕
>
> Workers don't auto-start by default (saves RAM). Start the worker you need before running workflows.
📖 See the [Module SDK Guide](fuzzforge-modules/fuzzforge-modules-sdk/README.md) for details.
```bash
# 5. Run your first workflow (files are automatically uploaded)
cd test_projects/vulnerable_app/
fuzzforge init # Initialize FuzzForge project
ff workflow run security_assessment . # Start workflow - CLI uploads files automatically!
---
# The CLI will:
# - Detect the local directory
# - Create a compressed tarball
# - Upload to backend (via MinIO)
# - Start the workflow on vertical worker
## 📁 Project Structure
```
fuzzforge_ai/
├── fuzzforge-cli/ # Command-line interface
├── fuzzforge-common/ # Shared abstractions (containers, storage)
├── fuzzforge-mcp/ # MCP server for AI agents
├── fuzzforge-modules/ # Security modules
│ └── fuzzforge-modules-sdk/ # Module development SDK
├── fuzzforge-runner/ # Local execution engine
├── fuzzforge-types/ # Type definitions & schemas
└── demo/ # Demo projects for testing
```
**What's running:**
- **Temporal**: Workflow orchestration (UI at http://localhost:8080)
- **MinIO**: File storage for targets (Console at http://localhost:9001)
- **Vertical Workers**: Pre-built workers with security toolchains
- **Backend API**: FuzzForge REST API (http://localhost:8000)
---
## AI-Powered Workflow Execution
## 🗺️ What's Next
![LLM Workflow Demo](docs/static/videos/llm_workflow.gif)
**[MCP Security Hub](https://github.com/FuzzingLabs/mcp-security-hub) integration** — Bridge 175+ offensive security tools (Nmap, Nuclei, Ghidra, and more) into FuzzForge workflows, all orchestrated by AI agents.
_AI agents automatically analyzing code and providing security insights_
## 📚 Resources
- 🌐 [Website](https://fuzzforge.ai)
- 📖 [Documentation](https://docs.fuzzforge.ai)
- 💬 [Community Discord](https://discord.gg/8XEX33UUwZ)
- 🎓 [FuzzingLabs Academy](https://academy.fuzzinglabs.com/?coupon=GITHUB_FUZZFORGE)
See [ROADMAP.md](ROADMAP.md) for the full roadmap.
---
## 🤝 Contributing
We welcome contributions from the community!
There are many ways to help:
We welcome contributions from the community!
- Report bugs by opening an [issue](../../issues)
- Suggest new features or improvements
- Submit pull requests with fixes or enhancements
- Share workflows, corpora, or modules with the community
- 🐛 Report bugs via [GitHub Issues](../../issues)
- 💡 Suggest features or improvements
- 🔧 Submit pull requests
- 📦 Share your custom modules
See our [Contributing Guide](CONTRIBUTING.md) for details.
See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
---
## 🗺️ Roadmap
## 📄 License
Planned features and improvements:
- 📦 Public workflow & module marketplace
- 🤖 New specialized AI agents (Rust, Go, Android, Automotive)
- 🔗 Expanded fuzzer integrations (LibFuzzer, Jazzer, more network fuzzers)
- ☁️ Multi-tenant SaaS platform with team collaboration
- 📊 Advanced reporting & analytics
👉 Follow updates in the [GitHub issues](../../issues) and [Discord](https://discord.gg/8XEX33UUwZ)
BSL 1.1 - See [LICENSE](LICENSE) for details.
---
## 📜 License
FuzzForge is released under the **Business Source License (BSL) 1.1**, with an automatic fallback to **Apache 2.0** after 4 years.
See [LICENSE](LICENSE) and [LICENSE-APACHE](LICENSE-APACHE) for details.
<p align="center">
<strong>Built with ❤️ by <a href="https://fuzzinglabs.com">FuzzingLabs</a></strong>
</p>

125
ROADMAP.md Normal file
View File

@@ -0,0 +1,125 @@
# FuzzForge OSS Roadmap
This document outlines the planned features and development direction for FuzzForge OSS.
---
## 🎯 Upcoming Features
### 1. MCP Security Hub Integration
**Status:** 🔄 Planned
Integrate [mcp-security-hub](https://github.com/FuzzingLabs/mcp-security-hub) tools into FuzzForge, giving AI agents access to 28 MCP servers and 163+ security tools through a unified interface.
#### How It Works
Unlike native FuzzForge modules (built with the SDK), mcp-security-hub tools are **standalone MCP servers**. The integration will bridge these tools so they can be:
- Discovered via `list_modules` alongside native modules
- Executed through FuzzForge's orchestration layer
- Chained with native modules in workflows
| Aspect | Native Modules | MCP Hub Tools |
|--------|----------------|---------------|
| **Runtime** | FuzzForge SDK container | Standalone MCP server container |
| **Protocol** | Direct execution | MCP-to-MCP bridge |
| **Configuration** | Module config | Tool-specific args |
| **Output** | FuzzForge results format | Tool-native format (normalized) |
#### Goals
- Unified discovery of all available tools (native + hub)
- Orchestrate hub tools through FuzzForge's workflow engine
- Normalize outputs for consistent result handling
- No modification required to mcp-security-hub tools
#### Planned Tool Categories
| Category | Tools | Example Use Cases |
|----------|-------|-------------------|
| **Reconnaissance** | nmap, masscan, whatweb, shodan | Network scanning, service discovery |
| **Web Security** | nuclei, sqlmap, ffuf, nikto | Vulnerability scanning, fuzzing |
| **Binary Analysis** | radare2, binwalk, yara, capa, ghidra | Reverse engineering, malware analysis |
| **Cloud Security** | trivy, prowler | Container scanning, cloud auditing |
| **Secrets Detection** | gitleaks | Credential scanning |
| **OSINT** | maigret, dnstwist | Username tracking, typosquatting |
| **Threat Intel** | virustotal, otx | Malware analysis, IOC lookup |
#### Example Workflow
```
You: "Scan example.com for vulnerabilities and analyze any suspicious binaries"
AI Agent:
1. Uses nmap module for port discovery
2. Uses nuclei module for vulnerability scanning
3. Uses binwalk module to extract firmware
4. Uses yara module for malware detection
5. Generates consolidated report
```
---
### 2. User Interface
**Status:** 🔄 Planned
A graphical interface to manage FuzzForge without the command line.
#### Goals
- Provide an alternative to CLI for users who prefer visual tools
- Make configuration and monitoring more accessible
- Complement (not replace) the CLI experience
#### Planned Capabilities
| Capability | Description |
|------------|-------------|
| **Configuration** | Change MCP server settings, engine options, paths |
| **Module Management** | Browse, configure, and launch modules |
| **Execution Monitoring** | View running tasks, logs, progress, metrics |
| **Project Overview** | Manage projects and browse execution results |
| **Workflow Management** | Create and run multi-module workflows |
---
## 📋 Backlog
Features under consideration for future releases:
| Feature | Description |
|---------|-------------|
| **Module Marketplace** | Browse and install community modules |
| **Scheduled Executions** | Run modules on a schedule (cron-style) |
| **Team Collaboration** | Share projects, results, and workflows |
| **Reporting Engine** | Generate PDF/HTML security reports |
| **Notifications** | Slack, Discord, email alerts for findings |
---
## ✅ Completed
| Feature | Version | Date |
|---------|---------|------|
| Docker as default engine | 0.1.0 | Jan 2026 |
| MCP server for AI agents | 0.1.0 | Jan 2026 |
| CLI for project management | 0.1.0 | Jan 2026 |
| Continuous execution mode | 0.1.0 | Jan 2026 |
| Workflow orchestration | 0.1.0 | Jan 2026 |
---
## 💬 Feedback
Have suggestions for the roadmap?
- Open an issue on [GitHub](https://github.com/FuzzingLabs/fuzzforge_ai/issues)
- Join our [Discord](https://discord.gg/8XEX33UUwZ)
---
<p align="center">
<strong>Built with ❤️ by <a href="https://fuzzinglabs.com">FuzzingLabs</a></strong>
</p>

453
USAGE.md Normal file
View File

@@ -0,0 +1,453 @@
# FuzzForge OSS Usage Guide
This guide covers everything you need to know to get started with FuzzForge OSS - from installation to running your first security research workflow with AI.
> **FuzzForge is designed to be used with AI agents** (GitHub Copilot, Claude, etc.) via MCP.
> The CLI is available for advanced users but the primary experience is through natural language interaction with your AI assistant.
---
## Table of Contents
- [Quick Start](#quick-start)
- [Prerequisites](#prerequisites)
- [Installation](#installation)
- [Building Modules](#building-modules)
- [MCP Server Configuration](#mcp-server-configuration)
- [GitHub Copilot](#github-copilot)
- [Claude Code (CLI)](#claude-code-cli)
- [Claude Desktop](#claude-desktop)
- [Using FuzzForge with AI](#using-fuzzforge-with-ai)
- [CLI Reference](#cli-reference)
- [Environment Variables](#environment-variables)
- [Troubleshooting](#troubleshooting)
---
## Quick Start
> **Prerequisites:** You need [uv](https://docs.astral.sh/uv/) and [Docker](https://docs.docker.com/get-docker/) installed.
> See the [Prerequisites](#prerequisites) section for installation instructions.
```bash
# 1. Clone and install
git clone https://github.com/FuzzingLabs/fuzzforge-oss.git
cd fuzzforge-oss
uv sync
# 2. Build the module images (one-time setup)
make build-modules
# 3. Install MCP for your AI agent
uv run fuzzforge mcp install copilot # For VS Code + GitHub Copilot
# OR
uv run fuzzforge mcp install claude-code # For Claude Code CLI
# 4. Restart your AI agent (VS Code, Claude, etc.)
# 5. Start talking to your AI:
# "List available FuzzForge modules"
# "Analyze this Rust crate for fuzzable functions"
# "Start fuzzing the parse_input function"
```
> **Note:** FuzzForge uses Docker by default. Podman is also supported via `--engine podman`.
---
## Prerequisites
Before installing FuzzForge OSS, ensure you have:
- **Python 3.12+** - [Download Python](https://www.python.org/downloads/)
- **uv** package manager - [Install uv](https://docs.astral.sh/uv/)
- **Docker** - Container runtime ([Install Docker](https://docs.docker.com/get-docker/))
### Installing uv
```bash
# Linux/macOS
curl -LsSf https://astral.sh/uv/install.sh | sh
# Or with pip
pip install uv
```
### Installing Docker
```bash
# Linux (Ubuntu/Debian)
curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER
# Log out and back in for group changes to take effect
# macOS/Windows
# Install Docker Desktop from https://docs.docker.com/get-docker/
```
> **Note:** Podman is also supported. Use `--engine podman` with CLI commands
> or set `FUZZFORGE_ENGINE=podman` environment variable.
---
## Installation
### 1. Clone the Repository
```bash
git clone https://github.com/FuzzingLabs/fuzzforge-oss.git
cd fuzzforge-oss
```
### 2. Install Dependencies
```bash
uv sync
```
This installs all FuzzForge components in a virtual environment.
### 3. Verify Installation
```bash
uv run fuzzforge --help
```
---
## Building Modules
FuzzForge modules are containerized security tools. After cloning, you need to build them once:
### Build All Modules
```bash
# From the fuzzforge-oss directory
make build-modules
```
This builds all available modules:
- `fuzzforge-rust-analyzer` - Analyzes Rust code for fuzzable functions
- `fuzzforge-cargo-fuzzer` - Runs cargo-fuzz on Rust crates
- `fuzzforge-harness-validator` - Validates generated fuzzing harnesses
- `fuzzforge-crash-analyzer` - Analyzes crash inputs
### Build a Single Module
```bash
# Build a specific module
cd fuzzforge-modules/rust-analyzer
make build
```
### Verify Modules are Built
```bash
# List built module images
docker images | grep fuzzforge
```
You should see something like:
```
fuzzforge-rust-analyzer 0.1.0 abc123def456 2 minutes ago 850 MB
fuzzforge-cargo-fuzzer 0.1.0 789ghi012jkl 2 minutes ago 1.2 GB
...
```
---
## MCP Server Configuration
FuzzForge integrates with AI agents through the Model Context Protocol (MCP). Configure your preferred AI agent to use FuzzForge tools.
### GitHub Copilot
```bash
# That's it! Just run this command:
uv run fuzzforge mcp install copilot
```
The command auto-detects everything:
- **FuzzForge root** - Where FuzzForge is installed
- **Modules path** - Defaults to `fuzzforge-oss/fuzzforge-modules`
- **Docker socket** - Auto-detects `/var/run/docker.sock`
**Optional overrides** (usually not needed):
```bash
uv run fuzzforge mcp install copilot \
--modules /path/to/modules \
--engine podman # if using Podman instead of Docker
```
**After installation:**
1. Restart VS Code
2. Open GitHub Copilot Chat
3. FuzzForge tools are now available!
### Claude Code (CLI)
```bash
uv run fuzzforge mcp install claude-code
```
Installs to `~/.claude.json` so FuzzForge tools are available from any directory.
**After installation:**
1. Run `claude` from any directory
2. FuzzForge tools are now available!
### Claude Desktop
```bash
# Automatic installation
uv run fuzzforge mcp install claude-desktop
# Verify
uv run fuzzforge mcp status
```
**After installation:**
1. Restart Claude Desktop
2. FuzzForge tools are now available!
### Check MCP Status
```bash
uv run fuzzforge mcp status
```
Shows configuration status for all supported AI agents:
```
┏━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Agent ┃ Config Path ┃ Status ┃ FuzzForge Configured ┃
┡━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ GitHub Copilot │ ~/.config/Code/User/mcp.json │ ✓ Exists │ ✓ Yes │
│ Claude Desktop │ ~/.config/Claude/claude_desktop_config... │ Not found │ - │
│ Claude Code │ ~/.claude.json │ ✓ Exists │ ✓ Yes │
└──────────────────────┴───────────────────────────────────────────┴──────────────┴─────────────────────────┘
```
### Generate Config Without Installing
```bash
# Preview the configuration that would be installed
uv run fuzzforge mcp generate copilot
uv run fuzzforge mcp generate claude-desktop
uv run fuzzforge mcp generate claude-code
```
### Remove MCP Configuration
```bash
uv run fuzzforge mcp uninstall copilot
uv run fuzzforge mcp uninstall claude-desktop
uv run fuzzforge mcp uninstall claude-code
```
---
## Using FuzzForge with AI
Once MCP is configured, you interact with FuzzForge through natural language with your AI assistant.
### Example Conversations
**Discover available tools:**
```
You: "What FuzzForge modules are available?"
AI: Uses list_modules → "I found 4 modules: rust-analyzer, cargo-fuzzer,
harness-validator, and crash-analyzer..."
```
**Analyze code for fuzzing targets:**
```
You: "Analyze this Rust crate for functions I should fuzz"
AI: Uses execute_module("rust-analyzer") → "I found 3 good fuzzing candidates:
- parse_input() in src/parser.rs - handles untrusted input
- decode_message() in src/codec.rs - complex parsing logic
..."
```
**Generate and validate harnesses:**
```
You: "Generate a fuzzing harness for the parse_input function"
AI: Creates harness code, then uses execute_module("harness-validator")
→ "Here's a harness that compiles successfully..."
```
**Run continuous fuzzing:**
```
You: "Start fuzzing parse_input for 10 minutes"
AI: Uses start_continuous_module("cargo-fuzzer") → "Started fuzzing session abc123"
You: "How's the fuzzing going?"
AI: Uses get_continuous_status("abc123") → "Running for 5 minutes:
- 150,000 executions
- 2 crashes found
- 45% edge coverage"
You: "Stop and show me the crashes"
AI: Uses stop_continuous_module("abc123") → "Found 2 unique crashes..."
```
### Available MCP Tools
| Tool | Description |
|------|-------------|
| `list_modules` | List all available security modules |
| `execute_module` | Run a module once and get results |
| `start_continuous_module` | Start a long-running module (e.g., fuzzing) |
| `get_continuous_status` | Check status of a continuous session |
| `stop_continuous_module` | Stop a continuous session |
| `list_continuous_sessions` | List all active sessions |
| `get_execution_results` | Retrieve results from an execution |
| `execute_workflow` | Run a multi-step workflow |
---
## CLI Reference
> **Note:** The CLI is for advanced users. Most users should interact with FuzzForge through their AI assistant.
### MCP Commands
```bash
uv run fuzzforge mcp status # Check configuration status
uv run fuzzforge mcp install <agent> # Install MCP config
uv run fuzzforge mcp uninstall <agent> # Remove MCP config
uv run fuzzforge mcp generate <agent> # Preview config without installing
```
### Module Commands
```bash
uv run fuzzforge modules list # List available modules
uv run fuzzforge modules info <module> # Show module details
uv run fuzzforge modules run <module> --assets . # Run a module
```
### Project Commands
```bash
uv run fuzzforge project init # Initialize a project
uv run fuzzforge project info # Show project info
uv run fuzzforge project executions # List executions
uv run fuzzforge project results <id> # Get execution results
```
---
## Environment Variables
Configure FuzzForge using environment variables:
```bash
# Project paths
export FUZZFORGE_MODULES_PATH=/path/to/modules
export FUZZFORGE_STORAGE_PATH=/path/to/storage
# Container engine (Docker is default)
export FUZZFORGE_ENGINE__TYPE=docker # or podman
# Podman-specific settings (only needed if using Podman under Snap)
export FUZZFORGE_ENGINE__GRAPHROOT=~/.fuzzforge/containers/storage
export FUZZFORGE_ENGINE__RUNROOT=~/.fuzzforge/containers/run
```
---
## Troubleshooting
### Docker Not Running
```
Error: Cannot connect to Docker daemon
```
**Solution:**
```bash
# Linux: Start Docker service
sudo systemctl start docker
# macOS/Windows: Start Docker Desktop application
# Verify Docker is running
docker run --rm hello-world
```
### Permission Denied on Docker Socket
```
Error: Permission denied connecting to Docker socket
```
**Solution:**
```bash
# Add your user to the docker group
sudo usermod -aG docker $USER
# Log out and back in for changes to take effect
# Then verify:
docker run --rm hello-world
```
### No Modules Found
```
No modules found.
```
**Solution:**
1. Build the modules first: `make build-modules`
2. Check the modules path: `uv run fuzzforge modules list`
3. Verify images exist: `docker images | grep fuzzforge`
### MCP Server Not Starting
Check the MCP configuration:
```bash
uv run fuzzforge mcp status
```
Verify the configuration file path exists and contains valid JSON.
### Module Container Fails to Build
```bash
# Build module container manually to see errors
cd fuzzforge-modules/<module-name>
docker build -t <module-name> .
```
### Using Podman Instead of Docker
If you prefer Podman:
```bash
# Use --engine podman with CLI
uv run fuzzforge mcp install copilot --engine podman
# Or set environment variable
export FUZZFORGE_ENGINE=podman
```
### Check Logs
FuzzForge stores execution logs in the storage directory:
```bash
ls -la ~/.fuzzforge/storage/<project-id>/<execution-id>/
```
---
## Next Steps
- 📖 Read the [Module SDK Guide](fuzzforge-modules/fuzzforge-modules-sdk/README.md) to create custom modules
- 🎬 Check the demos in the [README](README.md)
- 💬 Join our [Discord](https://discord.gg/8XEX33UUwZ) for support
---
<p align="center">
<strong>Built with ❤️ by <a href="https://fuzzinglabs.com">FuzzingLabs</a></strong>
</p>

6
ai/.gitignore vendored
View File

@@ -1,6 +0,0 @@
.env
__pycache__/
*.pyc
fuzzforge_sessions.db
agentops.log
*.log

View File

@@ -1,110 +0,0 @@
# FuzzForge AI Module
FuzzForge AI is the multi-agent layer that lets you operate the FuzzForge security platform through natural language. It orchestrates local tooling, registered Agent-to-Agent (A2A) peers, and the Temporal-powered backend while keeping long-running context in memory and project knowledge graphs.
## Quick Start
1. **Initialise a project**
```bash
cd /path/to/project
fuzzforge init
```
2. **Review environment settings** copy `.fuzzforge/.env.template` to `.fuzzforge/.env`, then edit the values to match your provider. The template ships with commented defaults for OpenAI-style usage and placeholders for Cognee keys.
```env
LLM_PROVIDER=openai
LITELLM_MODEL=gpt-5-mini
OPENAI_API_KEY=sk-your-key
FUZZFORGE_MCP_URL=http://localhost:8010/mcp
SESSION_PERSISTENCE=sqlite
```
Optional flags you may want to enable early:
```env
MEMORY_SERVICE=inmemory
AGENTOPS_API_KEY=sk-your-agentops-key # Enable hosted tracing
LOG_LEVEL=INFO # CLI / server log level
```
3. **Populate the knowledge graph**
```bash
fuzzforge ingest --path . --recursive
# alias: fuzzforge rag ingest --path . --recursive
```
4. **Launch the agent shell**
```bash
fuzzforge ai agent
```
Keep the backend running (Temporal API at `FUZZFORGE_MCP_URL`) so workflow commands succeed.
## Everyday Workflow
- Run `fuzzforge ai agent` and start with `list available fuzzforge workflows` or `/memory status` to confirm everything is wired.
- Use natural prompts for automation (`run fuzzforge workflow …`, `search project knowledge for …`) and fall back to slash commands for precision (`/recall`, `/sendfile`).
- Keep `/memory datasets` handy to see which Cognee datasets are available after each ingest.
- Start the HTTP surface with `python -m fuzzforge_ai` when external agents need access to artifacts or graph queries. The CLI stays usable at the same time.
- Refresh the knowledge graph regularly: `fuzzforge ingest --path . --recursive --force` keeps responses aligned with recent code changes.
## What the Agent Can Do
- **Route requests** automatically selects the right local tool or remote agent using the A2A capability registry.
- **Run security workflows** list, submit, and monitor FuzzForge workflows via MCP wrappers.
- **Manage artifacts** create downloadable files for reports, code edits, and shared attachments.
- **Maintain context** stores session history, semantic recall, and Cognee project graphs.
- **Serve over HTTP** expose the same agent as an A2A server using `python -m fuzzforge_ai`.
## Essential Commands
Inside `fuzzforge ai agent` you can mix slash commands and free-form prompts:
```text
/list # Show registered A2A agents
/register http://:10201 # Add a remote agent
/artifacts # List generated files
/sendfile SecurityAgent src/report.md "Please review"
You> route_to SecurityAnalyzer: scan ./backend for secrets
You> run fuzzforge workflow static_analysis_scan on ./test_projects/demo
You> search project knowledge for "temporal status" using INSIGHTS
```
Artifacts created during the conversation are served from `.fuzzforge/artifacts/` and exposed through the A2A HTTP API.
## Memory & Knowledge
The module layers three storage systems:
- **Session persistence** (SQLite or in-memory) for chat transcripts.
- **Semantic recall** via the ADK memory service for fuzzy search.
- **Cognee graphs** for project-wide knowledge built from ingestion runs.
Re-run ingestion after major code changes to keep graph answers relevant. If Cognee variables are not set, graph-specific tools automatically respond with a polite "not configured" message.
## Sample Prompts
Use these to validate the setup once the agent shell is running:
- `list available fuzzforge workflows`
- `run fuzzforge workflow static_analysis_scan on ./backend with target_branch=main`
- `show findings for that run once it finishes`
- `refresh the project knowledge graph for ./backend`
- `search project knowledge for "temporal readiness" using INSIGHTS`
- `/recall terraform secrets`
- `/memory status`
- `ROUTE_TO SecurityAnalyzer: audit infrastructure_vulnerable`
## Need More Detail?
Dive into the dedicated guides under `ai/docs/advanced/`:
- [Architecture](https://docs.fuzzforge.ai/docs/ai/intro) High-level architecture with diagrams and component breakdowns.
- [Ingestion](https://docs.fuzzforge.ai/docs/ai/ingestion.md) Command options, Cognee persistence, and prompt examples.
- [Configuration](https://docs.fuzzforge.ai/docs/ai/configuration.md) LLM provider matrix, local model setup, and tracing options.
- [Prompts](https://docs.fuzzforge.ai/docs/ai/prompts.md) Slash commands, workflow prompts, and routing tips.
- [A2A Services](https://docs.fuzzforge.ai/docs/ai/a2a-services.md) HTTP endpoints, agent card, and collaboration flow.
- [Memory Persistence](https://docs.fuzzforge.ai/docs/ai/architecture.md#memory--persistence) Deep dive on memory storage, datasets, and how `/memory status` inspects them.
## Development Notes
- Entry point for the CLI: `ai/src/fuzzforge_ai/cli.py`
- A2A HTTP server: `ai/src/fuzzforge_ai/a2a_server.py`
- Tool routing & workflow glue: `ai/src/fuzzforge_ai/agent_executor.py`
- Ingestion helpers: `ai/src/fuzzforge_ai/ingest_utils.py`
Install the module in editable mode (`pip install -e ai`) while iterating so CLI changes are picked up immediately.

View File

@@ -1,9 +0,0 @@
__pycache__
*.pyc
*.pyo
*.pytest_cache
*.coverage
coverage.xml
build/
dist/
.env

View File

@@ -1,10 +0,0 @@
# Default LiteLLM configuration
LITELLM_MODEL=gemini/gemini-2.0-flash-001
# LITELLM_PROVIDER=gemini
# API keys (uncomment and fill as needed)
# GOOGLE_API_KEY=
# OPENAI_API_KEY=
# ANTHROPIC_API_KEY=
# OPENROUTER_API_KEY=
# MISTRAL_API_KEY=

View File

@@ -1,82 +0,0 @@
# Architecture Overview
This package is a minimal ADK agent that keeps runtime behaviour and A2A access in separate layers so it can double as boilerplate.
## Directory Layout
```text
agent_with_adk_format/
├── __init__.py # Exposes root_agent for ADK runners
├── a2a_hot_swap.py # JSON-RPC helper for model/prompt swaps
├── README.md, QUICKSTART.md # Operational docs
├── ARCHITECTURE.md # This document
├── .env # Active environment (gitignored)
├── .env.example # Environment template
└── litellm_agent/
├── agent.py # Root Agent definition (LiteLLM shell)
├── callbacks.py # before_agent / before_model hooks
├── config.py # Defaults, state keys, control prefix
├── control.py # HOTSWAP command parsing/serialization
├── state.py # Session state wrapper + LiteLLM factory
├── tools.py # set_model / set_prompt / get_config
├── prompts.py # Base instruction text
└── agent.json # A2A agent card (served under /.well-known)
```
```mermaid
flowchart TD
subgraph ADK Runner
A["adk api_server / adk web / adk run"]
B["agent_with_adk_format/__init__.py"]
C["litellm_agent/agent.py (root_agent)"]
D["HotSwapState (state.py)"]
E["LiteLlm(model, provider)"]
end
subgraph Session State
S1[app:litellm_agent/model]
S2[app:litellm_agent/provider]
S3[app:litellm_agent/prompt]
end
A --> B --> C
C --> D
D -->|instantiate| E
D --> S1
D --> S2
D --> S3
E --> C
```
## Runtime Flow (ADK Runners)
1. **Startup**: `adk api_server`/`adk web` imports `agent_with_adk_format`, which exposes `root_agent` from `litellm_agent/agent.py`. `.env` at package root is loaded before the runner constructs the agent.
2. **Session State**: `callbacks.py` and `tools.py` read/write through `state.py`. We store `model`, `provider`, and `prompt` keys (prefixed `app:litellm_agent/…`) which persist across turns.
3. **Instruction Generation**: `provide_instruction` composes the base persona from `prompts.py` plus any stored prompt override. The current model/provider is appended for observability.
4. **Model Hot-Swap**: When a control message is detected (`[HOTSWAP:MODEL:…]`) the callback parses it via `control.py`, updates the session state, and calls `state.apply_state_to_agent` to instantiate a new `LiteLlm(model=…, custom_llm_provider=…)`. ADK runners reuse that instance for subsequent turns.
5. **Prompt Hot-Swap**: Similar path (`set_prompt` tool/callback) updates state; the dynamic instruction immediately reflects the change.
6. **Config Reporting**: Both the callback and the tool surface the summary string produced by `HotSwapState.describe()`, ensuring CLI, A2A, and UI all show the same data.
## A2A Integration
- `agent.json` defines the agent card and enables ADK to register `/a2a/litellm_agent` routes when launched with `--a2a`.
- `a2a_hot_swap.py` uses `a2a.client.A2AClient` to programmatically send control messages and user text via JSON-RPC. It supports streaming when available and falls back to blocking requests otherwise.
```mermaid
sequenceDiagram
participant Client as a2a_hot_swap.py
participant Server as ADK API Server
participant Agent as root_agent
Client->>Server: POST /a2a/litellm_agent (message/stream or message/send)
Server->>Agent: Invoke callbacks/tools
Agent->>Server: Status / artifacts / final message
Server->>Client: Streamed Task events
Client->>Client: Extract text & print summary
```
## Extending the Boilerplate
- Add tools under `litellm_agent/tools.py` and register them in `agent.py` to expose new capabilities.
- Use `state.py` to track additional configuration or session data (store under your own prefix to avoid collisions).
- When layering business logic, prefer expanding callbacks or adding higher-level agents while leaving the hot-swap mechanism untouched for reuse.

View File

@@ -1,71 +0,0 @@
# Docker & Kubernetes Deployment
## Local Docker
Build from the repository root:
```bash
docker build -t litellm-hot-swap:latest agent_with_adk_format
```
Run the container (port 8000, inject provider keys via env file or flags):
```bash
docker run \
-p 8000:8000 \
--env-file agent_with_adk_format/.env \
litellm-hot-swap:latest
```
The container serves Uvicorn on `http://localhost:8000`. Update `.env` (or pass `-e KEY=value`) before launching if you plan to hot-swap providers.
## Kubernetes (example manifest)
Use the same image, optionally pushed to a registry (`docker tag` + `docker push`). A simple Deployment/Service pair:
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: litellm-hot-swap
spec:
replicas: 1
selector:
matchLabels:
app: litellm-hot-swap
template:
metadata:
labels:
app: litellm-hot-swap
spec:
containers:
- name: server
image: <REGISTRY_URI>/litellm-hot-swap:latest
ports:
- containerPort: 8000
env:
- name: PORT
value: "8000"
- name: LITELLM_MODEL
value: gemini/gemini-2.0-flash-001
# Add provider keys as needed
# - name: OPENAI_API_KEY
# valueFrom:
# secretKeyRef:
# name: litellm-secrets
# key: OPENAI_API_KEY
---
apiVersion: v1
kind: Service
metadata:
name: litellm-hot-swap
spec:
type: LoadBalancer
selector:
app: litellm-hot-swap
ports:
- port: 80
targetPort: 8000
```
Apply with `kubectl apply -f deployment.yaml`. Provide secrets via `env` or Kubernetes Secrets.

View File

@@ -1,19 +0,0 @@
# syntax=docker/dockerfile:1
FROM python:3.11-slim AS base
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
PIP_NO_CACHE_DIR=1 \
PORT=8000
WORKDIR /app
COPY requirements.txt ./requirements.txt
RUN pip install --upgrade pip && pip install -r requirements.txt
COPY . /app/agent_with_adk_format
WORKDIR /app/agent_with_adk_format
ENV PYTHONPATH=/app
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

View File

@@ -1,61 +0,0 @@
# Quick Start Guide
## Launch the Agent
From the repository root you can expose the agent through any ADK entry point:
```bash
# A2A / HTTP server
adk api_server --a2a --port 8000 agent_with_adk_format
# Browser UI
adk web agent_with_adk_format
# Interactive terminal
adk run agent_with_adk_format
```
The A2A server exposes the JSON-RPC endpoint at `http://localhost:8000/a2a/litellm_agent`.
## Hot-Swap from the Command Line
Use the bundled helper to change model and prompt via A2A without touching the UI:
```bash
python agent_with_adk_format/a2a_hot_swap.py \
--model openai gpt-4o \
--prompt "You are concise." \
--config \
--context demo-session
```
The script sends the control messages for you and prints the servers responses. The `--context` flag lets you reuse the same conversation across multiple invocations.
### Follow-up Messages
Once the swaps are applied you can send a user message on the same session:
```bash
python agent_with_adk_format/a2a_hot_swap.py \
--context demo-session \
--message "Summarise the current configuration in five words."
```
### Clearing the Prompt
```bash
python agent_with_adk_format/a2a_hot_swap.py \
--context demo-session \
--prompt "" \
--config
```
## Control Messages (for reference)
Behind the scenes the helper sends plain text messages understood by the callbacks:
- `[HOTSWAP:MODEL:provider/model]`
- `[HOTSWAP:PROMPT:text]`
- `[HOTSWAP:GET_CONFIG]`
You can craft the same messages from any A2A client if you prefer.

View File

@@ -1,349 +0,0 @@
# LiteLLM Agent with Hot-Swap Support
A flexible AI agent powered by LiteLLM that supports runtime hot-swapping of models and system prompts. Compatible with ADK and A2A protocols.
## Features
- 🔄 **Hot-Swap Models**: Change LLM models on-the-fly without restarting
- 📝 **Dynamic Prompts**: Update system prompts during conversation
- 🌐 **Multi-Provider Support**: Works with OpenAI, Anthropic, Google, OpenRouter, and more
- 🔌 **A2A Compatible**: Can be served as an A2A agent
- 🛠️ **ADK Integration**: Run with `adk web`, `adk run`, or `adk api_server`
## Architecture
```
task_agent/
├── __init__.py # Exposes root_agent for ADK
├── a2a_hot_swap.py # JSON-RPC helper for hot-swapping
├── README.md # This guide
├── QUICKSTART.md # Quick-start walkthrough
├── .env # Active environment (gitignored)
├── .env.example # Environment template
└── litellm_agent/
├── __init__.py
├── agent.py # Main agent implementation
├── agent.json # A2A agent card
├── callbacks.py # ADK callbacks
├── config.py # Defaults and state keys
├── control.py # HOTSWAP message helpers
├── prompts.py # Base instruction
├── state.py # Session state utilities
└── tools.py # set_model / set_prompt / get_config
```
## Setup
### 1. Environment Configuration
Copying the example file is optional—the repository already ships with a root-level `.env` seeded with defaults. Adjust the values at the package root:
```bash
cd task_agent
# Optionally refresh from the template
# cp .env.example .env
```
Edit `.env` (or `.env.example`) and add your API keys. The agent must be restarted after changes so the values are picked up:
```bash
# Set default model
LITELLM_MODEL=gemini/gemini-2.0-flash-001
# Add API keys for providers you want to use
GOOGLE_API_KEY=your_google_api_key
OPENAI_API_KEY=your_openai_api_key
ANTHROPIC_API_KEY=your_anthropic_api_key
OPENROUTER_API_KEY=your_openrouter_api_key
```
### 2. Install Dependencies
```bash
pip install "google-adk" "a2a-sdk[all]" "python-dotenv" "litellm"
```
### 3. Run in Docker
Build the container (this image can be pushed to any registry or run locally):
```bash
docker build -t litellm-hot-swap:latest task_agent
```
Provide environment configuration at runtime (either pass variables individually or mount a file):
```bash
docker run \
-p 8000:8000 \
--env-file task_agent/.env \
litellm-hot-swap:latest
```
The container starts Uvicorn with the ADK app (`main.py`) listening on port 8000.
## Running the Agent
### Option 1: ADK Web UI (Recommended for Testing)
Start the web interface:
```bash
adk web task_agent
```
> **Tip:** before launching `adk web`/`adk run`/`adk api_server`, ensure the root-level `.env` contains valid API keys for any provider you plan to hot-swap to (e.g. set `OPENAI_API_KEY` before switching to `openai/gpt-4o`).
Open http://localhost:8000 in your browser and interact with the agent.
### Option 2: ADK Terminal
Run in terminal mode:
```bash
adk run task_agent
```
### Option 3: A2A API Server
Start as an A2A-compatible API server:
```bash
adk api_server --a2a --port 8000 task_agent
```
The agent will be available at: `http://localhost:8000/a2a/litellm_agent`
### Command-line helper
Use the bundled script to drive hot-swaps and user messages over A2A:
```bash
python task_agent/a2a_hot_swap.py \
--url http://127.0.0.1:8000/a2a/litellm_agent \
--model openai gpt-4o \
--prompt "You are concise." \
--config \
--context demo-session
```
To send a follow-up prompt in the same session (with a larger timeout for long answers):
```bash
python task_agent/a2a_hot_swap.py \
--url http://127.0.0.1:8000/a2a/litellm_agent \
--model openai gpt-4o \
--prompt "You are concise." \
--message "Give me a fuzzing harness." \
--context demo-session \
--timeout 120
```
> Ensure the corresponding provider keys are present in `.env` (or passed via environment variables) before issuing model swaps.
## Hot-Swap Tools
The agent provides three special tools:
### 1. `set_model` - Change the LLM Model
Change the model during conversation:
```
User: Use the set_model tool to change to gpt-4o with openai provider
Agent: ✅ Model configured to: openai/gpt-4o
This change is active now!
```
**Parameters:**
- `model`: Model name (e.g., "gpt-4o", "claude-3-sonnet-20240229")
- `custom_llm_provider`: Optional provider prefix (e.g., "openai", "anthropic", "openrouter")
**Examples:**
- OpenAI: `set_model(model="gpt-4o", custom_llm_provider="openai")`
- Anthropic: `set_model(model="claude-3-sonnet-20240229", custom_llm_provider="anthropic")`
- Google: `set_model(model="gemini-2.0-flash-001", custom_llm_provider="gemini")`
### 2. `set_prompt` - Change System Prompt
Update the system instructions:
```
User: Use set_prompt to change my behavior to "You are a helpful coding assistant"
Agent: ✅ System prompt updated:
You are a helpful coding assistant
This change is active now!
```
### 3. `get_config` - View Configuration
Check current model and prompt:
```
User: Use get_config to show me your configuration
Agent: 📊 Current Configuration:
━━━━━━━━━━━━━━━━━━━━━━
Model: openai/gpt-4o
System Prompt: You are a helpful coding assistant
━━━━━━━━━━━━━━━━━━━━━━
```
## Testing
### Basic A2A Client Test
```bash
python agent/test_a2a_client.py
```
### Hot-Swap Functionality Test
```bash
python agent/test_hotswap.py
```
This will:
1. Check initial configuration
2. Query with default model
3. Hot-swap to GPT-4o
4. Verify model changed
5. Change system prompt
6. Test new prompt behavior
7. Hot-swap to Claude
8. Verify final configuration
### Command-Line Hot-Swap Helper
You can trigger model and prompt changes directly against the A2A endpoint without the interactive CLI:
```bash
# Start the agent first (in another terminal):
adk api_server --a2a --port 8000 task_agent
# Apply swaps via pure A2A calls
python task_agent/a2a_hot_swap.py --model openai gpt-4o --prompt "You are concise." --config
python task_agent/a2a_hot_swap.py --model anthropic claude-3-sonnet-20240229 --context shared-session --config
python task_agent/a2a_hot_swap.py --prompt "" --context shared-session --config # Clear the prompt and show current state
```
`--model` accepts either `provider/model` or a provider/model pair. Add `--context` if you want to reuse the same conversation across invocations. Use `--config` to dump the agent's configuration after the changes are applied.
## Supported Models
### OpenAI
- `openai/gpt-4o`
- `openai/gpt-4-turbo`
- `openai/gpt-3.5-turbo`
### Anthropic
- `anthropic/claude-3-opus-20240229`
- `anthropic/claude-3-sonnet-20240229`
- `anthropic/claude-3-haiku-20240307`
### Google
- `gemini/gemini-2.0-flash-001`
- `gemini/gemini-2.5-pro-exp-03-25`
- `vertex_ai/gemini-2.0-flash-001`
### OpenRouter
- `openrouter/anthropic/claude-3-opus`
- `openrouter/openai/gpt-4`
- Any model from OpenRouter catalog
## How It Works
### Session State
- Model and prompt settings are stored in session state
- Each session maintains its own configuration
- Settings persist across messages in the same session
### Hot-Swap Mechanism
1. Tools update session state with new model/prompt
2. `before_agent_callback` checks for changes
3. If model changed, directly updates: `agent.model = LiteLlm(model=new_model)`
4. Dynamic instruction function reads custom prompt from session state
### A2A Compatibility
- Agent card at `agent.json` defines A2A metadata
- Served at `/a2a/litellm_agent` endpoint
- Compatible with A2A client protocol
## Example Usage
### Interactive Session
```python
from a2a.client import A2AClient
import asyncio
async def chat():
client = A2AClient("http://localhost:8000/a2a/litellm_agent")
context_id = "my-session-123"
# Start with default model
async for msg in client.send_message("Hello!", context_id=context_id):
print(msg)
# Switch to GPT-4
async for msg in client.send_message(
"Use set_model with model gpt-4o and provider openai",
context_id=context_id
):
print(msg)
# Continue with new model
async for msg in client.send_message(
"Help me write a function",
context_id=context_id
):
print(msg)
asyncio.run(chat())
```
## Troubleshooting
### Model Not Found
- Ensure API key for the provider is set in `.env`
- Check model name is correct for the provider
- Verify LiteLLM supports the model (https://docs.litellm.ai/docs/providers)
### Connection Refused
- Ensure the agent is running (`adk api_server --a2a task_agent`)
- Check the port matches (default: 8000)
- Verify no firewall blocking localhost
### Hot-Swap Not Working
- Check that you're using the same `context_id` across messages
- Ensure the tool is being called (not just asked to switch)
- Look for `🔄 Hot-swapped model to:` in server logs
## Development
### Adding New Tools
```python
async def my_tool(tool_ctx: ToolContext, param: str) -> str:
"""Your tool description."""
# Access session state
tool_ctx.state["my_key"] = "my_value"
return "Tool result"
# Add to agent
root_agent = LlmAgent(
# ...
tools=[set_model, set_prompt, get_config, my_tool],
)
```
### Modifying Callbacks
```python
async def after_model_callback(
callback_context: CallbackContext,
llm_response: LlmResponse
) -> Optional[LlmResponse]:
"""Modify response after model generates it."""
# Your logic here
return llm_response
```
## License
Apache 2.0

View File

@@ -1,5 +0,0 @@
"""Package entry point for the ADK-formatted hot swap agent."""
from .litellm_agent.agent import root_agent
__all__ = ["root_agent"]

View File

@@ -1,224 +0,0 @@
#!/usr/bin/env python3
"""Minimal A2A client utility for hot-swapping LiteLLM model/prompt."""
from __future__ import annotations
import argparse
import asyncio
from typing import Optional
from uuid import uuid4
import httpx
from a2a.client import A2AClient
from a2a.client.errors import A2AClientHTTPError
from a2a.types import (
JSONRPCErrorResponse,
Message,
MessageSendConfiguration,
MessageSendParams,
Part,
Role,
SendMessageRequest,
SendStreamingMessageRequest,
Task,
TaskArtifactUpdateEvent,
TaskStatusUpdateEvent,
TextPart,
)
from litellm_agent.control import (
HotSwapCommand,
build_control_message,
parse_model_spec,
serialize_model_spec,
)
DEFAULT_URL = "http://localhost:8000/a2a/litellm_agent"
async def _collect_text(client: A2AClient, message: str, context_id: str) -> str:
"""Send a message and collect streamed agent text into a single string."""
params = MessageSendParams(
configuration=MessageSendConfiguration(blocking=True),
message=Message(
context_id=context_id,
message_id=str(uuid4()),
role=Role.user,
parts=[Part(root=TextPart(text=message))],
),
)
stream_request = SendStreamingMessageRequest(id=str(uuid4()), params=params)
buffer: list[str] = []
try:
async for response in client.send_message_streaming(stream_request):
root = response.root
if isinstance(root, JSONRPCErrorResponse):
raise RuntimeError(f"A2A error: {root.error}")
payload = root.result
buffer.extend(_extract_text(payload))
except A2AClientHTTPError as exc:
if "text/event-stream" not in str(exc):
raise
send_request = SendMessageRequest(id=str(uuid4()), params=params)
response = await client.send_message(send_request)
root = response.root
if isinstance(root, JSONRPCErrorResponse):
raise RuntimeError(f"A2A error: {root.error}")
payload = root.result
buffer.extend(_extract_text(payload))
if buffer:
buffer = list(dict.fromkeys(buffer))
return "\n".join(buffer).strip()
def _extract_text(
result: Message | Task | TaskStatusUpdateEvent | TaskArtifactUpdateEvent,
) -> list[str]:
texts: list[str] = []
if isinstance(result, Message):
if result.role is Role.agent:
for part in result.parts:
root_part = part.root
text = getattr(root_part, "text", None)
if text:
texts.append(text)
elif isinstance(result, Task) and result.history:
for msg in result.history:
if msg.role is Role.agent:
for part in msg.parts:
root_part = part.root
text = getattr(root_part, "text", None)
if text:
texts.append(text)
elif isinstance(result, TaskStatusUpdateEvent):
message = result.status.message
if message:
texts.extend(_extract_text(message))
elif isinstance(result, TaskArtifactUpdateEvent):
artifact = result.artifact
if artifact and artifact.parts:
for part in artifact.parts:
root_part = part.root
text = getattr(root_part, "text", None)
if text:
texts.append(text)
return texts
def _split_model_args(model_args: Optional[list[str]]) -> tuple[Optional[str], Optional[str]]:
if not model_args:
return None, None
if len(model_args) == 1:
return model_args[0], None
provider = model_args[0]
model = " ".join(model_args[1:])
return model, provider
async def hot_swap(
url: str,
*,
model_args: Optional[list[str]],
provider: Optional[str],
prompt: Optional[str],
message: Optional[str],
show_config: bool,
context_id: Optional[str],
timeout: float,
) -> None:
"""Execute the requested hot-swap operations against the A2A endpoint."""
timeout_config = httpx.Timeout(timeout)
async with httpx.AsyncClient(timeout=timeout_config) as http_client:
client = A2AClient(url=url, httpx_client=http_client)
session_id = context_id or str(uuid4())
model, derived_provider = _split_model_args(model_args)
if model:
spec = parse_model_spec(model, provider=provider or derived_provider)
payload = serialize_model_spec(spec)
control_msg = build_control_message(HotSwapCommand.MODEL, payload)
result = await _collect_text(client, control_msg, session_id)
print(f"Model response: {result or '(no response)'}")
if prompt is not None:
control_msg = build_control_message(HotSwapCommand.PROMPT, prompt)
result = await _collect_text(client, control_msg, session_id)
print(f"Prompt response: {result or '(no response)'}")
if show_config:
control_msg = build_control_message(HotSwapCommand.GET_CONFIG)
result = await _collect_text(client, control_msg, session_id)
print(f"Config:\n{result or '(no response)'}")
if message:
result = await _collect_text(client, message, session_id)
print(f"Message response: {result or '(no response)'}")
print(f"Context ID: {session_id}")
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument(
"--url",
default=DEFAULT_URL,
help=f"A2A endpoint for the agent (default: {DEFAULT_URL})",
)
parser.add_argument(
"--model",
nargs="+",
help="LiteLLM model spec: either 'provider/model' or '<provider> <model>'.",
)
parser.add_argument(
"--provider",
help="Optional LiteLLM provider when --model lacks a prefix.")
parser.add_argument(
"--prompt",
help="Set the system prompt (omit to leave unchanged; empty string clears it).",
)
parser.add_argument(
"--message",
help="Send an additional user message after the swaps complete.")
parser.add_argument(
"--config",
action="store_true",
help="Print the agent configuration after performing swaps.")
parser.add_argument(
"--context",
help="Optional context/session identifier to reuse across calls.")
parser.add_argument(
"--timeout",
type=float,
default=60.0,
help="Request timeout (seconds) for A2A calls (default: 60).",
)
return parser.parse_args()
def main() -> None:
args = parse_args()
asyncio.run(
hot_swap(
args.url,
model_args=args.model,
provider=args.provider,
prompt=args.prompt,
message=args.message,
show_config=args.config,
context_id=args.context,
timeout=args.timeout,
)
)
if __name__ == "__main__":
main()

View File

@@ -1,24 +0,0 @@
version: '3.8'
services:
task-agent:
build:
context: .
dockerfile: Dockerfile
container_name: fuzzforge-task-agent
ports:
- "10900:8000"
env_file:
- ../../../volumes/env/.env
environment:
- PORT=8000
- PYTHONUNBUFFERED=1
volumes:
# Mount volumes/env for runtime config access
- ../../../volumes/env:/app/config:ro
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 10s
retries: 3

View File

@@ -1,55 +0,0 @@
"""LiteLLM hot-swap agent package exports."""
from .agent import root_agent
from .callbacks import (
before_agent_callback,
before_model_callback,
provide_instruction,
)
from .config import (
AGENT_DESCRIPTION,
AGENT_NAME,
CONTROL_PREFIX,
DEFAULT_MODEL,
DEFAULT_PROVIDER,
STATE_MODEL_KEY,
STATE_PROVIDER_KEY,
STATE_PROMPT_KEY,
)
from .control import (
HotSwapCommand,
ModelSpec,
build_control_message,
parse_control_message,
parse_model_spec,
serialize_model_spec,
)
from .state import HotSwapState, apply_state_to_agent
from .tools import HOTSWAP_TOOLS, get_config, set_model, set_prompt
__all__ = [
"root_agent",
"before_agent_callback",
"before_model_callback",
"provide_instruction",
"AGENT_DESCRIPTION",
"AGENT_NAME",
"CONTROL_PREFIX",
"DEFAULT_MODEL",
"DEFAULT_PROVIDER",
"STATE_MODEL_KEY",
"STATE_PROVIDER_KEY",
"STATE_PROMPT_KEY",
"HotSwapCommand",
"ModelSpec",
"HotSwapState",
"apply_state_to_agent",
"build_control_message",
"parse_control_message",
"parse_model_spec",
"serialize_model_spec",
"HOTSWAP_TOOLS",
"get_config",
"set_model",
"set_prompt",
]

View File

@@ -1,24 +0,0 @@
{
"name": "litellm_agent",
"description": "A flexible AI agent powered by LiteLLM with hot-swappable models from OpenRouter and other providers",
"url": "http://localhost:8000",
"version": "1.0.0",
"defaultInputModes": ["text/plain"],
"defaultOutputModes": ["text/plain"],
"capabilities": {
"streaming": true
},
"skills": [
{
"id": "litellm-general-purpose",
"name": "General Purpose AI Assistant",
"description": "A flexible AI assistant that can help with various tasks using any LiteLLM-supported model. Supports runtime model and prompt hot-swapping.",
"tags": ["ai", "assistant", "litellm", "flexible", "hot-swap"],
"examples": [
"Help me write a Python function",
"Explain quantum computing",
"Switch to Claude model and help me code"
]
}
]
}

View File

@@ -1,29 +0,0 @@
"""Root agent definition for the LiteLLM hot-swap shell."""
from __future__ import annotations
from google.adk.agents import Agent
from .callbacks import (
before_agent_callback,
before_model_callback,
provide_instruction,
)
from .config import AGENT_DESCRIPTION, AGENT_NAME, DEFAULT_MODEL, DEFAULT_PROVIDER
from .state import HotSwapState
from .tools import HOTSWAP_TOOLS
_initial_state = HotSwapState(model=DEFAULT_MODEL, provider=DEFAULT_PROVIDER)
root_agent = Agent(
name=AGENT_NAME,
model=_initial_state.instantiate_llm(),
description=AGENT_DESCRIPTION,
instruction=provide_instruction,
tools=HOTSWAP_TOOLS,
before_agent_callback=before_agent_callback,
before_model_callback=before_model_callback,
)
__all__ = ["root_agent"]

View File

@@ -1,137 +0,0 @@
"""Callbacks and instruction providers for the LiteLLM hot-swap agent."""
from __future__ import annotations
import logging
from typing import Optional
from google.adk.agents.callback_context import CallbackContext
from google.adk.agents.readonly_context import ReadonlyContext
from google.adk.models.llm_request import LlmRequest
from google.genai import types
from .config import CONTROL_PREFIX, DEFAULT_MODEL
from .control import HotSwapCommand, parse_control_message, parse_model_spec
from .prompts import BASE_INSTRUCTION
from .state import HotSwapState, apply_state_to_agent
_LOGGER = logging.getLogger(__name__)
def provide_instruction(ctx: ReadonlyContext | None = None) -> str:
"""Compose the system instruction using the stored state."""
state_mapping = getattr(ctx, "state", None)
state = HotSwapState.from_mapping(state_mapping)
prompt = state.prompt or BASE_INSTRUCTION
return f"{prompt}\n\nActive model: {state.display_model}"
def _ensure_state(callback_context: CallbackContext) -> HotSwapState:
state = HotSwapState.from_mapping(callback_context.state)
state.persist(callback_context.state)
return state
def _session_id(callback_context: CallbackContext) -> str:
session = getattr(callback_context, "session", None)
if session is None:
session = getattr(callback_context._invocation_context, "session", None)
return getattr(session, "id", "unknown-session")
async def before_model_callback(
callback_context: CallbackContext,
llm_request: LlmRequest,
) -> Optional[types.Content]:
"""Ensure outgoing requests use the active model from session state."""
state = _ensure_state(callback_context)
try:
apply_state_to_agent(callback_context._invocation_context, state)
except Exception: # pragma: no cover - defensive logging
_LOGGER.exception(
"Failed to apply LiteLLM model '%s' (provider=%s) for session %s",
state.model,
state.provider,
callback_context.session.id,
)
llm_request.model = state.model or DEFAULT_MODEL
return None
async def before_agent_callback(
callback_context: CallbackContext,
) -> Optional[types.Content]:
"""Intercept hot-swap control messages and update session state."""
user_content = callback_context.user_content
if not user_content or not user_content.parts:
return None
first_part = user_content.parts[0]
message_text = (first_part.text or "").strip()
if not message_text.startswith(CONTROL_PREFIX):
return None
parsed = parse_control_message(message_text)
if not parsed:
return None
command, payload = parsed
state = _ensure_state(callback_context)
if command is HotSwapCommand.MODEL:
if not payload:
return _render("❌ Missing model specification for hot-swap.")
try:
spec = parse_model_spec(payload)
except ValueError as exc:
return _render(f"❌ Invalid model specification: {exc}")
state.model = spec.model
state.provider = spec.provider
state.persist(callback_context.state)
try:
apply_state_to_agent(callback_context._invocation_context, state)
except Exception: # pragma: no cover - defensive logging
_LOGGER.exception(
"Failed to apply LiteLLM model '%s' (provider=%s) for session %s",
state.model,
state.provider,
_session_id(callback_context),
)
_LOGGER.info(
"Hot-swapped model to %s (provider=%s, session=%s)",
state.model,
state.provider,
_session_id(callback_context),
)
label = state.display_model
return _render(f"✅ Model switched to: {label}")
if command is HotSwapCommand.PROMPT:
prompt_value = (payload or "").strip()
state.prompt = prompt_value or None
state.persist(callback_context.state)
if state.prompt:
_LOGGER.info(
"Updated prompt for session %s", _session_id(callback_context)
)
return _render(
"✅ System prompt updated. This change takes effect immediately."
)
return _render("✅ System prompt cleared. Reverting to default instruction.")
if command is HotSwapCommand.GET_CONFIG:
return _render(state.describe())
expected = ", ".join(HotSwapCommand.choices())
return _render(
"⚠️ Unsupported hot-swap command. Available verbs: "
f"{expected}."
)
def _render(message: str) -> types.ModelContent:
return types.ModelContent(parts=[types.Part(text=message)])

View File

@@ -1,20 +0,0 @@
"""Configuration constants for the LiteLLM hot-swap agent."""
from __future__ import annotations
import os
AGENT_NAME = "litellm_agent"
AGENT_DESCRIPTION = (
"A LiteLLM-backed shell that exposes hot-swappable model and prompt controls."
)
DEFAULT_MODEL = os.getenv("LITELLM_MODEL", "gemini-2.0-flash-001")
DEFAULT_PROVIDER = os.getenv("LITELLM_PROVIDER")
STATE_PREFIX = "app:litellm_agent/"
STATE_MODEL_KEY = f"{STATE_PREFIX}model"
STATE_PROVIDER_KEY = f"{STATE_PREFIX}provider"
STATE_PROMPT_KEY = f"{STATE_PREFIX}prompt"
CONTROL_PREFIX = "[HOTSWAP"

View File

@@ -1,99 +0,0 @@
"""Control message helpers for hot-swapping model and prompt."""
from __future__ import annotations
import re
from dataclasses import dataclass
from enum import Enum
from typing import Optional, Tuple
from .config import DEFAULT_PROVIDER
class HotSwapCommand(str, Enum):
"""Supported control verbs embedded in user messages."""
MODEL = "MODEL"
PROMPT = "PROMPT"
GET_CONFIG = "GET_CONFIG"
@classmethod
def choices(cls) -> tuple[str, ...]:
return tuple(item.value for item in cls)
@dataclass(frozen=True)
class ModelSpec:
"""Represents a LiteLLM model and optional provider."""
model: str
provider: Optional[str] = None
_COMMAND_PATTERN = re.compile(
r"^\[HOTSWAP:(?P<verb>[A-Z_]+)(?::(?P<payload>.*))?\]$",
)
def parse_control_message(text: str) -> Optional[Tuple[HotSwapCommand, Optional[str]]]:
"""Return hot-swap command tuple when the string matches the control format."""
match = _COMMAND_PATTERN.match(text.strip())
if not match:
return None
verb = match.group("verb")
if verb not in HotSwapCommand.choices():
return None
payload = match.group("payload")
return HotSwapCommand(verb), payload if payload else None
def build_control_message(command: HotSwapCommand, payload: Optional[str] = None) -> str:
"""Serialise a control command for downstream clients."""
if command not in HotSwapCommand:
raise ValueError(f"Unsupported hot-swap command: {command}")
if payload is None or payload == "":
return f"[HOTSWAP:{command.value}]"
return f"[HOTSWAP:{command.value}:{payload}]"
def parse_model_spec(model: str, provider: Optional[str] = None) -> ModelSpec:
"""Parse model/provider inputs into a structured ModelSpec."""
candidate = (model or "").strip()
if not candidate:
raise ValueError("Model name cannot be empty")
if provider:
provider_clean = provider.strip()
if not provider_clean:
raise ValueError("Provider cannot be empty when supplied")
if "/" in candidate:
raise ValueError(
"Provide either provider/model or use provider argument, not both",
)
return ModelSpec(model=candidate, provider=provider_clean)
if "/" in candidate:
provider_part, model_part = candidate.split("/", 1)
provider_part = provider_part.strip()
model_part = model_part.strip()
if not provider_part or not model_part:
raise ValueError("Model spec must include provider and model when using '/' format")
return ModelSpec(model=model_part, provider=provider_part)
if DEFAULT_PROVIDER:
return ModelSpec(model=candidate, provider=DEFAULT_PROVIDER.strip())
return ModelSpec(model=candidate, provider=None)
def serialize_model_spec(spec: ModelSpec) -> str:
"""Render a ModelSpec to provider/model string for control messages."""
if spec.provider:
return f"{spec.provider}/{spec.model}"
return spec.model

View File

@@ -1,9 +0,0 @@
"""System prompt templates for the LiteLLM agent."""
BASE_INSTRUCTION = (
"You are a focused orchestration layer that relays between the user and a"
" LiteLLM managed model."
"\n- Keep answers concise and actionable."
"\n- Prefer plain language; reveal intermediate reasoning only when helpful."
"\n- Surface any tool results clearly with short explanations."
)

View File

@@ -1,86 +0,0 @@
"""Session state utilities for the LiteLLM hot-swap agent."""
from __future__ import annotations
from dataclasses import dataclass
from typing import Any, Mapping, MutableMapping, Optional
from .config import (
DEFAULT_MODEL,
DEFAULT_PROVIDER,
STATE_MODEL_KEY,
STATE_PROMPT_KEY,
STATE_PROVIDER_KEY,
)
@dataclass(slots=True)
class HotSwapState:
"""Lightweight view of the hot-swap session state."""
model: str = DEFAULT_MODEL
provider: Optional[str] = None
prompt: Optional[str] = None
@classmethod
def from_mapping(cls, mapping: Optional[Mapping[str, Any]]) -> "HotSwapState":
if not mapping:
return cls()
raw_model = mapping.get(STATE_MODEL_KEY, DEFAULT_MODEL)
raw_provider = mapping.get(STATE_PROVIDER_KEY)
raw_prompt = mapping.get(STATE_PROMPT_KEY)
model = raw_model.strip() if isinstance(raw_model, str) else DEFAULT_MODEL
provider = raw_provider.strip() if isinstance(raw_provider, str) else None
if not provider and DEFAULT_PROVIDER:
provider = DEFAULT_PROVIDER.strip() or None
prompt = raw_prompt.strip() if isinstance(raw_prompt, str) else None
return cls(
model=model or DEFAULT_MODEL,
provider=provider or None,
prompt=prompt or None,
)
def persist(self, store: MutableMapping[str, object]) -> None:
store[STATE_MODEL_KEY] = self.model
if self.provider:
store[STATE_PROVIDER_KEY] = self.provider
else:
store[STATE_PROVIDER_KEY] = None
store[STATE_PROMPT_KEY] = self.prompt
def describe(self) -> str:
prompt_value = self.prompt if self.prompt else "(default prompt)"
provider_value = self.provider if self.provider else "(default provider)"
return (
"📊 Current Configuration\n"
"━━━━━━━━━━━━━━━━━━━━━━\n"
f"Model: {self.model}\n"
f"Provider: {provider_value}\n"
f"System Prompt: {prompt_value}\n"
"━━━━━━━━━━━━━━━━━━━━━━"
)
def instantiate_llm(self):
"""Create a LiteLlm instance for the current state."""
from google.adk.models.lite_llm import LiteLlm # Lazy import to avoid cycle
kwargs = {"model": self.model}
if self.provider:
kwargs["custom_llm_provider"] = self.provider
return LiteLlm(**kwargs)
@property
def display_model(self) -> str:
if self.provider:
return f"{self.provider}/{self.model}"
return self.model
def apply_state_to_agent(invocation_context, state: HotSwapState) -> None:
"""Update the provided agent with a LiteLLM instance matching state."""
agent = invocation_context.agent
agent.model = state.instantiate_llm()

View File

@@ -1,64 +0,0 @@
"""Tool definitions exposed to the LiteLLM agent."""
from __future__ import annotations
from typing import Optional
from google.adk.tools import FunctionTool, ToolContext
from .control import parse_model_spec
from .state import HotSwapState, apply_state_to_agent
async def set_model(
model: str,
*,
provider: Optional[str] = None,
tool_context: ToolContext,
) -> str:
"""Hot-swap the active LiteLLM model for this session."""
spec = parse_model_spec(model, provider=provider)
state = HotSwapState.from_mapping(tool_context.state)
state.model = spec.model
state.provider = spec.provider
state.persist(tool_context.state)
try:
apply_state_to_agent(tool_context._invocation_context, state)
except Exception as exc: # pragma: no cover - defensive reporting
return f"❌ Failed to apply model '{state.display_model}': {exc}"
return f"✅ Model switched to: {state.display_model}"
async def set_prompt(prompt: str, *, tool_context: ToolContext) -> str:
"""Update or clear the system prompt used for this session."""
state = HotSwapState.from_mapping(tool_context.state)
prompt_value = prompt.strip()
state.prompt = prompt_value or None
state.persist(tool_context.state)
if state.prompt:
return "✅ System prompt updated. This change takes effect immediately."
return "✅ System prompt cleared. Reverting to default instruction."
async def get_config(*, tool_context: ToolContext) -> str:
"""Return a summary of the current model and prompt configuration."""
state = HotSwapState.from_mapping(tool_context.state)
return state.describe()
HOTSWAP_TOOLS = [
FunctionTool(set_model),
FunctionTool(set_prompt),
FunctionTool(get_config),
]
__all__ = [
"set_model",
"set_prompt",
"get_config",
"HOTSWAP_TOOLS",
]

View File

@@ -1,13 +0,0 @@
"""ASGI entrypoint for containerized deployments."""
from pathlib import Path
from google.adk.cli.fast_api import get_fast_api_app
AGENT_DIR = Path(__file__).resolve().parent
app = get_fast_api_app(
agents_dir=str(AGENT_DIR),
web=False,
a2a=True,
)

View File

@@ -1,4 +0,0 @@
google-adk
a2a-sdk[all]
litellm
python-dotenv

View File

@@ -1,93 +0,0 @@
FuzzForge AI LLM Configuration Guide
===================================
This note summarises the environment variables and libraries that drive LiteLLM (via the Google ADK runtime) inside the FuzzForge AI module. For complete matrices and advanced examples, read `docs/advanced/configuration.md`.
Core Libraries
--------------
- `google-adk` hosts the agent runtime, memory services, and LiteLLM bridge.
- `litellm` provider-agnostic LLM client used by ADK and the executor.
- Provider SDKs install the SDK that matches your target backend (`openai`, `anthropic`, `google-cloud-aiplatform`, `groq`, etc.).
- Optional extras: `agentops` for tracing, `cognee[all]` for knowledge-graph ingestion, `ollama` CLI for running local models.
Quick install foundation::
```
pip install google-adk litellm openai
```
Add any provider-specific SDKs (for example `pip install anthropic groq`) on top of that base.
Baseline Setup
--------------
Copy `.fuzzforge/.env.template` to `.fuzzforge/.env` and set the core fields:
```
LLM_PROVIDER=openai
LITELLM_MODEL=gpt-5-mini
OPENAI_API_KEY=sk-your-key
FUZZFORGE_MCP_URL=http://localhost:8010/mcp
SESSION_PERSISTENCE=sqlite
MEMORY_SERVICE=inmemory
```
LiteLLM Provider Examples
-------------------------
OpenAI-compatible (Azure, etc.)::
```
LLM_PROVIDER=azure_openai
LITELLM_MODEL=gpt-4o-mini
LLM_API_KEY=sk-your-azure-key
LLM_ENDPOINT=https://your-resource.openai.azure.com
```
Anthropic::
```
LLM_PROVIDER=anthropic
LITELLM_MODEL=claude-3-haiku-20240307
ANTHROPIC_API_KEY=sk-your-key
```
Ollama (local)::
```
LLM_PROVIDER=ollama_chat
LITELLM_MODEL=codellama:latest
OLLAMA_API_BASE=http://localhost:11434
```
Run `ollama pull codellama:latest` so the adapter can respond immediately.
Vertex AI::
```
LLM_PROVIDER=vertex_ai
LITELLM_MODEL=gemini-1.5-pro
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
```
Provider Checklist
------------------
- **OpenAI / Azure OpenAI**: `LLM_PROVIDER`, `LITELLM_MODEL`, API key, optional endpoint + API version (Azure).
- **Anthropic**: `LLM_PROVIDER=anthropic`, `LITELLM_MODEL`, `ANTHROPIC_API_KEY`.
- **Google Vertex AI**: `LLM_PROVIDER=vertex_ai`, `LITELLM_MODEL`, `GOOGLE_APPLICATION_CREDENTIALS`, `GOOGLE_CLOUD_PROJECT`.
- **Groq**: `LLM_PROVIDER=groq`, `LITELLM_MODEL`, `GROQ_API_KEY`.
- **Ollama / Local**: `LLM_PROVIDER=ollama_chat`, `LITELLM_MODEL`, `OLLAMA_API_BASE`, and the model pulled locally (`ollama pull <model>`).
Knowledge Graph Add-ons
-----------------------
Set these only if you plan to use Cognee project graphs:
```
LLM_COGNEE_PROVIDER=openai
LLM_COGNEE_MODEL=gpt-5-mini
LLM_COGNEE_API_KEY=sk-your-key
```
Tracing & Debugging
-------------------
- Provide `AGENTOPS_API_KEY` to enable hosted traces for every conversation.
- Set `FUZZFORGE_DEBUG=1` (and optionally `LOG_LEVEL=DEBUG`) for verbose executor output.
- Restart the agent after changing environment variables; LiteLLM loads configuration on boot.
Further Reading
---------------
`docs/advanced/configuration.md` provider comparison, debugging flags, and referenced modules.

View File

@@ -1,44 +0,0 @@
[project]
name = "fuzzforge-ai"
version = "0.7.0"
description = "FuzzForge AI orchestration module"
readme = "README.md"
requires-python = ">=3.11"
dependencies = [
"google-adk",
"a2a-sdk",
"litellm",
"python-dotenv",
"httpx",
"uvicorn",
"rich",
"agentops",
"fastmcp",
"mcp",
"typing-extensions",
"cognee>=0.3.0",
]
[project.optional-dependencies]
dev = [
"pytest",
"pytest-asyncio",
"black",
"ruff",
]
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[tool.hatch.build.targets.wheel]
packages = ["src/fuzzforge_ai"]
[tool.hatch.metadata]
allow-direct-references = true
[tool.uv]
dev-dependencies = [
"pytest",
"pytest-asyncio",
]

View File

@@ -1,24 +0,0 @@
"""
FuzzForge AI Module - Agent-to-Agent orchestration system
This module integrates the fuzzforge_ai components into FuzzForge,
providing intelligent AI agent capabilities for security analysis.
Usage:
from fuzzforge_ai.a2a_wrapper import send_agent_task
from fuzzforge_ai.agent import FuzzForgeAgent
from fuzzforge_ai.config_manager import ConfigManager
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
__version__ = "0.6.0"

View File

@@ -1,110 +0,0 @@
# ruff: noqa: E402 # Imports delayed for environment/logging setup
"""
FuzzForge A2A Server
Run this to expose FuzzForge as an A2A-compatible agent
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import os
import warnings
import logging
from dotenv import load_dotenv
from fuzzforge_ai.config_bridge import ProjectConfigManager
# Suppress warnings
warnings.filterwarnings("ignore")
logging.getLogger("google.adk").setLevel(logging.ERROR)
logging.getLogger("google.adk.tools.base_authenticated_tool").setLevel(logging.ERROR)
# Load .env from .fuzzforge directory first, then fallback
from pathlib import Path
# Ensure Cognee logs stay inside the project workspace
project_root = Path.cwd()
default_log_dir = project_root / ".fuzzforge" / "logs"
default_log_dir.mkdir(parents=True, exist_ok=True)
log_path = default_log_dir / "cognee.log"
os.environ.setdefault("COGNEE_LOG_PATH", str(log_path))
fuzzforge_env = Path.cwd() / ".fuzzforge" / ".env"
if fuzzforge_env.exists():
load_dotenv(fuzzforge_env, override=True)
else:
load_dotenv(override=True)
# Ensure Cognee uses the project-specific storage paths when available
try:
project_config = ProjectConfigManager()
project_config.setup_cognee_environment()
except Exception:
# Project may not be initialized; fall through with default settings
pass
# Check configuration
if not os.getenv('LITELLM_MODEL'):
print("[ERROR] LITELLM_MODEL not set in .env file")
print("Please set LITELLM_MODEL to your desired model (e.g., gpt-4o-mini)")
exit(1)
from .agent import get_fuzzforge_agent
from .a2a_server import create_a2a_app as create_custom_a2a_app
def create_a2a_app():
"""Create the A2A application"""
# Get configuration
port = int(os.getenv('FUZZFORGE_PORT', 10100))
# Get the FuzzForge agent
fuzzforge = get_fuzzforge_agent()
# Print ASCII banner
print("\033[95m") # Purple color
print(" ███████╗██╗ ██╗███████╗███████╗███████╗ ██████╗ ██████╗ ██████╗ ███████╗ █████╗ ██╗")
print(" ██╔════╝██║ ██║╚══███╔╝╚══███╔╝██╔════╝██╔═══██╗██╔══██╗██╔════╝ ██╔════╝ ██╔══██╗██║")
print(" █████╗ ██║ ██║ ███╔╝ ███╔╝ █████╗ ██║ ██║██████╔╝██║ ███╗█████╗ ███████║██║")
print(" ██╔══╝ ██║ ██║ ███╔╝ ███╔╝ ██╔══╝ ██║ ██║██╔══██╗██║ ██║██╔══╝ ██╔══██║██║")
print(" ██║ ╚██████╔╝███████╗███████╗██║ ╚██████╔╝██║ ██║╚██████╔╝███████╗ ██║ ██║██║")
print(" ╚═╝ ╚═════╝ ╚══════╝╚══════╝╚═╝ ╚═════╝ ╚═╝ ╚═╝ ╚═════╝ ╚══════╝ ╚═╝ ╚═╝╚═╝")
print("\033[0m") # Reset color
# Create A2A app
print("🚀 Starting FuzzForge A2A Server")
print(f" Model: {fuzzforge.model}")
if fuzzforge.cognee_url:
print(f" Memory: Cognee at {fuzzforge.cognee_url}")
print(f" Port: {port}")
app = create_custom_a2a_app(fuzzforge.adk_agent, port=port, executor=fuzzforge.executor)
print("\n✅ FuzzForge A2A Server ready!")
print(f" Agent card: http://localhost:{port}/.well-known/agent-card.json")
print(f" A2A endpoint: http://localhost:{port}/")
print(f"\n📡 Other agents can register FuzzForge at: http://localhost:{port}")
return app
def main():
"""Start the A2A server using uvicorn."""
import uvicorn
app = create_a2a_app()
port = int(os.getenv('FUZZFORGE_PORT', 10100))
print("\n🎯 Starting server with uvicorn...")
uvicorn.run(app, host="127.0.0.1", port=port)
if __name__ == "__main__":
main()

View File

@@ -1,229 +0,0 @@
"""Custom A2A wiring so we can access task store and queue manager."""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
from __future__ import annotations
import logging
from typing import Optional, Union
from starlette.applications import Starlette
from starlette.responses import Response, FileResponse
from google.adk.a2a.executor.a2a_agent_executor import A2aAgentExecutor
from google.adk.a2a.utils.agent_card_builder import AgentCardBuilder
from google.adk.a2a.experimental import a2a_experimental
from google.adk.agents.base_agent import BaseAgent
from google.adk.artifacts.in_memory_artifact_service import InMemoryArtifactService
from google.adk.auth.credential_service.in_memory_credential_service import InMemoryCredentialService
from google.adk.cli.utils.logs import setup_adk_logger
from google.adk.memory.in_memory_memory_service import InMemoryMemoryService
from google.adk.runners import Runner
from google.adk.sessions.in_memory_session_service import InMemorySessionService
from a2a.server.apps import A2AStarletteApplication
from a2a.server.request_handlers.default_request_handler import DefaultRequestHandler
from a2a.server.tasks.inmemory_task_store import InMemoryTaskStore
from a2a.server.events.in_memory_queue_manager import InMemoryQueueManager
from a2a.types import AgentCard
from .agent_executor import FuzzForgeExecutor
import json
async def serve_artifact(request):
"""Serve artifact files via HTTP for A2A agents"""
artifact_id = request.path_params["artifact_id"]
# Try to get the executor instance to access artifact cache
# We'll store a reference to it during app creation
executor = getattr(serve_artifact, '_executor', None)
if not executor:
return Response("Artifact service not available", status_code=503)
try:
# Look in the artifact cache directory
artifact_cache_dir = executor._artifact_cache_dir
artifact_dir = artifact_cache_dir / artifact_id
if not artifact_dir.exists():
return Response("Artifact not found", status_code=404)
# Find the artifact file (should be only one file in the directory)
artifact_files = list(artifact_dir.glob("*"))
if not artifact_files:
return Response("Artifact file not found", status_code=404)
artifact_file = artifact_files[0] # Take the first (and should be only) file
# Determine mime type from file extension or default to octet-stream
import mimetypes
mime_type, _ = mimetypes.guess_type(str(artifact_file))
if not mime_type:
mime_type = 'application/octet-stream'
return FileResponse(
path=str(artifact_file),
media_type=mime_type,
filename=artifact_file.name
)
except Exception as e:
return Response(f"Error serving artifact: {str(e)}", status_code=500)
async def knowledge_query(request):
"""Expose knowledge graph search over HTTP for external agents."""
executor = getattr(knowledge_query, '_executor', None)
if not executor:
return Response("Knowledge service not available", status_code=503)
try:
payload = await request.json()
except Exception:
return Response("Invalid JSON body", status_code=400)
query = payload.get("query")
if not query:
return Response("'query' is required", status_code=400)
search_type = payload.get("search_type", "INSIGHTS")
dataset = payload.get("dataset")
result = await executor.query_project_knowledge_api(
query=query,
search_type=search_type,
dataset=dataset,
)
status = 200 if not isinstance(result, dict) or "error" not in result else 400
return Response(
json.dumps(result, default=str),
status_code=status,
media_type="application/json",
)
async def create_file_artifact(request):
"""Create an artifact from a project file via HTTP."""
executor = getattr(create_file_artifact, '_executor', None)
if not executor:
return Response("File service not available", status_code=503)
try:
payload = await request.json()
except Exception:
return Response("Invalid JSON body", status_code=400)
path = payload.get("path")
if not path:
return Response("'path' is required", status_code=400)
result = await executor.create_project_file_artifact_api(path)
status = 200 if not isinstance(result, dict) or "error" not in result else 400
return Response(
json.dumps(result, default=str),
status_code=status,
media_type="application/json",
)
def _load_agent_card(agent_card: Optional[Union[AgentCard, str]]) -> Optional[AgentCard]:
if agent_card is None:
return None
if isinstance(agent_card, AgentCard):
return agent_card
import json
from pathlib import Path
path = Path(agent_card)
with path.open('r', encoding='utf-8') as handle:
data = json.load(handle)
return AgentCard(**data)
@a2a_experimental
def create_a2a_app(
agent: BaseAgent,
*,
host: str = "localhost",
port: int = 8000,
protocol: str = "http",
agent_card: Optional[Union[AgentCard, str]] = None,
executor=None, # Accept executor reference
) -> Starlette:
"""Variant of google.adk.a2a.utils.to_a2a that exposes task-store handles."""
setup_adk_logger(logging.INFO)
async def create_runner() -> Runner:
return Runner(
agent=agent,
app_name=agent.name or "fuzzforge",
artifact_service=InMemoryArtifactService(),
session_service=InMemorySessionService(),
memory_service=InMemoryMemoryService(),
credential_service=InMemoryCredentialService(),
)
task_store = InMemoryTaskStore()
queue_manager = InMemoryQueueManager()
agent_executor = A2aAgentExecutor(runner=create_runner)
request_handler = DefaultRequestHandler(
agent_executor=agent_executor,
task_store=task_store,
queue_manager=queue_manager,
)
rpc_url = f"{protocol}://{host}:{port}/"
provided_card = _load_agent_card(agent_card)
card_builder = AgentCardBuilder(agent=agent, rpc_url=rpc_url)
app = Starlette()
async def setup() -> None:
if provided_card is not None:
final_card = provided_card
else:
final_card = await card_builder.build()
a2a_app = A2AStarletteApplication(
agent_card=final_card,
http_handler=request_handler,
)
a2a_app.add_routes_to_app(app)
# Add artifact serving route
app.router.add_route("/artifacts/{artifact_id}", serve_artifact, methods=["GET"])
app.router.add_route("/graph/query", knowledge_query, methods=["POST"])
app.router.add_route("/project/files", create_file_artifact, methods=["POST"])
app.add_event_handler("startup", setup)
# Expose handles so the executor can emit task updates later
FuzzForgeExecutor.task_store = task_store
FuzzForgeExecutor.queue_manager = queue_manager
# Store reference to executor for artifact serving
serve_artifact._executor = executor
knowledge_query._executor = executor
create_file_artifact._executor = executor
return app
__all__ = ["create_a2a_app"]

View File

@@ -1,288 +0,0 @@
"""
A2A Wrapper Module for FuzzForge
Programmatic interface to send tasks to A2A agents with custom model/prompt/context
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
from __future__ import annotations
from typing import Optional, Any
from uuid import uuid4
import httpx
from a2a.client import A2AClient
from a2a.client.errors import A2AClientHTTPError
from a2a.types import (
JSONRPCErrorResponse,
Message,
MessageSendConfiguration,
MessageSendParams,
Part,
Role,
SendMessageRequest,
SendStreamingMessageRequest,
Task,
TaskArtifactUpdateEvent,
TaskStatusUpdateEvent,
TextPart,
)
class A2ATaskResult:
"""Result from an A2A agent task"""
def __init__(self, text: str, context_id: str, raw_response: Any = None):
self.text = text
self.context_id = context_id
self.raw_response = raw_response
def __str__(self) -> str:
return self.text
def __repr__(self) -> str:
return f"A2ATaskResult(text={self.text[:50]}..., context_id={self.context_id})"
def _build_control_message(command: str, payload: Optional[str] = None) -> str:
"""Build a control message for hot-swapping agent configuration"""
if payload is None or payload == "":
return f"[HOTSWAP:{command}]"
return f"[HOTSWAP:{command}:{payload}]"
def _extract_text(
result: Message | Task | TaskStatusUpdateEvent | TaskArtifactUpdateEvent,
) -> list[str]:
"""Extract text content from A2A response objects"""
texts: list[str] = []
if isinstance(result, Message):
if result.role is Role.agent:
for part in result.parts:
root_part = part.root
text = getattr(root_part, "text", None)
if text:
texts.append(text)
elif isinstance(result, Task) and result.history:
for msg in result.history:
if msg.role is Role.agent:
for part in msg.parts:
root_part = part.root
text = getattr(root_part, "text", None)
if text:
texts.append(text)
elif isinstance(result, TaskStatusUpdateEvent):
message = result.status.message
if message:
texts.extend(_extract_text(message))
elif isinstance(result, TaskArtifactUpdateEvent):
artifact = result.artifact
if artifact and artifact.parts:
for part in artifact.parts:
root_part = part.root
text = getattr(root_part, "text", None)
if text:
texts.append(text)
return texts
async def _send_message(
client: A2AClient,
message: str,
context_id: str,
) -> str:
"""Send a message to the A2A agent and collect the response"""
params = MessageSendParams(
configuration=MessageSendConfiguration(blocking=True),
message=Message(
context_id=context_id,
message_id=str(uuid4()),
role=Role.user,
parts=[Part(root=TextPart(text=message))],
),
)
stream_request = SendStreamingMessageRequest(id=str(uuid4()), params=params)
buffer: list[str] = []
try:
async for response in client.send_message_streaming(stream_request):
root = response.root
if isinstance(root, JSONRPCErrorResponse):
raise RuntimeError(f"A2A error: {root.error}")
payload = root.result
buffer.extend(_extract_text(payload))
except A2AClientHTTPError as exc:
if "text/event-stream" not in str(exc):
raise
# Fallback to non-streaming
send_request = SendMessageRequest(id=str(uuid4()), params=params)
response = await client.send_message(send_request)
root = response.root
if isinstance(root, JSONRPCErrorResponse):
raise RuntimeError(f"A2A error: {root.error}")
payload = root.result
buffer.extend(_extract_text(payload))
if buffer:
buffer = list(dict.fromkeys(buffer)) # Remove duplicates
return "\n".join(buffer).strip()
async def send_agent_task(
url: str,
message: str,
*,
model: Optional[str] = None,
provider: Optional[str] = None,
prompt: Optional[str] = None,
context: Optional[str] = None,
timeout: float = 120.0,
) -> A2ATaskResult:
"""
Send a task to an A2A agent with optional model/prompt configuration.
Args:
url: A2A endpoint URL (e.g., "http://127.0.0.1:8000/a2a/litellm_agent")
message: The task message to send to the agent
model: Optional model name (e.g., "gpt-4o", "gemini-2.0-flash")
provider: Optional provider name (e.g., "openai", "gemini")
prompt: Optional system prompt to set before sending the message
context: Optional context/session ID (generated if not provided)
timeout: Request timeout in seconds (default: 120)
Returns:
A2ATaskResult with the agent's response text and context ID
Example:
>>> result = await send_agent_task(
... url="http://127.0.0.1:8000/a2a/litellm_agent",
... model="gpt-4o",
... provider="openai",
... prompt="You are concise.",
... message="Give me a fuzzing harness.",
... context="fuzzing",
... timeout=120
... )
>>> print(result.text)
"""
timeout_config = httpx.Timeout(timeout)
context_id = context or str(uuid4())
async with httpx.AsyncClient(timeout=timeout_config) as http_client:
client = A2AClient(url=url, httpx_client=http_client)
# Set model if provided
if model:
model_spec = f"{provider}/{model}" if provider else model
control_msg = _build_control_message("MODEL", model_spec)
await _send_message(client, control_msg, context_id)
# Set prompt if provided
if prompt is not None:
control_msg = _build_control_message("PROMPT", prompt)
await _send_message(client, control_msg, context_id)
# Send the actual task message
response_text = await _send_message(client, message, context_id)
return A2ATaskResult(
text=response_text,
context_id=context_id,
)
async def get_agent_config(
url: str,
context: Optional[str] = None,
timeout: float = 60.0,
) -> str:
"""
Get the current configuration of an A2A agent.
Args:
url: A2A endpoint URL
context: Optional context/session ID
timeout: Request timeout in seconds
Returns:
Configuration string from the agent
"""
timeout_config = httpx.Timeout(timeout)
context_id = context or str(uuid4())
async with httpx.AsyncClient(timeout=timeout_config) as http_client:
client = A2AClient(url=url, httpx_client=http_client)
control_msg = _build_control_message("GET_CONFIG")
config_text = await _send_message(client, control_msg, context_id)
return config_text
async def hot_swap_model(
url: str,
model: str,
provider: Optional[str] = None,
context: Optional[str] = None,
timeout: float = 60.0,
) -> str:
"""
Hot-swap the model of an A2A agent without sending a task.
Args:
url: A2A endpoint URL
model: Model name to switch to
provider: Optional provider name
context: Optional context/session ID
timeout: Request timeout in seconds
Returns:
Response from the agent
"""
timeout_config = httpx.Timeout(timeout)
context_id = context or str(uuid4())
async with httpx.AsyncClient(timeout=timeout_config) as http_client:
client = A2AClient(url=url, httpx_client=http_client)
model_spec = f"{provider}/{model}" if provider else model
control_msg = _build_control_message("MODEL", model_spec)
response = await _send_message(client, control_msg, context_id)
return response
async def hot_swap_prompt(
url: str,
prompt: str,
context: Optional[str] = None,
timeout: float = 60.0,
) -> str:
"""
Hot-swap the system prompt of an A2A agent.
Args:
url: A2A endpoint URL
prompt: System prompt to set
context: Optional context/session ID
timeout: Request timeout in seconds
Returns:
Response from the agent
"""
timeout_config = httpx.Timeout(timeout)
context_id = context or str(uuid4())
async with httpx.AsyncClient(timeout=timeout_config) as http_client:
client = A2AClient(url=url, httpx_client=http_client)
control_msg = _build_control_message("PROMPT", prompt)
response = await _send_message(client, control_msg, context_id)
return response

View File

@@ -1,133 +0,0 @@
"""
FuzzForge Agent Definition
The core agent that combines all components
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import os
from pathlib import Path
from typing import Dict, Any, List
from google.adk import Agent
from google.adk.models.lite_llm import LiteLlm
from .agent_card import get_fuzzforge_agent_card
from .agent_executor import FuzzForgeExecutor
from .memory_service import FuzzForgeMemoryService, HybridMemoryManager
# Load environment variables from the AI module's .env file
try:
from dotenv import load_dotenv
_ai_dir = Path(__file__).parent
_env_file = _ai_dir / ".env"
if _env_file.exists():
load_dotenv(_env_file, override=False) # Don't override existing env vars
except ImportError:
# dotenv not available, skip loading
pass
class FuzzForgeAgent:
"""The main FuzzForge agent that combines card, executor, and ADK agent"""
def __init__(
self,
model: str = None,
cognee_url: str = None,
port: int = 10100,
):
"""Initialize FuzzForge agent with configuration"""
self.model = model or os.getenv('LITELLM_MODEL', 'gpt-4o-mini')
self.cognee_url = cognee_url or os.getenv('COGNEE_MCP_URL')
self.port = port
# Initialize ADK Memory Service for conversational memory
memory_type = os.getenv('MEMORY_SERVICE', 'inmemory')
self.memory_service = FuzzForgeMemoryService(memory_type=memory_type)
# Create the executor (the brain) with memory and session services
self.executor = FuzzForgeExecutor(
model=self.model,
cognee_url=self.cognee_url,
debug=os.getenv('FUZZFORGE_DEBUG', '0') == '1',
memory_service=self.memory_service,
session_persistence=os.getenv('SESSION_PERSISTENCE', 'inmemory'),
fuzzforge_mcp_url=None, # Disabled
)
# Create Hybrid Memory Manager (ADK + Cognee direct integration)
# MCP tools removed - using direct Cognee integration only
self.memory_manager = HybridMemoryManager(
memory_service=self.memory_service,
cognee_tools=None # No MCP tools, direct integration used instead
)
# Get the agent card (the identity)
self.agent_card = get_fuzzforge_agent_card(f"http://localhost:{self.port}")
# Create the ADK agent (for A2A server mode)
self.adk_agent = self._create_adk_agent()
def _create_adk_agent(self) -> Agent:
"""Create the ADK agent for A2A server mode"""
# Build instruction
instruction = f"""You are {self.agent_card.name}, {self.agent_card.description}
Your capabilities include:
"""
for skill in self.agent_card.skills:
instruction += f"\n- {skill.name}: {skill.description}"
instruction += """
When responding to requests:
1. Use your registered agents when appropriate
2. Use Cognee memory tools when available
3. Provide helpful, concise responses
4. Maintain context across conversations
"""
# Create ADK agent
return Agent(
model=LiteLlm(model=self.model),
name=self.agent_card.name,
description=self.agent_card.description,
instruction=instruction,
tools=self.executor.agent.tools if hasattr(self.executor.agent, 'tools') else []
)
async def process_message(self, message: str, context_id: str = None) -> str:
"""Process a message using the executor"""
result = await self.executor.execute(message, context_id or "default")
return result.get("response", "No response generated")
async def register_agent(self, url: str) -> Dict[str, Any]:
"""Register a new agent"""
return await self.executor.register_agent(url)
def list_agents(self) -> List[Dict[str, Any]]:
"""List registered agents"""
return self.executor.list_agents()
async def cleanup(self):
"""Clean up resources"""
await self.executor.cleanup()
# Create a singleton instance for import
_instance = None
def get_fuzzforge_agent() -> FuzzForgeAgent:
"""Get the singleton FuzzForge agent instance"""
global _instance
if _instance is None:
_instance = FuzzForgeAgent()
return _instance

View File

@@ -1,182 +0,0 @@
"""
FuzzForge Agent Card and Skills Definition
Defines what FuzzForge can do and how others can discover it
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
from dataclasses import dataclass
from typing import List, Dict, Any
@dataclass
class AgentSkill:
"""Represents a specific capability of the agent"""
id: str
name: str
description: str
tags: List[str]
examples: List[str]
input_modes: List[str] = None
output_modes: List[str] = None
def to_dict(self) -> Dict[str, Any]:
"""Convert to dictionary for JSON serialization"""
return {
"id": self.id,
"name": self.name,
"description": self.description,
"tags": self.tags,
"examples": self.examples,
"inputModes": self.input_modes or ["text/plain"],
"outputModes": self.output_modes or ["text/plain"]
}
@dataclass
class AgentCapabilities:
"""Defines agent capabilities for A2A protocol"""
streaming: bool = False
push_notifications: bool = False
multi_turn: bool = True
context_retention: bool = True
def to_dict(self) -> Dict[str, Any]:
return {
"streaming": self.streaming,
"pushNotifications": self.push_notifications,
"multiTurn": self.multi_turn,
"contextRetention": self.context_retention
}
@dataclass
class AgentCard:
"""The agent's business card - tells others what this agent can do"""
name: str
description: str
version: str
url: str
skills: List[AgentSkill]
capabilities: AgentCapabilities
default_input_modes: List[str] = None
default_output_modes: List[str] = None
preferred_transport: str = "JSONRPC"
protocol_version: str = "0.3.0"
def to_dict(self) -> Dict[str, Any]:
"""Convert to A2A-compliant agent card JSON"""
return {
"name": self.name,
"description": self.description,
"version": self.version,
"url": self.url,
"protocolVersion": self.protocol_version,
"preferredTransport": self.preferred_transport,
"defaultInputModes": self.default_input_modes or ["text/plain"],
"defaultOutputModes": self.default_output_modes or ["text/plain"],
"capabilities": self.capabilities.to_dict(),
"skills": [skill.to_dict() for skill in self.skills]
}
# Define FuzzForge's skills
orchestration_skill = AgentSkill(
id="orchestration",
name="Agent Orchestration",
description="Route requests to appropriate registered agents based on their capabilities",
tags=["orchestration", "routing", "coordination"],
examples=[
"Route this to the calculator",
"Send this to the appropriate agent",
"Which agent should handle this?"
]
)
memory_skill = AgentSkill(
id="memory",
name="Memory Management",
description="Store and retrieve information using Cognee knowledge graph",
tags=["memory", "knowledge", "storage", "cognee"],
examples=[
"Remember that my favorite color is blue",
"What do you remember about me?",
"Search your memory for project details"
]
)
conversation_skill = AgentSkill(
id="conversation",
name="General Conversation",
description="Engage in general conversation and answer questions using LLM",
tags=["chat", "conversation", "qa", "llm"],
examples=[
"What is the meaning of life?",
"Explain quantum computing",
"Help me understand this concept"
]
)
workflow_automation_skill = AgentSkill(
id="workflow_automation",
name="Workflow Automation",
description="Operate project workflows via MCP, monitor runs, and share results",
tags=["workflow", "automation", "mcp", "orchestration"],
examples=[
"Submit the security assessment workflow",
"Kick off the infrastructure scan and monitor it",
"Summarise findings for run abc123"
]
)
agent_management_skill = AgentSkill(
id="agent_management",
name="Agent Registry Management",
description="Register, list, and manage connections to other A2A agents",
tags=["registry", "management", "discovery"],
examples=[
"Register agent at http://localhost:10201",
"List all registered agents",
"Show agent capabilities"
]
)
# Define FuzzForge's capabilities
fuzzforge_capabilities = AgentCapabilities(
streaming=False,
push_notifications=True,
multi_turn=True, # We support multi-turn conversations
context_retention=True # We maintain context across turns
)
# Create the public agent card
def get_fuzzforge_agent_card(url: str = "http://localhost:10100") -> AgentCard:
"""Get FuzzForge's agent card with current configuration"""
return AgentCard(
name="ProjectOrchestrator",
description=(
"An A2A-capable project agent that can launch and monitor FuzzForge workflows, "
"consult the project knowledge graph, and coordinate with speciality agents."
),
version="project-agent",
url=url,
skills=[
orchestration_skill,
memory_skill,
conversation_skill,
agent_management_skill
],
capabilities=fuzzforge_capabilities,
default_input_modes=["text/plain", "application/json"],
default_output_modes=["text/plain", "application/json"],
preferred_transport="JSONRPC",
protocol_version="0.3.0"
)

File diff suppressed because it is too large Load Diff

View File

@@ -1,971 +0,0 @@
# ruff: noqa: E402 # Imports delayed for environment/logging setup
#!/usr/bin/env python3
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
"""
FuzzForge CLI - Clean modular version
Uses the separated agent components
"""
import asyncio
import shlex
import os
import sys
import signal
import warnings
import logging
import random
from datetime import datetime
from contextlib import contextmanager
from pathlib import Path
from dotenv import load_dotenv
# Ensure Cognee writes logs inside the project workspace
project_root = Path.cwd()
default_log_dir = project_root / ".fuzzforge" / "logs"
default_log_dir.mkdir(parents=True, exist_ok=True)
log_path = default_log_dir / "cognee.log"
os.environ.setdefault("COGNEE_LOG_PATH", str(log_path))
# Suppress warnings
warnings.filterwarnings("ignore")
logging.basicConfig(level=logging.ERROR)
# Load .env file with explicit path handling
# 1. First check current working directory for .fuzzforge/.env
fuzzforge_env = Path.cwd() / ".fuzzforge" / ".env"
if fuzzforge_env.exists():
load_dotenv(fuzzforge_env, override=True)
else:
# 2. Then check parent directories for .fuzzforge projects
current_path = Path.cwd()
for parent in [current_path] + list(current_path.parents):
fuzzforge_dir = parent / ".fuzzforge"
if fuzzforge_dir.exists():
project_env = fuzzforge_dir / ".env"
if project_env.exists():
load_dotenv(project_env, override=True)
break
else:
# 3. Fallback to generic load_dotenv
load_dotenv(override=True)
# Enhanced readline configuration for Rich Console input compatibility
try:
import readline
# Enable Rich-compatible input features
readline.parse_and_bind("tab: complete")
readline.parse_and_bind("set editing-mode emacs")
readline.parse_and_bind("set show-all-if-ambiguous on")
readline.parse_and_bind("set completion-ignore-case on")
readline.parse_and_bind("set colored-completion-prefix on")
readline.parse_and_bind("set enable-bracketed-paste on") # Better paste support
# Navigation bindings for better editing
readline.parse_and_bind("Control-a: beginning-of-line")
readline.parse_and_bind("Control-e: end-of-line")
readline.parse_and_bind("Control-u: unix-line-discard")
readline.parse_and_bind("Control-k: kill-line")
readline.parse_and_bind("Control-w: unix-word-rubout")
readline.parse_and_bind("Meta-Backspace: backward-kill-word")
# History and completion
readline.set_history_length(2000)
readline.set_startup_hook(None)
# Enable multiline editing hints
readline.parse_and_bind("set horizontal-scroll-mode off")
readline.parse_and_bind("set mark-symlinked-directories on")
READLINE_AVAILABLE = True
except ImportError:
READLINE_AVAILABLE = False
from rich.console import Console
from rich.table import Table
from rich.panel import Panel
from rich import box
from .agent import FuzzForgeAgent
from .config_manager import ConfigManager
from .config_bridge import ProjectConfigManager
console = Console()
# Global shutdown flag
shutdown_requested = False
# Dynamic status messages for better UX
THINKING_MESSAGES = [
"Thinking", "Processing", "Computing", "Analyzing", "Working",
"Pondering", "Deliberating", "Calculating", "Reasoning", "Evaluating"
]
WORKING_MESSAGES = [
"Working", "Processing", "Handling", "Executing", "Running",
"Operating", "Performing", "Conducting", "Managing", "Coordinating"
]
SEARCH_MESSAGES = [
"Searching", "Scanning", "Exploring", "Investigating", "Hunting",
"Seeking", "Probing", "Examining", "Inspecting", "Browsing"
]
# Cool prompt symbols
PROMPT_STYLES = [
"", "", "", "", "»", "", "", "", "", ""
]
def get_dynamic_status(action_type="thinking"):
"""Get a random status message based on action type"""
if action_type == "thinking":
return f"{random.choice(THINKING_MESSAGES)}..."
elif action_type == "working":
return f"{random.choice(WORKING_MESSAGES)}..."
elif action_type == "searching":
return f"{random.choice(SEARCH_MESSAGES)}..."
else:
return f"{random.choice(THINKING_MESSAGES)}..."
def get_prompt_symbol():
"""Get prompt symbol indicating where to write"""
return ">>"
def signal_handler(signum, frame):
"""Handle Ctrl+C gracefully"""
global shutdown_requested
shutdown_requested = True
console.print("\n\n[yellow]Shutting down gracefully...[/yellow]")
sys.exit(0)
signal.signal(signal.SIGINT, signal_handler)
@contextmanager
def safe_status(message: str):
"""Safe status context manager"""
status = console.status(message, spinner="dots")
try:
status.start()
yield
finally:
status.stop()
class FuzzForgeCLI:
"""Command-line interface for FuzzForge"""
def __init__(self):
"""Initialize the CLI"""
# Ensure .env is loaded from .fuzzforge directory
fuzzforge_env = Path.cwd() / ".fuzzforge" / ".env"
if fuzzforge_env.exists():
load_dotenv(fuzzforge_env, override=True)
# Load configuration for agent registry
self.config_manager = ConfigManager()
# Check environment configuration
if not os.getenv('LITELLM_MODEL'):
console.print("[red]ERROR: LITELLM_MODEL not set in .env file[/red]")
console.print("Please set LITELLM_MODEL to your desired model")
sys.exit(1)
# Create the agent (uses env vars directly)
self.agent = FuzzForgeAgent()
# Create a consistent context ID for this CLI session
self.context_id = f"cli_{datetime.now().strftime('%Y%m%d_%H%M%S')}"
# Track registered agents for config persistence
self.agents_modified = False
# Command handlers
self.commands = {
"/help": self.cmd_help,
"/register": self.cmd_register,
"/unregister": self.cmd_unregister,
"/list": self.cmd_list,
"/memory": self.cmd_memory,
"/recall": self.cmd_recall,
"/artifacts": self.cmd_artifacts,
"/tasks": self.cmd_tasks,
"/skills": self.cmd_skills,
"/sessions": self.cmd_sessions,
"/clear": self.cmd_clear,
"/sendfile": self.cmd_sendfile,
"/quit": self.cmd_quit,
"/exit": self.cmd_quit,
}
self.background_tasks: set[asyncio.Task] = set()
def print_banner(self):
"""Print welcome banner"""
card = self.agent.agent_card
# Print ASCII banner
console.print("[medium_purple3] ███████╗██╗ ██╗███████╗███████╗███████╗ ██████╗ ██████╗ ██████╗ ███████╗ █████╗ ██╗[/medium_purple3]")
console.print("[medium_purple3] ██╔════╝██║ ██║╚══███╔╝╚══███╔╝██╔════╝██╔═══██╗██╔══██╗██╔════╝ ██╔════╝ ██╔══██╗██║[/medium_purple3]")
console.print("[medium_purple3] █████╗ ██║ ██║ ███╔╝ ███╔╝ █████╗ ██║ ██║██████╔╝██║ ███╗█████╗ ███████║██║[/medium_purple3]")
console.print("[medium_purple3] ██╔══╝ ██║ ██║ ███╔╝ ███╔╝ ██╔══╝ ██║ ██║██╔══██╗██║ ██║██╔══╝ ██╔══██║██║[/medium_purple3]")
console.print("[medium_purple3] ██║ ╚██████╔╝███████╗███████╗██║ ╚██████╔╝██║ ██║╚██████╔╝███████╗ ██║ ██║██║[/medium_purple3]")
console.print("[medium_purple3] ╚═╝ ╚═════╝ ╚══════╝╚══════╝╚═╝ ╚═════╝ ╚═╝ ╚═╝ ╚═════╝ ╚══════╝ ╚═╝ ╚═╝╚═╝[/medium_purple3]")
console.print(f"\n[dim]{card.description}[/dim]\n")
provider = (
os.getenv("LLM_PROVIDER")
or os.getenv("LLM_COGNEE_PROVIDER")
or os.getenv("COGNEE_LLM_PROVIDER")
or "unknown"
)
console.print(
"LLM Provider: [medium_purple1]{provider}[/medium_purple1]".format(
provider=provider
)
)
console.print(
"LLM Model: [medium_purple1]{model}[/medium_purple1]".format(
model=self.agent.model
)
)
if self.agent.executor.agentops_trace:
console.print("Tracking: [medium_purple1]AgentOps active[/medium_purple1]")
# Show skills
console.print("\nSkills:")
for skill in card.skills:
console.print(
f" • [deep_sky_blue1]{skill.name}[/deep_sky_blue1] {skill.description}"
)
console.print("\nType /help for commands or just chat\n")
async def cmd_help(self, args: str = "") -> None:
"""Show help"""
help_text = """
[bold]Commands:[/bold]
/register <url> - Register an A2A agent (saves to config)
/unregister <name> - Remove agent from registry and config
/list - List registered agents
[bold]Memory Systems:[/bold]
/recall <query> - Search past conversations (ADK Memory)
/memory - Show knowledge graph (Cognee)
/memory save - Save to knowledge graph
/memory search - Search knowledge graph
[bold]Other:[/bold]
/artifacts - List created artifacts
/artifacts <id> - Show artifact content
/tasks [id] - Show task list or details
/skills - Show FuzzForge skills
/sessions - List active sessions
/sendfile <agent> <path> [message] - Attach file as artifact and route to agent
/clear - Clear screen
/help - Show this help
/quit - Exit
[bold]Sample prompts:[/bold]
run fuzzforge workflow security_assessment on /absolute/path --volume-mode ro
list fuzzforge runs limit=5
get fuzzforge summary <run_id>
query project knowledge about "unsafe Rust" using GRAPH_COMPLETION
export project file src/lib.rs as artifact
/memory search "recent findings"
[bold]Input Editing:[/bold]
Arrow keys - Move cursor
Ctrl+A/E - Start/end of line
Up/Down - Command history
"""
console.print(help_text)
async def cmd_register(self, args: str) -> None:
"""Register an agent"""
if not args:
console.print("Usage: /register <url>")
return
with safe_status(f"{get_dynamic_status('working')} Registering {args}"):
result = await self.agent.register_agent(args.strip())
if result["success"]:
console.print(f"✅ Registered: [bold]{result['name']}[/bold]")
console.print(f" Capabilities: {result['capabilities']} skills")
# Get description from the agent's card
agents = self.agent.list_agents()
description = ""
for agent in agents:
if agent['name'] == result['name']:
description = agent.get('description', '')
break
# Add to config for persistence
self.config_manager.add_registered_agent(
name=result['name'],
url=args.strip(),
description=description
)
console.print(" [dim]Saved to config for auto-registration[/dim]")
else:
console.print(f"[red]Failed: {result['error']}[/red]")
async def cmd_unregister(self, args: str) -> None:
"""Unregister an agent and remove from config"""
if not args:
console.print("Usage: /unregister <name or url>")
return
# Try to find the agent
agents = self.agent.list_agents()
agent_to_remove = None
for agent in agents:
if agent['name'].lower() == args.lower() or agent['url'] == args:
agent_to_remove = agent
break
if not agent_to_remove:
console.print(f"[yellow]Agent '{args}' not found[/yellow]")
return
# Remove from config
if self.config_manager.remove_registered_agent(name=agent_to_remove['name'], url=agent_to_remove['url']):
console.print(f"✅ Unregistered: [bold]{agent_to_remove['name']}[/bold]")
console.print(" [dim]Removed from config (won't auto-register next time)[/dim]")
else:
console.print("[yellow]Agent unregistered from session but not found in config[/yellow]")
async def cmd_list(self, args: str = "") -> None:
"""List registered agents"""
agents = self.agent.list_agents()
if not agents:
console.print("No agents registered. Use /register <url>")
return
table = Table(title="Registered Agents", box=box.ROUNDED)
table.add_column("Name", style="medium_purple3")
table.add_column("URL", style="deep_sky_blue3")
table.add_column("Skills", style="plum3")
table.add_column("Description", style="dim")
for agent in agents:
desc = agent['description']
if len(desc) > 40:
desc = desc[:37] + "..."
table.add_row(
agent['name'],
agent['url'],
str(agent['skills']),
desc
)
console.print(table)
async def cmd_recall(self, args: str = "") -> None:
"""Search conversational memory (past conversations)"""
if not args:
console.print("Usage: /recall <query>")
return
await self._sync_conversational_memory()
# First try MemoryService (for ingested memories)
with safe_status(get_dynamic_status('searching')):
results = await self.agent.memory_manager.search_conversational_memory(args)
if results and results.memories:
console.print(f"[bold]Found {len(results.memories)} memories:[/bold]\n")
for i, memory in enumerate(results.memories, 1):
# MemoryEntry has 'text' field, not 'content'
text = getattr(memory, 'text', str(memory))
if len(text) > 200:
text = text[:200] + "..."
console.print(f"{i}. {text}")
else:
# If MemoryService is empty, search SQLite directly
console.print("[yellow]No memories in MemoryService, searching SQLite sessions...[/yellow]")
# Check if using DatabaseSessionService
if hasattr(self.agent.executor, 'session_service'):
service_type = type(self.agent.executor.session_service).__name__
if service_type == 'DatabaseSessionService':
# Search SQLite database directly
import sqlite3
import os
db_path = os.getenv('SESSION_DB_PATH', './fuzzforge_sessions.db')
if os.path.exists(db_path):
conn = sqlite3.connect(db_path)
cursor = conn.cursor()
# Search in events table
query = f"%{args}%"
cursor.execute(
"SELECT content FROM events WHERE content LIKE ? LIMIT 10",
(query,)
)
rows = cursor.fetchall()
conn.close()
if rows:
console.print(f"[green]Found {len(rows)} matches in SQLite sessions:[/green]\n")
for i, (content,) in enumerate(rows, 1):
# Parse JSON content
import json
try:
data = json.loads(content)
if 'parts' in data and data['parts']:
text = data['parts'][0].get('text', '')[:150]
role = data.get('role', 'unknown')
console.print(f"{i}. [{role}]: {text}...")
except Exception:
console.print(f"{i}. {content[:150]}...")
else:
console.print("[yellow]No matches found in SQLite either[/yellow]")
else:
console.print("[yellow]SQLite database not found[/yellow]")
else:
console.print(f"[dim]Using {service_type} (not searchable)[/dim]")
else:
console.print("[yellow]No session history available[/yellow]")
async def cmd_memory(self, args: str = "") -> None:
"""Inspect conversational memory and knowledge graph state."""
raw_args = (args or "").strip()
lower_args = raw_args.lower()
if not raw_args or lower_args in {"status", "info"}:
await self._show_memory_status()
return
if lower_args == "datasets":
await self._show_dataset_summary()
return
if lower_args.startswith("search ") or lower_args.startswith("recall "):
query = raw_args.split(" ", 1)[1].strip() if " " in raw_args else ""
if not query:
console.print("Usage: /memory search <query>")
return
await self.cmd_recall(query)
return
console.print("Usage: /memory [status|datasets|search <query>]")
console.print("[dim]/memory search <query> is an alias for /recall <query>[/dim]")
async def _sync_conversational_memory(self) -> None:
"""Ensure the ADK memory service ingests any completed sessions."""
memory_service = getattr(self.agent.memory_manager, "memory_service", None)
executor_sessions = getattr(self.agent.executor, "sessions", {})
metadata_map = getattr(self.agent.executor, "session_metadata", {})
if not memory_service or not executor_sessions:
return
for context_id, session in list(executor_sessions.items()):
meta = metadata_map.get(context_id, {})
if meta.get('memory_synced'):
continue
add_session = getattr(memory_service, "add_session_to_memory", None)
if not callable(add_session):
return
try:
await add_session(session)
meta['memory_synced'] = True
metadata_map[context_id] = meta
except Exception as exc: # pragma: no cover - defensive logging
if os.getenv('FUZZFORGE_DEBUG', '0') == '1':
console.print(f"[yellow]Memory sync failed:[/yellow] {exc}")
async def _show_memory_status(self) -> None:
"""Render conversational memory, session store, and knowledge graph status."""
await self._sync_conversational_memory()
status = self.agent.memory_manager.get_status()
conversational = status.get("conversational_memory", {})
conv_type = conversational.get("type", "unknown")
conv_active = "yes" if conversational.get("active") else "no"
conv_details = conversational.get("details", "")
session_service = getattr(self.agent.executor, "session_service", None)
session_service_name = type(session_service).__name__ if session_service else "Unavailable"
session_lines = [
f"[bold]Service:[/bold] {session_service_name}"
]
session_count = None
event_count = None
db_path_display = None
if session_service_name == "DatabaseSessionService":
import sqlite3
db_path = os.getenv('SESSION_DB_PATH', './fuzzforge_sessions.db')
session_path = Path(db_path).expanduser().resolve()
db_path_display = str(session_path)
if session_path.exists():
try:
with sqlite3.connect(session_path) as conn:
cursor = conn.cursor()
cursor.execute("SELECT COUNT(*) FROM sessions")
session_count = cursor.fetchone()[0]
cursor.execute("SELECT COUNT(*) FROM events")
event_count = cursor.fetchone()[0]
except Exception as exc:
session_lines.append(f"[yellow]Warning:[/yellow] Unable to read session database ({exc})")
else:
session_lines.append("[yellow]SQLite session database not found yet[/yellow]")
elif session_service_name == "InMemorySessionService":
session_lines.append("[dim]Session data persists for the current process only[/dim]")
if db_path_display:
session_lines.append(f"[bold]Database:[/bold] {db_path_display}")
if session_count is not None:
session_lines.append(f"[bold]Sessions Recorded:[/bold] {session_count}")
if event_count is not None:
session_lines.append(f"[bold]Events Logged:[/bold] {event_count}")
conv_lines = [
f"[bold]Type:[/bold] {conv_type}",
f"[bold]Active:[/bold] {conv_active}"
]
if conv_details:
conv_lines.append(f"[bold]Details:[/bold] {conv_details}")
console.print(Panel("\n".join(conv_lines), title="Conversation Memory", border_style="medium_purple3"))
console.print(Panel("\n".join(session_lines), title="Session Store", border_style="deep_sky_blue3"))
# Knowledge graph section
knowledge = status.get("knowledge_graph", {})
kg_active = knowledge.get("active", False)
kg_lines = [
f"[bold]Active:[/bold] {'yes' if kg_active else 'no'}",
f"[bold]Purpose:[/bold] {knowledge.get('purpose', 'N/A')}"
]
cognee_data = None
cognee_error = None
try:
project_config = ProjectConfigManager()
cognee_data = project_config.get_cognee_config()
except Exception as exc: # pragma: no cover - defensive
cognee_error = str(exc)
if cognee_data:
data_dir = cognee_data.get('data_directory')
system_dir = cognee_data.get('system_directory')
if data_dir:
kg_lines.append(f"[bold]Data dir:[/bold] {data_dir}")
if system_dir:
kg_lines.append(f"[bold]System dir:[/bold] {system_dir}")
elif cognee_error:
kg_lines.append(f"[yellow]Config unavailable:[/yellow] {cognee_error}")
dataset_summary = None
if kg_active:
try:
integration = await self.agent.executor._get_knowledge_integration()
if integration:
dataset_summary = await integration.list_datasets()
except Exception as exc: # pragma: no cover - defensive
kg_lines.append(f"[yellow]Dataset listing failed:[/yellow] {exc}")
if dataset_summary:
if dataset_summary.get("error"):
kg_lines.append(f"[yellow]Dataset listing failed:[/yellow] {dataset_summary['error']}")
else:
datasets = dataset_summary.get("datasets", [])
total = dataset_summary.get("total_datasets")
if total is not None:
kg_lines.append(f"[bold]Datasets:[/bold] {total}")
if datasets:
preview = ", ".join(sorted(datasets)[:5])
if len(datasets) > 5:
preview += ", …"
kg_lines.append(f"[bold]Samples:[/bold] {preview}")
else:
kg_lines.append("[dim]Run `fuzzforge ingest` to populate the knowledge graph[/dim]")
console.print(Panel("\n".join(kg_lines), title="Knowledge Graph", border_style="spring_green4"))
console.print("\n[dim]Subcommands: /memory datasets | /memory search <query>[/dim]")
async def _show_dataset_summary(self) -> None:
"""List datasets available in the Cognee knowledge graph."""
try:
integration = await self.agent.executor._get_knowledge_integration()
except Exception as exc:
console.print(f"[yellow]Knowledge graph unavailable:[/yellow] {exc}")
return
if not integration:
console.print("[yellow]Knowledge graph is not initialised yet.[/yellow]")
console.print("[dim]Run `fuzzforge ingest --path . --recursive` to create the project dataset.[/dim]")
return
with safe_status(get_dynamic_status('searching')):
dataset_info = await integration.list_datasets()
if dataset_info.get("error"):
console.print(f"[red]{dataset_info['error']}[/red]")
return
datasets = dataset_info.get("datasets", [])
if not datasets:
console.print("[yellow]No datasets found.[/yellow]")
console.print("[dim]Run `fuzzforge ingest` to populate the knowledge graph.[/dim]")
return
table = Table(title="Cognee Datasets", box=box.ROUNDED)
table.add_column("Dataset", style="medium_purple3")
table.add_column("Notes", style="dim")
for name in sorted(datasets):
note = ""
if name.endswith("_codebase"):
note = "primary project dataset"
table.add_row(name, note)
console.print(table)
console.print(
"[dim]Use knowledge graph prompts (e.g. `search project knowledge for \"topic\" using INSIGHTS`) to query these datasets.[/dim]"
)
async def cmd_artifacts(self, args: str = "") -> None:
"""List or show artifacts"""
if args:
# Show specific artifact
artifacts = await self.agent.executor.get_artifacts(self.context_id)
for artifact in artifacts:
if artifact['id'] == args or args in artifact['id']:
console.print(Panel(
f"[bold]{artifact['title']}[/bold]\n"
f"Type: {artifact['type']} | Created: {artifact['created_at'][:19]}\n\n"
f"[code]{artifact['content']}[/code]",
title=f"Artifact: {artifact['id']}",
border_style="medium_purple3"
))
return
console.print(f"[yellow]Artifact {args} not found[/yellow]")
return
# List all artifacts
artifacts = await self.agent.executor.get_artifacts(self.context_id)
if not artifacts:
console.print("No artifacts created yet")
console.print("[dim]Artifacts are created when generating code, configs, or documents[/dim]")
return
table = Table(title="Artifacts", box=box.ROUNDED)
table.add_column("ID", style="medium_purple3")
table.add_column("Type", style="deep_sky_blue3")
table.add_column("Title", style="plum3")
table.add_column("Size", style="dim")
table.add_column("Created", style="dim")
for artifact in artifacts:
size = f"{len(artifact['content'])} chars"
created = artifact['created_at'][:19] # Just date and time
table.add_row(
artifact['id'],
artifact['type'],
artifact['title'][:40] + "..." if len(artifact['title']) > 40 else artifact['title'],
size,
created
)
console.print(table)
console.print("\n[dim]Use /artifacts <id> to view artifact content[/dim]")
async def cmd_tasks(self, args: str = "") -> None:
"""List tasks or show details for a specific task."""
store = getattr(self.agent.executor, "task_store", None)
if not store or not hasattr(store, "tasks"):
console.print("Task store not available")
return
task_id = args.strip()
async with store.lock:
tasks = dict(store.tasks)
if not tasks:
console.print("No tasks recorded yet")
return
if task_id:
task = tasks.get(task_id)
if not task:
console.print(f"Task '{task_id}' not found")
return
state_str = task.status.state.value if hasattr(task.status.state, "value") else str(task.status.state)
console.print(f"\n[bold]Task {task.id}[/bold]")
console.print(f"Context: {task.context_id}")
console.print(f"State: {state_str}")
console.print(f"Timestamp: {task.status.timestamp}")
if task.metadata:
console.print("Metadata:")
for key, value in task.metadata.items():
console.print(f"{key}: {value}")
if task.history:
console.print("History:")
for entry in task.history[-5:]:
text = getattr(entry, "text", None)
if not text and hasattr(entry, "parts"):
text = " ".join(
getattr(part, "text", "") for part in getattr(entry, "parts", [])
)
console.print(f" - {text}")
return
table = Table(title="FuzzForge Tasks", box=box.ROUNDED)
table.add_column("ID", style="medium_purple3")
table.add_column("State", style="white")
table.add_column("Workflow", style="deep_sky_blue3")
table.add_column("Updated", style="green")
for task in tasks.values():
state_value = task.status.state.value if hasattr(task.status.state, "value") else str(task.status.state)
workflow = ""
if task.metadata:
workflow = task.metadata.get("workflow") or task.metadata.get("workflow_name") or ""
timestamp = task.status.timestamp if task.status else ""
table.add_row(task.id, state_value, workflow, timestamp)
console.print(table)
console.print("\n[dim]Use /tasks <id> to view task details[/dim]")
async def cmd_sessions(self, args: str = "") -> None:
"""List active sessions"""
sessions = self.agent.executor.sessions
if not sessions:
console.print("No active sessions")
return
table = Table(title="Active Sessions", box=box.ROUNDED)
table.add_column("Context ID", style="medium_purple3")
table.add_column("Session ID", style="deep_sky_blue3")
table.add_column("User ID", style="plum3")
table.add_column("State", style="dim")
for context_id, session in sessions.items():
# Get session info
session_id = getattr(session, 'id', 'N/A')
user_id = getattr(session, 'user_id', 'N/A')
state = getattr(session, 'state', {})
# Format state info
agents_count = len(state.get('registered_agents', []))
state_info = f"{agents_count} agents registered"
table.add_row(
context_id[:20] + "..." if len(context_id) > 20 else context_id,
session_id[:20] + "..." if len(str(session_id)) > 20 else str(session_id),
user_id,
state_info
)
console.print(table)
console.print(f"\n[dim]Current session: {self.context_id}[/dim]")
async def cmd_skills(self, args: str = "") -> None:
"""Show FuzzForge skills"""
card = self.agent.agent_card
table = Table(title=f"{card.name} Skills", box=box.ROUNDED)
table.add_column("Skill", style="medium_purple3")
table.add_column("Description", style="white")
table.add_column("Tags", style="deep_sky_blue3")
for skill in card.skills:
table.add_row(
skill.name,
skill.description,
", ".join(skill.tags[:3])
)
console.print(table)
async def cmd_clear(self, args: str = "") -> None:
"""Clear screen"""
console.clear()
self.print_banner()
async def cmd_sendfile(self, args: str) -> None:
"""Encode a local file as an artifact and route it to a registered agent."""
tokens = shlex.split(args)
if len(tokens) < 2:
console.print("Usage: /sendfile <agent_name> <path> [message]")
return
agent_name = tokens[0]
file_arg = tokens[1]
note = " ".join(tokens[2:]).strip()
file_path = Path(file_arg).expanduser()
if not file_path.exists():
console.print(f"[red]File not found:[/red] {file_path}")
return
session = self.agent.executor.sessions.get(self.context_id)
if not session:
console.print("[red]No active session available. Try sending a prompt first.[/red]")
return
console.print(f"[dim]Delegating {file_path.name} to {agent_name}...[/dim]")
async def _delegate() -> None:
try:
response = await self.agent.executor.delegate_file_to_agent(
agent_name,
str(file_path),
note,
session=session,
context_id=self.context_id,
)
console.print(f"[{agent_name}]: {response}")
except Exception as exc:
console.print(f"[red]Failed to delegate file:[/red] {exc}")
finally:
self.background_tasks.discard(asyncio.current_task())
task = asyncio.create_task(_delegate())
self.background_tasks.add(task)
console.print("[dim]Delegation in progress… you can continue working.[/dim]")
async def cmd_quit(self, args: str = "") -> None:
"""Exit the CLI"""
console.print("\n[green]Shutting down...[/green]")
await self.agent.cleanup()
if self.background_tasks:
for task in list(self.background_tasks):
task.cancel()
await asyncio.gather(*self.background_tasks, return_exceptions=True)
console.print("Goodbye!\n")
sys.exit(0)
async def process_command(self, text: str) -> bool:
"""Process slash commands"""
if not text.startswith('/'):
return False
parts = text.split(maxsplit=1)
cmd = parts[0].lower()
args = parts[1] if len(parts) > 1 else ""
if cmd in self.commands:
await self.commands[cmd](args)
return True
console.print(f"Unknown command: {cmd}")
return True
async def auto_register_agents(self):
"""Auto-register agents from config on startup"""
agents_to_register = self.config_manager.get_registered_agents()
if agents_to_register:
console.print(f"\n[dim]Auto-registering {len(agents_to_register)} agents from config...[/dim]")
for agent_config in agents_to_register:
url = agent_config.get('url')
name = agent_config.get('name', 'Unknown')
if url:
try:
with safe_status(f"Registering {name}..."):
result = await self.agent.register_agent(url)
if result["success"]:
console.print(f"{name}: [green]Connected[/green]")
else:
console.print(f" ⚠️ {name}: [yellow]Failed - {result.get('error', 'Unknown error')}[/yellow]")
except Exception as e:
console.print(f" ⚠️ {name}: [yellow]Failed - {e}[/yellow]")
console.print("") # Empty line for spacing
async def run(self):
"""Main CLI loop"""
self.print_banner()
# Auto-register agents from config
await self.auto_register_agents()
while not shutdown_requested:
try:
# Use standard input with non-deletable colored prompt
prompt_symbol = get_prompt_symbol()
try:
# Print colored prompt then use input() for non-deletable behavior
console.print(f"[medium_purple3]{prompt_symbol}[/medium_purple3] ", end="")
user_input = input().strip()
except (EOFError, KeyboardInterrupt):
raise
if not user_input:
continue
# Check for commands
if await self.process_command(user_input):
continue
# Process message
with safe_status(get_dynamic_status('thinking')):
response = await self.agent.process_message(user_input, self.context_id)
# Display response
console.print(f"\n{response}\n")
except KeyboardInterrupt:
await self.cmd_quit()
except EOFError:
await self.cmd_quit()
except Exception as e:
console.print(f"[red]Error: {e}[/red]")
if os.getenv('FUZZFORGE_DEBUG') == '1':
console.print_exception()
console.print("")
await self.agent.cleanup()
def main():
"""Main entry point"""
try:
cli = FuzzForgeCLI()
asyncio.run(cli.run())
except KeyboardInterrupt:
console.print("\n[yellow]Interrupted[/yellow]")
sys.exit(0)
except Exception as e:
console.print(f"[red]Fatal error: {e}[/red]")
if os.getenv('FUZZFORGE_DEBUG') == '1':
console.print_exception()
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -1,469 +0,0 @@
"""
Cognee Integration Module for FuzzForge
Provides standardized access to project-specific knowledge graphs
Can be reused by external agents and other components
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import os
from typing import Dict, Any, Optional
from pathlib import Path
class CogneeProjectIntegration:
"""
Standardized Cognee integration that can be reused across agents
Automatically detects project context and provides knowledge graph access
"""
def __init__(self, project_dir: Optional[str] = None):
"""
Initialize with project directory (defaults to current working directory)
Args:
project_dir: Path to project directory (optional, defaults to cwd)
"""
self.project_dir = Path(project_dir) if project_dir else Path.cwd()
self.config_file = self.project_dir / ".fuzzforge" / "config.yaml"
self.project_context = None
self._cognee = None
self._initialized = False
async def initialize(self) -> bool:
"""
Initialize Cognee with project context
Returns:
bool: True if initialization successful
"""
try:
# Import Cognee
import cognee
self._cognee = cognee
# Load project context
if not self._load_project_context():
return False
# Configure Cognee for this project
await self._setup_cognee_config()
self._initialized = True
return True
except ImportError:
print("Cognee not installed. Install with: pip install cognee")
return False
except Exception as e:
print(f"Failed to initialize Cognee: {e}")
return False
def _load_project_context(self) -> bool:
"""Load project context from FuzzForge config"""
try:
if not self.config_file.exists():
print(f"No FuzzForge config found at {self.config_file}")
return False
import yaml
with open(self.config_file, 'r') as f:
config = yaml.safe_load(f)
self.project_context = {
"project_name": config.get("project", {}).get("name", "default"),
"project_id": config.get("project", {}).get("id", "default"),
"tenant_id": config.get("cognee", {}).get("tenant", "default")
}
return True
except Exception as e:
print(f"Error loading project context: {e}")
return False
async def _setup_cognee_config(self):
"""Configure Cognee for project-specific access"""
# Set API key and model
api_key = os.getenv('OPENAI_API_KEY')
model = os.getenv('LITELLM_MODEL', 'gpt-4o-mini')
if not api_key:
raise ValueError("OPENAI_API_KEY required for Cognee operations")
# Configure Cognee
self._cognee.config.set_llm_api_key(api_key)
self._cognee.config.set_llm_model(model)
self._cognee.config.set_llm_provider("openai")
# Set project-specific directories
project_cognee_dir = self.project_dir / ".fuzzforge" / "cognee" / f"project_{self.project_context['project_id']}"
self._cognee.config.data_root_directory(str(project_cognee_dir / "data"))
self._cognee.config.system_root_directory(str(project_cognee_dir / "system"))
# Ensure directories exist
project_cognee_dir.mkdir(parents=True, exist_ok=True)
(project_cognee_dir / "data").mkdir(exist_ok=True)
(project_cognee_dir / "system").mkdir(exist_ok=True)
async def search_knowledge_graph(self, query: str, search_type: str = "GRAPH_COMPLETION", dataset: str = None) -> Dict[str, Any]:
"""
Search the project's knowledge graph
Args:
query: Search query
search_type: Type of search ("GRAPH_COMPLETION", "INSIGHTS", "CHUNKS", etc.)
dataset: Specific dataset to search (optional)
Returns:
Dict containing search results
"""
if not self._initialized:
await self.initialize()
if not self._initialized:
return {"error": "Cognee not initialized"}
try:
from cognee.modules.search.types import SearchType
# Resolve search type dynamically; fallback to GRAPH_COMPLETION
try:
search_type_enum = getattr(SearchType, search_type.upper())
except AttributeError:
search_type_enum = SearchType.GRAPH_COMPLETION
search_type = "GRAPH_COMPLETION"
# Prepare search kwargs
search_kwargs = {
"query_type": search_type_enum,
"query_text": query
}
# Add dataset filter if specified
if dataset:
search_kwargs["datasets"] = [dataset]
results = await self._cognee.search(**search_kwargs)
return {
"query": query,
"search_type": search_type,
"dataset": dataset,
"results": results,
"project": self.project_context["project_name"]
}
except Exception as e:
return {"error": f"Search failed: {e}"}
async def list_knowledge_data(self) -> Dict[str, Any]:
"""
List available data in the knowledge graph
Returns:
Dict containing available data
"""
if not self._initialized:
await self.initialize()
if not self._initialized:
return {"error": "Cognee not initialized"}
try:
data = await self._cognee.list_data()
return {
"project": self.project_context["project_name"],
"available_data": data
}
except Exception as e:
return {"error": f"Failed to list data: {e}"}
async def cognify_text(self, text: str, dataset: str = None) -> Dict[str, Any]:
"""
Cognify text content into knowledge graph
Args:
text: Text to cognify
dataset: Dataset name (defaults to project_name_codebase)
Returns:
Dict containing cognify results
"""
if not self._initialized:
await self.initialize()
if not self._initialized:
return {"error": "Cognee not initialized"}
if not dataset:
dataset = f"{self.project_context['project_name']}_codebase"
try:
# Add text to dataset
await self._cognee.add([text], dataset_name=dataset)
# Process (cognify) the dataset
await self._cognee.cognify([dataset])
return {
"text_length": len(text),
"dataset": dataset,
"project": self.project_context["project_name"],
"status": "success"
}
except Exception as e:
return {"error": f"Cognify failed: {e}"}
async def ingest_text_to_dataset(self, text: str, dataset: str = None) -> Dict[str, Any]:
"""
Ingest text content into a specific dataset
Args:
text: Text to ingest
dataset: Dataset name (defaults to project_name_codebase)
Returns:
Dict containing ingest results
"""
if not self._initialized:
await self.initialize()
if not self._initialized:
return {"error": "Cognee not initialized"}
if not dataset:
dataset = f"{self.project_context['project_name']}_codebase"
try:
# Add text to dataset
await self._cognee.add([text], dataset_name=dataset)
# Process (cognify) the dataset
await self._cognee.cognify([dataset])
return {
"text_length": len(text),
"dataset": dataset,
"project": self.project_context["project_name"],
"status": "success"
}
except Exception as e:
return {"error": f"Ingest failed: {e}"}
async def ingest_files_to_dataset(self, file_paths: list, dataset: str = None) -> Dict[str, Any]:
"""
Ingest multiple files into a specific dataset
Args:
file_paths: List of file paths to ingest
dataset: Dataset name (defaults to project_name_codebase)
Returns:
Dict containing ingest results
"""
if not self._initialized:
await self.initialize()
if not self._initialized:
return {"error": "Cognee not initialized"}
if not dataset:
dataset = f"{self.project_context['project_name']}_codebase"
try:
# Validate and filter readable files
valid_files = []
for file_path in file_paths:
try:
path = Path(file_path)
if path.exists() and path.is_file():
# Test if file is readable
with open(path, 'r', encoding='utf-8') as f:
f.read(1)
valid_files.append(str(path))
except (UnicodeDecodeError, PermissionError, OSError):
continue
if not valid_files:
return {"error": "No valid files found to ingest"}
# Add files to dataset
await self._cognee.add(valid_files, dataset_name=dataset)
# Process (cognify) the dataset
await self._cognee.cognify([dataset])
return {
"files_processed": len(valid_files),
"total_files_requested": len(file_paths),
"dataset": dataset,
"project": self.project_context["project_name"],
"status": "success"
}
except Exception as e:
return {"error": f"Ingest failed: {e}"}
async def list_datasets(self) -> Dict[str, Any]:
"""
List all datasets available in the project
Returns:
Dict containing available datasets
"""
if not self._initialized:
await self.initialize()
if not self._initialized:
return {"error": "Cognee not initialized"}
try:
# Get available datasets by searching for data
data = await self._cognee.list_data()
# Extract unique dataset names from the data
datasets = set()
if isinstance(data, list):
for item in data:
if isinstance(item, dict) and 'dataset_name' in item:
datasets.add(item['dataset_name'])
return {
"project": self.project_context["project_name"],
"datasets": list(datasets),
"total_datasets": len(datasets)
}
except Exception as e:
return {"error": f"Failed to list datasets: {e}"}
async def create_dataset(self, dataset: str) -> Dict[str, Any]:
"""
Create a new dataset (dataset is created automatically when data is added)
Args:
dataset: Dataset name to create
Returns:
Dict containing creation result
"""
if not self._initialized:
await self.initialize()
if not self._initialized:
return {"error": "Cognee not initialized"}
try:
# In Cognee, datasets are created implicitly when data is added
# We'll add empty content to create the dataset
await self._cognee.add([f"Dataset {dataset} initialized for project {self.project_context['project_name']}"],
dataset_name=dataset)
return {
"dataset": dataset,
"project": self.project_context["project_name"],
"status": "created"
}
except Exception as e:
return {"error": f"Failed to create dataset: {e}"}
def get_project_context(self) -> Optional[Dict[str, str]]:
"""Get current project context"""
return self.project_context
def is_initialized(self) -> bool:
"""Check if Cognee is initialized"""
return self._initialized
# Convenience functions for easy integration
async def search_project_codebase(query: str, project_dir: Optional[str] = None, dataset: str = None, search_type: str = "GRAPH_COMPLETION") -> str:
"""
Convenience function to search project codebase
Args:
query: Search query
project_dir: Project directory (optional, defaults to cwd)
dataset: Specific dataset to search (optional)
search_type: Type of search ("GRAPH_COMPLETION", "INSIGHTS", "CHUNKS")
Returns:
Formatted search results as string
"""
cognee_integration = CogneeProjectIntegration(project_dir)
result = await cognee_integration.search_knowledge_graph(query, search_type, dataset)
if "error" in result:
return f"Error searching codebase: {result['error']}"
project_name = result.get("project", "Unknown")
results = result.get("results", [])
if not results:
return f"No results found for '{query}' in project {project_name}"
output = f"Search results for '{query}' in project {project_name}:\n\n"
# Format results
if isinstance(results, list):
for i, item in enumerate(results, 1):
if isinstance(item, dict):
# Handle structured results
output += f"{i}. "
if "search_result" in item:
output += f"Dataset: {item.get('dataset_name', 'Unknown')}\n"
for result_item in item["search_result"]:
if isinstance(result_item, dict):
if "name" in result_item:
output += f" - {result_item['name']}: {result_item.get('description', '')}\n"
elif "text" in result_item:
text = result_item["text"][:200] + "..." if len(result_item["text"]) > 200 else result_item["text"]
output += f" - {text}\n"
else:
output += f" - {str(result_item)[:200]}...\n"
else:
output += f"{str(item)[:200]}...\n"
output += "\n"
else:
output += f"{i}. {str(item)[:200]}...\n\n"
else:
output += f"{str(results)[:500]}..."
return output
async def list_project_knowledge(project_dir: Optional[str] = None) -> str:
"""
Convenience function to list project knowledge
Args:
project_dir: Project directory (optional, defaults to cwd)
Returns:
Formatted list of available data
"""
cognee_integration = CogneeProjectIntegration(project_dir)
result = await cognee_integration.list_knowledge_data()
if "error" in result:
return f"Error listing knowledge: {result['error']}"
project_name = result.get("project", "Unknown")
data = result.get("available_data", [])
output = f"Available knowledge in project {project_name}:\n\n"
if not data:
output += "No data available in knowledge graph"
else:
for i, item in enumerate(data, 1):
output += f"{i}. {item}\n"
return output

View File

@@ -1,414 +0,0 @@
"""
Cognee Service for FuzzForge
Provides integrated Cognee functionality for codebase analysis and knowledge graphs
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import os
import logging
from pathlib import Path
from typing import Dict, List, Any
logger = logging.getLogger(__name__)
class CogneeService:
"""
Service for managing Cognee integration with FuzzForge
Handles multi-tenant isolation and project-specific knowledge graphs
"""
def __init__(self, config):
"""Initialize with FuzzForge config"""
self.config = config
self.cognee_config = config.get_cognee_config()
self.project_context = config.get_project_context()
self._cognee = None
self._user = None
self._initialized = False
async def initialize(self):
"""Initialize Cognee with project-specific configuration"""
try:
# Ensure environment variables for Cognee are set before import
self.config.setup_cognee_environment()
logger.debug(
"Cognee environment configured",
extra={
"data": self.cognee_config.get("data_directory"),
"system": self.cognee_config.get("system_directory"),
},
)
import cognee
self._cognee = cognee
# Configure LLM with API key BEFORE any other cognee operations
provider = os.getenv("LLM_PROVIDER", "openai")
model = os.getenv("LLM_MODEL") or os.getenv("LITELLM_MODEL", "gpt-4o-mini")
api_key = os.getenv("LLM_API_KEY") or os.getenv("OPENAI_API_KEY")
endpoint = os.getenv("LLM_ENDPOINT")
api_version = os.getenv("LLM_API_VERSION")
max_tokens = os.getenv("LLM_MAX_TOKENS")
if provider.lower() in {"openai", "azure_openai", "custom"} and not api_key:
raise ValueError(
"OpenAI-compatible API key is required for Cognee LLM operations. "
"Set OPENAI_API_KEY, LLM_API_KEY, or COGNEE_LLM_API_KEY in your .env"
)
# Expose environment variables for downstream libraries
os.environ["LLM_PROVIDER"] = provider
os.environ["LITELLM_MODEL"] = model
os.environ["LLM_MODEL"] = model
if api_key:
os.environ["LLM_API_KEY"] = api_key
# Maintain compatibility with components still expecting OPENAI_API_KEY
if provider.lower() in {"openai", "azure_openai", "custom"}:
os.environ.setdefault("OPENAI_API_KEY", api_key)
if endpoint:
os.environ["LLM_ENDPOINT"] = endpoint
if api_version:
os.environ["LLM_API_VERSION"] = api_version
if max_tokens:
os.environ["LLM_MAX_TOKENS"] = str(max_tokens)
# Configure Cognee's runtime using its configuration helpers when available
if hasattr(cognee.config, "set_llm_provider"):
cognee.config.set_llm_provider(provider)
if hasattr(cognee.config, "set_llm_model"):
cognee.config.set_llm_model(model)
if api_key and hasattr(cognee.config, "set_llm_api_key"):
cognee.config.set_llm_api_key(api_key)
if endpoint and hasattr(cognee.config, "set_llm_endpoint"):
cognee.config.set_llm_endpoint(endpoint)
if api_version and hasattr(cognee.config, "set_llm_api_version"):
cognee.config.set_llm_api_version(api_version)
if max_tokens and hasattr(cognee.config, "set_llm_max_tokens"):
cognee.config.set_llm_max_tokens(int(max_tokens))
# Configure graph database
cognee.config.set_graph_db_config({
"graph_database_provider": self.cognee_config.get("graph_database_provider", "kuzu"),
})
# Set data directories
data_dir = self.cognee_config.get("data_directory")
system_dir = self.cognee_config.get("system_directory")
if data_dir:
logger.debug("Setting cognee data root", extra={"path": data_dir})
cognee.config.data_root_directory(data_dir)
if system_dir:
logger.debug("Setting cognee system root", extra={"path": system_dir})
cognee.config.system_root_directory(system_dir)
# Setup multi-tenant user context
await self._setup_user_context()
self._initialized = True
logger.info(f"Cognee initialized for project {self.project_context['project_name']} "
f"with Kuzu at {system_dir}")
except ImportError:
logger.error("Cognee not installed. Install with: pip install cognee")
raise
except Exception as e:
logger.error(f"Failed to initialize Cognee: {e}")
raise
async def create_dataset(self):
"""Create dataset for this project if it doesn't exist"""
if not self._initialized:
await self.initialize()
try:
# Dataset creation is handled automatically by Cognee when adding files
# We just ensure we have the right context set up
dataset_name = f"{self.project_context['project_name']}_codebase"
logger.info(f"Dataset {dataset_name} ready for project {self.project_context['project_name']}")
return dataset_name
except Exception as e:
logger.error(f"Failed to create dataset: {e}")
raise
async def _setup_user_context(self):
"""Setup user context for multi-tenant isolation"""
try:
from cognee.modules.users.methods import create_user, get_user
# Always try fallback email first to avoid validation issues
fallback_email = f"project_{self.project_context['project_id']}@fuzzforge.example"
user_tenant = self.project_context['tenant_id']
# Try to get existing fallback user first
try:
self._user = await get_user(fallback_email)
logger.info(f"Using existing user: {fallback_email}")
return
except Exception:
# User doesn't exist, try to create fallback
pass
# Create fallback user
try:
self._user = await create_user(fallback_email, user_tenant)
logger.info(f"Created fallback user: {fallback_email} for tenant: {user_tenant}")
return
except Exception as fallback_error:
logger.warning(f"Fallback user creation failed: {fallback_error}")
self._user = None
return
except Exception as e:
logger.warning(f"Could not setup multi-tenant user context: {e}")
logger.info("Proceeding with default context")
self._user = None
def get_project_dataset_name(self, dataset_suffix: str = "codebase") -> str:
"""Get project-specific dataset name"""
return f"{self.project_context['project_name']}_{dataset_suffix}"
async def ingest_text(self, content: str, dataset: str = "fuzzforge") -> bool:
"""Ingest text content into knowledge graph"""
if not self._initialized:
await self.initialize()
try:
await self._cognee.add([content], dataset)
await self._cognee.cognify([dataset])
return True
except Exception as e:
logger.error(f"Failed to ingest text: {e}")
return False
async def ingest_files(self, file_paths: List[Path], dataset: str = "fuzzforge") -> Dict[str, Any]:
"""Ingest multiple files into knowledge graph"""
if not self._initialized:
await self.initialize()
results = {
"success": 0,
"failed": 0,
"errors": []
}
try:
ingest_paths: List[str] = []
for file_path in file_paths:
try:
with open(file_path, 'r', encoding='utf-8'):
ingest_paths.append(str(file_path))
results["success"] += 1
except (UnicodeDecodeError, PermissionError) as exc:
results["failed"] += 1
results["errors"].append(f"{file_path}: {exc}")
logger.warning("Skipping %s: %s", file_path, exc)
if ingest_paths:
await self._cognee.add(ingest_paths, dataset_name=dataset)
await self._cognee.cognify([dataset])
except Exception as e:
logger.error(f"Failed to ingest files: {e}")
results["errors"].append(f"Cognify error: {str(e)}")
return results
async def search_insights(self, query: str, dataset: str = None) -> List[str]:
"""Search for insights in the knowledge graph"""
if not self._initialized:
await self.initialize()
try:
from cognee.modules.search.types import SearchType
kwargs = {
"query_type": SearchType.INSIGHTS,
"query_text": query
}
if dataset:
kwargs["datasets"] = [dataset]
results = await self._cognee.search(**kwargs)
return results if isinstance(results, list) else []
except Exception as e:
logger.error(f"Failed to search insights: {e}")
return []
async def search_chunks(self, query: str, dataset: str = None) -> List[str]:
"""Search for relevant text chunks"""
if not self._initialized:
await self.initialize()
try:
from cognee.modules.search.types import SearchType
kwargs = {
"query_type": SearchType.CHUNKS,
"query_text": query
}
if dataset:
kwargs["datasets"] = [dataset]
results = await self._cognee.search(**kwargs)
return results if isinstance(results, list) else []
except Exception as e:
logger.error(f"Failed to search chunks: {e}")
return []
async def search_graph_completion(self, query: str) -> List[str]:
"""Search for graph completion (relationships)"""
if not self._initialized:
await self.initialize()
try:
from cognee.modules.search.types import SearchType
results = await self._cognee.search(
query_type=SearchType.GRAPH_COMPLETION,
query_text=query
)
return results if isinstance(results, list) else []
except Exception as e:
logger.error(f"Failed to search graph completion: {e}")
return []
async def get_status(self) -> Dict[str, Any]:
"""Get service status and statistics"""
status = {
"initialized": self._initialized,
"enabled": self.cognee_config.get("enabled", True),
"provider": self.cognee_config.get("graph_database_provider", "kuzu"),
"data_directory": self.cognee_config.get("data_directory"),
"system_directory": self.cognee_config.get("system_directory"),
}
if self._initialized:
try:
# Check if directories exist and get sizes
data_dir = Path(status["data_directory"])
system_dir = Path(status["system_directory"])
status.update({
"data_dir_exists": data_dir.exists(),
"system_dir_exists": system_dir.exists(),
"kuzu_db_exists": (system_dir / "kuzu_db").exists(),
"lancedb_exists": (system_dir / "lancedb").exists(),
})
except Exception as e:
status["status_error"] = str(e)
return status
async def clear_data(self, confirm: bool = False):
"""Clear all ingested data (dangerous!)"""
if not confirm:
raise ValueError("Must confirm data clearing with confirm=True")
if not self._initialized:
await self.initialize()
try:
await self._cognee.prune.prune_data()
await self._cognee.prune.prune_system(metadata=True)
logger.info("Cognee data cleared")
except Exception as e:
logger.error(f"Failed to clear data: {e}")
raise
class FuzzForgeCogneeIntegration:
"""
Main integration class for FuzzForge + Cognee
Provides high-level operations for security analysis
"""
def __init__(self, config):
self.service = CogneeService(config)
async def analyze_codebase(self, path: Path, recursive: bool = True) -> Dict[str, Any]:
"""
Analyze a codebase and extract security-relevant insights
"""
# Collect code files
from fuzzforge_ai.ingest_utils import collect_ingest_files
files = collect_ingest_files(path, recursive, None, [])
if not files:
return {"error": "No files found to analyze"}
# Ingest files
results = await self.service.ingest_files(files, "security_analysis")
if results["success"] == 0:
return {"error": "Failed to ingest any files", "details": results}
# Extract security insights
security_queries = [
"vulnerabilities security risks",
"authentication authorization",
"input validation sanitization",
"encryption cryptography",
"error handling exceptions",
"logging sensitive data"
]
insights = {}
for query in security_queries:
insight_results = await self.service.search_insights(query, "security_analysis")
if insight_results:
insights[query.replace(" ", "_")] = insight_results
return {
"files_processed": results["success"],
"files_failed": results["failed"],
"errors": results["errors"],
"security_insights": insights
}
async def query_codebase(self, query: str, search_type: str = "insights") -> List[str]:
"""Query the ingested codebase"""
if search_type == "insights":
return await self.service.search_insights(query)
elif search_type == "chunks":
return await self.service.search_chunks(query)
elif search_type == "graph":
return await self.service.search_graph_completion(query)
else:
raise ValueError(f"Unknown search type: {search_type}")
async def get_project_summary(self) -> Dict[str, Any]:
"""Get a summary of the analyzed project"""
# Search for general project insights
summary_queries = [
"project structure components",
"main functionality features",
"programming languages frameworks",
"dependencies libraries"
]
summary = {}
for query in summary_queries:
results = await self.service.search_insights(query)
if results:
summary[query.replace(" ", "_")] = results[:3] # Top 3 results
return summary

View File

@@ -1,9 +0,0 @@
# FuzzForge Registered Agents
# These agents will be automatically registered on startup
registered_agents:
# Example entries:
# - name: Calculator
# url: http://localhost:10201
# description: Mathematical calculations agent

View File

@@ -1,31 +0,0 @@
"""Bridge module providing access to the host CLI configuration manager."""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
try:
from fuzzforge_cli.config import ProjectConfigManager as _ProjectConfigManager
except ImportError: # pragma: no cover - used when CLI not available
class _ProjectConfigManager: # type: ignore[no-redef]
"""Fallback implementation that raises a helpful error."""
def __init__(self, *args, **kwargs):
raise ImportError(
"ProjectConfigManager is unavailable. Install the FuzzForge CLI "
"package or supply a compatible configuration object."
)
def __getattr__(name): # pragma: no cover - defensive
raise ImportError("ProjectConfigManager unavailable")
ProjectConfigManager = _ProjectConfigManager
__all__ = ["ProjectConfigManager"]

View File

@@ -1,134 +0,0 @@
"""
Configuration manager for FuzzForge
Handles loading and saving registered agents
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import os
import yaml
from typing import Dict, Any, List
class ConfigManager:
"""Manages FuzzForge agent registry configuration"""
def __init__(self, config_path: str = None):
"""Initialize config manager"""
if config_path:
self.config_path = config_path
else:
# Check for local .fuzzforge/agents.yaml first, then fall back to global
local_config = os.path.join(os.getcwd(), '.fuzzforge', 'agents.yaml')
global_config = os.path.join(os.path.dirname(__file__), 'config.yaml')
if os.path.exists(local_config):
self.config_path = local_config
if os.getenv("FUZZFORGE_DEBUG", "0") == "1":
print(f"[CONFIG] Using local config: {local_config}")
else:
self.config_path = global_config
if os.getenv("FUZZFORGE_DEBUG", "0") == "1":
print(f"[CONFIG] Using global config: {global_config}")
self.config = self.load_config()
def load_config(self) -> Dict[str, Any]:
"""Load configuration from YAML file"""
if not os.path.exists(self.config_path):
# Create default config if it doesn't exist
return {'registered_agents': []}
try:
with open(self.config_path, 'r') as f:
config = yaml.safe_load(f) or {}
# Ensure registered_agents is a list
if 'registered_agents' not in config or config['registered_agents'] is None:
config['registered_agents'] = []
return config
except Exception as e:
print(f"[WARNING] Failed to load config: {e}")
return {'registered_agents': []}
def save_config(self):
"""Save current configuration to file"""
try:
# Create a clean config with comments
config_content = """# FuzzForge Registered Agents
# These agents will be automatically registered on startup
"""
# Add the agents list
if self.config.get('registered_agents'):
config_content += yaml.dump({'registered_agents': self.config['registered_agents']},
default_flow_style=False, sort_keys=False)
else:
config_content += "registered_agents: []\n"
config_content += """
# Example entries:
# - name: Calculator
# url: http://localhost:10201
# description: Mathematical calculations agent
"""
with open(self.config_path, 'w') as f:
f.write(config_content)
return True
except Exception as e:
print(f"[ERROR] Failed to save config: {e}")
return False
def get_registered_agents(self) -> List[Dict[str, Any]]:
"""Get list of registered agents from config"""
return self.config.get('registered_agents', [])
def add_registered_agent(self, name: str, url: str, description: str = "") -> bool:
"""Add a new registered agent to config"""
if 'registered_agents' not in self.config:
self.config['registered_agents'] = []
# Check if agent already exists
for agent in self.config['registered_agents']:
if agent.get('url') == url:
# Update existing agent
agent['name'] = name
agent['description'] = description
return self.save_config()
# Add new agent
self.config['registered_agents'].append({
'name': name,
'url': url,
'description': description
})
return self.save_config()
def remove_registered_agent(self, name: str = None, url: str = None) -> bool:
"""Remove a registered agent from config"""
if 'registered_agents' not in self.config:
return False
original_count = len(self.config['registered_agents'])
# Filter out the agent
self.config['registered_agents'] = [
agent for agent in self.config['registered_agents']
if not ((name and agent.get('name') == name) or
(url and agent.get('url') == url))
]
if len(self.config['registered_agents']) < original_count:
return self.save_config()
return False

View File

@@ -1,104 +0,0 @@
"""Utilities for collecting files to ingest into Cognee."""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
from __future__ import annotations
import fnmatch
from pathlib import Path
from typing import Iterable, List, Optional
_DEFAULT_FILE_TYPES = [
".py",
".js",
".ts",
".java",
".cpp",
".c",
".h",
".rs",
".go",
".rb",
".php",
".cs",
".swift",
".kt",
".scala",
".clj",
".hs",
".md",
".txt",
".yaml",
".yml",
".json",
".toml",
".cfg",
".ini",
]
_DEFAULT_EXCLUDE = [
"*.pyc",
"__pycache__",
".git",
".svn",
".hg",
"node_modules",
".venv",
"venv",
".env",
"dist",
"build",
".pytest_cache",
".mypy_cache",
".tox",
"coverage",
"*.log",
"*.tmp",
]
def collect_ingest_files(
path: Path,
recursive: bool = True,
file_types: Optional[Iterable[str]] = None,
exclude: Optional[Iterable[str]] = None,
) -> List[Path]:
"""Return a list of files eligible for ingestion."""
path = path.resolve()
files: List[Path] = []
extensions = list(file_types) if file_types else list(_DEFAULT_FILE_TYPES)
exclusions = list(exclude) if exclude else []
exclusions.extend(_DEFAULT_EXCLUDE)
def should_exclude(file_path: Path) -> bool:
file_str = str(file_path)
for pattern in exclusions:
if fnmatch.fnmatch(file_str, f"*{pattern}*") or fnmatch.fnmatch(file_path.name, pattern):
return True
return False
if path.is_file():
if not should_exclude(path) and any(str(path).endswith(ext) for ext in extensions):
files.append(path)
return files
pattern = "**/*" if recursive else "*"
for file_path in path.glob(pattern):
if file_path.is_file() and not should_exclude(file_path):
if any(str(file_path).endswith(ext) for ext in extensions):
files.append(file_path)
return files
__all__ = ["collect_ingest_files"]

View File

@@ -1,244 +0,0 @@
"""
FuzzForge Memory Service
Implements ADK MemoryService pattern for conversational memory
Separate from Cognee which will be used for RAG/codebase analysis
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import os
from typing import Dict, Any
import logging
# ADK Memory imports
from google.adk.memory import InMemoryMemoryService, BaseMemoryService
from google.adk.memory.base_memory_service import SearchMemoryResponse
# Optional VertexAI Memory Bank
try:
from google.adk.memory import VertexAiMemoryBankService
VERTEX_AVAILABLE = True
except ImportError:
VERTEX_AVAILABLE = False
logger = logging.getLogger(__name__)
class FuzzForgeMemoryService:
"""
Manages conversational memory using ADK patterns
This is separate from Cognee which will handle RAG/codebase
"""
def __init__(self, memory_type: str = "inmemory", **kwargs):
"""
Initialize memory service
Args:
memory_type: "inmemory" or "vertexai"
**kwargs: Additional args for specific memory service
For vertexai: project, location, agent_engine_id
"""
self.memory_type = memory_type
self.service = self._create_service(memory_type, **kwargs)
def _create_service(self, memory_type: str, **kwargs) -> BaseMemoryService:
"""Create the appropriate memory service"""
if memory_type == "inmemory":
# Use ADK's InMemoryMemoryService for local development
logger.info("Using InMemory MemoryService for conversational memory")
return InMemoryMemoryService()
elif memory_type == "vertexai" and VERTEX_AVAILABLE:
# Use VertexAI Memory Bank for production
project = kwargs.get('project') or os.getenv('GOOGLE_CLOUD_PROJECT')
location = kwargs.get('location') or os.getenv('GOOGLE_CLOUD_LOCATION', 'us-central1')
agent_engine_id = kwargs.get('agent_engine_id') or os.getenv('AGENT_ENGINE_ID')
if not all([project, location, agent_engine_id]):
logger.warning("VertexAI config missing, falling back to InMemory")
return InMemoryMemoryService()
logger.info(f"Using VertexAI MemoryBank: {agent_engine_id}")
return VertexAiMemoryBankService(
project=project,
location=location,
agent_engine_id=agent_engine_id
)
else:
# Default to in-memory
logger.info("Defaulting to InMemory MemoryService")
return InMemoryMemoryService()
async def add_session_to_memory(self, session: Any) -> None:
"""
Add a completed session to long-term memory
This extracts meaningful information from the conversation
Args:
session: The session object to process
"""
try:
# Let the underlying service handle the ingestion
# It will extract relevant information based on the implementation
await self.service.add_session_to_memory(session)
logger.debug(f"Added session {session.id} to {self.memory_type} memory")
except Exception as e:
logger.error(f"Failed to add session to memory: {e}")
async def search_memory(self,
query: str,
app_name: str = "fuzzforge",
user_id: str = None,
max_results: int = 10) -> SearchMemoryResponse:
"""
Search long-term memory for relevant information
Args:
query: The search query
app_name: Application name for filtering
user_id: User ID for filtering (optional)
max_results: Maximum number of results
Returns:
SearchMemoryResponse with relevant memories
"""
try:
# Search the memory service
results = await self.service.search_memory(
app_name=app_name,
user_id=user_id,
query=query
)
logger.debug(f"Memory search for '{query}' returned {len(results.memories)} results")
return results
except Exception as e:
logger.error(f"Memory search failed: {e}")
# Return empty results on error
return SearchMemoryResponse(memories=[])
async def ingest_completed_sessions(self, session_service) -> int:
"""
Batch ingest all completed sessions into memory
Useful for initial memory population
Args:
session_service: The session service containing sessions
Returns:
Number of sessions ingested
"""
ingested = 0
try:
# Get all sessions from the session service
sessions = await session_service.list_sessions(app_name="fuzzforge")
for session_info in sessions:
# Load full session
session = await session_service.load_session(
app_name="fuzzforge",
user_id=session_info.get('user_id'),
session_id=session_info.get('id')
)
if session and len(session.get_events()) > 0:
await self.add_session_to_memory(session)
ingested += 1
logger.info(f"Ingested {ingested} sessions into {self.memory_type} memory")
except Exception as e:
logger.error(f"Failed to batch ingest sessions: {e}")
return ingested
def get_status(self) -> Dict[str, Any]:
"""Get memory service status"""
return {
"type": self.memory_type,
"active": self.service is not None,
"vertex_available": VERTEX_AVAILABLE,
"details": {
"inmemory": "Non-persistent, keyword search",
"vertexai": "Persistent, semantic search with LLM extraction"
}.get(self.memory_type, "Unknown")
}
class HybridMemoryManager:
"""
Manages both ADK MemoryService (conversational) and Cognee (RAG/codebase)
Provides unified interface for both memory systems
"""
def __init__(self,
memory_service: FuzzForgeMemoryService = None,
cognee_tools = None):
"""
Initialize with both memory systems
Args:
memory_service: ADK-pattern memory for conversations
cognee_tools: Cognee MCP tools for RAG/codebase
"""
# ADK memory for conversations
self.memory_service = memory_service or FuzzForgeMemoryService()
# Cognee for knowledge graphs and RAG (future)
self.cognee_tools = cognee_tools
async def search_conversational_memory(self, query: str) -> SearchMemoryResponse:
"""Search past conversations using ADK memory"""
return await self.memory_service.search_memory(query)
async def search_knowledge_graph(self, query: str, search_type: str = "GRAPH_COMPLETION"):
"""Search Cognee knowledge graph (for RAG/codebase in future)"""
if not self.cognee_tools:
return None
try:
# Use Cognee's graph search
return await self.cognee_tools.search(
query=query,
search_type=search_type
)
except Exception as e:
logger.debug(f"Cognee search failed: {e}")
return None
async def store_in_graph(self, content: str):
"""Store in Cognee knowledge graph (for codebase analysis later)"""
if not self.cognee_tools:
return None
try:
# Use cognify to create graph structures
return await self.cognee_tools.cognify(content)
except Exception as e:
logger.debug(f"Cognee store failed: {e}")
return None
def get_status(self) -> Dict[str, Any]:
"""Get status of both memory systems"""
return {
"conversational_memory": self.memory_service.get_status(),
"knowledge_graph": {
"active": self.cognee_tools is not None,
"purpose": "RAG/codebase analysis (future)"
}
}

View File

@@ -1,148 +0,0 @@
"""
Remote Agent Connection Handler
Handles A2A protocol communication with remote agents
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import httpx
import uuid
from typing import Dict, Any, Optional, List
class RemoteAgentConnection:
"""Handles A2A protocol communication with remote agents"""
def __init__(self, url: str):
"""Initialize connection to a remote agent"""
self.url = url.rstrip('/')
self.agent_card = None
self.client = httpx.AsyncClient(timeout=120.0)
self.context_id = None
async def get_agent_card(self) -> Optional[Dict[str, Any]]:
"""Get the agent card from the remote agent"""
try:
# Try new path first (A2A 0.3.0+)
response = await self.client.get(f"{self.url}/.well-known/agent-card.json")
response.raise_for_status()
self.agent_card = response.json()
return self.agent_card
except Exception:
# Try old path for compatibility
try:
response = await self.client.get(f"{self.url}/.well-known/agent.json")
response.raise_for_status()
self.agent_card = response.json()
return self.agent_card
except Exception as e:
print(f"Failed to get agent card from {self.url}: {e}")
return None
async def send_message(self, message: str | Dict[str, Any] | List[Dict[str, Any]]) -> str:
"""Send a message to the remote agent using A2A protocol"""
try:
parts: List[Dict[str, Any]]
metadata: Dict[str, Any] | None = None
if isinstance(message, dict):
metadata = message.get("metadata") if isinstance(message.get("metadata"), dict) else None
raw_parts = message.get("parts", [])
if not raw_parts:
text_value = message.get("text") or message.get("message")
if isinstance(text_value, str):
raw_parts = [{"type": "text", "text": text_value}]
parts = [raw_part for raw_part in raw_parts if isinstance(raw_part, dict)]
elif isinstance(message, list):
parts = [part for part in message if isinstance(part, dict)]
metadata = None
else:
parts = [{"type": "text", "text": message}]
metadata = None
if not parts:
parts = [{"type": "text", "text": ""}]
# Build JSON-RPC request per A2A spec
payload = {
"jsonrpc": "2.0",
"method": "message/send",
"params": {
"message": {
"messageId": str(uuid.uuid4()),
"role": "user",
"parts": parts,
}
},
"id": 1
}
if metadata:
payload["params"]["message"]["metadata"] = metadata
# Include context if we have one
if self.context_id:
payload["params"]["contextId"] = self.context_id
# Send to root endpoint per A2A protocol
response = await self.client.post(f"{self.url}/", json=payload)
response.raise_for_status()
result = response.json()
# Extract response based on A2A JSON-RPC format
if isinstance(result, dict):
# Update context for continuity
if "result" in result and isinstance(result["result"], dict):
if "contextId" in result["result"]:
self.context_id = result["result"]["contextId"]
# Extract text from artifacts
if "artifacts" in result["result"]:
texts = []
for artifact in result["result"]["artifacts"]:
if isinstance(artifact, dict) and "parts" in artifact:
for part in artifact["parts"]:
if isinstance(part, dict) and "text" in part:
texts.append(part["text"])
if texts:
return " ".join(texts)
# Extract from message format
if "message" in result["result"]:
msg = result["result"]["message"]
if isinstance(msg, dict) and "parts" in msg:
texts = []
for part in msg["parts"]:
if isinstance(part, dict) and "text" in part:
texts.append(part["text"])
return " ".join(texts) if texts else str(msg)
return str(msg)
return str(result["result"])
# Handle error response
elif "error" in result:
error = result["error"]
if isinstance(error, dict):
return f"Error: {error.get('message', str(error))}"
return f"Error: {error}"
# Fallback
return result.get("response", result.get("message", str(result)))
return str(result)
except Exception as e:
return f"Error communicating with agent: {e}"
async def close(self):
"""Close the connection properly"""
await self.client.aclose()

BIN
assets/demopart1.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 360 KiB

BIN
assets/demopart2.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.1 MiB

View File

@@ -1,37 +0,0 @@
FROM python:3.11-slim
WORKDIR /app
# Install system dependencies including Docker client and rsync
RUN apt-get update && apt-get install -y \
curl \
ca-certificates \
gnupg \
lsb-release \
rsync \
&& curl -fsSL https://download.docker.com/linux/debian/gpg | gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg \
&& echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/debian $(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null \
&& apt-get update \
&& apt-get install -y docker-ce-cli \
&& rm -rf /var/lib/apt/lists/*
# Docker client configuration removed - localhost:5001 doesn't require insecure registry config
# Copy project files
COPY pyproject.toml ./
# Install dependencies with pip
RUN pip install --no-cache-dir -e .
# Copy source code
COPY . .
# Expose ports (API on 8000, MCP on 8010)
EXPOSE 8000 8010
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:8000/health || exit 1
# Start the application
CMD ["uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "8000"]

View File

@@ -1,316 +0,0 @@
# FuzzForge Backend
A stateless API server for security testing workflow orchestration using Temporal. This system dynamically discovers workflows, executes them in isolated worker environments, and returns findings in SARIF format.
## Architecture Overview
### Core Components
1. **Workflow Discovery System**: Automatically discovers workflows at startup
2. **Module System**: Reusable components (scanner, analyzer, reporter) with a common interface
3. **Temporal Integration**: Handles workflow orchestration, execution, and monitoring with vertical workers
4. **File Upload & Storage**: HTTP multipart upload to MinIO for target files
5. **SARIF Output**: Standardized security findings format
### Key Features
- **Stateless**: No persistent data, fully scalable
- **Generic**: No hardcoded workflows, automatic discovery
- **Isolated**: Each workflow runs in specialized vertical workers
- **Extensible**: Easy to add new workflows and modules
- **Secure**: File upload with MinIO storage, automatic cleanup via lifecycle policies
- **Observable**: Comprehensive logging and status tracking
## Quick Start
### Prerequisites
- Docker and Docker Compose
### Installation
From the project root, start all services:
```bash
docker-compose -f docker-compose.temporal.yaml up -d
```
This will start:
- Temporal server (Web UI at http://localhost:8233, gRPC at :7233)
- MinIO (S3 storage at http://localhost:9000, Console at http://localhost:9001)
- PostgreSQL database (for Temporal state)
- Vertical workers (worker-rust, worker-android, worker-web, etc.)
- FuzzForge backend API (port 8000)
**Note**: MinIO console login: `fuzzforge` / `fuzzforge123`
## API Endpoints
### Workflows
- `GET /workflows` - List all discovered workflows
- `GET /workflows/{name}/metadata` - Get workflow metadata and parameters
- `GET /workflows/{name}/parameters` - Get workflow parameter schema
- `GET /workflows/metadata/schema` - Get metadata.yaml schema
- `POST /workflows/{name}/submit` - Submit a workflow for execution (path-based, legacy)
- `POST /workflows/{name}/upload-and-submit` - **Upload local files and submit workflow** (recommended)
### Runs
- `GET /runs/{run_id}/status` - Get run status
- `GET /runs/{run_id}/findings` - Get SARIF findings from completed run
- `GET /runs/{workflow_name}/findings/{run_id}` - Alternative findings endpoint with workflow name
## Workflow Structure
Each workflow must have:
```
toolbox/workflows/{workflow_name}/
workflow.py # Temporal workflow definition
metadata.yaml # Mandatory metadata (parameters, version, vertical, etc.)
requirements.txt # Optional Python dependencies (installed in vertical worker)
```
**Note**: With Temporal architecture, workflows run in pre-built vertical workers (e.g., `worker-rust`, `worker-android`), not individual Docker containers. The workflow code is mounted as a volume and discovered at runtime.
### Example metadata.yaml
```yaml
name: security_assessment
version: "1.0.0"
description: "Comprehensive security analysis workflow"
author: "FuzzForge Team"
category: "comprehensive"
vertical: "rust" # Routes to worker-rust
tags:
- "security"
- "analysis"
- "comprehensive"
requirements:
tools:
- "file_scanner"
- "security_analyzer"
- "sarif_reporter"
resources:
memory: "512Mi"
cpu: "500m"
timeout: 1800
has_docker: true
parameters:
type: object
properties:
target_path:
type: string
default: "/workspace"
description: "Path to analyze"
scanner_config:
type: object
description: "Scanner configuration"
properties:
max_file_size:
type: integer
description: "Maximum file size to scan (bytes)"
output_schema:
type: object
properties:
sarif:
type: object
description: "SARIF-formatted security findings"
summary:
type: object
description: "Scan execution summary"
```
### Metadata Field Descriptions
- **name**: Workflow identifier (must match directory name)
- **version**: Semantic version (x.y.z format)
- **description**: Human-readable description of the workflow
- **author**: Workflow author/maintainer
- **category**: Workflow category (comprehensive, specialized, fuzzing, focused)
- **tags**: Array of descriptive tags for categorization
- **requirements.tools**: Required security tools that the workflow uses
- **requirements.resources**: Resource requirements enforced at runtime:
- `memory`: Memory limit (e.g., "512Mi", "1Gi")
- `cpu`: CPU limit (e.g., "500m" for 0.5 cores, "1" for 1 core)
- `timeout`: Maximum execution time in seconds
- **parameters**: JSON Schema object defining workflow parameters
- **output_schema**: Expected output format (typically SARIF)
### Resource Requirements
Resource requirements defined in workflow metadata are automatically enforced. Users can override defaults when submitting workflows:
```bash
curl -X POST "http://localhost:8000/workflows/security_assessment/submit" \
-H "Content-Type: application/json" \
-d '{
"target_path": "/tmp/project",
"resource_limits": {
"memory_limit": "1Gi",
"cpu_limit": "1"
}
}'
```
Resource precedence: User limits > Workflow requirements > System defaults
## File Upload and Target Access
### Upload Endpoint
The backend provides an upload endpoint for submitting workflows with local files:
```
POST /workflows/{workflow_name}/upload-and-submit
Content-Type: multipart/form-data
Parameters:
file: File upload (supports .tar.gz for directories)
parameters: JSON string of workflow parameters (optional)
timeout: Execution timeout in seconds (optional)
```
Example using curl:
```bash
# Upload a directory (create tarball first)
tar -czf project.tar.gz /path/to/project
curl -X POST "http://localhost:8000/workflows/security_assessment/upload-and-submit" \
-F "file=@project.tar.gz" \
-F "parameters={\"check_secrets\":true}"
# Upload a single file
curl -X POST "http://localhost:8000/workflows/security_assessment/upload-and-submit" \
-F "file=@binary.elf"
```
### Storage Flow
1. **CLI/API uploads file** via HTTP multipart
2. **Backend receives file** and streams to temporary location (max 10GB)
3. **Backend uploads to MinIO** with generated `target_id`
4. **Workflow is submitted** to Temporal with `target_id`
5. **Worker downloads target** from MinIO to local cache
6. **Workflow processes target** from cache
7. **MinIO lifecycle policy** deletes files after 7 days
### Advantages
- **No host filesystem access required** - workers can run anywhere
- **Automatic cleanup** - lifecycle policies prevent disk exhaustion
- **Caching** - repeated workflows reuse cached targets
- **Multi-host ready** - targets accessible from any worker
- **Secure** - isolated storage, no arbitrary host path access
## Module Development
Modules implement the `BaseModule` interface:
```python
from src.toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult
class MyModule(BaseModule):
def get_metadata(self) -> ModuleMetadata:
return ModuleMetadata(
name="my_module",
version="1.0.0",
description="Module description",
category="scanner",
...
)
async def execute(self, config: Dict, workspace: Path) -> ModuleResult:
# Module logic here
findings = [...]
return self.create_result(findings=findings)
def validate_config(self, config: Dict) -> bool:
# Validate configuration
return True
```
## Submitting a Workflow
### With File Upload (Recommended)
```bash
# Automatic tarball and upload
tar -czf project.tar.gz /home/user/project
curl -X POST "http://localhost:8000/workflows/security_assessment/upload-and-submit" \
-F "file=@project.tar.gz" \
-F "parameters={\"scanner_config\":{\"patterns\":[\"*.py\"]},\"analyzer_config\":{\"check_secrets\":true}}"
```
### Legacy Path-Based Submission
```bash
# Only works if backend and target are on same machine
curl -X POST "http://localhost:8000/workflows/security_assessment/submit" \
-H "Content-Type: application/json" \
-d '{
"target_path": "/home/user/project",
"parameters": {
"scanner_config": {"patterns": ["*.py"]},
"analyzer_config": {"check_secrets": true}
}
}'
```
## Getting Findings
```bash
curl "http://localhost:8000/runs/{run_id}/findings"
```
Returns SARIF-formatted findings:
```json
{
"workflow": "security_assessment",
"run_id": "abc-123",
"sarif": {
"version": "2.1.0",
"runs": [{
"tool": {...},
"results": [...]
}]
}
}
```
## Security Considerations
1. **File Upload Security**: Files uploaded to MinIO with isolated storage
2. **Read-Only Default**: Target files accessed as read-only unless explicitly set
3. **Worker Isolation**: Each workflow runs in isolated vertical workers
4. **Resource Limits**: Can set CPU/memory limits per worker
5. **Automatic Cleanup**: MinIO lifecycle policies delete old files after 7 days
## Development
### Adding a New Workflow
1. Create directory: `toolbox/workflows/my_workflow/`
2. Add `workflow.py` with a Temporal workflow (using `@workflow.defn`)
3. Add mandatory `metadata.yaml` with `vertical` field
4. Restart the appropriate worker: `docker-compose -f docker-compose.temporal.yaml restart worker-rust`
5. Worker will automatically discover and register the new workflow
### Adding a New Module
1. Create module in `toolbox/modules/{category}/`
2. Implement `BaseModule` interface
3. Use in workflows via import
### Adding a New Vertical Worker
1. Create worker directory: `workers/{vertical}/`
2. Create `Dockerfile` with required tools
3. Add worker to `docker-compose.temporal.yaml`
4. Worker will automatically discover workflows with matching `vertical` in metadata

View File

@@ -1,184 +0,0 @@
# FuzzForge Benchmark Suite
Performance benchmarking infrastructure organized by module category.
## Directory Structure
```
benchmarks/
├── conftest.py # Benchmark fixtures
├── category_configs.py # Category-specific thresholds
├── by_category/ # Benchmarks organized by category
│ ├── fuzzer/
│ │ ├── bench_cargo_fuzz.py
│ │ └── bench_atheris.py
│ ├── scanner/
│ │ └── bench_file_scanner.py
│ ├── secret_detection/
│ │ ├── bench_gitleaks.py
│ │ └── bench_trufflehog.py
│ └── analyzer/
│ └── bench_security_analyzer.py
├── fixtures/ # Benchmark test data
│ ├── small/ # ~1K LOC
│ ├── medium/ # ~10K LOC
│ └── large/ # ~100K LOC
└── results/ # Benchmark results (JSON)
```
## Module Categories
### Fuzzer
**Expected Metrics**: execs/sec, coverage_rate, time_to_crash, memory_usage
**Performance Thresholds**:
- Min 1000 execs/sec
- Max 10s for small projects
- Max 2GB memory
### Scanner
**Expected Metrics**: files/sec, LOC/sec, findings_count
**Performance Thresholds**:
- Min 100 files/sec
- Min 10K LOC/sec
- Max 512MB memory
### Secret Detection
**Expected Metrics**: patterns/sec, precision, recall, F1
**Performance Thresholds**:
- Min 90% precision
- Min 95% recall
- Max 5 false positives per 100 secrets
### Analyzer
**Expected Metrics**: analysis_depth, files/sec, accuracy
**Performance Thresholds**:
- Min 10 files/sec (deep analysis)
- Min 85% accuracy
- Max 2GB memory
## Running Benchmarks
### All Benchmarks
```bash
cd backend
pytest benchmarks/ --benchmark-only -v
```
### Specific Category
```bash
pytest benchmarks/by_category/fuzzer/ --benchmark-only -v
```
### With Comparison
```bash
# Run and save baseline
pytest benchmarks/ --benchmark-only --benchmark-save=baseline
# Compare against baseline
pytest benchmarks/ --benchmark-only --benchmark-compare=baseline
```
### Generate Histogram
```bash
pytest benchmarks/ --benchmark-only --benchmark-histogram=histogram
```
## Benchmark Results
Results are saved as JSON and include:
- Mean execution time
- Standard deviation
- Min/Max values
- Iterations per second
- Memory usage
Example output:
```
------------------------ benchmark: fuzzer --------------------------
Name Mean StdDev Ops/Sec
bench_cargo_fuzz[discovery] 0.0012s 0.0001s 833.33
bench_cargo_fuzz[execution] 0.1250s 0.0050s 8.00
bench_cargo_fuzz[memory] 0.0100s 0.0005s 100.00
---------------------------------------------------------------------
```
## CI/CD Integration
Benchmarks run:
- **Nightly**: Full benchmark suite, track trends
- **On PR**: When benchmarks/ or modules/ changed
- **Manual**: Via workflow_dispatch
### Regression Detection
Benchmarks automatically fail if:
- Performance degrades >10%
- Memory usage exceeds thresholds
- Throughput drops below minimum
See `.github/workflows/benchmark.yml` for configuration.
## Adding New Benchmarks
### 1. Create benchmark file in category directory
```python
# benchmarks/by_category/fuzzer/bench_new_fuzzer.py
import pytest
from benchmarks.category_configs import ModuleCategory, get_threshold
@pytest.mark.benchmark(group="fuzzer")
def test_execution_performance(benchmark, new_fuzzer, test_workspace):
"""Benchmark execution speed"""
result = benchmark(new_fuzzer.execute, config, test_workspace)
# Validate against threshold
threshold = get_threshold(ModuleCategory.FUZZER, "max_execution_time_small")
assert result.execution_time < threshold
```
### 2. Update category_configs.py if needed
Add new thresholds or metrics for your module.
### 3. Run locally
```bash
pytest benchmarks/by_category/fuzzer/bench_new_fuzzer.py --benchmark-only -v
```
## Best Practices
1. **Use mocking** for external dependencies (network, disk I/O)
2. **Fixed iterations** for consistent benchmarking
3. **Warm-up runs** for JIT-compiled code
4. **Category-specific metrics** aligned with module purpose
5. **Realistic fixtures** that represent actual use cases
6. **Memory profiling** using tracemalloc
7. **Compare apples to apples** within the same category
## Interpreting Results
### Good Performance
- ✅ Execution time below threshold
- ✅ Memory usage within limits
- ✅ Throughput meets minimum
- ✅ <5% variance across runs
### Performance Issues
- ⚠️ Execution time 10-20% over threshold
- ❌ Execution time >20% over threshold
- ❌ Memory leaks (increasing over iterations)
- ❌ High variance (>10%) indicates instability
## Tracking Performance Over Time
Benchmark results are stored as artifacts with:
- Commit SHA
- Timestamp
- Environment details (Python version, OS)
- Full metrics
Use these to track long-term performance trends and detect gradual degradation.

View File

@@ -1,221 +0,0 @@
"""
Benchmarks for CargoFuzzer module
Tests performance characteristics of Rust fuzzing:
- Execution throughput (execs/sec)
- Coverage rate
- Memory efficiency
- Time to first crash
"""
import pytest
import asyncio
from pathlib import Path
from unittest.mock import AsyncMock, patch
import sys
sys.path.insert(0, str(Path(__file__).resolve().parents[3] / "toolbox"))
from modules.fuzzer.cargo_fuzzer import CargoFuzzer
from benchmarks.category_configs import ModuleCategory, get_threshold
@pytest.fixture
def cargo_fuzzer():
"""Create CargoFuzzer instance for benchmarking"""
return CargoFuzzer()
@pytest.fixture
def benchmark_config():
"""Benchmark-optimized configuration"""
return {
"target_name": None,
"max_iterations": 10000, # Fixed iterations for consistent benchmarking
"timeout_seconds": 30,
"sanitizer": "address"
}
@pytest.fixture
def mock_rust_workspace(tmp_path):
"""Create a minimal Rust workspace for benchmarking"""
workspace = tmp_path / "rust_project"
workspace.mkdir()
# Cargo.toml
(workspace / "Cargo.toml").write_text("""[package]
name = "bench_project"
version = "0.1.0"
edition = "2021"
""")
# src/lib.rs
src = workspace / "src"
src.mkdir()
(src / "lib.rs").write_text("""
pub fn benchmark_function(data: &[u8]) -> Vec<u8> {
data.to_vec()
}
""")
# fuzz structure
fuzz = workspace / "fuzz"
fuzz.mkdir()
(fuzz / "Cargo.toml").write_text("""[package]
name = "bench_project-fuzz"
version = "0.0.0"
edition = "2021"
[dependencies]
libfuzzer-sys = "0.4"
[dependencies.bench_project]
path = ".."
[[bin]]
name = "fuzz_target_1"
path = "fuzz_targets/fuzz_target_1.rs"
""")
targets = fuzz / "fuzz_targets"
targets.mkdir()
(targets / "fuzz_target_1.rs").write_text("""#![no_main]
use libfuzzer_sys::fuzz_target;
use bench_project::benchmark_function;
fuzz_target!(|data: &[u8]| {
let _ = benchmark_function(data);
});
""")
return workspace
class TestCargoFuzzerPerformance:
"""Benchmark CargoFuzzer performance metrics"""
@pytest.mark.benchmark(group="fuzzer")
def test_target_discovery_performance(self, benchmark, cargo_fuzzer, mock_rust_workspace):
"""Benchmark fuzz target discovery speed"""
def discover():
return asyncio.run(cargo_fuzzer._discover_fuzz_targets(mock_rust_workspace))
result = benchmark(discover)
assert len(result) > 0
@pytest.mark.benchmark(group="fuzzer")
def test_config_validation_performance(self, benchmark, cargo_fuzzer, benchmark_config):
"""Benchmark configuration validation speed"""
result = benchmark(cargo_fuzzer.validate_config, benchmark_config)
assert result is True
@pytest.mark.benchmark(group="fuzzer")
def test_module_initialization_performance(self, benchmark):
"""Benchmark module instantiation time"""
def init_module():
return CargoFuzzer()
module = benchmark(init_module)
assert module is not None
class TestCargoFuzzerThroughput:
"""Benchmark execution throughput"""
@pytest.mark.benchmark(group="fuzzer")
def test_execution_throughput(self, benchmark, cargo_fuzzer, mock_rust_workspace, benchmark_config):
"""Benchmark fuzzing execution throughput"""
# Mock actual fuzzing to focus on orchestration overhead
async def mock_run(workspace, target, config, callback):
# Simulate 10K execs at 1000 execs/sec
if callback:
await callback({
"total_execs": 10000,
"execs_per_sec": 1000.0,
"crashes": 0,
"coverage": 50,
"corpus_size": 10,
"elapsed_time": 10
})
return [], {"total_executions": 10000, "execution_time": 10.0}
with patch.object(cargo_fuzzer, '_build_fuzz_target', new_callable=AsyncMock, return_value=True):
with patch.object(cargo_fuzzer, '_run_fuzzing', side_effect=mock_run):
with patch.object(cargo_fuzzer, '_parse_crash_artifacts', new_callable=AsyncMock, return_value=[]):
def run_fuzzer():
# Run in new event loop
loop = asyncio.new_event_loop()
try:
return loop.run_until_complete(
cargo_fuzzer.execute(benchmark_config, mock_rust_workspace)
)
finally:
loop.close()
result = benchmark(run_fuzzer)
assert result.status == "success"
# Verify performance threshold
threshold = get_threshold(ModuleCategory.FUZZER, "max_execution_time_small")
assert result.execution_time < threshold, \
f"Execution time {result.execution_time}s exceeds threshold {threshold}s"
class TestCargoFuzzerMemory:
"""Benchmark memory efficiency"""
@pytest.mark.benchmark(group="fuzzer")
def test_memory_overhead(self, benchmark, cargo_fuzzer, mock_rust_workspace, benchmark_config):
"""Benchmark memory usage during execution"""
import tracemalloc
def measure_memory():
tracemalloc.start()
# Simulate operations
cargo_fuzzer.validate_config(benchmark_config)
asyncio.run(cargo_fuzzer._discover_fuzz_targets(mock_rust_workspace))
current, peak = tracemalloc.get_traced_memory()
tracemalloc.stop()
return peak / 1024 / 1024 # Convert to MB
peak_mb = benchmark(measure_memory)
# Check against threshold
max_memory = get_threshold(ModuleCategory.FUZZER, "max_memory_mb")
assert peak_mb < max_memory, \
f"Peak memory {peak_mb:.2f}MB exceeds threshold {max_memory}MB"
class TestCargoFuzzerScalability:
"""Benchmark scalability characteristics"""
@pytest.mark.benchmark(group="fuzzer")
def test_multiple_target_discovery(self, benchmark, cargo_fuzzer, tmp_path):
"""Benchmark discovery with multiple targets"""
workspace = tmp_path / "multi_target"
workspace.mkdir()
# Create workspace with 10 fuzz targets
(workspace / "Cargo.toml").write_text("[package]\nname = \"test\"\nversion = \"0.1.0\"\nedition = \"2021\"")
src = workspace / "src"
src.mkdir()
(src / "lib.rs").write_text("pub fn test() {}")
fuzz = workspace / "fuzz"
fuzz.mkdir()
targets = fuzz / "fuzz_targets"
targets.mkdir()
for i in range(10):
(targets / f"fuzz_target_{i}.rs").write_text("// Target")
def discover():
return asyncio.run(cargo_fuzzer._discover_fuzz_targets(workspace))
result = benchmark(discover)
assert len(result) == 10

View File

@@ -1,240 +0,0 @@
# Secret Detection Benchmarks
Comprehensive benchmarking suite comparing secret detection tools via complete workflow execution:
- **Gitleaks** - Fast pattern-based detection
- **TruffleHog** - Entropy analysis with verification
- **LLM Detector** - AI-powered semantic analysis (gpt-4o-mini, gpt-5-mini)
## Quick Start
### Run All Comparisons
```bash
cd backend
python benchmarks/by_category/secret_detection/compare_tools.py
```
This will run all workflows on `test_projects/secret_detection_benchmark/` and generate comparison reports.
### Run Benchmark Tests
```bash
# All benchmarks (Gitleaks, TruffleHog, LLM with 3 models)
pytest benchmarks/by_category/secret_detection/bench_comparison.py --benchmark-only -v
# Specific tool only
pytest benchmarks/by_category/secret_detection/bench_comparison.py::TestSecretDetectionComparison::test_gitleaks_workflow --benchmark-only -v
# Performance tests only
pytest benchmarks/by_category/secret_detection/bench_comparison.py::TestSecretDetectionPerformance --benchmark-only -v
```
## Ground Truth Dataset
**Controlled Benchmark** (`test_projects/secret_detection_benchmark/`)
**Exactly 32 documented secrets** for accurate precision/recall testing:
- **12 Easy**: Standard patterns (AWS keys, GitHub PATs, Stripe keys, SSH keys)
- **10 Medium**: Obfuscated (Base64, hex, concatenated, in comments, Unicode)
- **10 Hard**: Well hidden (ROT13, binary, XOR, reversed, template strings, regex patterns)
All secrets documented in `secret_detection_benchmark_GROUND_TRUTH.json` with exact file paths and line numbers.
See `test_projects/secret_detection_benchmark/README.md` for details.
## Metrics Measured
### Accuracy Metrics
- **Precision**: TP / (TP + FP) - How many detected secrets are real?
- **Recall**: TP / (TP + FN) - How many real secrets were found?
- **F1 Score**: Harmonic mean of precision and recall
- **False Positive Rate**: FP / Total Detected
### Performance Metrics
- **Execution Time**: Total time to scan all files
- **Throughput**: Files/secrets scanned per second
- **Memory Usage**: Peak memory during execution
### Thresholds (from `category_configs.py`)
- Minimum Precision: 90%
- Minimum Recall: 95%
- Max Execution Time (small): 2.0s
- Max False Positives: 5 per 100 secrets
## Tool Comparison
### Gitleaks
**Strengths:**
- Fastest execution
- Git-aware (commit history scanning)
- Low false positive rate
- No API required
- Works offline
**Weaknesses:**
- Pattern-based only
- May miss obfuscated secrets
- Limited to known patterns
### TruffleHog
**Strengths:**
- Secret verification (validates if active)
- High detection rate with entropy analysis
- Multiple detectors (600+ secret types)
- Catches high-entropy strings
**Weaknesses:**
- Slower than Gitleaks
- Higher false positive rate
- Verification requires network calls
### LLM Detector
**Strengths:**
- Semantic understanding of context
- Catches novel/custom secret patterns
- Can reason about what "looks like" a secret
- Multiple model options (GPT-4, Claude, etc.)
- Understands code context
**Weaknesses:**
- Slowest (API latency + LLM processing)
- Most expensive (LLM API costs)
- Requires A2A agent infrastructure
- Accuracy varies by model
- May miss well-disguised secrets
## Results Directory
After running comparisons, results are saved to:
```
benchmarks/by_category/secret_detection/results/
├── comparison_report.md # Human-readable comparison with:
│ # - Summary table with secrets/files/avg per file/time
│ # - Agreement analysis (secrets found by N tools)
│ # - Tool agreement matrix (overlap between pairs)
│ # - Per-file detailed comparison table
│ # - File type breakdown
│ # - Files analyzed by each tool
│ # - Overlap analysis and performance summary
└── comparison_results.json # Machine-readable data with findings_by_file
```
## Latest Benchmark Results
Run the benchmark to generate results:
```bash
cd backend
python benchmarks/by_category/secret_detection/compare_tools.py
```
Results are saved to `results/comparison_report.md` with:
- Summary table (secrets found, files scanned, time)
- Agreement analysis (how many tools found each secret)
- Tool agreement matrix (overlap between tools)
- Per-file detailed comparison
- File type breakdown
## CI/CD Integration
Add to your CI pipeline:
```yaml
# .github/workflows/benchmark-secrets.yml
name: Secret Detection Benchmark
on:
schedule:
- cron: '0 0 * * 0' # Weekly
workflow_dispatch:
jobs:
benchmark:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Install dependencies
run: |
pip install -r backend/requirements.txt
pip install pytest-benchmark
- name: Run benchmarks
env:
GITGUARDIAN_API_KEY: ${{ secrets.GITGUARDIAN_API_KEY }}
run: |
cd backend
pytest benchmarks/by_category/secret_detection/bench_comparison.py \
--benchmark-only \
--benchmark-json=results.json \
--gitguardian-api-key
- name: Upload results
uses: actions/upload-artifact@v3
with:
name: benchmark-results
path: backend/results.json
```
## Adding New Tools
To benchmark a new secret detection tool:
1. Create module in `toolbox/modules/secret_detection/`
2. Register in `__init__.py`
3. Add to `compare_tools.py` in `run_all_tools()`
4. Add test in `bench_comparison.py`
## Interpreting Results
### High Precision, Low Recall
Tool is conservative - few false positives but misses secrets.
**Use case**: Production environments where false positives are costly.
### Low Precision, High Recall
Tool is aggressive - finds most secrets but many false positives.
**Use case**: Initial scans where manual review is acceptable.
### Balanced (High F1)
Tool has good balance of precision and recall.
**Use case**: General purpose scanning.
### Fast Execution
Suitable for CI/CD pipelines and pre-commit hooks.
### Slow but Accurate
Better for comprehensive security audits.
## Best Practices
1. **Use multiple tools**: Each has strengths/weaknesses
2. **Combine results**: Union of all findings for maximum coverage
3. **Filter intelligently**: Remove known false positives
4. **Verify findings**: Check if secrets are actually valid
5. **Track over time**: Monitor precision/recall trends
6. **Update regularly**: Patterns evolve, tools improve
## Troubleshooting
### GitGuardian Tests Skipped
- Set `GITGUARDIAN_API_KEY` environment variable
- Use `--gitguardian-api-key` flag
### LLM Tests Skipped
- Ensure A2A agent is running
- Check agent URL in config
- Use `--llm-enabled` flag
### Low Recall
- Check if ground truth is up to date
- Verify tool is configured correctly
- Review missed secrets manually
### High False Positives
- Adjust tool sensitivity
- Add exclusion patterns
- Review false positive list

View File

@@ -1,285 +0,0 @@
"""
Secret Detection Tool Comparison Benchmark
Compares Gitleaks, TruffleHog, and LLM-based detection
on the vulnerable_app ground truth dataset via workflow execution.
"""
import pytest
import json
from pathlib import Path
from typing import Dict, List, Any
import sys
sys.path.insert(0, str(Path(__file__).resolve().parents[3] / "sdk" / "src"))
from fuzzforge_sdk import FuzzForgeClient
from benchmarks.category_configs import ModuleCategory, get_threshold
@pytest.fixture
def target_path():
"""Path to vulnerable_app"""
path = Path(__file__).parent.parent.parent.parent.parent / "test_projects" / "vulnerable_app"
assert path.exists(), f"Target not found: {path}"
return path
@pytest.fixture
def ground_truth(target_path):
"""Load ground truth data"""
metadata_file = target_path / "SECRETS_GROUND_TRUTH.json"
assert metadata_file.exists(), f"Ground truth not found: {metadata_file}"
with open(metadata_file) as f:
return json.load(f)
@pytest.fixture
def sdk_client():
"""FuzzForge SDK client"""
client = FuzzForgeClient(base_url="http://localhost:8000")
yield client
client.close()
def calculate_metrics(sarif_results: List[Dict], ground_truth: Dict[str, Any]) -> Dict[str, float]:
"""Calculate precision, recall, and F1 score"""
# Extract expected secrets from ground truth
expected_secrets = set()
for file_info in ground_truth["files"]:
if "secrets" in file_info:
for secret in file_info["secrets"]:
expected_secrets.add((file_info["filename"], secret["line"]))
# Extract detected secrets from SARIF
detected_secrets = set()
for result in sarif_results:
locations = result.get("locations", [])
for location in locations:
physical_location = location.get("physicalLocation", {})
artifact_location = physical_location.get("artifactLocation", {})
region = physical_location.get("region", {})
uri = artifact_location.get("uri", "")
line = region.get("startLine", 0)
if uri and line:
file_path = Path(uri)
filename = file_path.name
detected_secrets.add((filename, line))
# Also try with relative path
if len(file_path.parts) > 1:
rel_path = str(Path(*file_path.parts[-2:]))
detected_secrets.add((rel_path, line))
# Calculate metrics
true_positives = len(expected_secrets & detected_secrets)
false_positives = len(detected_secrets - expected_secrets)
false_negatives = len(expected_secrets - detected_secrets)
precision = true_positives / (true_positives + false_positives) if (true_positives + false_positives) > 0 else 0
recall = true_positives / (true_positives + false_negatives) if (true_positives + false_negatives) > 0 else 0
f1 = 2 * (precision * recall) / (precision + recall) if (precision + recall) > 0 else 0
return {
"precision": precision,
"recall": recall,
"f1": f1,
"true_positives": true_positives,
"false_positives": false_positives,
"false_negatives": false_negatives
}
class TestSecretDetectionComparison:
"""Compare all secret detection tools"""
@pytest.mark.benchmark(group="secret_detection")
def test_gitleaks_workflow(self, benchmark, sdk_client, target_path, ground_truth):
"""Benchmark Gitleaks workflow accuracy and performance"""
def run_gitleaks():
run = sdk_client.submit_workflow_with_upload(
workflow_name="gitleaks_detection",
target_path=str(target_path),
parameters={
"scan_mode": "detect",
"no_git": True,
"redact": False
}
)
result = sdk_client.wait_for_completion(run.run_id, timeout=300)
assert result.status == "completed", f"Workflow failed: {result.status}"
findings = sdk_client.get_run_findings(run.run_id)
assert findings and findings.sarif, "No findings returned"
return findings
findings = benchmark(run_gitleaks)
# Extract SARIF results
sarif_results = []
for run_data in findings.sarif.get("runs", []):
sarif_results.extend(run_data.get("results", []))
# Calculate metrics
metrics = calculate_metrics(sarif_results, ground_truth)
# Log results
print(f"\n=== Gitleaks Workflow Results ===")
print(f"Precision: {metrics['precision']:.2%}")
print(f"Recall: {metrics['recall']:.2%}")
print(f"F1 Score: {metrics['f1']:.2%}")
print(f"True Positives: {metrics['true_positives']}")
print(f"False Positives: {metrics['false_positives']}")
print(f"False Negatives: {metrics['false_negatives']}")
print(f"Findings Count: {len(sarif_results)}")
# Assert meets thresholds
min_precision = get_threshold(ModuleCategory.SECRET_DETECTION, "min_precision")
min_recall = get_threshold(ModuleCategory.SECRET_DETECTION, "min_recall")
assert metrics['precision'] >= min_precision, \
f"Precision {metrics['precision']:.2%} below threshold {min_precision:.2%}"
assert metrics['recall'] >= min_recall, \
f"Recall {metrics['recall']:.2%} below threshold {min_recall:.2%}"
@pytest.mark.benchmark(group="secret_detection")
def test_trufflehog_workflow(self, benchmark, sdk_client, target_path, ground_truth):
"""Benchmark TruffleHog workflow accuracy and performance"""
def run_trufflehog():
run = sdk_client.submit_workflow_with_upload(
workflow_name="trufflehog_detection",
target_path=str(target_path),
parameters={
"verify": False,
"max_depth": 10
}
)
result = sdk_client.wait_for_completion(run.run_id, timeout=300)
assert result.status == "completed", f"Workflow failed: {result.status}"
findings = sdk_client.get_run_findings(run.run_id)
assert findings and findings.sarif, "No findings returned"
return findings
findings = benchmark(run_trufflehog)
sarif_results = []
for run_data in findings.sarif.get("runs", []):
sarif_results.extend(run_data.get("results", []))
metrics = calculate_metrics(sarif_results, ground_truth)
print(f"\n=== TruffleHog Workflow Results ===")
print(f"Precision: {metrics['precision']:.2%}")
print(f"Recall: {metrics['recall']:.2%}")
print(f"F1 Score: {metrics['f1']:.2%}")
print(f"True Positives: {metrics['true_positives']}")
print(f"False Positives: {metrics['false_positives']}")
print(f"False Negatives: {metrics['false_negatives']}")
print(f"Findings Count: {len(sarif_results)}")
min_precision = get_threshold(ModuleCategory.SECRET_DETECTION, "min_precision")
min_recall = get_threshold(ModuleCategory.SECRET_DETECTION, "min_recall")
assert metrics['precision'] >= min_precision
assert metrics['recall'] >= min_recall
@pytest.mark.benchmark(group="secret_detection")
@pytest.mark.parametrize("model", [
"gpt-4o-mini",
"gpt-4o",
"claude-3-5-sonnet-20241022"
])
def test_llm_workflow(self, benchmark, sdk_client, target_path, ground_truth, model):
"""Benchmark LLM workflow with different models"""
def run_llm():
provider = "openai" if "gpt" in model else "anthropic"
run = sdk_client.submit_workflow_with_upload(
workflow_name="llm_secret_detection",
target_path=str(target_path),
parameters={
"agent_url": "http://fuzzforge-task-agent:8000/a2a/litellm_agent",
"llm_model": model,
"llm_provider": provider,
"max_files": 20,
"timeout": 60
}
)
result = sdk_client.wait_for_completion(run.run_id, timeout=300)
assert result.status == "completed", f"Workflow failed: {result.status}"
findings = sdk_client.get_run_findings(run.run_id)
assert findings and findings.sarif, "No findings returned"
return findings
findings = benchmark(run_llm)
sarif_results = []
for run_data in findings.sarif.get("runs", []):
sarif_results.extend(run_data.get("results", []))
metrics = calculate_metrics(sarif_results, ground_truth)
print(f"\n=== LLM ({model}) Workflow Results ===")
print(f"Precision: {metrics['precision']:.2%}")
print(f"Recall: {metrics['recall']:.2%}")
print(f"F1 Score: {metrics['f1']:.2%}")
print(f"True Positives: {metrics['true_positives']}")
print(f"False Positives: {metrics['false_positives']}")
print(f"False Negatives: {metrics['false_negatives']}")
print(f"Findings Count: {len(sarif_results)}")
class TestSecretDetectionPerformance:
"""Performance benchmarks for each tool"""
@pytest.mark.benchmark(group="secret_detection")
def test_gitleaks_performance(self, benchmark, sdk_client, target_path):
"""Benchmark Gitleaks workflow execution speed"""
def run():
run = sdk_client.submit_workflow_with_upload(
workflow_name="gitleaks_detection",
target_path=str(target_path),
parameters={"scan_mode": "detect", "no_git": True}
)
result = sdk_client.wait_for_completion(run.run_id, timeout=300)
return result
result = benchmark(run)
max_time = get_threshold(ModuleCategory.SECRET_DETECTION, "max_execution_time_small")
# Note: Workflow execution time includes orchestration overhead
# so we allow 2x the module threshold
assert result.execution_time < max_time * 2
@pytest.mark.benchmark(group="secret_detection")
def test_trufflehog_performance(self, benchmark, sdk_client, target_path):
"""Benchmark TruffleHog workflow execution speed"""
def run():
run = sdk_client.submit_workflow_with_upload(
workflow_name="trufflehog_detection",
target_path=str(target_path),
parameters={"verify": False}
)
result = sdk_client.wait_for_completion(run.run_id, timeout=300)
return result
result = benchmark(run)
max_time = get_threshold(ModuleCategory.SECRET_DETECTION, "max_execution_time_small")
assert result.execution_time < max_time * 2

View File

@@ -1,547 +0,0 @@
"""
Secret Detection Tools Comparison Report Generator
Generates comparison reports showing strengths/weaknesses of each tool.
Uses workflow execution via SDK to test complete pipeline.
"""
import asyncio
import json
import time
from pathlib import Path
from typing import Dict, List, Any, Optional
from dataclasses import dataclass, asdict
import sys
sys.path.insert(0, str(Path(__file__).resolve().parents[3] / "sdk" / "src"))
from fuzzforge_sdk import FuzzForgeClient
@dataclass
class ToolResult:
"""Results from running a tool"""
tool_name: str
execution_time: float
findings_count: int
findings_by_file: Dict[str, List[int]] # file_path -> [line_numbers]
unique_files: int
unique_locations: int # unique (file, line) pairs
secret_density: float # average secrets per file
file_types: Dict[str, int] # file extension -> count of files with secrets
class SecretDetectionComparison:
"""Compare secret detection tools"""
def __init__(self, target_path: Path, api_url: str = "http://localhost:8000"):
self.target_path = target_path
self.client = FuzzForgeClient(base_url=api_url)
async def run_workflow(self, workflow_name: str, tool_name: str, config: Dict[str, Any] = None) -> Optional[ToolResult]:
"""Run a workflow and extract findings"""
print(f"\nRunning {tool_name} workflow...")
start_time = time.time()
try:
# Start workflow
run = self.client.submit_workflow_with_upload(
workflow_name=workflow_name,
target_path=str(self.target_path),
parameters=config or {}
)
print(f" Started run: {run.run_id}")
# Wait for completion (up to 30 minutes for slow LLMs)
print(f" Waiting for completion...")
result = self.client.wait_for_completion(run.run_id, timeout=1800)
execution_time = time.time() - start_time
if result.status != "COMPLETED":
print(f"{tool_name} workflow failed: {result.status}")
return None
# Get findings from SARIF
findings = self.client.get_run_findings(run.run_id)
if not findings or not findings.sarif:
print(f"⚠️ {tool_name} produced no findings")
return None
# Extract results from SARIF and group by file
findings_by_file = {}
unique_locations = set()
for run_data in findings.sarif.get("runs", []):
for result in run_data.get("results", []):
locations = result.get("locations", [])
for location in locations:
physical_location = location.get("physicalLocation", {})
artifact_location = physical_location.get("artifactLocation", {})
region = physical_location.get("region", {})
uri = artifact_location.get("uri", "")
line = region.get("startLine", 0)
if uri and line:
if uri not in findings_by_file:
findings_by_file[uri] = []
findings_by_file[uri].append(line)
unique_locations.add((uri, line))
# Sort line numbers for each file
for file_path in findings_by_file:
findings_by_file[file_path] = sorted(set(findings_by_file[file_path]))
# Calculate file type distribution
file_types = {}
for file_path in findings_by_file:
ext = Path(file_path).suffix or Path(file_path).name # Use full name for files like .env
if ext.startswith('.'):
file_types[ext] = file_types.get(ext, 0) + 1
else:
file_types['[no extension]'] = file_types.get('[no extension]', 0) + 1
# Calculate secret density
secret_density = len(unique_locations) / len(findings_by_file) if findings_by_file else 0
print(f" ✓ Found {len(unique_locations)} secrets in {len(findings_by_file)} files (avg {secret_density:.1f} per file)")
return ToolResult(
tool_name=tool_name,
execution_time=execution_time,
findings_count=len(unique_locations),
findings_by_file=findings_by_file,
unique_files=len(findings_by_file),
unique_locations=len(unique_locations),
secret_density=secret_density,
file_types=file_types
)
except Exception as e:
print(f"{tool_name} error: {e}")
return None
async def run_all_tools(self, llm_models: List[str] = None) -> List[ToolResult]:
"""Run all available tools"""
results = []
if llm_models is None:
llm_models = ["gpt-4o-mini"]
# Gitleaks
result = await self.run_workflow("gitleaks_detection", "Gitleaks", {
"scan_mode": "detect",
"no_git": True,
"redact": False
})
if result:
results.append(result)
# TruffleHog
result = await self.run_workflow("trufflehog_detection", "TruffleHog", {
"verify": False,
"max_depth": 10
})
if result:
results.append(result)
# LLM Detector with multiple models
for model in llm_models:
tool_name = f"LLM ({model})"
result = await self.run_workflow("llm_secret_detection", tool_name, {
"agent_url": "http://fuzzforge-task-agent:8000/a2a/litellm_agent",
"llm_model": model,
"llm_provider": "openai" if "gpt" in model else "anthropic",
"max_files": 20,
"timeout": 60,
"file_patterns": [
"*.py", "*.js", "*.ts", "*.java", "*.go", "*.env", "*.yaml", "*.yml",
"*.json", "*.xml", "*.ini", "*.sql", "*.properties", "*.sh", "*.bat",
"*.config", "*.conf", "*.toml", "*id_rsa*", "*.txt"
]
})
if result:
results.append(result)
return results
def _calculate_agreement_matrix(self, results: List[ToolResult]) -> Dict[str, Dict[str, int]]:
"""Calculate overlap matrix showing common secrets between tool pairs"""
matrix = {}
for i, result1 in enumerate(results):
matrix[result1.tool_name] = {}
# Convert to set of (file, line) tuples
secrets1 = set()
for file_path, lines in result1.findings_by_file.items():
for line in lines:
secrets1.add((file_path, line))
for result2 in results:
secrets2 = set()
for file_path, lines in result2.findings_by_file.items():
for line in lines:
secrets2.add((file_path, line))
# Count common secrets
common = len(secrets1 & secrets2)
matrix[result1.tool_name][result2.tool_name] = common
return matrix
def _get_per_file_comparison(self, results: List[ToolResult]) -> Dict[str, Dict[str, int]]:
"""Get per-file breakdown of findings across all tools"""
all_files = set()
for result in results:
all_files.update(result.findings_by_file.keys())
comparison = {}
for file_path in sorted(all_files):
comparison[file_path] = {}
for result in results:
comparison[file_path][result.tool_name] = len(result.findings_by_file.get(file_path, []))
return comparison
def _get_agreement_stats(self, results: List[ToolResult]) -> Dict[int, int]:
"""Calculate how many secrets are found by 1, 2, 3, or all tools"""
# Collect all unique (file, line) pairs across all tools
all_secrets = {} # (file, line) -> list of tools that found it
for result in results:
for file_path, lines in result.findings_by_file.items():
for line in lines:
key = (file_path, line)
if key not in all_secrets:
all_secrets[key] = []
all_secrets[key].append(result.tool_name)
# Count by number of tools
agreement_counts = {}
for secret, tools in all_secrets.items():
count = len(set(tools)) # Unique tools
agreement_counts[count] = agreement_counts.get(count, 0) + 1
return agreement_counts
def generate_markdown_report(self, results: List[ToolResult]) -> str:
"""Generate markdown comparison report"""
report = []
report.append("# Secret Detection Tools Comparison\n")
report.append(f"**Target**: {self.target_path.name}")
report.append(f"**Tools**: {', '.join([r.tool_name for r in results])}\n")
# Summary table with extended metrics
report.append("\n## Summary\n")
report.append("| Tool | Secrets | Files | Avg/File | Time (s) |")
report.append("|------|---------|-------|----------|----------|")
for result in results:
report.append(
f"| {result.tool_name} | "
f"{result.findings_count} | "
f"{result.unique_files} | "
f"{result.secret_density:.1f} | "
f"{result.execution_time:.2f} |"
)
# Agreement Analysis
agreement_stats = self._get_agreement_stats(results)
report.append("\n## Agreement Analysis\n")
report.append("Secrets found by different numbers of tools:\n")
for num_tools in sorted(agreement_stats.keys(), reverse=True):
count = agreement_stats[num_tools]
if num_tools == len(results):
report.append(f"- **All {num_tools} tools agree**: {count} secrets")
elif num_tools == 1:
report.append(f"- **Only 1 tool found**: {count} secrets")
else:
report.append(f"- **{num_tools} tools agree**: {count} secrets")
# Agreement Matrix
agreement_matrix = self._calculate_agreement_matrix(results)
report.append("\n## Tool Agreement Matrix\n")
report.append("Number of common secrets found by tool pairs:\n")
# Header row
header = "| Tool |"
separator = "|------|"
for result in results:
short_name = result.tool_name.replace("LLM (", "").replace(")", "")
header += f" {short_name} |"
separator += "------|"
report.append(header)
report.append(separator)
# Data rows
for result in results:
short_name = result.tool_name.replace("LLM (", "").replace(")", "")
row = f"| {short_name} |"
for result2 in results:
count = agreement_matrix[result.tool_name][result2.tool_name]
row += f" {count} |"
report.append(row)
# Per-File Comparison
per_file = self._get_per_file_comparison(results)
report.append("\n## Per-File Detailed Comparison\n")
report.append("Secrets found per file by each tool:\n")
# Header
header = "| File |"
separator = "|------|"
for result in results:
short_name = result.tool_name.replace("LLM (", "").replace(")", "")
header += f" {short_name} |"
separator += "------|"
header += " Total |"
separator += "------|"
report.append(header)
report.append(separator)
# Show top 15 files by total findings
file_totals = [(f, sum(counts.values())) for f, counts in per_file.items()]
file_totals.sort(key=lambda x: x[1], reverse=True)
for file_path, total in file_totals[:15]:
row = f"| `{file_path}` |"
for result in results:
count = per_file[file_path].get(result.tool_name, 0)
row += f" {count} |"
row += f" **{total}** |"
report.append(row)
if len(file_totals) > 15:
report.append(f"| ... and {len(file_totals) - 15} more files | ... | ... | ... | ... | ... |")
# File Type Breakdown
report.append("\n## File Type Breakdown\n")
all_extensions = set()
for result in results:
all_extensions.update(result.file_types.keys())
if all_extensions:
header = "| Type |"
separator = "|------|"
for result in results:
short_name = result.tool_name.replace("LLM (", "").replace(")", "")
header += f" {short_name} |"
separator += "------|"
report.append(header)
report.append(separator)
for ext in sorted(all_extensions):
row = f"| `{ext}` |"
for result in results:
count = result.file_types.get(ext, 0)
row += f" {count} files |"
report.append(row)
# File analysis
report.append("\n## Files Analyzed\n")
# Collect all unique files across all tools
all_files = set()
for result in results:
all_files.update(result.findings_by_file.keys())
report.append(f"**Total unique files with secrets**: {len(all_files)}\n")
for result in results:
report.append(f"\n### {result.tool_name}\n")
report.append(f"Found secrets in **{result.unique_files} files**:\n")
# Sort files by number of findings (descending)
sorted_files = sorted(
result.findings_by_file.items(),
key=lambda x: len(x[1]),
reverse=True
)
# Show top 10 files
for file_path, lines in sorted_files[:10]:
report.append(f"- `{file_path}`: {len(lines)} secrets (lines: {', '.join(map(str, lines[:5]))}{'...' if len(lines) > 5 else ''})")
if len(sorted_files) > 10:
report.append(f"- ... and {len(sorted_files) - 10} more files")
# Overlap analysis
if len(results) >= 2:
report.append("\n## Overlap Analysis\n")
# Find common files
file_sets = [set(r.findings_by_file.keys()) for r in results]
common_files = set.intersection(*file_sets) if file_sets else set()
if common_files:
report.append(f"\n**Files found by all tools** ({len(common_files)}):\n")
for file_path in sorted(common_files)[:10]:
report.append(f"- `{file_path}`")
else:
report.append("\n**No files were found by all tools**\n")
# Find tool-specific files
for i, result in enumerate(results):
unique_to_tool = set(result.findings_by_file.keys())
for j, other_result in enumerate(results):
if i != j:
unique_to_tool -= set(other_result.findings_by_file.keys())
if unique_to_tool:
report.append(f"\n**Unique to {result.tool_name}** ({len(unique_to_tool)} files):\n")
for file_path in sorted(unique_to_tool)[:5]:
report.append(f"- `{file_path}`")
if len(unique_to_tool) > 5:
report.append(f"- ... and {len(unique_to_tool) - 5} more")
# Ground Truth Analysis (if available)
ground_truth_path = Path(__file__).parent / "secret_detection_benchmark_GROUND_TRUTH.json"
if ground_truth_path.exists():
report.append("\n## Ground Truth Analysis\n")
try:
with open(ground_truth_path) as f:
gt_data = json.load(f)
gt_total = gt_data.get("total_secrets", 30)
report.append(f"**Expected secrets**: {gt_total} (documented in ground truth)\n")
# Build ground truth set of (file, line) tuples
gt_secrets = set()
for secret in gt_data.get("secrets", []):
gt_secrets.add((secret["file"], secret["line"]))
report.append("### Tool Performance vs Ground Truth\n")
report.append("| Tool | Found | Expected | Recall | Extra Findings |")
report.append("|------|-------|----------|--------|----------------|")
for result in results:
# Build tool findings set
tool_secrets = set()
for file_path, lines in result.findings_by_file.items():
for line in lines:
tool_secrets.add((file_path, line))
# Calculate metrics
true_positives = len(gt_secrets & tool_secrets)
recall = (true_positives / gt_total * 100) if gt_total > 0 else 0
extra = len(tool_secrets - gt_secrets)
report.append(
f"| {result.tool_name} | "
f"{result.findings_count} | "
f"{gt_total} | "
f"{recall:.1f}% | "
f"{extra} |"
)
# Analyze LLM extra findings
llm_results = [r for r in results if "LLM" in r.tool_name]
if llm_results:
report.append("\n### LLM Extra Findings Explanation\n")
report.append("LLMs may find more than 30 secrets because they detect:\n")
report.append("- **Split secret components**: Each part of `DB_PASS_PART1 + PART2 + PART3` counted separately")
report.append("- **Join operations**: Lines like `''.join(AWS_SECRET_CHARS)` flagged as additional exposure")
report.append("- **Decoding functions**: Code that reveals secrets (e.g., `base64.b64decode()`, `codecs.decode()`)")
report.append("- **Comment identifiers**: Lines marking secret locations without plaintext values")
report.append("\nThese are *technically correct* detections of secret exposure points, not false positives.")
report.append("The ground truth documents 30 'primary' secrets, but the codebase has additional derivative exposures.\n")
except Exception as e:
report.append(f"*Could not load ground truth: {e}*\n")
# Performance summary
if results:
report.append("\n## Performance Summary\n")
most_findings = max(results, key=lambda r: r.findings_count)
most_files = max(results, key=lambda r: r.unique_files)
fastest = min(results, key=lambda r: r.execution_time)
report.append(f"- **Most secrets found**: {most_findings.tool_name} ({most_findings.findings_count} secrets)")
report.append(f"- **Most files covered**: {most_files.tool_name} ({most_files.unique_files} files)")
report.append(f"- **Fastest**: {fastest.tool_name} ({fastest.execution_time:.2f}s)")
return "\n".join(report)
def save_json_report(self, results: List[ToolResult], output_path: Path):
"""Save results as JSON"""
data = {
"target_path": str(self.target_path),
"results": [asdict(r) for r in results]
}
with open(output_path, 'w') as f:
json.dump(data, f, indent=2)
print(f"\n✅ JSON report saved to: {output_path}")
def cleanup(self):
"""Cleanup SDK client"""
self.client.close()
async def main():
"""Run comparison and generate reports"""
# Get target path (secret_detection_benchmark)
target_path = Path(__file__).parent.parent.parent.parent.parent / "test_projects" / "secret_detection_benchmark"
if not target_path.exists():
print(f"❌ Target not found at: {target_path}")
return 1
print("=" * 80)
print("Secret Detection Tools Comparison")
print("=" * 80)
print(f"Target: {target_path}")
# LLM models to test
llm_models = [
"gpt-4o-mini",
"gpt-5-mini"
]
print(f"LLM models: {', '.join(llm_models)}\n")
# Run comparison
comparison = SecretDetectionComparison(target_path)
try:
results = await comparison.run_all_tools(llm_models=llm_models)
if not results:
print("❌ No tools ran successfully")
return 1
# Generate reports
print("\n" + "=" * 80)
markdown_report = comparison.generate_markdown_report(results)
print(markdown_report)
# Save reports
output_dir = Path(__file__).parent / "results"
output_dir.mkdir(exist_ok=True)
markdown_path = output_dir / "comparison_report.md"
with open(markdown_path, 'w') as f:
f.write(markdown_report)
print(f"\n✅ Markdown report saved to: {markdown_path}")
json_path = output_dir / "comparison_results.json"
comparison.save_json_report(results, json_path)
print("\n" + "=" * 80)
print("✅ Comparison complete!")
print("=" * 80)
return 0
finally:
comparison.cleanup()
if __name__ == "__main__":
exit_code = asyncio.run(main())
sys.exit(exit_code)

View File

@@ -1,169 +0,0 @@
# Secret Detection Tools Comparison
**Target**: secret_detection_benchmark
**Tools**: Gitleaks, TruffleHog, LLM (gpt-4o-mini), LLM (gpt-5-mini)
## Summary
| Tool | Secrets | Files | Avg/File | Time (s) |
|------|---------|-------|----------|----------|
| Gitleaks | 12 | 10 | 1.2 | 5.18 |
| TruffleHog | 1 | 1 | 1.0 | 5.06 |
| LLM (gpt-4o-mini) | 30 | 15 | 2.0 | 296.85 |
| LLM (gpt-5-mini) | 41 | 16 | 2.6 | 618.55 |
## Agreement Analysis
Secrets found by different numbers of tools:
- **3 tools agree**: 6 secrets
- **2 tools agree**: 22 secrets
- **Only 1 tool found**: 22 secrets
## Tool Agreement Matrix
Number of common secrets found by tool pairs:
| Tool | Gitleaks | TruffleHog | gpt-4o-mini | gpt-5-mini |
|------|------|------|------|------|
| Gitleaks | 12 | 0 | 7 | 11 |
| TruffleHog | 0 | 1 | 0 | 0 |
| gpt-4o-mini | 7 | 0 | 30 | 22 |
| gpt-5-mini | 11 | 0 | 22 | 41 |
## Per-File Detailed Comparison
Secrets found per file by each tool:
| File | Gitleaks | TruffleHog | gpt-4o-mini | gpt-5-mini | Total |
|------|------|------|------|------|------|
| `src/obfuscated.py` | 2 | 0 | 6 | 7 | **15** |
| `src/advanced.js` | 0 | 0 | 5 | 7 | **12** |
| `src/config.py` | 1 | 0 | 0 | 6 | **7** |
| `.env` | 1 | 0 | 2 | 2 | **5** |
| `config/keys.yaml` | 1 | 0 | 2 | 2 | **5** |
| `config/oauth.json` | 1 | 0 | 2 | 2 | **5** |
| `config/settings.py` | 2 | 0 | 0 | 3 | **5** |
| `scripts/deploy.sh` | 1 | 0 | 2 | 2 | **5** |
| `config/legacy.ini` | 0 | 0 | 2 | 2 | **4** |
| `src/Crypto.go` | 0 | 0 | 2 | 2 | **4** |
| `config/app.properties` | 1 | 0 | 1 | 1 | **3** |
| `config/database.yaml` | 0 | 1 | 1 | 1 | **3** |
| `src/Main.java` | 1 | 0 | 1 | 1 | **3** |
| `id_rsa` | 1 | 0 | 1 | 0 | **2** |
| `scripts/webhook.js` | 0 | 0 | 1 | 1 | **2** |
| ... and 2 more files | ... | ... | ... | ... | ... |
## File Type Breakdown
| Type | Gitleaks | TruffleHog | gpt-4o-mini | gpt-5-mini |
|------|------|------|------|------|
| `.env` | 1 files | 0 files | 1 files | 1 files |
| `.go` | 0 files | 0 files | 1 files | 1 files |
| `.ini` | 0 files | 0 files | 1 files | 1 files |
| `.java` | 1 files | 0 files | 1 files | 1 files |
| `.js` | 0 files | 0 files | 2 files | 2 files |
| `.json` | 1 files | 0 files | 1 files | 1 files |
| `.properties` | 1 files | 0 files | 1 files | 1 files |
| `.py` | 3 files | 0 files | 2 files | 4 files |
| `.sh` | 1 files | 0 files | 1 files | 1 files |
| `.sql` | 0 files | 0 files | 1 files | 1 files |
| `.yaml` | 1 files | 1 files | 2 files | 2 files |
| `[no extension]` | 1 files | 0 files | 1 files | 0 files |
## Files Analyzed
**Total unique files with secrets**: 17
### Gitleaks
Found secrets in **10 files**:
- `config/settings.py`: 2 secrets (lines: 6, 9)
- `src/obfuscated.py`: 2 secrets (lines: 7, 17)
- `.env`: 1 secrets (lines: 3)
- `config/app.properties`: 1 secrets (lines: 6)
- `config/keys.yaml`: 1 secrets (lines: 6)
- `id_rsa`: 1 secrets (lines: 1)
- `config/oauth.json`: 1 secrets (lines: 4)
- `scripts/deploy.sh`: 1 secrets (lines: 5)
- `src/Main.java`: 1 secrets (lines: 5)
- `src/config.py`: 1 secrets (lines: 7)
### TruffleHog
Found secrets in **1 files**:
- `config/database.yaml`: 1 secrets (lines: 6)
### LLM (gpt-4o-mini)
Found secrets in **15 files**:
- `src/obfuscated.py`: 6 secrets (lines: 7, 10, 13, 18, 20...)
- `src/advanced.js`: 5 secrets (lines: 4, 7, 10, 12, 17)
- `src/Crypto.go`: 2 secrets (lines: 6, 10)
- `.env`: 2 secrets (lines: 3, 4)
- `config/keys.yaml`: 2 secrets (lines: 6, 12)
- `config/oauth.json`: 2 secrets (lines: 3, 4)
- `config/legacy.ini`: 2 secrets (lines: 4, 7)
- `scripts/deploy.sh`: 2 secrets (lines: 6, 9)
- `src/app.py`: 1 secrets (lines: 7)
- `scripts/webhook.js`: 1 secrets (lines: 4)
- ... and 5 more files
### LLM (gpt-5-mini)
Found secrets in **16 files**:
- `src/obfuscated.py`: 7 secrets (lines: 7, 10, 13, 14, 17...)
- `src/advanced.js`: 7 secrets (lines: 4, 7, 9, 10, 13...)
- `src/config.py`: 6 secrets (lines: 7, 10, 13, 14, 15...)
- `config/settings.py`: 3 secrets (lines: 6, 9, 20)
- `src/Crypto.go`: 2 secrets (lines: 10, 15)
- `.env`: 2 secrets (lines: 3, 4)
- `config/keys.yaml`: 2 secrets (lines: 6, 12)
- `config/oauth.json`: 2 secrets (lines: 3, 4)
- `config/legacy.ini`: 2 secrets (lines: 3, 7)
- `scripts/deploy.sh`: 2 secrets (lines: 5, 10)
- ... and 6 more files
## Overlap Analysis
**No files were found by all tools**
## Ground Truth Analysis
**Expected secrets**: 32 (documented in ground truth)
### Tool Performance vs Ground Truth
| Tool | Found | Expected | Recall | Extra Findings |
|------|-------|----------|--------|----------------|
| Gitleaks | 12 | 32 | 37.5% | 0 |
| TruffleHog | 1 | 32 | 0.0% | 1 |
| LLM (gpt-4o-mini) | 30 | 32 | 56.2% | 12 |
| LLM (gpt-5-mini) | 41 | 32 | 84.4% | 14 |
### LLM Extra Findings Explanation
LLMs may find more than 30 secrets because they detect:
- **Split secret components**: Each part of `DB_PASS_PART1 + PART2 + PART3` counted separately
- **Join operations**: Lines like `''.join(AWS_SECRET_CHARS)` flagged as additional exposure
- **Decoding functions**: Code that reveals secrets (e.g., `base64.b64decode()`, `codecs.decode()`)
- **Comment identifiers**: Lines marking secret locations without plaintext values
These are *technically correct* detections of secret exposure points, not false positives.
The ground truth documents 30 'primary' secrets, but the codebase has additional derivative exposures.
## Performance Summary
- **Most secrets found**: LLM (gpt-5-mini) (41 secrets)
- **Most files covered**: LLM (gpt-5-mini) (16 files)
- **Fastest**: TruffleHog (5.06s)

View File

@@ -1,253 +0,0 @@
{
"target_path": "/Users/tduhamel/Documents/FuzzingLabs/fuzzforge_ai/test_projects/secret_detection_benchmark",
"results": [
{
"tool_name": "Gitleaks",
"execution_time": 5.177123069763184,
"findings_count": 12,
"findings_by_file": {
".env": [
3
],
"config/app.properties": [
6
],
"config/keys.yaml": [
6
],
"id_rsa": [
1
],
"config/oauth.json": [
4
],
"scripts/deploy.sh": [
5
],
"config/settings.py": [
6,
9
],
"src/Main.java": [
5
],
"src/obfuscated.py": [
7,
17
],
"src/config.py": [
7
]
},
"unique_files": 10,
"unique_locations": 12,
"secret_density": 1.2,
"file_types": {
".env": 1,
".properties": 1,
".yaml": 1,
"[no extension]": 1,
".json": 1,
".sh": 1,
".py": 3,
".java": 1
}
},
{
"tool_name": "TruffleHog",
"execution_time": 5.061383008956909,
"findings_count": 1,
"findings_by_file": {
"config/database.yaml": [
6
]
},
"unique_files": 1,
"unique_locations": 1,
"secret_density": 1.0,
"file_types": {
".yaml": 1
}
},
{
"tool_name": "LLM (gpt-4o-mini)",
"execution_time": 296.8492441177368,
"findings_count": 30,
"findings_by_file": {
"src/obfuscated.py": [
7,
10,
13,
18,
20,
23
],
"src/app.py": [
7
],
"scripts/webhook.js": [
4
],
"src/advanced.js": [
4,
7,
10,
12,
17
],
"src/Main.java": [
5
],
"src/Crypto.go": [
6,
10
],
".env": [
3,
4
],
"config/keys.yaml": [
6,
12
],
"config/database.yaml": [
7
],
"config/oauth.json": [
3,
4
],
"config/legacy.ini": [
4,
7
],
"src/database.sql": [
4
],
"config/app.properties": [
6
],
"scripts/deploy.sh": [
6,
9
],
"id_rsa": [
1
]
},
"unique_files": 15,
"unique_locations": 30,
"secret_density": 2.0,
"file_types": {
".py": 2,
".js": 2,
".java": 1,
".go": 1,
".env": 1,
".yaml": 2,
".json": 1,
".ini": 1,
".sql": 1,
".properties": 1,
".sh": 1,
"[no extension]": 1
}
},
{
"tool_name": "LLM (gpt-5-mini)",
"execution_time": 618.5462851524353,
"findings_count": 41,
"findings_by_file": {
"config/settings.py": [
6,
9,
20
],
"src/obfuscated.py": [
7,
10,
13,
14,
17,
20,
23
],
"src/app.py": [
7
],
"src/config.py": [
7,
10,
13,
14,
15,
16
],
"scripts/webhook.js": [
4
],
"src/advanced.js": [
4,
7,
9,
10,
13,
17,
19
],
"src/Main.java": [
5
],
"src/Crypto.go": [
10,
15
],
".env": [
3,
4
],
"config/keys.yaml": [
6,
12
],
"config/database.yaml": [
7
],
"config/oauth.json": [
3,
4
],
"config/legacy.ini": [
3,
7
],
"src/database.sql": [
6
],
"config/app.properties": [
6
],
"scripts/deploy.sh": [
5,
10
]
},
"unique_files": 16,
"unique_locations": 41,
"secret_density": 2.5625,
"file_types": {
".py": 4,
".js": 2,
".java": 1,
".go": 1,
".env": 1,
".yaml": 2,
".json": 1,
".ini": 1,
".sql": 1,
".properties": 1,
".sh": 1
}
}
]
}

View File

@@ -1,344 +0,0 @@
{
"description": "Ground truth dataset for secret detection benchmarking - Exactly 32 secrets",
"version": "1.1.0",
"total_secrets": 32,
"secrets_by_difficulty": {
"easy": 12,
"medium": 10,
"hard": 10
},
"secrets": [
{
"id": 1,
"file": ".env",
"line": 3,
"difficulty": "easy",
"type": "aws_access_key",
"value": "AKIAIOSFODNN7EXAMPLE",
"severity": "critical"
},
{
"id": 2,
"file": ".env",
"line": 4,
"difficulty": "easy",
"type": "aws_secret_access_key",
"value": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
"severity": "critical"
},
{
"id": 3,
"file": "config/settings.py",
"line": 6,
"difficulty": "easy",
"type": "github_pat",
"value": "ghp_vR8jK2mN4pQ6tX9bC3wY7zA1eF5hI8kL",
"severity": "critical"
},
{
"id": 4,
"file": "config/settings.py",
"line": 9,
"difficulty": "easy",
"type": "stripe_api_key",
"value": "sk_live_51MabcdefghijklmnopqrstuvwxyzABCDEF123456789",
"severity": "critical"
},
{
"id": 5,
"file": "config/settings.py",
"line": 17,
"difficulty": "easy",
"type": "database_password",
"value": "ProdDB_P@ssw0rd_2024_Secure!",
"severity": "critical"
},
{
"id": 6,
"file": "src/app.py",
"line": 6,
"difficulty": "easy",
"type": "jwt_secret",
"value": "my-super-secret-jwt-key-do-not-share-2024",
"severity": "critical"
},
{
"id": 7,
"file": "config/database.yaml",
"line": 7,
"difficulty": "easy",
"type": "azure_storage_key",
"value": "DefaultEndpointsProtocol=https;AccountName=prodstore;AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==;EndpointSuffix=core.windows.net",
"severity": "critical"
},
{
"id": 8,
"file": "scripts/webhook.js",
"line": 4,
"difficulty": "easy",
"type": "slack_webhook",
"value": "https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXX",
"severity": "high"
},
{
"id": 9,
"file": "config/app.properties",
"line": 6,
"difficulty": "easy",
"type": "api_key",
"value": "sk_test_4eC39HqLyjWDarjtT1zdp7dc",
"severity": "high"
},
{
"id": 10,
"file": "id_rsa",
"line": 1,
"difficulty": "easy",
"type": "ssh_private_key",
"value": "-----BEGIN OPENSSH PRIVATE KEY-----",
"severity": "critical"
},
{
"id": 11,
"file": "config/oauth.json",
"line": 4,
"difficulty": "easy",
"type": "oauth_client_secret",
"value": "GOCSPX-Ab12Cd34Ef56Gh78Ij90Kl12",
"severity": "critical"
},
{
"id": 12,
"file": "src/Main.java",
"line": 5,
"difficulty": "easy",
"type": "google_oauth_secret",
"value": "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",
"severity": "critical"
},
{
"id": 13,
"file": "src/config.py",
"line": 7,
"difficulty": "medium",
"type": "aws_access_key_base64",
"value": "QUtJQUlPU0ZPRE5ON0VYQU1QTEU=",
"decoded": "AKIAIOSFODNN7EXAMPLE",
"severity": "critical"
},
{
"id": 14,
"file": "src/config.py",
"line": 10,
"difficulty": "medium",
"type": "api_token_hex",
"value": "6170695f746f6b656e5f616263313233787977373839",
"decoded": "api_token_abc123xyz789",
"severity": "high"
},
{
"id": 15,
"file": "src/config.py",
"line": 16,
"difficulty": "medium",
"type": "database_password_concatenated",
"value": "MySecurePassword2024!",
"note": "Built from DB_PASS_PART1 + DB_PASS_PART2 + DB_PASS_PART3",
"severity": "critical"
},
{
"id": 16,
"file": "scripts/deploy.sh",
"line": 5,
"difficulty": "medium",
"type": "api_key_export",
"value": "sk_prod_1234567890abcdefghijklmnopqrstuvwxyz",
"severity": "critical"
},
{
"id": 17,
"file": "scripts/deploy.sh",
"line": 11,
"difficulty": "medium",
"type": "database_password_url_encoded",
"value": "mysql://admin:MyP%40ssw0rd%21@db.example.com:3306/prod",
"decoded": "mysql://admin:MyP@ssw0rd!@db.example.com:3306/prod",
"note": "In comment",
"severity": "critical"
},
{
"id": 18,
"file": "config/keys.yaml",
"line": 6,
"difficulty": "medium",
"type": "rsa_private_key_multiline",
"value": "-----BEGIN RSA PRIVATE KEY-----",
"note": "Multi-line YAML literal block",
"severity": "critical"
},
{
"id": 19,
"file": "config/keys.yaml",
"line": 11,
"difficulty": "medium",
"type": "api_token_unicode",
"value": "tøkęn_śęçrėt_ẃïth_ŭñïçődė_123456",
"severity": "high"
},
{
"id": 20,
"file": "src/database.sql",
"line": 6,
"difficulty": "medium",
"type": "database_connection_string",
"value": "postgresql://admin:Pr0dDB_S3cr3t_P@ss@db.prod.example.com:5432/prod_db",
"note": "In SQL comment",
"severity": "critical"
},
{
"id": 21,
"file": "config/legacy.ini",
"line": 3,
"difficulty": "medium",
"type": "database_password",
"value": "L3g@cy_DB_P@ssw0rd_2023",
"severity": "critical"
},
{
"id": 22,
"file": "config/legacy.ini",
"line": 7,
"difficulty": "medium",
"type": "api_key_commented",
"value": "backup_key_xyz789abc123def456ghi",
"note": "Commented backup key",
"severity": "high"
},
{
"id": 23,
"file": "src/obfuscated.py",
"line": 7,
"difficulty": "hard",
"type": "stripe_key_rot13",
"value": "fx_yvir_frperg_xrl_12345",
"decoded": "sk_live_secret_key_12345",
"severity": "critical"
},
{
"id": 24,
"file": "src/obfuscated.py",
"line": 10,
"difficulty": "hard",
"type": "github_token_binary",
"value": "b'\\x67\\x68\\x70\\x5f\\x4d\\x79\\x47\\x69\\x74\\x48\\x75\\x62\\x54\\x6f\\x6b\\x65\\x6e\\x31\\x32\\x33\\x34\\x35\\x36'",
"decoded": "ghp_MyGitHubToken123456",
"severity": "critical"
},
{
"id": 25,
"file": "src/obfuscated.py",
"line": 13,
"difficulty": "hard",
"type": "aws_secret_char_array",
"value": "['A','W','S','_','S','E','C','R','E','T','_','K','E','Y','_','X','Y','Z','7','8','9']",
"decoded": "AWS_SECRET_KEY_XYZ789",
"severity": "critical"
},
{
"id": 26,
"file": "src/obfuscated.py",
"line": 17,
"difficulty": "hard",
"type": "api_token_reversed",
"value": "321cba_desrever_nekot_ipa",
"decoded": "api_token_reversed_abc123",
"severity": "high"
},
{
"id": 27,
"file": "src/advanced.js",
"line": 4,
"difficulty": "hard",
"type": "secret_template_string",
"value": "sk_prod_template_key_xyz",
"note": "Built from template literals",
"severity": "critical"
},
{
"id": 28,
"file": "src/advanced.js",
"line": 7,
"difficulty": "hard",
"type": "password_in_regex",
"value": "password_regex_secret_789",
"note": "Inside regex pattern",
"severity": "medium"
},
{
"id": 29,
"file": "src/advanced.js",
"line": 10,
"difficulty": "hard",
"type": "api_key_xor",
"value": "[65,82,90,75,94,91,92,75,93,67,65,90,67,92,75,91,67,95]",
"decoded": "api_xor_secret_key",
"note": "XOR encrypted with key 42",
"severity": "critical"
},
{
"id": 30,
"file": "src/advanced.js",
"line": 17,
"difficulty": "hard",
"type": "api_key_escaped_json",
"value": "sk_escaped_json_key_456",
"note": "Escaped JSON within string",
"severity": "high"
},
{
"id": 31,
"file": "src/Crypto.go",
"line": 10,
"difficulty": "hard",
"type": "secret_in_heredoc",
"value": "golang_heredoc_secret_999",
"note": "In heredoc/multi-line string",
"severity": "high"
},
{
"id": 32,
"file": "src/Crypto.go",
"line": 15,
"difficulty": "hard",
"type": "stripe_key_typo",
"value": "strippe_sk_live_corrected_key",
"decoded": "stripe_sk_live_corrected_key",
"note": "Intentional typo corrected programmatically",
"severity": "critical"
}
],
"file_summary": {
".env": 2,
"config/settings.py": 3,
"src/app.py": 1,
"config/database.yaml": 1,
"scripts/webhook.js": 1,
"config/app.properties": 1,
"id_rsa": 1,
"config/oauth.json": 1,
"src/Main.java": 1,
"src/config.py": 3,
"scripts/deploy.sh": 2,
"config/keys.yaml": 2,
"src/database.sql": 1,
"config/legacy.ini": 2,
"src/obfuscated.py": 4,
"src/advanced.js": 4,
"src/Crypto.go": 2
},
"notes": {
"easy_secrets": "Standard patterns that any decent secret scanner should detect",
"medium_secrets": "Slightly obfuscated - base64, hex, concatenated, or in comments",
"hard_secrets": "Well hidden - ROT13, binary, XOR, reversed, split across constructs"
}
}

View File

@@ -1,151 +0,0 @@
"""
Category-specific benchmark configurations
Defines expected metrics and performance thresholds for each module category.
"""
from dataclasses import dataclass
from typing import List, Dict
from enum import Enum
class ModuleCategory(str, Enum):
"""Module categories for benchmarking"""
FUZZER = "fuzzer"
SCANNER = "scanner"
ANALYZER = "analyzer"
SECRET_DETECTION = "secret_detection"
REPORTER = "reporter"
@dataclass
class CategoryBenchmarkConfig:
"""Benchmark configuration for a module category"""
category: ModuleCategory
expected_metrics: List[str]
performance_thresholds: Dict[str, float]
description: str
# Fuzzer category configuration
FUZZER_CONFIG = CategoryBenchmarkConfig(
category=ModuleCategory.FUZZER,
expected_metrics=[
"execs_per_sec",
"coverage_rate",
"time_to_first_crash",
"corpus_efficiency",
"execution_time",
"peak_memory_mb"
],
performance_thresholds={
"min_execs_per_sec": 1000, # Minimum executions per second
"max_execution_time_small": 10.0, # Max time for small project (seconds)
"max_execution_time_medium": 60.0, # Max time for medium project
"max_memory_mb": 2048, # Maximum memory usage
"min_coverage_rate": 1.0, # Minimum new coverage per second
},
description="Fuzzing modules: coverage-guided fuzz testing"
)
# Scanner category configuration
SCANNER_CONFIG = CategoryBenchmarkConfig(
category=ModuleCategory.SCANNER,
expected_metrics=[
"files_per_sec",
"loc_per_sec",
"execution_time",
"peak_memory_mb",
"findings_count"
],
performance_thresholds={
"min_files_per_sec": 100, # Minimum files scanned per second
"min_loc_per_sec": 10000, # Minimum lines of code per second
"max_execution_time_small": 1.0,
"max_execution_time_medium": 10.0,
"max_memory_mb": 512,
},
description="File scanning modules: fast pattern-based scanning"
)
# Secret detection category configuration
SECRET_DETECTION_CONFIG = CategoryBenchmarkConfig(
category=ModuleCategory.SECRET_DETECTION,
expected_metrics=[
"patterns_per_sec",
"precision",
"recall",
"f1_score",
"false_positive_rate",
"execution_time",
"peak_memory_mb"
],
performance_thresholds={
"min_patterns_per_sec": 1000,
"min_precision": 0.90, # 90% precision target
"min_recall": 0.95, # 95% recall target
"max_false_positives": 5, # Max false positives per 100 secrets
"max_execution_time_small": 2.0,
"max_execution_time_medium": 20.0,
"max_memory_mb": 1024,
},
description="Secret detection modules: high precision pattern matching"
)
# Analyzer category configuration
ANALYZER_CONFIG = CategoryBenchmarkConfig(
category=ModuleCategory.ANALYZER,
expected_metrics=[
"analysis_depth",
"files_analyzed_per_sec",
"execution_time",
"peak_memory_mb",
"findings_count",
"accuracy"
],
performance_thresholds={
"min_files_per_sec": 10, # Slower than scanners due to deep analysis
"max_execution_time_small": 5.0,
"max_execution_time_medium": 60.0,
"max_memory_mb": 2048,
"min_accuracy": 0.85, # 85% accuracy target
},
description="Code analysis modules: deep semantic analysis"
)
# Reporter category configuration
REPORTER_CONFIG = CategoryBenchmarkConfig(
category=ModuleCategory.REPORTER,
expected_metrics=[
"report_generation_time",
"findings_per_sec",
"peak_memory_mb"
],
performance_thresholds={
"max_report_time_100_findings": 1.0, # Max 1 second for 100 findings
"max_report_time_1000_findings": 10.0, # Max 10 seconds for 1000 findings
"max_memory_mb": 256,
},
description="Reporting modules: fast report generation"
)
# Category configurations map
CATEGORY_CONFIGS = {
ModuleCategory.FUZZER: FUZZER_CONFIG,
ModuleCategory.SCANNER: SCANNER_CONFIG,
ModuleCategory.SECRET_DETECTION: SECRET_DETECTION_CONFIG,
ModuleCategory.ANALYZER: ANALYZER_CONFIG,
ModuleCategory.REPORTER: REPORTER_CONFIG,
}
def get_category_config(category: ModuleCategory) -> CategoryBenchmarkConfig:
"""Get benchmark configuration for a category"""
return CATEGORY_CONFIGS[category]
def get_threshold(category: ModuleCategory, metric: str) -> float:
"""Get performance threshold for a specific metric"""
config = get_category_config(category)
return config.performance_thresholds.get(metric, 0.0)

View File

@@ -1,60 +0,0 @@
"""
Benchmark fixtures and configuration
"""
import sys
from pathlib import Path
import pytest
# Add parent directories to path
BACKEND_ROOT = Path(__file__).resolve().parents[1]
TOOLBOX = BACKEND_ROOT / "toolbox"
if str(BACKEND_ROOT) not in sys.path:
sys.path.insert(0, str(BACKEND_ROOT))
if str(TOOLBOX) not in sys.path:
sys.path.insert(0, str(TOOLBOX))
# ============================================================================
# Benchmark Fixtures
# ============================================================================
@pytest.fixture(scope="session")
def benchmark_fixtures_dir():
"""Path to benchmark fixtures directory"""
return Path(__file__).parent / "fixtures"
@pytest.fixture(scope="session")
def small_project_fixture(benchmark_fixtures_dir):
"""Small project fixture (~1K LOC)"""
return benchmark_fixtures_dir / "small"
@pytest.fixture(scope="session")
def medium_project_fixture(benchmark_fixtures_dir):
"""Medium project fixture (~10K LOC)"""
return benchmark_fixtures_dir / "medium"
@pytest.fixture(scope="session")
def large_project_fixture(benchmark_fixtures_dir):
"""Large project fixture (~100K LOC)"""
return benchmark_fixtures_dir / "large"
# ============================================================================
# pytest-benchmark Configuration
# ============================================================================
def pytest_configure(config):
"""Configure pytest-benchmark"""
config.addinivalue_line(
"markers", "benchmark: mark test as a benchmark"
)
def pytest_benchmark_group_stats(config, benchmarks, group_by):
"""Group benchmark results by category"""
return group_by

View File

@@ -1,122 +0,0 @@
{
"name": "FuzzForge Security Testing Platform",
"description": "MCP server for FuzzForge security testing workflows via Docker Compose",
"version": "0.6.0",
"connection": {
"type": "http",
"host": "localhost",
"port": 8010,
"base_url": "http://localhost:8010",
"mcp_endpoint": "/mcp"
},
"docker_compose": {
"service": "fuzzforge-backend",
"command": "docker compose up -d",
"health_check": "http://localhost:8000/health"
},
"capabilities": {
"tools": [
{
"name": "submit_security_scan_mcp",
"description": "Submit a security scanning workflow for execution",
"parameters": {
"workflow_name": "string",
"target_path": "string",
"volume_mode": "string (ro|rw)",
"parameters": "object"
}
},
{
"name": "get_comprehensive_scan_summary",
"description": "Get a comprehensive summary of scan results with analysis",
"parameters": {
"run_id": "string"
}
}
],
"fastapi_routes": [
{
"method": "GET",
"path": "/",
"description": "Get API status and loaded workflows count"
},
{
"method": "GET",
"path": "/workflows/",
"description": "List all available security testing workflows"
},
{
"method": "POST",
"path": "/workflows/{workflow_name}/submit",
"description": "Submit a security scanning workflow for execution"
},
{
"method": "GET",
"path": "/runs/{run_id}/status",
"description": "Get the current status of a security scan run"
},
{
"method": "GET",
"path": "/runs/{run_id}/findings",
"description": "Get security findings from a completed scan"
},
{
"method": "GET",
"path": "/fuzzing/{run_id}/stats",
"description": "Get fuzzing statistics for a run"
}
]
},
"examples": {
"start_infrastructure_scan": {
"description": "Run infrastructure security scan on a project",
"steps": [
"1. Start Docker Compose: docker compose up -d",
"2. Submit scan via MCP tool: submit_security_scan_mcp",
"3. Monitor status and get results"
],
"workflow_name": "infrastructure_scan",
"target_path": "/Users/tduhamel/Documents/FuzzingLabs/fuzzforge_alpha/test_projects/infrastructure_vulnerable",
"parameters": {
"checkov_config": {
"severity": ["HIGH", "MEDIUM", "LOW"]
},
"hadolint_config": {
"severity": ["error", "warning", "info", "style"]
}
}
},
"static_analysis_scan": {
"description": "Run static analysis security scan",
"workflow_name": "static_analysis_scan",
"target_path": "/Users/tduhamel/Documents/FuzzingLabs/fuzzforge_alpha/test_projects/static_analysis_vulnerable",
"parameters": {
"bandit_config": {
"severity": ["HIGH", "MEDIUM", "LOW"]
},
"opengrep_config": {
"severity": ["HIGH", "MEDIUM", "LOW"]
}
}
},
"secret_detection_scan": {
"description": "Run secret detection scan",
"workflow_name": "secret_detection_scan",
"target_path": "/Users/tduhamel/Documents/FuzzingLabs/fuzzforge_alpha/test_projects/secret_detection_vulnerable",
"parameters": {
"trufflehog_config": {
"verified_only": false
},
"gitleaks_config": {
"no_git": true
}
}
}
},
"usage": {
"via_mcp": "Connect MCP client to http://localhost:8010/mcp after starting Docker Compose",
"via_api": "Use FastAPI endpoints directly at http://localhost:8000",
"start_system": "docker compose up -d",
"stop_system": "docker compose down"
}
}

View File

@@ -1,41 +0,0 @@
[project]
name = "backend"
version = "0.7.0"
description = "FuzzForge OSS backend"
authors = []
readme = "README.md"
requires-python = ">=3.11"
dependencies = [
"fastapi>=0.116.1",
"temporalio>=1.6.0",
"boto3>=1.34.0",
"pydantic>=2.0.0",
"pyyaml>=6.0",
"docker>=7.0.0",
"aiofiles>=23.0.0",
"uvicorn>=0.30.0",
"aiohttp>=3.12.15",
"fastmcp",
]
[project.optional-dependencies]
dev = [
"pytest>=8.0.0",
"pytest-asyncio>=0.23.0",
"pytest-benchmark>=4.0.0",
"pytest-cov>=5.0.0",
"pytest-xdist>=3.5.0",
"pytest-mock>=3.12.0",
"httpx>=0.27.0",
"ruff>=0.1.0",
]
[tool.pytest.ini_options]
asyncio_mode = "auto"
testpaths = ["tests", "benchmarks"]
python_files = ["test_*.py", "bench_*.py"]
python_classes = ["Test*"]
python_functions = ["test_*"]
markers = [
"benchmark: mark test as a benchmark",
]

View File

@@ -1,11 +0,0 @@
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.

View File

@@ -1,11 +0,0 @@
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.

View File

@@ -1,325 +0,0 @@
"""
API endpoints for fuzzing workflow management and real-time monitoring
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import logging
from typing import List, Dict
from fastapi import APIRouter, HTTPException, WebSocket, WebSocketDisconnect
from fastapi.responses import StreamingResponse
import asyncio
import json
from datetime import datetime
from src.models.findings import (
FuzzingStats,
CrashReport
)
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/fuzzing", tags=["fuzzing"])
# In-memory storage for real-time stats (in production, use Redis or similar)
fuzzing_stats: Dict[str, FuzzingStats] = {}
crash_reports: Dict[str, List[CrashReport]] = {}
active_connections: Dict[str, List[WebSocket]] = {}
def initialize_fuzzing_tracking(run_id: str, workflow_name: str):
"""
Initialize fuzzing tracking for a new run.
This function should be called when a workflow is submitted to enable
real-time monitoring and stats collection.
Args:
run_id: The run identifier
workflow_name: Name of the workflow
"""
fuzzing_stats[run_id] = FuzzingStats(
run_id=run_id,
workflow=workflow_name
)
crash_reports[run_id] = []
active_connections[run_id] = []
@router.get("/{run_id}/stats", response_model=FuzzingStats)
async def get_fuzzing_stats(run_id: str) -> FuzzingStats:
"""
Get current fuzzing statistics for a run.
Args:
run_id: The fuzzing run ID
Returns:
Current fuzzing statistics
Raises:
HTTPException: 404 if run not found
"""
if run_id not in fuzzing_stats:
raise HTTPException(
status_code=404,
detail=f"Fuzzing run not found: {run_id}"
)
return fuzzing_stats[run_id]
@router.get("/{run_id}/crashes", response_model=List[CrashReport])
async def get_crash_reports(run_id: str) -> List[CrashReport]:
"""
Get crash reports for a fuzzing run.
Args:
run_id: The fuzzing run ID
Returns:
List of crash reports
Raises:
HTTPException: 404 if run not found
"""
if run_id not in crash_reports:
raise HTTPException(
status_code=404,
detail=f"Fuzzing run not found: {run_id}"
)
return crash_reports[run_id]
@router.post("/{run_id}/stats")
async def update_fuzzing_stats(run_id: str, stats: FuzzingStats):
"""
Update fuzzing statistics (called by fuzzing workflows).
Args:
run_id: The fuzzing run ID
stats: Updated statistics
Raises:
HTTPException: 404 if run not found
"""
if run_id not in fuzzing_stats:
raise HTTPException(
status_code=404,
detail=f"Fuzzing run not found: {run_id}"
)
# Update stats
fuzzing_stats[run_id] = stats
# Debug: log reception for live instrumentation
try:
logger.info(
"Received fuzzing stats update: run_id=%s exec=%s eps=%.2f crashes=%s corpus=%s coverage=%s elapsed=%ss",
run_id,
stats.executions,
stats.executions_per_sec,
stats.crashes,
stats.corpus_size,
stats.coverage,
stats.elapsed_time,
)
except Exception:
pass
# Notify connected WebSocket clients
if run_id in active_connections:
message = {
"type": "stats_update",
"data": stats.model_dump()
}
for websocket in active_connections[run_id][:]: # Copy to avoid modification during iteration
try:
await websocket.send_text(json.dumps(message))
except Exception:
# Remove disconnected clients
active_connections[run_id].remove(websocket)
@router.post("/{run_id}/crash")
async def report_crash(run_id: str, crash: CrashReport):
"""
Report a new crash (called by fuzzing workflows).
Args:
run_id: The fuzzing run ID
crash: Crash report details
"""
if run_id not in crash_reports:
crash_reports[run_id] = []
# Add crash report
crash_reports[run_id].append(crash)
# Update stats
if run_id in fuzzing_stats:
fuzzing_stats[run_id].crashes += 1
fuzzing_stats[run_id].last_crash_time = crash.timestamp
# Notify connected WebSocket clients
if run_id in active_connections:
message = {
"type": "crash_report",
"data": crash.model_dump()
}
for websocket in active_connections[run_id][:]:
try:
await websocket.send_text(json.dumps(message))
except Exception:
active_connections[run_id].remove(websocket)
@router.websocket("/{run_id}/live")
async def websocket_endpoint(websocket: WebSocket, run_id: str):
"""
WebSocket endpoint for real-time fuzzing updates.
Args:
websocket: WebSocket connection
run_id: The fuzzing run ID to monitor
"""
await websocket.accept()
# Initialize connection tracking
if run_id not in active_connections:
active_connections[run_id] = []
active_connections[run_id].append(websocket)
try:
# Send current stats on connection
if run_id in fuzzing_stats:
current = fuzzing_stats[run_id]
if isinstance(current, dict):
payload = current
elif hasattr(current, "model_dump"):
payload = current.model_dump()
elif hasattr(current, "dict"):
payload = current.dict()
else:
payload = getattr(current, "__dict__", {"run_id": run_id})
message = {"type": "stats_update", "data": payload}
await websocket.send_text(json.dumps(message))
# Keep connection alive
while True:
try:
# Wait for ping or handle disconnect
data = await asyncio.wait_for(websocket.receive_text(), timeout=30.0)
# Echo back for ping-pong
if data == "ping":
await websocket.send_text("pong")
except asyncio.TimeoutError:
# Send periodic heartbeat
await websocket.send_text(json.dumps({"type": "heartbeat"}))
except WebSocketDisconnect:
# Clean up connection
if run_id in active_connections and websocket in active_connections[run_id]:
active_connections[run_id].remove(websocket)
except Exception as e:
logger.error(f"WebSocket error for run {run_id}: {e}")
if run_id in active_connections and websocket in active_connections[run_id]:
active_connections[run_id].remove(websocket)
@router.get("/{run_id}/stream")
async def stream_fuzzing_updates(run_id: str):
"""
Server-Sent Events endpoint for real-time fuzzing updates.
Args:
run_id: The fuzzing run ID to monitor
Returns:
Streaming response with real-time updates
"""
if run_id not in fuzzing_stats:
raise HTTPException(
status_code=404,
detail=f"Fuzzing run not found: {run_id}"
)
async def event_stream():
"""Generate server-sent events for fuzzing updates"""
last_stats_time = datetime.utcnow()
while True:
try:
# Send current stats
if run_id in fuzzing_stats:
current_stats = fuzzing_stats[run_id]
if isinstance(current_stats, dict):
stats_payload = current_stats
elif hasattr(current_stats, "model_dump"):
stats_payload = current_stats.model_dump()
elif hasattr(current_stats, "dict"):
stats_payload = current_stats.dict()
else:
stats_payload = getattr(current_stats, "__dict__", {"run_id": run_id})
event_data = f"data: {json.dumps({'type': 'stats', 'data': stats_payload})}\n\n"
yield event_data
# Send recent crashes
if run_id in crash_reports:
recent_crashes = [
crash for crash in crash_reports[run_id]
if crash.timestamp > last_stats_time
]
for crash in recent_crashes:
event_data = f"data: {json.dumps({'type': 'crash', 'data': crash.model_dump()})}\n\n"
yield event_data
last_stats_time = datetime.utcnow()
await asyncio.sleep(5) # Update every 5 seconds
except Exception as e:
logger.error(f"Error in event stream for run {run_id}: {e}")
break
return StreamingResponse(
event_stream(),
media_type="text/event-stream",
headers={
"Cache-Control": "no-cache",
"Connection": "keep-alive",
}
)
@router.delete("/{run_id}")
async def cleanup_fuzzing_run(run_id: str):
"""
Clean up fuzzing run data.
Args:
run_id: The fuzzing run ID to clean up
"""
# Clean up tracking data
fuzzing_stats.pop(run_id, None)
crash_reports.pop(run_id, None)
# Close any active WebSocket connections
if run_id in active_connections:
for websocket in active_connections[run_id]:
try:
await websocket.close()
except Exception:
pass
del active_connections[run_id]
return {"message": f"Cleaned up fuzzing run {run_id}"}

View File

@@ -1,183 +0,0 @@
"""
API endpoints for workflow run management and findings retrieval
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import logging
from fastapi import APIRouter, HTTPException, Depends
from src.models.findings import WorkflowFindings, WorkflowStatus
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/runs", tags=["runs"])
def get_temporal_manager():
"""Dependency to get the Temporal manager instance"""
from src.main import temporal_mgr
return temporal_mgr
@router.get("/{run_id}/status", response_model=WorkflowStatus)
async def get_run_status(
run_id: str,
temporal_mgr=Depends(get_temporal_manager)
) -> WorkflowStatus:
"""
Get the current status of a workflow run.
Args:
run_id: The workflow run ID
Returns:
Status information including state, timestamps, and completion flags
Raises:
HTTPException: 404 if run not found
"""
try:
status = await temporal_mgr.get_workflow_status(run_id)
# Map Temporal status to response format
workflow_status = status.get("status", "UNKNOWN")
is_completed = workflow_status in ["COMPLETED", "FAILED", "CANCELLED"]
is_failed = workflow_status == "FAILED"
is_running = workflow_status == "RUNNING"
# Extract workflow name from run_id (format: workflow_name-unique_id)
workflow_name = run_id.rsplit('-', 1)[0] if '-' in run_id else "unknown"
return WorkflowStatus(
run_id=run_id,
workflow=workflow_name,
status=workflow_status,
is_completed=is_completed,
is_failed=is_failed,
is_running=is_running,
created_at=status.get("start_time"),
updated_at=status.get("close_time") or status.get("execution_time")
)
except Exception as e:
logger.error(f"Failed to get status for run {run_id}: {e}")
raise HTTPException(
status_code=404,
detail=f"Run not found: {run_id}"
)
@router.get("/{run_id}/findings", response_model=WorkflowFindings)
async def get_run_findings(
run_id: str,
temporal_mgr=Depends(get_temporal_manager)
) -> WorkflowFindings:
"""
Get the findings from a completed workflow run.
Args:
run_id: The workflow run ID
Returns:
SARIF-formatted findings from the workflow execution
Raises:
HTTPException: 404 if run not found, 400 if run not completed
"""
try:
# Get run status first
status = await temporal_mgr.get_workflow_status(run_id)
workflow_status = status.get("status", "UNKNOWN")
if workflow_status not in ["COMPLETED", "FAILED", "CANCELLED"]:
if workflow_status == "RUNNING":
raise HTTPException(
status_code=400,
detail=f"Run {run_id} is still running. Current status: {workflow_status}"
)
else:
raise HTTPException(
status_code=400,
detail=f"Run {run_id} not completed. Status: {workflow_status}"
)
if workflow_status == "FAILED":
raise HTTPException(
status_code=400,
detail=f"Run {run_id} failed. Status: {workflow_status}"
)
# Get the workflow result
result = await temporal_mgr.get_workflow_result(run_id)
# Extract SARIF from result (handle None for backwards compatibility)
if isinstance(result, dict):
sarif = result.get("sarif") or {}
else:
sarif = {}
# Extract workflow name from run_id (format: workflow_name-unique_id)
workflow_name = run_id.rsplit('-', 1)[0] if '-' in run_id else "unknown"
# Metadata
metadata = {
"completion_time": status.get("close_time"),
"workflow_version": "unknown"
}
return WorkflowFindings(
workflow=workflow_name,
run_id=run_id,
sarif=sarif,
metadata=metadata
)
except HTTPException:
raise
except Exception as e:
logger.error(f"Failed to get findings for run {run_id}: {e}")
raise HTTPException(
status_code=500,
detail=f"Failed to retrieve findings: {str(e)}"
)
@router.get("/{workflow_name}/findings/{run_id}", response_model=WorkflowFindings)
async def get_workflow_findings(
workflow_name: str,
run_id: str,
temporal_mgr=Depends(get_temporal_manager)
) -> WorkflowFindings:
"""
Get findings for a specific workflow run.
Alternative endpoint that includes workflow name in the path for clarity.
Args:
workflow_name: Name of the workflow
run_id: The workflow run ID
Returns:
SARIF-formatted findings from the workflow execution
Raises:
HTTPException: 404 if workflow or run not found, 400 if run not completed
"""
if workflow_name not in temporal_mgr.workflows:
raise HTTPException(
status_code=404,
detail=f"Workflow not found: {workflow_name}"
)
# Delegate to the main findings endpoint
return await get_run_findings(run_id, temporal_mgr)

View File

@@ -1,634 +0,0 @@
"""
API endpoints for workflow management with enhanced error handling
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import logging
import traceback
import tempfile
from typing import List, Dict, Any, Optional
from fastapi import APIRouter, HTTPException, Depends, UploadFile, File, Form
from pathlib import Path
from src.models.findings import (
WorkflowSubmission,
WorkflowMetadata,
WorkflowListItem,
RunSubmissionResponse
)
from src.temporal.discovery import WorkflowDiscovery
logger = logging.getLogger(__name__)
# Configuration for file uploads
MAX_UPLOAD_SIZE = 10 * 1024 * 1024 * 1024 # 10 GB
ALLOWED_CONTENT_TYPES = [
"application/gzip",
"application/x-gzip",
"application/x-tar",
"application/x-compressed-tar",
"application/octet-stream", # Generic binary
]
router = APIRouter(prefix="/workflows", tags=["workflows"])
def create_structured_error_response(
error_type: str,
message: str,
workflow_name: Optional[str] = None,
run_id: Optional[str] = None,
container_info: Optional[Dict[str, Any]] = None,
deployment_info: Optional[Dict[str, Any]] = None,
suggestions: Optional[List[str]] = None
) -> Dict[str, Any]:
"""Create a structured error response with rich context."""
error_response = {
"error": {
"type": error_type,
"message": message,
"timestamp": __import__("datetime").datetime.utcnow().isoformat() + "Z"
}
}
if workflow_name:
error_response["error"]["workflow_name"] = workflow_name
if run_id:
error_response["error"]["run_id"] = run_id
if container_info:
error_response["error"]["container"] = container_info
if deployment_info:
error_response["error"]["deployment"] = deployment_info
if suggestions:
error_response["error"]["suggestions"] = suggestions
return error_response
def get_temporal_manager():
"""Dependency to get the Temporal manager instance"""
from src.main import temporal_mgr
return temporal_mgr
@router.get("/", response_model=List[WorkflowListItem])
async def list_workflows(
temporal_mgr=Depends(get_temporal_manager)
) -> List[WorkflowListItem]:
"""
List all discovered workflows with their metadata.
Returns a summary of each workflow including name, version, description,
author, and tags.
"""
workflows = []
for name, info in temporal_mgr.workflows.items():
workflows.append(WorkflowListItem(
name=name,
version=info.metadata.get("version", "0.6.0"),
description=info.metadata.get("description", ""),
author=info.metadata.get("author"),
tags=info.metadata.get("tags", [])
))
return workflows
@router.get("/metadata/schema")
async def get_metadata_schema() -> Dict[str, Any]:
"""
Get the JSON schema for workflow metadata files.
This schema defines the structure and requirements for metadata.yaml files
that must accompany each workflow.
"""
return WorkflowDiscovery.get_metadata_schema()
@router.get("/{workflow_name}/metadata", response_model=WorkflowMetadata)
async def get_workflow_metadata(
workflow_name: str,
temporal_mgr=Depends(get_temporal_manager)
) -> WorkflowMetadata:
"""
Get complete metadata for a specific workflow.
Args:
workflow_name: Name of the workflow
Returns:
Complete metadata including parameters schema, supported volume modes,
required modules, and more.
Raises:
HTTPException: 404 if workflow not found
"""
if workflow_name not in temporal_mgr.workflows:
available_workflows = list(temporal_mgr.workflows.keys())
error_response = create_structured_error_response(
error_type="WorkflowNotFound",
message=f"Workflow '{workflow_name}' not found",
workflow_name=workflow_name,
suggestions=[
f"Available workflows: {', '.join(available_workflows)}",
"Use GET /workflows/ to see all available workflows",
"Check workflow name spelling and case sensitivity"
]
)
raise HTTPException(
status_code=404,
detail=error_response
)
info = temporal_mgr.workflows[workflow_name]
metadata = info.metadata
return WorkflowMetadata(
name=workflow_name,
version=metadata.get("version", "0.6.0"),
description=metadata.get("description", ""),
author=metadata.get("author"),
tags=metadata.get("tags", []),
parameters=metadata.get("parameters", {}),
default_parameters=metadata.get("default_parameters", {}),
required_modules=metadata.get("required_modules", [])
)
@router.post("/{workflow_name}/submit", response_model=RunSubmissionResponse)
async def submit_workflow(
workflow_name: str,
submission: WorkflowSubmission,
temporal_mgr=Depends(get_temporal_manager)
) -> RunSubmissionResponse:
"""
Submit a workflow for execution.
Args:
workflow_name: Name of the workflow to execute
submission: Submission parameters including target path and parameters
Returns:
Run submission response with run_id and initial status
Raises:
HTTPException: 404 if workflow not found, 400 for invalid parameters
"""
if workflow_name not in temporal_mgr.workflows:
available_workflows = list(temporal_mgr.workflows.keys())
error_response = create_structured_error_response(
error_type="WorkflowNotFound",
message=f"Workflow '{workflow_name}' not found",
workflow_name=workflow_name,
suggestions=[
f"Available workflows: {', '.join(available_workflows)}",
"Use GET /workflows/ to see all available workflows",
"Check workflow name spelling and case sensitivity"
]
)
raise HTTPException(
status_code=404,
detail=error_response
)
try:
# Upload target file to MinIO and get target_id
target_path = Path(submission.target_path)
if not target_path.exists():
raise ValueError(f"Target path does not exist: {submission.target_path}")
# Upload target (using anonymous user for now)
target_id = await temporal_mgr.upload_target(
file_path=target_path,
user_id="api-user",
metadata={"workflow": workflow_name}
)
# Merge default parameters with user parameters
workflow_info = temporal_mgr.workflows[workflow_name]
metadata = workflow_info.metadata or {}
defaults = metadata.get("default_parameters", {})
user_params = submission.parameters or {}
workflow_params = {**defaults, **user_params}
# Start workflow execution
handle = await temporal_mgr.run_workflow(
workflow_name=workflow_name,
target_id=target_id,
workflow_params=workflow_params
)
run_id = handle.id
# Initialize fuzzing tracking if this looks like a fuzzing workflow
workflow_info = temporal_mgr.workflows.get(workflow_name, {})
workflow_tags = workflow_info.metadata.get("tags", []) if hasattr(workflow_info, 'metadata') else []
if "fuzzing" in workflow_tags or "fuzz" in workflow_name.lower():
from src.api.fuzzing import initialize_fuzzing_tracking
initialize_fuzzing_tracking(run_id, workflow_name)
return RunSubmissionResponse(
run_id=run_id,
status="RUNNING",
workflow=workflow_name,
message=f"Workflow '{workflow_name}' submitted successfully"
)
except ValueError as e:
# Parameter validation errors
error_response = create_structured_error_response(
error_type="ValidationError",
message=str(e),
workflow_name=workflow_name,
suggestions=[
"Check parameter types and values",
"Use GET /workflows/{workflow_name}/parameters for schema",
"Ensure all required parameters are provided"
]
)
raise HTTPException(status_code=400, detail=error_response)
except Exception as e:
logger.error(f"Failed to submit workflow '{workflow_name}': {e}")
logger.error(f"Traceback: {traceback.format_exc()}")
# Try to get more context about the error
container_info = None
deployment_info = None
suggestions = []
error_message = str(e)
error_type = "WorkflowSubmissionError"
# Detect specific error patterns
if "workflow" in error_message.lower() and "not found" in error_message.lower():
error_type = "WorkflowError"
suggestions.extend([
"Check if Temporal server is running and accessible",
"Verify workflow workers are running",
"Check if workflow is registered with correct vertical",
"Ensure Docker is running and has sufficient resources"
])
elif "volume" in error_message.lower() or "mount" in error_message.lower():
error_type = "VolumeError"
suggestions.extend([
"Check if the target path exists and is accessible",
"Verify file permissions (Docker needs read access)",
"Ensure the path is not in use by another process",
"Try using an absolute path instead of relative path"
])
elif "memory" in error_message.lower() or "resource" in error_message.lower():
error_type = "ResourceError"
suggestions.extend([
"Check system memory and CPU availability",
"Consider reducing resource limits or dataset size",
"Monitor Docker resource usage",
"Increase Docker memory limits if needed"
])
elif "image" in error_message.lower():
error_type = "ImageError"
suggestions.extend([
"Check if the workflow image exists",
"Verify Docker registry access",
"Try rebuilding the workflow image",
"Check network connectivity to registries"
])
else:
suggestions.extend([
"Check FuzzForge backend logs for details",
"Verify all services are running (docker-compose up -d)",
"Try restarting the workflow deployment",
"Contact support if the issue persists"
])
error_response = create_structured_error_response(
error_type=error_type,
message=f"Failed to submit workflow: {error_message}",
workflow_name=workflow_name,
container_info=container_info,
deployment_info=deployment_info,
suggestions=suggestions
)
raise HTTPException(
status_code=500,
detail=error_response
)
@router.post("/{workflow_name}/upload-and-submit", response_model=RunSubmissionResponse)
async def upload_and_submit_workflow(
workflow_name: str,
file: UploadFile = File(..., description="Target file or tarball to analyze"),
parameters: Optional[str] = Form(None, description="JSON-encoded workflow parameters"),
timeout: Optional[int] = Form(None, description="Timeout in seconds"),
temporal_mgr=Depends(get_temporal_manager)
) -> RunSubmissionResponse:
"""
Upload a target file/tarball and submit workflow for execution.
This endpoint accepts multipart/form-data uploads and is the recommended
way to submit workflows from remote CLI clients.
Args:
workflow_name: Name of the workflow to execute
file: Target file or tarball (compressed directory)
parameters: JSON string of workflow parameters (optional)
timeout: Execution timeout in seconds (optional)
Returns:
Run submission response with run_id and initial status
Raises:
HTTPException: 404 if workflow not found, 400 for invalid parameters,
413 if file too large
"""
if workflow_name not in temporal_mgr.workflows:
available_workflows = list(temporal_mgr.workflows.keys())
error_response = create_structured_error_response(
error_type="WorkflowNotFound",
message=f"Workflow '{workflow_name}' not found",
workflow_name=workflow_name,
suggestions=[
f"Available workflows: {', '.join(available_workflows)}",
"Use GET /workflows/ to see all available workflows"
]
)
raise HTTPException(status_code=404, detail=error_response)
temp_file_path = None
try:
# Validate file size
file_size = 0
chunk_size = 1024 * 1024 # 1MB chunks
# Create temporary file
temp_fd, temp_file_path = tempfile.mkstemp(suffix=".tar.gz")
logger.info(f"Receiving file upload for workflow '{workflow_name}': {file.filename}")
# Stream file to disk
with open(temp_fd, 'wb') as temp_file:
while True:
chunk = await file.read(chunk_size)
if not chunk:
break
file_size += len(chunk)
# Check size limit
if file_size > MAX_UPLOAD_SIZE:
raise HTTPException(
status_code=413,
detail=create_structured_error_response(
error_type="FileTooLarge",
message=f"File size exceeds maximum allowed size of {MAX_UPLOAD_SIZE / (1024**3):.1f} GB",
workflow_name=workflow_name,
suggestions=[
"Reduce the size of your target directory",
"Exclude unnecessary files (build artifacts, dependencies, etc.)",
"Consider splitting into smaller analysis targets"
]
)
)
temp_file.write(chunk)
logger.info(f"Received file: {file_size / (1024**2):.2f} MB")
# Parse parameters
workflow_params = {}
if parameters:
try:
import json
workflow_params = json.loads(parameters)
if not isinstance(workflow_params, dict):
raise ValueError("Parameters must be a JSON object")
except (json.JSONDecodeError, ValueError) as e:
raise HTTPException(
status_code=400,
detail=create_structured_error_response(
error_type="InvalidParameters",
message=f"Invalid parameters JSON: {e}",
workflow_name=workflow_name,
suggestions=["Ensure parameters is valid JSON object"]
)
)
# Upload to MinIO
target_id = await temporal_mgr.upload_target(
file_path=Path(temp_file_path),
user_id="api-user",
metadata={
"workflow": workflow_name,
"original_filename": file.filename,
"upload_method": "multipart"
}
)
logger.info(f"Uploaded to MinIO with target_id: {target_id}")
# Merge default parameters with user parameters
workflow_info = temporal_mgr.workflows.get(workflow_name)
metadata = workflow_info.metadata or {}
defaults = metadata.get("default_parameters", {})
workflow_params = {**defaults, **workflow_params}
# Start workflow execution
handle = await temporal_mgr.run_workflow(
workflow_name=workflow_name,
target_id=target_id,
workflow_params=workflow_params
)
run_id = handle.id
# Initialize fuzzing tracking if needed
workflow_info = temporal_mgr.workflows.get(workflow_name, {})
workflow_tags = workflow_info.metadata.get("tags", []) if hasattr(workflow_info, 'metadata') else []
if "fuzzing" in workflow_tags or "fuzz" in workflow_name.lower():
from src.api.fuzzing import initialize_fuzzing_tracking
initialize_fuzzing_tracking(run_id, workflow_name)
return RunSubmissionResponse(
run_id=run_id,
status="RUNNING",
workflow=workflow_name,
message=f"Workflow '{workflow_name}' submitted successfully with uploaded target"
)
except HTTPException:
raise
except Exception as e:
logger.error(f"Failed to upload and submit workflow '{workflow_name}': {e}")
logger.error(f"Traceback: {traceback.format_exc()}")
error_response = create_structured_error_response(
error_type="WorkflowSubmissionError",
message=f"Failed to process upload and submit workflow: {str(e)}",
workflow_name=workflow_name,
suggestions=[
"Check if the uploaded file is a valid tarball",
"Verify MinIO storage is accessible",
"Check backend logs for detailed error information",
"Ensure Temporal workers are running"
]
)
raise HTTPException(status_code=500, detail=error_response)
finally:
# Cleanup temporary file
if temp_file_path and Path(temp_file_path).exists():
try:
Path(temp_file_path).unlink()
logger.debug(f"Cleaned up temp file: {temp_file_path}")
except Exception as e:
logger.warning(f"Failed to cleanup temp file {temp_file_path}: {e}")
@router.get("/{workflow_name}/worker-info")
async def get_workflow_worker_info(
workflow_name: str,
temporal_mgr=Depends(get_temporal_manager)
) -> Dict[str, Any]:
"""
Get worker information for a workflow.
Returns details about which worker is required to execute this workflow,
including container name, task queue, and vertical.
Args:
workflow_name: Name of the workflow
Returns:
Worker information including vertical, container name, and task queue
Raises:
HTTPException: 404 if workflow not found
"""
if workflow_name not in temporal_mgr.workflows:
available_workflows = list(temporal_mgr.workflows.keys())
error_response = create_structured_error_response(
error_type="WorkflowNotFound",
message=f"Workflow '{workflow_name}' not found",
workflow_name=workflow_name,
suggestions=[
f"Available workflows: {', '.join(available_workflows)}",
"Use GET /workflows/ to see all available workflows"
]
)
raise HTTPException(
status_code=404,
detail=error_response
)
info = temporal_mgr.workflows[workflow_name]
metadata = info.metadata
# Extract vertical from metadata
vertical = metadata.get("vertical")
if not vertical:
error_response = create_structured_error_response(
error_type="MissingVertical",
message=f"Workflow '{workflow_name}' does not specify a vertical in metadata",
workflow_name=workflow_name,
suggestions=[
"Check workflow metadata.yaml for 'vertical' field",
"Contact workflow author for support"
]
)
raise HTTPException(
status_code=500,
detail=error_response
)
return {
"workflow": workflow_name,
"vertical": vertical,
"worker_service": f"worker-{vertical}",
"task_queue": f"{vertical}-queue",
"required": True
}
@router.get("/{workflow_name}/parameters")
async def get_workflow_parameters(
workflow_name: str,
temporal_mgr=Depends(get_temporal_manager)
) -> Dict[str, Any]:
"""
Get the parameters schema for a workflow.
Args:
workflow_name: Name of the workflow
Returns:
Parameters schema with types, descriptions, and defaults
Raises:
HTTPException: 404 if workflow not found
"""
if workflow_name not in temporal_mgr.workflows:
available_workflows = list(temporal_mgr.workflows.keys())
error_response = create_structured_error_response(
error_type="WorkflowNotFound",
message=f"Workflow '{workflow_name}' not found",
workflow_name=workflow_name,
suggestions=[
f"Available workflows: {', '.join(available_workflows)}",
"Use GET /workflows/ to see all available workflows"
]
)
raise HTTPException(
status_code=404,
detail=error_response
)
info = temporal_mgr.workflows[workflow_name]
metadata = info.metadata
# Return parameters with enhanced schema information
parameters_schema = metadata.get("parameters", {})
# Extract the actual parameter definitions from JSON schema structure
if "properties" in parameters_schema:
param_definitions = parameters_schema["properties"]
else:
param_definitions = parameters_schema
# Add default values to the schema
default_params = metadata.get("default_parameters", {})
for param_name, param_schema in param_definitions.items():
if isinstance(param_schema, dict) and param_name in default_params:
param_schema["default"] = default_params[param_name]
return {
"workflow": workflow_name,
"parameters": param_definitions,
"default_parameters": default_params,
"required_parameters": [
name for name, schema in param_definitions.items()
if isinstance(schema, dict) and schema.get("required", False)
]
}

View File

@@ -1,11 +0,0 @@
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.

View File

@@ -1,45 +0,0 @@
"""
Setup utilities for FuzzForge infrastructure
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import logging
logger = logging.getLogger(__name__)
async def setup_result_storage():
"""
Setup result storage (MinIO).
MinIO is used for both target upload and result storage.
This is a placeholder for any MinIO-specific setup if needed.
"""
logger.info("Result storage (MinIO) configured")
# MinIO is configured via environment variables in docker-compose
# No additional setup needed here
return True
async def validate_infrastructure():
"""
Validate all required infrastructure components.
This should be called during startup to ensure everything is ready.
"""
logger.info("Validating infrastructure...")
# Setup storage (MinIO)
await setup_result_storage()
logger.info("Infrastructure validation completed")

View File

@@ -1,725 +0,0 @@
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import asyncio
import logging
import os
from contextlib import AsyncExitStack, asynccontextmanager, suppress
from typing import Any, Dict, Optional, List
import uvicorn
from fastapi import FastAPI
from starlette.applications import Starlette
from starlette.routing import Mount
from fastmcp.server.http import create_sse_app
from src.temporal.manager import TemporalManager
from src.core.setup import setup_result_storage, validate_infrastructure
from src.api import workflows, runs, fuzzing
from fastmcp import FastMCP
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
temporal_mgr = TemporalManager()
class TemporalBootstrapState:
"""Tracks Temporal initialization progress for API and MCP consumers."""
def __init__(self) -> None:
self.ready: bool = False
self.status: str = "not_started"
self.last_error: Optional[str] = None
self.task_running: bool = False
def as_dict(self) -> Dict[str, Any]:
return {
"ready": self.ready,
"status": self.status,
"last_error": self.last_error,
"task_running": self.task_running,
}
temporal_bootstrap_state = TemporalBootstrapState()
# Configure retry strategy for bootstrapping Temporal + infrastructure
STARTUP_RETRY_SECONDS = max(1, int(os.getenv("FUZZFORGE_STARTUP_RETRY_SECONDS", "5")))
STARTUP_RETRY_MAX_SECONDS = max(
STARTUP_RETRY_SECONDS,
int(os.getenv("FUZZFORGE_STARTUP_RETRY_MAX_SECONDS", "60")),
)
temporal_bootstrap_task: Optional[asyncio.Task] = None
# ---------------------------------------------------------------------------
# FastAPI application (REST API)
# ---------------------------------------------------------------------------
app = FastAPI(
title="FuzzForge API",
description="Security testing workflow orchestration API with fuzzing support",
version="0.6.0",
)
app.include_router(workflows.router)
app.include_router(runs.router)
app.include_router(fuzzing.router)
def get_temporal_status() -> Dict[str, Any]:
"""Return a snapshot of Temporal bootstrap state for diagnostics."""
status = temporal_bootstrap_state.as_dict()
status["workflows_loaded"] = len(temporal_mgr.workflows)
status["bootstrap_task_running"] = (
temporal_bootstrap_task is not None and not temporal_bootstrap_task.done()
)
return status
def _temporal_not_ready_status() -> Optional[Dict[str, Any]]:
"""Return status details if Temporal is not ready yet."""
status = get_temporal_status()
if status.get("ready"):
return None
return status
@app.get("/")
async def root() -> Dict[str, Any]:
status = get_temporal_status()
return {
"name": "FuzzForge API",
"version": "0.6.0",
"status": "ready" if status.get("ready") else "initializing",
"workflows_loaded": status.get("workflows_loaded", 0),
"temporal": status,
}
@app.get("/health")
async def health() -> Dict[str, str]:
status = get_temporal_status()
health_status = "healthy" if status.get("ready") else "initializing"
return {"status": health_status}
# Map FastAPI OpenAPI operationIds to readable MCP tool names
FASTAPI_MCP_NAME_OVERRIDES: Dict[str, str] = {
"list_workflows_workflows__get": "api_list_workflows",
"get_metadata_schema_workflows_metadata_schema_get": "api_get_metadata_schema",
"get_workflow_metadata_workflows__workflow_name__metadata_get": "api_get_workflow_metadata",
"submit_workflow_workflows__workflow_name__submit_post": "api_submit_workflow",
"get_workflow_parameters_workflows__workflow_name__parameters_get": "api_get_workflow_parameters",
"get_run_status_runs__run_id__status_get": "api_get_run_status",
"get_run_findings_runs__run_id__findings_get": "api_get_run_findings",
"get_workflow_findings_runs__workflow_name__findings__run_id__get": "api_get_workflow_findings",
"get_fuzzing_stats_fuzzing__run_id__stats_get": "api_get_fuzzing_stats",
"update_fuzzing_stats_fuzzing__run_id__stats_post": "api_update_fuzzing_stats",
"get_crash_reports_fuzzing__run_id__crashes_get": "api_get_crash_reports",
"report_crash_fuzzing__run_id__crash_post": "api_report_crash",
"stream_fuzzing_updates_fuzzing__run_id__stream_get": "api_stream_fuzzing_updates",
"cleanup_fuzzing_run_fuzzing__run_id__delete": "api_cleanup_fuzzing_run",
"root__get": "api_root",
"health_health_get": "api_health",
}
# Create an MCP adapter exposing all FastAPI endpoints via OpenAPI parsing
FASTAPI_MCP_ADAPTER = FastMCP.from_fastapi(
app,
name="FuzzForge FastAPI",
mcp_names=FASTAPI_MCP_NAME_OVERRIDES,
)
_fastapi_mcp_imported = False
# ---------------------------------------------------------------------------
# FastMCP server (runs on dedicated port outside FastAPI)
# ---------------------------------------------------------------------------
mcp = FastMCP(name="FuzzForge MCP")
async def _bootstrap_temporal_with_retries() -> None:
"""Initialize Temporal infrastructure with exponential backoff retries."""
attempt = 0
while True:
attempt += 1
temporal_bootstrap_state.task_running = True
temporal_bootstrap_state.status = "starting"
temporal_bootstrap_state.ready = False
temporal_bootstrap_state.last_error = None
try:
logger.info("Bootstrapping Temporal infrastructure...")
await validate_infrastructure()
await setup_result_storage()
await temporal_mgr.initialize()
temporal_bootstrap_state.ready = True
temporal_bootstrap_state.status = "ready"
temporal_bootstrap_state.task_running = False
logger.info("Temporal infrastructure ready")
return
except asyncio.CancelledError:
temporal_bootstrap_state.status = "cancelled"
temporal_bootstrap_state.task_running = False
logger.info("Temporal bootstrap task cancelled")
raise
except Exception as exc: # pragma: no cover - defensive logging on infra startup
logger.exception("Temporal bootstrap failed")
temporal_bootstrap_state.ready = False
temporal_bootstrap_state.status = "error"
temporal_bootstrap_state.last_error = str(exc)
# Ensure partial initialization does not leave stale state behind
temporal_mgr.workflows.clear()
wait_time = min(
STARTUP_RETRY_SECONDS * (2 ** (attempt - 1)),
STARTUP_RETRY_MAX_SECONDS,
)
logger.info("Retrying Temporal bootstrap in %s second(s)", wait_time)
try:
await asyncio.sleep(wait_time)
except asyncio.CancelledError:
temporal_bootstrap_state.status = "cancelled"
temporal_bootstrap_state.task_running = False
raise
def _lookup_workflow(workflow_name: str):
info = temporal_mgr.workflows.get(workflow_name)
if not info:
return None
metadata = info.metadata
defaults = metadata.get("default_parameters", {})
default_target_path = metadata.get("default_target_path") or defaults.get("target_path")
supported_modes = metadata.get("supported_volume_modes") or ["ro", "rw"]
if not isinstance(supported_modes, list) or not supported_modes:
supported_modes = ["ro", "rw"]
default_volume_mode = (
metadata.get("default_volume_mode")
or defaults.get("volume_mode")
or supported_modes[0]
)
return {
"name": workflow_name,
"version": metadata.get("version", "0.6.0"),
"description": metadata.get("description", ""),
"author": metadata.get("author"),
"tags": metadata.get("tags", []),
"parameters": metadata.get("parameters", {}),
"default_parameters": metadata.get("default_parameters", {}),
"required_modules": metadata.get("required_modules", []),
"supported_volume_modes": supported_modes,
"default_target_path": default_target_path,
"default_volume_mode": default_volume_mode
}
@mcp.tool
async def list_workflows_mcp() -> Dict[str, Any]:
"""List all discovered workflows and their metadata summary."""
not_ready = _temporal_not_ready_status()
if not_ready:
return {
"workflows": [],
"temporal": not_ready,
"message": "Temporal infrastructure is still initializing",
}
workflows_summary = []
for name, info in temporal_mgr.workflows.items():
metadata = info.metadata
defaults = metadata.get("default_parameters", {})
workflows_summary.append({
"name": name,
"version": metadata.get("version", "0.6.0"),
"description": metadata.get("description", ""),
"author": metadata.get("author"),
"tags": metadata.get("tags", []),
"supported_volume_modes": metadata.get("supported_volume_modes", ["ro", "rw"]),
"default_volume_mode": metadata.get("default_volume_mode")
or defaults.get("volume_mode")
or "ro",
"default_target_path": metadata.get("default_target_path")
or defaults.get("target_path")
})
return {"workflows": workflows_summary, "temporal": get_temporal_status()}
@mcp.tool
async def get_workflow_metadata_mcp(workflow_name: str) -> Dict[str, Any]:
"""Fetch detailed metadata for a workflow."""
not_ready = _temporal_not_ready_status()
if not_ready:
return {
"error": "Temporal infrastructure not ready",
"temporal": not_ready,
}
data = _lookup_workflow(workflow_name)
if not data:
return {"error": f"Workflow not found: {workflow_name}"}
return data
@mcp.tool
async def get_workflow_parameters_mcp(workflow_name: str) -> Dict[str, Any]:
"""Return the parameter schema and defaults for a workflow."""
not_ready = _temporal_not_ready_status()
if not_ready:
return {
"error": "Temporal infrastructure not ready",
"temporal": not_ready,
}
data = _lookup_workflow(workflow_name)
if not data:
return {"error": f"Workflow not found: {workflow_name}"}
return {
"parameters": data.get("parameters", {}),
"defaults": data.get("default_parameters", {}),
}
@mcp.tool
async def get_workflow_metadata_schema_mcp() -> Dict[str, Any]:
"""Return the JSON schema describing workflow metadata files."""
from src.temporal.discovery import WorkflowDiscovery
return WorkflowDiscovery.get_metadata_schema()
@mcp.tool
async def submit_security_scan_mcp(
workflow_name: str,
target_id: str,
parameters: Dict[str, Any] | None = None,
) -> Dict[str, Any] | Dict[str, str]:
"""Submit a Temporal workflow via MCP."""
try:
not_ready = _temporal_not_ready_status()
if not_ready:
return {
"error": "Temporal infrastructure not ready",
"temporal": not_ready,
}
workflow_info = temporal_mgr.workflows.get(workflow_name)
if not workflow_info:
return {"error": f"Workflow '{workflow_name}' not found"}
metadata = workflow_info.metadata or {}
defaults = metadata.get("default_parameters", {})
parameters = parameters or {}
cleaned_parameters: Dict[str, Any] = {**defaults, **parameters}
# Ensure *_config structures default to dicts
for key, value in list(cleaned_parameters.items()):
if isinstance(key, str) and key.endswith("_config") and value is None:
cleaned_parameters[key] = {}
# Some workflows expect configuration dictionaries even when omitted
parameter_definitions = (
metadata.get("parameters", {}).get("properties", {})
if isinstance(metadata.get("parameters"), dict)
else {}
)
for key, definition in parameter_definitions.items():
if not isinstance(key, str) or not key.endswith("_config"):
continue
if key not in cleaned_parameters:
default_value = definition.get("default") if isinstance(definition, dict) else None
cleaned_parameters[key] = default_value if default_value is not None else {}
elif cleaned_parameters[key] is None:
cleaned_parameters[key] = {}
# Start workflow
handle = await temporal_mgr.run_workflow(
workflow_name=workflow_name,
target_id=target_id,
workflow_params=cleaned_parameters,
)
return {
"run_id": handle.id,
"status": "RUNNING",
"workflow": workflow_name,
"message": f"Workflow '{workflow_name}' submitted successfully",
"target_id": target_id,
"parameters": cleaned_parameters,
"mcp_enabled": True,
}
except Exception as exc: # pragma: no cover - defensive logging
logger.exception("MCP submit failed")
return {"error": f"Failed to submit workflow: {exc}"}
@mcp.tool
async def get_comprehensive_scan_summary(run_id: str) -> Dict[str, Any] | Dict[str, str]:
"""Return a summary for the given workflow run via MCP."""
try:
not_ready = _temporal_not_ready_status()
if not_ready:
return {
"error": "Temporal infrastructure not ready",
"temporal": not_ready,
}
status = await temporal_mgr.get_workflow_status(run_id)
# Try to get result if completed
total_findings = 0
severity_summary = {"critical": 0, "high": 0, "medium": 0, "low": 0, "info": 0}
if status.get("status") == "COMPLETED":
try:
result = await temporal_mgr.get_workflow_result(run_id)
if isinstance(result, dict):
summary = result.get("summary", {})
total_findings = summary.get("total_findings", 0)
except Exception as e:
logger.debug(f"Could not retrieve result for {run_id}: {e}")
return {
"run_id": run_id,
"workflow": "unknown", # Temporal doesn't track workflow name in status
"status": status.get("status", "unknown"),
"is_completed": status.get("status") == "COMPLETED",
"total_findings": total_findings,
"severity_summary": severity_summary,
"scan_duration": status.get("close_time", "In progress"),
"recommendations": (
[
"Review high and critical severity findings first",
"Implement security fixes based on finding recommendations",
"Re-run scan after applying fixes to verify remediation",
]
if total_findings > 0
else ["No security issues found"]
),
"mcp_analysis": True,
}
except Exception as exc: # pragma: no cover
logger.exception("MCP summary failed")
return {"error": f"Failed to summarize run: {exc}"}
@mcp.tool
async def get_run_status_mcp(run_id: str) -> Dict[str, Any]:
"""Return current status information for a Temporal run."""
try:
not_ready = _temporal_not_ready_status()
if not_ready:
return {
"error": "Temporal infrastructure not ready",
"temporal": not_ready,
}
status = await temporal_mgr.get_workflow_status(run_id)
return {
"run_id": run_id,
"workflow": "unknown",
"status": status["status"],
"is_completed": status["status"] in ["COMPLETED", "FAILED", "CANCELLED"],
"is_failed": status["status"] == "FAILED",
"is_running": status["status"] == "RUNNING",
"created_at": status.get("start_time"),
"updated_at": status.get("close_time") or status.get("execution_time"),
}
except Exception as exc:
logger.exception("MCP run status failed")
return {"error": f"Failed to get run status: {exc}"}
@mcp.tool
async def get_run_findings_mcp(run_id: str) -> Dict[str, Any]:
"""Return SARIF findings for a completed run."""
try:
not_ready = _temporal_not_ready_status()
if not_ready:
return {
"error": "Temporal infrastructure not ready",
"temporal": not_ready,
}
status = await temporal_mgr.get_workflow_status(run_id)
if status.get("status") != "COMPLETED":
return {"error": f"Run {run_id} not completed. Status: {status.get('status')}"}
result = await temporal_mgr.get_workflow_result(run_id)
metadata = {
"completion_time": status.get("close_time"),
"workflow_version": "unknown",
}
sarif = result.get("sarif", {}) if isinstance(result, dict) else {}
return {
"workflow": "unknown",
"run_id": run_id,
"sarif": sarif,
"metadata": metadata,
}
except Exception as exc:
logger.exception("MCP findings failed")
return {"error": f"Failed to retrieve findings: {exc}"}
@mcp.tool
async def list_recent_runs_mcp(
limit: int = 10,
workflow_name: str | None = None,
) -> Dict[str, Any]:
"""List recent Temporal runs with optional workflow filter."""
not_ready = _temporal_not_ready_status()
if not_ready:
return {
"runs": [],
"temporal": not_ready,
"message": "Temporal infrastructure is still initializing",
}
try:
limit_value = int(limit)
except (TypeError, ValueError):
limit_value = 10
limit_value = max(1, min(limit_value, 100))
try:
# Build filter query
filter_query = None
if workflow_name:
workflow_info = temporal_mgr.workflows.get(workflow_name)
if workflow_info:
filter_query = f'WorkflowType="{workflow_info.workflow_type}"'
workflows = await temporal_mgr.list_workflows(filter_query, limit_value)
results: List[Dict[str, Any]] = []
for wf in workflows:
results.append({
"run_id": wf["workflow_id"],
"workflow": workflow_name or "unknown",
"state": wf["status"],
"state_type": wf["status"],
"is_completed": wf["status"] in ["COMPLETED", "FAILED", "CANCELLED"],
"is_running": wf["status"] == "RUNNING",
"is_failed": wf["status"] == "FAILED",
"created_at": wf.get("start_time"),
"updated_at": wf.get("close_time"),
})
return {"runs": results, "temporal": get_temporal_status()}
except Exception as exc:
logger.exception("Failed to list runs")
return {
"runs": [],
"temporal": get_temporal_status(),
"error": str(exc)
}
@mcp.tool
async def get_fuzzing_stats_mcp(run_id: str) -> Dict[str, Any]:
"""Return fuzzing statistics for a run if available."""
not_ready = _temporal_not_ready_status()
if not_ready:
return {
"error": "Temporal infrastructure not ready",
"temporal": not_ready,
}
stats = fuzzing.fuzzing_stats.get(run_id)
if not stats:
return {"error": f"Fuzzing run not found: {run_id}"}
# Be resilient if a plain dict slipped into the cache
if isinstance(stats, dict):
return stats
if hasattr(stats, "model_dump"):
return stats.model_dump()
if hasattr(stats, "dict"):
return stats.dict()
# Last resort
return getattr(stats, "__dict__", {"run_id": run_id})
@mcp.tool
async def get_fuzzing_crash_reports_mcp(run_id: str) -> Dict[str, Any]:
"""Return crash reports collected for a fuzzing run."""
not_ready = _temporal_not_ready_status()
if not_ready:
return {
"error": "Temporal infrastructure not ready",
"temporal": not_ready,
}
reports = fuzzing.crash_reports.get(run_id)
if reports is None:
return {"error": f"Fuzzing run not found: {run_id}"}
return {"run_id": run_id, "crashes": [report.model_dump() for report in reports]}
@mcp.tool
async def get_backend_status_mcp() -> Dict[str, Any]:
"""Expose backend readiness, workflows, and registered MCP tools."""
status = get_temporal_status()
response: Dict[str, Any] = {"temporal": status}
if status.get("ready"):
response["workflows"] = list(temporal_mgr.workflows.keys())
try:
tools = await mcp._tool_manager.list_tools()
response["mcp_tools"] = sorted(tool.name for tool in tools)
except Exception as exc: # pragma: no cover - defensive logging
logger.debug("Failed to enumerate MCP tools: %s", exc)
return response
def create_mcp_transport_app() -> Starlette:
"""Build a Starlette app serving HTTP + SSE transports on one port."""
http_app = mcp.http_app(path="/", transport="streamable-http")
sse_app = create_sse_app(
server=mcp,
message_path="/messages",
sse_path="/",
auth=mcp.auth,
)
routes = [
Mount("/mcp", app=http_app),
Mount("/mcp/sse", app=sse_app),
]
@asynccontextmanager
async def lifespan(app: Starlette): # pragma: no cover - integration wiring
async with AsyncExitStack() as stack:
await stack.enter_async_context(
http_app.router.lifespan_context(http_app)
)
await stack.enter_async_context(
sse_app.router.lifespan_context(sse_app)
)
yield
combined_app = Starlette(routes=routes, lifespan=lifespan)
combined_app.state.fastmcp_server = mcp
combined_app.state.http_app = http_app
combined_app.state.sse_app = sse_app
return combined_app
# ---------------------------------------------------------------------------
# Combined lifespan: Temporal init + dedicated MCP transports
# ---------------------------------------------------------------------------
@asynccontextmanager
async def combined_lifespan(app: FastAPI):
global temporal_bootstrap_task, _fastapi_mcp_imported
logger.info("Starting FuzzForge backend...")
# Ensure FastAPI endpoints are exposed via MCP once
if not _fastapi_mcp_imported:
try:
await mcp.import_server(FASTAPI_MCP_ADAPTER)
_fastapi_mcp_imported = True
logger.info("Mounted FastAPI endpoints as MCP tools")
except Exception as exc:
logger.exception("Failed to import FastAPI endpoints into MCP", exc_info=exc)
# Kick off Temporal bootstrap in the background if needed
if temporal_bootstrap_task is None or temporal_bootstrap_task.done():
temporal_bootstrap_task = asyncio.create_task(_bootstrap_temporal_with_retries())
logger.info("Temporal bootstrap task started")
else:
logger.info("Temporal bootstrap task already running")
# Start MCP transports on shared port (HTTP + SSE)
mcp_app = create_mcp_transport_app()
mcp_config = uvicorn.Config(
app=mcp_app,
host="0.0.0.0",
port=8010,
log_level="info",
lifespan="on",
)
mcp_server = uvicorn.Server(mcp_config)
mcp_server.install_signal_handlers = lambda: None # type: ignore[assignment]
mcp_task = asyncio.create_task(mcp_server.serve())
async def _wait_for_uvicorn_startup() -> None:
started_attr = getattr(mcp_server, "started", None)
if hasattr(started_attr, "wait"):
await asyncio.wait_for(started_attr.wait(), timeout=10)
return
# Fallback for uvicorn versions where "started" is a bool
poll_interval = 0.1
checks = int(10 / poll_interval)
for _ in range(checks):
if getattr(mcp_server, "started", False):
return
await asyncio.sleep(poll_interval)
raise asyncio.TimeoutError
try:
await _wait_for_uvicorn_startup()
except asyncio.TimeoutError: # pragma: no cover - defensive logging
if mcp_task.done():
raise RuntimeError("MCP server failed to start") from mcp_task.exception()
logger.warning("Timed out waiting for MCP server startup; continuing anyway")
logger.info("MCP HTTP available at http://0.0.0.0:8010/mcp")
logger.info("MCP SSE available at http://0.0.0.0:8010/mcp/sse")
try:
yield
finally:
logger.info("Shutting down MCP transports...")
mcp_server.should_exit = True
mcp_server.force_exit = True
await asyncio.gather(mcp_task, return_exceptions=True)
if temporal_bootstrap_task and not temporal_bootstrap_task.done():
temporal_bootstrap_task.cancel()
with suppress(asyncio.CancelledError):
await temporal_bootstrap_task
temporal_bootstrap_state.task_running = False
if not temporal_bootstrap_state.ready:
temporal_bootstrap_state.status = "stopped"
temporal_bootstrap_task = None
# Close Temporal client
await temporal_mgr.close()
logger.info("Shutting down FuzzForge backend...")
app.router.lifespan_context = combined_lifespan

View File

@@ -1,11 +0,0 @@
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.

View File

@@ -1,124 +0,0 @@
"""
Models for workflow findings and submissions
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
from pydantic import BaseModel, Field
from typing import Dict, Any, Optional, Literal, List
from datetime import datetime
class WorkflowFindings(BaseModel):
"""Findings from a workflow execution in SARIF format"""
workflow: str = Field(..., description="Workflow name")
run_id: str = Field(..., description="Unique run identifier")
sarif: Dict[str, Any] = Field(..., description="SARIF formatted findings")
metadata: Dict[str, Any] = Field(default_factory=dict, description="Additional metadata")
class WorkflowSubmission(BaseModel):
"""
Submit a workflow with configurable settings.
Note: This model is deprecated in favor of the /upload-and-submit endpoint
which handles file uploads directly.
"""
parameters: Dict[str, Any] = Field(
default_factory=dict,
description="Workflow-specific parameters"
)
timeout: Optional[int] = Field(
default=None, # Allow workflow-specific defaults
description="Timeout in seconds (None for workflow default)",
ge=1,
le=604800 # Max 7 days to support fuzzing campaigns
)
class WorkflowStatus(BaseModel):
"""Status of a workflow run"""
run_id: str = Field(..., description="Unique run identifier")
workflow: str = Field(..., description="Workflow name")
status: str = Field(..., description="Current status")
is_completed: bool = Field(..., description="Whether the run is completed")
is_failed: bool = Field(..., description="Whether the run failed")
is_running: bool = Field(..., description="Whether the run is currently running")
created_at: datetime = Field(..., description="Run creation time")
updated_at: datetime = Field(..., description="Last update time")
class WorkflowMetadata(BaseModel):
"""Complete metadata for a workflow"""
name: str = Field(..., description="Workflow name")
version: str = Field(..., description="Semantic version")
description: str = Field(..., description="Workflow description")
author: Optional[str] = Field(None, description="Workflow author")
tags: List[str] = Field(default_factory=list, description="Workflow tags")
parameters: Dict[str, Any] = Field(..., description="Parameters schema")
default_parameters: Dict[str, Any] = Field(
default_factory=dict,
description="Default parameter values"
)
required_modules: List[str] = Field(
default_factory=list,
description="Required module names"
)
supported_volume_modes: List[Literal["ro", "rw"]] = Field(
default=["ro", "rw"],
description="Supported volume mount modes"
)
class WorkflowListItem(BaseModel):
"""Summary information for a workflow in list views"""
name: str = Field(..., description="Workflow name")
version: str = Field(..., description="Semantic version")
description: str = Field(..., description="Workflow description")
author: Optional[str] = Field(None, description="Workflow author")
tags: List[str] = Field(default_factory=list, description="Workflow tags")
class RunSubmissionResponse(BaseModel):
"""Response after submitting a workflow"""
run_id: str = Field(..., description="Unique run identifier")
status: str = Field(..., description="Initial status")
workflow: str = Field(..., description="Workflow name")
message: str = Field(default="Workflow submitted successfully")
class FuzzingStats(BaseModel):
"""Real-time fuzzing statistics"""
run_id: str = Field(..., description="Unique run identifier")
workflow: str = Field(..., description="Workflow name")
executions: int = Field(default=0, description="Total executions")
executions_per_sec: float = Field(default=0.0, description="Current execution rate")
crashes: int = Field(default=0, description="Total crashes found")
unique_crashes: int = Field(default=0, description="Unique crashes")
coverage: Optional[float] = Field(None, description="Code coverage percentage")
corpus_size: int = Field(default=0, description="Current corpus size")
elapsed_time: int = Field(default=0, description="Elapsed time in seconds")
last_crash_time: Optional[datetime] = Field(None, description="Time of last crash")
class CrashReport(BaseModel):
"""Individual crash report from fuzzing"""
run_id: str = Field(..., description="Run identifier")
crash_id: str = Field(..., description="Unique crash identifier")
timestamp: datetime = Field(default_factory=datetime.utcnow)
signal: Optional[str] = Field(None, description="Crash signal (SIGSEGV, etc.)")
crash_type: Optional[str] = Field(None, description="Type of crash")
stack_trace: Optional[str] = Field(None, description="Stack trace")
input_file: Optional[str] = Field(None, description="Path to crashing input")
reproducer: Optional[str] = Field(None, description="Minimized reproducer")
severity: str = Field(default="medium", description="Crash severity")
exploitability: Optional[str] = Field(None, description="Exploitability assessment")

View File

@@ -1,10 +0,0 @@
"""
Storage abstraction layer for FuzzForge.
Provides unified interface for storing and retrieving targets and results.
"""
from .base import StorageBackend
from .s3_cached import S3CachedStorage
__all__ = ["StorageBackend", "S3CachedStorage"]

View File

@@ -1,153 +0,0 @@
"""
Base storage backend interface.
All storage implementations must implement this interface.
"""
from abc import ABC, abstractmethod
from pathlib import Path
from typing import Optional, Dict, Any
class StorageBackend(ABC):
"""
Abstract base class for storage backends.
Implementations handle storage and retrieval of:
- Uploaded targets (code, binaries, etc.)
- Workflow results
- Temporary files
"""
@abstractmethod
async def upload_target(
self,
file_path: Path,
user_id: str,
metadata: Optional[Dict[str, Any]] = None
) -> str:
"""
Upload a target file to storage.
Args:
file_path: Local path to file to upload
user_id: ID of user uploading the file
metadata: Optional metadata to store with file
Returns:
Target ID (unique identifier for retrieval)
Raises:
FileNotFoundError: If file_path doesn't exist
StorageError: If upload fails
"""
pass
@abstractmethod
async def get_target(self, target_id: str) -> Path:
"""
Get target file from storage.
Args:
target_id: Unique identifier from upload_target()
Returns:
Local path to cached file
Raises:
FileNotFoundError: If target doesn't exist
StorageError: If download fails
"""
pass
@abstractmethod
async def delete_target(self, target_id: str) -> None:
"""
Delete target from storage.
Args:
target_id: Unique identifier to delete
Raises:
StorageError: If deletion fails (doesn't raise if not found)
"""
pass
@abstractmethod
async def upload_results(
self,
workflow_id: str,
results: Dict[str, Any],
results_format: str = "json"
) -> str:
"""
Upload workflow results to storage.
Args:
workflow_id: Workflow execution ID
results: Results dictionary
results_format: Format (json, sarif, etc.)
Returns:
URL to uploaded results
Raises:
StorageError: If upload fails
"""
pass
@abstractmethod
async def get_results(self, workflow_id: str) -> Dict[str, Any]:
"""
Get workflow results from storage.
Args:
workflow_id: Workflow execution ID
Returns:
Results dictionary
Raises:
FileNotFoundError: If results don't exist
StorageError: If download fails
"""
pass
@abstractmethod
async def list_targets(
self,
user_id: Optional[str] = None,
limit: int = 100
) -> list[Dict[str, Any]]:
"""
List uploaded targets.
Args:
user_id: Filter by user ID (None = all users)
limit: Maximum number of results
Returns:
List of target metadata dictionaries
Raises:
StorageError: If listing fails
"""
pass
@abstractmethod
async def cleanup_cache(self) -> int:
"""
Clean up local cache (LRU eviction).
Returns:
Number of files removed
Raises:
StorageError: If cleanup fails
"""
pass
class StorageError(Exception):
"""Base exception for storage operations."""
pass

View File

@@ -1,423 +0,0 @@
"""
S3-compatible storage backend with local caching.
Works with MinIO (dev/prod) or AWS S3 (cloud).
"""
import json
import logging
import os
import shutil
from datetime import datetime
from pathlib import Path
from typing import Optional, Dict, Any
from uuid import uuid4
import boto3
from botocore.exceptions import ClientError
from .base import StorageBackend, StorageError
logger = logging.getLogger(__name__)
class S3CachedStorage(StorageBackend):
"""
S3-compatible storage with local caching.
Features:
- Upload targets to S3/MinIO
- Download with local caching (LRU eviction)
- Lifecycle management (auto-cleanup old files)
- Metadata tracking
"""
def __init__(
self,
endpoint_url: Optional[str] = None,
access_key: Optional[str] = None,
secret_key: Optional[str] = None,
bucket: str = "targets",
region: str = "us-east-1",
use_ssl: bool = False,
cache_dir: Optional[Path] = None,
cache_max_size_gb: int = 10
):
"""
Initialize S3 storage backend.
Args:
endpoint_url: S3 endpoint (None = AWS S3, or MinIO URL)
access_key: S3 access key (None = from env)
secret_key: S3 secret key (None = from env)
bucket: S3 bucket name
region: AWS region
use_ssl: Use HTTPS
cache_dir: Local cache directory
cache_max_size_gb: Maximum cache size in GB
"""
# Use environment variables as defaults
self.endpoint_url = endpoint_url or os.getenv('S3_ENDPOINT', 'http://minio:9000')
self.access_key = access_key or os.getenv('S3_ACCESS_KEY', 'fuzzforge')
self.secret_key = secret_key or os.getenv('S3_SECRET_KEY', 'fuzzforge123')
self.bucket = bucket or os.getenv('S3_BUCKET', 'targets')
self.region = region or os.getenv('S3_REGION', 'us-east-1')
self.use_ssl = use_ssl or os.getenv('S3_USE_SSL', 'false').lower() == 'true'
# Cache configuration
self.cache_dir = cache_dir or Path(os.getenv('CACHE_DIR', '/tmp/fuzzforge-cache'))
self.cache_max_size = cache_max_size_gb * (1024 ** 3) # Convert to bytes
# Ensure cache directory exists
self.cache_dir.mkdir(parents=True, exist_ok=True)
# Initialize S3 client
try:
self.s3_client = boto3.client(
's3',
endpoint_url=self.endpoint_url,
aws_access_key_id=self.access_key,
aws_secret_access_key=self.secret_key,
region_name=self.region,
use_ssl=self.use_ssl
)
logger.info(f"Initialized S3 storage: {self.endpoint_url}/{self.bucket}")
except Exception as e:
logger.error(f"Failed to initialize S3 client: {e}")
raise StorageError(f"S3 initialization failed: {e}")
async def upload_target(
self,
file_path: Path,
user_id: str,
metadata: Optional[Dict[str, Any]] = None
) -> str:
"""Upload target file to S3/MinIO."""
if not file_path.exists():
raise FileNotFoundError(f"File not found: {file_path}")
# Generate unique target ID
target_id = str(uuid4())
# Prepare metadata
upload_metadata = {
'user_id': user_id,
'uploaded_at': datetime.now().isoformat(),
'filename': file_path.name,
'size': str(file_path.stat().st_size)
}
if metadata:
upload_metadata.update(metadata)
# Upload to S3
s3_key = f'{target_id}/target'
try:
logger.info(f"Uploading target to s3://{self.bucket}/{s3_key}")
self.s3_client.upload_file(
str(file_path),
self.bucket,
s3_key,
ExtraArgs={
'Metadata': upload_metadata
}
)
file_size_mb = file_path.stat().st_size / (1024 * 1024)
logger.info(
f"✓ Uploaded target {target_id} "
f"({file_path.name}, {file_size_mb:.2f} MB)"
)
return target_id
except ClientError as e:
logger.error(f"S3 upload failed: {e}", exc_info=True)
raise StorageError(f"Failed to upload target: {e}")
except Exception as e:
logger.error(f"Upload failed: {e}", exc_info=True)
raise StorageError(f"Upload error: {e}")
async def get_target(self, target_id: str) -> Path:
"""Get target from cache or download from S3/MinIO."""
# Check cache first
cache_path = self.cache_dir / target_id
cached_file = cache_path / "target"
if cached_file.exists():
# Update access time for LRU
cached_file.touch()
logger.info(f"Cache HIT: {target_id}")
return cached_file
# Cache miss - download from S3
logger.info(f"Cache MISS: {target_id}, downloading from S3...")
try:
# Create cache directory
cache_path.mkdir(parents=True, exist_ok=True)
# Download from S3
s3_key = f'{target_id}/target'
logger.info(f"Downloading s3://{self.bucket}/{s3_key}")
self.s3_client.download_file(
self.bucket,
s3_key,
str(cached_file)
)
# Verify download
if not cached_file.exists():
raise StorageError(f"Downloaded file not found: {cached_file}")
file_size_mb = cached_file.stat().st_size / (1024 * 1024)
logger.info(f"✓ Downloaded target {target_id} ({file_size_mb:.2f} MB)")
return cached_file
except ClientError as e:
error_code = e.response.get('Error', {}).get('Code')
if error_code in ['404', 'NoSuchKey']:
logger.error(f"Target not found: {target_id}")
raise FileNotFoundError(f"Target {target_id} not found in storage")
else:
logger.error(f"S3 download failed: {e}", exc_info=True)
raise StorageError(f"Download failed: {e}")
except Exception as e:
logger.error(f"Download error: {e}", exc_info=True)
# Cleanup partial download
if cache_path.exists():
shutil.rmtree(cache_path, ignore_errors=True)
raise StorageError(f"Download error: {e}")
async def delete_target(self, target_id: str) -> None:
"""Delete target from S3/MinIO."""
try:
s3_key = f'{target_id}/target'
logger.info(f"Deleting s3://{self.bucket}/{s3_key}")
self.s3_client.delete_object(
Bucket=self.bucket,
Key=s3_key
)
# Also delete from cache if present
cache_path = self.cache_dir / target_id
if cache_path.exists():
shutil.rmtree(cache_path, ignore_errors=True)
logger.info(f"✓ Deleted target {target_id} from S3 and cache")
else:
logger.info(f"✓ Deleted target {target_id} from S3")
except ClientError as e:
logger.error(f"S3 delete failed: {e}", exc_info=True)
# Don't raise error if object doesn't exist
if e.response.get('Error', {}).get('Code') not in ['404', 'NoSuchKey']:
raise StorageError(f"Delete failed: {e}")
except Exception as e:
logger.error(f"Delete error: {e}", exc_info=True)
raise StorageError(f"Delete error: {e}")
async def upload_results(
self,
workflow_id: str,
results: Dict[str, Any],
results_format: str = "json"
) -> str:
"""Upload workflow results to S3/MinIO."""
try:
# Prepare results content
if results_format == "json":
content = json.dumps(results, indent=2).encode('utf-8')
content_type = 'application/json'
file_ext = 'json'
elif results_format == "sarif":
content = json.dumps(results, indent=2).encode('utf-8')
content_type = 'application/sarif+json'
file_ext = 'sarif'
else:
content = json.dumps(results, indent=2).encode('utf-8')
content_type = 'application/json'
file_ext = 'json'
# Upload to results bucket
results_bucket = 'results'
s3_key = f'{workflow_id}/results.{file_ext}'
logger.info(f"Uploading results to s3://{results_bucket}/{s3_key}")
self.s3_client.put_object(
Bucket=results_bucket,
Key=s3_key,
Body=content,
ContentType=content_type,
Metadata={
'workflow_id': workflow_id,
'format': results_format,
'uploaded_at': datetime.now().isoformat()
}
)
# Construct URL
results_url = f"{self.endpoint_url}/{results_bucket}/{s3_key}"
logger.info(f"✓ Uploaded results: {results_url}")
return results_url
except Exception as e:
logger.error(f"Results upload failed: {e}", exc_info=True)
raise StorageError(f"Results upload failed: {e}")
async def get_results(self, workflow_id: str) -> Dict[str, Any]:
"""Get workflow results from S3/MinIO."""
try:
results_bucket = 'results'
s3_key = f'{workflow_id}/results.json'
logger.info(f"Downloading results from s3://{results_bucket}/{s3_key}")
response = self.s3_client.get_object(
Bucket=results_bucket,
Key=s3_key
)
content = response['Body'].read().decode('utf-8')
results = json.loads(content)
logger.info(f"✓ Downloaded results for workflow {workflow_id}")
return results
except ClientError as e:
error_code = e.response.get('Error', {}).get('Code')
if error_code in ['404', 'NoSuchKey']:
logger.error(f"Results not found: {workflow_id}")
raise FileNotFoundError(f"Results for workflow {workflow_id} not found")
else:
logger.error(f"Results download failed: {e}", exc_info=True)
raise StorageError(f"Results download failed: {e}")
except Exception as e:
logger.error(f"Results download error: {e}", exc_info=True)
raise StorageError(f"Results download error: {e}")
async def list_targets(
self,
user_id: Optional[str] = None,
limit: int = 100
) -> list[Dict[str, Any]]:
"""List uploaded targets."""
try:
targets = []
paginator = self.s3_client.get_paginator('list_objects_v2')
for page in paginator.paginate(Bucket=self.bucket, PaginationConfig={'MaxItems': limit}):
for obj in page.get('Contents', []):
# Get object metadata
try:
metadata_response = self.s3_client.head_object(
Bucket=self.bucket,
Key=obj['Key']
)
metadata = metadata_response.get('Metadata', {})
# Filter by user_id if specified
if user_id and metadata.get('user_id') != user_id:
continue
targets.append({
'target_id': obj['Key'].split('/')[0],
'key': obj['Key'],
'size': obj['Size'],
'last_modified': obj['LastModified'].isoformat(),
'metadata': metadata
})
except Exception as e:
logger.warning(f"Failed to get metadata for {obj['Key']}: {e}")
continue
logger.info(f"Listed {len(targets)} targets (user_id={user_id})")
return targets
except Exception as e:
logger.error(f"List targets failed: {e}", exc_info=True)
raise StorageError(f"List targets failed: {e}")
async def cleanup_cache(self) -> int:
"""Clean up local cache using LRU eviction."""
try:
cache_files = []
total_size = 0
# Gather all cached files with metadata
for cache_file in self.cache_dir.rglob('*'):
if cache_file.is_file():
try:
stat = cache_file.stat()
cache_files.append({
'path': cache_file,
'size': stat.st_size,
'atime': stat.st_atime # Last access time
})
total_size += stat.st_size
except Exception as e:
logger.warning(f"Failed to stat {cache_file}: {e}")
continue
# Check if cleanup is needed
if total_size <= self.cache_max_size:
logger.info(
f"Cache size OK: {total_size / (1024**3):.2f} GB / "
f"{self.cache_max_size / (1024**3):.2f} GB"
)
return 0
# Sort by access time (oldest first)
cache_files.sort(key=lambda x: x['atime'])
# Remove files until under limit
removed_count = 0
for file_info in cache_files:
if total_size <= self.cache_max_size:
break
try:
file_info['path'].unlink()
total_size -= file_info['size']
removed_count += 1
logger.debug(f"Evicted from cache: {file_info['path']}")
except Exception as e:
logger.warning(f"Failed to delete {file_info['path']}: {e}")
continue
logger.info(
f"✓ Cache cleanup: removed {removed_count} files, "
f"new size: {total_size / (1024**3):.2f} GB"
)
return removed_count
except Exception as e:
logger.error(f"Cache cleanup failed: {e}", exc_info=True)
raise StorageError(f"Cache cleanup failed: {e}")
def get_cache_stats(self) -> Dict[str, Any]:
"""Get cache statistics."""
try:
total_size = 0
file_count = 0
for cache_file in self.cache_dir.rglob('*'):
if cache_file.is_file():
total_size += cache_file.stat().st_size
file_count += 1
return {
'total_size_bytes': total_size,
'total_size_gb': total_size / (1024 ** 3),
'file_count': file_count,
'max_size_gb': self.cache_max_size / (1024 ** 3),
'usage_percent': (total_size / self.cache_max_size) * 100
}
except Exception as e:
logger.error(f"Failed to get cache stats: {e}")
return {'error': str(e)}

View File

@@ -1,10 +0,0 @@
"""
Temporal integration for FuzzForge.
Handles workflow execution, monitoring, and management.
"""
from .manager import TemporalManager
from .discovery import WorkflowDiscovery
__all__ = ["TemporalManager", "WorkflowDiscovery"]

View File

@@ -1,257 +0,0 @@
"""
Workflow Discovery for Temporal
Discovers workflows from the toolbox/workflows directory
and provides metadata about available workflows.
"""
import logging
import yaml
from pathlib import Path
from typing import Dict, Any
from pydantic import BaseModel, Field, ConfigDict
logger = logging.getLogger(__name__)
class WorkflowInfo(BaseModel):
"""Information about a discovered workflow"""
name: str = Field(..., description="Workflow name")
path: Path = Field(..., description="Path to workflow directory")
workflow_file: Path = Field(..., description="Path to workflow.py file")
metadata: Dict[str, Any] = Field(..., description="Workflow metadata from YAML")
workflow_type: str = Field(..., description="Workflow class name")
vertical: str = Field(..., description="Vertical (worker type) for this workflow")
model_config = ConfigDict(arbitrary_types_allowed=True)
class WorkflowDiscovery:
"""
Discovers workflows from the filesystem.
Scans toolbox/workflows/ for directories containing:
- metadata.yaml (required)
- workflow.py (required)
Each workflow declares its vertical (rust, android, web, etc.)
which determines which worker pool will execute it.
"""
def __init__(self, workflows_dir: Path):
"""
Initialize workflow discovery.
Args:
workflows_dir: Path to the workflows directory
"""
self.workflows_dir = workflows_dir
if not self.workflows_dir.exists():
self.workflows_dir.mkdir(parents=True, exist_ok=True)
logger.info(f"Created workflows directory: {self.workflows_dir}")
async def discover_workflows(self) -> Dict[str, WorkflowInfo]:
"""
Discover workflows by scanning the workflows directory.
Returns:
Dictionary mapping workflow names to their information
"""
workflows = {}
logger.info(f"Scanning for workflows in: {self.workflows_dir}")
for workflow_dir in self.workflows_dir.iterdir():
if not workflow_dir.is_dir():
continue
# Skip special directories
if workflow_dir.name.startswith('.') or workflow_dir.name == '__pycache__':
continue
metadata_file = workflow_dir / "metadata.yaml"
if not metadata_file.exists():
logger.debug(f"No metadata.yaml in {workflow_dir.name}, skipping")
continue
workflow_file = workflow_dir / "workflow.py"
if not workflow_file.exists():
logger.warning(
f"Workflow {workflow_dir.name} has metadata but no workflow.py, skipping"
)
continue
try:
# Parse metadata
with open(metadata_file) as f:
metadata = yaml.safe_load(f)
# Validate required fields
if 'name' not in metadata:
logger.warning(f"Workflow {workflow_dir.name} metadata missing 'name' field")
metadata['name'] = workflow_dir.name
if 'vertical' not in metadata:
logger.warning(
f"Workflow {workflow_dir.name} metadata missing 'vertical' field"
)
continue
# Infer workflow class name from metadata or use convention
workflow_type = metadata.get('workflow_class')
if not workflow_type:
# Convention: convert snake_case to PascalCase + Workflow
# e.g., rust_test -> RustTestWorkflow
parts = workflow_dir.name.split('_')
workflow_type = ''.join(part.capitalize() for part in parts) + 'Workflow'
# Create workflow info
info = WorkflowInfo(
name=metadata['name'],
path=workflow_dir,
workflow_file=workflow_file,
metadata=metadata,
workflow_type=workflow_type,
vertical=metadata['vertical']
)
workflows[info.name] = info
logger.info(
f"✓ Discovered workflow: {info.name} "
f"(vertical: {info.vertical}, class: {info.workflow_type})"
)
except Exception as e:
logger.error(
f"Error discovering workflow {workflow_dir.name}: {e}",
exc_info=True
)
continue
logger.info(f"Discovered {len(workflows)} workflows")
return workflows
def get_workflows_by_vertical(
self,
workflows: Dict[str, WorkflowInfo],
vertical: str
) -> Dict[str, WorkflowInfo]:
"""
Filter workflows by vertical.
Args:
workflows: All discovered workflows
vertical: Vertical name to filter by
Returns:
Filtered workflows dictionary
"""
return {
name: info
for name, info in workflows.items()
if info.vertical == vertical
}
def get_available_verticals(self, workflows: Dict[str, WorkflowInfo]) -> list[str]:
"""
Get list of all verticals from discovered workflows.
Args:
workflows: All discovered workflows
Returns:
List of unique vertical names
"""
return list(set(info.vertical for info in workflows.values()))
@staticmethod
def get_metadata_schema() -> Dict[str, Any]:
"""
Get the JSON schema for workflow metadata.
Returns:
JSON schema dictionary
"""
return {
"type": "object",
"required": ["name", "version", "description", "author", "vertical", "parameters"],
"properties": {
"name": {
"type": "string",
"description": "Workflow name"
},
"version": {
"type": "string",
"pattern": "^\\d+\\.\\d+\\.\\d+$",
"description": "Semantic version (x.y.z)"
},
"vertical": {
"type": "string",
"description": "Vertical worker type (rust, android, web, etc.)"
},
"description": {
"type": "string",
"description": "Workflow description"
},
"author": {
"type": "string",
"description": "Workflow author"
},
"category": {
"type": "string",
"enum": ["comprehensive", "specialized", "fuzzing", "focused"],
"description": "Workflow category"
},
"tags": {
"type": "array",
"items": {"type": "string"},
"description": "Workflow tags for categorization"
},
"requirements": {
"type": "object",
"required": ["tools", "resources"],
"properties": {
"tools": {
"type": "array",
"items": {"type": "string"},
"description": "Required security tools"
},
"resources": {
"type": "object",
"required": ["memory", "cpu", "timeout"],
"properties": {
"memory": {
"type": "string",
"pattern": "^\\d+[GMK]i$",
"description": "Memory limit (e.g., 1Gi, 512Mi)"
},
"cpu": {
"type": "string",
"pattern": "^\\d+m?$",
"description": "CPU limit (e.g., 1000m, 2)"
},
"timeout": {
"type": "integer",
"minimum": 60,
"maximum": 7200,
"description": "Workflow timeout in seconds"
}
}
}
}
},
"parameters": {
"type": "object",
"description": "Workflow parameters schema"
},
"default_parameters": {
"type": "object",
"description": "Default parameter values"
},
"required_modules": {
"type": "array",
"items": {"type": "string"},
"description": "Required module names"
}
}
}

View File

@@ -1,376 +0,0 @@
"""
Temporal Manager - Workflow execution and management
Handles:
- Workflow discovery from toolbox
- Workflow execution (submit to Temporal)
- Status monitoring
- Results retrieval
"""
import logging
import os
from pathlib import Path
from typing import Dict, Optional, Any
from uuid import uuid4
from temporalio.client import Client, WorkflowHandle
from temporalio.common import RetryPolicy
from datetime import timedelta
from .discovery import WorkflowDiscovery, WorkflowInfo
from src.storage import S3CachedStorage
logger = logging.getLogger(__name__)
class TemporalManager:
"""
Manages Temporal workflow execution for FuzzForge.
This class:
- Discovers available workflows from toolbox
- Submits workflow executions to Temporal
- Monitors workflow status
- Retrieves workflow results
"""
def __init__(
self,
workflows_dir: Optional[Path] = None,
temporal_address: Optional[str] = None,
temporal_namespace: str = "default",
storage: Optional[S3CachedStorage] = None
):
"""
Initialize Temporal manager.
Args:
workflows_dir: Path to workflows directory (default: toolbox/workflows)
temporal_address: Temporal server address (default: from env or localhost:7233)
temporal_namespace: Temporal namespace
storage: Storage backend for file uploads (default: S3CachedStorage)
"""
if workflows_dir is None:
workflows_dir = Path("toolbox/workflows")
self.temporal_address = temporal_address or os.getenv(
'TEMPORAL_ADDRESS',
'localhost:7233'
)
self.temporal_namespace = temporal_namespace
self.discovery = WorkflowDiscovery(workflows_dir)
self.workflows: Dict[str, WorkflowInfo] = {}
self.client: Optional[Client] = None
# Initialize storage backend
self.storage = storage or S3CachedStorage()
logger.info(
f"TemporalManager initialized: {self.temporal_address} "
f"(namespace: {self.temporal_namespace})"
)
async def initialize(self):
"""Initialize the manager by discovering workflows and connecting to Temporal."""
try:
# Discover workflows
self.workflows = await self.discovery.discover_workflows()
if not self.workflows:
logger.warning("No workflows discovered")
else:
logger.info(
f"Discovered {len(self.workflows)} workflows: "
f"{list(self.workflows.keys())}"
)
# Connect to Temporal
self.client = await Client.connect(
self.temporal_address,
namespace=self.temporal_namespace
)
logger.info(f"✓ Connected to Temporal: {self.temporal_address}")
except Exception as e:
logger.error(f"Failed to initialize Temporal manager: {e}", exc_info=True)
raise
async def close(self):
"""Close Temporal client connection."""
if self.client:
# Temporal client doesn't need explicit close in Python SDK
pass
async def get_workflows(self) -> Dict[str, WorkflowInfo]:
"""
Get all discovered workflows.
Returns:
Dictionary mapping workflow names to their info
"""
return self.workflows
async def get_workflow(self, name: str) -> Optional[WorkflowInfo]:
"""
Get workflow info by name.
Args:
name: Workflow name
Returns:
WorkflowInfo or None if not found
"""
return self.workflows.get(name)
async def upload_target(
self,
file_path: Path,
user_id: str,
metadata: Optional[Dict[str, Any]] = None
) -> str:
"""
Upload target file to storage.
Args:
file_path: Local path to file
user_id: User ID
metadata: Optional metadata
Returns:
Target ID for use in workflow execution
"""
target_id = await self.storage.upload_target(file_path, user_id, metadata)
logger.info(f"Uploaded target: {target_id}")
return target_id
async def run_workflow(
self,
workflow_name: str,
target_id: str,
workflow_params: Optional[Dict[str, Any]] = None,
workflow_id: Optional[str] = None
) -> WorkflowHandle:
"""
Execute a workflow.
Args:
workflow_name: Name of workflow to execute
target_id: Target ID (from upload_target)
workflow_params: Additional workflow parameters
workflow_id: Optional workflow ID (generated if not provided)
Returns:
WorkflowHandle for monitoring/results
Raises:
ValueError: If workflow not found or client not initialized
"""
if not self.client:
raise ValueError("Temporal client not initialized. Call initialize() first.")
# Get workflow info
workflow_info = self.workflows.get(workflow_name)
if not workflow_info:
raise ValueError(f"Workflow not found: {workflow_name}")
# Generate workflow ID if not provided
if not workflow_id:
workflow_id = f"{workflow_name}-{str(uuid4())[:8]}"
# Prepare workflow input arguments
workflow_params = workflow_params or {}
# Build args list: [target_id, ...workflow_params in schema order]
# The workflow parameters are passed as individual positional args
workflow_args = [target_id]
# Add parameters in order based on metadata schema
# This ensures parameters match the workflow signature order
if workflow_params and 'parameters' in workflow_info.metadata:
param_schema = workflow_info.metadata['parameters'].get('properties', {})
# Iterate parameters in schema order and add values
for param_name in param_schema.keys():
param_value = workflow_params.get(param_name)
workflow_args.append(param_value)
# Determine task queue from workflow vertical
vertical = workflow_info.metadata.get("vertical", "default")
task_queue = f"{vertical}-queue"
logger.info(
f"Starting workflow: {workflow_name} "
f"(id={workflow_id}, queue={task_queue}, target={target_id})"
)
logger.info(f"DEBUG: workflow_args = {workflow_args}")
logger.info(f"DEBUG: workflow_params received = {workflow_params}")
try:
# Start workflow execution with positional arguments
handle = await self.client.start_workflow(
workflow=workflow_info.workflow_type, # Workflow class name
args=workflow_args, # Positional arguments
id=workflow_id,
task_queue=task_queue,
retry_policy=RetryPolicy(
initial_interval=timedelta(seconds=1),
maximum_interval=timedelta(minutes=1),
maximum_attempts=3
)
)
logger.info(f"✓ Workflow started: {workflow_id}")
return handle
except Exception as e:
logger.error(f"Failed to start workflow {workflow_name}: {e}", exc_info=True)
raise
async def get_workflow_status(self, workflow_id: str) -> Dict[str, Any]:
"""
Get workflow execution status.
Args:
workflow_id: Workflow execution ID
Returns:
Status dictionary with workflow state
Raises:
ValueError: If client not initialized or workflow not found
"""
if not self.client:
raise ValueError("Temporal client not initialized")
try:
# Get workflow handle
handle = self.client.get_workflow_handle(workflow_id)
# Try to get result (non-blocking describe)
description = await handle.describe()
status = {
"workflow_id": workflow_id,
"status": description.status.name,
"start_time": description.start_time.isoformat() if description.start_time else None,
"execution_time": description.execution_time.isoformat() if description.execution_time else None,
"close_time": description.close_time.isoformat() if description.close_time else None,
"task_queue": description.task_queue,
}
logger.info(f"Workflow {workflow_id} status: {status['status']}")
return status
except Exception as e:
logger.error(f"Failed to get workflow status: {e}", exc_info=True)
raise
async def get_workflow_result(
self,
workflow_id: str,
timeout: Optional[timedelta] = None
) -> Any:
"""
Get workflow execution result (blocking).
Args:
workflow_id: Workflow execution ID
timeout: Maximum time to wait for result
Returns:
Workflow result
Raises:
ValueError: If client not initialized
TimeoutError: If timeout exceeded
"""
if not self.client:
raise ValueError("Temporal client not initialized")
try:
handle = self.client.get_workflow_handle(workflow_id)
logger.info(f"Waiting for workflow result: {workflow_id}")
# Wait for workflow to complete and get result
if timeout:
# Use asyncio timeout if provided
import asyncio
result = await asyncio.wait_for(handle.result(), timeout=timeout.total_seconds())
else:
result = await handle.result()
logger.info(f"✓ Workflow {workflow_id} completed")
return result
except Exception as e:
logger.error(f"Failed to get workflow result: {e}", exc_info=True)
raise
async def cancel_workflow(self, workflow_id: str) -> None:
"""
Cancel a running workflow.
Args:
workflow_id: Workflow execution ID
Raises:
ValueError: If client not initialized
"""
if not self.client:
raise ValueError("Temporal client not initialized")
try:
handle = self.client.get_workflow_handle(workflow_id)
await handle.cancel()
logger.info(f"✓ Workflow cancelled: {workflow_id}")
except Exception as e:
logger.error(f"Failed to cancel workflow: {e}", exc_info=True)
raise
async def list_workflows(
self,
filter_query: Optional[str] = None,
limit: int = 100
) -> list[Dict[str, Any]]:
"""
List workflow executions.
Args:
filter_query: Optional Temporal list filter query
limit: Maximum number of results
Returns:
List of workflow execution info
Raises:
ValueError: If client not initialized
"""
if not self.client:
raise ValueError("Temporal client not initialized")
try:
workflows = []
# Use Temporal's list API
async for workflow in self.client.list_workflows(filter_query):
workflows.append({
"workflow_id": workflow.id,
"workflow_type": workflow.workflow_type,
"status": workflow.status.name,
"start_time": workflow.start_time.isoformat() if workflow.start_time else None,
"close_time": workflow.close_time.isoformat() if workflow.close_time else None,
"task_queue": workflow.task_queue,
})
if len(workflows) >= limit:
break
logger.info(f"Listed {len(workflows)} workflows")
return workflows
except Exception as e:
logger.error(f"Failed to list workflows: {e}", exc_info=True)
raise

View File

@@ -1,119 +0,0 @@
# FuzzForge Test Suite
Comprehensive test infrastructure for FuzzForge modules and workflows.
## Directory Structure
```
tests/
├── conftest.py # Shared pytest fixtures
├── unit/ # Fast, isolated unit tests
│ ├── test_modules/ # Module-specific tests
│ │ ├── test_cargo_fuzzer.py
│ │ └── test_atheris_fuzzer.py
│ ├── test_workflows/ # Workflow tests
│ └── test_api/ # API endpoint tests
├── integration/ # Integration tests (requires Docker)
└── fixtures/ # Test data and projects
├── test_projects/ # Vulnerable projects for testing
└── expected_results/ # Expected output for validation
```
## Running Tests
### All Tests
```bash
cd backend
pytest tests/ -v
```
### Unit Tests Only (Fast)
```bash
pytest tests/unit/ -v
```
### Integration Tests (Requires Docker)
```bash
# Start services
docker-compose up -d
# Run integration tests
pytest tests/integration/ -v
# Cleanup
docker-compose down
```
### With Coverage
```bash
pytest tests/ --cov=toolbox/modules --cov=src --cov-report=html
```
### Parallel Execution
```bash
pytest tests/unit/ -n auto
```
## Available Fixtures
### Workspace Fixtures
- `temp_workspace`: Empty temporary workspace
- `python_test_workspace`: Python project with vulnerabilities
- `rust_test_workspace`: Rust project with fuzz targets
### Module Fixtures
- `atheris_fuzzer`: AtherisFuzzer instance
- `cargo_fuzzer`: CargoFuzzer instance
- `file_scanner`: FileScanner instance
### Configuration Fixtures
- `atheris_config`: Default Atheris configuration
- `cargo_fuzz_config`: Default cargo-fuzz configuration
- `gitleaks_config`: Default Gitleaks configuration
### Mock Fixtures
- `mock_stats_callback`: Mock stats callback for fuzzing
- `mock_temporal_context`: Mock Temporal activity context
## Writing Tests
### Unit Test Example
```python
import pytest
@pytest.mark.asyncio
async def test_module_execution(cargo_fuzzer, rust_test_workspace, cargo_fuzz_config):
"""Test module execution"""
result = await cargo_fuzzer.execute(cargo_fuzz_config, rust_test_workspace)
assert result.status == "success"
assert result.execution_time > 0
```
### Integration Test Example
```python
@pytest.mark.integration
async def test_end_to_end_workflow():
"""Test complete workflow execution"""
# Test full workflow with real services
pass
```
## CI/CD Integration
Tests run automatically on:
- **Push to main/develop**: Full test suite
- **Pull requests**: Full test suite + coverage
- **Nightly**: Extended integration tests
See `.github/workflows/test.yml` for configuration.
## Code Coverage
Target coverage: **80%+** for core modules
View coverage report:
```bash
pytest tests/ --cov --cov-report=html
open htmlcov/index.html
```

View File

@@ -1,230 +0,0 @@
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import sys
from pathlib import Path
from typing import Dict, Any
import pytest
# Ensure project root is on sys.path so `src` is importable
ROOT = Path(__file__).resolve().parents[1]
if str(ROOT) not in sys.path:
sys.path.insert(0, str(ROOT))
# Add toolbox to path for module imports
TOOLBOX = ROOT / "toolbox"
if str(TOOLBOX) not in sys.path:
sys.path.insert(0, str(TOOLBOX))
# ============================================================================
# Workspace Fixtures
# ============================================================================
@pytest.fixture
def temp_workspace(tmp_path):
"""Create a temporary workspace directory for testing"""
workspace = tmp_path / "workspace"
workspace.mkdir()
return workspace
@pytest.fixture
def python_test_workspace(temp_workspace):
"""Create a Python test workspace with sample files"""
# Create a simple Python project structure
(temp_workspace / "main.py").write_text("""
def process_data(data):
# Intentional bug: no bounds checking
return data[0:100]
def divide(a, b):
# Division by zero vulnerability
return a / b
""")
(temp_workspace / "config.py").write_text("""
# Hardcoded secrets for testing
API_KEY = "sk_test_1234567890abcdef"
DATABASE_URL = "postgresql://admin:password123@localhost/db"
AWS_SECRET = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
""")
return temp_workspace
@pytest.fixture
def rust_test_workspace(temp_workspace):
"""Create a Rust test workspace with fuzz targets"""
# Create Cargo.toml
(temp_workspace / "Cargo.toml").write_text("""[package]
name = "test_project"
version = "0.1.0"
edition = "2021"
[dependencies]
""")
# Create src/lib.rs
src_dir = temp_workspace / "src"
src_dir.mkdir()
(src_dir / "lib.rs").write_text("""
pub fn process_buffer(data: &[u8]) -> Vec<u8> {
if data.len() < 4 {
return Vec::new();
}
// Vulnerability: bounds checking issue
let size = data[0] as usize;
let mut result = Vec::new();
for i in 0..size {
result.push(data[i]);
}
result
}
""")
# Create fuzz directory structure
fuzz_dir = temp_workspace / "fuzz"
fuzz_dir.mkdir()
(fuzz_dir / "Cargo.toml").write_text("""[package]
name = "test_project-fuzz"
version = "0.0.0"
edition = "2021"
[dependencies]
libfuzzer-sys = "0.4"
[dependencies.test_project]
path = ".."
[[bin]]
name = "fuzz_target_1"
path = "fuzz_targets/fuzz_target_1.rs"
""")
fuzz_targets_dir = fuzz_dir / "fuzz_targets"
fuzz_targets_dir.mkdir()
(fuzz_targets_dir / "fuzz_target_1.rs").write_text("""#![no_main]
use libfuzzer_sys::fuzz_target;
use test_project::process_buffer;
fuzz_target!(|data: &[u8]| {
let _ = process_buffer(data);
});
""")
return temp_workspace
# ============================================================================
# Module Configuration Fixtures
# ============================================================================
@pytest.fixture
def atheris_config():
"""Default Atheris fuzzer configuration"""
return {
"target_file": "auto-discover",
"max_iterations": 1000,
"timeout_seconds": 10,
"corpus_dir": None
}
@pytest.fixture
def cargo_fuzz_config():
"""Default cargo-fuzz configuration"""
return {
"target_name": None,
"max_iterations": 1000,
"timeout_seconds": 10,
"sanitizer": "address"
}
@pytest.fixture
def gitleaks_config():
"""Default Gitleaks configuration"""
return {
"config_path": None,
"scan_uncommitted": True
}
@pytest.fixture
def file_scanner_config():
"""Default file scanner configuration"""
return {
"scan_patterns": ["*.py", "*.rs", "*.js"],
"exclude_patterns": ["*.test.*", "*.spec.*"],
"max_file_size": 1048576 # 1MB
}
# ============================================================================
# Module Instance Fixtures
# ============================================================================
@pytest.fixture
def atheris_fuzzer():
"""Create an AtherisFuzzer instance"""
from modules.fuzzer.atheris_fuzzer import AtherisFuzzer
return AtherisFuzzer()
@pytest.fixture
def cargo_fuzzer():
"""Create a CargoFuzzer instance"""
from modules.fuzzer.cargo_fuzzer import CargoFuzzer
return CargoFuzzer()
@pytest.fixture
def file_scanner():
"""Create a FileScanner instance"""
from modules.scanner.file_scanner import FileScanner
return FileScanner()
# ============================================================================
# Mock Fixtures
# ============================================================================
@pytest.fixture
def mock_stats_callback():
"""Mock stats callback for fuzzing"""
stats_received = []
async def callback(stats: Dict[str, Any]):
stats_received.append(stats)
callback.stats_received = stats_received
return callback
@pytest.fixture
def mock_temporal_context():
"""Mock Temporal activity context"""
class MockActivityInfo:
def __init__(self):
self.workflow_id = "test-workflow-123"
self.activity_id = "test-activity-1"
self.attempt = 1
class MockContext:
def __init__(self):
self.info = MockActivityInfo()
return MockContext()

View File

@@ -1,177 +0,0 @@
"""
Unit tests for AtherisFuzzer module
"""
import pytest
from unittest.mock import AsyncMock, patch
@pytest.mark.asyncio
class TestAtherisFuzzerMetadata:
"""Test AtherisFuzzer metadata"""
async def test_metadata_structure(self, atheris_fuzzer):
"""Test that module metadata is properly defined"""
metadata = atheris_fuzzer.get_metadata()
assert metadata.name == "atheris_fuzzer"
assert metadata.category == "fuzzer"
assert "fuzzing" in metadata.tags
assert "python" in metadata.tags
@pytest.mark.asyncio
class TestAtherisFuzzerConfigValidation:
"""Test configuration validation"""
async def test_valid_config(self, atheris_fuzzer, atheris_config):
"""Test validation of valid configuration"""
assert atheris_fuzzer.validate_config(atheris_config) is True
async def test_invalid_max_iterations(self, atheris_fuzzer):
"""Test validation fails with invalid max_iterations"""
config = {
"target_file": "fuzz_target.py",
"max_iterations": -1,
"timeout_seconds": 10
}
with pytest.raises(ValueError, match="max_iterations"):
atheris_fuzzer.validate_config(config)
async def test_invalid_timeout(self, atheris_fuzzer):
"""Test validation fails with invalid timeout"""
config = {
"target_file": "fuzz_target.py",
"max_iterations": 1000,
"timeout_seconds": 0
}
with pytest.raises(ValueError, match="timeout_seconds"):
atheris_fuzzer.validate_config(config)
@pytest.mark.asyncio
class TestAtherisFuzzerDiscovery:
"""Test fuzz target discovery"""
async def test_auto_discover(self, atheris_fuzzer, python_test_workspace):
"""Test auto-discovery of Python fuzz targets"""
# Create a fuzz target file
(python_test_workspace / "fuzz_target.py").write_text("""
import atheris
import sys
def TestOneInput(data):
pass
if __name__ == "__main__":
atheris.Setup(sys.argv, TestOneInput)
atheris.Fuzz()
""")
# Pass None for auto-discovery
target = atheris_fuzzer._discover_target(python_test_workspace, None)
assert target is not None
assert "fuzz_target.py" in str(target)
@pytest.mark.asyncio
class TestAtherisFuzzerExecution:
"""Test fuzzer execution logic"""
async def test_execution_creates_result(self, atheris_fuzzer, python_test_workspace, atheris_config):
"""Test that execution returns a ModuleResult"""
# Create a simple fuzz target
(python_test_workspace / "fuzz_target.py").write_text("""
import atheris
import sys
def TestOneInput(data):
if len(data) > 0:
pass
if __name__ == "__main__":
atheris.Setup(sys.argv, TestOneInput)
atheris.Fuzz()
""")
# Use a very short timeout for testing
test_config = {
"target_file": "fuzz_target.py",
"max_iterations": 10,
"timeout_seconds": 1
}
# Mock the fuzzing subprocess to avoid actual execution
with patch.object(atheris_fuzzer, '_run_fuzzing', new_callable=AsyncMock, return_value=([], {"total_executions": 10})):
result = await atheris_fuzzer.execute(test_config, python_test_workspace)
assert result.module == "atheris_fuzzer"
assert result.status in ["success", "partial", "failed"]
assert isinstance(result.execution_time, float)
@pytest.mark.asyncio
class TestAtherisFuzzerStatsCallback:
"""Test stats callback functionality"""
async def test_stats_callback_invoked(self, atheris_fuzzer, python_test_workspace, atheris_config, mock_stats_callback):
"""Test that stats callback is invoked during fuzzing"""
(python_test_workspace / "fuzz_target.py").write_text("""
import atheris
import sys
def TestOneInput(data):
pass
if __name__ == "__main__":
atheris.Setup(sys.argv, TestOneInput)
atheris.Fuzz()
""")
# Mock fuzzing to simulate stats
async def mock_run_fuzzing(test_one_input, target_path, workspace, max_iterations, timeout_seconds, stats_callback):
if stats_callback:
await stats_callback({
"total_execs": 100,
"execs_per_sec": 10.0,
"crashes": 0,
"coverage": 5,
"corpus_size": 2,
"elapsed_time": 10
})
return
with patch.object(atheris_fuzzer, '_run_fuzzing', side_effect=mock_run_fuzzing):
with patch.object(atheris_fuzzer, '_load_target_module', return_value=lambda x: None):
# Put stats_callback in config dict, not as kwarg
atheris_config["target_file"] = "fuzz_target.py"
atheris_config["stats_callback"] = mock_stats_callback
await atheris_fuzzer.execute(atheris_config, python_test_workspace)
# Verify callback was invoked
assert len(mock_stats_callback.stats_received) > 0
@pytest.mark.asyncio
class TestAtherisFuzzerFindingGeneration:
"""Test finding generation from crashes"""
async def test_create_crash_finding(self, atheris_fuzzer):
"""Test crash finding creation"""
finding = atheris_fuzzer.create_finding(
title="Crash: Exception in TestOneInput",
description="IndexError: list index out of range",
severity="high",
category="crash",
file_path="fuzz_target.py",
metadata={
"crash_type": "IndexError",
"stack_trace": "Traceback..."
}
)
assert finding.title == "Crash: Exception in TestOneInput"
assert finding.severity == "high"
assert finding.category == "crash"
assert "IndexError" in finding.metadata["crash_type"]

View File

@@ -1,177 +0,0 @@
"""
Unit tests for CargoFuzzer module
"""
import pytest
from unittest.mock import AsyncMock, patch
@pytest.mark.asyncio
class TestCargoFuzzerMetadata:
"""Test CargoFuzzer metadata"""
async def test_metadata_structure(self, cargo_fuzzer):
"""Test that module metadata is properly defined"""
metadata = cargo_fuzzer.get_metadata()
assert metadata.name == "cargo_fuzz"
assert metadata.version == "0.11.2"
assert metadata.category == "fuzzer"
assert "fuzzing" in metadata.tags
assert "rust" in metadata.tags
@pytest.mark.asyncio
class TestCargoFuzzerConfigValidation:
"""Test configuration validation"""
async def test_valid_config(self, cargo_fuzzer, cargo_fuzz_config):
"""Test validation of valid configuration"""
assert cargo_fuzzer.validate_config(cargo_fuzz_config) is True
async def test_invalid_max_iterations(self, cargo_fuzzer):
"""Test validation fails with invalid max_iterations"""
config = {
"max_iterations": -1,
"timeout_seconds": 10,
"sanitizer": "address"
}
with pytest.raises(ValueError, match="max_iterations"):
cargo_fuzzer.validate_config(config)
async def test_invalid_timeout(self, cargo_fuzzer):
"""Test validation fails with invalid timeout"""
config = {
"max_iterations": 1000,
"timeout_seconds": 0,
"sanitizer": "address"
}
with pytest.raises(ValueError, match="timeout_seconds"):
cargo_fuzzer.validate_config(config)
async def test_invalid_sanitizer(self, cargo_fuzzer):
"""Test validation fails with invalid sanitizer"""
config = {
"max_iterations": 1000,
"timeout_seconds": 10,
"sanitizer": "invalid_sanitizer"
}
with pytest.raises(ValueError, match="sanitizer"):
cargo_fuzzer.validate_config(config)
@pytest.mark.asyncio
class TestCargoFuzzerWorkspaceValidation:
"""Test workspace validation"""
async def test_valid_workspace(self, cargo_fuzzer, rust_test_workspace):
"""Test validation of valid workspace"""
assert cargo_fuzzer.validate_workspace(rust_test_workspace) is True
async def test_nonexistent_workspace(self, cargo_fuzzer, tmp_path):
"""Test validation fails with nonexistent workspace"""
nonexistent = tmp_path / "does_not_exist"
with pytest.raises(ValueError, match="does not exist"):
cargo_fuzzer.validate_workspace(nonexistent)
async def test_workspace_is_file(self, cargo_fuzzer, tmp_path):
"""Test validation fails when workspace is a file"""
file_path = tmp_path / "file.txt"
file_path.write_text("test")
with pytest.raises(ValueError, match="not a directory"):
cargo_fuzzer.validate_workspace(file_path)
@pytest.mark.asyncio
class TestCargoFuzzerDiscovery:
"""Test fuzz target discovery"""
async def test_discover_targets(self, cargo_fuzzer, rust_test_workspace):
"""Test discovery of fuzz targets"""
targets = await cargo_fuzzer._discover_fuzz_targets(rust_test_workspace)
assert len(targets) == 1
assert "fuzz_target_1" in targets
async def test_no_fuzz_directory(self, cargo_fuzzer, temp_workspace):
"""Test discovery with no fuzz directory"""
targets = await cargo_fuzzer._discover_fuzz_targets(temp_workspace)
assert targets == []
@pytest.mark.asyncio
class TestCargoFuzzerExecution:
"""Test fuzzer execution logic"""
async def test_execution_creates_result(self, cargo_fuzzer, rust_test_workspace, cargo_fuzz_config):
"""Test that execution returns a ModuleResult"""
# Mock the build and run methods to avoid actual fuzzing
with patch.object(cargo_fuzzer, '_build_fuzz_target', new_callable=AsyncMock, return_value=True):
with patch.object(cargo_fuzzer, '_run_fuzzing', new_callable=AsyncMock, return_value=([], {"total_executions": 0, "crashes_found": 0})):
with patch.object(cargo_fuzzer, '_parse_crash_artifacts', new_callable=AsyncMock, return_value=[]):
result = await cargo_fuzzer.execute(cargo_fuzz_config, rust_test_workspace)
assert result.module == "cargo_fuzz"
assert result.status == "success"
assert isinstance(result.execution_time, float)
assert result.execution_time >= 0
async def test_execution_with_no_targets(self, cargo_fuzzer, temp_workspace, cargo_fuzz_config):
"""Test execution fails gracefully with no fuzz targets"""
result = await cargo_fuzzer.execute(cargo_fuzz_config, temp_workspace)
assert result.status == "failed"
assert "No fuzz targets found" in result.error
@pytest.mark.asyncio
class TestCargoFuzzerStatsCallback:
"""Test stats callback functionality"""
async def test_stats_callback_invoked(self, cargo_fuzzer, rust_test_workspace, cargo_fuzz_config, mock_stats_callback):
"""Test that stats callback is invoked during fuzzing"""
# Mock build/run to simulate stats generation
async def mock_run_fuzzing(workspace, target, config, callback):
# Simulate stats callback
if callback:
await callback({
"total_execs": 1000,
"execs_per_sec": 100.0,
"crashes": 0,
"coverage": 10,
"corpus_size": 5,
"elapsed_time": 10
})
return [], {"total_executions": 1000}
with patch.object(cargo_fuzzer, '_build_fuzz_target', new_callable=AsyncMock, return_value=True):
with patch.object(cargo_fuzzer, '_run_fuzzing', side_effect=mock_run_fuzzing):
with patch.object(cargo_fuzzer, '_parse_crash_artifacts', new_callable=AsyncMock, return_value=[]):
await cargo_fuzzer.execute(cargo_fuzz_config, rust_test_workspace, stats_callback=mock_stats_callback)
# Verify callback was invoked
assert len(mock_stats_callback.stats_received) > 0
assert mock_stats_callback.stats_received[0]["total_execs"] == 1000
@pytest.mark.asyncio
class TestCargoFuzzerFindingGeneration:
"""Test finding generation from crashes"""
async def test_create_finding_from_crash(self, cargo_fuzzer):
"""Test finding creation"""
finding = cargo_fuzzer.create_finding(
title="Crash: Segmentation Fault",
description="Test crash",
severity="critical",
category="crash",
file_path="fuzz/fuzz_targets/fuzz_target_1.rs",
metadata={"crash_type": "SIGSEGV"}
)
assert finding.title == "Crash: Segmentation Fault"
assert finding.severity == "critical"
assert finding.category == "crash"
assert finding.file_path == "fuzz/fuzz_targets/fuzz_target_1.rs"
assert finding.metadata["crash_type"] == "SIGSEGV"

View File

@@ -1,349 +0,0 @@
"""
Unit tests for FileScanner module
"""
import sys
from pathlib import Path
import pytest
sys.path.insert(0, str(Path(__file__).resolve().parents[3] / "toolbox"))
@pytest.mark.asyncio
class TestFileScannerMetadata:
"""Test FileScanner metadata"""
async def test_metadata_structure(self, file_scanner):
"""Test that metadata has correct structure"""
metadata = file_scanner.get_metadata()
assert metadata.name == "file_scanner"
assert metadata.version == "1.0.0"
assert metadata.category == "scanner"
assert "files" in metadata.tags
assert "enumeration" in metadata.tags
assert metadata.requires_workspace is True
@pytest.mark.asyncio
class TestFileScannerConfigValidation:
"""Test configuration validation"""
async def test_valid_config(self, file_scanner):
"""Test that valid config passes validation"""
config = {
"patterns": ["*.py", "*.js"],
"max_file_size": 1048576,
"check_sensitive": True,
"calculate_hashes": False
}
assert file_scanner.validate_config(config) is True
async def test_default_config(self, file_scanner):
"""Test that empty config uses defaults"""
config = {}
assert file_scanner.validate_config(config) is True
async def test_invalid_patterns_type(self, file_scanner):
"""Test that non-list patterns raises error"""
config = {"patterns": "*.py"}
with pytest.raises(ValueError, match="patterns must be a list"):
file_scanner.validate_config(config)
async def test_invalid_max_file_size(self, file_scanner):
"""Test that invalid max_file_size raises error"""
config = {"max_file_size": -1}
with pytest.raises(ValueError, match="max_file_size must be a positive integer"):
file_scanner.validate_config(config)
async def test_invalid_max_file_size_type(self, file_scanner):
"""Test that non-integer max_file_size raises error"""
config = {"max_file_size": "large"}
with pytest.raises(ValueError, match="max_file_size must be a positive integer"):
file_scanner.validate_config(config)
@pytest.mark.asyncio
class TestFileScannerExecution:
"""Test scanner execution"""
async def test_scan_python_files(self, file_scanner, python_test_workspace):
"""Test scanning Python files"""
config = {
"patterns": ["*.py"],
"check_sensitive": False,
"calculate_hashes": False
}
result = await file_scanner.execute(config, python_test_workspace)
assert result.module == "file_scanner"
assert result.status == "success"
assert len(result.findings) > 0
# Check that Python files were found
python_files = [f for f in result.findings if f.file_path.endswith('.py')]
assert len(python_files) > 0
async def test_scan_all_files(self, file_scanner, python_test_workspace):
"""Test scanning all files with wildcard"""
config = {
"patterns": ["*"],
"check_sensitive": False,
"calculate_hashes": False
}
result = await file_scanner.execute(config, python_test_workspace)
assert result.status == "success"
assert len(result.findings) > 0
assert result.summary["total_files"] > 0
async def test_scan_with_multiple_patterns(self, file_scanner, python_test_workspace):
"""Test scanning with multiple patterns"""
config = {
"patterns": ["*.py", "*.txt"],
"check_sensitive": False,
"calculate_hashes": False
}
result = await file_scanner.execute(config, python_test_workspace)
assert result.status == "success"
assert len(result.findings) > 0
async def test_empty_workspace(self, file_scanner, temp_workspace):
"""Test scanning empty workspace"""
config = {
"patterns": ["*.py"],
"check_sensitive": False
}
result = await file_scanner.execute(config, temp_workspace)
assert result.status == "success"
assert len(result.findings) == 0
assert result.summary["total_files"] == 0
@pytest.mark.asyncio
class TestFileScannerSensitiveDetection:
"""Test sensitive file detection"""
async def test_detect_env_file(self, file_scanner, temp_workspace):
"""Test detection of .env file"""
# Create .env file
(temp_workspace / ".env").write_text("API_KEY=secret123")
config = {
"patterns": ["*"],
"check_sensitive": True,
"calculate_hashes": False
}
result = await file_scanner.execute(config, temp_workspace)
assert result.status == "success"
# Check for sensitive file finding
sensitive_findings = [f for f in result.findings if f.category == "sensitive_file"]
assert len(sensitive_findings) > 0
assert any(".env" in f.title for f in sensitive_findings)
async def test_detect_private_key(self, file_scanner, temp_workspace):
"""Test detection of private key file"""
# Create private key file
(temp_workspace / "id_rsa").write_text("-----BEGIN RSA PRIVATE KEY-----")
config = {
"patterns": ["*"],
"check_sensitive": True
}
result = await file_scanner.execute(config, temp_workspace)
assert result.status == "success"
sensitive_findings = [f for f in result.findings if f.category == "sensitive_file"]
assert len(sensitive_findings) > 0
async def test_no_sensitive_detection_when_disabled(self, file_scanner, temp_workspace):
"""Test that sensitive detection can be disabled"""
(temp_workspace / ".env").write_text("API_KEY=secret123")
config = {
"patterns": ["*"],
"check_sensitive": False
}
result = await file_scanner.execute(config, temp_workspace)
assert result.status == "success"
sensitive_findings = [f for f in result.findings if f.category == "sensitive_file"]
assert len(sensitive_findings) == 0
@pytest.mark.asyncio
class TestFileScannerHashing:
"""Test file hashing functionality"""
async def test_hash_calculation(self, file_scanner, temp_workspace):
"""Test SHA256 hash calculation"""
# Create test file
test_file = temp_workspace / "test.txt"
test_file.write_text("Hello World")
config = {
"patterns": ["*.txt"],
"calculate_hashes": True
}
result = await file_scanner.execute(config, temp_workspace)
assert result.status == "success"
# Find the test.txt finding
txt_findings = [f for f in result.findings if "test.txt" in f.file_path]
assert len(txt_findings) > 0
# Check that hash was calculated
finding = txt_findings[0]
assert finding.metadata.get("file_hash") is not None
assert len(finding.metadata["file_hash"]) == 64 # SHA256 hex length
async def test_no_hash_when_disabled(self, file_scanner, temp_workspace):
"""Test that hashing can be disabled"""
test_file = temp_workspace / "test.txt"
test_file.write_text("Hello World")
config = {
"patterns": ["*.txt"],
"calculate_hashes": False
}
result = await file_scanner.execute(config, temp_workspace)
assert result.status == "success"
txt_findings = [f for f in result.findings if "test.txt" in f.file_path]
if len(txt_findings) > 0:
finding = txt_findings[0]
assert finding.metadata.get("file_hash") is None
@pytest.mark.asyncio
class TestFileScannerFileTypes:
"""Test file type detection"""
async def test_detect_python_type(self, file_scanner, temp_workspace):
"""Test detection of Python file type"""
(temp_workspace / "script.py").write_text("print('hello')")
config = {"patterns": ["*.py"]}
result = await file_scanner.execute(config, temp_workspace)
assert result.status == "success"
py_findings = [f for f in result.findings if "script.py" in f.file_path]
assert len(py_findings) > 0
assert "python" in py_findings[0].metadata["file_type"]
async def test_detect_javascript_type(self, file_scanner, temp_workspace):
"""Test detection of JavaScript file type"""
(temp_workspace / "app.js").write_text("console.log('hello')")
config = {"patterns": ["*.js"]}
result = await file_scanner.execute(config, temp_workspace)
assert result.status == "success"
js_findings = [f for f in result.findings if "app.js" in f.file_path]
assert len(js_findings) > 0
assert "javascript" in js_findings[0].metadata["file_type"]
async def test_file_type_summary(self, file_scanner, temp_workspace):
"""Test that file type summary is generated"""
(temp_workspace / "script.py").write_text("print('hello')")
(temp_workspace / "app.js").write_text("console.log('hello')")
(temp_workspace / "readme.txt").write_text("Documentation")
config = {"patterns": ["*"]}
result = await file_scanner.execute(config, temp_workspace)
assert result.status == "success"
assert "file_types" in result.summary
assert len(result.summary["file_types"]) > 0
@pytest.mark.asyncio
class TestFileScannerSizeLimits:
"""Test file size handling"""
async def test_skip_large_files(self, file_scanner, temp_workspace):
"""Test that large files are skipped"""
# Create a "large" file
large_file = temp_workspace / "large.txt"
large_file.write_text("x" * 1000)
config = {
"patterns": ["*.txt"],
"max_file_size": 500 # Set limit smaller than file
}
result = await file_scanner.execute(config, temp_workspace)
# Should succeed but skip the large file
assert result.status == "success"
# The file should still be counted but not have a detailed finding
assert result.summary["total_files"] > 0
async def test_process_small_files(self, file_scanner, temp_workspace):
"""Test that small files are processed"""
small_file = temp_workspace / "small.txt"
small_file.write_text("small content")
config = {
"patterns": ["*.txt"],
"max_file_size": 1048576 # 1MB
}
result = await file_scanner.execute(config, temp_workspace)
assert result.status == "success"
txt_findings = [f for f in result.findings if "small.txt" in f.file_path]
assert len(txt_findings) > 0
@pytest.mark.asyncio
class TestFileScannerSummary:
"""Test result summary generation"""
async def test_summary_structure(self, file_scanner, python_test_workspace):
"""Test that summary has correct structure"""
config = {"patterns": ["*"]}
result = await file_scanner.execute(config, python_test_workspace)
assert result.status == "success"
assert "total_files" in result.summary
assert "total_size_bytes" in result.summary
assert "file_types" in result.summary
assert "patterns_scanned" in result.summary
assert isinstance(result.summary["total_files"], int)
assert isinstance(result.summary["total_size_bytes"], int)
assert isinstance(result.summary["file_types"], dict)
assert isinstance(result.summary["patterns_scanned"], list)
async def test_summary_counts(self, file_scanner, temp_workspace):
"""Test that summary counts are accurate"""
# Create known files
(temp_workspace / "file1.py").write_text("content1")
(temp_workspace / "file2.py").write_text("content2")
(temp_workspace / "file3.txt").write_text("content3")
config = {"patterns": ["*"]}
result = await file_scanner.execute(config, temp_workspace)
assert result.status == "success"
assert result.summary["total_files"] == 3
assert result.summary["total_size_bytes"] > 0

View File

@@ -1,493 +0,0 @@
"""
Unit tests for SecurityAnalyzer module
"""
import pytest
import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).resolve().parents[3] / "toolbox"))
from modules.analyzer.security_analyzer import SecurityAnalyzer
@pytest.fixture
def security_analyzer():
"""Create SecurityAnalyzer instance"""
return SecurityAnalyzer()
@pytest.mark.asyncio
class TestSecurityAnalyzerMetadata:
"""Test SecurityAnalyzer metadata"""
async def test_metadata_structure(self, security_analyzer):
"""Test that metadata has correct structure"""
metadata = security_analyzer.get_metadata()
assert metadata.name == "security_analyzer"
assert metadata.version == "1.0.0"
assert metadata.category == "analyzer"
assert "security" in metadata.tags
assert "vulnerabilities" in metadata.tags
assert metadata.requires_workspace is True
@pytest.mark.asyncio
class TestSecurityAnalyzerConfigValidation:
"""Test configuration validation"""
async def test_valid_config(self, security_analyzer):
"""Test that valid config passes validation"""
config = {
"file_extensions": [".py", ".js"],
"check_secrets": True,
"check_sql": True,
"check_dangerous_functions": True
}
assert security_analyzer.validate_config(config) is True
async def test_default_config(self, security_analyzer):
"""Test that empty config uses defaults"""
config = {}
assert security_analyzer.validate_config(config) is True
async def test_invalid_extensions_type(self, security_analyzer):
"""Test that non-list extensions raises error"""
config = {"file_extensions": ".py"}
with pytest.raises(ValueError, match="file_extensions must be a list"):
security_analyzer.validate_config(config)
@pytest.mark.asyncio
class TestSecurityAnalyzerSecretDetection:
"""Test hardcoded secret detection"""
async def test_detect_api_key(self, security_analyzer, temp_workspace):
"""Test detection of hardcoded API key"""
code_file = temp_workspace / "config.py"
code_file.write_text("""
# Configuration file
api_key = "apikey_live_abcdefghijklmnopqrstuvwxyzabcdefghijk"
database_url = "postgresql://localhost/db"
""")
config = {
"file_extensions": [".py"],
"check_secrets": True,
"check_sql": False,
"check_dangerous_functions": False
}
result = await security_analyzer.execute(config, temp_workspace)
assert result.status == "success"
secret_findings = [f for f in result.findings if f.category == "hardcoded_secret"]
assert len(secret_findings) > 0
assert any("API Key" in f.title for f in secret_findings)
async def test_detect_password(self, security_analyzer, temp_workspace):
"""Test detection of hardcoded password"""
code_file = temp_workspace / "auth.py"
code_file.write_text("""
def connect():
password = "mySecretP@ssw0rd"
return connect_db(password)
""")
config = {
"file_extensions": [".py"],
"check_secrets": True,
"check_sql": False,
"check_dangerous_functions": False
}
result = await security_analyzer.execute(config, temp_workspace)
assert result.status == "success"
secret_findings = [f for f in result.findings if f.category == "hardcoded_secret"]
assert len(secret_findings) > 0
async def test_detect_aws_credentials(self, security_analyzer, temp_workspace):
"""Test detection of AWS credentials"""
code_file = temp_workspace / "aws_config.py"
code_file.write_text("""
aws_access_key = "AKIAIOSFODNN7REALKEY"
aws_secret_key = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYREALKEY"
""")
config = {
"file_extensions": [".py"],
"check_secrets": True
}
result = await security_analyzer.execute(config, temp_workspace)
assert result.status == "success"
aws_findings = [f for f in result.findings if "AWS" in f.title]
assert len(aws_findings) >= 2 # Both access key and secret key
async def test_no_secret_detection_when_disabled(self, security_analyzer, temp_workspace):
"""Test that secret detection can be disabled"""
code_file = temp_workspace / "config.py"
code_file.write_text('api_key = "sk_live_1234567890abcdef"')
config = {
"file_extensions": [".py"],
"check_secrets": False
}
result = await security_analyzer.execute(config, temp_workspace)
assert result.status == "success"
secret_findings = [f for f in result.findings if f.category == "hardcoded_secret"]
assert len(secret_findings) == 0
@pytest.mark.asyncio
class TestSecurityAnalyzerSQLInjection:
"""Test SQL injection detection"""
async def test_detect_string_concatenation(self, security_analyzer, temp_workspace):
"""Test detection of SQL string concatenation"""
code_file = temp_workspace / "db.py"
code_file.write_text("""
def get_user(user_id):
query = "SELECT * FROM users WHERE id = " + user_id
return execute(query)
""")
config = {
"file_extensions": [".py"],
"check_secrets": False,
"check_sql": True,
"check_dangerous_functions": False
}
result = await security_analyzer.execute(config, temp_workspace)
assert result.status == "success"
sql_findings = [f for f in result.findings if f.category == "sql_injection"]
assert len(sql_findings) > 0
async def test_detect_f_string_sql(self, security_analyzer, temp_workspace):
"""Test detection of f-string in SQL"""
code_file = temp_workspace / "db.py"
code_file.write_text("""
def get_user(name):
query = f"SELECT * FROM users WHERE name = '{name}'"
return execute(query)
""")
config = {
"file_extensions": [".py"],
"check_sql": True
}
result = await security_analyzer.execute(config, temp_workspace)
assert result.status == "success"
sql_findings = [f for f in result.findings if f.category == "sql_injection"]
assert len(sql_findings) > 0
async def test_detect_dynamic_query_building(self, security_analyzer, temp_workspace):
"""Test detection of dynamic query building"""
code_file = temp_workspace / "queries.py"
code_file.write_text("""
def search(keyword):
query = "SELECT * FROM products WHERE name LIKE " + keyword
execute(query + " ORDER BY price")
""")
config = {
"file_extensions": [".py"],
"check_sql": True
}
result = await security_analyzer.execute(config, temp_workspace)
assert result.status == "success"
sql_findings = [f for f in result.findings if f.category == "sql_injection"]
assert len(sql_findings) > 0
async def test_no_sql_detection_when_disabled(self, security_analyzer, temp_workspace):
"""Test that SQL detection can be disabled"""
code_file = temp_workspace / "db.py"
code_file.write_text('query = "SELECT * FROM users WHERE id = " + user_id')
config = {
"file_extensions": [".py"],
"check_sql": False
}
result = await security_analyzer.execute(config, temp_workspace)
assert result.status == "success"
sql_findings = [f for f in result.findings if f.category == "sql_injection"]
assert len(sql_findings) == 0
@pytest.mark.asyncio
class TestSecurityAnalyzerDangerousFunctions:
"""Test dangerous function detection"""
async def test_detect_eval(self, security_analyzer, temp_workspace):
"""Test detection of eval() usage"""
code_file = temp_workspace / "dangerous.py"
code_file.write_text("""
def process_input(user_input):
result = eval(user_input)
return result
""")
config = {
"file_extensions": [".py"],
"check_secrets": False,
"check_sql": False,
"check_dangerous_functions": True
}
result = await security_analyzer.execute(config, temp_workspace)
assert result.status == "success"
dangerous_findings = [f for f in result.findings if f.category == "dangerous_function"]
assert len(dangerous_findings) > 0
assert any("eval" in f.title.lower() for f in dangerous_findings)
async def test_detect_exec(self, security_analyzer, temp_workspace):
"""Test detection of exec() usage"""
code_file = temp_workspace / "runner.py"
code_file.write_text("""
def run_code(code):
exec(code)
""")
config = {
"file_extensions": [".py"],
"check_dangerous_functions": True
}
result = await security_analyzer.execute(config, temp_workspace)
assert result.status == "success"
dangerous_findings = [f for f in result.findings if f.category == "dangerous_function"]
assert len(dangerous_findings) > 0
async def test_detect_os_system(self, security_analyzer, temp_workspace):
"""Test detection of os.system() usage"""
code_file = temp_workspace / "commands.py"
code_file.write_text("""
import os
def run_command(cmd):
os.system(cmd)
""")
config = {
"file_extensions": [".py"],
"check_dangerous_functions": True
}
result = await security_analyzer.execute(config, temp_workspace)
assert result.status == "success"
dangerous_findings = [f for f in result.findings if f.category == "dangerous_function"]
assert len(dangerous_findings) > 0
assert any("os.system" in f.title for f in dangerous_findings)
async def test_detect_pickle_loads(self, security_analyzer, temp_workspace):
"""Test detection of pickle.loads() usage"""
code_file = temp_workspace / "serializer.py"
code_file.write_text("""
import pickle
def deserialize(data):
return pickle.loads(data)
""")
config = {
"file_extensions": [".py"],
"check_dangerous_functions": True
}
result = await security_analyzer.execute(config, temp_workspace)
assert result.status == "success"
dangerous_findings = [f for f in result.findings if f.category == "dangerous_function"]
assert len(dangerous_findings) > 0
async def test_detect_javascript_eval(self, security_analyzer, temp_workspace):
"""Test detection of eval() in JavaScript"""
code_file = temp_workspace / "app.js"
code_file.write_text("""
function processInput(userInput) {
return eval(userInput);
}
""")
config = {
"file_extensions": [".js"],
"check_dangerous_functions": True
}
result = await security_analyzer.execute(config, temp_workspace)
assert result.status == "success"
dangerous_findings = [f for f in result.findings if f.category == "dangerous_function"]
assert len(dangerous_findings) > 0
async def test_detect_innerHTML(self, security_analyzer, temp_workspace):
"""Test detection of innerHTML (XSS risk)"""
code_file = temp_workspace / "dom.js"
code_file.write_text("""
function updateContent(html) {
document.getElementById("content").innerHTML = html;
}
""")
config = {
"file_extensions": [".js"],
"check_dangerous_functions": True
}
result = await security_analyzer.execute(config, temp_workspace)
assert result.status == "success"
dangerous_findings = [f for f in result.findings if f.category == "dangerous_function"]
assert len(dangerous_findings) > 0
async def test_no_dangerous_detection_when_disabled(self, security_analyzer, temp_workspace):
"""Test that dangerous function detection can be disabled"""
code_file = temp_workspace / "code.py"
code_file.write_text('result = eval(user_input)')
config = {
"file_extensions": [".py"],
"check_dangerous_functions": False
}
result = await security_analyzer.execute(config, temp_workspace)
assert result.status == "success"
dangerous_findings = [f for f in result.findings if f.category == "dangerous_function"]
assert len(dangerous_findings) == 0
@pytest.mark.asyncio
class TestSecurityAnalyzerMultipleIssues:
"""Test detection of multiple issues in same file"""
async def test_detect_multiple_vulnerabilities(self, security_analyzer, temp_workspace):
"""Test detection of multiple vulnerability types"""
code_file = temp_workspace / "vulnerable.py"
code_file.write_text("""
import os
# Hardcoded credentials
api_key = "apikey_live_abcdefghijklmnopqrstuvwxyzabcdef"
password = "MySecureP@ssw0rd"
def process_query(user_input):
# SQL injection
query = "SELECT * FROM users WHERE name = " + user_input
# Dangerous function
result = eval(user_input)
# Command injection
os.system(user_input)
return result
""")
config = {
"file_extensions": [".py"],
"check_secrets": True,
"check_sql": True,
"check_dangerous_functions": True
}
result = await security_analyzer.execute(config, temp_workspace)
assert result.status == "success"
# Should find multiple types of issues
secret_findings = [f for f in result.findings if f.category == "hardcoded_secret"]
sql_findings = [f for f in result.findings if f.category == "sql_injection"]
dangerous_findings = [f for f in result.findings if f.category == "dangerous_function"]
assert len(secret_findings) > 0
assert len(sql_findings) > 0
assert len(dangerous_findings) > 0
@pytest.mark.asyncio
class TestSecurityAnalyzerSummary:
"""Test result summary generation"""
async def test_summary_structure(self, security_analyzer, temp_workspace):
"""Test that summary has correct structure"""
(temp_workspace / "test.py").write_text("print('hello')")
config = {"file_extensions": [".py"]}
result = await security_analyzer.execute(config, temp_workspace)
assert result.status == "success"
assert "files_analyzed" in result.summary
assert "total_findings" in result.summary
assert "extensions_scanned" in result.summary
assert isinstance(result.summary["files_analyzed"], int)
assert isinstance(result.summary["total_findings"], int)
assert isinstance(result.summary["extensions_scanned"], list)
async def test_empty_workspace(self, security_analyzer, temp_workspace):
"""Test analyzing empty workspace"""
config = {"file_extensions": [".py"]}
result = await security_analyzer.execute(config, temp_workspace)
assert result.status == "partial" # No files found
assert result.summary["files_analyzed"] == 0
async def test_analyze_multiple_file_types(self, security_analyzer, temp_workspace):
"""Test analyzing multiple file types"""
(temp_workspace / "app.py").write_text("print('hello')")
(temp_workspace / "script.js").write_text("console.log('hello')")
(temp_workspace / "index.php").write_text("<?php echo 'hello'; ?>")
config = {"file_extensions": [".py", ".js", ".php"]}
result = await security_analyzer.execute(config, temp_workspace)
assert result.status == "success"
assert result.summary["files_analyzed"] == 3
@pytest.mark.asyncio
class TestSecurityAnalyzerFalsePositives:
"""Test false positive filtering"""
async def test_skip_test_secrets(self, security_analyzer, temp_workspace):
"""Test that test/example secrets are filtered"""
code_file = temp_workspace / "test_config.py"
code_file.write_text("""
# Test configuration - should be filtered
api_key = "test_key_example"
password = "dummy_password_123"
token = "sample_token_placeholder"
""")
config = {
"file_extensions": [".py"],
"check_secrets": True
}
result = await security_analyzer.execute(config, temp_workspace)
assert result.status == "success"
# These should be filtered as false positives
secret_findings = [f for f in result.findings if f.category == "hardcoded_secret"]
# Should have fewer or no findings due to false positive filtering
assert len(secret_findings) == 0 or all(
not any(fp in f.description.lower() for fp in ['test', 'example', 'dummy', 'sample'])
for f in secret_findings
)

Some files were not shown because too many files have changed in this diff Show More