Compare commits

...

60 Commits

Author SHA1 Message Date
AFredefon
cd5bfc27ee fix: pipeline module fixes and improved AI agent guidance 2026-02-16 10:08:46 +01:00
AFredefon
8adc7a2e00 refactor: simplify module metadata schema for AI discoverability 2026-02-10 21:35:22 +01:00
tduhamel42
3b521dba42 fix: update license badge and footer from Apache 2.0 to BSL 1.1 2026-02-10 18:31:37 +01:00
AFredefon
66a10d1bc4 docs: add ROADMAP.md with planned features 2026-02-09 10:36:33 +01:00
AFredefon
48ad2a59af refactor(modules): rename metadata fields and use natural 2026-02-09 10:17:16 +01:00
AFredefon
8b8662d7af feat(modules): add harness-tester module for Rust fuzzing pipeline 2026-02-03 18:12:28 +01:00
AFredefon
f099bd018d chore(modules): remove redundant harness-validator module 2026-02-03 18:12:20 +01:00
tduhamel42
d786c6dab1 fix: block Podman on macOS and remove ghcr.io default (#39)
* fix: block Podman on macOS and remove ghcr.io default

- Add platform check in PodmanCLI.__init__() that raises FuzzForgeError
  on macOS with instructions to use Docker instead
- Change RegistrySettings.url default from "ghcr.io/fuzzinglabs" to ""
  (empty string) for local-only mode since no images are published yet
- Update _ensure_module_image() to show helpful error when image not
  found locally and no registry configured
- Update tests to mock Linux platform for Podman tests
- Add root ruff.toml to fix broken configuration in fuzzforge-runner

* rewrite guides for module architecture and update repo links

---------

Co-authored-by: AFredefon <antoinefredefon@yahoo.fr>
2026-02-03 10:15:16 +01:00
AFredefon
e72c5fb201 docs: fix USAGE.md setup instructions for new users 2026-01-30 13:55:48 +01:00
AFredefon
404c89a742 feat: make Docker the default container engine 2026-01-30 13:20:03 +01:00
AFredefon
aea50ac42a fix: use SNAP detection for podman storage, update tests for OSS 2026-01-30 11:59:40 +01:00
AFredefon
5d300e5366 fix: remove enterprise SDK references from OSS tests 2026-01-30 10:36:33 +01:00
AFredefon
1186f57a5c chore: remove old fuzzforge_ai files 2026-01-30 10:06:21 +01:00
AFredefon
9a97cc0f31 merge old fuzzforge_ai for cleanup 2026-01-30 10:02:49 +01:00
AFredefon
b46f050aef feat: FuzzForge AI - complete rewrite for OSS release 2026-01-30 09:57:48 +01:00
vhash
50ffad46a4 fix: broken links (#35)
move fuzzinglabs.io to fuzzinglabs.ai
2025-11-14 09:44:57 +01:00
Steve
83244ee537 Fix Discord link in README.md (#34) 2025-11-06 11:11:03 +01:00
Songbird99
e1b0b1b178 Support flexible A2A agent registration and fix redirects (#33)
- Accept direct .json URLs (e.g., http://host/.well-known/agent-card.json)
- Accept base agent URLs (e.g., http://host/a2a/sentinel)
- Extract canonical URL from agent card response
- Try both agent-card.json and agent.json for compatibility
- Follow HTTP redirects for POST requests (fixes 307 redirects)
- Remove trailing slash from POST endpoint to avoid redirect loops

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
2025-11-06 11:08:05 +01:00
tduhamel42
943bc9a114 Release v0.7.3 - Android workflows, LiteLLM integration, ARM64 support (#32)
* ci: add worker validation and Docker build checks

Add automated validation to prevent worker-related issues:

**Worker Validation Script:**
- New script: .github/scripts/validate-workers.sh
- Validates all workers in docker-compose.yml exist
- Checks required files: Dockerfile, requirements.txt, worker.py
- Verifies files are tracked by git (not gitignored)
- Detects gitignore issues that could hide workers

**CI Workflow Updates:**
- Added validate-workers job (runs on every PR)
- Added build-workers job (runs if workers/ modified)
- Uses Docker Buildx for caching
- Validates Docker images build successfully
- Updated test-summary to check validation results

**PR Template:**
- New pull request template with comprehensive checklist
- Specific section for worker-related changes
- Reminds contributors to validate worker files
- Includes documentation and changelog reminders

These checks would have caught the secrets worker gitignore issue.

Implements Phase 1 improvements from CI/CD quality assessment.

* fix: add dev branch to test workflow triggers

The test workflow was configured for 'develop' but the actual branch is named 'dev'.
This caused tests not to run on PRs to dev branch.

Now tests will run on:
- PRs to: main, master, dev, develop
- Pushes to: main, master, dev, develop, feature/**

* fix: properly detect worker file changes in CI

The previous condition used invalid GitHub context field.
Now uses git diff to properly detect changes to workers/ or docker-compose.yml.

Behavior:
- Job always runs the check step
- Detects if workers/ or docker-compose.yml modified
- Only builds Docker images if workers actually changed
- Shows clear skip message when no worker changes detected

* feat: Add Python SAST workflow with three security analysis tools

Implements Issue #5 - Python SAST workflow that combines:
- Dependency scanning (pip-audit) for CVE detection
- Security linting (Bandit) for vulnerability patterns
- Type checking (Mypy) for type safety issues

## Changes

**New Modules:**
- `DependencyScanner`: Scans Python dependencies for known CVEs using pip-audit
- `BanditAnalyzer`: Analyzes Python code for security issues using Bandit
- `MypyAnalyzer`: Checks Python code for type safety issues using Mypy

**New Workflow:**
- `python_sast`: Temporal workflow that orchestrates all three SAST tools
  - Runs tools in parallel for fast feedback (3-5 min vs hours for fuzzing)
  - Generates unified SARIF report with findings from all tools
  - Supports configurable severity/confidence thresholds

**Updates:**
- Added SAST dependencies to Python worker (bandit, pip-audit, mypy)
- Updated module __init__.py files to export new analyzers
- Added type_errors.py test file to vulnerable_app for Mypy validation

## Testing

Workflow tested successfully on vulnerable_app:
-  Bandit: Detected 9 security issues (command injection, unsafe functions)
-  Mypy: Detected 5 type errors
-  DependencyScanner: Ran successfully (no CVEs in test dependencies)
-  SARIF export: Generated valid SARIF with 14 total findings

* fix: Remove unused imports to pass linter

* fix: resolve live monitoring bug, remove deprecated parameters, and auto-start Python worker

- Fix live monitoring style error by calling _live_monitor() helper directly
- Remove default_parameters duplication from 10 workflow metadata files
- Remove deprecated volume_mode parameter from 26 files across CLI, SDK, backend, and docs
- Configure Python worker to start automatically with docker compose up
- Clean up constants, validation, completion, and example files

Fixes #
- Live monitoring now works correctly with --live flag
- Workflow metadata follows JSON Schema standard
- Cleaner codebase without deprecated volume_mode
- Python worker (most commonly used) starts by default

* fix: resolve linter errors and optimize CI worker builds

- Remove unused Literal import from backend findings model
- Remove unnecessary f-string prefixes in CLI findings command
- Optimize GitHub Actions to build only modified workers
  - Detect specific worker changes (python, secrets, rust, android, ossfuzz)
  - Build only changed workers instead of all 5
  - Build all workers if docker-compose.yml changes
  - Significantly reduces CI build time

* feat: Add Android static analysis workflow with Jadx, OpenGrep, and MobSF

Comprehensive Android security testing workflow converted from Prefect to Temporal architecture:

Modules (3):
- JadxDecompiler: APK to Java source code decompilation
- OpenGrepAndroid: Static analysis with Android-specific security rules
- MobSFScanner: Comprehensive mobile security framework integration

Custom Rules (13):
- clipboard-sensitive-data, hardcoded-secrets, insecure-data-storage
- insecure-deeplink, insecure-logging, intent-redirection
- sensitive_data_sharedPreferences, sqlite-injection
- vulnerable-activity, vulnerable-content-provider, vulnerable-service
- webview-javascript-enabled, webview-load-arbitrary-url

Workflow:
- 6-phase Temporal workflow: download → Jadx → OpenGrep → MobSF → SARIF → upload
- 4 activities: decompile_with_jadx, scan_with_opengrep, scan_with_mobsf, generate_android_sarif
- SARIF output combining findings from all security tools

Docker Worker:
- ARM64 Mac compatibility via amd64 platform emulation
- Pre-installed: Android SDK, Jadx 1.4.7, OpenGrep 1.45.0, MobSF 3.9.7
- MobSF runs as background service with API key auto-generation
- Added aiohttp for async HTTP communication

Test APKs:
- BeetleBug.apk and shopnest.apk for workflow validation

* fix(android): correct activity names and MobSF API key generation

- Fix activity names in workflow.py (get_target, upload_results, cleanup_cache)
- Fix MobSF API key generation in Dockerfile startup script (cut delimiter)
- Update activity parameter signatures to match actual implementations
- Workflow now executes successfully with Jadx and OpenGrep

* feat: add platform-aware worker architecture with ARM64 support

Implement platform-specific Dockerfile selection and graceful tool degradation to support both x86_64 and ARM64 (Apple Silicon) platforms.

**Backend Changes:**
- Add system info API endpoint (/system/info) exposing host filesystem paths
- Add FUZZFORGE_HOST_ROOT environment variable to backend service
- Add graceful degradation in MobSF activity for ARM64 platforms

**CLI Changes:**
- Implement multi-strategy path resolution (backend API, .fuzzforge marker, env var)
- Add platform detection (linux/amd64 vs linux/arm64)
- Add worker metadata.yaml reading for platform capabilities
- Auto-select appropriate Dockerfile based on detected platform
- Pass platform-specific env vars to docker-compose

**Worker Changes:**
- Create workers/android/metadata.yaml defining platform capabilities
- Rename Dockerfile -> Dockerfile.amd64 (full toolchain with MobSF)
- Create Dockerfile.arm64 (excludes MobSF due to Rosetta 2 incompatibility)
- Update docker-compose.yml to use ${ANDROID_DOCKERFILE} variable

**Workflow Changes:**
- Handle MobSF "skipped" status gracefully in workflow
- Log clear warnings when tools are unavailable on platform

**Key Features:**
- Automatic platform detection and Dockerfile selection
- Graceful degradation when tools unavailable (MobSF on ARM64)
- Works from any directory (backend API provides paths)
- Manual override via environment variables
- Clear user feedback about platform and selected Dockerfile

**Benefits:**
- Android workflow now works on Apple Silicon Macs
- No code changes needed for other workflows
- Convention established for future platform-specific workers

Closes: MobSF Rosetta 2 incompatibility issue
Implements: Platform-aware worker architecture (Option B)

* fix: make MobSFScanner import conditional for ARM64 compatibility

- Add try-except block to conditionally import MobSFScanner in modules/android/__init__.py
- Allows Android worker to start on ARM64 without MobSF dependencies (aiohttp)
- MobSF activity gracefully skips on ARM64 with clear warning message
- Remove workflow path detection logic (not needed - workflows receive directories)

Platform-aware architecture fully functional on ARM64:
- CLI detects ARM64 and selects Dockerfile.arm64 automatically
- Worker builds and runs without MobSF on ARM64
- Jadx successfully decompiles APKs (4145 files from BeetleBug.apk)
- OpenGrep finds security vulnerabilities (8 issues found)
- MobSF gracefully skips with warning on ARM64
- Graceful degradation working as designed

Tested with:
  ff workflow run android_static_analysis test_projects/android_test/ \
    --wait --no-interactive apk_path=BeetleBug.apk decompile_apk=true

Results: 8 security findings (1 ERROR, 7 WARNINGS)

* docs: update CHANGELOG with Android workflow and ARM64 support

Added [Unreleased] section documenting:
- Android Static Analysis Workflow (Jadx, OpenGrep, MobSF)
- Platform-Aware Worker Architecture with ARM64 support
- Python SAST Workflow
- CI/CD improvements and worker validation
- CLI enhancements
- Bug fixes and technical changes

Fixed date typo: 2025-01-16 → 2025-10-16

* fix: resolve linter errors in Android modules

- Remove unused imports from mobsf_scanner.py (asyncio, hashlib, json, Optional)
- Remove unused variables from opengrep_android.py (start_col, end_col)
- Remove duplicate Path import from workflow.py

* ci: support multi-platform Dockerfiles in worker validation

Updated worker validation script to accept both:
- Single Dockerfile pattern (existing workers)
- Multi-platform Dockerfile pattern (Dockerfile.amd64, Dockerfile.arm64, etc.)

This enables platform-aware worker architectures like the Android worker
which uses different Dockerfiles for x86_64 and ARM64 platforms.

* Feature/litellm proxy (#27)

* feat: seed governance config and responses routing

* Add env-configurable timeout for proxy providers

* Integrate LiteLLM OTEL collector and update docs

* Make .env.litellm optional for LiteLLM proxy

* Add LiteLLM proxy integration with model-agnostic virtual keys

Changes:
- Bootstrap generates 3 virtual keys with individual budgets (CLI: $100, Task-Agent: $25, Cognee: $50)
- Task-agent loads config at runtime via entrypoint script to wait for bootstrap completion
- All keys are model-agnostic by default (no LITELLM_DEFAULT_MODELS restrictions)
- Bootstrap handles database/env mismatch after docker prune by deleting stale aliases
- CLI and Cognee configured to use LiteLLM proxy with virtual keys
- Added comprehensive documentation in volumes/env/README.md

Technical details:
- task-agent entrypoint waits for keys in .env file before starting uvicorn
- Bootstrap creates/updates TASK_AGENT_API_KEY, COGNEE_API_KEY, and OPENAI_API_KEY
- Removed hardcoded API keys from docker-compose.yml
- All services route through http://localhost:10999 proxy

* Fix CLI not loading virtual keys from global .env

Project .env files with empty OPENAI_API_KEY values were overriding
the global virtual keys. Updated _load_env_file_if_exists to only
override with non-empty values.

* Fix agent executor not passing API key to LiteLLM

The agent was initializing LiteLlm without api_key or api_base,
causing authentication errors when using the LiteLLM proxy. Now
reads from OPENAI_API_KEY/LLM_API_KEY and LLM_ENDPOINT environment
variables and passes them to LiteLlm constructor.

* Auto-populate project .env with virtual key from global config

When running 'ff init', the command now checks for a global
volumes/env/.env file and automatically uses the OPENAI_API_KEY
virtual key if found. This ensures projects work with LiteLLM
proxy out of the box without manual key configuration.

* docs: Update README with LiteLLM configuration instructions

Add note about LITELLM_GEMINI_API_KEY configuration and clarify that OPENAI_API_KEY default value should not be changed as it's used for the LLM proxy.

* Refactor workflow parameters to use JSON Schema defaults

Consolidates parameter defaults into JSON Schema format, removing the separate default_parameters field. Adds extract_defaults_from_json_schema() helper to extract defaults from the standard schema structure. Updates LiteLLM proxy config to use LITELLM_OPENAI_API_KEY environment variable.

* Remove .env.example from task_agent

* Fix MDX syntax error in llm-proxy.md

* fix: apply default parameters from metadata.yaml automatically

Fixed TemporalManager.run_workflow() to correctly apply default parameter
values from workflow metadata.yaml files when parameters are not provided
by the caller.

Previous behavior:
- When workflow_params was empty {}, the condition
  `if workflow_params and 'parameters' in metadata` would fail
- Parameters would not be extracted from schema, resulting in workflows
  receiving only target_id with no other parameters

New behavior:
- Removed the `workflow_params and` requirement from the condition
- Now explicitly checks for defaults in parameter spec
- Applies defaults from metadata.yaml automatically when param not provided
- Workflows receive all parameters with proper fallback:
  provided value > metadata default > None

This makes metadata.yaml the single source of truth for parameter defaults,
removing the need for workflows to implement defensive default handling.

Affected workflows:
- llm_secret_detection (was failing with KeyError)
- All other workflows now benefit from automatic default application

Co-authored-by: tduhamel42 <tduhamel@fuzzinglabs.com>

* fix: add default values to llm_analysis workflow parameters

Resolves validation error where agent_url was None when not explicitly provided. The TemporalManager applies defaults from metadata.yaml, not from module input schemas, so all parameters need defaults in the workflow metadata.

Changes:
- Add default agent_url, llm_model (gpt-5-mini), llm_provider (openai)
- Expand file_patterns to 45 comprehensive patterns covering code, configs, secrets, and Docker files
- Increase default limits: max_files (10), max_file_size (100KB), timeout (90s)

* refactor: replace .env.example with .env.template in documentation

- Remove volumes/env/.env.example file
- Update all documentation references to use .env.template instead
- Update bootstrap script error message
- Update .gitignore comment

* feat(cli): add worker management commands with improved progress feedback

Add comprehensive CLI commands for managing Temporal workers:
- ff worker list - List workers with status and uptime
- ff worker start <name> - Start specific worker with optional rebuild
- ff worker stop - Safely stop all workers without affecting core services

Improvements:
- Live progress display during worker startup with Rich Status spinner
- Real-time elapsed time counter and container state updates
- Health check status tracking (starting → unhealthy → healthy)
- Helpful contextual hints at 10s, 30s, 60s intervals
- Better timeout messages showing last known state

Worker management enhancements:
- Use 'docker compose' (space) instead of 'docker-compose' (hyphen)
- Stop workers individually with 'docker stop' to avoid stopping core services
- Platform detection and Dockerfile selection (ARM64/AMD64)

Documentation:
- Updated docker-setup.md with CLI commands as primary method
- Created comprehensive cli-reference.md with all commands and examples
- Added worker management best practices

* fix: MobSF scanner now properly parses files dict structure

MobSF returns 'files' as a dict (not list):
{"filename": "line_numbers"}

The parser was treating it as a list, causing zero findings
to be extracted. Now properly iterates over the dict and
creates one finding per affected file with correct line numbers
and metadata (CWE, OWASP, MASVS, CVSS).

Fixed in both code_analysis and behaviour sections.

* chore: bump version to 0.7.3

* docs: fix broken documentation links in cli-reference

* chore: add worker startup documentation and cleanup .gitignore

- Add workflow-to-worker mapping tables across documentation
- Update troubleshooting guide with worker requirements section
- Enhance getting started guide with worker examples
- Add quick reference to docker setup guide
- Add WEEK_SUMMARY*.md pattern to .gitignore

* docs: update CHANGELOG with missing versions and recent changes

- Add Unreleased section for post-v0.7.3 documentation updates
- Add v0.7.2 entry with bug fixes and worker improvements
- Document that v0.7.1 was re-tagged as v0.7.2
- Fix v0.6.0 date to "Undocumented" (no tag exists)
- Add version comparison links for easier navigation

* chore: bump all package versions to 0.7.3 for consistency

* Update GitHub link to fuzzforge_ai

---------

Co-authored-by: Songbird99 <150154823+Songbird99@users.noreply.github.com>
Co-authored-by: Songbird <Songbirdx99@gmail.com>
2025-11-06 11:07:50 +01:00
Ectario
f6cdb1ae2e fix(docs): fixing workflow docs (#29) 2025-10-27 12:37:04 +01:00
tduhamel42
731927667d fix/ Change default llm_secret_detection to gpt-5-mini 2025-10-22 10:17:41 +02:00
tduhamel42
75df59ddef fix: add missing secrets worker to repository
The secrets worker was being ignored due to broad gitignore pattern.
Added exception to allow workers/secrets/ directory while still ignoring actual secrets.

Files added:
- workers/secrets/Dockerfile
- workers/secrets/requirements.txt
- workers/secrets/worker.py
2025-10-22 08:39:20 +02:00
tduhamel42
4e14b4207d Merge pull request #20 from FuzzingLabs/dev
Release: v0.7.1 - Worker fixes, monitor consolidation, and findings improvements
2025-10-21 16:59:44 +02:00
tduhamel42
4cf4a1e5e8 Merge pull request #19 from FuzzingLabs/fix/worker-naming-and-compose-version
fix: worker naming, monitor commands, and findings CLI improvements
2025-10-21 16:54:51 +02:00
tduhamel42
076ec71482 fix: worker naming, monitor commands, and findings CLI improvements
This PR addresses multiple issues and improvements across the CLI and backend:

**Worker Naming Fixes:**
- Fix worker container naming mismatch between CLI and docker-compose
- Update worker_manager.py to use docker compose commands with service names
- Remove worker_container field from workflows API, keep only worker_service
- Backend now correctly uses service names (worker-python, worker-secrets, etc.)

**Backend API Fixes:**
- Fix workflow name extraction from run_id in runs.py (was showing "unknown")
- Update monitor command suggestions from 'monitor stats' to 'monitor live'

**Monitor Command Consolidation:**
- Merge 'monitor stats' and 'monitor live' into single 'monitor live' command
- Add --once and --style flags for flexibility
- Remove all references to deprecated 'monitor stats' command

**Findings CLI Structure Improvements (Closes #18):**
- Move 'show' command from 'findings' (plural) to 'finding' (singular)
- Keep 'export' command in 'findings' (plural) as it exports all findings
- Remove broken 'analyze' command (imported non-existent function)
- Update all command suggestions to use correct paths
- Fix smart routing logic in main.py to handle new command structure
- Add export suggestions after viewing findings with unique timestamps
- Change default export format to SARIF (industry standard)

**Docker Compose:**
- Remove obsolete version field to fix deprecation warning

All commands tested and working:
- ff finding show <run-id> --rule <rule-id> ✓
- ff findings export <run-id> ✓
- ff finding <run-id> (direct viewing) ✓
- ff monitor live <run-id> ✓
2025-10-21 16:53:08 +02:00
tduhamel42
f200cb6fb7 docs: add worker startup instructions to quickstart and tutorial 2025-10-17 11:46:40 +02:00
tduhamel42
a72a0072df Merge pull request #17 from FuzzingLabs/docs/update-temporal-architecture
docs: Update documentation for v0.7.0 Temporal architecture
2025-10-17 11:02:02 +02:00
tduhamel42
c652340db6 docs: fix broken link in docker-setup 2025-10-17 10:57:48 +02:00
tduhamel42
187b171360 docs: Fix workflow references and module paths for v0.7.0
Updated all documentation to reflect actual v0.7.0 workflow implementation:

Workflow name changes:
- Removed all references to non-existent workflows (static_analysis_scan,
  secret_detection_scan, infrastructure_scan, penetration_testing_scan)
- Updated examples to use actual workflows (security_assessment, gitleaks_detection,
  trufflehog_detection, llm_secret_detection)
- Deleted docs/docs/reference/workflows/static-analysis.md (described non-existent workflow)

Content corrections:
- Fixed workflow tool descriptions (removed incorrect Semgrep/Bandit references,
  documented actual SecurityAnalyzer and FileScanner modules)
- Updated all workflow lists to show production-ready vs development status
- Fixed all example configurations to match actual workflow parameters

Module creation guide fixes:
- Fixed 4 path references: backend/src/toolbox → backend/toolbox
- Updated import statements in example code

Files updated:
- docs/index.md - workflow list, CLI example, broken tutorial links
- docs/docs/tutorial/getting-started.md - workflow list, example output, tool descriptions
- docs/docs/how-to/create-module.md - module paths and imports
- docs/docs/how-to/mcp-integration.md - workflow examples and list
- docs/docs/ai/prompts.md - workflow example
- docs/docs/reference/cli-ai.md - 3 workflow references
2025-10-17 10:48:48 +02:00
tduhamel42
f14bec9410 docs: Update architecture diagram to reflect Temporal/MinIO architecture
- Removed Docker Registry from execution layer diagram
- Updated diagram to show vertical workers with MinIO storage
- Removed obsolete COMPOSE_PROJECT_NAME from example configuration
2025-10-17 10:29:38 +02:00
tduhamel42
37c15af130 docs: Remove obsolete Docker registry configuration
Updated documentation to reflect v0.7.0 Temporal architecture which uses MinIO for storage instead of a Docker registry.

Major changes:
- getting-started.md: Added mandatory volumes/env/.env setup, removed registry config section, updated service list
- docker-setup.md: Complete rewrite focusing on system requirements and worker profiles instead of registry
- index.md: Replaced registry warning with environment file requirement
- troubleshooting.md: Removed all registry troubleshooting, added environment configuration issues
2025-10-17 10:28:17 +02:00
tduhamel42
e42f07fc63 docs: Apply global fixes for v0.7.0 Temporal architecture
- Replace docker-compose.temporal.yaml → docker-compose.yml
- Replace Temporal UI port :8233 → :8080
- Replace repository URL fuzzforge.git → fuzzforge_ai.git

Affected files:
- concept/docker-containers.md
- concept/resource-management.md
- concept/sarif-format.md
- how-to/create-workflow.md
- how-to/debugging.md
- how-to/troubleshooting.md
- tutorial/getting-started.md
2025-10-17 10:21:47 +02:00
tduhamel42
54738ca091 fix: Add benchmark results files to git
- Added exception in .gitignore for benchmark results directory
- Force-added comparison_report.md and comparison_results.json
- These files contain benchmark metrics, not actual secrets
- Fixes broken link in README to benchmark results
2025-10-17 10:02:39 +02:00
tduhamel42
fe58b39abf fix: Add benchmark results files to git
- Added exception in .gitignore for benchmark results directory
- Force-added comparison_report.md and comparison_results.json
- These files contain benchmark metrics, not actual secrets
- Fixes broken link in README to benchmark results
2025-10-17 09:56:09 +02:00
Patrick Ventuzelo
2edcc40cea Revise README for manual workflow and support info
Updated README to include manual workflow setup and support project section.
2025-10-16 22:31:22 +02:00
Patrick Ventuzelo
794d5abc3a Merge pull request #16 from FuzzingLabs/dev
Release v0.7.0
2025-10-16 22:22:18 +02:00
tduhamel42
73ba98afa8 docs: Add Secret Detection Benchmarks section with performance metrics
- Added dedicated section showcasing secret detection benchmark results
- Includes comparison table with recall rates and speeds
- Links to detailed benchmark analysis
- Highlights LLM detector's 84.4% recall on obfuscated secrets
2025-10-16 14:05:05 +02:00
tduhamel42
3f133374d5 docs: Add development status warning for fuzzing workflows
- Added note that fuzzing workflows are in early development
- Fixed Fuzzer Integration feature to list actual workflows only
- Clarified OSS-Fuzz integration is under heavy development
- Listed stable workflows for production use
2025-10-16 14:00:32 +02:00
tduhamel42
32b45f24cb ci: Disable automatic benchmark runs
Benchmarks are not ready for CI/CD yet. Disabled automatic triggers:
- Removed schedule (nightly) trigger
- Removed pull_request trigger

Kept workflow_dispatch for manual testing when benchmarks are ready.

This prevents benchmark failures from blocking PR merges and releases.
2025-10-16 13:50:10 +02:00
tduhamel42
11b3e6db6a fix: Resolve CI failures for v0.7.0 release
Fix lint errors:
- Remove unused Optional import from gitleaks workflow
- Remove unused logging import from trufflehog activities

Fix documentation broken links:
- Update workspace-isolation links to use /docs/ prefix in resource-management.md
- Update workspace-isolation links to use /docs/ prefix in create-workflow.md

Fix benchmark dependency:
- Add fuzzforge-sdk installation to benchmark workflow
- SDK is required for bench_comparison.py import

All CI checks should now pass.
2025-10-16 12:55:20 +02:00
tduhamel42
28ad4468de Merge branch 'master' into dev for v0.7.0 release
Resolved conflicts:
- Kept monitor.py (dev version - required for live monitoring)
- Kept workflow_exec.py (dev version - includes worker management, --live, --fail-on, --export-sarif)
- Kept main.py (dev version - includes new command structure)

All conflicts resolved in favor of dev branch features for 0.7.0 release.
2025-10-16 12:32:25 +02:00
tduhamel42
746699e7c0 chore: Bump version to 0.7.0
Version updates:
- README.md badge: 0.6.0 → 0.7.0
- cli/pyproject.toml: 0.6.0 → 0.7.0
- backend/pyproject.toml: 0.6.0 → 0.7.0
- sdk/pyproject.toml: 0.6.0 → 0.7.0
- ai/pyproject.toml: 0.6.0 → 0.7.0

Add CHANGELOG.md with comprehensive release notes for 0.7.0:
- Secret detection workflows (gitleaks, trufflehog, llm_secret_detection)
- AI module and agent integration
- Temporal migration completion
- CI/CD integration
- Documentation updates
- Bug fixes and improvements

Update llm_analysis default model to gpt-5-mini
2025-10-16 12:23:56 +02:00
tduhamel42
8063f03d87 docs: Update README and fix worker startup instructions
README updates:
- Update docker compose command (now main docker-compose.yml)
- Remove obsolete insecure registries section (MinIO replaces local registry)
- Add .env configuration section for AI agent API keys

Worker management fixes:
- Add worker_service field to API response (backend)
- Fix CLI help message to use correct service name with 'docker compose up -d'
- Use modern 'docker compose' syntax instead of deprecated 'docker-compose'

This ensures users get correct instructions when workers aren't running.
2025-10-16 12:12:49 +02:00
tduhamel42
6db40f6689 feat: Reactivate AI agent command
Restore the AI agent command functionality after maintenance period.
Users can now run 'fuzzforge ai agent' to launch the full AI agent CLI
with A2A orchestration.
2025-10-16 11:48:57 +02:00
tduhamel42
3be4d34531 test: Add secret detection benchmark dataset and ground truth
Add comprehensive benchmark dataset with 32 documented secrets for testing
secret detection workflows (gitleaks, trufflehog, llm_secret_detection).

- Add test_projects/secret_detection_benchmark/ with 19 test files
- Add ground truth JSON with precise line-by-line secret mappings
- Update .gitignore with exceptions for benchmark files (not real secrets)

Dataset breakdown:
- 12 Easy secrets (standard patterns)
- 10 Medium secrets (obfuscated)
- 10 Hard secrets (well hidden)
2025-10-16 11:46:28 +02:00
tduhamel42
87e3262832 docs: Remove obsolete volume_mode references from documentation
The volume_mode parameter is no longer used since workflows now upload files to MinIO storage instead of mounting volumes directly. This commit removes all references to volume_mode from:

- Backend API documentation (README.md)
- Tutorial getting started guide
- MCP integration guide
- CLI AI reference documentation
- SDK documentation and examples
- Test project documentation

All curl examples and code samples have been updated to reflect the current MinIO-based file upload approach.
2025-10-16 11:36:53 +02:00
tduhamel42
2da986ebb0 feat: Add secret detection workflows and comprehensive benchmarking (#15)
Add three production-ready secret detection workflows with full benchmarking infrastructure:

**New Workflows:**
- gitleaks_detection: Pattern-based secret scanning (13/32 benchmark secrets)
- trufflehog_detection: Entropy-based detection with verification (1/32 benchmark secrets)
- llm_secret_detection: AI-powered semantic analysis (32/32 benchmark secrets - 100% recall)

**Benchmarking Infrastructure:**
- Ground truth dataset with 32 documented secrets (12 Easy, 10 Medium, 10 Hard)
- Automated comparison tools for precision/recall testing
- SARIF output format for all workflows
- Performance metrics and tool comparison reports

**Fixes:**
- Set gitleaks default to no_git=True for uploaded directories
- Update documentation with correct secret counts and workflow names
- Temporarily deactivate AI agent command
- Clean up deprecated test files and GitGuardian workflow

**Testing:**
All workflows verified on secret_detection_benchmark and vulnerable_app test projects.
Workers healthy and system fully functional.
2025-10-16 11:21:24 +02:00
Songbird
c3ce03e216 fix: Add missing cognify_text method to CogneeProjectIntegration
Resolves AttributeError when agent_executor calls cognify_text().
The method adds text to a dataset and cognifies it into a knowledge graph.
2025-10-15 13:22:37 +02:00
tduhamel42
4d30b08476 feat: Add LLM analysis workflow and ruff linter fixes
LLM Analysis Workflow:
- Add llm_analyzer module for AI-powered code security analysis
- Add llm_analysis workflow with SARIF output support
- Mount AI module in Python worker for A2A wrapper access
- Add a2a-sdk dependency to Python worker requirements
- Fix workflow parameter ordering in Temporal manager

Ruff Linter Fixes:
- Fix bare except clauses (E722) across AI and CLI modules
- Add noqa comments for intentional late imports (E402)
- Replace undefined get_ai_status_async with TODO placeholder
- Remove unused imports and variables
- Remove container diagnostics display from exception handler

MCP Configuration:
- Reactivate FUZZFORGE_MCP_URL with default value
- Set default MCP URL to http://localhost:8010/mcp in init
2025-10-14 16:43:14 +02:00
tduhamel42
dabbcf3718 Merge feature/ai_module into dev
Add AI module with A2A wrapper and task agent
2025-10-14 15:03:15 +02:00
tduhamel42
40d48a8045 feat: Complete Temporal migration cleanup and fixes
- Remove obsolete docker_logs.py module and container diagnostics from SDK
- Fix security_assessment workflow metadata (vertical: rust -> python)
- Remove all Prefect references from documentation
- Add SDK exception handling test suite
- Clean up old test artifacts
2025-10-14 15:02:52 +02:00
Songbird
018ec40432 Update task_agent README to use task_agent instead of agent_with_adk_format 2025-10-14 14:33:36 +02:00
Songbird
4b2456670b Add volumes/env/.env to gitignore 2025-10-14 14:29:06 +02:00
Songbird
5da16f358b Fix a2a_wrapper imports and add clean usage example
- Remove top-level imports from fuzzforge_ai/__init__.py to avoid dependency issues
- Fix config_bridge.py exception handling (remove undefined exc variable)
- Add examples/test_a2a_simple.py demonstrating clean a2a_wrapper usage
- Update package to use explicit imports: from fuzzforge_ai.a2a_wrapper import send_agent_task

All functionality preserved, imports are now explicit and modular.
2025-10-14 14:27:25 +02:00
Songbird
baace0eac4 Add AI module with A2A wrapper and task agent
- Disable FuzzForge MCP connection (no Prefect backend)
- Add a2a_wrapper module for programmatic A2A agent tasks
- Add task_agent (LiteLLM A2A agent) on port 10900
- Create volumes/env/ for centralized Docker config
- Update docker-compose.yml with task-agent service
- Remove workflow_automation_skill from agent card
2025-10-14 13:05:35 +02:00
tduhamel42
60ca088ecf CI/CD Integration with Ephemeral Deployment Model (#14)
* feat: Complete migration from Prefect to Temporal

BREAKING CHANGE: Replaces Prefect workflow orchestration with Temporal

## Major Changes
- Replace Prefect with Temporal for workflow orchestration
- Implement vertical worker architecture (rust, android)
- Replace Docker registry with MinIO for unified storage
- Refactor activities to be co-located with workflows
- Update all API endpoints for Temporal compatibility

## Infrastructure
- New: docker-compose.temporal.yaml (Temporal + MinIO + workers)
- New: workers/ directory with rust and android vertical workers
- New: backend/src/temporal/ (manager, discovery)
- New: backend/src/storage/ (S3-cached storage with MinIO)
- New: backend/toolbox/common/ (shared storage activities)
- Deleted: docker-compose.yaml (old Prefect setup)
- Deleted: backend/src/core/prefect_manager.py
- Deleted: backend/src/services/prefect_stats_monitor.py
- Deleted: Docker registry and insecure-registries requirement

## Workflows
- Migrated: security_assessment workflow to Temporal
- New: rust_test workflow (example/test workflow)
- Deleted: secret_detection_scan (Prefect-based, to be reimplemented)
- Activities now co-located with workflows for independent testing

## API Changes
- Updated: backend/src/api/workflows.py (Temporal submission)
- Updated: backend/src/api/runs.py (Temporal status/results)
- Updated: backend/src/main.py (727 lines, TemporalManager integration)
- Updated: All 16 MCP tools to use TemporalManager

## Testing
-  All services healthy (Temporal, PostgreSQL, MinIO, workers, backend)
-  All API endpoints functional
-  End-to-end workflow test passed (72 findings from vulnerable_app)
-  MinIO storage integration working (target upload/download, results)
-  Worker activity discovery working (6 activities registered)
-  Tarball extraction working
-  SARIF report generation working

## Documentation
- ARCHITECTURE.md: Complete Temporal architecture documentation
- QUICKSTART_TEMPORAL.md: Getting started guide
- MIGRATION_DECISION.md: Why we chose Temporal over Prefect
- IMPLEMENTATION_STATUS.md: Migration progress tracking
- workers/README.md: Worker development guide

## Dependencies
- Added: temporalio>=1.6.0
- Added: boto3>=1.34.0 (MinIO S3 client)
- Removed: prefect>=3.4.18

* feat: Add Python fuzzing vertical with Atheris integration

This commit implements a complete Python fuzzing workflow using Atheris:

## Python Worker (workers/python/)
- Dockerfile with Python 3.11, Atheris, and build tools
- Generic worker.py for dynamic workflow discovery
- requirements.txt with temporalio, boto3, atheris dependencies
- Added to docker-compose.temporal.yaml with dedicated cache volume

## AtherisFuzzer Module (backend/toolbox/modules/fuzzer/)
- Reusable module extending BaseModule
- Auto-discovers fuzz targets (fuzz_*.py, *_fuzz.py, fuzz_target.py)
- Recursive search to find targets in nested directories
- Dynamically loads TestOneInput() function
- Configurable max_iterations and timeout
- Real-time stats callback support for live monitoring
- Returns findings as ModuleFinding objects

## Atheris Fuzzing Workflow (backend/toolbox/workflows/atheris_fuzzing/)
- Temporal workflow for orchestrating fuzzing
- Downloads user code from MinIO
- Executes AtherisFuzzer module
- Uploads results to MinIO
- Cleans up cache after execution
- metadata.yaml with vertical: python for routing

## Test Project (test_projects/python_fuzz_waterfall/)
- Demonstrates stateful waterfall vulnerability
- main.py with check_secret() that leaks progress
- fuzz_target.py with Atheris TestOneInput() harness
- Complete README with usage instructions

## Backend Fixes
- Fixed parameter merging in REST API endpoints (workflows.py)
- Changed workflow parameter passing from positional args to kwargs (manager.py)
- Default parameters now properly merged with user parameters

## Testing
 Worker discovered AtherisFuzzingWorkflow
 Workflow executed end-to-end successfully
 Fuzz target auto-discovered in nested directories
 Atheris ran 100,000 iterations
 Results uploaded and cache cleaned

* chore: Complete Temporal migration with updated CLI/SDK/docs

This commit includes all remaining Temporal migration changes:

## CLI Updates (cli/)
- Updated workflow execution commands for Temporal
- Enhanced error handling and exceptions
- Updated dependencies in uv.lock

## SDK Updates (sdk/)
- Client methods updated for Temporal workflows
- Updated models for new workflow execution
- Updated dependencies in uv.lock

## Documentation Updates (docs/)
- Architecture documentation for Temporal
- Workflow concept documentation
- Resource management documentation (new)
- Debugging guide (new)
- Updated tutorials and how-to guides
- Troubleshooting updates

## README Updates
- Main README with Temporal instructions
- Backend README
- CLI README
- SDK README

## Other
- Updated IMPLEMENTATION_STATUS.md
- Removed old vulnerable_app.tar.gz

These changes complete the Temporal migration and ensure the
CLI/SDK work correctly with the new backend.

* fix: Use positional args instead of kwargs for Temporal workflows

The Temporal Python SDK's start_workflow() method doesn't accept
a 'kwargs' parameter. Workflows must receive parameters as positional
arguments via the 'args' parameter.

Changed from:
  args=workflow_args  # Positional arguments

This fixes the error:
  TypeError: Client.start_workflow() got an unexpected keyword argument 'kwargs'

Workflows now correctly receive parameters in order:
- security_assessment: [target_id, scanner_config, analyzer_config, reporter_config]
- atheris_fuzzing: [target_id, target_file, max_iterations, timeout_seconds]
- rust_test: [target_id, test_message]

* fix: Filter metadata-only parameters from workflow arguments

SecurityAssessmentWorkflow was receiving 7 arguments instead of 2-5.
The issue was that target_path and volume_mode from default_parameters
were being passed to the workflow, when they should only be used by
the system for configuration.

Now filters out metadata-only parameters (target_path, volume_mode)
before passing arguments to workflow execution.

* refactor: Remove Prefect leftovers and volume mounting legacy

Complete cleanup of Prefect migration artifacts:

Backend:
- Delete registry.py and workflow_discovery.py (Prefect-specific files)
- Remove Docker validation from setup.py (no longer needed)
- Remove ResourceLimits and VolumeMount models
- Remove target_path and volume_mode from WorkflowSubmission
- Remove supported_volume_modes from API and discovery
- Clean up metadata.yaml files (remove volume/path fields)
- Simplify parameter filtering in manager.py

SDK:
- Remove volume_mode parameter from client methods
- Remove ResourceLimits and VolumeMount models
- Remove Prefect error patterns from docker_logs.py
- Clean up WorkflowSubmission and WorkflowMetadata models

CLI:
- Remove Volume Modes display from workflow info

All removed features are Prefect-specific or Docker volume mounting
artifacts. Temporal workflows use MinIO storage exclusively.

* feat: Add comprehensive test suite and benchmark infrastructure

- Add 68 unit tests for fuzzer, scanner, and analyzer modules
- Implement pytest-based test infrastructure with fixtures
- Add 6 performance benchmarks with category-specific thresholds
- Configure GitHub Actions for automated testing and benchmarking
- Add test and benchmark documentation

Test coverage:
- AtherisFuzzer: 8 tests
- CargoFuzzer: 14 tests
- FileScanner: 22 tests
- SecurityAnalyzer: 24 tests

All tests passing (68/68)
All benchmarks passing (6/6)

* fix: Resolve all ruff linting violations across codebase

Fixed 27 ruff violations in 12 files:
- Removed unused imports (Depends, Dict, Any, Optional, etc.)
- Fixed undefined workflow_info variable in workflows.py
- Removed dead code with undefined variables in atheris_fuzzer.py
- Changed f-string to regular string where no placeholders used

All files now pass ruff checks for CI/CD compliance.

* fix: Configure CI for unit tests only

- Renamed docker-compose.temporal.yaml → docker-compose.yml for CI compatibility
- Commented out integration-tests job (no integration tests yet)
- Updated test-summary to only depend on lint and unit-tests

CI will now run successfully with 68 unit tests. Integration tests can be added later.

* feat: Add CI/CD integration with ephemeral deployment model

Implements comprehensive CI/CD support for FuzzForge with on-demand worker management:

**Worker Management (v0.7.0)**
- Add WorkerManager for automatic worker lifecycle control
- Auto-start workers from stopped state when workflows execute
- Auto-stop workers after workflow completion
- Health checks and startup timeout handling (90s default)

**CI/CD Features**
- `--fail-on` flag: Fail builds based on SARIF severity levels (error/warning/note/info)
- `--export-sarif` flag: Export findings in SARIF 2.1.0 format
- `--auto-start`/`--auto-stop` flags: Control worker lifecycle
- Exit code propagation: Returns 1 on blocking findings, 0 on success

**Exit Code Fix**
- Add `except typer.Exit: raise` handlers at 3 critical locations
- Move worker cleanup to finally block for guaranteed execution
- Exit codes now propagate correctly even when build fails

**CI Scripts & Examples**
- ci-start.sh: Start FuzzForge services with health checks
- ci-stop.sh: Clean shutdown with volume preservation option
- GitHub Actions workflow example (security-scan.yml)
- GitLab CI pipeline example (.gitlab-ci.example.yml)
- docker-compose.ci.yml: CI-optimized compose file with profiles

**OSS-Fuzz Integration**
- New ossfuzz_campaign workflow for running OSS-Fuzz projects
- OSS-Fuzz worker with Docker-in-Docker support
- Configurable campaign duration and project selection

**Documentation**
- Comprehensive CI/CD integration guide (docs/how-to/cicd-integration.md)
- Updated architecture docs with worker lifecycle details
- Updated workspace isolation documentation
- CLI README with worker management examples

**SDK Enhancements**
- Add get_workflow_worker_info() endpoint
- Worker vertical metadata in workflow responses

**Testing**
- All workflows tested: security_assessment, atheris_fuzzing, secret_detection, cargo_fuzzing
- All monitoring commands tested: stats, crashes, status, finding
- Full CI pipeline simulation verified
- Exit codes verified for success/failure scenarios

Ephemeral CI/CD model: ~3-4GB RAM, ~60-90s startup, runs entirely in CI containers.

* fix: Resolve ruff linting violations in CI/CD code

- Remove unused variables (run_id, defaults, result)
- Remove unused imports
- Fix f-string without placeholders

All CI/CD integration files now pass ruff checks.
2025-10-14 10:13:45 +02:00
abel
4ad44332ee docs: updated discord invite link 2025-10-06 11:59:28 +02:00
tduhamel42
09821c1c43 Merge pull request #12 from FuzzingLabs/ci/create-base-python-ci
ci: created base python ci
2025-10-03 11:22:48 +02:00
tduhamel42
6f24c88907 Merge pull request #13 from FuzzingLabs/fix/config-command-routing
fix: register config as command group instead of custom function
2025-10-03 11:17:33 +02:00
abel
92b338f9ed ci: created base python ci 2025-10-02 17:17:52 +02:00
424 changed files with 16037 additions and 62024 deletions

View File

@@ -1,48 +0,0 @@
---
name: 🐛 Bug Report
about: Create a report to help us improve FuzzForge
title: "[BUG] "
labels: bug
assignees: ''
---
## Description
A clear and concise description of the bug you encountered.
## Environment
Please provide details about your environment:
- **OS**: (e.g., macOS 14.0, Ubuntu 22.04, Windows 11)
- **Python version**: (e.g., 3.9.7)
- **Docker version**: (e.g., 24.0.6)
- **FuzzForge version**: (e.g., 0.6.0)
## Steps to Reproduce
Clear steps to recreate the issue:
1. Go to '...'
2. Run command '...'
3. Click on '...'
4. See error
## Expected Behavior
A clear and concise description of what should happen.
## Actual Behavior
A clear and concise description of what actually happens.
## Logs
Please include relevant error messages and stack traces:
```
Paste logs here
```
## Screenshots
If applicable, add screenshots to help explain your problem.
## Additional Context
Add any other context about the problem here (workflow used, specific target, configuration, etc.).
---
💬 **Need help?** Join our [Discord Community](https://discord.com/invite/acqv9FVG) for real-time support.

View File

@@ -1,8 +0,0 @@
blank_issues_enabled: false
contact_links:
- name: 💬 Community Discord
url: https://discord.com/invite/acqv9FVG
about: Join our Discord to discuss ideas, workflows, and security research with the community.
- name: 📖 Documentation
url: https://github.com/FuzzingLabs/fuzzforge_ai/tree/main/docs
about: Check our documentation for guides, tutorials, and API reference.

View File

@@ -1,38 +0,0 @@
---
name: ✨ Feature Request
about: Suggest an idea for FuzzForge
title: "[FEATURE] "
labels: enhancement
assignees: ''
---
## Use Case
Why is this feature needed? Describe the problem you're trying to solve or the improvement you'd like to see.
## Proposed Solution
How should it work? Describe your ideal solution in detail.
## Alternatives
What other approaches have you considered? List any alternative solutions or features you've thought about.
## Implementation
**(Optional)** Do you have any technical considerations or implementation ideas?
## Category
What area of FuzzForge would this feature enhance?
- [ ] 🤖 AI Agents for Security
- [ ] 🛠 Workflow Automation
- [ ] 📈 Vulnerability Research
- [ ] 🔗 Fuzzer Integration
- [ ] 🌐 Community Marketplace
- [ ] 🔒 Enterprise Features
- [ ] 📚 Documentation
- [ ] 🎯 Other
## Additional Context
Add any other context, screenshots, references, or examples about the feature request here.
---
💬 **Want to discuss this idea?** Join our [Discord Community](https://discord.com/invite/acqv9FVG) to collaborate with other contributors!

View File

@@ -1,67 +0,0 @@
---
name: 🔄 Workflow Submission
about: Contribute a security workflow or module to the FuzzForge community
title: "[WORKFLOW] "
labels: workflow, community
assignees: ''
---
## Workflow Name
Provide a short, descriptive name for your workflow.
## Description
Explain what this workflow does and what security problems it solves.
## Category
What type of security workflow is this?
- [ ] 🛡️ **Security Assessment** - Static analysis, vulnerability scanning
- [ ] 🔍 **Secret Detection** - Credential and secret scanning
- [ ] 🎯 **Fuzzing** - Dynamic testing and fuzz testing
- [ ] 🔄 **Reverse Engineering** - Binary analysis and decompilation
- [ ] 🌐 **Infrastructure Security** - Container, cloud, network security
- [ ] 🔒 **Penetration Testing** - Offensive security testing
- [ ] 📋 **Other** - Please describe
## Files
Please attach or provide links to your workflow files:
- [ ] `workflow.py` - Main Prefect flow implementation
- [ ] `Dockerfile` - Container definition
- [ ] `metadata.yaml` - Workflow metadata
- [ ] Test files or examples
- [ ] Documentation
## Testing
How did you test this workflow? Please describe:
- **Test targets used**: (e.g., vulnerable_app, custom test cases)
- **Expected outputs**: (e.g., SARIF format, specific vulnerabilities detected)
- **Validation results**: (e.g., X vulnerabilities found, Y false positives)
## SARIF Compliance
- [ ] My workflow outputs results in SARIF format
- [ ] Results include severity levels and descriptions
- [ ] Code flow information is provided where applicable
## Security Guidelines
- [ ] This workflow focuses on **defensive security** purposes only
- [ ] I have not included any malicious tools or capabilities
- [ ] All secrets/credentials are parameterized (no hardcoded values)
- [ ] I have followed responsible disclosure practices
## Registry Integration
Have you updated the workflow registry?
- [ ] Added import statement to `backend/toolbox/workflows/registry.py`
- [ ] Added registry entry with proper metadata
- [ ] Tested workflow registration and deployment
## Additional Notes
Anything else the maintainers should know about this workflow?
---
🚀 **Thank you for contributing to FuzzForge!** Your workflow will help the security community automate and scale their testing efforts.
💬 **Questions?** Join our [Discord Community](https://discord.com/invite/acqv9FVG) to discuss your contribution!

View File

@@ -1,57 +0,0 @@
name: Deploy Docusaurus to GitHub Pages
on:
workflow_dispatch:
push:
branches:
- master
paths:
- "docs/**"
jobs:
build:
name: Build Docusaurus
runs-on: ubuntu-latest
defaults:
run:
working-directory: ./docs
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: actions/setup-node@v4
with:
node-version: 24
cache: npm
cache-dependency-path: "**/package-lock.json"
- name: Install dependencies
run: npm ci
- name: Build website
run: npm run build
- name: Upload Build Artifact
uses: actions/upload-pages-artifact@v3
with:
path: ./docs/build
deploy:
name: Deploy to GitHub Pages
needs: build
# Grant GITHUB_TOKEN the permissions required to make a Pages deployment
permissions:
pages: write # to deploy to Pages
id-token: write # to verify the deployment originates from an appropriate source
# Deploy to the github-pages environment
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
runs-on: ubuntu-latest
steps:
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v4

View File

@@ -1,33 +0,0 @@
name: Docusaurus test deployment
on:
workflow_dispatch:
push:
paths:
- "docs/**"
pull_request:
paths:
- "docs/**"
jobs:
test-deploy:
name: Test deployment
runs-on: ubuntu-latest
defaults:
run:
working-directory: ./docs
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: actions/setup-node@v4
with:
node-version: 24
cache: npm
cache-dependency-path: "**/package-lock.json"
- name: Install dependencies
run: npm ci
- name: Test build website
run: npm run build

297
.gitignore vendored
View File

@@ -1,291 +1,12 @@
# ========================================
# FuzzForge Platform .gitignore
# ========================================
# -------------------- Python --------------------
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
# Environments
*.egg-info
*.whl
.env
.mypy_cache
.pytest_cache
.ruff_cache
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
.python-version
.vscode
__pycache__
# UV package manager
uv.lock
# But allow uv.lock in CLI and SDK for reproducible builds
!cli/uv.lock
!sdk/uv.lock
!backend/uv.lock
# MyPy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# pytype static type analyzer
.pytype/
# Cython debug symbols
cython_debug/
# -------------------- IDE / Editor --------------------
# VSCode
.vscode/
*.code-workspace
# PyCharm
.idea/
# Vim
*.swp
*.swo
*~
# Emacs
*~
\#*\#
/.emacs.desktop
/.emacs.desktop.lock
*.elc
auto-save-list
tramp
.\#*
# Sublime Text
*.sublime-project
*.sublime-workspace
# -------------------- Operating System --------------------
# macOS
.DS_Store
.AppleDouble
.LSOverride
Icon
._*
.DocumentRevisions-V100
.fseventsd
.Spotlight-V100
.TemporaryItems
.Trashes
.VolumeIcon.icns
.com.apple.timemachine.donotpresent
.AppleDB
.AppleDesktop
Network Trash Folder
Temporary Items
.apdisk
# Windows
Thumbs.db
Thumbs.db:encryptable
ehthumbs.db
ehthumbs_vista.db
*.stackdump
[Dd]esktop.ini
$RECYCLE.BIN/
*.cab
*.msi
*.msix
*.msm
*.msp
*.lnk
# Linux
*~
.fuse_hidden*
.directory
.Trash-*
.nfs*
# -------------------- Docker --------------------
# Docker volumes and data
docker-volumes/
.dockerignore.bak
# Docker Compose override files
docker-compose.override.yml
docker-compose.override.yaml
# -------------------- Database --------------------
# SQLite
*.sqlite
*.sqlite3
*.db
*.db-journal
*.db-shm
*.db-wal
# PostgreSQL
*.sql.backup
# -------------------- Logs --------------------
# General logs
*.log
logs/
*.log.*
# -------------------- FuzzForge Specific --------------------
# FuzzForge project directories (user projects should manage their own .gitignore)
.fuzzforge/
# Test project databases and configurations
test_projects/*/.fuzzforge/
test_projects/*/findings.db*
test_projects/*/config.yaml
test_projects/*/.gitignore
# Local development configurations
local_config.yaml
dev_config.yaml
.env.local
.env.development
# Generated reports and outputs
reports/
output/
findings/
*.sarif.json
*.html.report
security_report.*
# Temporary files
tmp/
temp/
*.tmp
*.temp
# Backup files
*.bak
*.backup
*~
# -------------------- Node.js (for any JS tooling) --------------------
node_modules/
npm-debug.log*
yarn-debug.log*
yarn-error.log*
.npm
# -------------------- Security --------------------
# Never commit these files
*.pem
*.key
*.p12
*.pfx
secret*
secrets/
credentials*
api_keys*
.env.production
.env.staging
# AWS credentials
.aws/
# -------------------- Build Artifacts --------------------
# Python builds
build/
dist/
*.wheel
# Documentation builds
docs/_build/
site/
# -------------------- Miscellaneous --------------------
# Jupyter Notebook checkpoints
.ipynb_checkpoints
# IPython history
.ipython/
# Rope project settings
.ropeproject
# spyderproject
.spyderproject
.spyproject
# mkdocs documentation
/site
# Local Netlify folder
.netlify
# -------------------- Project Specific Overrides --------------------
# Allow specific test project files that should be tracked
!test_projects/*/src/
!test_projects/*/scripts/
!test_projects/*/config/
!test_projects/*/data/
!test_projects/*/README.md
!test_projects/*/*.py
!test_projects/*/*.js
!test_projects/*/*.php
!test_projects/*/*.java
# But exclude their sensitive content
test_projects/*/.env
test_projects/*/private_key.pem
test_projects/*/wallet.json
test_projects/*/.npmrc
test_projects/*/.git-credentials
test_projects/*/credentials.*
test_projects/*/api_keys.*
# Podman/Docker container storage artifacts
~/.fuzzforge/

1
.python-version Normal file
View File

@@ -0,0 +1 @@
3.14.2

View File

@@ -1,17 +1,21 @@
# Contributing to FuzzForge 🤝
# Contributing to FuzzForge OSS
Thank you for your interest in contributing to FuzzForge! We welcome contributions from the community and are excited to collaborate with you.
Thank you for your interest in contributing to FuzzForge OSS! We welcome contributions from the community and are excited to collaborate with you.
## 🌟 Ways to Contribute
**Our Vision**: FuzzForge aims to be a **universal platform for security research** across all cybersecurity domains. Through our modular architecture, any security tool—from fuzzing engines to cloud scanners, from mobile app analyzers to IoT security tools—can be integrated as a containerized module and controlled via AI agents.
- 🐛 **Bug Reports** - Help us identify and fix issues
- 💡 **Feature Requests** - Suggest new capabilities and improvements
- 🔧 **Code Contributions** - Submit bug fixes, features, and enhancements
- 📚 **Documentation** - Improve guides, tutorials, and API documentation
- 🧪 **Testing** - Help test new features and report issues
- 🛡️ **Security Workflows** - Contribute new security analysis workflows
## Ways to Contribute
## 📋 Contribution Guidelines
- **Security Modules** - Create modules for any cybersecurity domain (AppSec, NetSec, Cloud, IoT, etc.)
- **Bug Reports** - Help us identify and fix issues
- **Feature Requests** - Suggest new capabilities and improvements
- **Core Features** - Contribute to the MCP server, runner, or CLI
- **Documentation** - Improve guides, tutorials, and module documentation
- **Testing** - Help test new features and report issues
- **AI Integration** - Improve MCP tools and AI agent interactions
- **Tool Integrations** - Wrap existing security tools as FuzzForge modules
## Contribution Guidelines
### Code Style
@@ -44,9 +48,10 @@ We use conventional commits for clear history:
**Examples:**
```
feat(workflows): add new static analysis workflow for Go
fix(api): resolve authentication timeout issue
docs(readme): update installation instructions
feat(modules): add cloud security scanner module
fix(mcp): resolve module listing timeout
docs(sdk): update module development guide
test(runner): add container execution tests
```
### Pull Request Process
@@ -65,9 +70,14 @@ docs(readme): update installation instructions
3. **Test Your Changes**
```bash
# Test workflows
cd test_projects/vulnerable_app/
ff workflow security_assessment .
# Test modules
FUZZFORGE_MODULES_PATH=./fuzzforge-modules uv run fuzzforge modules list
# Run a module
uv run fuzzforge modules run your-module --assets ./test-assets
# Test MCP integration (if applicable)
uv run fuzzforge mcp status
```
4. **Submit Pull Request**
@@ -76,64 +86,353 @@ docs(readme): update installation instructions
- Link related issues using `Fixes #123` or `Closes #123`
- Ensure all CI checks pass
## 🛡️ Security Workflow Development
## Module Development
### Creating New Workflows
FuzzForge uses a modular architecture where security tools run as isolated containers. The `fuzzforge-modules-sdk` provides everything you need to create new modules.
1. **Workflow Structure**
```
backend/toolbox/workflows/your_workflow/
├── __init__.py
├── workflow.py # Main Prefect flow
├── metadata.yaml # Workflow metadata
└── Dockerfile # Container definition
**Documentation:**
- [Module SDK Documentation](fuzzforge-modules/fuzzforge-modules-sdk/README.md) - Complete SDK reference
- [Module Template](fuzzforge-modules/fuzzforge-module-template/) - Starting point for new modules
- [USAGE Guide](USAGE.md) - Setup and installation instructions
### Creating a New Module
1. **Use the Module Template**
```bash
# Generate a new module from template
cd fuzzforge-modules/
cp -r fuzzforge-module-template my-new-module
cd my-new-module
```
2. **Register Your Workflow**
Add your workflow to `backend/toolbox/workflows/registry.py`:
2. **Module Structure**
```
my-new-module/
├── Dockerfile # Container definition
├── Makefile # Build commands
├── README.md # Module documentation
├── pyproject.toml # Python dependencies
├── mypy.ini # Type checking config
├── ruff.toml # Linting config
└── src/
└── module/
├── __init__.py
├── __main__.py # Entry point
├── mod.py # Main module logic
├── models.py # Pydantic models
└── settings.py # Configuration
```
3. **Implement Your Module**
Edit `src/module/mod.py`:
```python
# Import your workflow
from .your_workflow.workflow import main_flow as your_workflow_flow
# Add to registry
WORKFLOW_REGISTRY["your_workflow"] = {
"flow": your_workflow_flow,
"module_path": "toolbox.workflows.your_workflow.workflow",
"function_name": "main_flow",
"description": "Description of your workflow",
"version": "1.0.0",
"author": "Your Name",
"tags": ["tag1", "tag2"]
}
from fuzzforge_modules_sdk.api.modules import BaseModule
from fuzzforge_modules_sdk.api.models import ModuleResult
from .models import MyModuleConfig, MyModuleOutput
class MyModule(BaseModule[MyModuleConfig, MyModuleOutput]):
"""Your module description."""
def execute(self) -> ModuleResult[MyModuleOutput]:
"""Main execution logic."""
# Access input assets
assets = self.input_path
# Your security tool logic here
results = self.run_analysis(assets)
# Return structured results
return ModuleResult(
success=True,
output=MyModuleOutput(
findings=results,
summary="Analysis complete"
)
)
```
3. **Testing Workflows**
- Create test cases in `test_projects/vulnerable_app/`
- Ensure SARIF output format compliance
- Test with various input scenarios
4. **Define Configuration Models**
Edit `src/module/models.py`:
```python
from pydantic import BaseModel, Field
from fuzzforge_modules_sdk.api.models import BaseModuleConfig, BaseModuleOutput
class MyModuleConfig(BaseModuleConfig):
"""Configuration for your module."""
timeout: int = Field(default=300, description="Timeout in seconds")
max_iterations: int = Field(default=1000, description="Max iterations")
class MyModuleOutput(BaseModuleOutput):
"""Output from your module."""
findings: list[dict] = Field(default_factory=list)
coverage: float = Field(default=0.0)
```
5. **Build Your Module**
```bash
# Build the SDK first (if not already done)
cd ../fuzzforge-modules-sdk
uv build
mkdir -p .wheels
cp ../../dist/fuzzforge_modules_sdk-*.whl .wheels/
cd ../..
docker build -t localhost/fuzzforge-modules-sdk:0.1.0 fuzzforge-modules/fuzzforge-modules-sdk/
# Build your module
cd fuzzforge-modules/my-new-module
docker build -t fuzzforge-my-new-module:0.1.0 .
```
6. **Test Your Module**
```bash
# Run with test assets
uv run fuzzforge modules run my-new-module --assets ./test-assets
# Check module info
uv run fuzzforge modules info my-new-module
```
### Module Development Guidelines
**Important Conventions:**
- **Input/Output**: Use `/fuzzforge/input` for assets and `/fuzzforge/output` for results
- **Configuration**: Support JSON configuration via stdin or file
- **Logging**: Use structured logging (structlog is pre-configured)
- **Error Handling**: Return proper exit codes and error messages
- **Security**: Run as non-root user when possible
- **Documentation**: Include clear README with usage examples
- **Dependencies**: Minimize container size, use multi-stage builds
**See also:**
- [Module SDK API Reference](fuzzforge-modules/fuzzforge-modules-sdk/src/fuzzforge_modules_sdk/api/)
- [Dockerfile Best Practices](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/)
### Module Types
FuzzForge is designed to support modules across **all cybersecurity domains**. The modular architecture allows any security tool to be containerized and integrated. Here are the main categories:
**Application Security**
- Fuzzing engines (coverage-guided, grammar-based, mutation-based)
- Static analysis (SAST, code quality, dependency scanning)
- Dynamic analysis (DAST, runtime analysis, instrumentation)
- Test validation and coverage analysis
- Crash analysis and exploit detection
**Network & Infrastructure Security**
- Network scanning and service enumeration
- Protocol analysis and fuzzing
- Firewall and configuration testing
- Cloud security (AWS/Azure/GCP misconfiguration detection, IAM analysis)
- Container security (image scanning, Kubernetes security)
**Web & API Security**
- Web vulnerability scanners (XSS, SQL injection, CSRF)
- Authentication and session testing
- API security (REST/GraphQL/gRPC testing, fuzzing)
- SSL/TLS analysis
**Binary & Reverse Engineering**
- Binary analysis and disassembly
- Malware sandboxing and behavior analysis
- Exploit development tools
- Firmware extraction and analysis
**Mobile & IoT Security**
- Mobile app analysis (Android/iOS static/dynamic analysis)
- IoT device security and firmware analysis
- SCADA/ICS and industrial protocol testing
- Automotive security (CAN bus, ECU testing)
**Data & Compliance**
- Database security testing
- Encryption and cryptography analysis
- Secrets and credential detection
- Privacy tools (PII detection, GDPR compliance)
- Compliance checkers (PCI-DSS, HIPAA, SOC2, ISO27001)
**Threat Intelligence & Risk**
- OSINT and reconnaissance tools
- Threat hunting and IOC correlation
- Risk assessment and attack surface mapping
- Security audit and policy validation
**Emerging Technologies**
- AI/ML security (model poisoning, adversarial testing)
- Blockchain and smart contract analysis
- Quantum-safe cryptography testing
**Custom & Integration**
- Domain-specific security tools
- Bridges to existing security tools
- Multi-tool orchestration and result aggregation
### Example: Simple Security Scanner Module
```python
# src/module/mod.py
from pathlib import Path
from fuzzforge_modules_sdk.api.modules import BaseModule
from fuzzforge_modules_sdk.api.models import ModuleResult
from .models import ScannerConfig, ScannerOutput
class SecurityScanner(BaseModule[ScannerConfig, ScannerOutput]):
"""Scans for common security issues in code."""
def execute(self) -> ModuleResult[ScannerOutput]:
findings = []
# Scan all source files
for file_path in self.input_path.rglob("*"):
if file_path.is_file():
findings.extend(self.scan_file(file_path))
return ModuleResult(
success=True,
output=ScannerOutput(
findings=findings,
files_scanned=len(list(self.input_path.rglob("*")))
)
)
def scan_file(self, path: Path) -> list[dict]:
"""Scan a single file for security issues."""
# Your scanning logic here
return []
```
### Testing Modules
Create tests in `tests/`:
```python
import pytest
from module.mod import MyModule
from module.models import MyModuleConfig
def test_module_execution():
config = MyModuleConfig(timeout=60)
module = MyModule(config=config, input_path=Path("test_assets"))
result = module.execute()
assert result.success
assert len(result.output.findings) >= 0
```
Run tests:
```bash
uv run pytest
```
### Security Guidelines
- 🔐 Never commit secrets, API keys, or credentials
- 🛡️ Focus on **defensive security** tools and analysis
- ⚠️ Do not create tools for malicious purposes
- 🧪 Test workflows thoroughly before submission
- 📋 Follow responsible disclosure for security issues
**Critical Requirements:**
- Never commit secrets, API keys, or credentials
- Focus on **defensive security** tools and analysis
- Do not create tools for malicious purposes
- Test modules thoroughly before submission
- Follow responsible disclosure for security issues
- Use minimal, secure base images for containers
- Avoid running containers as root when possible
## 🐛 Bug Reports
**Security Resources:**
- [OWASP Container Security](https://cheatsheetseries.owasp.org/cheatsheets/Docker_Security_Cheat_Sheet.html)
- [CIS Docker Benchmarks](https://www.cisecurity.org/benchmark/docker)
## Contributing to Core Features
Beyond modules, you can contribute to FuzzForge's core components.
**Useful Resources:**
- [Project Structure](README.md) - Overview of the codebase
- [USAGE Guide](USAGE.md) - Installation and setup
- Python best practices: [PEP 8](https://pep8.org/)
### Core Components
- **fuzzforge-mcp** - MCP server for AI agent integration
- **fuzzforge-runner** - Module execution engine
- **fuzzforge-cli** - Command-line interface
- **fuzzforge-common** - Shared utilities and sandbox engines
- **fuzzforge-types** - Type definitions and schemas
### Development Setup
1. **Clone and Install**
```bash
git clone https://github.com/FuzzingLabs/fuzzforge-oss.git
cd fuzzforge-oss
uv sync --all-extras
```
2. **Run Tests**
```bash
# Run all tests
make test
# Run specific package tests
cd fuzzforge-mcp
uv run pytest
```
3. **Type Checking**
```bash
# Type check all packages
make typecheck
# Type check specific package
cd fuzzforge-runner
uv run mypy .
```
4. **Linting and Formatting**
```bash
# Format code
make format
# Lint code
make lint
```
## Bug Reports
When reporting bugs, please include:
- **Environment**: OS, Python version, Docker version
- **Environment**: OS, Python version, Docker version, uv version
- **FuzzForge Version**: Output of `uv run fuzzforge --version`
- **Module**: Which module or component is affected
- **Steps to Reproduce**: Clear steps to recreate the issue
- **Expected Behavior**: What should happen
- **Actual Behavior**: What actually happens
- **Logs**: Relevant error messages and stack traces
- **Container Logs**: For module issues, include Docker/Podman logs
- **Screenshots**: If applicable
Use our [Bug Report Template](.github/ISSUE_TEMPLATE/bug_report.md).
**Example:**
```markdown
**Environment:**
- OS: Ubuntu 22.04
- Python: 3.14.2
- Docker: 24.0.7
- uv: 0.5.13
## 💡 Feature Requests
**Module:** my-custom-scanner
**Steps to Reproduce:**
1. Run `uv run fuzzforge modules run my-scanner --assets ./test-target`
2. Module fails with timeout error
**Expected:** Module completes analysis
**Actual:** Times out after 30 seconds
**Logs:**
```
ERROR: Module execution timeout
...
```
```
## Feature Requests
For new features, please provide:
@@ -141,33 +440,124 @@ For new features, please provide:
- **Proposed Solution**: How should it work?
- **Alternatives**: Other approaches considered
- **Implementation**: Technical considerations (optional)
- **Module vs Core**: Should this be a module or core feature?
Use our [Feature Request Template](.github/ISSUE_TEMPLATE/feature_request.md).
**Example Feature Requests:**
- New module for cloud security posture management (CSPM)
- Module for analyzing smart contract vulnerabilities
- MCP tool for orchestrating multi-module workflows
- CLI command for batch module execution across multiple targets
- Support for distributed fuzzing campaigns
- Integration with CI/CD pipelines
- Module marketplace/registry features
## 📚 Documentation
## Documentation
Help improve our documentation:
- **Module Documentation**: Document your modules in their README.md
- **API Documentation**: Update docstrings and type hints
- **User Guides**: Create tutorials and how-to guides
- **Workflow Documentation**: Document new security workflows
- **Examples**: Add practical usage examples
- **User Guides**: Improve USAGE.md and tutorial content
- **Module SDK Guides**: Help document the SDK for module developers
- **MCP Integration**: Document AI agent integration patterns
- **Examples**: Add practical usage examples and workflows
## 🙏 Recognition
### Documentation Standards
- Use clear, concise language
- Include code examples
- Add command-line examples with expected output
- Document all configuration options
- Explain error messages and troubleshooting
### Module README Template
```markdown
# Module Name
Brief description of what this module does.
## Features
- Feature 1
- Feature 2
## Configuration
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| timeout | int | 300 | Timeout in seconds |
## Usage
\`\`\`bash
uv run fuzzforge modules run module-name --assets ./path/to/assets
\`\`\`
## Output
Describes the output structure and format.
## Examples
Practical usage examples.
```
## Recognition
Contributors will be:
- Listed in our [Contributors](CONTRIBUTORS.md) file
- Mentioned in release notes for significant contributions
- Invited to join our Discord community
- Eligible for FuzzingLabs Academy courses and swag
- Credited in module documentation (for module authors)
- Invited to join our [Discord community](https://discord.gg/8XEX33UUwZ)
## 📜 License
## Module Submission Checklist
By contributing to FuzzForge, you agree that your contributions will be licensed under the same [Business Source License 1.1](LICENSE) as the project.
Before submitting a new module:
- [ ] Module follows SDK structure and conventions
- [ ] Dockerfile builds successfully
- [ ] Module executes without errors
- [ ] Configuration options are documented
- [ ] README.md is complete with examples
- [ ] Tests are included (pytest)
- [ ] Type hints are used throughout
- [ ] Linting passes (ruff)
- [ ] Security best practices followed
- [ ] No secrets or credentials in code
- [ ] License headers included
## Review Process
1. **Initial Review** - Maintainers review for completeness
2. **Technical Review** - Code quality and security assessment
3. **Testing** - Module tested in isolated environment
4. **Documentation Review** - Ensure docs are clear and complete
5. **Approval** - Module merged and included in next release
## License
By contributing to FuzzForge OSS, you agree that your contributions will be licensed under the same license as the project (see [LICENSE](LICENSE)).
For module contributions:
- Modules you create remain under the project license
- You retain credit as the module author
- Your module may be used by others under the project license terms
---
**Thank you for making FuzzForge better! 🚀**
## Getting Help
Every contribution, no matter how small, helps build a stronger security community.
Need help contributing?
- Join our [Discord](https://discord.gg/8XEX33UUwZ)
- Read the [Module SDK Documentation](fuzzforge-modules/fuzzforge-modules-sdk/README.md)
- Check the module template for examples
- Contact: contact@fuzzinglabs.com
---
**Thank you for making FuzzForge better!**
Every contribution, no matter how small, helps build a stronger security research platform. Whether you're creating a module for web security, cloud scanning, mobile analysis, or any other cybersecurity domain, your work makes FuzzForge more powerful and versatile for the entire security community!

103
Makefile Normal file
View File

@@ -0,0 +1,103 @@
.PHONY: help install sync format lint typecheck test build-modules clean
SHELL := /bin/bash
# Default target
help:
@echo "FuzzForge OSS Development Commands"
@echo ""
@echo " make install - Install all dependencies"
@echo " make sync - Sync shared packages from upstream"
@echo " make format - Format code with ruff"
@echo " make lint - Lint code with ruff"
@echo " make typecheck - Type check with mypy"
@echo " make test - Run all tests"
@echo " make build-modules - Build all module container images"
@echo " make clean - Clean build artifacts"
@echo ""
# Install all dependencies
install:
uv sync
# Sync shared packages from upstream fuzzforge-core
sync:
@if [ -z "$(UPSTREAM)" ]; then \
echo "Usage: make sync UPSTREAM=/path/to/fuzzforge-core"; \
exit 1; \
fi
./scripts/sync-upstream.sh $(UPSTREAM)
# Format all packages
format:
@for pkg in packages/fuzzforge-*/; do \
if [ -f "$$pkg/pyproject.toml" ]; then \
echo "Formatting $$pkg..."; \
cd "$$pkg" && uv run ruff format . && cd -; \
fi \
done
# Lint all packages
lint:
@for pkg in packages/fuzzforge-*/; do \
if [ -f "$$pkg/pyproject.toml" ]; then \
echo "Linting $$pkg..."; \
cd "$$pkg" && uv run ruff check . && cd -; \
fi \
done
# Type check all packages
typecheck:
@for pkg in packages/fuzzforge-*/; do \
if [ -f "$$pkg/pyproject.toml" ] && [ -f "$$pkg/mypy.ini" ]; then \
echo "Type checking $$pkg..."; \
cd "$$pkg" && uv run mypy . && cd -; \
fi \
done
# Run all tests
test:
@for pkg in packages/fuzzforge-*/; do \
if [ -f "$$pkg/pytest.ini" ]; then \
echo "Testing $$pkg..."; \
cd "$$pkg" && uv run pytest && cd -; \
fi \
done
# Build all module container images
# Uses Docker by default, or Podman if FUZZFORGE_ENGINE=podman
build-modules:
@echo "Building FuzzForge module images..."
@if [ "$$FUZZFORGE_ENGINE" = "podman" ]; then \
if [ -n "$$SNAP" ]; then \
echo "Using Podman with isolated storage (Snap detected)"; \
CONTAINER_CMD="podman --root ~/.fuzzforge/containers/storage --runroot ~/.fuzzforge/containers/run"; \
else \
echo "Using Podman"; \
CONTAINER_CMD="podman"; \
fi; \
else \
echo "Using Docker"; \
CONTAINER_CMD="docker"; \
fi; \
for module in fuzzforge-modules/*/; do \
if [ -f "$$module/Dockerfile" ] && \
[ "$$module" != "fuzzforge-modules/fuzzforge-modules-sdk/" ] && \
[ "$$module" != "fuzzforge-modules/fuzzforge-module-template/" ]; then \
name=$$(basename $$module); \
version=$$(grep 'version' "$$module/pyproject.toml" 2>/dev/null | head -1 | sed 's/.*"\(.*\\)".*/\\1/' || echo "0.1.0"); \
echo "Building $$name:$$version..."; \
$$CONTAINER_CMD build -t "fuzzforge-$$name:$$version" "$$module" || exit 1; \
fi \
done
@echo ""
@echo "✓ All modules built successfully!"
# Clean build artifacts
clean:
find . -type d -name "__pycache__" -exec rm -rf {} + 2>/dev/null || true
find . -type d -name ".pytest_cache" -exec rm -rf {} + 2>/dev/null || true
find . -type d -name ".mypy_cache" -exec rm -rf {} + 2>/dev/null || true
find . -type d -name ".ruff_cache" -exec rm -rf {} + 2>/dev/null || true
find . -type d -name "*.egg-info" -exec rm -rf {} + 2>/dev/null || true
find . -type f -name "*.pyc" -delete 2>/dev/null || true

359
README.md
View File

@@ -1,215 +1,284 @@
<h1 align="center"> FuzzForge OSS</h1>
<h3 align="center">AI-Powered Security Research Orchestration via MCP</h3>
<p align="center">
<img src="docs/static/img/fuzzforge_banner_github.png" alt="FuzzForge Banner" width="100%">
<a href="https://discord.gg/8XEX33UUwZ"><img src="https://img.shields.io/discord/1420767905255133267?logo=discord&label=Discord" alt="Discord"></a>
<a href="LICENSE"><img src="https://img.shields.io/badge/license-BSL%201.1-blue" alt="License: BSL 1.1"></a>
<a href="https://www.python.org/downloads/"><img src="https://img.shields.io/badge/python-3.12%2B-blue" alt="Python 3.12+"/></a>
<a href="https://modelcontextprotocol.io"><img src="https://img.shields.io/badge/MCP-compatible-green" alt="MCP Compatible"/></a>
<a href="https://fuzzforge.ai"><img src="https://img.shields.io/badge/Website-fuzzforge.ai-purple" alt="Website"/></a>
</p>
<h1 align="center">🚧 FuzzForge is under active development</h1>
<p align="center"><strong>AI-powered workflow automation and AI Agents for AppSec, Fuzzing & Offensive Security</strong></p>
<p align="center">
<a href="https://discord.com/invite/acqv9FVG"><img src="https://img.shields.io/discord/1420767905255133267?logo=discord&label=Discord" alt="Discord"></a>
<a href="LICENSE"><img src="https://img.shields.io/badge/license-BSL%20%2B%20Apache-orange" alt="License: BSL + Apache"></a>
<a href="https://www.python.org/downloads/"><img src="https://img.shields.io/badge/python-3.11%2B-blue" alt="Python 3.11+"/></a>
<a href="https://fuzzforge.ai"><img src="https://img.shields.io/badge/Website-fuzzforge.ai-blue" alt="Website"/></a>
<img src="https://img.shields.io/badge/version-0.6.0-green" alt="Version">
<a href="https://github.com/FuzzingLabs/fuzzforge_ai/stargazers"><img src="https://img.shields.io/github/stars/FuzzingLabs/fuzzforge_ai?style=social" alt="GitHub Stars"></a>
<strong>Let AI agents orchestrate your security research workflows locally</strong>
</p>
<p align="center">
<sub>
<a href="#-overview"><b>Overview</b></a>
<a href="#-key-features"><b>Features</b></a>
<a href="#-installation"><b>Installation</b></a>
<a href="#-quickstart"><b>Quickstart</b></a>
<a href="#ai-powered-workflow-execution"><b>AI Demo</b></a>
<a href="#-contributing"><b>Contributing</b></a>
• <a href="#%EF%B8%8F-roadmap"><b>Roadmap</b></a>
<a href="#-overview"><b>Overview</b></a>
<a href="#-features"><b>Features</b></a>
<a href="#-installation"><b>Installation</b></a>
<a href="USAGE.md"><b>Usage Guide</b></a>
<a href="#-modules"><b>Modules</b></a>
<a href="#-contributing"><b>Contributing</b></a>
</sub>
</p>
---
> 🚧 **FuzzForge OSS is under active development.** Expect breaking changes and new features!
---
## 🚀 Overview
**FuzzForge** helps security researchers and engineers automate **application security** and **offensive security** workflows with the power of AI and fuzzing frameworks.
**FuzzForge OSS** is an open-source runtime that enables AI agents (GitHub Copilot, Claude, etc.) to orchestrate security research workflows through the **Model Context Protocol (MCP)**.
- Orchestrate static & dynamic analysis
- Automate vulnerability research
- Scale AppSec testing with AI agents
- Build, share & reuse workflows across teams
### The Core: Modules
FuzzForge is **open source**, built to empower security teams, researchers, and the community.
At the heart of FuzzForge are **modules** - containerized security tools that AI agents can discover, configure, and orchestrate. Each module encapsulates a specific security capability (static analysis, fuzzing, crash analysis, etc.) and runs in an isolated container.
> 🚧 FuzzForge is under active development. Expect breaking changes.
- **🔌 Plug & Play**: Modules are self-contained - just pull and run
- **🤖 AI-Native**: Designed for AI agent orchestration via MCP
- **🔗 Composable**: Chain modules together into automated workflows
- **📦 Extensible**: Build custom modules with the Python SDK
The OSS runtime handles module discovery, execution, and result collection. Security modules (developed separately) provide the actual security tooling - from static analyzers to fuzzers to crash triagers.
Instead of manually running security tools, describe what you want and let your AI assistant handle it.
### 🎬 Use Case: Rust Fuzzing Pipeline
> **Scenario**: Fuzz a Rust crate to discover vulnerabilities using AI-assisted harness generation and parallel fuzzing.
<table align="center">
<tr>
<th>1⃣ Analyze, Generate & Validate Harnesses</th>
<th>2⃣ Run Parallel Continuous Fuzzing</th>
</tr>
<tr>
<td><img src="assets/demopart2.gif" alt="FuzzForge Demo - Analysis Pipeline" width="100%"></td>
<td><img src="assets/demopart1.gif" alt="FuzzForge Demo - Parallel Fuzzing" width="100%"></td>
</tr>
<tr>
<td align="center"><sub>AI agent analyzes code, generates harnesses, and validates they compile</sub></td>
<td align="center"><sub>Multiple fuzzing sessions run in parallel with live metrics</sub></td>
</tr>
</table>
---
## ⭐ Support the Project
If you find FuzzForge useful, please **star the repo** to support development! 🚀
<a href="https://github.com/FuzzingLabs/fuzzforge_ai/stargazers">
<img src="https://img.shields.io/github/stars/FuzzingLabs/fuzzforge_ai?style=social" alt="GitHub Stars">
</a>
If you find FuzzForge useful, please star the repo to support development 🚀
---
## ✨ Features
| Feature | Description |
|---------|-------------|
| 🤖 **AI-Native** | Built for MCP - works with GitHub Copilot, Claude, and any MCP-compatible agent |
| 📦 **Containerized** | Each module runs in isolation via Docker or Podman |
| 🔄 **Continuous Mode** | Long-running tasks (fuzzing) with real-time metrics streaming |
| 🔗 **Workflows** | Chain multiple modules together in automated pipelines |
| 🛠️ **Extensible** | Create custom modules with the Python SDK |
| 🏠 **Local First** | All execution happens on your machine - no cloud required |
| 🔒 **Secure** | Sandboxed containers with no network access by default |
---
## ✨ Key Features
## 🏗️ Architecture
- 🤖 **AI Agents for Security** Specialized agents for AppSec, reversing, and fuzzing
- 🛠 **Workflow Automation** Define & execute AppSec workflows as code
- 📈 **Vulnerability Research at Scale** Rediscover 1-days & find 0-days with automation
- 🔗 **Fuzzer Integration** AFL, Honggfuzz, AFLnet, StateAFL & more
- 🌐 **Community Marketplace** Share workflows, corpora, PoCs, and modules
- 🔒 **Enterprise Ready** Team/Corp cloud tiers for scaling offensive security
```
┌─────────────────────────────────────────────────────────────────┐
│ AI Agent (Copilot/Claude) │
└───────────────────────────┬─────────────────────────────────────┘
│ MCP Protocol (stdio)
┌─────────────────────────────────────────────────────────────────┐
│ FuzzForge MCP Server │
│ ┌─────────────┐ ┌──────────────┐ ┌────────────────────────┐ │
│ │list_modules │ │execute_module│ │start_continuous_module │ │
│ └─────────────┘ └──────────────┘ └────────────────────────┘ │
└───────────────────────────┬─────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ FuzzForge Runner │
│ Container Engine (Docker/Podman) │
└───────────────────────────┬─────────────────────────────────────┘
┌───────────────────┼───────────────────┐
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ Module A │ │ Module B │ │ Module C │
│ (Container) │ │ (Container) │ │ (Container) │
└───────────────┘ └───────────────┘ └───────────────┘
```
---
## 📦 Installation
### Requirements
### Prerequisites
**Python 3.11+**
Python 3.11 or higher is required.
- **Python 3.12+**
- **[uv](https://docs.astral.sh/uv/)** package manager
- **Docker** ([Install Docker](https://docs.docker.com/get-docker/)) or Podman
**uv Package Manager**
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```
**Docker**
For containerized workflows, see the [Docker Installation Guide](https://docs.docker.com/get-docker/).
#### Configure Docker Daemon
Before running `docker compose up`, configure Docker to allow insecure registries (required for the local registry).
Add the following to your Docker daemon configuration:
```json
{
"insecure-registries": [
"localhost:5000",
"host.docker.internal:5001",
"registry:5000"
]
}
```
**macOS (Docker Desktop):**
1. Open Docker Desktop
2. Go to Settings → Docker Engine
3. Add the `insecure-registries` configuration to the JSON
4. Click "Apply & Restart"
**Linux:**
1. Edit `/etc/docker/daemon.json` (create if it doesn't exist):
```bash
sudo nano /etc/docker/daemon.json
```
2. Add the configuration above
3. Restart Docker:
```bash
sudo systemctl restart docker
```
### CLI Installation
After installing the requirements, install the FuzzForge CLI:
### Quick Install
```bash
# Clone the repository
git clone https://github.com/fuzzinglabs/fuzzforge_ai.git
git clone https://github.com/FuzzingLabs/fuzzforge_ai.git
cd fuzzforge_ai
# Install CLI with uv (from the root directory)
uv tool install --python python3.12 .
# Install dependencies
uv sync
# Build module images
make build-modules
```
### Configure MCP for Your AI Agent
```bash
# For GitHub Copilot
uv run fuzzforge mcp install copilot
# For Claude Code (CLI)
uv run fuzzforge mcp install claude-code
# For Claude Desktop (standalone app)
uv run fuzzforge mcp install claude-desktop
# Verify installation
uv run fuzzforge mcp status
```
**Restart your editor** and your AI agent will have access to FuzzForge tools!
---
## 📦 Modules
FuzzForge modules are containerized security tools that AI agents can orchestrate. The module ecosystem is designed around a simple principle: **the OSS runtime orchestrates, enterprise modules execute**.
### Module Ecosystem
| | FuzzForge OSS | FuzzForge Enterprise Modules |
|---|---|---|
| **What** | Runtime & MCP server | Security research modules |
| **License** | Apache 2.0 | BSL 1.1 (Business Source License) |
| **Compatibility** | ✅ Runs any compatible module | ✅ Works with OSS runtime |
**Enterprise modules** are developed separately and provide production-ready security tooling:
| Category | Modules | Description |
|----------|---------|-------------|
| 🔍 **Static Analysis** | Rust Analyzer, Solidity Analyzer, Cairo Analyzer | Code analysis and fuzzable function detection |
| 🎯 **Fuzzing** | Cargo Fuzzer, Honggfuzz, AFL++ | Coverage-guided fuzz testing |
| 💥 **Crash Analysis** | Crash Triager, Root Cause Analyzer | Automated crash deduplication and analysis |
| 🔐 **Vulnerability Detection** | Pattern Matcher, Taint Analyzer | Security vulnerability scanning |
| 📝 **Reporting** | Report Generator, SARIF Exporter | Automated security report generation |
> 💡 **Build your own modules!** The FuzzForge SDK allows you to create custom modules that integrate seamlessly with the OSS runtime. See [Creating Custom Modules](#-creating-custom-modules).
### Execution Modes
Modules run in two execution modes:
#### One-shot Execution
Run a module once and get results:
```python
result = execute_module("my-analyzer", assets_path="/path/to/project")
```
#### Continuous Execution
For long-running tasks like fuzzing, with real-time metrics:
```python
# Start continuous execution
session = start_continuous_module("my-fuzzer",
assets_path="/path/to/project",
configuration={"target": "my_target"})
# Check status with live metrics
status = get_continuous_status(session["session_id"])
# Stop and collect results
stop_continuous_module(session["session_id"])
```
---
## ⚡ Quickstart
## 🛠️ Creating Custom Modules
Run your first workflow :
Build your own security modules with the FuzzForge SDK:
```bash
# 1. Clone the repo
git clone https://github.com/fuzzinglabs/fuzzforge_ai.git
cd fuzzforge_ai
```python
from fuzzforge_modules_sdk import FuzzForgeModule, FuzzForgeModuleResults
# 2. Build & run with Docker
# Set registry host for your OS (local registry is mandatory)
# macOS/Windows (Docker Desktop):
export REGISTRY_HOST=host.docker.internal
# Linux (default):
# export REGISTRY_HOST=localhost
docker compose up -d
class MySecurityModule(FuzzForgeModule):
def _run(self, resources):
self.emit_event("started", target=resources[0].path)
# Your analysis logic here
results = self.analyze(resources)
self.emit_progress(100, status="completed",
message=f"Analysis complete")
return FuzzForgeModuleResults.SUCCESS
```
> The first launch can take 5-10 minutes due to Docker image building - a good time for a coffee break ☕
```bash
# 3. Run your first workflow
cd test_projects/vulnerable_app/ # Go into the test directory
fuzzforge init # Init a fuzzforge project
ff workflow run security_assessment . # Start a workflow (you can also use ff command)
```
### Manual Workflow Setup
![Manual Workflow Demo](docs/static/videos/manual_workflow.gif)
_Setting up and running security workflows through the interface_
👉 More installation options in the [Documentation](https://docs.fuzzforge.ai).
📖 See the [Module SDK Guide](fuzzforge-modules/fuzzforge-modules-sdk/README.md) for details.
---
## AI-Powered Workflow Execution
## 📁 Project Structure
![LLM Workflow Demo](docs/static/videos/llm_workflow.gif)
_AI agents automatically analyzing code and providing security insights_
## 📚 Resources
- 🌐 [Website](https://fuzzforge.ai)
- 📖 [Documentation](https://docs.fuzzforge.ai)
- 💬 [Community Discord](https://discord.com/invite/acqv9FVG)
- 🎓 [FuzzingLabs Academy](https://academy.fuzzinglabs.com/?coupon=GITHUB_FUZZFORGE)
```
fuzzforge_ai/
├── fuzzforge-cli/ # Command-line interface
├── fuzzforge-common/ # Shared abstractions (containers, storage)
├── fuzzforge-mcp/ # MCP server for AI agents
├── fuzzforge-modules/ # Security modules
│ └── fuzzforge-modules-sdk/ # Module development SDK
├── fuzzforge-runner/ # Local execution engine
├── fuzzforge-types/ # Type definitions & schemas
└── demo/ # Demo projects for testing
```
---
## 🤝 Contributing
We welcome contributions from the community!
There are many ways to help:
We welcome contributions from the community!
- Report bugs by opening an [issue](../../issues)
- Suggest new features or improvements
- Submit pull requests with fixes or enhancements
- Share workflows, corpora, or modules with the community
- 🐛 Report bugs via [GitHub Issues](../../issues)
- 💡 Suggest features or improvements
- 🔧 Submit pull requests
- 📦 Share your custom modules
See our [Contributing Guide](CONTRIBUTING.md) for details.
See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
---
## 🗺️ Roadmap
## 📄 License
Planned features and improvements:
- 📦 Public workflow & module marketplace
- 🤖 New specialized AI agents (Rust, Go, Android, Automotive)
- 🔗 Expanded fuzzer integrations (LibFuzzer, Jazzer, more network fuzzers)
- ☁️ Multi-tenant SaaS platform with team collaboration
- 📊 Advanced reporting & analytics
👉 Follow updates in the [GitHub issues](../../issues) and [Discord](https://discord.com/invite/acqv9FVG).
BSL 1.1 - See [LICENSE](LICENSE) for details.
---
## 📜 License
FuzzForge is released under the **Business Source License (BSL) 1.1**, with an automatic fallback to **Apache 2.0** after 4 years.
See [LICENSE](LICENSE) and [LICENSE-APACHE](LICENSE-APACHE) for details.
<p align="center">
<strong>Maintained by <a href="https://fuzzinglabs.com">FuzzingLabs</a></strong>
<br>
</p>

125
ROADMAP.md Normal file
View File

@@ -0,0 +1,125 @@
# FuzzForge OSS Roadmap
This document outlines the planned features and development direction for FuzzForge OSS.
---
## 🎯 Upcoming Features
### 1. MCP Security Hub Integration
**Status:** 🔄 Planned
Integrate [mcp-security-hub](https://github.com/FuzzingLabs/mcp-security-hub) tools into FuzzForge, giving AI agents access to 28 MCP servers and 163+ security tools through a unified interface.
#### How It Works
Unlike native FuzzForge modules (built with the SDK), mcp-security-hub tools are **standalone MCP servers**. The integration will bridge these tools so they can be:
- Discovered via `list_modules` alongside native modules
- Executed through FuzzForge's orchestration layer
- Chained with native modules in workflows
| Aspect | Native Modules | MCP Hub Tools |
|--------|----------------|---------------|
| **Runtime** | FuzzForge SDK container | Standalone MCP server container |
| **Protocol** | Direct execution | MCP-to-MCP bridge |
| **Configuration** | Module config | Tool-specific args |
| **Output** | FuzzForge results format | Tool-native format (normalized) |
#### Goals
- Unified discovery of all available tools (native + hub)
- Orchestrate hub tools through FuzzForge's workflow engine
- Normalize outputs for consistent result handling
- No modification required to mcp-security-hub tools
#### Planned Tool Categories
| Category | Tools | Example Use Cases |
|----------|-------|-------------------|
| **Reconnaissance** | nmap, masscan, whatweb, shodan | Network scanning, service discovery |
| **Web Security** | nuclei, sqlmap, ffuf, nikto | Vulnerability scanning, fuzzing |
| **Binary Analysis** | radare2, binwalk, yara, capa, ghidra | Reverse engineering, malware analysis |
| **Cloud Security** | trivy, prowler | Container scanning, cloud auditing |
| **Secrets Detection** | gitleaks | Credential scanning |
| **OSINT** | maigret, dnstwist | Username tracking, typosquatting |
| **Threat Intel** | virustotal, otx | Malware analysis, IOC lookup |
#### Example Workflow
```
You: "Scan example.com for vulnerabilities and analyze any suspicious binaries"
AI Agent:
1. Uses nmap module for port discovery
2. Uses nuclei module for vulnerability scanning
3. Uses binwalk module to extract firmware
4. Uses yara module for malware detection
5. Generates consolidated report
```
---
### 2. User Interface
**Status:** 🔄 Planned
A graphical interface to manage FuzzForge without the command line.
#### Goals
- Provide an alternative to CLI for users who prefer visual tools
- Make configuration and monitoring more accessible
- Complement (not replace) the CLI experience
#### Planned Capabilities
| Capability | Description |
|------------|-------------|
| **Configuration** | Change MCP server settings, engine options, paths |
| **Module Management** | Browse, configure, and launch modules |
| **Execution Monitoring** | View running tasks, logs, progress, metrics |
| **Project Overview** | Manage projects and browse execution results |
| **Workflow Management** | Create and run multi-module workflows |
---
## 📋 Backlog
Features under consideration for future releases:
| Feature | Description |
|---------|-------------|
| **Module Marketplace** | Browse and install community modules |
| **Scheduled Executions** | Run modules on a schedule (cron-style) |
| **Team Collaboration** | Share projects, results, and workflows |
| **Reporting Engine** | Generate PDF/HTML security reports |
| **Notifications** | Slack, Discord, email alerts for findings |
---
## ✅ Completed
| Feature | Version | Date |
|---------|---------|------|
| Docker as default engine | 0.1.0 | Jan 2026 |
| MCP server for AI agents | 0.1.0 | Jan 2026 |
| CLI for project management | 0.1.0 | Jan 2026 |
| Continuous execution mode | 0.1.0 | Jan 2026 |
| Workflow orchestration | 0.1.0 | Jan 2026 |
---
## 💬 Feedback
Have suggestions for the roadmap?
- Open an issue on [GitHub](https://github.com/FuzzingLabs/fuzzforge_ai/issues)
- Join our [Discord](https://discord.gg/8XEX33UUwZ)
---
<p align="center">
<strong>Built with ❤️ by <a href="https://fuzzinglabs.com">FuzzingLabs</a></strong>
</p>

453
USAGE.md Normal file
View File

@@ -0,0 +1,453 @@
# FuzzForge OSS Usage Guide
This guide covers everything you need to know to get started with FuzzForge OSS - from installation to running your first security research workflow with AI.
> **FuzzForge is designed to be used with AI agents** (GitHub Copilot, Claude, etc.) via MCP.
> The CLI is available for advanced users but the primary experience is through natural language interaction with your AI assistant.
---
## Table of Contents
- [Quick Start](#quick-start)
- [Prerequisites](#prerequisites)
- [Installation](#installation)
- [Building Modules](#building-modules)
- [MCP Server Configuration](#mcp-server-configuration)
- [GitHub Copilot](#github-copilot)
- [Claude Code (CLI)](#claude-code-cli)
- [Claude Desktop](#claude-desktop)
- [Using FuzzForge with AI](#using-fuzzforge-with-ai)
- [CLI Reference](#cli-reference)
- [Environment Variables](#environment-variables)
- [Troubleshooting](#troubleshooting)
---
## Quick Start
> **Prerequisites:** You need [uv](https://docs.astral.sh/uv/) and [Docker](https://docs.docker.com/get-docker/) installed.
> See the [Prerequisites](#prerequisites) section for installation instructions.
```bash
# 1. Clone and install
git clone https://github.com/FuzzingLabs/fuzzforge-oss.git
cd fuzzforge-oss
uv sync
# 2. Build the module images (one-time setup)
make build-modules
# 3. Install MCP for your AI agent
uv run fuzzforge mcp install copilot # For VS Code + GitHub Copilot
# OR
uv run fuzzforge mcp install claude-code # For Claude Code CLI
# 4. Restart your AI agent (VS Code, Claude, etc.)
# 5. Start talking to your AI:
# "List available FuzzForge modules"
# "Analyze this Rust crate for fuzzable functions"
# "Start fuzzing the parse_input function"
```
> **Note:** FuzzForge uses Docker by default. Podman is also supported via `--engine podman`.
---
## Prerequisites
Before installing FuzzForge OSS, ensure you have:
- **Python 3.12+** - [Download Python](https://www.python.org/downloads/)
- **uv** package manager - [Install uv](https://docs.astral.sh/uv/)
- **Docker** - Container runtime ([Install Docker](https://docs.docker.com/get-docker/))
### Installing uv
```bash
# Linux/macOS
curl -LsSf https://astral.sh/uv/install.sh | sh
# Or with pip
pip install uv
```
### Installing Docker
```bash
# Linux (Ubuntu/Debian)
curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER
# Log out and back in for group changes to take effect
# macOS/Windows
# Install Docker Desktop from https://docs.docker.com/get-docker/
```
> **Note:** Podman is also supported. Use `--engine podman` with CLI commands
> or set `FUZZFORGE_ENGINE=podman` environment variable.
---
## Installation
### 1. Clone the Repository
```bash
git clone https://github.com/FuzzingLabs/fuzzforge-oss.git
cd fuzzforge-oss
```
### 2. Install Dependencies
```bash
uv sync
```
This installs all FuzzForge components in a virtual environment.
### 3. Verify Installation
```bash
uv run fuzzforge --help
```
---
## Building Modules
FuzzForge modules are containerized security tools. After cloning, you need to build them once:
### Build All Modules
```bash
# From the fuzzforge-oss directory
make build-modules
```
This builds all available modules:
- `fuzzforge-rust-analyzer` - Analyzes Rust code for fuzzable functions
- `fuzzforge-cargo-fuzzer` - Runs cargo-fuzz on Rust crates
- `fuzzforge-harness-validator` - Validates generated fuzzing harnesses
- `fuzzforge-crash-analyzer` - Analyzes crash inputs
### Build a Single Module
```bash
# Build a specific module
cd fuzzforge-modules/rust-analyzer
make build
```
### Verify Modules are Built
```bash
# List built module images
docker images | grep fuzzforge
```
You should see something like:
```
fuzzforge-rust-analyzer 0.1.0 abc123def456 2 minutes ago 850 MB
fuzzforge-cargo-fuzzer 0.1.0 789ghi012jkl 2 minutes ago 1.2 GB
...
```
---
## MCP Server Configuration
FuzzForge integrates with AI agents through the Model Context Protocol (MCP). Configure your preferred AI agent to use FuzzForge tools.
### GitHub Copilot
```bash
# That's it! Just run this command:
uv run fuzzforge mcp install copilot
```
The command auto-detects everything:
- **FuzzForge root** - Where FuzzForge is installed
- **Modules path** - Defaults to `fuzzforge-oss/fuzzforge-modules`
- **Docker socket** - Auto-detects `/var/run/docker.sock`
**Optional overrides** (usually not needed):
```bash
uv run fuzzforge mcp install copilot \
--modules /path/to/modules \
--engine podman # if using Podman instead of Docker
```
**After installation:**
1. Restart VS Code
2. Open GitHub Copilot Chat
3. FuzzForge tools are now available!
### Claude Code (CLI)
```bash
uv run fuzzforge mcp install claude-code
```
Installs to `~/.claude.json` so FuzzForge tools are available from any directory.
**After installation:**
1. Run `claude` from any directory
2. FuzzForge tools are now available!
### Claude Desktop
```bash
# Automatic installation
uv run fuzzforge mcp install claude-desktop
# Verify
uv run fuzzforge mcp status
```
**After installation:**
1. Restart Claude Desktop
2. FuzzForge tools are now available!
### Check MCP Status
```bash
uv run fuzzforge mcp status
```
Shows configuration status for all supported AI agents:
```
┏━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Agent ┃ Config Path ┃ Status ┃ FuzzForge Configured ┃
┡━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ GitHub Copilot │ ~/.config/Code/User/mcp.json │ ✓ Exists │ ✓ Yes │
│ Claude Desktop │ ~/.config/Claude/claude_desktop_config... │ Not found │ - │
│ Claude Code │ ~/.claude.json │ ✓ Exists │ ✓ Yes │
└──────────────────────┴───────────────────────────────────────────┴──────────────┴─────────────────────────┘
```
### Generate Config Without Installing
```bash
# Preview the configuration that would be installed
uv run fuzzforge mcp generate copilot
uv run fuzzforge mcp generate claude-desktop
uv run fuzzforge mcp generate claude-code
```
### Remove MCP Configuration
```bash
uv run fuzzforge mcp uninstall copilot
uv run fuzzforge mcp uninstall claude-desktop
uv run fuzzforge mcp uninstall claude-code
```
---
## Using FuzzForge with AI
Once MCP is configured, you interact with FuzzForge through natural language with your AI assistant.
### Example Conversations
**Discover available tools:**
```
You: "What FuzzForge modules are available?"
AI: Uses list_modules → "I found 4 modules: rust-analyzer, cargo-fuzzer,
harness-validator, and crash-analyzer..."
```
**Analyze code for fuzzing targets:**
```
You: "Analyze this Rust crate for functions I should fuzz"
AI: Uses execute_module("rust-analyzer") → "I found 3 good fuzzing candidates:
- parse_input() in src/parser.rs - handles untrusted input
- decode_message() in src/codec.rs - complex parsing logic
..."
```
**Generate and validate harnesses:**
```
You: "Generate a fuzzing harness for the parse_input function"
AI: Creates harness code, then uses execute_module("harness-validator")
→ "Here's a harness that compiles successfully..."
```
**Run continuous fuzzing:**
```
You: "Start fuzzing parse_input for 10 minutes"
AI: Uses start_continuous_module("cargo-fuzzer") → "Started fuzzing session abc123"
You: "How's the fuzzing going?"
AI: Uses get_continuous_status("abc123") → "Running for 5 minutes:
- 150,000 executions
- 2 crashes found
- 45% edge coverage"
You: "Stop and show me the crashes"
AI: Uses stop_continuous_module("abc123") → "Found 2 unique crashes..."
```
### Available MCP Tools
| Tool | Description |
|------|-------------|
| `list_modules` | List all available security modules |
| `execute_module` | Run a module once and get results |
| `start_continuous_module` | Start a long-running module (e.g., fuzzing) |
| `get_continuous_status` | Check status of a continuous session |
| `stop_continuous_module` | Stop a continuous session |
| `list_continuous_sessions` | List all active sessions |
| `get_execution_results` | Retrieve results from an execution |
| `execute_workflow` | Run a multi-step workflow |
---
## CLI Reference
> **Note:** The CLI is for advanced users. Most users should interact with FuzzForge through their AI assistant.
### MCP Commands
```bash
uv run fuzzforge mcp status # Check configuration status
uv run fuzzforge mcp install <agent> # Install MCP config
uv run fuzzforge mcp uninstall <agent> # Remove MCP config
uv run fuzzforge mcp generate <agent> # Preview config without installing
```
### Module Commands
```bash
uv run fuzzforge modules list # List available modules
uv run fuzzforge modules info <module> # Show module details
uv run fuzzforge modules run <module> --assets . # Run a module
```
### Project Commands
```bash
uv run fuzzforge project init # Initialize a project
uv run fuzzforge project info # Show project info
uv run fuzzforge project executions # List executions
uv run fuzzforge project results <id> # Get execution results
```
---
## Environment Variables
Configure FuzzForge using environment variables:
```bash
# Project paths
export FUZZFORGE_MODULES_PATH=/path/to/modules
export FUZZFORGE_STORAGE_PATH=/path/to/storage
# Container engine (Docker is default)
export FUZZFORGE_ENGINE__TYPE=docker # or podman
# Podman-specific settings (only needed if using Podman under Snap)
export FUZZFORGE_ENGINE__GRAPHROOT=~/.fuzzforge/containers/storage
export FUZZFORGE_ENGINE__RUNROOT=~/.fuzzforge/containers/run
```
---
## Troubleshooting
### Docker Not Running
```
Error: Cannot connect to Docker daemon
```
**Solution:**
```bash
# Linux: Start Docker service
sudo systemctl start docker
# macOS/Windows: Start Docker Desktop application
# Verify Docker is running
docker run --rm hello-world
```
### Permission Denied on Docker Socket
```
Error: Permission denied connecting to Docker socket
```
**Solution:**
```bash
# Add your user to the docker group
sudo usermod -aG docker $USER
# Log out and back in for changes to take effect
# Then verify:
docker run --rm hello-world
```
### No Modules Found
```
No modules found.
```
**Solution:**
1. Build the modules first: `make build-modules`
2. Check the modules path: `uv run fuzzforge modules list`
3. Verify images exist: `docker images | grep fuzzforge`
### MCP Server Not Starting
Check the MCP configuration:
```bash
uv run fuzzforge mcp status
```
Verify the configuration file path exists and contains valid JSON.
### Module Container Fails to Build
```bash
# Build module container manually to see errors
cd fuzzforge-modules/<module-name>
docker build -t <module-name> .
```
### Using Podman Instead of Docker
If you prefer Podman:
```bash
# Use --engine podman with CLI
uv run fuzzforge mcp install copilot --engine podman
# Or set environment variable
export FUZZFORGE_ENGINE=podman
```
### Check Logs
FuzzForge stores execution logs in the storage directory:
```bash
ls -la ~/.fuzzforge/storage/<project-id>/<execution-id>/
```
---
## Next Steps
- 📖 Read the [Module SDK Guide](fuzzforge-modules/fuzzforge-modules-sdk/README.md) to create custom modules
- 🎬 Check the demos in the [README](README.md)
- 💬 Join our [Discord](https://discord.gg/8XEX33UUwZ) for support
---
<p align="center">
<strong>Built with ❤️ by <a href="https://fuzzinglabs.com">FuzzingLabs</a></strong>
</p>

6
ai/.gitignore vendored
View File

@@ -1,6 +0,0 @@
.env
__pycache__/
*.pyc
fuzzforge_sessions.db
agentops.log
*.log

View File

@@ -1,110 +0,0 @@
# FuzzForge AI Module
FuzzForge AI is the multi-agent layer that lets you operate the FuzzForge security platform through natural language. It orchestrates local tooling, registered Agent-to-Agent (A2A) peers, and the Prefect-powered backend while keeping long-running context in memory and project knowledge graphs.
## Quick Start
1. **Initialise a project**
```bash
cd /path/to/project
fuzzforge init
```
2. **Review environment settings** copy `.fuzzforge/.env.template` to `.fuzzforge/.env`, then edit the values to match your provider. The template ships with commented defaults for OpenAI-style usage and placeholders for Cognee keys.
```env
LLM_PROVIDER=openai
LITELLM_MODEL=gpt-5-mini
OPENAI_API_KEY=sk-your-key
FUZZFORGE_MCP_URL=http://localhost:8010/mcp
SESSION_PERSISTENCE=sqlite
```
Optional flags you may want to enable early:
```env
MEMORY_SERVICE=inmemory
AGENTOPS_API_KEY=sk-your-agentops-key # Enable hosted tracing
LOG_LEVEL=INFO # CLI / server log level
```
3. **Populate the knowledge graph**
```bash
fuzzforge ingest --path . --recursive
# alias: fuzzforge rag ingest --path . --recursive
```
4. **Launch the agent shell**
```bash
fuzzforge ai agent
```
Keep the backend running (Prefect API at `FUZZFORGE_MCP_URL`) so workflow commands succeed.
## Everyday Workflow
- Run `fuzzforge ai agent` and start with `list available fuzzforge workflows` or `/memory status` to confirm everything is wired.
- Use natural prompts for automation (`run fuzzforge workflow …`, `search project knowledge for …`) and fall back to slash commands for precision (`/recall`, `/sendfile`).
- Keep `/memory datasets` handy to see which Cognee datasets are available after each ingest.
- Start the HTTP surface with `python -m fuzzforge_ai` when external agents need access to artifacts or graph queries. The CLI stays usable at the same time.
- Refresh the knowledge graph regularly: `fuzzforge ingest --path . --recursive --force` keeps responses aligned with recent code changes.
## What the Agent Can Do
- **Route requests** automatically selects the right local tool or remote agent using the A2A capability registry.
- **Run security workflows** list, submit, and monitor FuzzForge workflows via MCP wrappers.
- **Manage artifacts** create downloadable files for reports, code edits, and shared attachments.
- **Maintain context** stores session history, semantic recall, and Cognee project graphs.
- **Serve over HTTP** expose the same agent as an A2A server using `python -m fuzzforge_ai`.
## Essential Commands
Inside `fuzzforge ai agent` you can mix slash commands and free-form prompts:
```text
/list # Show registered A2A agents
/register http://:10201 # Add a remote agent
/artifacts # List generated files
/sendfile SecurityAgent src/report.md "Please review"
You> route_to SecurityAnalyzer: scan ./backend for secrets
You> run fuzzforge workflow static_analysis_scan on ./test_projects/demo
You> search project knowledge for "prefect status" using INSIGHTS
```
Artifacts created during the conversation are served from `.fuzzforge/artifacts/` and exposed through the A2A HTTP API.
## Memory & Knowledge
The module layers three storage systems:
- **Session persistence** (SQLite or in-memory) for chat transcripts.
- **Semantic recall** via the ADK memory service for fuzzy search.
- **Cognee graphs** for project-wide knowledge built from ingestion runs.
Re-run ingestion after major code changes to keep graph answers relevant. If Cognee variables are not set, graph-specific tools automatically respond with a polite "not configured" message.
## Sample Prompts
Use these to validate the setup once the agent shell is running:
- `list available fuzzforge workflows`
- `run fuzzforge workflow static_analysis_scan on ./backend with target_branch=main`
- `show findings for that run once it finishes`
- `refresh the project knowledge graph for ./backend`
- `search project knowledge for "prefect readiness" using INSIGHTS`
- `/recall terraform secrets`
- `/memory status`
- `ROUTE_TO SecurityAnalyzer: audit infrastructure_vulnerable`
## Need More Detail?
Dive into the dedicated guides under `ai/docs/advanced/`:
- [Architecture](https://docs.fuzzforge.ai/docs/ai/intro) High-level architecture with diagrams and component breakdowns.
- [Ingestion](https://docs.fuzzforge.ai/docs/ai/ingestion.md) Command options, Cognee persistence, and prompt examples.
- [Configuration](https://docs.fuzzforge.ai/docs/ai/configuration.md) LLM provider matrix, local model setup, and tracing options.
- [Prompts](https://docs.fuzzforge.ai/docs/ai/prompts.md) Slash commands, workflow prompts, and routing tips.
- [A2A Services](https://docs.fuzzforge.ai/docs/ai/a2a-services.md) HTTP endpoints, agent card, and collaboration flow.
- [Memory Persistence](https://docs.fuzzforge.ai/docs/ai/architecture.md#memory--persistence) Deep dive on memory storage, datasets, and how `/memory status` inspects them.
## Development Notes
- Entry point for the CLI: `ai/src/fuzzforge_ai/cli.py`
- A2A HTTP server: `ai/src/fuzzforge_ai/a2a_server.py`
- Tool routing & workflow glue: `ai/src/fuzzforge_ai/agent_executor.py`
- Ingestion helpers: `ai/src/fuzzforge_ai/ingest_utils.py`
Install the module in editable mode (`pip install -e ai`) while iterating so CLI changes are picked up immediately.

View File

@@ -1,93 +0,0 @@
FuzzForge AI LLM Configuration Guide
===================================
This note summarises the environment variables and libraries that drive LiteLLM (via the Google ADK runtime) inside the FuzzForge AI module. For complete matrices and advanced examples, read `docs/advanced/configuration.md`.
Core Libraries
--------------
- `google-adk` hosts the agent runtime, memory services, and LiteLLM bridge.
- `litellm` provider-agnostic LLM client used by ADK and the executor.
- Provider SDKs install the SDK that matches your target backend (`openai`, `anthropic`, `google-cloud-aiplatform`, `groq`, etc.).
- Optional extras: `agentops` for tracing, `cognee[all]` for knowledge-graph ingestion, `ollama` CLI for running local models.
Quick install foundation::
```
pip install google-adk litellm openai
```
Add any provider-specific SDKs (for example `pip install anthropic groq`) on top of that base.
Baseline Setup
--------------
Copy `.fuzzforge/.env.template` to `.fuzzforge/.env` and set the core fields:
```
LLM_PROVIDER=openai
LITELLM_MODEL=gpt-5-mini
OPENAI_API_KEY=sk-your-key
FUZZFORGE_MCP_URL=http://localhost:8010/mcp
SESSION_PERSISTENCE=sqlite
MEMORY_SERVICE=inmemory
```
LiteLLM Provider Examples
-------------------------
OpenAI-compatible (Azure, etc.)::
```
LLM_PROVIDER=azure_openai
LITELLM_MODEL=gpt-4o-mini
LLM_API_KEY=sk-your-azure-key
LLM_ENDPOINT=https://your-resource.openai.azure.com
```
Anthropic::
```
LLM_PROVIDER=anthropic
LITELLM_MODEL=claude-3-haiku-20240307
ANTHROPIC_API_KEY=sk-your-key
```
Ollama (local)::
```
LLM_PROVIDER=ollama_chat
LITELLM_MODEL=codellama:latest
OLLAMA_API_BASE=http://localhost:11434
```
Run `ollama pull codellama:latest` so the adapter can respond immediately.
Vertex AI::
```
LLM_PROVIDER=vertex_ai
LITELLM_MODEL=gemini-1.5-pro
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
```
Provider Checklist
------------------
- **OpenAI / Azure OpenAI**: `LLM_PROVIDER`, `LITELLM_MODEL`, API key, optional endpoint + API version (Azure).
- **Anthropic**: `LLM_PROVIDER=anthropic`, `LITELLM_MODEL`, `ANTHROPIC_API_KEY`.
- **Google Vertex AI**: `LLM_PROVIDER=vertex_ai`, `LITELLM_MODEL`, `GOOGLE_APPLICATION_CREDENTIALS`, `GOOGLE_CLOUD_PROJECT`.
- **Groq**: `LLM_PROVIDER=groq`, `LITELLM_MODEL`, `GROQ_API_KEY`.
- **Ollama / Local**: `LLM_PROVIDER=ollama_chat`, `LITELLM_MODEL`, `OLLAMA_API_BASE`, and the model pulled locally (`ollama pull <model>`).
Knowledge Graph Add-ons
-----------------------
Set these only if you plan to use Cognee project graphs:
```
LLM_COGNEE_PROVIDER=openai
LLM_COGNEE_MODEL=gpt-5-mini
LLM_COGNEE_API_KEY=sk-your-key
```
Tracing & Debugging
-------------------
- Provide `AGENTOPS_API_KEY` to enable hosted traces for every conversation.
- Set `FUZZFORGE_DEBUG=1` (and optionally `LOG_LEVEL=DEBUG`) for verbose executor output.
- Restart the agent after changing environment variables; LiteLLM loads configuration on boot.
Further Reading
---------------
`docs/advanced/configuration.md` provider comparison, debugging flags, and referenced modules.

View File

@@ -1,44 +0,0 @@
[project]
name = "fuzzforge-ai"
version = "0.6.0"
description = "FuzzForge AI orchestration module"
readme = "README.md"
requires-python = ">=3.11"
dependencies = [
"google-adk",
"a2a-sdk",
"litellm",
"python-dotenv",
"httpx",
"uvicorn",
"rich",
"agentops",
"fastmcp",
"mcp",
"typing-extensions",
"cognee>=0.3.0",
]
[project.optional-dependencies]
dev = [
"pytest",
"pytest-asyncio",
"black",
"ruff",
]
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[tool.hatch.build.targets.wheel]
packages = ["src/fuzzforge_ai"]
[tool.hatch.metadata]
allow-direct-references = true
[tool.uv]
dev-dependencies = [
"pytest",
"pytest-asyncio",
]

View File

@@ -1,24 +0,0 @@
"""
FuzzForge AI Module - Agent-to-Agent orchestration system
This module integrates the fuzzforge_ai components into FuzzForge,
providing intelligent AI agent capabilities for security analysis.
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
__version__ = "0.6.0"
from .agent import FuzzForgeAgent
from .config_manager import ConfigManager
__all__ = ['FuzzForgeAgent', 'ConfigManager']

View File

@@ -1,109 +0,0 @@
"""
FuzzForge A2A Server
Run this to expose FuzzForge as an A2A-compatible agent
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import os
import warnings
import logging
from dotenv import load_dotenv
from fuzzforge_ai.config_bridge import ProjectConfigManager
# Suppress warnings
warnings.filterwarnings("ignore")
logging.getLogger("google.adk").setLevel(logging.ERROR)
logging.getLogger("google.adk.tools.base_authenticated_tool").setLevel(logging.ERROR)
# Load .env from .fuzzforge directory first, then fallback
from pathlib import Path
# Ensure Cognee logs stay inside the project workspace
project_root = Path.cwd()
default_log_dir = project_root / ".fuzzforge" / "logs"
default_log_dir.mkdir(parents=True, exist_ok=True)
log_path = default_log_dir / "cognee.log"
os.environ.setdefault("COGNEE_LOG_PATH", str(log_path))
fuzzforge_env = Path.cwd() / ".fuzzforge" / ".env"
if fuzzforge_env.exists():
load_dotenv(fuzzforge_env, override=True)
else:
load_dotenv(override=True)
# Ensure Cognee uses the project-specific storage paths when available
try:
project_config = ProjectConfigManager()
project_config.setup_cognee_environment()
except Exception:
# Project may not be initialized; fall through with default settings
pass
# Check configuration
if not os.getenv('LITELLM_MODEL'):
print("[ERROR] LITELLM_MODEL not set in .env file")
print("Please set LITELLM_MODEL to your desired model (e.g., gpt-4o-mini)")
exit(1)
from .agent import get_fuzzforge_agent
from .a2a_server import create_a2a_app as create_custom_a2a_app
def create_a2a_app():
"""Create the A2A application"""
# Get configuration
port = int(os.getenv('FUZZFORGE_PORT', 10100))
# Get the FuzzForge agent
fuzzforge = get_fuzzforge_agent()
# Print ASCII banner
print("\033[95m") # Purple color
print(" ███████╗██╗ ██╗███████╗███████╗███████╗ ██████╗ ██████╗ ██████╗ ███████╗ █████╗ ██╗")
print(" ██╔════╝██║ ██║╚══███╔╝╚══███╔╝██╔════╝██╔═══██╗██╔══██╗██╔════╝ ██╔════╝ ██╔══██╗██║")
print(" █████╗ ██║ ██║ ███╔╝ ███╔╝ █████╗ ██║ ██║██████╔╝██║ ███╗█████╗ ███████║██║")
print(" ██╔══╝ ██║ ██║ ███╔╝ ███╔╝ ██╔══╝ ██║ ██║██╔══██╗██║ ██║██╔══╝ ██╔══██║██║")
print(" ██║ ╚██████╔╝███████╗███████╗██║ ╚██████╔╝██║ ██║╚██████╔╝███████╗ ██║ ██║██║")
print(" ╚═╝ ╚═════╝ ╚══════╝╚══════╝╚═╝ ╚═════╝ ╚═╝ ╚═╝ ╚═════╝ ╚══════╝ ╚═╝ ╚═╝╚═╝")
print("\033[0m") # Reset color
# Create A2A app
print(f"🚀 Starting FuzzForge A2A Server")
print(f" Model: {fuzzforge.model}")
if fuzzforge.cognee_url:
print(f" Memory: Cognee at {fuzzforge.cognee_url}")
print(f" Port: {port}")
app = create_custom_a2a_app(fuzzforge.adk_agent, port=port, executor=fuzzforge.executor)
print(f"\n✅ FuzzForge A2A Server ready!")
print(f" Agent card: http://localhost:{port}/.well-known/agent-card.json")
print(f" A2A endpoint: http://localhost:{port}/")
print(f"\n📡 Other agents can register FuzzForge at: http://localhost:{port}")
return app
def main():
"""Start the A2A server using uvicorn."""
import uvicorn
app = create_a2a_app()
port = int(os.getenv('FUZZFORGE_PORT', 10100))
print(f"\n🎯 Starting server with uvicorn...")
uvicorn.run(app, host="127.0.0.1", port=port)
if __name__ == "__main__":
main()

View File

@@ -1,230 +0,0 @@
"""Custom A2A wiring so we can access task store and queue manager."""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
from __future__ import annotations
import logging
from typing import Optional, Union
from starlette.applications import Starlette
from starlette.responses import Response, FileResponse
from starlette.routing import Route
from google.adk.a2a.executor.a2a_agent_executor import A2aAgentExecutor
from google.adk.a2a.utils.agent_card_builder import AgentCardBuilder
from google.adk.a2a.experimental import a2a_experimental
from google.adk.agents.base_agent import BaseAgent
from google.adk.artifacts.in_memory_artifact_service import InMemoryArtifactService
from google.adk.auth.credential_service.in_memory_credential_service import InMemoryCredentialService
from google.adk.cli.utils.logs import setup_adk_logger
from google.adk.memory.in_memory_memory_service import InMemoryMemoryService
from google.adk.runners import Runner
from google.adk.sessions.in_memory_session_service import InMemorySessionService
from a2a.server.apps import A2AStarletteApplication
from a2a.server.request_handlers.default_request_handler import DefaultRequestHandler
from a2a.server.tasks.inmemory_task_store import InMemoryTaskStore
from a2a.server.events.in_memory_queue_manager import InMemoryQueueManager
from a2a.types import AgentCard
from .agent_executor import FuzzForgeExecutor
import json
async def serve_artifact(request):
"""Serve artifact files via HTTP for A2A agents"""
artifact_id = request.path_params["artifact_id"]
# Try to get the executor instance to access artifact cache
# We'll store a reference to it during app creation
executor = getattr(serve_artifact, '_executor', None)
if not executor:
return Response("Artifact service not available", status_code=503)
try:
# Look in the artifact cache directory
artifact_cache_dir = executor._artifact_cache_dir
artifact_dir = artifact_cache_dir / artifact_id
if not artifact_dir.exists():
return Response("Artifact not found", status_code=404)
# Find the artifact file (should be only one file in the directory)
artifact_files = list(artifact_dir.glob("*"))
if not artifact_files:
return Response("Artifact file not found", status_code=404)
artifact_file = artifact_files[0] # Take the first (and should be only) file
# Determine mime type from file extension or default to octet-stream
import mimetypes
mime_type, _ = mimetypes.guess_type(str(artifact_file))
if not mime_type:
mime_type = 'application/octet-stream'
return FileResponse(
path=str(artifact_file),
media_type=mime_type,
filename=artifact_file.name
)
except Exception as e:
return Response(f"Error serving artifact: {str(e)}", status_code=500)
async def knowledge_query(request):
"""Expose knowledge graph search over HTTP for external agents."""
executor = getattr(knowledge_query, '_executor', None)
if not executor:
return Response("Knowledge service not available", status_code=503)
try:
payload = await request.json()
except Exception:
return Response("Invalid JSON body", status_code=400)
query = payload.get("query")
if not query:
return Response("'query' is required", status_code=400)
search_type = payload.get("search_type", "INSIGHTS")
dataset = payload.get("dataset")
result = await executor.query_project_knowledge_api(
query=query,
search_type=search_type,
dataset=dataset,
)
status = 200 if not isinstance(result, dict) or "error" not in result else 400
return Response(
json.dumps(result, default=str),
status_code=status,
media_type="application/json",
)
async def create_file_artifact(request):
"""Create an artifact from a project file via HTTP."""
executor = getattr(create_file_artifact, '_executor', None)
if not executor:
return Response("File service not available", status_code=503)
try:
payload = await request.json()
except Exception:
return Response("Invalid JSON body", status_code=400)
path = payload.get("path")
if not path:
return Response("'path' is required", status_code=400)
result = await executor.create_project_file_artifact_api(path)
status = 200 if not isinstance(result, dict) or "error" not in result else 400
return Response(
json.dumps(result, default=str),
status_code=status,
media_type="application/json",
)
def _load_agent_card(agent_card: Optional[Union[AgentCard, str]]) -> Optional[AgentCard]:
if agent_card is None:
return None
if isinstance(agent_card, AgentCard):
return agent_card
import json
from pathlib import Path
path = Path(agent_card)
with path.open('r', encoding='utf-8') as handle:
data = json.load(handle)
return AgentCard(**data)
@a2a_experimental
def create_a2a_app(
agent: BaseAgent,
*,
host: str = "localhost",
port: int = 8000,
protocol: str = "http",
agent_card: Optional[Union[AgentCard, str]] = None,
executor=None, # Accept executor reference
) -> Starlette:
"""Variant of google.adk.a2a.utils.to_a2a that exposes task-store handles."""
setup_adk_logger(logging.INFO)
async def create_runner() -> Runner:
return Runner(
agent=agent,
app_name=agent.name or "fuzzforge",
artifact_service=InMemoryArtifactService(),
session_service=InMemorySessionService(),
memory_service=InMemoryMemoryService(),
credential_service=InMemoryCredentialService(),
)
task_store = InMemoryTaskStore()
queue_manager = InMemoryQueueManager()
agent_executor = A2aAgentExecutor(runner=create_runner)
request_handler = DefaultRequestHandler(
agent_executor=agent_executor,
task_store=task_store,
queue_manager=queue_manager,
)
rpc_url = f"{protocol}://{host}:{port}/"
provided_card = _load_agent_card(agent_card)
card_builder = AgentCardBuilder(agent=agent, rpc_url=rpc_url)
app = Starlette()
async def setup() -> None:
if provided_card is not None:
final_card = provided_card
else:
final_card = await card_builder.build()
a2a_app = A2AStarletteApplication(
agent_card=final_card,
http_handler=request_handler,
)
a2a_app.add_routes_to_app(app)
# Add artifact serving route
app.router.add_route("/artifacts/{artifact_id}", serve_artifact, methods=["GET"])
app.router.add_route("/graph/query", knowledge_query, methods=["POST"])
app.router.add_route("/project/files", create_file_artifact, methods=["POST"])
app.add_event_handler("startup", setup)
# Expose handles so the executor can emit task updates later
FuzzForgeExecutor.task_store = task_store
FuzzForgeExecutor.queue_manager = queue_manager
# Store reference to executor for artifact serving
serve_artifact._executor = executor
knowledge_query._executor = executor
create_file_artifact._executor = executor
return app
__all__ = ["create_a2a_app"]

View File

@@ -1,133 +0,0 @@
"""
FuzzForge Agent Definition
The core agent that combines all components
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import os
from pathlib import Path
from typing import Dict, Any, List
from google.adk import Agent
from google.adk.models.lite_llm import LiteLlm
from .agent_card import get_fuzzforge_agent_card
from .agent_executor import FuzzForgeExecutor
from .memory_service import FuzzForgeMemoryService, HybridMemoryManager
# Load environment variables from the AI module's .env file
try:
from dotenv import load_dotenv
_ai_dir = Path(__file__).parent
_env_file = _ai_dir / ".env"
if _env_file.exists():
load_dotenv(_env_file, override=False) # Don't override existing env vars
except ImportError:
# dotenv not available, skip loading
pass
class FuzzForgeAgent:
"""The main FuzzForge agent that combines card, executor, and ADK agent"""
def __init__(
self,
model: str = None,
cognee_url: str = None,
port: int = 10100,
):
"""Initialize FuzzForge agent with configuration"""
self.model = model or os.getenv('LITELLM_MODEL', 'gpt-4o-mini')
self.cognee_url = cognee_url or os.getenv('COGNEE_MCP_URL')
self.port = port
# Initialize ADK Memory Service for conversational memory
memory_type = os.getenv('MEMORY_SERVICE', 'inmemory')
self.memory_service = FuzzForgeMemoryService(memory_type=memory_type)
# Create the executor (the brain) with memory and session services
self.executor = FuzzForgeExecutor(
model=self.model,
cognee_url=self.cognee_url,
debug=os.getenv('FUZZFORGE_DEBUG', '0') == '1',
memory_service=self.memory_service,
session_persistence=os.getenv('SESSION_PERSISTENCE', 'inmemory'),
fuzzforge_mcp_url=os.getenv('FUZZFORGE_MCP_URL'),
)
# Create Hybrid Memory Manager (ADK + Cognee direct integration)
# MCP tools removed - using direct Cognee integration only
self.memory_manager = HybridMemoryManager(
memory_service=self.memory_service,
cognee_tools=None # No MCP tools, direct integration used instead
)
# Get the agent card (the identity)
self.agent_card = get_fuzzforge_agent_card(f"http://localhost:{self.port}")
# Create the ADK agent (for A2A server mode)
self.adk_agent = self._create_adk_agent()
def _create_adk_agent(self) -> Agent:
"""Create the ADK agent for A2A server mode"""
# Build instruction
instruction = f"""You are {self.agent_card.name}, {self.agent_card.description}
Your capabilities include:
"""
for skill in self.agent_card.skills:
instruction += f"\n- {skill.name}: {skill.description}"
instruction += """
When responding to requests:
1. Use your registered agents when appropriate
2. Use Cognee memory tools when available
3. Provide helpful, concise responses
4. Maintain context across conversations
"""
# Create ADK agent
return Agent(
model=LiteLlm(model=self.model),
name=self.agent_card.name,
description=self.agent_card.description,
instruction=instruction,
tools=self.executor.agent.tools if hasattr(self.executor.agent, 'tools') else []
)
async def process_message(self, message: str, context_id: str = None) -> str:
"""Process a message using the executor"""
result = await self.executor.execute(message, context_id or "default")
return result.get("response", "No response generated")
async def register_agent(self, url: str) -> Dict[str, Any]:
"""Register a new agent"""
return await self.executor.register_agent(url)
def list_agents(self) -> List[Dict[str, Any]]:
"""List registered agents"""
return self.executor.list_agents()
async def cleanup(self):
"""Clean up resources"""
await self.executor.cleanup()
# Create a singleton instance for import
_instance = None
def get_fuzzforge_agent() -> FuzzForgeAgent:
"""Get the singleton FuzzForge agent instance"""
global _instance
if _instance is None:
_instance = FuzzForgeAgent()
return _instance

View File

@@ -1,183 +0,0 @@
"""
FuzzForge Agent Card and Skills Definition
Defines what FuzzForge can do and how others can discover it
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
from dataclasses import dataclass
from typing import List, Optional, Dict, Any
@dataclass
class AgentSkill:
"""Represents a specific capability of the agent"""
id: str
name: str
description: str
tags: List[str]
examples: List[str]
input_modes: List[str] = None
output_modes: List[str] = None
def to_dict(self) -> Dict[str, Any]:
"""Convert to dictionary for JSON serialization"""
return {
"id": self.id,
"name": self.name,
"description": self.description,
"tags": self.tags,
"examples": self.examples,
"inputModes": self.input_modes or ["text/plain"],
"outputModes": self.output_modes or ["text/plain"]
}
@dataclass
class AgentCapabilities:
"""Defines agent capabilities for A2A protocol"""
streaming: bool = False
push_notifications: bool = False
multi_turn: bool = True
context_retention: bool = True
def to_dict(self) -> Dict[str, Any]:
return {
"streaming": self.streaming,
"pushNotifications": self.push_notifications,
"multiTurn": self.multi_turn,
"contextRetention": self.context_retention
}
@dataclass
class AgentCard:
"""The agent's business card - tells others what this agent can do"""
name: str
description: str
version: str
url: str
skills: List[AgentSkill]
capabilities: AgentCapabilities
default_input_modes: List[str] = None
default_output_modes: List[str] = None
preferred_transport: str = "JSONRPC"
protocol_version: str = "0.3.0"
def to_dict(self) -> Dict[str, Any]:
"""Convert to A2A-compliant agent card JSON"""
return {
"name": self.name,
"description": self.description,
"version": self.version,
"url": self.url,
"protocolVersion": self.protocol_version,
"preferredTransport": self.preferred_transport,
"defaultInputModes": self.default_input_modes or ["text/plain"],
"defaultOutputModes": self.default_output_modes or ["text/plain"],
"capabilities": self.capabilities.to_dict(),
"skills": [skill.to_dict() for skill in self.skills]
}
# Define FuzzForge's skills
orchestration_skill = AgentSkill(
id="orchestration",
name="Agent Orchestration",
description="Route requests to appropriate registered agents based on their capabilities",
tags=["orchestration", "routing", "coordination"],
examples=[
"Route this to the calculator",
"Send this to the appropriate agent",
"Which agent should handle this?"
]
)
memory_skill = AgentSkill(
id="memory",
name="Memory Management",
description="Store and retrieve information using Cognee knowledge graph",
tags=["memory", "knowledge", "storage", "cognee"],
examples=[
"Remember that my favorite color is blue",
"What do you remember about me?",
"Search your memory for project details"
]
)
conversation_skill = AgentSkill(
id="conversation",
name="General Conversation",
description="Engage in general conversation and answer questions using LLM",
tags=["chat", "conversation", "qa", "llm"],
examples=[
"What is the meaning of life?",
"Explain quantum computing",
"Help me understand this concept"
]
)
workflow_automation_skill = AgentSkill(
id="workflow_automation",
name="Workflow Automation",
description="Operate project workflows via MCP, monitor runs, and share results",
tags=["workflow", "automation", "mcp", "orchestration"],
examples=[
"Submit the security assessment workflow",
"Kick off the infrastructure scan and monitor it",
"Summarise findings for run abc123"
]
)
agent_management_skill = AgentSkill(
id="agent_management",
name="Agent Registry Management",
description="Register, list, and manage connections to other A2A agents",
tags=["registry", "management", "discovery"],
examples=[
"Register agent at http://localhost:10201",
"List all registered agents",
"Show agent capabilities"
]
)
# Define FuzzForge's capabilities
fuzzforge_capabilities = AgentCapabilities(
streaming=False,
push_notifications=True,
multi_turn=True, # We support multi-turn conversations
context_retention=True # We maintain context across turns
)
# Create the public agent card
def get_fuzzforge_agent_card(url: str = "http://localhost:10100") -> AgentCard:
"""Get FuzzForge's agent card with current configuration"""
return AgentCard(
name="ProjectOrchestrator",
description=(
"An A2A-capable project agent that can launch and monitor FuzzForge workflows, "
"consult the project knowledge graph, and coordinate with speciality agents."
),
version="project-agent",
url=url,
skills=[
orchestration_skill,
memory_skill,
conversation_skill,
workflow_automation_skill,
agent_management_skill
],
capabilities=fuzzforge_capabilities,
default_input_modes=["text/plain", "application/json"],
default_output_modes=["text/plain", "application/json"],
preferred_transport="JSONRPC",
protocol_version="0.3.0"
)

File diff suppressed because it is too large Load Diff

View File

@@ -1,977 +0,0 @@
#!/usr/bin/env python3
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
"""
FuzzForge CLI - Clean modular version
Uses the separated agent components
"""
import asyncio
import shlex
import os
import sys
import signal
import warnings
import logging
import random
from datetime import datetime
from contextlib import contextmanager
from pathlib import Path
from typing import Any
from dotenv import load_dotenv
# Ensure Cognee writes logs inside the project workspace
project_root = Path.cwd()
default_log_dir = project_root / ".fuzzforge" / "logs"
default_log_dir.mkdir(parents=True, exist_ok=True)
log_path = default_log_dir / "cognee.log"
os.environ.setdefault("COGNEE_LOG_PATH", str(log_path))
# Suppress warnings
warnings.filterwarnings("ignore")
logging.basicConfig(level=logging.ERROR)
# Load .env file with explicit path handling
# 1. First check current working directory for .fuzzforge/.env
fuzzforge_env = Path.cwd() / ".fuzzforge" / ".env"
if fuzzforge_env.exists():
load_dotenv(fuzzforge_env, override=True)
else:
# 2. Then check parent directories for .fuzzforge projects
current_path = Path.cwd()
for parent in [current_path] + list(current_path.parents):
fuzzforge_dir = parent / ".fuzzforge"
if fuzzforge_dir.exists():
project_env = fuzzforge_dir / ".env"
if project_env.exists():
load_dotenv(project_env, override=True)
break
else:
# 3. Fallback to generic load_dotenv
load_dotenv(override=True)
# Enhanced readline configuration for Rich Console input compatibility
try:
import readline
# Enable Rich-compatible input features
readline.parse_and_bind("tab: complete")
readline.parse_and_bind("set editing-mode emacs")
readline.parse_and_bind("set show-all-if-ambiguous on")
readline.parse_and_bind("set completion-ignore-case on")
readline.parse_and_bind("set colored-completion-prefix on")
readline.parse_and_bind("set enable-bracketed-paste on") # Better paste support
# Navigation bindings for better editing
readline.parse_and_bind("Control-a: beginning-of-line")
readline.parse_and_bind("Control-e: end-of-line")
readline.parse_and_bind("Control-u: unix-line-discard")
readline.parse_and_bind("Control-k: kill-line")
readline.parse_and_bind("Control-w: unix-word-rubout")
readline.parse_and_bind("Meta-Backspace: backward-kill-word")
# History and completion
readline.set_history_length(2000)
readline.set_startup_hook(None)
# Enable multiline editing hints
readline.parse_and_bind("set horizontal-scroll-mode off")
readline.parse_and_bind("set mark-symlinked-directories on")
READLINE_AVAILABLE = True
except ImportError:
READLINE_AVAILABLE = False
from rich.console import Console
from rich.table import Table
from rich.panel import Panel
from rich.prompt import Prompt
from rich import box
from google.adk.events.event import Event
from google.adk.events.event_actions import EventActions
from google.genai import types as gen_types
from .agent import FuzzForgeAgent
from .agent_card import get_fuzzforge_agent_card
from .config_manager import ConfigManager
from .config_bridge import ProjectConfigManager
from .remote_agent import RemoteAgentConnection
console = Console()
# Global shutdown flag
shutdown_requested = False
# Dynamic status messages for better UX
THINKING_MESSAGES = [
"Thinking", "Processing", "Computing", "Analyzing", "Working",
"Pondering", "Deliberating", "Calculating", "Reasoning", "Evaluating"
]
WORKING_MESSAGES = [
"Working", "Processing", "Handling", "Executing", "Running",
"Operating", "Performing", "Conducting", "Managing", "Coordinating"
]
SEARCH_MESSAGES = [
"Searching", "Scanning", "Exploring", "Investigating", "Hunting",
"Seeking", "Probing", "Examining", "Inspecting", "Browsing"
]
# Cool prompt symbols
PROMPT_STYLES = [
"", "", "", "", "»", "", "", "", "", ""
]
def get_dynamic_status(action_type="thinking"):
"""Get a random status message based on action type"""
if action_type == "thinking":
return f"{random.choice(THINKING_MESSAGES)}..."
elif action_type == "working":
return f"{random.choice(WORKING_MESSAGES)}..."
elif action_type == "searching":
return f"{random.choice(SEARCH_MESSAGES)}..."
else:
return f"{random.choice(THINKING_MESSAGES)}..."
def get_prompt_symbol():
"""Get prompt symbol indicating where to write"""
return ">>"
def signal_handler(signum, frame):
"""Handle Ctrl+C gracefully"""
global shutdown_requested
shutdown_requested = True
console.print("\n\n[yellow]Shutting down gracefully...[/yellow]")
sys.exit(0)
signal.signal(signal.SIGINT, signal_handler)
@contextmanager
def safe_status(message: str):
"""Safe status context manager"""
status = console.status(message, spinner="dots")
try:
status.start()
yield
finally:
status.stop()
class FuzzForgeCLI:
"""Command-line interface for FuzzForge"""
def __init__(self):
"""Initialize the CLI"""
# Ensure .env is loaded from .fuzzforge directory
fuzzforge_env = Path.cwd() / ".fuzzforge" / ".env"
if fuzzforge_env.exists():
load_dotenv(fuzzforge_env, override=True)
# Load configuration for agent registry
self.config_manager = ConfigManager()
# Check environment configuration
if not os.getenv('LITELLM_MODEL'):
console.print("[red]ERROR: LITELLM_MODEL not set in .env file[/red]")
console.print("Please set LITELLM_MODEL to your desired model")
sys.exit(1)
# Create the agent (uses env vars directly)
self.agent = FuzzForgeAgent()
# Create a consistent context ID for this CLI session
self.context_id = f"cli_{datetime.now().strftime('%Y%m%d_%H%M%S')}"
# Track registered agents for config persistence
self.agents_modified = False
# Command handlers
self.commands = {
"/help": self.cmd_help,
"/register": self.cmd_register,
"/unregister": self.cmd_unregister,
"/list": self.cmd_list,
"/memory": self.cmd_memory,
"/recall": self.cmd_recall,
"/artifacts": self.cmd_artifacts,
"/tasks": self.cmd_tasks,
"/skills": self.cmd_skills,
"/sessions": self.cmd_sessions,
"/clear": self.cmd_clear,
"/sendfile": self.cmd_sendfile,
"/quit": self.cmd_quit,
"/exit": self.cmd_quit,
}
self.background_tasks: set[asyncio.Task] = set()
def print_banner(self):
"""Print welcome banner"""
card = self.agent.agent_card
# Print ASCII banner
console.print("[medium_purple3] ███████╗██╗ ██╗███████╗███████╗███████╗ ██████╗ ██████╗ ██████╗ ███████╗ █████╗ ██╗[/medium_purple3]")
console.print("[medium_purple3] ██╔════╝██║ ██║╚══███╔╝╚══███╔╝██╔════╝██╔═══██╗██╔══██╗██╔════╝ ██╔════╝ ██╔══██╗██║[/medium_purple3]")
console.print("[medium_purple3] █████╗ ██║ ██║ ███╔╝ ███╔╝ █████╗ ██║ ██║██████╔╝██║ ███╗█████╗ ███████║██║[/medium_purple3]")
console.print("[medium_purple3] ██╔══╝ ██║ ██║ ███╔╝ ███╔╝ ██╔══╝ ██║ ██║██╔══██╗██║ ██║██╔══╝ ██╔══██║██║[/medium_purple3]")
console.print("[medium_purple3] ██║ ╚██████╔╝███████╗███████╗██║ ╚██████╔╝██║ ██║╚██████╔╝███████╗ ██║ ██║██║[/medium_purple3]")
console.print("[medium_purple3] ╚═╝ ╚═════╝ ╚══════╝╚══════╝╚═╝ ╚═════╝ ╚═╝ ╚═╝ ╚═════╝ ╚══════╝ ╚═╝ ╚═╝╚═╝[/medium_purple3]")
console.print(f"\n[dim]{card.description}[/dim]\n")
provider = (
os.getenv("LLM_PROVIDER")
or os.getenv("LLM_COGNEE_PROVIDER")
or os.getenv("COGNEE_LLM_PROVIDER")
or "unknown"
)
console.print(
"LLM Provider: [medium_purple1]{provider}[/medium_purple1]".format(
provider=provider
)
)
console.print(
"LLM Model: [medium_purple1]{model}[/medium_purple1]".format(
model=self.agent.model
)
)
if self.agent.executor.agentops_trace:
console.print(f"Tracking: [medium_purple1]AgentOps active[/medium_purple1]")
# Show skills
console.print("\nSkills:")
for skill in card.skills:
console.print(
f" • [deep_sky_blue1]{skill.name}[/deep_sky_blue1] {skill.description}"
)
console.print("\nType /help for commands or just chat\n")
async def cmd_help(self, args: str = "") -> None:
"""Show help"""
help_text = """
[bold]Commands:[/bold]
/register <url> - Register an A2A agent (saves to config)
/unregister <name> - Remove agent from registry and config
/list - List registered agents
[bold]Memory Systems:[/bold]
/recall <query> - Search past conversations (ADK Memory)
/memory - Show knowledge graph (Cognee)
/memory save - Save to knowledge graph
/memory search - Search knowledge graph
[bold]Other:[/bold]
/artifacts - List created artifacts
/artifacts <id> - Show artifact content
/tasks [id] - Show task list or details
/skills - Show FuzzForge skills
/sessions - List active sessions
/sendfile <agent> <path> [message] - Attach file as artifact and route to agent
/clear - Clear screen
/help - Show this help
/quit - Exit
[bold]Sample prompts:[/bold]
run fuzzforge workflow security_assessment on /absolute/path --volume-mode ro
list fuzzforge runs limit=5
get fuzzforge summary <run_id>
query project knowledge about "unsafe Rust" using GRAPH_COMPLETION
export project file src/lib.rs as artifact
/memory search "recent findings"
[bold]Input Editing:[/bold]
Arrow keys - Move cursor
Ctrl+A/E - Start/end of line
Up/Down - Command history
"""
console.print(help_text)
async def cmd_register(self, args: str) -> None:
"""Register an agent"""
if not args:
console.print("Usage: /register <url>")
return
with safe_status(f"{get_dynamic_status('working')} Registering {args}"):
result = await self.agent.register_agent(args.strip())
if result["success"]:
console.print(f"✅ Registered: [bold]{result['name']}[/bold]")
console.print(f" Capabilities: {result['capabilities']} skills")
# Get description from the agent's card
agents = self.agent.list_agents()
description = ""
for agent in agents:
if agent['name'] == result['name']:
description = agent.get('description', '')
break
# Add to config for persistence
self.config_manager.add_registered_agent(
name=result['name'],
url=args.strip(),
description=description
)
console.print(f" [dim]Saved to config for auto-registration[/dim]")
else:
console.print(f"[red]Failed: {result['error']}[/red]")
async def cmd_unregister(self, args: str) -> None:
"""Unregister an agent and remove from config"""
if not args:
console.print("Usage: /unregister <name or url>")
return
# Try to find the agent
agents = self.agent.list_agents()
agent_to_remove = None
for agent in agents:
if agent['name'].lower() == args.lower() or agent['url'] == args:
agent_to_remove = agent
break
if not agent_to_remove:
console.print(f"[yellow]Agent '{args}' not found[/yellow]")
return
# Remove from config
if self.config_manager.remove_registered_agent(name=agent_to_remove['name'], url=agent_to_remove['url']):
console.print(f"✅ Unregistered: [bold]{agent_to_remove['name']}[/bold]")
console.print(f" [dim]Removed from config (won't auto-register next time)[/dim]")
else:
console.print(f"[yellow]Agent unregistered from session but not found in config[/yellow]")
async def cmd_list(self, args: str = "") -> None:
"""List registered agents"""
agents = self.agent.list_agents()
if not agents:
console.print("No agents registered. Use /register <url>")
return
table = Table(title="Registered Agents", box=box.ROUNDED)
table.add_column("Name", style="medium_purple3")
table.add_column("URL", style="deep_sky_blue3")
table.add_column("Skills", style="plum3")
table.add_column("Description", style="dim")
for agent in agents:
desc = agent['description']
if len(desc) > 40:
desc = desc[:37] + "..."
table.add_row(
agent['name'],
agent['url'],
str(agent['skills']),
desc
)
console.print(table)
async def cmd_recall(self, args: str = "") -> None:
"""Search conversational memory (past conversations)"""
if not args:
console.print("Usage: /recall <query>")
return
await self._sync_conversational_memory()
# First try MemoryService (for ingested memories)
with safe_status(get_dynamic_status('searching')):
results = await self.agent.memory_manager.search_conversational_memory(args)
if results and results.memories:
console.print(f"[bold]Found {len(results.memories)} memories:[/bold]\n")
for i, memory in enumerate(results.memories, 1):
# MemoryEntry has 'text' field, not 'content'
text = getattr(memory, 'text', str(memory))
if len(text) > 200:
text = text[:200] + "..."
console.print(f"{i}. {text}")
else:
# If MemoryService is empty, search SQLite directly
console.print("[yellow]No memories in MemoryService, searching SQLite sessions...[/yellow]")
# Check if using DatabaseSessionService
if hasattr(self.agent.executor, 'session_service'):
service_type = type(self.agent.executor.session_service).__name__
if service_type == 'DatabaseSessionService':
# Search SQLite database directly
import sqlite3
import os
db_path = os.getenv('SESSION_DB_PATH', './fuzzforge_sessions.db')
if os.path.exists(db_path):
conn = sqlite3.connect(db_path)
cursor = conn.cursor()
# Search in events table
query = f"%{args}%"
cursor.execute(
"SELECT content FROM events WHERE content LIKE ? LIMIT 10",
(query,)
)
rows = cursor.fetchall()
conn.close()
if rows:
console.print(f"[green]Found {len(rows)} matches in SQLite sessions:[/green]\n")
for i, (content,) in enumerate(rows, 1):
# Parse JSON content
import json
try:
data = json.loads(content)
if 'parts' in data and data['parts']:
text = data['parts'][0].get('text', '')[:150]
role = data.get('role', 'unknown')
console.print(f"{i}. [{role}]: {text}...")
except:
console.print(f"{i}. {content[:150]}...")
else:
console.print("[yellow]No matches found in SQLite either[/yellow]")
else:
console.print("[yellow]SQLite database not found[/yellow]")
else:
console.print(f"[dim]Using {service_type} (not searchable)[/dim]")
else:
console.print("[yellow]No session history available[/yellow]")
async def cmd_memory(self, args: str = "") -> None:
"""Inspect conversational memory and knowledge graph state."""
raw_args = (args or "").strip()
lower_args = raw_args.lower()
if not raw_args or lower_args in {"status", "info"}:
await self._show_memory_status()
return
if lower_args == "datasets":
await self._show_dataset_summary()
return
if lower_args.startswith("search ") or lower_args.startswith("recall "):
query = raw_args.split(" ", 1)[1].strip() if " " in raw_args else ""
if not query:
console.print("Usage: /memory search <query>")
return
await self.cmd_recall(query)
return
console.print("Usage: /memory [status|datasets|search <query>]")
console.print("[dim]/memory search <query> is an alias for /recall <query>[/dim]")
async def _sync_conversational_memory(self) -> None:
"""Ensure the ADK memory service ingests any completed sessions."""
memory_service = getattr(self.agent.memory_manager, "memory_service", None)
executor_sessions = getattr(self.agent.executor, "sessions", {})
metadata_map = getattr(self.agent.executor, "session_metadata", {})
if not memory_service or not executor_sessions:
return
for context_id, session in list(executor_sessions.items()):
meta = metadata_map.get(context_id, {})
if meta.get('memory_synced'):
continue
add_session = getattr(memory_service, "add_session_to_memory", None)
if not callable(add_session):
return
try:
await add_session(session)
meta['memory_synced'] = True
metadata_map[context_id] = meta
except Exception as exc: # pragma: no cover - defensive logging
if os.getenv('FUZZFORGE_DEBUG', '0') == '1':
console.print(f"[yellow]Memory sync failed:[/yellow] {exc}")
async def _show_memory_status(self) -> None:
"""Render conversational memory, session store, and knowledge graph status."""
await self._sync_conversational_memory()
status = self.agent.memory_manager.get_status()
conversational = status.get("conversational_memory", {})
conv_type = conversational.get("type", "unknown")
conv_active = "yes" if conversational.get("active") else "no"
conv_details = conversational.get("details", "")
session_service = getattr(self.agent.executor, "session_service", None)
session_service_name = type(session_service).__name__ if session_service else "Unavailable"
session_lines = [
f"[bold]Service:[/bold] {session_service_name}"
]
session_count = None
event_count = None
db_path_display = None
if session_service_name == "DatabaseSessionService":
import sqlite3
db_path = os.getenv('SESSION_DB_PATH', './fuzzforge_sessions.db')
session_path = Path(db_path).expanduser().resolve()
db_path_display = str(session_path)
if session_path.exists():
try:
with sqlite3.connect(session_path) as conn:
cursor = conn.cursor()
cursor.execute("SELECT COUNT(*) FROM sessions")
session_count = cursor.fetchone()[0]
cursor.execute("SELECT COUNT(*) FROM events")
event_count = cursor.fetchone()[0]
except Exception as exc:
session_lines.append(f"[yellow]Warning:[/yellow] Unable to read session database ({exc})")
else:
session_lines.append("[yellow]SQLite session database not found yet[/yellow]")
elif session_service_name == "InMemorySessionService":
session_lines.append("[dim]Session data persists for the current process only[/dim]")
if db_path_display:
session_lines.append(f"[bold]Database:[/bold] {db_path_display}")
if session_count is not None:
session_lines.append(f"[bold]Sessions Recorded:[/bold] {session_count}")
if event_count is not None:
session_lines.append(f"[bold]Events Logged:[/bold] {event_count}")
conv_lines = [
f"[bold]Type:[/bold] {conv_type}",
f"[bold]Active:[/bold] {conv_active}"
]
if conv_details:
conv_lines.append(f"[bold]Details:[/bold] {conv_details}")
console.print(Panel("\n".join(conv_lines), title="Conversation Memory", border_style="medium_purple3"))
console.print(Panel("\n".join(session_lines), title="Session Store", border_style="deep_sky_blue3"))
# Knowledge graph section
knowledge = status.get("knowledge_graph", {})
kg_active = knowledge.get("active", False)
kg_lines = [
f"[bold]Active:[/bold] {'yes' if kg_active else 'no'}",
f"[bold]Purpose:[/bold] {knowledge.get('purpose', 'N/A')}"
]
cognee_data = None
cognee_error = None
try:
project_config = ProjectConfigManager()
cognee_data = project_config.get_cognee_config()
except Exception as exc: # pragma: no cover - defensive
cognee_error = str(exc)
if cognee_data:
data_dir = cognee_data.get('data_directory')
system_dir = cognee_data.get('system_directory')
if data_dir:
kg_lines.append(f"[bold]Data dir:[/bold] {data_dir}")
if system_dir:
kg_lines.append(f"[bold]System dir:[/bold] {system_dir}")
elif cognee_error:
kg_lines.append(f"[yellow]Config unavailable:[/yellow] {cognee_error}")
dataset_summary = None
if kg_active:
try:
integration = await self.agent.executor._get_knowledge_integration()
if integration:
dataset_summary = await integration.list_datasets()
except Exception as exc: # pragma: no cover - defensive
kg_lines.append(f"[yellow]Dataset listing failed:[/yellow] {exc}")
if dataset_summary:
if dataset_summary.get("error"):
kg_lines.append(f"[yellow]Dataset listing failed:[/yellow] {dataset_summary['error']}")
else:
datasets = dataset_summary.get("datasets", [])
total = dataset_summary.get("total_datasets")
if total is not None:
kg_lines.append(f"[bold]Datasets:[/bold] {total}")
if datasets:
preview = ", ".join(sorted(datasets)[:5])
if len(datasets) > 5:
preview += ", …"
kg_lines.append(f"[bold]Samples:[/bold] {preview}")
else:
kg_lines.append("[dim]Run `fuzzforge ingest` to populate the knowledge graph[/dim]")
console.print(Panel("\n".join(kg_lines), title="Knowledge Graph", border_style="spring_green4"))
console.print("\n[dim]Subcommands: /memory datasets | /memory search <query>[/dim]")
async def _show_dataset_summary(self) -> None:
"""List datasets available in the Cognee knowledge graph."""
try:
integration = await self.agent.executor._get_knowledge_integration()
except Exception as exc:
console.print(f"[yellow]Knowledge graph unavailable:[/yellow] {exc}")
return
if not integration:
console.print("[yellow]Knowledge graph is not initialised yet.[/yellow]")
console.print("[dim]Run `fuzzforge ingest --path . --recursive` to create the project dataset.[/dim]")
return
with safe_status(get_dynamic_status('searching')):
dataset_info = await integration.list_datasets()
if dataset_info.get("error"):
console.print(f"[red]{dataset_info['error']}[/red]")
return
datasets = dataset_info.get("datasets", [])
if not datasets:
console.print("[yellow]No datasets found.[/yellow]")
console.print("[dim]Run `fuzzforge ingest` to populate the knowledge graph.[/dim]")
return
table = Table(title="Cognee Datasets", box=box.ROUNDED)
table.add_column("Dataset", style="medium_purple3")
table.add_column("Notes", style="dim")
for name in sorted(datasets):
note = ""
if name.endswith("_codebase"):
note = "primary project dataset"
table.add_row(name, note)
console.print(table)
console.print(
"[dim]Use knowledge graph prompts (e.g. `search project knowledge for \"topic\" using INSIGHTS`) to query these datasets.[/dim]"
)
async def cmd_artifacts(self, args: str = "") -> None:
"""List or show artifacts"""
if args:
# Show specific artifact
artifacts = await self.agent.executor.get_artifacts(self.context_id)
for artifact in artifacts:
if artifact['id'] == args or args in artifact['id']:
console.print(Panel(
f"[bold]{artifact['title']}[/bold]\n"
f"Type: {artifact['type']} | Created: {artifact['created_at'][:19]}\n\n"
f"[code]{artifact['content']}[/code]",
title=f"Artifact: {artifact['id']}",
border_style="medium_purple3"
))
return
console.print(f"[yellow]Artifact {args} not found[/yellow]")
return
# List all artifacts
artifacts = await self.agent.executor.get_artifacts(self.context_id)
if not artifacts:
console.print("No artifacts created yet")
console.print("[dim]Artifacts are created when generating code, configs, or documents[/dim]")
return
table = Table(title="Artifacts", box=box.ROUNDED)
table.add_column("ID", style="medium_purple3")
table.add_column("Type", style="deep_sky_blue3")
table.add_column("Title", style="plum3")
table.add_column("Size", style="dim")
table.add_column("Created", style="dim")
for artifact in artifacts:
size = f"{len(artifact['content'])} chars"
created = artifact['created_at'][:19] # Just date and time
table.add_row(
artifact['id'],
artifact['type'],
artifact['title'][:40] + "..." if len(artifact['title']) > 40 else artifact['title'],
size,
created
)
console.print(table)
console.print(f"\n[dim]Use /artifacts <id> to view artifact content[/dim]")
async def cmd_tasks(self, args: str = "") -> None:
"""List tasks or show details for a specific task."""
store = getattr(self.agent.executor, "task_store", None)
if not store or not hasattr(store, "tasks"):
console.print("Task store not available")
return
task_id = args.strip()
async with store.lock:
tasks = dict(store.tasks)
if not tasks:
console.print("No tasks recorded yet")
return
if task_id:
task = tasks.get(task_id)
if not task:
console.print(f"Task '{task_id}' not found")
return
state_str = task.status.state.value if hasattr(task.status.state, "value") else str(task.status.state)
console.print(f"\n[bold]Task {task.id}[/bold]")
console.print(f"Context: {task.context_id}")
console.print(f"State: {state_str}")
console.print(f"Timestamp: {task.status.timestamp}")
if task.metadata:
console.print("Metadata:")
for key, value in task.metadata.items():
console.print(f"{key}: {value}")
if task.history:
console.print("History:")
for entry in task.history[-5:]:
text = getattr(entry, "text", None)
if not text and hasattr(entry, "parts"):
text = " ".join(
getattr(part, "text", "") for part in getattr(entry, "parts", [])
)
console.print(f" - {text}")
return
table = Table(title="FuzzForge Tasks", box=box.ROUNDED)
table.add_column("ID", style="medium_purple3")
table.add_column("State", style="white")
table.add_column("Workflow", style="deep_sky_blue3")
table.add_column("Updated", style="green")
for task in tasks.values():
state_value = task.status.state.value if hasattr(task.status.state, "value") else str(task.status.state)
workflow = ""
if task.metadata:
workflow = task.metadata.get("workflow") or task.metadata.get("workflow_name") or ""
timestamp = task.status.timestamp if task.status else ""
table.add_row(task.id, state_value, workflow, timestamp)
console.print(table)
console.print("\n[dim]Use /tasks <id> to view task details[/dim]")
async def cmd_sessions(self, args: str = "") -> None:
"""List active sessions"""
sessions = self.agent.executor.sessions
if not sessions:
console.print("No active sessions")
return
table = Table(title="Active Sessions", box=box.ROUNDED)
table.add_column("Context ID", style="medium_purple3")
table.add_column("Session ID", style="deep_sky_blue3")
table.add_column("User ID", style="plum3")
table.add_column("State", style="dim")
for context_id, session in sessions.items():
# Get session info
session_id = getattr(session, 'id', 'N/A')
user_id = getattr(session, 'user_id', 'N/A')
state = getattr(session, 'state', {})
# Format state info
agents_count = len(state.get('registered_agents', []))
state_info = f"{agents_count} agents registered"
table.add_row(
context_id[:20] + "..." if len(context_id) > 20 else context_id,
session_id[:20] + "..." if len(str(session_id)) > 20 else str(session_id),
user_id,
state_info
)
console.print(table)
console.print(f"\n[dim]Current session: {self.context_id}[/dim]")
async def cmd_skills(self, args: str = "") -> None:
"""Show FuzzForge skills"""
card = self.agent.agent_card
table = Table(title=f"{card.name} Skills", box=box.ROUNDED)
table.add_column("Skill", style="medium_purple3")
table.add_column("Description", style="white")
table.add_column("Tags", style="deep_sky_blue3")
for skill in card.skills:
table.add_row(
skill.name,
skill.description,
", ".join(skill.tags[:3])
)
console.print(table)
async def cmd_clear(self, args: str = "") -> None:
"""Clear screen"""
console.clear()
self.print_banner()
async def cmd_sendfile(self, args: str) -> None:
"""Encode a local file as an artifact and route it to a registered agent."""
tokens = shlex.split(args)
if len(tokens) < 2:
console.print("Usage: /sendfile <agent_name> <path> [message]")
return
agent_name = tokens[0]
file_arg = tokens[1]
note = " ".join(tokens[2:]).strip()
file_path = Path(file_arg).expanduser()
if not file_path.exists():
console.print(f"[red]File not found:[/red] {file_path}")
return
session = self.agent.executor.sessions.get(self.context_id)
if not session:
console.print("[red]No active session available. Try sending a prompt first.[/red]")
return
console.print(f"[dim]Delegating {file_path.name} to {agent_name}...[/dim]")
async def _delegate() -> None:
try:
response = await self.agent.executor.delegate_file_to_agent(
agent_name,
str(file_path),
note,
session=session,
context_id=self.context_id,
)
console.print(f"[{agent_name}]: {response}")
except Exception as exc:
console.print(f"[red]Failed to delegate file:[/red] {exc}")
finally:
self.background_tasks.discard(asyncio.current_task())
task = asyncio.create_task(_delegate())
self.background_tasks.add(task)
console.print("[dim]Delegation in progress… you can continue working.[/dim]")
async def cmd_quit(self, args: str = "") -> None:
"""Exit the CLI"""
console.print("\n[green]Shutting down...[/green]")
await self.agent.cleanup()
if self.background_tasks:
for task in list(self.background_tasks):
task.cancel()
await asyncio.gather(*self.background_tasks, return_exceptions=True)
console.print("Goodbye!\n")
sys.exit(0)
async def process_command(self, text: str) -> bool:
"""Process slash commands"""
if not text.startswith('/'):
return False
parts = text.split(maxsplit=1)
cmd = parts[0].lower()
args = parts[1] if len(parts) > 1 else ""
if cmd in self.commands:
await self.commands[cmd](args)
return True
console.print(f"Unknown command: {cmd}")
return True
async def auto_register_agents(self):
"""Auto-register agents from config on startup"""
agents_to_register = self.config_manager.get_registered_agents()
if agents_to_register:
console.print(f"\n[dim]Auto-registering {len(agents_to_register)} agents from config...[/dim]")
for agent_config in agents_to_register:
url = agent_config.get('url')
name = agent_config.get('name', 'Unknown')
if url:
try:
with safe_status(f"Registering {name}..."):
result = await self.agent.register_agent(url)
if result["success"]:
console.print(f"{name}: [green]Connected[/green]")
else:
console.print(f" ⚠️ {name}: [yellow]Failed - {result.get('error', 'Unknown error')}[/yellow]")
except Exception as e:
console.print(f" ⚠️ {name}: [yellow]Failed - {e}[/yellow]")
console.print("") # Empty line for spacing
async def run(self):
"""Main CLI loop"""
self.print_banner()
# Auto-register agents from config
await self.auto_register_agents()
while not shutdown_requested:
try:
# Use standard input with non-deletable colored prompt
prompt_symbol = get_prompt_symbol()
try:
# Print colored prompt then use input() for non-deletable behavior
console.print(f"[medium_purple3]{prompt_symbol}[/medium_purple3] ", end="")
user_input = input().strip()
except (EOFError, KeyboardInterrupt):
raise
if not user_input:
continue
# Check for commands
if await self.process_command(user_input):
continue
# Process message
with safe_status(get_dynamic_status('thinking')):
response = await self.agent.process_message(user_input, self.context_id)
# Display response
console.print(f"\n{response}\n")
except KeyboardInterrupt:
await self.cmd_quit()
except EOFError:
await self.cmd_quit()
except Exception as e:
console.print(f"[red]Error: {e}[/red]")
if os.getenv('FUZZFORGE_DEBUG') == '1':
console.print_exception()
console.print("")
await self.agent.cleanup()
def main():
"""Main entry point"""
try:
cli = FuzzForgeCLI()
asyncio.run(cli.run())
except KeyboardInterrupt:
console.print("\n[yellow]Interrupted[/yellow]")
sys.exit(0)
except Exception as e:
console.print(f"[red]Fatal error: {e}[/red]")
if os.getenv('FUZZFORGE_DEBUG') == '1':
console.print_exception()
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -1,435 +0,0 @@
"""
Cognee Integration Module for FuzzForge
Provides standardized access to project-specific knowledge graphs
Can be reused by external agents and other components
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import os
import asyncio
import json
from typing import Dict, List, Any, Optional, Union
from pathlib import Path
class CogneeProjectIntegration:
"""
Standardized Cognee integration that can be reused across agents
Automatically detects project context and provides knowledge graph access
"""
def __init__(self, project_dir: Optional[str] = None):
"""
Initialize with project directory (defaults to current working directory)
Args:
project_dir: Path to project directory (optional, defaults to cwd)
"""
self.project_dir = Path(project_dir) if project_dir else Path.cwd()
self.config_file = self.project_dir / ".fuzzforge" / "config.yaml"
self.project_context = None
self._cognee = None
self._initialized = False
async def initialize(self) -> bool:
"""
Initialize Cognee with project context
Returns:
bool: True if initialization successful
"""
try:
# Import Cognee
import cognee
self._cognee = cognee
# Load project context
if not self._load_project_context():
return False
# Configure Cognee for this project
await self._setup_cognee_config()
self._initialized = True
return True
except ImportError:
print("Cognee not installed. Install with: pip install cognee")
return False
except Exception as e:
print(f"Failed to initialize Cognee: {e}")
return False
def _load_project_context(self) -> bool:
"""Load project context from FuzzForge config"""
try:
if not self.config_file.exists():
print(f"No FuzzForge config found at {self.config_file}")
return False
import yaml
with open(self.config_file, 'r') as f:
config = yaml.safe_load(f)
self.project_context = {
"project_name": config.get("project", {}).get("name", "default"),
"project_id": config.get("project", {}).get("id", "default"),
"tenant_id": config.get("cognee", {}).get("tenant", "default")
}
return True
except Exception as e:
print(f"Error loading project context: {e}")
return False
async def _setup_cognee_config(self):
"""Configure Cognee for project-specific access"""
# Set API key and model
api_key = os.getenv('OPENAI_API_KEY')
model = os.getenv('LITELLM_MODEL', 'gpt-4o-mini')
if not api_key:
raise ValueError("OPENAI_API_KEY required for Cognee operations")
# Configure Cognee
self._cognee.config.set_llm_api_key(api_key)
self._cognee.config.set_llm_model(model)
self._cognee.config.set_llm_provider("openai")
# Set project-specific directories
project_cognee_dir = self.project_dir / ".fuzzforge" / "cognee" / f"project_{self.project_context['project_id']}"
self._cognee.config.data_root_directory(str(project_cognee_dir / "data"))
self._cognee.config.system_root_directory(str(project_cognee_dir / "system"))
# Ensure directories exist
project_cognee_dir.mkdir(parents=True, exist_ok=True)
(project_cognee_dir / "data").mkdir(exist_ok=True)
(project_cognee_dir / "system").mkdir(exist_ok=True)
async def search_knowledge_graph(self, query: str, search_type: str = "GRAPH_COMPLETION", dataset: str = None) -> Dict[str, Any]:
"""
Search the project's knowledge graph
Args:
query: Search query
search_type: Type of search ("GRAPH_COMPLETION", "INSIGHTS", "CHUNKS", etc.)
dataset: Specific dataset to search (optional)
Returns:
Dict containing search results
"""
if not self._initialized:
await self.initialize()
if not self._initialized:
return {"error": "Cognee not initialized"}
try:
from cognee.modules.search.types import SearchType
# Resolve search type dynamically; fallback to GRAPH_COMPLETION
try:
search_type_enum = getattr(SearchType, search_type.upper())
except AttributeError:
search_type_enum = SearchType.GRAPH_COMPLETION
search_type = "GRAPH_COMPLETION"
# Prepare search kwargs
search_kwargs = {
"query_type": search_type_enum,
"query_text": query
}
# Add dataset filter if specified
if dataset:
search_kwargs["datasets"] = [dataset]
results = await self._cognee.search(**search_kwargs)
return {
"query": query,
"search_type": search_type,
"dataset": dataset,
"results": results,
"project": self.project_context["project_name"]
}
except Exception as e:
return {"error": f"Search failed: {e}"}
async def list_knowledge_data(self) -> Dict[str, Any]:
"""
List available data in the knowledge graph
Returns:
Dict containing available data
"""
if not self._initialized:
await self.initialize()
if not self._initialized:
return {"error": "Cognee not initialized"}
try:
data = await self._cognee.list_data()
return {
"project": self.project_context["project_name"],
"available_data": data
}
except Exception as e:
return {"error": f"Failed to list data: {e}"}
async def ingest_text_to_dataset(self, text: str, dataset: str = None) -> Dict[str, Any]:
"""
Ingest text content into a specific dataset
Args:
text: Text to ingest
dataset: Dataset name (defaults to project_name_codebase)
Returns:
Dict containing ingest results
"""
if not self._initialized:
await self.initialize()
if not self._initialized:
return {"error": "Cognee not initialized"}
if not dataset:
dataset = f"{self.project_context['project_name']}_codebase"
try:
# Add text to dataset
await self._cognee.add([text], dataset_name=dataset)
# Process (cognify) the dataset
await self._cognee.cognify([dataset])
return {
"text_length": len(text),
"dataset": dataset,
"project": self.project_context["project_name"],
"status": "success"
}
except Exception as e:
return {"error": f"Ingest failed: {e}"}
async def ingest_files_to_dataset(self, file_paths: list, dataset: str = None) -> Dict[str, Any]:
"""
Ingest multiple files into a specific dataset
Args:
file_paths: List of file paths to ingest
dataset: Dataset name (defaults to project_name_codebase)
Returns:
Dict containing ingest results
"""
if not self._initialized:
await self.initialize()
if not self._initialized:
return {"error": "Cognee not initialized"}
if not dataset:
dataset = f"{self.project_context['project_name']}_codebase"
try:
# Validate and filter readable files
valid_files = []
for file_path in file_paths:
try:
path = Path(file_path)
if path.exists() and path.is_file():
# Test if file is readable
with open(path, 'r', encoding='utf-8') as f:
f.read(1)
valid_files.append(str(path))
except (UnicodeDecodeError, PermissionError, OSError):
continue
if not valid_files:
return {"error": "No valid files found to ingest"}
# Add files to dataset
await self._cognee.add(valid_files, dataset_name=dataset)
# Process (cognify) the dataset
await self._cognee.cognify([dataset])
return {
"files_processed": len(valid_files),
"total_files_requested": len(file_paths),
"dataset": dataset,
"project": self.project_context["project_name"],
"status": "success"
}
except Exception as e:
return {"error": f"Ingest failed: {e}"}
async def list_datasets(self) -> Dict[str, Any]:
"""
List all datasets available in the project
Returns:
Dict containing available datasets
"""
if not self._initialized:
await self.initialize()
if not self._initialized:
return {"error": "Cognee not initialized"}
try:
# Get available datasets by searching for data
data = await self._cognee.list_data()
# Extract unique dataset names from the data
datasets = set()
if isinstance(data, list):
for item in data:
if isinstance(item, dict) and 'dataset_name' in item:
datasets.add(item['dataset_name'])
return {
"project": self.project_context["project_name"],
"datasets": list(datasets),
"total_datasets": len(datasets)
}
except Exception as e:
return {"error": f"Failed to list datasets: {e}"}
async def create_dataset(self, dataset: str) -> Dict[str, Any]:
"""
Create a new dataset (dataset is created automatically when data is added)
Args:
dataset: Dataset name to create
Returns:
Dict containing creation result
"""
if not self._initialized:
await self.initialize()
if not self._initialized:
return {"error": "Cognee not initialized"}
try:
# In Cognee, datasets are created implicitly when data is added
# We'll add empty content to create the dataset
await self._cognee.add([f"Dataset {dataset} initialized for project {self.project_context['project_name']}"],
dataset_name=dataset)
return {
"dataset": dataset,
"project": self.project_context["project_name"],
"status": "created"
}
except Exception as e:
return {"error": f"Failed to create dataset: {e}"}
def get_project_context(self) -> Optional[Dict[str, str]]:
"""Get current project context"""
return self.project_context
def is_initialized(self) -> bool:
"""Check if Cognee is initialized"""
return self._initialized
# Convenience functions for easy integration
async def search_project_codebase(query: str, project_dir: Optional[str] = None, dataset: str = None, search_type: str = "GRAPH_COMPLETION") -> str:
"""
Convenience function to search project codebase
Args:
query: Search query
project_dir: Project directory (optional, defaults to cwd)
dataset: Specific dataset to search (optional)
search_type: Type of search ("GRAPH_COMPLETION", "INSIGHTS", "CHUNKS")
Returns:
Formatted search results as string
"""
cognee_integration = CogneeProjectIntegration(project_dir)
result = await cognee_integration.search_knowledge_graph(query, search_type, dataset)
if "error" in result:
return f"Error searching codebase: {result['error']}"
project_name = result.get("project", "Unknown")
results = result.get("results", [])
if not results:
return f"No results found for '{query}' in project {project_name}"
output = f"Search results for '{query}' in project {project_name}:\n\n"
# Format results
if isinstance(results, list):
for i, item in enumerate(results, 1):
if isinstance(item, dict):
# Handle structured results
output += f"{i}. "
if "search_result" in item:
output += f"Dataset: {item.get('dataset_name', 'Unknown')}\n"
for result_item in item["search_result"]:
if isinstance(result_item, dict):
if "name" in result_item:
output += f" - {result_item['name']}: {result_item.get('description', '')}\n"
elif "text" in result_item:
text = result_item["text"][:200] + "..." if len(result_item["text"]) > 200 else result_item["text"]
output += f" - {text}\n"
else:
output += f" - {str(result_item)[:200]}...\n"
else:
output += f"{str(item)[:200]}...\n"
output += "\n"
else:
output += f"{i}. {str(item)[:200]}...\n\n"
else:
output += f"{str(results)[:500]}..."
return output
async def list_project_knowledge(project_dir: Optional[str] = None) -> str:
"""
Convenience function to list project knowledge
Args:
project_dir: Project directory (optional, defaults to cwd)
Returns:
Formatted list of available data
"""
cognee_integration = CogneeProjectIntegration(project_dir)
result = await cognee_integration.list_knowledge_data()
if "error" in result:
return f"Error listing knowledge: {result['error']}"
project_name = result.get("project", "Unknown")
data = result.get("available_data", [])
output = f"Available knowledge in project {project_name}:\n\n"
if not data:
output += "No data available in knowledge graph"
else:
for i, item in enumerate(data, 1):
output += f"{i}. {item}\n"
return output

View File

@@ -1,416 +0,0 @@
"""
Cognee Service for FuzzForge
Provides integrated Cognee functionality for codebase analysis and knowledge graphs
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import os
import asyncio
import logging
from pathlib import Path
from typing import Dict, List, Any, Optional
from datetime import datetime
logger = logging.getLogger(__name__)
class CogneeService:
"""
Service for managing Cognee integration with FuzzForge
Handles multi-tenant isolation and project-specific knowledge graphs
"""
def __init__(self, config):
"""Initialize with FuzzForge config"""
self.config = config
self.cognee_config = config.get_cognee_config()
self.project_context = config.get_project_context()
self._cognee = None
self._user = None
self._initialized = False
async def initialize(self):
"""Initialize Cognee with project-specific configuration"""
try:
# Ensure environment variables for Cognee are set before import
self.config.setup_cognee_environment()
logger.debug(
"Cognee environment configured",
extra={
"data": self.cognee_config.get("data_directory"),
"system": self.cognee_config.get("system_directory"),
},
)
import cognee
self._cognee = cognee
# Configure LLM with API key BEFORE any other cognee operations
provider = os.getenv("LLM_PROVIDER", "openai")
model = os.getenv("LLM_MODEL") or os.getenv("LITELLM_MODEL", "gpt-4o-mini")
api_key = os.getenv("LLM_API_KEY") or os.getenv("OPENAI_API_KEY")
endpoint = os.getenv("LLM_ENDPOINT")
api_version = os.getenv("LLM_API_VERSION")
max_tokens = os.getenv("LLM_MAX_TOKENS")
if provider.lower() in {"openai", "azure_openai", "custom"} and not api_key:
raise ValueError(
"OpenAI-compatible API key is required for Cognee LLM operations. "
"Set OPENAI_API_KEY, LLM_API_KEY, or COGNEE_LLM_API_KEY in your .env"
)
# Expose environment variables for downstream libraries
os.environ["LLM_PROVIDER"] = provider
os.environ["LITELLM_MODEL"] = model
os.environ["LLM_MODEL"] = model
if api_key:
os.environ["LLM_API_KEY"] = api_key
# Maintain compatibility with components still expecting OPENAI_API_KEY
if provider.lower() in {"openai", "azure_openai", "custom"}:
os.environ.setdefault("OPENAI_API_KEY", api_key)
if endpoint:
os.environ["LLM_ENDPOINT"] = endpoint
if api_version:
os.environ["LLM_API_VERSION"] = api_version
if max_tokens:
os.environ["LLM_MAX_TOKENS"] = str(max_tokens)
# Configure Cognee's runtime using its configuration helpers when available
if hasattr(cognee.config, "set_llm_provider"):
cognee.config.set_llm_provider(provider)
if hasattr(cognee.config, "set_llm_model"):
cognee.config.set_llm_model(model)
if api_key and hasattr(cognee.config, "set_llm_api_key"):
cognee.config.set_llm_api_key(api_key)
if endpoint and hasattr(cognee.config, "set_llm_endpoint"):
cognee.config.set_llm_endpoint(endpoint)
if api_version and hasattr(cognee.config, "set_llm_api_version"):
cognee.config.set_llm_api_version(api_version)
if max_tokens and hasattr(cognee.config, "set_llm_max_tokens"):
cognee.config.set_llm_max_tokens(int(max_tokens))
# Configure graph database
cognee.config.set_graph_db_config({
"graph_database_provider": self.cognee_config.get("graph_database_provider", "kuzu"),
})
# Set data directories
data_dir = self.cognee_config.get("data_directory")
system_dir = self.cognee_config.get("system_directory")
if data_dir:
logger.debug("Setting cognee data root", extra={"path": data_dir})
cognee.config.data_root_directory(data_dir)
if system_dir:
logger.debug("Setting cognee system root", extra={"path": system_dir})
cognee.config.system_root_directory(system_dir)
# Setup multi-tenant user context
await self._setup_user_context()
self._initialized = True
logger.info(f"Cognee initialized for project {self.project_context['project_name']} "
f"with Kuzu at {system_dir}")
except ImportError:
logger.error("Cognee not installed. Install with: pip install cognee")
raise
except Exception as e:
logger.error(f"Failed to initialize Cognee: {e}")
raise
async def create_dataset(self):
"""Create dataset for this project if it doesn't exist"""
if not self._initialized:
await self.initialize()
try:
# Dataset creation is handled automatically by Cognee when adding files
# We just ensure we have the right context set up
dataset_name = f"{self.project_context['project_name']}_codebase"
logger.info(f"Dataset {dataset_name} ready for project {self.project_context['project_name']}")
return dataset_name
except Exception as e:
logger.error(f"Failed to create dataset: {e}")
raise
async def _setup_user_context(self):
"""Setup user context for multi-tenant isolation"""
try:
from cognee.modules.users.methods import create_user, get_user
# Always try fallback email first to avoid validation issues
fallback_email = f"project_{self.project_context['project_id']}@fuzzforge.example"
user_tenant = self.project_context['tenant_id']
# Try to get existing fallback user first
try:
self._user = await get_user(fallback_email)
logger.info(f"Using existing user: {fallback_email}")
return
except:
# User doesn't exist, try to create fallback
pass
# Create fallback user
try:
self._user = await create_user(fallback_email, user_tenant)
logger.info(f"Created fallback user: {fallback_email} for tenant: {user_tenant}")
return
except Exception as fallback_error:
logger.warning(f"Fallback user creation failed: {fallback_error}")
self._user = None
return
except Exception as e:
logger.warning(f"Could not setup multi-tenant user context: {e}")
logger.info("Proceeding with default context")
self._user = None
def get_project_dataset_name(self, dataset_suffix: str = "codebase") -> str:
"""Get project-specific dataset name"""
return f"{self.project_context['project_name']}_{dataset_suffix}"
async def ingest_text(self, content: str, dataset: str = "fuzzforge") -> bool:
"""Ingest text content into knowledge graph"""
if not self._initialized:
await self.initialize()
try:
await self._cognee.add([content], dataset)
await self._cognee.cognify([dataset])
return True
except Exception as e:
logger.error(f"Failed to ingest text: {e}")
return False
async def ingest_files(self, file_paths: List[Path], dataset: str = "fuzzforge") -> Dict[str, Any]:
"""Ingest multiple files into knowledge graph"""
if not self._initialized:
await self.initialize()
results = {
"success": 0,
"failed": 0,
"errors": []
}
try:
ingest_paths: List[str] = []
for file_path in file_paths:
try:
with open(file_path, 'r', encoding='utf-8'):
ingest_paths.append(str(file_path))
results["success"] += 1
except (UnicodeDecodeError, PermissionError) as exc:
results["failed"] += 1
results["errors"].append(f"{file_path}: {exc}")
logger.warning("Skipping %s: %s", file_path, exc)
if ingest_paths:
await self._cognee.add(ingest_paths, dataset_name=dataset)
await self._cognee.cognify([dataset])
except Exception as e:
logger.error(f"Failed to ingest files: {e}")
results["errors"].append(f"Cognify error: {str(e)}")
return results
async def search_insights(self, query: str, dataset: str = None) -> List[str]:
"""Search for insights in the knowledge graph"""
if not self._initialized:
await self.initialize()
try:
from cognee.modules.search.types import SearchType
kwargs = {
"query_type": SearchType.INSIGHTS,
"query_text": query
}
if dataset:
kwargs["datasets"] = [dataset]
results = await self._cognee.search(**kwargs)
return results if isinstance(results, list) else []
except Exception as e:
logger.error(f"Failed to search insights: {e}")
return []
async def search_chunks(self, query: str, dataset: str = None) -> List[str]:
"""Search for relevant text chunks"""
if not self._initialized:
await self.initialize()
try:
from cognee.modules.search.types import SearchType
kwargs = {
"query_type": SearchType.CHUNKS,
"query_text": query
}
if dataset:
kwargs["datasets"] = [dataset]
results = await self._cognee.search(**kwargs)
return results if isinstance(results, list) else []
except Exception as e:
logger.error(f"Failed to search chunks: {e}")
return []
async def search_graph_completion(self, query: str) -> List[str]:
"""Search for graph completion (relationships)"""
if not self._initialized:
await self.initialize()
try:
from cognee.modules.search.types import SearchType
results = await self._cognee.search(
query_type=SearchType.GRAPH_COMPLETION,
query_text=query
)
return results if isinstance(results, list) else []
except Exception as e:
logger.error(f"Failed to search graph completion: {e}")
return []
async def get_status(self) -> Dict[str, Any]:
"""Get service status and statistics"""
status = {
"initialized": self._initialized,
"enabled": self.cognee_config.get("enabled", True),
"provider": self.cognee_config.get("graph_database_provider", "kuzu"),
"data_directory": self.cognee_config.get("data_directory"),
"system_directory": self.cognee_config.get("system_directory"),
}
if self._initialized:
try:
# Check if directories exist and get sizes
data_dir = Path(status["data_directory"])
system_dir = Path(status["system_directory"])
status.update({
"data_dir_exists": data_dir.exists(),
"system_dir_exists": system_dir.exists(),
"kuzu_db_exists": (system_dir / "kuzu_db").exists(),
"lancedb_exists": (system_dir / "lancedb").exists(),
})
except Exception as e:
status["status_error"] = str(e)
return status
async def clear_data(self, confirm: bool = False):
"""Clear all ingested data (dangerous!)"""
if not confirm:
raise ValueError("Must confirm data clearing with confirm=True")
if not self._initialized:
await self.initialize()
try:
await self._cognee.prune.prune_data()
await self._cognee.prune.prune_system(metadata=True)
logger.info("Cognee data cleared")
except Exception as e:
logger.error(f"Failed to clear data: {e}")
raise
class FuzzForgeCogneeIntegration:
"""
Main integration class for FuzzForge + Cognee
Provides high-level operations for security analysis
"""
def __init__(self, config):
self.service = CogneeService(config)
async def analyze_codebase(self, path: Path, recursive: bool = True) -> Dict[str, Any]:
"""
Analyze a codebase and extract security-relevant insights
"""
# Collect code files
from fuzzforge_ai.ingest_utils import collect_ingest_files
files = collect_ingest_files(path, recursive, None, [])
if not files:
return {"error": "No files found to analyze"}
# Ingest files
results = await self.service.ingest_files(files, "security_analysis")
if results["success"] == 0:
return {"error": "Failed to ingest any files", "details": results}
# Extract security insights
security_queries = [
"vulnerabilities security risks",
"authentication authorization",
"input validation sanitization",
"encryption cryptography",
"error handling exceptions",
"logging sensitive data"
]
insights = {}
for query in security_queries:
insight_results = await self.service.search_insights(query, "security_analysis")
if insight_results:
insights[query.replace(" ", "_")] = insight_results
return {
"files_processed": results["success"],
"files_failed": results["failed"],
"errors": results["errors"],
"security_insights": insights
}
async def query_codebase(self, query: str, search_type: str = "insights") -> List[str]:
"""Query the ingested codebase"""
if search_type == "insights":
return await self.service.search_insights(query)
elif search_type == "chunks":
return await self.service.search_chunks(query)
elif search_type == "graph":
return await self.service.search_graph_completion(query)
else:
raise ValueError(f"Unknown search type: {search_type}")
async def get_project_summary(self) -> Dict[str, Any]:
"""Get a summary of the analyzed project"""
# Search for general project insights
summary_queries = [
"project structure components",
"main functionality features",
"programming languages frameworks",
"dependencies libraries"
]
summary = {}
for query in summary_queries:
results = await self.service.search_insights(query)
if results:
summary[query.replace(" ", "_")] = results[:3] # Top 3 results
return summary

View File

@@ -1,9 +0,0 @@
# FuzzForge Registered Agents
# These agents will be automatically registered on startup
registered_agents:
# Example entries:
# - name: Calculator
# url: http://localhost:10201
# description: Mathematical calculations agent

View File

@@ -1,31 +0,0 @@
"""Bridge module providing access to the host CLI configuration manager."""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
try:
from fuzzforge_cli.config import ProjectConfigManager as _ProjectConfigManager
except ImportError as exc: # pragma: no cover - used when CLI not available
class _ProjectConfigManager: # type: ignore[no-redef]
"""Fallback implementation that raises a helpful error."""
def __init__(self, *args, **kwargs):
raise ImportError(
"ProjectConfigManager is unavailable. Install the FuzzForge CLI "
"package or supply a compatible configuration object."
) from exc
def __getattr__(name): # pragma: no cover - defensive
raise ImportError("ProjectConfigManager unavailable") from exc
ProjectConfigManager = _ProjectConfigManager
__all__ = ["ProjectConfigManager"]

View File

@@ -1,134 +0,0 @@
"""
Configuration manager for FuzzForge
Handles loading and saving registered agents
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import os
import yaml
from typing import Dict, Any, List
class ConfigManager:
"""Manages FuzzForge agent registry configuration"""
def __init__(self, config_path: str = None):
"""Initialize config manager"""
if config_path:
self.config_path = config_path
else:
# Check for local .fuzzforge/agents.yaml first, then fall back to global
local_config = os.path.join(os.getcwd(), '.fuzzforge', 'agents.yaml')
global_config = os.path.join(os.path.dirname(__file__), 'config.yaml')
if os.path.exists(local_config):
self.config_path = local_config
if os.getenv("FUZZFORGE_DEBUG", "0") == "1":
print(f"[CONFIG] Using local config: {local_config}")
else:
self.config_path = global_config
if os.getenv("FUZZFORGE_DEBUG", "0") == "1":
print(f"[CONFIG] Using global config: {global_config}")
self.config = self.load_config()
def load_config(self) -> Dict[str, Any]:
"""Load configuration from YAML file"""
if not os.path.exists(self.config_path):
# Create default config if it doesn't exist
return {'registered_agents': []}
try:
with open(self.config_path, 'r') as f:
config = yaml.safe_load(f) or {}
# Ensure registered_agents is a list
if 'registered_agents' not in config or config['registered_agents'] is None:
config['registered_agents'] = []
return config
except Exception as e:
print(f"[WARNING] Failed to load config: {e}")
return {'registered_agents': []}
def save_config(self):
"""Save current configuration to file"""
try:
# Create a clean config with comments
config_content = """# FuzzForge Registered Agents
# These agents will be automatically registered on startup
"""
# Add the agents list
if self.config.get('registered_agents'):
config_content += yaml.dump({'registered_agents': self.config['registered_agents']},
default_flow_style=False, sort_keys=False)
else:
config_content += "registered_agents: []\n"
config_content += """
# Example entries:
# - name: Calculator
# url: http://localhost:10201
# description: Mathematical calculations agent
"""
with open(self.config_path, 'w') as f:
f.write(config_content)
return True
except Exception as e:
print(f"[ERROR] Failed to save config: {e}")
return False
def get_registered_agents(self) -> List[Dict[str, Any]]:
"""Get list of registered agents from config"""
return self.config.get('registered_agents', [])
def add_registered_agent(self, name: str, url: str, description: str = "") -> bool:
"""Add a new registered agent to config"""
if 'registered_agents' not in self.config:
self.config['registered_agents'] = []
# Check if agent already exists
for agent in self.config['registered_agents']:
if agent.get('url') == url:
# Update existing agent
agent['name'] = name
agent['description'] = description
return self.save_config()
# Add new agent
self.config['registered_agents'].append({
'name': name,
'url': url,
'description': description
})
return self.save_config()
def remove_registered_agent(self, name: str = None, url: str = None) -> bool:
"""Remove a registered agent from config"""
if 'registered_agents' not in self.config:
return False
original_count = len(self.config['registered_agents'])
# Filter out the agent
self.config['registered_agents'] = [
agent for agent in self.config['registered_agents']
if not ((name and agent.get('name') == name) or
(url and agent.get('url') == url))
]
if len(self.config['registered_agents']) < original_count:
return self.save_config()
return False

View File

@@ -1,104 +0,0 @@
"""Utilities for collecting files to ingest into Cognee."""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
from __future__ import annotations
import fnmatch
from pathlib import Path
from typing import Iterable, List, Optional
_DEFAULT_FILE_TYPES = [
".py",
".js",
".ts",
".java",
".cpp",
".c",
".h",
".rs",
".go",
".rb",
".php",
".cs",
".swift",
".kt",
".scala",
".clj",
".hs",
".md",
".txt",
".yaml",
".yml",
".json",
".toml",
".cfg",
".ini",
]
_DEFAULT_EXCLUDE = [
"*.pyc",
"__pycache__",
".git",
".svn",
".hg",
"node_modules",
".venv",
"venv",
".env",
"dist",
"build",
".pytest_cache",
".mypy_cache",
".tox",
"coverage",
"*.log",
"*.tmp",
]
def collect_ingest_files(
path: Path,
recursive: bool = True,
file_types: Optional[Iterable[str]] = None,
exclude: Optional[Iterable[str]] = None,
) -> List[Path]:
"""Return a list of files eligible for ingestion."""
path = path.resolve()
files: List[Path] = []
extensions = list(file_types) if file_types else list(_DEFAULT_FILE_TYPES)
exclusions = list(exclude) if exclude else []
exclusions.extend(_DEFAULT_EXCLUDE)
def should_exclude(file_path: Path) -> bool:
file_str = str(file_path)
for pattern in exclusions:
if fnmatch.fnmatch(file_str, f"*{pattern}*") or fnmatch.fnmatch(file_path.name, pattern):
return True
return False
if path.is_file():
if not should_exclude(path) and any(str(path).endswith(ext) for ext in extensions):
files.append(path)
return files
pattern = "**/*" if recursive else "*"
for file_path in path.glob(pattern):
if file_path.is_file() and not should_exclude(file_path):
if any(str(file_path).endswith(ext) for ext in extensions):
files.append(file_path)
return files
__all__ = ["collect_ingest_files"]

View File

@@ -1,247 +0,0 @@
"""
FuzzForge Memory Service
Implements ADK MemoryService pattern for conversational memory
Separate from Cognee which will be used for RAG/codebase analysis
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import os
import json
from typing import Dict, List, Any, Optional
from datetime import datetime
import logging
# ADK Memory imports
from google.adk.memory import InMemoryMemoryService, BaseMemoryService
from google.adk.memory.base_memory_service import SearchMemoryResponse
from google.adk.memory.memory_entry import MemoryEntry
# Optional VertexAI Memory Bank
try:
from google.adk.memory import VertexAiMemoryBankService
VERTEX_AVAILABLE = True
except ImportError:
VERTEX_AVAILABLE = False
logger = logging.getLogger(__name__)
class FuzzForgeMemoryService:
"""
Manages conversational memory using ADK patterns
This is separate from Cognee which will handle RAG/codebase
"""
def __init__(self, memory_type: str = "inmemory", **kwargs):
"""
Initialize memory service
Args:
memory_type: "inmemory" or "vertexai"
**kwargs: Additional args for specific memory service
For vertexai: project, location, agent_engine_id
"""
self.memory_type = memory_type
self.service = self._create_service(memory_type, **kwargs)
def _create_service(self, memory_type: str, **kwargs) -> BaseMemoryService:
"""Create the appropriate memory service"""
if memory_type == "inmemory":
# Use ADK's InMemoryMemoryService for local development
logger.info("Using InMemory MemoryService for conversational memory")
return InMemoryMemoryService()
elif memory_type == "vertexai" and VERTEX_AVAILABLE:
# Use VertexAI Memory Bank for production
project = kwargs.get('project') or os.getenv('GOOGLE_CLOUD_PROJECT')
location = kwargs.get('location') or os.getenv('GOOGLE_CLOUD_LOCATION', 'us-central1')
agent_engine_id = kwargs.get('agent_engine_id') or os.getenv('AGENT_ENGINE_ID')
if not all([project, location, agent_engine_id]):
logger.warning("VertexAI config missing, falling back to InMemory")
return InMemoryMemoryService()
logger.info(f"Using VertexAI MemoryBank: {agent_engine_id}")
return VertexAiMemoryBankService(
project=project,
location=location,
agent_engine_id=agent_engine_id
)
else:
# Default to in-memory
logger.info("Defaulting to InMemory MemoryService")
return InMemoryMemoryService()
async def add_session_to_memory(self, session: Any) -> None:
"""
Add a completed session to long-term memory
This extracts meaningful information from the conversation
Args:
session: The session object to process
"""
try:
# Let the underlying service handle the ingestion
# It will extract relevant information based on the implementation
await self.service.add_session_to_memory(session)
logger.debug(f"Added session {session.id} to {self.memory_type} memory")
except Exception as e:
logger.error(f"Failed to add session to memory: {e}")
async def search_memory(self,
query: str,
app_name: str = "fuzzforge",
user_id: str = None,
max_results: int = 10) -> SearchMemoryResponse:
"""
Search long-term memory for relevant information
Args:
query: The search query
app_name: Application name for filtering
user_id: User ID for filtering (optional)
max_results: Maximum number of results
Returns:
SearchMemoryResponse with relevant memories
"""
try:
# Search the memory service
results = await self.service.search_memory(
app_name=app_name,
user_id=user_id,
query=query
)
logger.debug(f"Memory search for '{query}' returned {len(results.memories)} results")
return results
except Exception as e:
logger.error(f"Memory search failed: {e}")
# Return empty results on error
return SearchMemoryResponse(memories=[])
async def ingest_completed_sessions(self, session_service) -> int:
"""
Batch ingest all completed sessions into memory
Useful for initial memory population
Args:
session_service: The session service containing sessions
Returns:
Number of sessions ingested
"""
ingested = 0
try:
# Get all sessions from the session service
sessions = await session_service.list_sessions(app_name="fuzzforge")
for session_info in sessions:
# Load full session
session = await session_service.load_session(
app_name="fuzzforge",
user_id=session_info.get('user_id'),
session_id=session_info.get('id')
)
if session and len(session.get_events()) > 0:
await self.add_session_to_memory(session)
ingested += 1
logger.info(f"Ingested {ingested} sessions into {self.memory_type} memory")
except Exception as e:
logger.error(f"Failed to batch ingest sessions: {e}")
return ingested
def get_status(self) -> Dict[str, Any]:
"""Get memory service status"""
return {
"type": self.memory_type,
"active": self.service is not None,
"vertex_available": VERTEX_AVAILABLE,
"details": {
"inmemory": "Non-persistent, keyword search",
"vertexai": "Persistent, semantic search with LLM extraction"
}.get(self.memory_type, "Unknown")
}
class HybridMemoryManager:
"""
Manages both ADK MemoryService (conversational) and Cognee (RAG/codebase)
Provides unified interface for both memory systems
"""
def __init__(self,
memory_service: FuzzForgeMemoryService = None,
cognee_tools = None):
"""
Initialize with both memory systems
Args:
memory_service: ADK-pattern memory for conversations
cognee_tools: Cognee MCP tools for RAG/codebase
"""
# ADK memory for conversations
self.memory_service = memory_service or FuzzForgeMemoryService()
# Cognee for knowledge graphs and RAG (future)
self.cognee_tools = cognee_tools
async def search_conversational_memory(self, query: str) -> SearchMemoryResponse:
"""Search past conversations using ADK memory"""
return await self.memory_service.search_memory(query)
async def search_knowledge_graph(self, query: str, search_type: str = "GRAPH_COMPLETION"):
"""Search Cognee knowledge graph (for RAG/codebase in future)"""
if not self.cognee_tools:
return None
try:
# Use Cognee's graph search
return await self.cognee_tools.search(
query=query,
search_type=search_type
)
except Exception as e:
logger.debug(f"Cognee search failed: {e}")
return None
async def store_in_graph(self, content: str):
"""Store in Cognee knowledge graph (for codebase analysis later)"""
if not self.cognee_tools:
return None
try:
# Use cognify to create graph structures
return await self.cognee_tools.cognify(content)
except Exception as e:
logger.debug(f"Cognee store failed: {e}")
return None
def get_status(self) -> Dict[str, Any]:
"""Get status of both memory systems"""
return {
"conversational_memory": self.memory_service.get_status(),
"knowledge_graph": {
"active": self.cognee_tools is not None,
"purpose": "RAG/codebase analysis (future)"
}
}

View File

@@ -1,148 +0,0 @@
"""
Remote Agent Connection Handler
Handles A2A protocol communication with remote agents
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import httpx
import uuid
from typing import Dict, Any, Optional, List
class RemoteAgentConnection:
"""Handles A2A protocol communication with remote agents"""
def __init__(self, url: str):
"""Initialize connection to a remote agent"""
self.url = url.rstrip('/')
self.agent_card = None
self.client = httpx.AsyncClient(timeout=120.0)
self.context_id = None
async def get_agent_card(self) -> Optional[Dict[str, Any]]:
"""Get the agent card from the remote agent"""
try:
# Try new path first (A2A 0.3.0+)
response = await self.client.get(f"{self.url}/.well-known/agent-card.json")
response.raise_for_status()
self.agent_card = response.json()
return self.agent_card
except:
# Try old path for compatibility
try:
response = await self.client.get(f"{self.url}/.well-known/agent.json")
response.raise_for_status()
self.agent_card = response.json()
return self.agent_card
except Exception as e:
print(f"Failed to get agent card from {self.url}: {e}")
return None
async def send_message(self, message: str | Dict[str, Any] | List[Dict[str, Any]]) -> str:
"""Send a message to the remote agent using A2A protocol"""
try:
parts: List[Dict[str, Any]]
metadata: Dict[str, Any] | None = None
if isinstance(message, dict):
metadata = message.get("metadata") if isinstance(message.get("metadata"), dict) else None
raw_parts = message.get("parts", [])
if not raw_parts:
text_value = message.get("text") or message.get("message")
if isinstance(text_value, str):
raw_parts = [{"type": "text", "text": text_value}]
parts = [raw_part for raw_part in raw_parts if isinstance(raw_part, dict)]
elif isinstance(message, list):
parts = [part for part in message if isinstance(part, dict)]
metadata = None
else:
parts = [{"type": "text", "text": message}]
metadata = None
if not parts:
parts = [{"type": "text", "text": ""}]
# Build JSON-RPC request per A2A spec
payload = {
"jsonrpc": "2.0",
"method": "message/send",
"params": {
"message": {
"messageId": str(uuid.uuid4()),
"role": "user",
"parts": parts,
}
},
"id": 1
}
if metadata:
payload["params"]["message"]["metadata"] = metadata
# Include context if we have one
if self.context_id:
payload["params"]["contextId"] = self.context_id
# Send to root endpoint per A2A protocol
response = await self.client.post(f"{self.url}/", json=payload)
response.raise_for_status()
result = response.json()
# Extract response based on A2A JSON-RPC format
if isinstance(result, dict):
# Update context for continuity
if "result" in result and isinstance(result["result"], dict):
if "contextId" in result["result"]:
self.context_id = result["result"]["contextId"]
# Extract text from artifacts
if "artifacts" in result["result"]:
texts = []
for artifact in result["result"]["artifacts"]:
if isinstance(artifact, dict) and "parts" in artifact:
for part in artifact["parts"]:
if isinstance(part, dict) and "text" in part:
texts.append(part["text"])
if texts:
return " ".join(texts)
# Extract from message format
if "message" in result["result"]:
msg = result["result"]["message"]
if isinstance(msg, dict) and "parts" in msg:
texts = []
for part in msg["parts"]:
if isinstance(part, dict) and "text" in part:
texts.append(part["text"])
return " ".join(texts) if texts else str(msg)
return str(msg)
return str(result["result"])
# Handle error response
elif "error" in result:
error = result["error"]
if isinstance(error, dict):
return f"Error: {error.get('message', str(error))}"
return f"Error: {error}"
# Fallback
return result.get("response", result.get("message", str(result)))
return str(result)
except Exception as e:
return f"Error communicating with agent: {e}"
async def close(self):
"""Close the connection properly"""
await self.client.aclose()

BIN
assets/demopart1.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 360 KiB

BIN
assets/demopart2.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.1 MiB

View File

@@ -1,41 +0,0 @@
FROM python:3.11-slim
WORKDIR /app
# Install system dependencies including Docker client and rsync
RUN apt-get update && apt-get install -y \
curl \
ca-certificates \
gnupg \
lsb-release \
rsync \
&& curl -fsSL https://download.docker.com/linux/debian/gpg | gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg \
&& echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/debian $(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null \
&& apt-get update \
&& apt-get install -y docker-ce-cli \
&& rm -rf /var/lib/apt/lists/*
# Docker client configuration removed - localhost:5001 doesn't require insecure registry config
# Install uv for faster package management
RUN pip install uv
# Copy project files
COPY pyproject.toml ./
COPY uv.lock ./
# Install dependencies
RUN uv sync --no-dev
# Copy source code
COPY . .
# Expose port
EXPOSE 8000
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:8000/health || exit 1
# Start the application
CMD ["uv", "run", "uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "8000"]

View File

@@ -1,257 +0,0 @@
# FuzzForge Backend
A stateless API server for security testing workflow orchestration using Prefect. This system dynamically discovers workflows, executes them in isolated Docker containers with volume mounting, and returns findings in SARIF format.
## Architecture Overview
### Core Components
1. **Workflow Discovery System**: Automatically discovers workflows at startup
2. **Module System**: Reusable components (scanner, analyzer, reporter) with a common interface
3. **Prefect Integration**: Handles container orchestration, workflow execution, and monitoring
4. **Volume Mounting**: Secure file access with configurable permissions (ro/rw)
5. **SARIF Output**: Standardized security findings format
### Key Features
- **Stateless**: No persistent data, fully scalable
- **Generic**: No hardcoded workflows, automatic discovery
- **Isolated**: Each workflow runs in its own Docker container
- **Extensible**: Easy to add new workflows and modules
- **Secure**: Read-only volume mounts by default, path validation
- **Observable**: Comprehensive logging and status tracking
## Quick Start
### Prerequisites
- Docker and Docker Compose
### Installation
From the project root, start all services:
```bash
docker-compose up -d
```
This will start:
- Prefect server (API at http://localhost:4200/api)
- PostgreSQL database
- Redis cache
- Docker registry (port 5001)
- Prefect worker (for running workflows)
- FuzzForge backend API (port 8000)
- FuzzForge MCP server (port 8010)
**Note**: The Prefect UI at http://localhost:4200 is not currently accessible from the host due to the API being configured for inter-container communication. Use the REST API or MCP interface instead.
## API Endpoints
### Workflows
- `GET /workflows` - List all discovered workflows
- `GET /workflows/{name}/metadata` - Get workflow metadata and parameters
- `GET /workflows/{name}/parameters` - Get workflow parameter schema
- `GET /workflows/metadata/schema` - Get metadata.yaml schema
- `POST /workflows/{name}/submit` - Submit a workflow for execution
### Runs
- `GET /runs/{run_id}/status` - Get run status
- `GET /runs/{run_id}/findings` - Get SARIF findings from completed run
- `GET /runs/{workflow_name}/findings/{run_id}` - Alternative findings endpoint with workflow name
## Workflow Structure
Each workflow must have:
```
toolbox/workflows/{workflow_name}/
workflow.py # Prefect flow definition
metadata.yaml # Mandatory metadata (parameters, version, etc.)
Dockerfile # Optional custom container definition
requirements.txt # Optional Python dependencies
```
### Example metadata.yaml
```yaml
name: security_assessment
version: "1.0.0"
description: "Comprehensive security analysis workflow"
author: "FuzzForge Team"
category: "comprehensive"
tags:
- "security"
- "analysis"
- "comprehensive"
supported_volume_modes:
- "ro"
- "rw"
requirements:
tools:
- "file_scanner"
- "security_analyzer"
- "sarif_reporter"
resources:
memory: "512Mi"
cpu: "500m"
timeout: 1800
has_docker: true
parameters:
type: object
properties:
target_path:
type: string
default: "/workspace"
description: "Path to analyze"
volume_mode:
type: string
enum: ["ro", "rw"]
default: "ro"
description: "Volume mount mode"
scanner_config:
type: object
description: "Scanner configuration"
properties:
max_file_size:
type: integer
description: "Maximum file size to scan (bytes)"
output_schema:
type: object
properties:
sarif:
type: object
description: "SARIF-formatted security findings"
summary:
type: object
description: "Scan execution summary"
```
### Metadata Field Descriptions
- **name**: Workflow identifier (must match directory name)
- **version**: Semantic version (x.y.z format)
- **description**: Human-readable description of the workflow
- **author**: Workflow author/maintainer
- **category**: Workflow category (comprehensive, specialized, fuzzing, focused)
- **tags**: Array of descriptive tags for categorization
- **requirements.tools**: Required security tools that the workflow uses
- **requirements.resources**: Resource requirements enforced at runtime:
- `memory`: Memory limit (e.g., "512Mi", "1Gi")
- `cpu`: CPU limit (e.g., "500m" for 0.5 cores, "1" for 1 core)
- `timeout`: Maximum execution time in seconds
- **parameters**: JSON Schema object defining workflow parameters
- **output_schema**: Expected output format (typically SARIF)
### Resource Requirements
Resource requirements defined in workflow metadata are automatically enforced. Users can override defaults when submitting workflows:
```bash
curl -X POST "http://localhost:8000/workflows/security_assessment/submit" \
-H "Content-Type: application/json" \
-d '{
"target_path": "/tmp/project",
"volume_mode": "ro",
"resource_limits": {
"memory_limit": "1Gi",
"cpu_limit": "1"
}
}'
```
Resource precedence: User limits > Workflow requirements > System defaults
## Module Development
Modules implement the `BaseModule` interface:
```python
from src.toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult
class MyModule(BaseModule):
def get_metadata(self) -> ModuleMetadata:
return ModuleMetadata(
name="my_module",
version="1.0.0",
description="Module description",
category="scanner",
...
)
async def execute(self, config: Dict, workspace: Path) -> ModuleResult:
# Module logic here
findings = [...]
return self.create_result(findings=findings)
def validate_config(self, config: Dict) -> bool:
# Validate configuration
return True
```
## Submitting a Workflow
```bash
curl -X POST "http://localhost:8000/workflows/security_assessment/submit" \
-H "Content-Type: application/json" \
-d '{
"target_path": "/home/user/project",
"volume_mode": "ro",
"parameters": {
"scanner_config": {"patterns": ["*.py"]},
"analyzer_config": {"check_secrets": true}
}
}'
```
## Getting Findings
```bash
curl "http://localhost:8000/runs/{run_id}/findings"
```
Returns SARIF-formatted findings:
```json
{
"workflow": "security_assessment",
"run_id": "abc-123",
"sarif": {
"version": "2.1.0",
"runs": [{
"tool": {...},
"results": [...]
}]
}
}
```
## Security Considerations
1. **Volume Mounting**: Only allowed directories can be mounted
2. **Read-Only Default**: Volumes mounted as read-only unless explicitly set
3. **Container Isolation**: Each workflow runs in an isolated container
4. **Resource Limits**: Can set CPU/memory limits via Prefect
5. **Network Isolation**: Containers use bridge networking
## Development
### Adding a New Workflow
1. Create directory: `toolbox/workflows/my_workflow/`
2. Add `workflow.py` with a Prefect flow
3. Add mandatory `metadata.yaml`
4. Restart backend: `docker-compose restart fuzzforge-backend`
### Adding a New Module
1. Create module in `toolbox/modules/{category}/`
2. Implement `BaseModule` interface
3. Use in workflows via import

View File

@@ -1,122 +0,0 @@
{
"name": "FuzzForge Security Testing Platform",
"description": "MCP server for FuzzForge security testing workflows via Docker Compose",
"version": "0.6.0",
"connection": {
"type": "http",
"host": "localhost",
"port": 8010,
"base_url": "http://localhost:8010",
"mcp_endpoint": "/mcp"
},
"docker_compose": {
"service": "fuzzforge-backend",
"command": "docker compose up -d",
"health_check": "http://localhost:8000/health"
},
"capabilities": {
"tools": [
{
"name": "submit_security_scan_mcp",
"description": "Submit a security scanning workflow for execution",
"parameters": {
"workflow_name": "string",
"target_path": "string",
"volume_mode": "string (ro|rw)",
"parameters": "object"
}
},
{
"name": "get_comprehensive_scan_summary",
"description": "Get a comprehensive summary of scan results with analysis",
"parameters": {
"run_id": "string"
}
}
],
"fastapi_routes": [
{
"method": "GET",
"path": "/",
"description": "Get API status and loaded workflows count"
},
{
"method": "GET",
"path": "/workflows/",
"description": "List all available security testing workflows"
},
{
"method": "POST",
"path": "/workflows/{workflow_name}/submit",
"description": "Submit a security scanning workflow for execution"
},
{
"method": "GET",
"path": "/runs/{run_id}/status",
"description": "Get the current status of a security scan run"
},
{
"method": "GET",
"path": "/runs/{run_id}/findings",
"description": "Get security findings from a completed scan"
},
{
"method": "GET",
"path": "/fuzzing/{run_id}/stats",
"description": "Get fuzzing statistics for a run"
}
]
},
"examples": {
"start_infrastructure_scan": {
"description": "Run infrastructure security scan on a project",
"steps": [
"1. Start Docker Compose: docker compose up -d",
"2. Submit scan via MCP tool: submit_security_scan_mcp",
"3. Monitor status and get results"
],
"workflow_name": "infrastructure_scan",
"target_path": "/Users/tduhamel/Documents/FuzzingLabs/fuzzforge_alpha/test_projects/infrastructure_vulnerable",
"parameters": {
"checkov_config": {
"severity": ["HIGH", "MEDIUM", "LOW"]
},
"hadolint_config": {
"severity": ["error", "warning", "info", "style"]
}
}
},
"static_analysis_scan": {
"description": "Run static analysis security scan",
"workflow_name": "static_analysis_scan",
"target_path": "/Users/tduhamel/Documents/FuzzingLabs/fuzzforge_alpha/test_projects/static_analysis_vulnerable",
"parameters": {
"bandit_config": {
"severity": ["HIGH", "MEDIUM", "LOW"]
},
"opengrep_config": {
"severity": ["HIGH", "MEDIUM", "LOW"]
}
}
},
"secret_detection_scan": {
"description": "Run secret detection scan",
"workflow_name": "secret_detection_scan",
"target_path": "/Users/tduhamel/Documents/FuzzingLabs/fuzzforge_alpha/test_projects/secret_detection_vulnerable",
"parameters": {
"trufflehog_config": {
"verified_only": false
},
"gitleaks_config": {
"no_git": true
}
}
}
},
"usage": {
"via_mcp": "Connect MCP client to http://localhost:8010/mcp after starting Docker Compose",
"via_api": "Use FastAPI endpoints directly at http://localhost:8000",
"start_system": "docker compose up -d",
"stop_system": "docker compose down"
}
}

View File

@@ -1,25 +0,0 @@
[project]
name = "backend"
version = "0.6.0"
description = "FuzzForge OSS backend"
authors = []
readme = "README.md"
requires-python = ">=3.11"
dependencies = [
"fastapi>=0.116.1",
"prefect>=3.4.18",
"pydantic>=2.0.0",
"pyyaml>=6.0",
"docker>=7.0.0",
"aiofiles>=23.0.0",
"uvicorn>=0.30.0",
"aiohttp>=3.12.15",
"fastmcp",
]
[project.optional-dependencies]
dev = [
"pytest>=8.0.0",
"pytest-asyncio>=0.23.0",
"httpx>=0.27.0",
]

View File

@@ -1,11 +0,0 @@
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.

View File

@@ -1,11 +0,0 @@
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.

View File

@@ -1,325 +0,0 @@
"""
API endpoints for fuzzing workflow management and real-time monitoring
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import logging
from typing import List, Dict, Any
from fastapi import APIRouter, HTTPException, Depends, WebSocket, WebSocketDisconnect
from fastapi.responses import StreamingResponse
import asyncio
import json
from datetime import datetime
from src.models.findings import (
FuzzingStats,
CrashReport
)
from src.core.workflow_discovery import WorkflowDiscovery
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/fuzzing", tags=["fuzzing"])
# In-memory storage for real-time stats (in production, use Redis or similar)
fuzzing_stats: Dict[str, FuzzingStats] = {}
crash_reports: Dict[str, List[CrashReport]] = {}
active_connections: Dict[str, List[WebSocket]] = {}
def initialize_fuzzing_tracking(run_id: str, workflow_name: str):
"""
Initialize fuzzing tracking for a new run.
This function should be called when a workflow is submitted to enable
real-time monitoring and stats collection.
Args:
run_id: The run identifier
workflow_name: Name of the workflow
"""
fuzzing_stats[run_id] = FuzzingStats(
run_id=run_id,
workflow=workflow_name
)
crash_reports[run_id] = []
active_connections[run_id] = []
@router.get("/{run_id}/stats", response_model=FuzzingStats)
async def get_fuzzing_stats(run_id: str) -> FuzzingStats:
"""
Get current fuzzing statistics for a run.
Args:
run_id: The fuzzing run ID
Returns:
Current fuzzing statistics
Raises:
HTTPException: 404 if run not found
"""
if run_id not in fuzzing_stats:
raise HTTPException(
status_code=404,
detail=f"Fuzzing run not found: {run_id}"
)
return fuzzing_stats[run_id]
@router.get("/{run_id}/crashes", response_model=List[CrashReport])
async def get_crash_reports(run_id: str) -> List[CrashReport]:
"""
Get crash reports for a fuzzing run.
Args:
run_id: The fuzzing run ID
Returns:
List of crash reports
Raises:
HTTPException: 404 if run not found
"""
if run_id not in crash_reports:
raise HTTPException(
status_code=404,
detail=f"Fuzzing run not found: {run_id}"
)
return crash_reports[run_id]
@router.post("/{run_id}/stats")
async def update_fuzzing_stats(run_id: str, stats: FuzzingStats):
"""
Update fuzzing statistics (called by fuzzing workflows).
Args:
run_id: The fuzzing run ID
stats: Updated statistics
Raises:
HTTPException: 404 if run not found
"""
if run_id not in fuzzing_stats:
raise HTTPException(
status_code=404,
detail=f"Fuzzing run not found: {run_id}"
)
# Update stats
fuzzing_stats[run_id] = stats
# Debug: log reception for live instrumentation
try:
logger.info(
"Received fuzzing stats update: run_id=%s exec=%s eps=%.2f crashes=%s corpus=%s elapsed=%ss",
run_id,
stats.executions,
stats.executions_per_sec,
stats.crashes,
stats.corpus_size,
stats.elapsed_time,
)
except Exception:
pass
# Notify connected WebSocket clients
if run_id in active_connections:
message = {
"type": "stats_update",
"data": stats.model_dump()
}
for websocket in active_connections[run_id][:]: # Copy to avoid modification during iteration
try:
await websocket.send_text(json.dumps(message))
except Exception:
# Remove disconnected clients
active_connections[run_id].remove(websocket)
@router.post("/{run_id}/crash")
async def report_crash(run_id: str, crash: CrashReport):
"""
Report a new crash (called by fuzzing workflows).
Args:
run_id: The fuzzing run ID
crash: Crash report details
"""
if run_id not in crash_reports:
crash_reports[run_id] = []
# Add crash report
crash_reports[run_id].append(crash)
# Update stats
if run_id in fuzzing_stats:
fuzzing_stats[run_id].crashes += 1
fuzzing_stats[run_id].last_crash_time = crash.timestamp
# Notify connected WebSocket clients
if run_id in active_connections:
message = {
"type": "crash_report",
"data": crash.model_dump()
}
for websocket in active_connections[run_id][:]:
try:
await websocket.send_text(json.dumps(message))
except Exception:
active_connections[run_id].remove(websocket)
@router.websocket("/{run_id}/live")
async def websocket_endpoint(websocket: WebSocket, run_id: str):
"""
WebSocket endpoint for real-time fuzzing updates.
Args:
websocket: WebSocket connection
run_id: The fuzzing run ID to monitor
"""
await websocket.accept()
# Initialize connection tracking
if run_id not in active_connections:
active_connections[run_id] = []
active_connections[run_id].append(websocket)
try:
# Send current stats on connection
if run_id in fuzzing_stats:
current = fuzzing_stats[run_id]
if isinstance(current, dict):
payload = current
elif hasattr(current, "model_dump"):
payload = current.model_dump()
elif hasattr(current, "dict"):
payload = current.dict()
else:
payload = getattr(current, "__dict__", {"run_id": run_id})
message = {"type": "stats_update", "data": payload}
await websocket.send_text(json.dumps(message))
# Keep connection alive
while True:
try:
# Wait for ping or handle disconnect
data = await asyncio.wait_for(websocket.receive_text(), timeout=30.0)
# Echo back for ping-pong
if data == "ping":
await websocket.send_text("pong")
except asyncio.TimeoutError:
# Send periodic heartbeat
await websocket.send_text(json.dumps({"type": "heartbeat"}))
except WebSocketDisconnect:
# Clean up connection
if run_id in active_connections and websocket in active_connections[run_id]:
active_connections[run_id].remove(websocket)
except Exception as e:
logger.error(f"WebSocket error for run {run_id}: {e}")
if run_id in active_connections and websocket in active_connections[run_id]:
active_connections[run_id].remove(websocket)
@router.get("/{run_id}/stream")
async def stream_fuzzing_updates(run_id: str):
"""
Server-Sent Events endpoint for real-time fuzzing updates.
Args:
run_id: The fuzzing run ID to monitor
Returns:
Streaming response with real-time updates
"""
if run_id not in fuzzing_stats:
raise HTTPException(
status_code=404,
detail=f"Fuzzing run not found: {run_id}"
)
async def event_stream():
"""Generate server-sent events for fuzzing updates"""
last_stats_time = datetime.utcnow()
while True:
try:
# Send current stats
if run_id in fuzzing_stats:
current_stats = fuzzing_stats[run_id]
if isinstance(current_stats, dict):
stats_payload = current_stats
elif hasattr(current_stats, "model_dump"):
stats_payload = current_stats.model_dump()
elif hasattr(current_stats, "dict"):
stats_payload = current_stats.dict()
else:
stats_payload = getattr(current_stats, "__dict__", {"run_id": run_id})
event_data = f"data: {json.dumps({'type': 'stats', 'data': stats_payload})}\n\n"
yield event_data
# Send recent crashes
if run_id in crash_reports:
recent_crashes = [
crash for crash in crash_reports[run_id]
if crash.timestamp > last_stats_time
]
for crash in recent_crashes:
event_data = f"data: {json.dumps({'type': 'crash', 'data': crash.model_dump()})}\n\n"
yield event_data
last_stats_time = datetime.utcnow()
await asyncio.sleep(5) # Update every 5 seconds
except Exception as e:
logger.error(f"Error in event stream for run {run_id}: {e}")
break
return StreamingResponse(
event_stream(),
media_type="text/event-stream",
headers={
"Cache-Control": "no-cache",
"Connection": "keep-alive",
}
)
@router.delete("/{run_id}")
async def cleanup_fuzzing_run(run_id: str):
"""
Clean up fuzzing run data.
Args:
run_id: The fuzzing run ID to clean up
"""
# Clean up tracking data
fuzzing_stats.pop(run_id, None)
crash_reports.pop(run_id, None)
# Close any active WebSocket connections
if run_id in active_connections:
for websocket in active_connections[run_id]:
try:
await websocket.close()
except Exception:
pass
del active_connections[run_id]
return {"message": f"Cleaned up fuzzing run {run_id}"}

View File

@@ -1,184 +0,0 @@
"""
API endpoints for workflow run management and findings retrieval
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import logging
from typing import Dict, Any
from fastapi import APIRouter, HTTPException, Depends
from src.models.findings import WorkflowFindings, WorkflowStatus
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/runs", tags=["runs"])
def get_prefect_manager():
"""Dependency to get the Prefect manager instance"""
from src.main import prefect_mgr
return prefect_mgr
@router.get("/{run_id}/status", response_model=WorkflowStatus)
async def get_run_status(
run_id: str,
prefect_mgr=Depends(get_prefect_manager)
) -> WorkflowStatus:
"""
Get the current status of a workflow run.
Args:
run_id: The flow run ID
Returns:
Status information including state, timestamps, and completion flags
Raises:
HTTPException: 404 if run not found
"""
try:
status = await prefect_mgr.get_flow_run_status(run_id)
# Find workflow name from deployment
workflow_name = "unknown"
workflow_deployment_id = status.get("workflow", "")
for name, deployment_id in prefect_mgr.deployments.items():
if str(deployment_id) == str(workflow_deployment_id):
workflow_name = name
break
return WorkflowStatus(
run_id=status["run_id"],
workflow=workflow_name,
status=status["status"],
is_completed=status["is_completed"],
is_failed=status["is_failed"],
is_running=status["is_running"],
created_at=status["created_at"],
updated_at=status["updated_at"]
)
except Exception as e:
logger.error(f"Failed to get status for run {run_id}: {e}")
raise HTTPException(
status_code=404,
detail=f"Run not found: {run_id}"
)
@router.get("/{run_id}/findings", response_model=WorkflowFindings)
async def get_run_findings(
run_id: str,
prefect_mgr=Depends(get_prefect_manager)
) -> WorkflowFindings:
"""
Get the findings from a completed workflow run.
Args:
run_id: The flow run ID
Returns:
SARIF-formatted findings from the workflow execution
Raises:
HTTPException: 404 if run not found, 400 if run not completed
"""
try:
# Get run status first
status = await prefect_mgr.get_flow_run_status(run_id)
if not status["is_completed"]:
if status["is_running"]:
raise HTTPException(
status_code=400,
detail=f"Run {run_id} is still running. Current status: {status['status']}"
)
elif status["is_failed"]:
raise HTTPException(
status_code=400,
detail=f"Run {run_id} failed. Status: {status['status']}"
)
else:
raise HTTPException(
status_code=400,
detail=f"Run {run_id} not completed. Status: {status['status']}"
)
# Get the findings
findings = await prefect_mgr.get_flow_run_findings(run_id)
# Find workflow name
workflow_name = "unknown"
workflow_deployment_id = status.get("workflow", "")
for name, deployment_id in prefect_mgr.deployments.items():
if str(deployment_id) == str(workflow_deployment_id):
workflow_name = name
break
# Get workflow version if available
metadata = {
"completion_time": status["updated_at"],
"workflow_version": "unknown"
}
if workflow_name in prefect_mgr.workflows:
workflow_info = prefect_mgr.workflows[workflow_name]
metadata["workflow_version"] = workflow_info.metadata.get("version", "unknown")
return WorkflowFindings(
workflow=workflow_name,
run_id=run_id,
sarif=findings,
metadata=metadata
)
except HTTPException:
raise
except Exception as e:
logger.error(f"Failed to get findings for run {run_id}: {e}")
raise HTTPException(
status_code=500,
detail=f"Failed to retrieve findings: {str(e)}"
)
@router.get("/{workflow_name}/findings/{run_id}", response_model=WorkflowFindings)
async def get_workflow_findings(
workflow_name: str,
run_id: str,
prefect_mgr=Depends(get_prefect_manager)
) -> WorkflowFindings:
"""
Get findings for a specific workflow run.
Alternative endpoint that includes workflow name in the path for clarity.
Args:
workflow_name: Name of the workflow
run_id: The flow run ID
Returns:
SARIF-formatted findings from the workflow execution
Raises:
HTTPException: 404 if workflow or run not found, 400 if run not completed
"""
if workflow_name not in prefect_mgr.workflows:
raise HTTPException(
status_code=404,
detail=f"Workflow not found: {workflow_name}"
)
# Delegate to the main findings endpoint
return await get_run_findings(run_id, prefect_mgr)

View File

@@ -1,386 +0,0 @@
"""
API endpoints for workflow management with enhanced error handling
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import logging
import traceback
from typing import List, Dict, Any, Optional
from fastapi import APIRouter, HTTPException, Depends
from pathlib import Path
from src.models.findings import (
WorkflowSubmission,
WorkflowMetadata,
WorkflowListItem,
RunSubmissionResponse
)
from src.core.workflow_discovery import WorkflowDiscovery
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/workflows", tags=["workflows"])
def create_structured_error_response(
error_type: str,
message: str,
workflow_name: Optional[str] = None,
run_id: Optional[str] = None,
container_info: Optional[Dict[str, Any]] = None,
deployment_info: Optional[Dict[str, Any]] = None,
suggestions: Optional[List[str]] = None
) -> Dict[str, Any]:
"""Create a structured error response with rich context."""
error_response = {
"error": {
"type": error_type,
"message": message,
"timestamp": __import__("datetime").datetime.utcnow().isoformat() + "Z"
}
}
if workflow_name:
error_response["error"]["workflow_name"] = workflow_name
if run_id:
error_response["error"]["run_id"] = run_id
if container_info:
error_response["error"]["container"] = container_info
if deployment_info:
error_response["error"]["deployment"] = deployment_info
if suggestions:
error_response["error"]["suggestions"] = suggestions
return error_response
def get_prefect_manager():
"""Dependency to get the Prefect manager instance"""
from src.main import prefect_mgr
return prefect_mgr
@router.get("/", response_model=List[WorkflowListItem])
async def list_workflows(
prefect_mgr=Depends(get_prefect_manager)
) -> List[WorkflowListItem]:
"""
List all discovered workflows with their metadata.
Returns a summary of each workflow including name, version, description,
author, and tags.
"""
workflows = []
for name, info in prefect_mgr.workflows.items():
workflows.append(WorkflowListItem(
name=name,
version=info.metadata.get("version", "0.6.0"),
description=info.metadata.get("description", ""),
author=info.metadata.get("author"),
tags=info.metadata.get("tags", [])
))
return workflows
@router.get("/metadata/schema")
async def get_metadata_schema() -> Dict[str, Any]:
"""
Get the JSON schema for workflow metadata files.
This schema defines the structure and requirements for metadata.yaml files
that must accompany each workflow.
"""
return WorkflowDiscovery.get_metadata_schema()
@router.get("/{workflow_name}/metadata", response_model=WorkflowMetadata)
async def get_workflow_metadata(
workflow_name: str,
prefect_mgr=Depends(get_prefect_manager)
) -> WorkflowMetadata:
"""
Get complete metadata for a specific workflow.
Args:
workflow_name: Name of the workflow
Returns:
Complete metadata including parameters schema, supported volume modes,
required modules, and more.
Raises:
HTTPException: 404 if workflow not found
"""
if workflow_name not in prefect_mgr.workflows:
available_workflows = list(prefect_mgr.workflows.keys())
error_response = create_structured_error_response(
error_type="WorkflowNotFound",
message=f"Workflow '{workflow_name}' not found",
workflow_name=workflow_name,
suggestions=[
f"Available workflows: {', '.join(available_workflows)}",
"Use GET /workflows/ to see all available workflows",
"Check workflow name spelling and case sensitivity"
]
)
raise HTTPException(
status_code=404,
detail=error_response
)
info = prefect_mgr.workflows[workflow_name]
metadata = info.metadata
return WorkflowMetadata(
name=workflow_name,
version=metadata.get("version", "0.6.0"),
description=metadata.get("description", ""),
author=metadata.get("author"),
tags=metadata.get("tags", []),
parameters=metadata.get("parameters", {}),
default_parameters=metadata.get("default_parameters", {}),
required_modules=metadata.get("required_modules", []),
supported_volume_modes=metadata.get("supported_volume_modes", ["ro", "rw"]),
has_custom_docker=info.has_docker
)
@router.post("/{workflow_name}/submit", response_model=RunSubmissionResponse)
async def submit_workflow(
workflow_name: str,
submission: WorkflowSubmission,
prefect_mgr=Depends(get_prefect_manager)
) -> RunSubmissionResponse:
"""
Submit a workflow for execution with volume mounting.
Args:
workflow_name: Name of the workflow to execute
submission: Submission parameters including target path and volume mode
Returns:
Run submission response with run_id and initial status
Raises:
HTTPException: 404 if workflow not found, 400 for invalid parameters
"""
if workflow_name not in prefect_mgr.workflows:
available_workflows = list(prefect_mgr.workflows.keys())
error_response = create_structured_error_response(
error_type="WorkflowNotFound",
message=f"Workflow '{workflow_name}' not found",
workflow_name=workflow_name,
suggestions=[
f"Available workflows: {', '.join(available_workflows)}",
"Use GET /workflows/ to see all available workflows",
"Check workflow name spelling and case sensitivity"
]
)
raise HTTPException(
status_code=404,
detail=error_response
)
try:
# Convert ResourceLimits to dict if provided
resource_limits_dict = None
if submission.resource_limits:
resource_limits_dict = {
"cpu_limit": submission.resource_limits.cpu_limit,
"memory_limit": submission.resource_limits.memory_limit,
"cpu_request": submission.resource_limits.cpu_request,
"memory_request": submission.resource_limits.memory_request
}
# Submit the workflow with enhanced parameters
flow_run = await prefect_mgr.submit_workflow(
workflow_name=workflow_name,
target_path=submission.target_path,
volume_mode=submission.volume_mode,
parameters=submission.parameters,
resource_limits=resource_limits_dict,
additional_volumes=submission.additional_volumes,
timeout=submission.timeout
)
run_id = str(flow_run.id)
# Initialize fuzzing tracking if this looks like a fuzzing workflow
workflow_info = prefect_mgr.workflows.get(workflow_name, {})
workflow_tags = workflow_info.metadata.get("tags", []) if hasattr(workflow_info, 'metadata') else []
if "fuzzing" in workflow_tags or "fuzz" in workflow_name.lower():
from src.api.fuzzing import initialize_fuzzing_tracking
initialize_fuzzing_tracking(run_id, workflow_name)
return RunSubmissionResponse(
run_id=run_id,
status=flow_run.state.name if flow_run.state else "PENDING",
workflow=workflow_name,
message=f"Workflow '{workflow_name}' submitted successfully"
)
except ValueError as e:
# Parameter validation errors
error_response = create_structured_error_response(
error_type="ValidationError",
message=str(e),
workflow_name=workflow_name,
suggestions=[
"Check parameter types and values",
"Use GET /workflows/{workflow_name}/parameters for schema",
"Ensure all required parameters are provided"
]
)
raise HTTPException(status_code=400, detail=error_response)
except Exception as e:
logger.error(f"Failed to submit workflow '{workflow_name}': {e}")
logger.error(f"Traceback: {traceback.format_exc()}")
# Try to get more context about the error
container_info = None
deployment_info = None
suggestions = []
error_message = str(e)
error_type = "WorkflowSubmissionError"
# Detect specific error patterns
if "deployment" in error_message.lower():
error_type = "DeploymentError"
deployment_info = {
"status": "failed",
"error": error_message
}
suggestions.extend([
"Check if Prefect server is running and accessible",
"Verify Docker is running and has sufficient resources",
"Check container image availability",
"Ensure volume paths exist and are accessible"
])
elif "volume" in error_message.lower() or "mount" in error_message.lower():
error_type = "VolumeError"
suggestions.extend([
"Check if the target path exists and is accessible",
"Verify file permissions (Docker needs read access)",
"Ensure the path is not in use by another process",
"Try using an absolute path instead of relative path"
])
elif "memory" in error_message.lower() or "resource" in error_message.lower():
error_type = "ResourceError"
suggestions.extend([
"Check system memory and CPU availability",
"Consider reducing resource limits or dataset size",
"Monitor Docker resource usage",
"Increase Docker memory limits if needed"
])
elif "image" in error_message.lower():
error_type = "ImageError"
suggestions.extend([
"Check if the workflow image exists",
"Verify Docker registry access",
"Try rebuilding the workflow image",
"Check network connectivity to registries"
])
else:
suggestions.extend([
"Check FuzzForge backend logs for details",
"Verify all services are running (docker-compose up -d)",
"Try restarting the workflow deployment",
"Contact support if the issue persists"
])
error_response = create_structured_error_response(
error_type=error_type,
message=f"Failed to submit workflow: {error_message}",
workflow_name=workflow_name,
container_info=container_info,
deployment_info=deployment_info,
suggestions=suggestions
)
raise HTTPException(
status_code=500,
detail=error_response
)
@router.get("/{workflow_name}/parameters")
async def get_workflow_parameters(
workflow_name: str,
prefect_mgr=Depends(get_prefect_manager)
) -> Dict[str, Any]:
"""
Get the parameters schema for a workflow.
Args:
workflow_name: Name of the workflow
Returns:
Parameters schema with types, descriptions, and defaults
Raises:
HTTPException: 404 if workflow not found
"""
if workflow_name not in prefect_mgr.workflows:
available_workflows = list(prefect_mgr.workflows.keys())
error_response = create_structured_error_response(
error_type="WorkflowNotFound",
message=f"Workflow '{workflow_name}' not found",
workflow_name=workflow_name,
suggestions=[
f"Available workflows: {', '.join(available_workflows)}",
"Use GET /workflows/ to see all available workflows"
]
)
raise HTTPException(
status_code=404,
detail=error_response
)
info = prefect_mgr.workflows[workflow_name]
metadata = info.metadata
# Return parameters with enhanced schema information
parameters_schema = metadata.get("parameters", {})
# Extract the actual parameter definitions from JSON schema structure
if "properties" in parameters_schema:
param_definitions = parameters_schema["properties"]
else:
param_definitions = parameters_schema
# Add default values to the schema
default_params = metadata.get("default_parameters", {})
for param_name, param_schema in param_definitions.items():
if isinstance(param_schema, dict) and param_name in default_params:
param_schema["default"] = default_params[param_name]
return {
"workflow": workflow_name,
"parameters": param_definitions,
"default_parameters": default_params,
"required_parameters": [
name for name, schema in param_definitions.items()
if isinstance(schema, dict) and schema.get("required", False)
]
}

View File

@@ -1,11 +0,0 @@
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.

View File

@@ -1,770 +0,0 @@
"""
Prefect Manager - Core orchestration for workflow deployment and execution
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import logging
import os
import platform
import re
from pathlib import Path
from typing import Dict, Optional, Any
from prefect import get_client
from prefect.docker import DockerImage
from prefect.client.schemas import FlowRun
from src.core.workflow_discovery import WorkflowDiscovery, WorkflowInfo
logger = logging.getLogger(__name__)
def get_registry_url(context: str = "default") -> str:
"""
Get the container registry URL to use for a given operation context.
Goals:
- Work reliably across Linux and macOS Docker Desktop
- Prefer in-network service discovery when running inside containers
- Allow full override via env vars from docker-compose
Env overrides:
- FUZZFORGE_REGISTRY_PUSH_URL: used for image builds/pushes
- FUZZFORGE_REGISTRY_PULL_URL: used for workers to pull images
"""
# Normalize context
ctx = (context or "default").lower()
# Always honor explicit overrides first
if ctx in ("push", "build"):
push_url = os.getenv("FUZZFORGE_REGISTRY_PUSH_URL")
if push_url:
logger.debug("Using FUZZFORGE_REGISTRY_PUSH_URL: %s", push_url)
return push_url
# Default to host-published registry for Docker daemon operations
return "localhost:5001"
if ctx == "pull":
pull_url = os.getenv("FUZZFORGE_REGISTRY_PULL_URL")
if pull_url:
logger.debug("Using FUZZFORGE_REGISTRY_PULL_URL: %s", pull_url)
return pull_url
# Prefect worker pulls via host Docker daemon as well
return "localhost:5001"
# Default/fallback
return os.getenv("FUZZFORGE_REGISTRY_PULL_URL", os.getenv("FUZZFORGE_REGISTRY_PUSH_URL", "localhost:5001"))
def _compose_project_name(default: str = "fuzzforge") -> str:
"""Return the docker-compose project name used for network/volume naming.
Always returns 'fuzzforge' regardless of environment variables.
"""
return "fuzzforge"
class PrefectManager:
"""
Manages Prefect deployments and flow runs for discovered workflows.
This class handles:
- Workflow discovery and registration
- Docker image building through Prefect
- Deployment creation and management
- Flow run submission with volume mounting
- Findings retrieval from completed runs
"""
def __init__(self, workflows_dir: Path = None):
"""
Initialize the Prefect manager.
Args:
workflows_dir: Path to the workflows directory (default: toolbox/workflows)
"""
if workflows_dir is None:
workflows_dir = Path("toolbox/workflows")
self.discovery = WorkflowDiscovery(workflows_dir)
self.workflows: Dict[str, WorkflowInfo] = {}
self.deployments: Dict[str, str] = {} # workflow_name -> deployment_id
# Security: Define allowed and forbidden paths for host mounting
self.allowed_base_paths = [
"/tmp",
"/home",
"/Users", # macOS users
"/opt",
"/var/tmp",
"/workspace", # Common container workspace
"/app" # Container application directory (for test projects)
]
self.forbidden_paths = [
"/etc",
"/root",
"/var/run",
"/sys",
"/proc",
"/dev",
"/boot",
"/var/lib/docker", # Critical Docker data
"/var/log", # System logs
"/usr/bin", # System binaries
"/usr/sbin",
"/sbin",
"/bin"
]
@staticmethod
def _parse_memory_to_bytes(memory_str: str) -> int:
"""
Parse memory string (like '512Mi', '1Gi') to bytes.
Args:
memory_str: Memory string with unit suffix
Returns:
Memory in bytes
Raises:
ValueError: If format is invalid
"""
if not memory_str:
return 0
match = re.match(r'^(\d+(?:\.\d+)?)\s*([GMK]i?)$', memory_str.strip())
if not match:
raise ValueError(f"Invalid memory format: {memory_str}. Expected format like '512Mi', '1Gi'")
value, unit = match.groups()
value = float(value)
# Convert to bytes based on unit (binary units: Ki, Mi, Gi)
if unit in ['K', 'Ki']:
multiplier = 1024
elif unit in ['M', 'Mi']:
multiplier = 1024 * 1024
elif unit in ['G', 'Gi']:
multiplier = 1024 * 1024 * 1024
else:
raise ValueError(f"Unsupported memory unit: {unit}")
return int(value * multiplier)
@staticmethod
def _parse_cpu_to_millicores(cpu_str: str) -> int:
"""
Parse CPU string (like '500m', '1', '2.5') to millicores.
Args:
cpu_str: CPU string
Returns:
CPU in millicores (1 core = 1000 millicores)
Raises:
ValueError: If format is invalid
"""
if not cpu_str:
return 0
cpu_str = cpu_str.strip()
# Handle millicores format (e.g., '500m')
if cpu_str.endswith('m'):
try:
return int(cpu_str[:-1])
except ValueError:
raise ValueError(f"Invalid CPU format: {cpu_str}")
# Handle core format (e.g., '1', '2.5')
try:
cores = float(cpu_str)
return int(cores * 1000) # Convert to millicores
except ValueError:
raise ValueError(f"Invalid CPU format: {cpu_str}")
def _extract_resource_requirements(self, workflow_info: WorkflowInfo) -> Dict[str, str]:
"""
Extract resource requirements from workflow metadata.
Args:
workflow_info: Workflow information with metadata
Returns:
Dictionary with resource requirements in Docker format
"""
metadata = workflow_info.metadata
requirements = metadata.get("requirements", {})
resources = requirements.get("resources", {})
resource_config = {}
# Extract memory requirement
memory = resources.get("memory")
if memory:
try:
# Validate memory format and store original string for Docker
self._parse_memory_to_bytes(memory)
resource_config["memory"] = memory
except ValueError as e:
logger.warning(f"Invalid memory requirement in {workflow_info.name}: {e}")
# Extract CPU requirement
cpu = resources.get("cpu")
if cpu:
try:
# Validate CPU format and store original string for Docker
self._parse_cpu_to_millicores(cpu)
resource_config["cpus"] = cpu
except ValueError as e:
logger.warning(f"Invalid CPU requirement in {workflow_info.name}: {e}")
# Extract timeout
timeout = resources.get("timeout")
if timeout and isinstance(timeout, int):
resource_config["timeout"] = str(timeout)
return resource_config
async def initialize(self):
"""
Initialize the manager by discovering and deploying all workflows.
This method:
1. Discovers all valid workflows in the workflows directory
2. Validates their metadata
3. Deploys each workflow to Prefect with Docker images
"""
try:
# Discover workflows
self.workflows = await self.discovery.discover_workflows()
if not self.workflows:
logger.warning("No workflows discovered")
return
logger.info(f"Discovered {len(self.workflows)} workflows: {list(self.workflows.keys())}")
# Deploy each workflow
for name, info in self.workflows.items():
try:
await self._deploy_workflow(name, info)
except Exception as e:
logger.error(f"Failed to deploy workflow '{name}': {e}")
except Exception as e:
logger.error(f"Failed to initialize Prefect manager: {e}")
raise
async def _deploy_workflow(self, name: str, info: WorkflowInfo):
"""
Deploy a single workflow to Prefect with Docker image.
Args:
name: Workflow name
info: Workflow information including metadata and paths
"""
logger.info(f"Deploying workflow '{name}'...")
# Get the flow function from registry
flow_func = self.discovery.get_flow_function(name)
if not flow_func:
logger.error(
f"Failed to get flow function for '{name}' from registry. "
f"Ensure the workflow is properly registered in toolbox/workflows/registry.py"
)
return
# Use the mandatory Dockerfile with absolute paths for Docker Compose
# Get absolute paths for build context and dockerfile
toolbox_path = info.path.parent.parent.resolve()
dockerfile_abs_path = info.dockerfile.resolve()
# Calculate relative dockerfile path from toolbox context
try:
dockerfile_rel_path = dockerfile_abs_path.relative_to(toolbox_path)
except ValueError:
# If relative path fails, use the workflow-specific path
dockerfile_rel_path = Path("workflows") / name / "Dockerfile"
# Determine deployment strategy based on Dockerfile presence
base_image = "prefecthq/prefect:3-python3.11"
has_custom_dockerfile = info.has_docker and info.dockerfile.exists()
logger.info(f"=== DEPLOYMENT DEBUG for '{name}' ===")
logger.info(f"info.has_docker: {info.has_docker}")
logger.info(f"info.dockerfile: {info.dockerfile}")
logger.info(f"info.dockerfile.exists(): {info.dockerfile.exists()}")
logger.info(f"has_custom_dockerfile: {has_custom_dockerfile}")
logger.info(f"toolbox_path: {toolbox_path}")
logger.info(f"dockerfile_rel_path: {dockerfile_rel_path}")
if has_custom_dockerfile:
logger.info(f"Workflow '{name}' has custom Dockerfile - building custom image")
# Decide whether to use registry or keep images local to host engine
import os
# Default to using the local registry; set FUZZFORGE_USE_REGISTRY=false to bypass (not recommended)
use_registry = os.getenv("FUZZFORGE_USE_REGISTRY", "true").lower() == "true"
if use_registry:
registry_url = get_registry_url(context="push")
image_spec = DockerImage(
name=f"{registry_url}/fuzzforge/{name}",
tag="latest",
dockerfile=str(dockerfile_rel_path),
context=str(toolbox_path)
)
deploy_image = f"{registry_url}/fuzzforge/{name}:latest"
build_custom = True
push_custom = True
logger.info(f"Using registry: {registry_url} for '{name}'")
else:
# Single-host mode: build into host engine cache; no push required
image_spec = DockerImage(
name=f"fuzzforge/{name}",
tag="latest",
dockerfile=str(dockerfile_rel_path),
context=str(toolbox_path)
)
deploy_image = f"fuzzforge/{name}:latest"
build_custom = True
push_custom = False
logger.info("Using single-host image (no registry push): %s", deploy_image)
else:
logger.info(f"Workflow '{name}' using base image - no custom dependencies needed")
deploy_image = base_image
build_custom = False
push_custom = False
# Pre-validate registry connectivity when pushing
if push_custom:
try:
from .setup import validate_registry_connectivity
await validate_registry_connectivity(registry_url)
logger.info(f"Registry connectivity validated for {registry_url}")
except Exception as e:
logger.error(f"Registry connectivity validation failed for {registry_url}: {e}")
raise RuntimeError(f"Cannot deploy workflow '{name}': Registry {registry_url} is not accessible. {e}")
# Deploy the workflow
try:
# Ensure any previous deployment is removed so job variables are updated
try:
async with get_client() as client:
existing = await client.read_deployment_by_name(
f"{name}/{name}-deployment"
)
if existing:
logger.info(f"Removing existing deployment for '{name}' to refresh settings...")
await client.delete_deployment(existing.id)
except Exception:
# If not found or deletion fails, continue with deployment
pass
# Extract resource requirements from metadata
workflow_resource_requirements = self._extract_resource_requirements(info)
logger.info(f"Workflow '{name}' resource requirements: {workflow_resource_requirements}")
# Build job variables with resource requirements
job_variables = {
"image": deploy_image, # Use the worker-accessible registry name
"volumes": [], # Populated at run submission with toolbox mount
"env": {
"PYTHONPATH": "/opt/prefect/toolbox:/opt/prefect",
"WORKFLOW_NAME": name
}
}
# Add resource requirements to job variables if present
if workflow_resource_requirements:
job_variables["resources"] = workflow_resource_requirements
# Prepare deployment parameters
deploy_params = {
"name": f"{name}-deployment",
"work_pool_name": "docker-pool",
"image": image_spec if has_custom_dockerfile else deploy_image,
"push": push_custom,
"build": build_custom,
"job_variables": job_variables
}
deployment = await flow_func.deploy(**deploy_params)
self.deployments[name] = str(deployment.id) if hasattr(deployment, 'id') else name
logger.info(f"Successfully deployed workflow '{name}'")
except Exception as e:
# Enhanced error reporting with more context
import traceback
logger.error(f"Failed to deploy workflow '{name}': {e}")
logger.error(f"Deployment traceback: {traceback.format_exc()}")
# Try to capture Docker-specific context
error_context = {
"workflow_name": name,
"has_dockerfile": has_custom_dockerfile,
"image_name": deploy_image if 'deploy_image' in locals() else "unknown",
"registry_url": registry_url if 'registry_url' in locals() else "unknown",
"error_type": type(e).__name__,
"error_message": str(e)
}
# Check for specific error patterns with detailed categorization
error_msg_lower = str(e).lower()
if "registry" in error_msg_lower and ("no such host" in error_msg_lower or "connection" in error_msg_lower):
error_context["category"] = "registry_connectivity_error"
error_context["solution"] = f"Cannot reach registry at {error_context['registry_url']}. Check Docker network and registry service."
elif "docker" in error_msg_lower:
error_context["category"] = "docker_error"
if "build" in error_msg_lower:
error_context["subcategory"] = "image_build_failed"
error_context["solution"] = "Check Dockerfile syntax and dependencies."
elif "pull" in error_msg_lower:
error_context["subcategory"] = "image_pull_failed"
error_context["solution"] = "Check if image exists in registry and network connectivity."
elif "push" in error_msg_lower:
error_context["subcategory"] = "image_push_failed"
error_context["solution"] = f"Check registry connectivity and push permissions to {error_context['registry_url']}."
elif "registry" in error_msg_lower:
error_context["category"] = "registry_error"
error_context["solution"] = "Check registry configuration and accessibility."
elif "prefect" in error_msg_lower:
error_context["category"] = "prefect_error"
error_context["solution"] = "Check Prefect server connectivity and deployment configuration."
else:
error_context["category"] = "unknown_deployment_error"
error_context["solution"] = "Check logs for more specific error details."
logger.error(f"Deployment error context: {error_context}")
# Raise enhanced exception with context
enhanced_error = Exception(f"Deployment failed for workflow '{name}': {str(e)} | Context: {error_context}")
enhanced_error.original_error = e
enhanced_error.context = error_context
raise enhanced_error
async def submit_workflow(
self,
workflow_name: str,
target_path: str,
volume_mode: str = "ro",
parameters: Dict[str, Any] = None,
resource_limits: Dict[str, str] = None,
additional_volumes: list = None,
timeout: int = None
) -> FlowRun:
"""
Submit a workflow for execution with volume mounting.
Args:
workflow_name: Name of the workflow to execute
target_path: Host path to mount as volume
volume_mode: Volume mount mode ("ro" for read-only, "rw" for read-write)
parameters: Workflow-specific parameters
resource_limits: CPU/memory limits for container
additional_volumes: List of additional volume mounts
timeout: Timeout in seconds
Returns:
FlowRun object with run information
Raises:
ValueError: If workflow not found or volume mode not supported
"""
if workflow_name not in self.workflows:
raise ValueError(f"Unknown workflow: {workflow_name}")
# Validate volume mode
workflow_info = self.workflows[workflow_name]
supported_modes = workflow_info.metadata.get("supported_volume_modes", ["ro", "rw"])
if volume_mode not in supported_modes:
raise ValueError(
f"Workflow '{workflow_name}' doesn't support volume mode '{volume_mode}'. "
f"Supported modes: {supported_modes}"
)
# Validate target path with security checks
self._validate_target_path(target_path)
# Validate additional volumes if provided
if additional_volumes:
for volume in additional_volumes:
self._validate_target_path(volume.host_path)
async with get_client() as client:
# Get the deployment, auto-redeploy once if missing
try:
deployment = await client.read_deployment_by_name(
f"{workflow_name}/{workflow_name}-deployment"
)
except Exception as e:
import traceback
logger.error(f"Failed to find deployment for workflow '{workflow_name}': {e}")
logger.error(f"Deployment lookup traceback: {traceback.format_exc()}")
# Attempt a one-time auto-deploy to recover from startup races
try:
logger.info(f"Auto-deploying missing workflow '{workflow_name}' and retrying...")
await self._deploy_workflow(workflow_name, workflow_info)
deployment = await client.read_deployment_by_name(
f"{workflow_name}/{workflow_name}-deployment"
)
except Exception as redeploy_exc:
# Enhanced error with context
error_context = {
"workflow_name": workflow_name,
"error_type": type(e).__name__,
"error_message": str(e),
"redeploy_error": str(redeploy_exc),
"available_deployments": list(self.deployments.keys()),
}
enhanced_error = ValueError(
f"Deployment not found and redeploy failed for workflow '{workflow_name}': {e} | Context: {error_context}"
)
enhanced_error.context = error_context
raise enhanced_error
# Determine the Docker Compose network name and volume names
# Hardcoded to 'fuzzforge' to avoid directory name dependencies
import os
compose_project = "fuzzforge"
docker_network = "fuzzforge_default"
# Build volume mounts
# Add toolbox volume mount for workflow code access
backend_toolbox_path = "/app/toolbox" # Path in backend container
# Hardcoded volume names
prefect_storage_volume = "fuzzforge_prefect_storage"
toolbox_code_volume = "fuzzforge_toolbox_code"
volumes = [
f"{target_path}:/workspace:{volume_mode}",
f"{prefect_storage_volume}:/prefect-storage", # Shared storage for results
f"{toolbox_code_volume}:/opt/prefect/toolbox:ro" # Mount workflow code
]
# Add additional volumes if provided
if additional_volumes:
for volume in additional_volumes:
volume_spec = f"{volume.host_path}:{volume.container_path}:{volume.mode}"
volumes.append(volume_spec)
# Build environment variables
env_vars = {
"PREFECT_API_URL": "http://prefect-server:4200/api", # Use internal network hostname
"PREFECT_LOGGING_LEVEL": "INFO",
"PREFECT_LOCAL_STORAGE_PATH": "/prefect-storage", # Use shared storage
"PREFECT_RESULTS_PERSIST_BY_DEFAULT": "true", # Enable result persistence
"PREFECT_DEFAULT_RESULT_STORAGE_BLOCK": "local-file-system/fuzzforge-results", # Use our storage block
"WORKSPACE_PATH": "/workspace",
"VOLUME_MODE": volume_mode,
"WORKFLOW_NAME": workflow_name
}
# Add additional volume paths to environment for easy access
if additional_volumes:
for i, volume in enumerate(additional_volumes):
env_vars[f"ADDITIONAL_VOLUME_{i}_PATH"] = volume.container_path
# Determine which image to use based on workflow configuration
workflow_info = self.workflows[workflow_name]
has_custom_dockerfile = workflow_info.has_docker and workflow_info.dockerfile.exists()
# Use pull context for worker to pull from registry
registry_url = get_registry_url(context="pull")
workflow_image = f"{registry_url}/fuzzforge/{workflow_name}:latest" if has_custom_dockerfile else "prefecthq/prefect:3-python3.11"
logger.debug(f"Worker will pull image: {workflow_image} (Registry: {registry_url})")
# Configure job variables with volume mounting and network access
job_variables = {
# Use custom image if available, otherwise base Prefect image
"image": workflow_image,
"volumes": volumes,
"networks": [docker_network], # Connect to Docker Compose network
"env": {
**env_vars,
"PYTHONPATH": "/opt/prefect/toolbox:/opt/prefect/toolbox/workflows",
"WORKFLOW_NAME": workflow_name
}
}
# Apply resource requirements from workflow metadata and user overrides
workflow_resource_requirements = self._extract_resource_requirements(workflow_info)
final_resource_config = {}
# Start with workflow requirements as base
if workflow_resource_requirements:
final_resource_config.update(workflow_resource_requirements)
# Apply user-provided resource limits (overrides workflow defaults)
if resource_limits:
user_resource_config = {}
if resource_limits.get("cpu_limit"):
user_resource_config["cpus"] = resource_limits["cpu_limit"]
if resource_limits.get("memory_limit"):
user_resource_config["memory"] = resource_limits["memory_limit"]
# Note: cpu_request and memory_request are not directly supported by Docker
# but could be used for Kubernetes in the future
# User overrides take precedence
final_resource_config.update(user_resource_config)
# Apply final resource configuration
if final_resource_config:
job_variables["resources"] = final_resource_config
logger.info(f"Applied resource limits: {final_resource_config}")
# Merge parameters with defaults from metadata
default_params = workflow_info.metadata.get("default_parameters", {})
final_params = {**default_params, **(parameters or {})}
# Set flow parameters that match the flow signature
final_params["target_path"] = "/workspace" # Container path where volume is mounted
final_params["volume_mode"] = volume_mode
# Create and submit the flow run
# Pass job_variables to ensure network, volumes, and environment are configured
logger.info(f"Submitting flow with job_variables: {job_variables}")
logger.info(f"Submitting flow with parameters: {final_params}")
# Prepare flow run creation parameters
flow_run_params = {
"deployment_id": deployment.id,
"parameters": final_params,
"job_variables": job_variables
}
# Note: Timeout is handled through workflow-level configuration
# Additional timeout configuration can be added to deployment metadata if needed
flow_run = await client.create_flow_run_from_deployment(**flow_run_params)
logger.info(
f"Submitted workflow '{workflow_name}' with run_id: {flow_run.id}, "
f"target: {target_path}, mode: {volume_mode}"
)
return flow_run
async def get_flow_run_findings(self, run_id: str) -> Dict[str, Any]:
"""
Retrieve findings from a completed flow run.
Args:
run_id: The flow run ID
Returns:
Dictionary containing SARIF-formatted findings
Raises:
ValueError: If run not completed or not found
"""
async with get_client() as client:
flow_run = await client.read_flow_run(run_id)
if not flow_run.state.is_completed():
raise ValueError(
f"Flow run {run_id} not completed. Current status: {flow_run.state.name}"
)
# Get the findings from the flow run result
try:
findings = await flow_run.state.result()
return findings
except Exception as e:
logger.error(f"Failed to retrieve findings for run {run_id}: {e}")
raise ValueError(f"Failed to retrieve findings: {e}")
async def get_flow_run_status(self, run_id: str) -> Dict[str, Any]:
"""
Get the current status of a flow run.
Args:
run_id: The flow run ID
Returns:
Dictionary with status information
"""
async with get_client() as client:
flow_run = await client.read_flow_run(run_id)
return {
"run_id": str(flow_run.id),
"workflow": flow_run.deployment_id,
"status": flow_run.state.name,
"is_completed": flow_run.state.is_completed(),
"is_failed": flow_run.state.is_failed(),
"is_running": flow_run.state.is_running(),
"created_at": flow_run.created,
"updated_at": flow_run.updated
}
def _validate_target_path(self, target_path: str) -> None:
"""
Validate target path for security before mounting as volume.
Args:
target_path: Host path to validate
Raises:
ValueError: If path is not allowed for security reasons
"""
target = Path(target_path)
# Path must be absolute
if not target.is_absolute():
raise ValueError(f"Target path must be absolute: {target_path}")
# Resolve path to handle symlinks and relative components
try:
resolved_path = target.resolve()
except (OSError, RuntimeError) as e:
raise ValueError(f"Cannot resolve target path: {target_path} - {e}")
resolved_str = str(resolved_path)
# Check against forbidden paths first (more restrictive)
for forbidden in self.forbidden_paths:
if resolved_str.startswith(forbidden):
raise ValueError(
f"Access denied: Path '{target_path}' resolves to forbidden directory '{forbidden}'. "
f"This path contains sensitive system files and cannot be mounted."
)
# Check if path starts with any allowed base path
path_allowed = False
for allowed in self.allowed_base_paths:
if resolved_str.startswith(allowed):
path_allowed = True
break
if not path_allowed:
allowed_list = ", ".join(self.allowed_base_paths)
raise ValueError(
f"Access denied: Path '{target_path}' is not in allowed directories. "
f"Allowed base paths: {allowed_list}"
)
# Additional security checks
if resolved_str == "/":
raise ValueError("Cannot mount root filesystem")
# Warn if path doesn't exist (but don't block - it might be created later)
if not resolved_path.exists():
logger.warning(f"Target path does not exist: {target_path}")
logger.info(f"Path validation passed for: {target_path} -> {resolved_str}")

View File

@@ -1,402 +0,0 @@
"""
Setup utilities for Prefect infrastructure
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import logging
from prefect import get_client
from prefect.client.schemas.actions import WorkPoolCreate
from prefect.client.schemas.objects import WorkPool
from .prefect_manager import get_registry_url
logger = logging.getLogger(__name__)
async def setup_docker_pool():
"""
Create or update the Docker work pool for container execution.
This work pool is configured to:
- Connect to the local Docker daemon
- Support volume mounting at runtime
- Clean up containers after execution
- Use bridge networking by default
"""
import os
async with get_client() as client:
pool_name = "docker-pool"
# Add force recreation flag for debugging fresh install issues
force_recreate = os.getenv('FORCE_RECREATE_WORK_POOL', 'false').lower() == 'true'
debug_setup = os.getenv('DEBUG_WORK_POOL_SETUP', 'false').lower() == 'true'
if force_recreate:
logger.warning(f"FORCE_RECREATE_WORK_POOL=true - Will recreate work pool regardless of existing configuration")
if debug_setup:
logger.warning(f"DEBUG_WORK_POOL_SETUP=true - Enhanced logging enabled")
# Temporarily set logging level to DEBUG for this function
original_level = logger.level
logger.setLevel(logging.DEBUG)
try:
# Check if pool already exists and supports custom images
existing_pools = await client.read_work_pools()
existing_pool = None
for pool in existing_pools:
if pool.name == pool_name:
existing_pool = pool
break
if existing_pool and not force_recreate:
logger.info(f"Found existing work pool '{pool_name}' - validating configuration...")
# Check if the existing pool has the correct configuration
base_template = existing_pool.base_job_template or {}
logger.debug(f"Base template keys: {list(base_template.keys())}")
job_config = base_template.get("job_configuration", {})
logger.debug(f"Job config keys: {list(job_config.keys())}")
image_config = job_config.get("image", "")
has_image_variable = "{{ image }}" in str(image_config)
logger.debug(f"Image config: '{image_config}' -> has_image_variable: {has_image_variable}")
# Check if volume defaults include toolbox mount
variables = base_template.get("variables", {})
properties = variables.get("properties", {})
volume_config = properties.get("volumes", {})
volume_defaults = volume_config.get("default", [])
has_toolbox_volume = any("toolbox_code" in str(vol) for vol in volume_defaults) if volume_defaults else False
logger.debug(f"Volume defaults: {volume_defaults}")
logger.debug(f"Has toolbox volume: {has_toolbox_volume}")
# Check if environment defaults include required settings
env_config = properties.get("env", {})
env_defaults = env_config.get("default", {})
has_api_url = "PREFECT_API_URL" in env_defaults
has_storage_path = "PREFECT_LOCAL_STORAGE_PATH" in env_defaults
has_results_persist = "PREFECT_RESULTS_PERSIST_BY_DEFAULT" in env_defaults
has_required_env = has_api_url and has_storage_path and has_results_persist
logger.debug(f"Environment defaults: {env_defaults}")
logger.debug(f"Has API URL: {has_api_url}, Has storage path: {has_storage_path}, Has results persist: {has_results_persist}")
logger.debug(f"Has required env: {has_required_env}")
# Log the full validation result
logger.info(f"Work pool validation - Image: {has_image_variable}, Toolbox: {has_toolbox_volume}, Environment: {has_required_env}")
if has_image_variable and has_toolbox_volume and has_required_env:
logger.info(f"Docker work pool '{pool_name}' already exists with correct configuration")
return
else:
reasons = []
if not has_image_variable:
reasons.append("missing image template")
if not has_toolbox_volume:
reasons.append("missing toolbox volume mount")
if not has_required_env:
if not has_api_url:
reasons.append("missing PREFECT_API_URL")
if not has_storage_path:
reasons.append("missing PREFECT_LOCAL_STORAGE_PATH")
if not has_results_persist:
reasons.append("missing PREFECT_RESULTS_PERSIST_BY_DEFAULT")
logger.warning(f"Docker work pool '{pool_name}' exists but lacks: {', '.join(reasons)}. Recreating...")
# Delete the old pool and recreate it
try:
await client.delete_work_pool(pool_name)
logger.info(f"Deleted old work pool '{pool_name}'")
except Exception as e:
logger.warning(f"Failed to delete old work pool: {e}")
elif force_recreate and existing_pool:
logger.warning(f"Force recreation enabled - deleting existing work pool '{pool_name}'")
try:
await client.delete_work_pool(pool_name)
logger.info(f"Deleted existing work pool for force recreation")
except Exception as e:
logger.warning(f"Failed to delete work pool for force recreation: {e}")
logger.info(f"Creating Docker work pool '{pool_name}' with custom image support...")
# Create the work pool with proper Docker configuration
work_pool = WorkPoolCreate(
name=pool_name,
type="docker",
description="Docker work pool for FuzzForge workflows with custom image support",
base_job_template={
"job_configuration": {
"image": "{{ image }}", # Template variable for custom images
"volumes": "{{ volumes }}", # List of volume mounts
"env": "{{ env }}", # Environment variables
"networks": "{{ networks }}", # Docker networks
"stream_output": True,
"auto_remove": True,
"privileged": False,
"network_mode": None, # Use networks instead
"labels": {},
"command": None # Let the image's CMD/ENTRYPOINT run
},
"variables": {
"type": "object",
"properties": {
"image": {
"type": "string",
"title": "Docker Image",
"default": "prefecthq/prefect:3-python3.11",
"description": "Docker image for the flow run"
},
"volumes": {
"type": "array",
"title": "Volume Mounts",
"default": [
"fuzzforge_prefect_storage:/prefect-storage",
"fuzzforge_toolbox_code:/opt/prefect/toolbox:ro"
],
"description": "Volume mounts in format 'host:container:mode'",
"items": {
"type": "string"
}
},
"networks": {
"type": "array",
"title": "Docker Networks",
"default": ["fuzzforge_default"],
"description": "Docker networks to connect container to",
"items": {
"type": "string"
}
},
"env": {
"type": "object",
"title": "Environment Variables",
"default": {
"PREFECT_API_URL": "http://prefect-server:4200/api",
"PREFECT_LOCAL_STORAGE_PATH": "/prefect-storage",
"PREFECT_RESULTS_PERSIST_BY_DEFAULT": "true"
},
"description": "Environment variables for the container",
"additionalProperties": {
"type": "string"
}
}
}
}
}
)
await client.create_work_pool(work_pool)
logger.info(f"Created Docker work pool '{pool_name}'")
except Exception as e:
logger.error(f"Failed to setup Docker work pool: {e}")
raise
finally:
# Restore original logging level if debug mode was enabled
if debug_setup and 'original_level' in locals():
logger.setLevel(original_level)
def get_actual_compose_project_name():
"""
Return the hardcoded compose project name for FuzzForge.
Always returns 'fuzzforge' as per system requirements.
"""
logger.info("Using hardcoded compose project name: fuzzforge")
return "fuzzforge"
async def setup_result_storage():
"""
Create or update Prefect result storage block for findings persistence.
This sets up a LocalFileSystem storage block pointing to the shared
/prefect-storage volume for result persistence.
"""
from prefect.filesystems import LocalFileSystem
storage_name = "fuzzforge-results"
try:
# Create the storage block, overwrite if it exists
logger.info(f"Setting up storage block '{storage_name}'...")
storage = LocalFileSystem(basepath="/prefect-storage")
block_doc_id = await storage.save(name=storage_name, overwrite=True)
logger.info(f"Storage block '{storage_name}' configured successfully")
return str(block_doc_id)
except Exception as e:
logger.error(f"Failed to setup result storage: {e}")
# Don't raise the exception - continue without storage block
logger.warning("Continuing without result storage block - findings may not persist")
return None
async def validate_docker_connection():
"""
Validate that Docker is accessible and running.
Note: In containerized deployments with Docker socket proxy,
the backend doesn't need direct Docker access.
Raises:
RuntimeError: If Docker is not accessible
"""
import os
# Skip Docker validation if running in container without socket access
if os.path.exists("/.dockerenv") and not os.path.exists("/var/run/docker.sock"):
logger.info("Running in container without Docker socket - skipping Docker validation")
return
try:
import docker
client = docker.from_env()
client.ping()
logger.info("Docker connection validated")
except Exception as e:
logger.error(f"Docker is not accessible: {e}")
raise RuntimeError(
"Docker is not running or not accessible. "
"Please ensure Docker is installed and running."
)
async def validate_registry_connectivity(registry_url: str = None):
"""
Validate that the Docker registry is accessible.
Args:
registry_url: URL of the Docker registry to validate (auto-detected if None)
Raises:
RuntimeError: If registry is not accessible
"""
# Resolve a reachable test URL from within this process
if registry_url is None:
# If not specified, prefer internal service name in containers, host port on host
import os
if os.path.exists('/.dockerenv'):
registry_url = "registry:5000"
else:
registry_url = "localhost:5001"
# If we're running inside a container and asked to probe localhost:PORT,
# the probe would hit the container, not the host. Use host.docker.internal instead.
import os
try:
host_part, port_part = registry_url.split(":", 1)
except ValueError:
host_part, port_part = registry_url, "80"
if os.path.exists('/.dockerenv') and host_part in ("localhost", "127.0.0.1"):
test_host = "host.docker.internal"
else:
test_host = host_part
test_url = f"http://{test_host}:{port_part}/v2/"
import aiohttp
import asyncio
logger.info(f"Validating registry connectivity to {registry_url}...")
try:
async with aiohttp.ClientSession(timeout=aiohttp.ClientTimeout(total=10)) as session:
async with session.get(test_url) as response:
if response.status == 200:
logger.info(f"Registry at {registry_url} is accessible (tested via {test_host})")
return
else:
raise RuntimeError(f"Registry returned status {response.status}")
except asyncio.TimeoutError:
raise RuntimeError(f"Registry at {registry_url} is not responding (timeout)")
except aiohttp.ClientError as e:
raise RuntimeError(f"Registry at {registry_url} is not accessible: {e}")
except Exception as e:
raise RuntimeError(f"Failed to validate registry connectivity: {e}")
async def validate_docker_network(network_name: str):
"""
Validate that the specified Docker network exists.
Args:
network_name: Name of the Docker network to validate
Raises:
RuntimeError: If network doesn't exist
"""
import os
# Skip network validation if running in container without Docker socket
if os.path.exists("/.dockerenv") and not os.path.exists("/var/run/docker.sock"):
logger.info("Running in container without Docker socket - skipping network validation")
return
try:
import docker
client = docker.from_env()
# List all networks
networks = client.networks.list(names=[network_name])
if not networks:
# Try to find networks with similar names
all_networks = client.networks.list()
similar_networks = [n.name for n in all_networks if "fuzzforge" in n.name.lower()]
error_msg = f"Docker network '{network_name}' not found."
if similar_networks:
error_msg += f" Available networks: {similar_networks}"
else:
error_msg += " Please ensure Docker Compose is running."
raise RuntimeError(error_msg)
logger.info(f"Docker network '{network_name}' validated")
except Exception as e:
if isinstance(e, RuntimeError):
raise
logger.error(f"Network validation failed: {e}")
raise RuntimeError(f"Failed to validate Docker network: {e}")
async def validate_infrastructure():
"""
Validate all required infrastructure components.
This should be called during startup to ensure everything is ready.
"""
logger.info("Validating infrastructure...")
# Validate Docker connection
await validate_docker_connection()
# Validate registry connectivity for custom image building
await validate_registry_connectivity()
# Validate network (hardcoded to avoid directory name dependencies)
import os
compose_project = "fuzzforge"
docker_network = "fuzzforge_default"
try:
await validate_docker_network(docker_network)
except RuntimeError as e:
logger.warning(f"Network validation failed: {e}")
logger.warning("Workflows may not be able to connect to Prefect services")
logger.info("Infrastructure validation completed")

View File

@@ -1,459 +0,0 @@
"""
Workflow Discovery - Registry-based discovery and loading of workflows
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import logging
import yaml
from pathlib import Path
from typing import Dict, Optional, Any, Callable
from pydantic import BaseModel, Field, ConfigDict
logger = logging.getLogger(__name__)
class WorkflowInfo(BaseModel):
"""Information about a discovered workflow"""
name: str = Field(..., description="Workflow name")
path: Path = Field(..., description="Path to workflow directory")
workflow_file: Path = Field(..., description="Path to workflow.py file")
dockerfile: Path = Field(..., description="Path to Dockerfile")
has_docker: bool = Field(..., description="Whether workflow has custom Dockerfile")
metadata: Dict[str, Any] = Field(..., description="Workflow metadata from YAML")
flow_function_name: str = Field(default="main_flow", description="Name of the flow function")
model_config = ConfigDict(arbitrary_types_allowed=True)
class WorkflowDiscovery:
"""
Discovers workflows from the filesystem and validates them against the registry.
This system:
1. Scans for workflows with metadata.yaml files
2. Cross-references them with the manual registry
3. Provides registry-based flow functions for deployment
Workflows must have:
- workflow.py: Contains the Prefect flow
- metadata.yaml: Mandatory metadata file
- Entry in toolbox/workflows/registry.py: Manual registration
- Dockerfile (optional): Custom container definition
- requirements.txt (optional): Python dependencies
"""
def __init__(self, workflows_dir: Path):
"""
Initialize workflow discovery.
Args:
workflows_dir: Path to the workflows directory
"""
self.workflows_dir = workflows_dir
if not self.workflows_dir.exists():
self.workflows_dir.mkdir(parents=True, exist_ok=True)
logger.info(f"Created workflows directory: {self.workflows_dir}")
# Import registry - this validates it on import
try:
from toolbox.workflows.registry import WORKFLOW_REGISTRY, list_registered_workflows
self.registry = WORKFLOW_REGISTRY
logger.info(f"Loaded workflow registry with {len(self.registry)} registered workflows")
except ImportError as e:
logger.error(f"Failed to import workflow registry: {e}")
self.registry = {}
except Exception as e:
logger.error(f"Registry validation failed: {e}")
self.registry = {}
# Cache for discovered workflows
self._workflow_cache: Optional[Dict[str, WorkflowInfo]] = None
self._cache_timestamp: Optional[float] = None
self._cache_ttl = 60.0 # Cache TTL in seconds
async def discover_workflows(self) -> Dict[str, WorkflowInfo]:
"""
Discover workflows by cross-referencing filesystem with registry.
Uses caching to avoid frequent filesystem scans.
Returns:
Dictionary mapping workflow names to their information
"""
# Check cache validity
import time
current_time = time.time()
if (self._workflow_cache is not None and
self._cache_timestamp is not None and
(current_time - self._cache_timestamp) < self._cache_ttl):
# Return cached results
logger.debug(f"Returning cached workflow discovery ({len(self._workflow_cache)} workflows)")
return self._workflow_cache
workflows = {}
discovered_dirs = set()
registry_names = set(self.registry.keys())
if not self.workflows_dir.exists():
logger.warning(f"Workflows directory does not exist: {self.workflows_dir}")
return workflows
# Recursively scan all directories and subdirectories
await self._scan_directory_recursive(self.workflows_dir, workflows, discovered_dirs)
# Check for registry entries without corresponding directories
missing_dirs = registry_names - discovered_dirs
if missing_dirs:
logger.warning(
f"Registry contains workflows without filesystem directories: {missing_dirs}. "
f"These workflows cannot be deployed."
)
logger.info(
f"Discovery complete: {len(workflows)} workflows ready for deployment, "
f"{len(missing_dirs)} registry entries missing directories, "
f"{len(discovered_dirs - registry_names)} filesystem workflows not registered"
)
# Update cache
self._workflow_cache = workflows
self._cache_timestamp = current_time
return workflows
async def _scan_directory_recursive(self, directory: Path, workflows: Dict[str, WorkflowInfo], discovered_dirs: set):
"""
Recursively scan directory for workflows.
Args:
directory: Directory to scan
workflows: Dictionary to populate with discovered workflows
discovered_dirs: Set to track discovered workflow names
"""
for item in directory.iterdir():
if not item.is_dir():
continue
if item.name.startswith('_') or item.name.startswith('.'):
continue # Skip hidden or private directories
# Check if this directory contains workflow files (workflow.py and metadata.yaml)
workflow_file = item / "workflow.py"
metadata_file = item / "metadata.yaml"
if workflow_file.exists() and metadata_file.exists():
# This is a workflow directory
workflow_name = item.name
discovered_dirs.add(workflow_name)
# Only process workflows that are in the registry
if workflow_name not in self.registry:
logger.warning(
f"Workflow '{workflow_name}' found in filesystem but not in registry. "
f"Add it to toolbox/workflows/registry.py to enable deployment."
)
continue
try:
workflow_info = await self._load_workflow(item)
if workflow_info:
workflows[workflow_info.name] = workflow_info
logger.info(f"Discovered and registered workflow: {workflow_info.name}")
except Exception as e:
logger.error(f"Failed to load workflow from {item}: {e}")
else:
# This is a category directory, recurse into it
await self._scan_directory_recursive(item, workflows, discovered_dirs)
async def _load_workflow(self, workflow_dir: Path) -> Optional[WorkflowInfo]:
"""
Load and validate a single workflow.
Args:
workflow_dir: Path to the workflow directory
Returns:
WorkflowInfo if valid, None otherwise
"""
workflow_name = workflow_dir.name
# Check for mandatory files
workflow_file = workflow_dir / "workflow.py"
metadata_file = workflow_dir / "metadata.yaml"
if not workflow_file.exists():
logger.warning(f"Workflow {workflow_name} missing workflow.py")
return None
if not metadata_file.exists():
logger.error(f"Workflow {workflow_name} missing mandatory metadata.yaml")
return None
# Load and validate metadata
try:
metadata = self._load_metadata(metadata_file)
if not self._validate_metadata(metadata, workflow_name):
return None
except Exception as e:
logger.error(f"Failed to load metadata for {workflow_name}: {e}")
return None
# Check for mandatory Dockerfile
dockerfile = workflow_dir / "Dockerfile"
if not dockerfile.exists():
logger.error(f"Workflow {workflow_name} missing mandatory Dockerfile")
return None
has_docker = True # Always True since Dockerfile is mandatory
# Get flow function name from metadata or use default
flow_function_name = metadata.get("flow_function", "main_flow")
return WorkflowInfo(
name=workflow_name,
path=workflow_dir,
workflow_file=workflow_file,
dockerfile=dockerfile,
has_docker=has_docker,
metadata=metadata,
flow_function_name=flow_function_name
)
def _load_metadata(self, metadata_file: Path) -> Dict[str, Any]:
"""
Load metadata from YAML file.
Args:
metadata_file: Path to metadata.yaml
Returns:
Dictionary containing metadata
"""
with open(metadata_file, 'r') as f:
metadata = yaml.safe_load(f)
if metadata is None:
raise ValueError("Empty metadata file")
return metadata
def _validate_metadata(self, metadata: Dict[str, Any], workflow_name: str) -> bool:
"""
Validate that metadata contains all required fields.
Args:
metadata: Metadata dictionary
workflow_name: Name of the workflow for logging
Returns:
True if valid, False otherwise
"""
required_fields = ["name", "version", "description", "author", "category", "parameters", "requirements"]
missing_fields = []
for field in required_fields:
if field not in metadata:
missing_fields.append(field)
if missing_fields:
logger.error(
f"Workflow {workflow_name} metadata missing required fields: {missing_fields}"
)
return False
# Validate version format (semantic versioning)
version = metadata.get("version", "")
if not self._is_valid_version(version):
logger.error(f"Workflow {workflow_name} has invalid version format: {version}")
return False
# Validate parameters structure
parameters = metadata.get("parameters", {})
if not isinstance(parameters, dict):
logger.error(f"Workflow {workflow_name} parameters must be a dictionary")
return False
return True
def _is_valid_version(self, version: str) -> bool:
"""
Check if version follows semantic versioning (x.y.z).
Args:
version: Version string
Returns:
True if valid semantic version
"""
try:
parts = version.split('.')
if len(parts) != 3:
return False
for part in parts:
int(part) # Check if each part is a number
return True
except (ValueError, AttributeError):
return False
def invalidate_cache(self) -> None:
"""
Invalidate the workflow discovery cache.
Useful when workflows are added or modified.
"""
self._workflow_cache = None
self._cache_timestamp = None
logger.debug("Workflow discovery cache invalidated")
def get_flow_function(self, workflow_name: str) -> Optional[Callable]:
"""
Get the flow function from the registry.
Args:
workflow_name: Name of the workflow
Returns:
The flow function if found in registry, None otherwise
"""
if workflow_name not in self.registry:
logger.error(
f"Workflow '{workflow_name}' not found in registry. "
f"Available workflows: {list(self.registry.keys())}"
)
return None
try:
from toolbox.workflows.registry import get_workflow_flow
flow_func = get_workflow_flow(workflow_name)
logger.debug(f"Retrieved flow function for '{workflow_name}' from registry")
return flow_func
except Exception as e:
logger.error(f"Failed to get flow function for '{workflow_name}': {e}")
return None
def get_registry_info(self, workflow_name: str) -> Optional[Dict[str, Any]]:
"""
Get registry information for a workflow.
Args:
workflow_name: Name of the workflow
Returns:
Registry information if found, None otherwise
"""
if workflow_name not in self.registry:
return None
try:
from toolbox.workflows.registry import get_workflow_info
return get_workflow_info(workflow_name)
except Exception as e:
logger.error(f"Failed to get registry info for '{workflow_name}': {e}")
return None
@staticmethod
def get_metadata_schema() -> Dict[str, Any]:
"""
Get the JSON schema for workflow metadata.
Returns:
JSON schema dictionary
"""
return {
"type": "object",
"required": ["name", "version", "description", "author", "category", "parameters", "requirements"],
"properties": {
"name": {
"type": "string",
"description": "Workflow name"
},
"version": {
"type": "string",
"pattern": "^\\d+\\.\\d+\\.\\d+$",
"description": "Semantic version (x.y.z)"
},
"description": {
"type": "string",
"description": "Workflow description"
},
"author": {
"type": "string",
"description": "Workflow author"
},
"category": {
"type": "string",
"enum": ["comprehensive", "specialized", "fuzzing", "focused"],
"description": "Workflow category"
},
"tags": {
"type": "array",
"items": {"type": "string"},
"description": "Workflow tags for categorization"
},
"requirements": {
"type": "object",
"required": ["tools", "resources"],
"properties": {
"tools": {
"type": "array",
"items": {"type": "string"},
"description": "Required security tools"
},
"resources": {
"type": "object",
"required": ["memory", "cpu", "timeout"],
"properties": {
"memory": {
"type": "string",
"pattern": "^\\d+[GMK]i$",
"description": "Memory limit (e.g., 1Gi, 512Mi)"
},
"cpu": {
"type": "string",
"pattern": "^\\d+m?$",
"description": "CPU limit (e.g., 1000m, 2)"
},
"timeout": {
"type": "integer",
"minimum": 60,
"maximum": 7200,
"description": "Workflow timeout in seconds"
}
}
}
}
},
"parameters": {
"type": "object",
"description": "Workflow parameters schema"
},
"default_parameters": {
"type": "object",
"description": "Default parameter values"
},
"required_modules": {
"type": "array",
"items": {"type": "string"},
"description": "Required module names"
},
"supported_volume_modes": {
"type": "array",
"items": {"enum": ["ro", "rw"]},
"default": ["ro", "rw"],
"description": "Supported volume mount modes"
},
"flow_function": {
"type": "string",
"default": "main_flow",
"description": "Name of the flow function in workflow.py"
}
}
}

View File

@@ -1,864 +0,0 @@
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import asyncio
import logging
import os
from uuid import UUID
from contextlib import AsyncExitStack, asynccontextmanager, suppress
from typing import Any, Dict, Optional, List
import uvicorn
from fastapi import FastAPI
from starlette.applications import Starlette
from starlette.routing import Mount
from fastmcp.server.http import create_sse_app
from src.core.prefect_manager import PrefectManager
from src.core.setup import setup_docker_pool, setup_result_storage, validate_infrastructure
from src.core.workflow_discovery import WorkflowDiscovery
from src.api import workflows, runs, fuzzing
from src.services.prefect_stats_monitor import prefect_stats_monitor
from fastmcp import FastMCP
from prefect.client.orchestration import get_client
from prefect.client.schemas.filters import (
FlowRunFilter,
FlowRunFilterDeploymentId,
FlowRunFilterState,
FlowRunFilterStateType,
)
from prefect.client.schemas.sorting import FlowRunSort
from prefect.states import StateType
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
prefect_mgr = PrefectManager()
class PrefectBootstrapState:
"""Tracks Prefect initialization progress for API and MCP consumers."""
def __init__(self) -> None:
self.ready: bool = False
self.status: str = "not_started"
self.last_error: Optional[str] = None
self.task_running: bool = False
def as_dict(self) -> Dict[str, Any]:
return {
"ready": self.ready,
"status": self.status,
"last_error": self.last_error,
"task_running": self.task_running,
}
prefect_bootstrap_state = PrefectBootstrapState()
# Configure retry strategy for bootstrapping Prefect + infrastructure
STARTUP_RETRY_SECONDS = max(1, int(os.getenv("FUZZFORGE_STARTUP_RETRY_SECONDS", "5")))
STARTUP_RETRY_MAX_SECONDS = max(
STARTUP_RETRY_SECONDS,
int(os.getenv("FUZZFORGE_STARTUP_RETRY_MAX_SECONDS", "60")),
)
prefect_bootstrap_task: Optional[asyncio.Task] = None
# ---------------------------------------------------------------------------
# FastAPI application (REST API remains unchanged)
# ---------------------------------------------------------------------------
app = FastAPI(
title="FuzzForge API",
description="Security testing workflow orchestration API with fuzzing support",
version="0.6.0",
)
app.include_router(workflows.router)
app.include_router(runs.router)
app.include_router(fuzzing.router)
def get_prefect_status() -> Dict[str, Any]:
"""Return a snapshot of Prefect bootstrap state for diagnostics."""
status = prefect_bootstrap_state.as_dict()
status["workflows_loaded"] = len(prefect_mgr.workflows)
status["deployments_tracked"] = len(prefect_mgr.deployments)
status["bootstrap_task_running"] = (
prefect_bootstrap_task is not None and not prefect_bootstrap_task.done()
)
return status
def _prefect_not_ready_status() -> Optional[Dict[str, Any]]:
"""Return status details if Prefect is not ready yet."""
status = get_prefect_status()
if status.get("ready"):
return None
return status
@app.get("/")
async def root() -> Dict[str, Any]:
status = get_prefect_status()
return {
"name": "FuzzForge API",
"version": "0.6.0",
"status": "ready" if status.get("ready") else "initializing",
"workflows_loaded": status.get("workflows_loaded", 0),
"prefect": status,
}
@app.get("/health")
async def health() -> Dict[str, str]:
status = get_prefect_status()
health_status = "healthy" if status.get("ready") else "initializing"
return {"status": health_status}
# Map FastAPI OpenAPI operationIds to readable MCP tool names
FASTAPI_MCP_NAME_OVERRIDES: Dict[str, str] = {
"list_workflows_workflows__get": "api_list_workflows",
"get_metadata_schema_workflows_metadata_schema_get": "api_get_metadata_schema",
"get_workflow_metadata_workflows__workflow_name__metadata_get": "api_get_workflow_metadata",
"submit_workflow_workflows__workflow_name__submit_post": "api_submit_workflow",
"get_workflow_parameters_workflows__workflow_name__parameters_get": "api_get_workflow_parameters",
"get_run_status_runs__run_id__status_get": "api_get_run_status",
"get_run_findings_runs__run_id__findings_get": "api_get_run_findings",
"get_workflow_findings_runs__workflow_name__findings__run_id__get": "api_get_workflow_findings",
"get_fuzzing_stats_fuzzing__run_id__stats_get": "api_get_fuzzing_stats",
"update_fuzzing_stats_fuzzing__run_id__stats_post": "api_update_fuzzing_stats",
"get_crash_reports_fuzzing__run_id__crashes_get": "api_get_crash_reports",
"report_crash_fuzzing__run_id__crash_post": "api_report_crash",
"stream_fuzzing_updates_fuzzing__run_id__stream_get": "api_stream_fuzzing_updates",
"cleanup_fuzzing_run_fuzzing__run_id__delete": "api_cleanup_fuzzing_run",
"root__get": "api_root",
"health_health_get": "api_health",
}
# Create an MCP adapter exposing all FastAPI endpoints via OpenAPI parsing
FASTAPI_MCP_ADAPTER = FastMCP.from_fastapi(
app,
name="FuzzForge FastAPI",
mcp_names=FASTAPI_MCP_NAME_OVERRIDES,
)
_fastapi_mcp_imported = False
# ---------------------------------------------------------------------------
# FastMCP server (runs on dedicated port outside FastAPI)
# ---------------------------------------------------------------------------
mcp = FastMCP(name="FuzzForge MCP")
async def _bootstrap_prefect_with_retries() -> None:
"""Initialize Prefect infrastructure with exponential backoff retries."""
attempt = 0
while True:
attempt += 1
prefect_bootstrap_state.task_running = True
prefect_bootstrap_state.status = "starting"
prefect_bootstrap_state.ready = False
prefect_bootstrap_state.last_error = None
try:
logger.info("Bootstrapping Prefect infrastructure...")
await validate_infrastructure()
await setup_docker_pool()
await setup_result_storage()
await prefect_mgr.initialize()
await prefect_stats_monitor.start_monitoring()
prefect_bootstrap_state.ready = True
prefect_bootstrap_state.status = "ready"
prefect_bootstrap_state.task_running = False
logger.info("Prefect infrastructure ready")
return
except asyncio.CancelledError:
prefect_bootstrap_state.status = "cancelled"
prefect_bootstrap_state.task_running = False
logger.info("Prefect bootstrap task cancelled")
raise
except Exception as exc: # pragma: no cover - defensive logging on infra startup
logger.exception("Prefect bootstrap failed")
prefect_bootstrap_state.ready = False
prefect_bootstrap_state.status = "error"
prefect_bootstrap_state.last_error = str(exc)
# Ensure partial initialization does not leave stale state behind
prefect_mgr.workflows.clear()
prefect_mgr.deployments.clear()
await prefect_stats_monitor.stop_monitoring()
wait_time = min(
STARTUP_RETRY_SECONDS * (2 ** (attempt - 1)),
STARTUP_RETRY_MAX_SECONDS,
)
logger.info("Retrying Prefect bootstrap in %s second(s)", wait_time)
try:
await asyncio.sleep(wait_time)
except asyncio.CancelledError:
prefect_bootstrap_state.status = "cancelled"
prefect_bootstrap_state.task_running = False
raise
def _lookup_workflow(workflow_name: str):
info = prefect_mgr.workflows.get(workflow_name)
if not info:
return None
metadata = info.metadata
defaults = metadata.get("default_parameters", {})
default_target_path = metadata.get("default_target_path") or defaults.get("target_path")
supported_modes = metadata.get("supported_volume_modes") or ["ro", "rw"]
if not isinstance(supported_modes, list) or not supported_modes:
supported_modes = ["ro", "rw"]
default_volume_mode = (
metadata.get("default_volume_mode")
or defaults.get("volume_mode")
or supported_modes[0]
)
return {
"name": workflow_name,
"version": metadata.get("version", "0.6.0"),
"description": metadata.get("description", ""),
"author": metadata.get("author"),
"tags": metadata.get("tags", []),
"parameters": metadata.get("parameters", {}),
"default_parameters": metadata.get("default_parameters", {}),
"required_modules": metadata.get("required_modules", []),
"supported_volume_modes": supported_modes,
"default_target_path": default_target_path,
"default_volume_mode": default_volume_mode,
"has_custom_docker": bool(info.has_docker),
}
@mcp.tool
async def list_workflows_mcp() -> Dict[str, Any]:
"""List all discovered workflows and their metadata summary."""
not_ready = _prefect_not_ready_status()
if not_ready:
return {
"workflows": [],
"prefect": not_ready,
"message": "Prefect infrastructure is still initializing",
}
workflows_summary = []
for name, info in prefect_mgr.workflows.items():
metadata = info.metadata
defaults = metadata.get("default_parameters", {})
workflows_summary.append({
"name": name,
"version": metadata.get("version", "0.6.0"),
"description": metadata.get("description", ""),
"author": metadata.get("author"),
"tags": metadata.get("tags", []),
"supported_volume_modes": metadata.get("supported_volume_modes", ["ro", "rw"]),
"default_volume_mode": metadata.get("default_volume_mode")
or defaults.get("volume_mode")
or "ro",
"default_target_path": metadata.get("default_target_path")
or defaults.get("target_path"),
"has_custom_docker": bool(info.has_docker),
})
return {"workflows": workflows_summary, "prefect": get_prefect_status()}
@mcp.tool
async def get_workflow_metadata_mcp(workflow_name: str) -> Dict[str, Any]:
"""Fetch detailed metadata for a workflow."""
not_ready = _prefect_not_ready_status()
if not_ready:
return {
"error": "Prefect infrastructure not ready",
"prefect": not_ready,
}
data = _lookup_workflow(workflow_name)
if not data:
return {"error": f"Workflow not found: {workflow_name}"}
return data
@mcp.tool
async def get_workflow_parameters_mcp(workflow_name: str) -> Dict[str, Any]:
"""Return the parameter schema and defaults for a workflow."""
not_ready = _prefect_not_ready_status()
if not_ready:
return {
"error": "Prefect infrastructure not ready",
"prefect": not_ready,
}
data = _lookup_workflow(workflow_name)
if not data:
return {"error": f"Workflow not found: {workflow_name}"}
return {
"parameters": data.get("parameters", {}),
"defaults": data.get("default_parameters", {}),
}
@mcp.tool
async def get_workflow_metadata_schema_mcp() -> Dict[str, Any]:
"""Return the JSON schema describing workflow metadata files."""
return WorkflowDiscovery.get_metadata_schema()
@mcp.tool
async def submit_security_scan_mcp(
workflow_name: str,
target_path: str | None = None,
volume_mode: str | None = None,
parameters: Dict[str, Any] | None = None,
) -> Dict[str, Any] | Dict[str, str]:
"""Submit a Prefect workflow via MCP."""
try:
not_ready = _prefect_not_ready_status()
if not_ready:
return {
"error": "Prefect infrastructure not ready",
"prefect": not_ready,
}
workflow_info = prefect_mgr.workflows.get(workflow_name)
if not workflow_info:
return {"error": f"Workflow '{workflow_name}' not found"}
metadata = workflow_info.metadata or {}
defaults = metadata.get("default_parameters", {})
resolved_target_path = target_path or metadata.get("default_target_path") or defaults.get("target_path")
if not resolved_target_path:
return {
"error": (
"target_path is required and no default_target_path is defined in metadata"
),
"metadata": {
"workflow": workflow_name,
"default_target_path": metadata.get("default_target_path"),
},
}
requested_volume_mode = volume_mode or metadata.get("default_volume_mode") or defaults.get("volume_mode")
if not requested_volume_mode:
requested_volume_mode = "ro"
normalised_volume_mode = (
str(requested_volume_mode).strip().lower().replace("-", "_")
)
if normalised_volume_mode in {"read_only", "readonly", "ro"}:
normalised_volume_mode = "ro"
elif normalised_volume_mode in {"read_write", "readwrite", "rw"}:
normalised_volume_mode = "rw"
else:
supported_modes = metadata.get("supported_volume_modes", ["ro", "rw"])
if isinstance(supported_modes, list) and normalised_volume_mode in supported_modes:
pass
else:
normalised_volume_mode = "ro"
parameters = parameters or {}
cleaned_parameters: Dict[str, Any] = {**defaults, **parameters}
# Ensure *_config structures default to dicts so Prefect validation passes.
for key, value in list(cleaned_parameters.items()):
if isinstance(key, str) and key.endswith("_config") and value is None:
cleaned_parameters[key] = {}
# Some workflows expect configuration dictionaries even when omitted.
parameter_definitions = (
metadata.get("parameters", {}).get("properties", {})
if isinstance(metadata.get("parameters"), dict)
else {}
)
for key, definition in parameter_definitions.items():
if not isinstance(key, str) or not key.endswith("_config"):
continue
if key not in cleaned_parameters:
default_value = definition.get("default") if isinstance(definition, dict) else None
cleaned_parameters[key] = default_value if default_value is not None else {}
elif cleaned_parameters[key] is None:
cleaned_parameters[key] = {}
flow_run = await prefect_mgr.submit_workflow(
workflow_name=workflow_name,
target_path=resolved_target_path,
volume_mode=normalised_volume_mode,
parameters=cleaned_parameters,
)
return {
"run_id": str(flow_run.id),
"status": flow_run.state.name if flow_run.state else "PENDING",
"workflow": workflow_name,
"message": f"Workflow '{workflow_name}' submitted successfully",
"target_path": resolved_target_path,
"volume_mode": normalised_volume_mode,
"parameters": cleaned_parameters,
"mcp_enabled": True,
}
except Exception as exc: # pragma: no cover - defensive logging
logger.exception("MCP submit failed")
return {"error": f"Failed to submit workflow: {exc}"}
@mcp.tool
async def get_comprehensive_scan_summary(run_id: str) -> Dict[str, Any] | Dict[str, str]:
"""Return a summary for the given flow run via MCP."""
try:
not_ready = _prefect_not_ready_status()
if not_ready:
return {
"error": "Prefect infrastructure not ready",
"prefect": not_ready,
}
status = await prefect_mgr.get_flow_run_status(run_id)
findings = await prefect_mgr.get_flow_run_findings(run_id)
workflow_name = "unknown"
deployment_id = status.get("workflow", "")
for name, deployment in prefect_mgr.deployments.items():
if str(deployment) == str(deployment_id):
workflow_name = name
break
total_findings = 0
severity_summary = {"critical": 0, "high": 0, "medium": 0, "low": 0, "info": 0}
if findings and "sarif" in findings:
sarif = findings["sarif"]
if isinstance(sarif, dict):
total_findings = sarif.get("total_findings", 0)
return {
"run_id": run_id,
"workflow": workflow_name,
"status": status.get("status", "unknown"),
"is_completed": status.get("is_completed", False),
"total_findings": total_findings,
"severity_summary": severity_summary,
"scan_duration": status.get("updated_at", "")
if status.get("is_completed")
else "In progress",
"recommendations": (
[
"Review high and critical severity findings first",
"Implement security fixes based on finding recommendations",
"Re-run scan after applying fixes to verify remediation",
]
if total_findings > 0
else ["No security issues found"]
),
"mcp_analysis": True,
}
except Exception as exc: # pragma: no cover
logger.exception("MCP summary failed")
return {"error": f"Failed to summarize run: {exc}"}
@mcp.tool
async def get_run_status_mcp(run_id: str) -> Dict[str, Any]:
"""Return current status information for a Prefect run."""
try:
not_ready = _prefect_not_ready_status()
if not_ready:
return {
"error": "Prefect infrastructure not ready",
"prefect": not_ready,
}
status = await prefect_mgr.get_flow_run_status(run_id)
workflow_name = "unknown"
deployment_id = status.get("workflow", "")
for name, deployment in prefect_mgr.deployments.items():
if str(deployment) == str(deployment_id):
workflow_name = name
break
return {
"run_id": status["run_id"],
"workflow": workflow_name,
"status": status["status"],
"is_completed": status["is_completed"],
"is_failed": status["is_failed"],
"is_running": status["is_running"],
"created_at": status["created_at"],
"updated_at": status["updated_at"],
}
except Exception as exc:
logger.exception("MCP run status failed")
return {"error": f"Failed to get run status: {exc}"}
@mcp.tool
async def get_run_findings_mcp(run_id: str) -> Dict[str, Any]:
"""Return SARIF findings for a completed run."""
try:
not_ready = _prefect_not_ready_status()
if not_ready:
return {
"error": "Prefect infrastructure not ready",
"prefect": not_ready,
}
status = await prefect_mgr.get_flow_run_status(run_id)
if not status.get("is_completed"):
return {"error": f"Run {run_id} not completed. Status: {status.get('status')}"}
findings = await prefect_mgr.get_flow_run_findings(run_id)
workflow_name = "unknown"
deployment_id = status.get("workflow", "")
for name, deployment in prefect_mgr.deployments.items():
if str(deployment) == str(deployment_id):
workflow_name = name
break
metadata = {
"completion_time": status.get("updated_at"),
"workflow_version": "unknown",
}
info = prefect_mgr.workflows.get(workflow_name)
if info:
metadata["workflow_version"] = info.metadata.get("version", "unknown")
return {
"workflow": workflow_name,
"run_id": run_id,
"sarif": findings,
"metadata": metadata,
}
except Exception as exc:
logger.exception("MCP findings failed")
return {"error": f"Failed to retrieve findings: {exc}"}
@mcp.tool
async def list_recent_runs_mcp(
limit: int = 10,
workflow_name: str | None = None,
states: List[str] | None = None,
) -> Dict[str, Any]:
"""List recent Prefect runs with optional workflow/state filters."""
not_ready = _prefect_not_ready_status()
if not_ready:
return {
"runs": [],
"prefect": not_ready,
"message": "Prefect infrastructure is still initializing",
}
try:
limit_value = int(limit)
except (TypeError, ValueError):
limit_value = 10
limit_value = max(1, min(limit_value, 100))
deployment_map = {
str(deployment_id): workflow
for workflow, deployment_id in prefect_mgr.deployments.items()
}
deployment_filter_value = None
if workflow_name:
deployment_id = prefect_mgr.deployments.get(workflow_name)
if not deployment_id:
return {
"runs": [],
"prefect": get_prefect_status(),
"error": f"Workflow '{workflow_name}' has no registered deployment",
}
try:
deployment_filter_value = UUID(str(deployment_id))
except ValueError:
return {
"runs": [],
"prefect": get_prefect_status(),
"error": (
f"Deployment id '{deployment_id}' for workflow '{workflow_name}' is invalid"
),
}
desired_state_types: List[StateType] = []
if states:
for raw_state in states:
if not raw_state:
continue
normalised = raw_state.strip().upper()
if normalised == "ALL":
desired_state_types = []
break
try:
desired_state_types.append(StateType[normalised])
except KeyError:
continue
if not desired_state_types:
desired_state_types = [
StateType.RUNNING,
StateType.COMPLETED,
StateType.FAILED,
StateType.CANCELLED,
]
flow_filter = FlowRunFilter()
if desired_state_types:
flow_filter.state = FlowRunFilterState(
type=FlowRunFilterStateType(any_=desired_state_types)
)
if deployment_filter_value:
flow_filter.deployment_id = FlowRunFilterDeploymentId(
any_=[deployment_filter_value]
)
async with get_client() as client:
flow_runs = await client.read_flow_runs(
limit=limit_value,
flow_run_filter=flow_filter,
sort=FlowRunSort.START_TIME_DESC,
)
results: List[Dict[str, Any]] = []
for flow_run in flow_runs:
deployment_id = getattr(flow_run, "deployment_id", None)
workflow = deployment_map.get(str(deployment_id), "unknown")
state = getattr(flow_run, "state", None)
state_name = getattr(state, "name", None) if state else None
state_type = getattr(state, "type", None) if state else None
results.append(
{
"run_id": str(flow_run.id),
"workflow": workflow,
"deployment_id": str(deployment_id) if deployment_id else None,
"state": state_name or (state_type.name if state_type else None),
"state_type": state_type.name if state_type else None,
"is_completed": bool(getattr(state, "is_completed", lambda: False)()),
"is_running": bool(getattr(state, "is_running", lambda: False)()),
"is_failed": bool(getattr(state, "is_failed", lambda: False)()),
"created_at": getattr(flow_run, "created", None),
"updated_at": getattr(flow_run, "updated", None),
"expected_start_time": getattr(flow_run, "expected_start_time", None),
"start_time": getattr(flow_run, "start_time", None),
}
)
# Normalise datetimes to ISO 8601 strings for serialization
for entry in results:
for key in ("created_at", "updated_at", "expected_start_time", "start_time"):
value = entry.get(key)
if value is None:
continue
try:
entry[key] = value.isoformat()
except AttributeError:
entry[key] = str(value)
return {"runs": results, "prefect": get_prefect_status()}
@mcp.tool
async def get_fuzzing_stats_mcp(run_id: str) -> Dict[str, Any]:
"""Return fuzzing statistics for a run if available."""
not_ready = _prefect_not_ready_status()
if not_ready:
return {
"error": "Prefect infrastructure not ready",
"prefect": not_ready,
}
stats = fuzzing.fuzzing_stats.get(run_id)
if not stats:
return {"error": f"Fuzzing run not found: {run_id}"}
# Be resilient if a plain dict slipped into the cache
if isinstance(stats, dict):
return stats
if hasattr(stats, "model_dump"):
return stats.model_dump()
if hasattr(stats, "dict"):
return stats.dict()
# Last resort
return getattr(stats, "__dict__", {"run_id": run_id})
@mcp.tool
async def get_fuzzing_crash_reports_mcp(run_id: str) -> Dict[str, Any]:
"""Return crash reports collected for a fuzzing run."""
not_ready = _prefect_not_ready_status()
if not_ready:
return {
"error": "Prefect infrastructure not ready",
"prefect": not_ready,
}
reports = fuzzing.crash_reports.get(run_id)
if reports is None:
return {"error": f"Fuzzing run not found: {run_id}"}
return {"run_id": run_id, "crashes": [report.model_dump() for report in reports]}
@mcp.tool
async def get_backend_status_mcp() -> Dict[str, Any]:
"""Expose backend readiness, workflows, and registered MCP tools."""
status = get_prefect_status()
response: Dict[str, Any] = {"prefect": status}
if status.get("ready"):
response["workflows"] = list(prefect_mgr.workflows.keys())
try:
tools = await mcp._tool_manager.list_tools()
response["mcp_tools"] = sorted(tool.name for tool in tools)
except Exception as exc: # pragma: no cover - defensive logging
logger.debug("Failed to enumerate MCP tools: %s", exc)
return response
def create_mcp_transport_app() -> Starlette:
"""Build a Starlette app serving HTTP + SSE transports on one port."""
http_app = mcp.http_app(path="/", transport="streamable-http")
sse_app = create_sse_app(
server=mcp,
message_path="/messages",
sse_path="/",
auth=mcp.auth,
)
routes = [
Mount("/mcp", app=http_app),
Mount("/mcp/sse", app=sse_app),
]
@asynccontextmanager
async def lifespan(app: Starlette): # pragma: no cover - integration wiring
async with AsyncExitStack() as stack:
await stack.enter_async_context(
http_app.router.lifespan_context(http_app)
)
await stack.enter_async_context(
sse_app.router.lifespan_context(sse_app)
)
yield
combined_app = Starlette(routes=routes, lifespan=lifespan)
combined_app.state.fastmcp_server = mcp
combined_app.state.http_app = http_app
combined_app.state.sse_app = sse_app
return combined_app
# ---------------------------------------------------------------------------
# Combined lifespan: Prefect init + dedicated MCP transports
# ---------------------------------------------------------------------------
@asynccontextmanager
async def combined_lifespan(app: FastAPI):
global prefect_bootstrap_task, _fastapi_mcp_imported
logger.info("Starting FuzzForge backend...")
# Ensure FastAPI endpoints are exposed via MCP once
if not _fastapi_mcp_imported:
try:
await mcp.import_server(FASTAPI_MCP_ADAPTER)
_fastapi_mcp_imported = True
logger.info("Mounted FastAPI endpoints as MCP tools")
except Exception as exc:
logger.exception("Failed to import FastAPI endpoints into MCP", exc_info=exc)
# Kick off Prefect bootstrap in the background if needed
if prefect_bootstrap_task is None or prefect_bootstrap_task.done():
prefect_bootstrap_task = asyncio.create_task(_bootstrap_prefect_with_retries())
logger.info("Prefect bootstrap task started")
else:
logger.info("Prefect bootstrap task already running")
# Start MCP transports on shared port (HTTP + SSE)
mcp_app = create_mcp_transport_app()
mcp_config = uvicorn.Config(
app=mcp_app,
host="0.0.0.0",
port=8010,
log_level="info",
lifespan="on",
)
mcp_server = uvicorn.Server(mcp_config)
mcp_server.install_signal_handlers = lambda: None # type: ignore[assignment]
mcp_task = asyncio.create_task(mcp_server.serve())
async def _wait_for_uvicorn_startup() -> None:
started_attr = getattr(mcp_server, "started", None)
if hasattr(started_attr, "wait"):
await asyncio.wait_for(started_attr.wait(), timeout=10)
return
# Fallback for uvicorn versions where "started" is a bool
poll_interval = 0.1
checks = int(10 / poll_interval)
for _ in range(checks):
if getattr(mcp_server, "started", False):
return
await asyncio.sleep(poll_interval)
raise asyncio.TimeoutError
try:
await _wait_for_uvicorn_startup()
except asyncio.TimeoutError: # pragma: no cover - defensive logging
if mcp_task.done():
raise RuntimeError("MCP server failed to start") from mcp_task.exception()
logger.warning("Timed out waiting for MCP server startup; continuing anyway")
logger.info("MCP HTTP available at http://0.0.0.0:8010/mcp")
logger.info("MCP SSE available at http://0.0.0.0:8010/mcp/sse")
try:
yield
finally:
logger.info("Shutting down MCP transports...")
mcp_server.should_exit = True
mcp_server.force_exit = True
await asyncio.gather(mcp_task, return_exceptions=True)
if prefect_bootstrap_task and not prefect_bootstrap_task.done():
prefect_bootstrap_task.cancel()
with suppress(asyncio.CancelledError):
await prefect_bootstrap_task
prefect_bootstrap_state.task_running = False
if not prefect_bootstrap_state.ready:
prefect_bootstrap_state.status = "stopped"
prefect_bootstrap_state.next_retry_seconds = None
prefect_bootstrap_task = None
logger.info("Shutting down Prefect statistics monitor...")
await prefect_stats_monitor.stop_monitoring()
logger.info("Shutting down FuzzForge backend...")
app.router.lifespan_context = combined_lifespan

View File

@@ -1,11 +0,0 @@
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.

View File

@@ -1,182 +0,0 @@
"""
Models for workflow findings and submissions
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
from pydantic import BaseModel, Field, field_validator
from typing import Dict, Any, Optional, Literal, List
from datetime import datetime
from pathlib import Path
class WorkflowFindings(BaseModel):
"""Findings from a workflow execution in SARIF format"""
workflow: str = Field(..., description="Workflow name")
run_id: str = Field(..., description="Unique run identifier")
sarif: Dict[str, Any] = Field(..., description="SARIF formatted findings")
metadata: Dict[str, Any] = Field(default_factory=dict, description="Additional metadata")
class ResourceLimits(BaseModel):
"""Resource limits for workflow execution"""
cpu_limit: Optional[str] = Field(None, description="CPU limit (e.g., '2' for 2 cores, '500m' for 0.5 cores)")
memory_limit: Optional[str] = Field(None, description="Memory limit (e.g., '1Gi', '512Mi')")
cpu_request: Optional[str] = Field(None, description="CPU request (guaranteed)")
memory_request: Optional[str] = Field(None, description="Memory request (guaranteed)")
class VolumeMount(BaseModel):
"""Volume mount specification"""
host_path: str = Field(..., description="Host path to mount")
container_path: str = Field(..., description="Container path for mount")
mode: Literal["ro", "rw"] = Field(default="ro", description="Mount mode")
@field_validator("host_path")
@classmethod
def validate_host_path(cls, v):
"""Validate that the host path is absolute (existence checked at runtime)"""
path = Path(v)
if not path.is_absolute():
raise ValueError(f"Host path must be absolute: {v}")
# Note: Path existence is validated at workflow runtime
# We can't validate existence here as this runs inside Docker container
return str(path)
@field_validator("container_path")
@classmethod
def validate_container_path(cls, v):
"""Validate that the container path is absolute"""
if not v.startswith('/'):
raise ValueError(f"Container path must be absolute: {v}")
return v
class WorkflowSubmission(BaseModel):
"""Submit a workflow with configurable settings"""
target_path: str = Field(..., description="Absolute path to analyze")
volume_mode: Literal["ro", "rw"] = Field(
default="ro",
description="Volume mount mode: read-only (ro) or read-write (rw)"
)
parameters: Dict[str, Any] = Field(
default_factory=dict,
description="Workflow-specific parameters"
)
timeout: Optional[int] = Field(
default=None, # Allow workflow-specific defaults
description="Timeout in seconds (None for workflow default)",
ge=1,
le=604800 # Max 7 days to support fuzzing campaigns
)
resource_limits: Optional[ResourceLimits] = Field(
None,
description="Resource limits for workflow container"
)
additional_volumes: List[VolumeMount] = Field(
default_factory=list,
description="Additional volume mounts (e.g., for corpus, output directories)"
)
@field_validator("target_path")
@classmethod
def validate_path(cls, v):
"""Validate that the target path is absolute (existence checked at runtime)"""
path = Path(v)
if not path.is_absolute():
raise ValueError(f"Path must be absolute: {v}")
# Note: Path existence is validated at workflow runtime when volumes are mounted
# We can't validate existence here as this runs inside Docker container
return str(path)
class WorkflowStatus(BaseModel):
"""Status of a workflow run"""
run_id: str = Field(..., description="Unique run identifier")
workflow: str = Field(..., description="Workflow name")
status: str = Field(..., description="Current status")
is_completed: bool = Field(..., description="Whether the run is completed")
is_failed: bool = Field(..., description="Whether the run failed")
is_running: bool = Field(..., description="Whether the run is currently running")
created_at: datetime = Field(..., description="Run creation time")
updated_at: datetime = Field(..., description="Last update time")
class WorkflowMetadata(BaseModel):
"""Complete metadata for a workflow"""
name: str = Field(..., description="Workflow name")
version: str = Field(..., description="Semantic version")
description: str = Field(..., description="Workflow description")
author: Optional[str] = Field(None, description="Workflow author")
tags: List[str] = Field(default_factory=list, description="Workflow tags")
parameters: Dict[str, Any] = Field(..., description="Parameters schema")
default_parameters: Dict[str, Any] = Field(
default_factory=dict,
description="Default parameter values"
)
required_modules: List[str] = Field(
default_factory=list,
description="Required module names"
)
supported_volume_modes: List[Literal["ro", "rw"]] = Field(
default=["ro", "rw"],
description="Supported volume mount modes"
)
has_custom_docker: bool = Field(
default=False,
description="Whether workflow has custom Dockerfile"
)
class WorkflowListItem(BaseModel):
"""Summary information for a workflow in list views"""
name: str = Field(..., description="Workflow name")
version: str = Field(..., description="Semantic version")
description: str = Field(..., description="Workflow description")
author: Optional[str] = Field(None, description="Workflow author")
tags: List[str] = Field(default_factory=list, description="Workflow tags")
class RunSubmissionResponse(BaseModel):
"""Response after submitting a workflow"""
run_id: str = Field(..., description="Unique run identifier")
status: str = Field(..., description="Initial status")
workflow: str = Field(..., description="Workflow name")
message: str = Field(default="Workflow submitted successfully")
class FuzzingStats(BaseModel):
"""Real-time fuzzing statistics"""
run_id: str = Field(..., description="Unique run identifier")
workflow: str = Field(..., description="Workflow name")
executions: int = Field(default=0, description="Total executions")
executions_per_sec: float = Field(default=0.0, description="Current execution rate")
crashes: int = Field(default=0, description="Total crashes found")
unique_crashes: int = Field(default=0, description="Unique crashes")
coverage: Optional[float] = Field(None, description="Code coverage percentage")
corpus_size: int = Field(default=0, description="Current corpus size")
elapsed_time: int = Field(default=0, description="Elapsed time in seconds")
last_crash_time: Optional[datetime] = Field(None, description="Time of last crash")
class CrashReport(BaseModel):
"""Individual crash report from fuzzing"""
run_id: str = Field(..., description="Run identifier")
crash_id: str = Field(..., description="Unique crash identifier")
timestamp: datetime = Field(default_factory=datetime.utcnow)
signal: Optional[str] = Field(None, description="Crash signal (SIGSEGV, etc.)")
crash_type: Optional[str] = Field(None, description="Type of crash")
stack_trace: Optional[str] = Field(None, description="Stack trace")
input_file: Optional[str] = Field(None, description="Path to crashing input")
reproducer: Optional[str] = Field(None, description="Minimized reproducer")
severity: str = Field(default="medium", description="Crash severity")
exploitability: Optional[str] = Field(None, description="Exploitability assessment")

View File

@@ -1,394 +0,0 @@
"""
Generic Prefect Statistics Monitor Service
This service monitors ALL workflows for structured live data logging and
updates the appropriate statistics APIs. Works with any workflow that follows
the standard LIVE_STATS logging pattern.
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import asyncio
import json
import logging
from datetime import datetime, timedelta, timezone
from typing import Dict, Any, Optional
from prefect.client.orchestration import get_client
from prefect.client.schemas.objects import FlowRun, TaskRun
from src.models.findings import FuzzingStats
from src.api.fuzzing import fuzzing_stats, initialize_fuzzing_tracking, active_connections
logger = logging.getLogger(__name__)
class PrefectStatsMonitor:
"""Monitors Prefect flows and tasks for live statistics from any workflow"""
def __init__(self):
self.monitoring = False
self.monitor_task = None
self.monitored_runs = set()
self.last_log_ts: Dict[str, datetime] = {}
self._client = None
self._client_refresh_time = None
self._client_refresh_interval = 300 # Refresh connection every 5 minutes
async def start_monitoring(self):
"""Start the Prefect statistics monitoring service"""
if self.monitoring:
logger.warning("Prefect stats monitor already running")
return
self.monitoring = True
self.monitor_task = asyncio.create_task(self._monitor_flows())
logger.info("Started Prefect statistics monitor")
async def stop_monitoring(self):
"""Stop the monitoring service"""
self.monitoring = False
if self.monitor_task:
self.monitor_task.cancel()
try:
await self.monitor_task
except asyncio.CancelledError:
pass
logger.info("Stopped Prefect statistics monitor")
async def _get_or_refresh_client(self):
"""Get or refresh Prefect client with connection pooling."""
now = datetime.now(timezone.utc)
if (self._client is None or
self._client_refresh_time is None or
(now - self._client_refresh_time).total_seconds() > self._client_refresh_interval):
if self._client:
try:
await self._client.aclose()
except Exception:
pass
self._client = get_client()
self._client_refresh_time = now
await self._client.__aenter__()
return self._client
async def _monitor_flows(self):
"""Main monitoring loop that watches Prefect flows"""
try:
while self.monitoring:
try:
# Use connection pooling for better performance
client = await self._get_or_refresh_client()
# Get recent flow runs (limit to reduce load)
flow_runs = await client.read_flow_runs(
limit=50,
sort="START_TIME_DESC",
)
# Only consider runs from the last 15 minutes
recent_cutoff = datetime.now(timezone.utc) - timedelta(minutes=15)
for flow_run in flow_runs:
created = getattr(flow_run, "created", None)
if created is None:
continue
try:
# Ensure timezone-aware comparison
if created.tzinfo is None:
created = created.replace(tzinfo=timezone.utc)
if created >= recent_cutoff:
await self._monitor_flow_run(client, flow_run)
except Exception:
# If comparison fails, attempt monitoring anyway
await self._monitor_flow_run(client, flow_run)
await asyncio.sleep(5) # Check every 5 seconds
except Exception as e:
logger.error(f"Error in Prefect monitoring: {e}")
await asyncio.sleep(10)
except asyncio.CancelledError:
logger.info("Prefect monitoring cancelled")
except Exception as e:
logger.error(f"Fatal error in Prefect monitoring: {e}")
finally:
# Clean up client on exit
if self._client:
try:
await self._client.__aexit__(None, None, None)
except Exception:
pass
self._client = None
async def _monitor_flow_run(self, client, flow_run: FlowRun):
"""Monitor a specific flow run for statistics"""
run_id = str(flow_run.id)
workflow_name = flow_run.name or "unknown"
try:
# Initialize tracking if not exists - only for workflows that might have live stats
if run_id not in fuzzing_stats:
initialize_fuzzing_tracking(run_id, workflow_name)
self.monitored_runs.add(run_id)
# Skip corrupted entries (should not happen after startup cleanup, but defensive)
elif not isinstance(fuzzing_stats[run_id], FuzzingStats):
logger.warning(f"Skipping corrupted stats entry for {run_id}, reinitializing")
initialize_fuzzing_tracking(run_id, workflow_name)
self.monitored_runs.add(run_id)
# Get task runs for this flow
task_runs = await client.read_task_runs(
flow_run_filter={"id": {"any_": [flow_run.id]}},
limit=25,
)
# Check all tasks for live statistics logging
for task_run in task_runs:
await self._extract_stats_from_task(client, run_id, task_run, workflow_name)
# Also scan flow-level logs as a fallback
await self._extract_stats_from_flow_logs(client, run_id, flow_run, workflow_name)
except Exception as e:
logger.warning(f"Error monitoring flow run {run_id}: {e}")
async def _extract_stats_from_task(self, client, run_id: str, task_run: TaskRun, workflow_name: str):
"""Extract statistics from any task that logs live stats"""
try:
# Get task run logs
logs = await client.read_logs(
log_filter={
"task_run_id": {"any_": [task_run.id]}
},
limit=100,
sort="TIMESTAMP_ASC"
)
# Parse logs for LIVE_STATS entries (generic pattern for any workflow)
latest_stats = None
for log in logs:
# Prefer structured extra field if present
extra_data = getattr(log, "extra", None) or getattr(log, "extra_fields", None) or None
if isinstance(extra_data, dict):
stat_type = extra_data.get("stats_type")
if stat_type in ["fuzzing_live_update", "scan_progress", "analysis_update", "live_stats"]:
latest_stats = extra_data
continue
# Fallback to parsing from message text
if ("FUZZ_STATS" in log.message or "LIVE_STATS" in log.message):
stats = self._parse_stats_from_log(log.message)
if stats:
latest_stats = stats
# Update statistics if we found any
if latest_stats:
# Calculate elapsed time from task start
elapsed_time = 0
if task_run.start_time:
# Ensure timezone-aware arithmetic
now = datetime.now(timezone.utc)
try:
elapsed_time = int((now - task_run.start_time).total_seconds())
except Exception:
# Fallback to naive UTC if types mismatch
elapsed_time = int((datetime.utcnow() - task_run.start_time.replace(tzinfo=None)).total_seconds())
updated_stats = FuzzingStats(
run_id=run_id,
workflow=workflow_name,
executions=latest_stats.get("executions", 0),
executions_per_sec=latest_stats.get("executions_per_sec", 0.0),
crashes=latest_stats.get("crashes", 0),
unique_crashes=latest_stats.get("unique_crashes", 0),
corpus_size=latest_stats.get("corpus_size", 0),
elapsed_time=elapsed_time
)
# Update the global stats
previous = fuzzing_stats.get(run_id)
fuzzing_stats[run_id] = updated_stats
# Broadcast to any active WebSocket clients for this run
if active_connections.get(run_id):
# Handle both Pydantic objects and plain dicts
if isinstance(updated_stats, dict):
stats_data = updated_stats
elif hasattr(updated_stats, 'model_dump'):
stats_data = updated_stats.model_dump()
elif hasattr(updated_stats, 'dict'):
stats_data = updated_stats.dict()
else:
stats_data = updated_stats.__dict__
message = {
"type": "stats_update",
"data": stats_data,
}
disconnected = []
for ws in active_connections[run_id]:
try:
await ws.send_text(json.dumps(message))
except Exception:
disconnected.append(ws)
# Clean up disconnected sockets
for ws in disconnected:
try:
active_connections[run_id].remove(ws)
except ValueError:
pass
logger.debug(f"Updated Prefect stats for {run_id}: {updated_stats.executions} execs")
except Exception as e:
logger.warning(f"Error extracting stats from task {task_run.id}: {e}")
async def _extract_stats_from_flow_logs(self, client, run_id: str, flow_run: FlowRun, workflow_name: str):
"""Extract statistics by scanning flow-level logs for LIVE/FUZZ stats"""
try:
logs = await client.read_logs(
log_filter={
"flow_run_id": {"any_": [flow_run.id]}
},
limit=200,
sort="TIMESTAMP_ASC"
)
latest_stats = None
last_seen = self.last_log_ts.get(run_id)
max_ts = last_seen
for log in logs:
# Skip logs we've already processed
ts = getattr(log, "timestamp", None)
if last_seen and ts and ts <= last_seen:
continue
if ts and (max_ts is None or ts > max_ts):
max_ts = ts
# Prefer structured extra field if available
extra_data = getattr(log, "extra", None) or getattr(log, "extra_fields", None) or None
if isinstance(extra_data, dict):
stat_type = extra_data.get("stats_type")
if stat_type in ["fuzzing_live_update", "scan_progress", "analysis_update", "live_stats"]:
latest_stats = extra_data
continue
# Fallback to message parse
if ("FUZZ_STATS" in log.message or "LIVE_STATS" in log.message):
stats = self._parse_stats_from_log(log.message)
if stats:
latest_stats = stats
if max_ts:
self.last_log_ts[run_id] = max_ts
if latest_stats:
# Use flow_run timestamps for elapsed time if available
elapsed_time = 0
start_time = getattr(flow_run, "start_time", None) or getattr(flow_run, "start_time", None)
if start_time:
now = datetime.now(timezone.utc)
try:
if start_time.tzinfo is None:
start_time = start_time.replace(tzinfo=timezone.utc)
elapsed_time = int((now - start_time).total_seconds())
except Exception:
elapsed_time = int((datetime.utcnow() - start_time.replace(tzinfo=None)).total_seconds())
updated_stats = FuzzingStats(
run_id=run_id,
workflow=workflow_name,
executions=latest_stats.get("executions", 0),
executions_per_sec=latest_stats.get("executions_per_sec", 0.0),
crashes=latest_stats.get("crashes", 0),
unique_crashes=latest_stats.get("unique_crashes", 0),
corpus_size=latest_stats.get("corpus_size", 0),
elapsed_time=elapsed_time
)
fuzzing_stats[run_id] = updated_stats
# Broadcast if listeners exist
if active_connections.get(run_id):
# Handle both Pydantic objects and plain dicts
if isinstance(updated_stats, dict):
stats_data = updated_stats
elif hasattr(updated_stats, 'model_dump'):
stats_data = updated_stats.model_dump()
elif hasattr(updated_stats, 'dict'):
stats_data = updated_stats.dict()
else:
stats_data = updated_stats.__dict__
message = {
"type": "stats_update",
"data": stats_data,
}
disconnected = []
for ws in active_connections[run_id]:
try:
await ws.send_text(json.dumps(message))
except Exception:
disconnected.append(ws)
for ws in disconnected:
try:
active_connections[run_id].remove(ws)
except ValueError:
pass
except Exception as e:
logger.warning(f"Error extracting stats from flow logs {run_id}: {e}")
def _parse_stats_from_log(self, log_message: str) -> Optional[Dict[str, Any]]:
"""Parse statistics from a log message"""
try:
import re
# Prefer explicit JSON after marker tokens
m = re.search(r'(?:FUZZ_STATS|LIVE_STATS)\s+(\{.*\})', log_message)
if m:
try:
return json.loads(m.group(1))
except Exception:
pass
# Fallback: Extract the extra= dict and coerce to JSON
stats_match = re.search(r'extra=({.*?})', log_message)
if not stats_match:
return None
extra_str = stats_match.group(1)
extra_str = extra_str.replace("'", '"')
extra_str = extra_str.replace('None', 'null')
extra_str = extra_str.replace('True', 'true')
extra_str = extra_str.replace('False', 'false')
stats_data = json.loads(extra_str)
# Support multiple stat types for different workflows
stat_type = stats_data.get("stats_type")
if stat_type in ["fuzzing_live_update", "scan_progress", "analysis_update", "live_stats"]:
return stats_data
except Exception as e:
logger.debug(f"Error parsing log stats: {e}")
return None
# Global instance
prefect_stats_monitor = PrefectStatsMonitor()

View File

@@ -1,19 +0,0 @@
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import sys
from pathlib import Path
# Ensure project root is on sys.path so `src` is importable
ROOT = Path(__file__).resolve().parents[1]
if str(ROOT) not in sys.path:
sys.path.insert(0, str(ROOT))

View File

@@ -1,82 +0,0 @@
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import asyncio
from datetime import datetime, timezone, timedelta
from src.services.prefect_stats_monitor import PrefectStatsMonitor
from src.api import fuzzing
class FakeLog:
def __init__(self, message: str):
self.message = message
class FakeClient:
def __init__(self, logs):
self._logs = logs
async def read_logs(self, log_filter=None, limit=100, sort="TIMESTAMP_ASC"):
return self._logs
class FakeTaskRun:
def __init__(self):
self.id = "task-1"
self.start_time = datetime.now(timezone.utc) - timedelta(seconds=5)
def test_parse_stats_from_log_fuzzing():
mon = PrefectStatsMonitor()
msg = (
"INFO LIVE_STATS extra={'stats_type': 'fuzzing_live_update', "
"'executions': 42, 'executions_per_sec': 3.14, 'crashes': 1, 'unique_crashes': 1, 'corpus_size': 9}"
)
stats = mon._parse_stats_from_log(msg)
assert stats is not None
assert stats["stats_type"] == "fuzzing_live_update"
assert stats["executions"] == 42
def test_extract_stats_updates_and_broadcasts():
mon = PrefectStatsMonitor()
run_id = "run-123"
workflow = "wf"
fuzzing.initialize_fuzzing_tracking(run_id, workflow)
# Prepare a fake websocket to capture messages
sent = []
class FakeWS:
async def send_text(self, text: str):
sent.append(text)
fuzzing.active_connections[run_id] = [FakeWS()]
# Craft a log line the parser understands
msg = (
"INFO LIVE_STATS extra={'stats_type': 'fuzzing_live_update', "
"'executions': 10, 'executions_per_sec': 1.5, 'crashes': 0, 'unique_crashes': 0, 'corpus_size': 2}"
)
fake_client = FakeClient([FakeLog(msg)])
task_run = FakeTaskRun()
asyncio.run(mon._extract_stats_from_task(fake_client, run_id, task_run, workflow))
# Verify stats updated
stats = fuzzing.fuzzing_stats[run_id]
assert stats.executions == 10
assert stats.executions_per_sec == 1.5
# Verify a message was sent to WebSocket
assert sent, "Expected a stats_update message to be sent"

View File

@@ -1,11 +0,0 @@
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.

View File

@@ -1,11 +0,0 @@
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.

View File

@@ -1,14 +0,0 @@
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
from .security_analyzer import SecurityAnalyzer
__all__ = ["SecurityAnalyzer"]

View File

@@ -1,368 +0,0 @@
"""
Security Analyzer Module - Analyzes code for security vulnerabilities
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import logging
import re
from pathlib import Path
from typing import Dict, Any, List, Optional
try:
from toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
except ImportError:
try:
from modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
except ImportError:
from src.toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
logger = logging.getLogger(__name__)
class SecurityAnalyzer(BaseModule):
"""
Analyzes source code for common security vulnerabilities.
This module:
- Detects hardcoded secrets and credentials
- Identifies dangerous function calls
- Finds SQL injection vulnerabilities
- Detects insecure configurations
"""
def get_metadata(self) -> ModuleMetadata:
"""Get module metadata"""
return ModuleMetadata(
name="security_analyzer",
version="1.0.0",
description="Analyzes code for security vulnerabilities",
author="FuzzForge Team",
category="analyzer",
tags=["security", "vulnerabilities", "static-analysis"],
input_schema={
"file_extensions": {
"type": "array",
"items": {"type": "string"},
"description": "File extensions to analyze",
"default": [".py", ".js", ".java", ".php", ".rb", ".go"]
},
"check_secrets": {
"type": "boolean",
"description": "Check for hardcoded secrets",
"default": True
},
"check_sql": {
"type": "boolean",
"description": "Check for SQL injection risks",
"default": True
},
"check_dangerous_functions": {
"type": "boolean",
"description": "Check for dangerous function calls",
"default": True
}
},
output_schema={
"findings": {
"type": "array",
"description": "List of security findings"
}
},
requires_workspace=True
)
def validate_config(self, config: Dict[str, Any]) -> bool:
"""Validate module configuration"""
extensions = config.get("file_extensions", [])
if not isinstance(extensions, list):
raise ValueError("file_extensions must be a list")
return True
async def execute(self, config: Dict[str, Any], workspace: Path) -> ModuleResult:
"""
Execute the security analysis module.
Args:
config: Module configuration
workspace: Path to the workspace directory
Returns:
ModuleResult with security findings
"""
self.start_timer()
self.validate_workspace(workspace)
self.validate_config(config)
findings = []
files_analyzed = 0
# Get configuration
file_extensions = config.get("file_extensions", [".py", ".js", ".java", ".php", ".rb", ".go"])
check_secrets = config.get("check_secrets", True)
check_sql = config.get("check_sql", True)
check_dangerous = config.get("check_dangerous_functions", True)
logger.info(f"Analyzing files with extensions: {file_extensions}")
try:
# Analyze each file
for ext in file_extensions:
for file_path in workspace.rglob(f"*{ext}"):
if not file_path.is_file():
continue
files_analyzed += 1
relative_path = file_path.relative_to(workspace)
try:
content = file_path.read_text(encoding='utf-8', errors='ignore')
lines = content.splitlines()
# Check for secrets
if check_secrets:
secret_findings = self._check_hardcoded_secrets(
content, lines, relative_path
)
findings.extend(secret_findings)
# Check for SQL injection
if check_sql and ext in [".py", ".php", ".java", ".js"]:
sql_findings = self._check_sql_injection(
content, lines, relative_path
)
findings.extend(sql_findings)
# Check for dangerous functions
if check_dangerous:
dangerous_findings = self._check_dangerous_functions(
content, lines, relative_path, ext
)
findings.extend(dangerous_findings)
except Exception as e:
logger.error(f"Error analyzing file {relative_path}: {e}")
# Create summary
summary = {
"files_analyzed": files_analyzed,
"total_findings": len(findings),
"extensions_scanned": file_extensions
}
return self.create_result(
findings=findings,
status="success" if files_analyzed > 0 else "partial",
summary=summary,
metadata={
"workspace": str(workspace),
"config": config
}
)
except Exception as e:
logger.error(f"Security analyzer failed: {e}")
return self.create_result(
findings=findings,
status="failed",
error=str(e)
)
def _check_hardcoded_secrets(
self, content: str, lines: List[str], file_path: Path
) -> List[ModuleFinding]:
"""
Check for hardcoded secrets in code.
Args:
content: File content
lines: File lines
file_path: Relative file path
Returns:
List of findings
"""
findings = []
# Patterns for secrets
secret_patterns = [
(r'api[_-]?key\s*=\s*["\']([^"\']{20,})["\']', 'API Key'),
(r'api[_-]?secret\s*=\s*["\']([^"\']{20,})["\']', 'API Secret'),
(r'password\s*=\s*["\']([^"\']+)["\']', 'Hardcoded Password'),
(r'token\s*=\s*["\']([^"\']{20,})["\']', 'Authentication Token'),
(r'aws[_-]?access[_-]?key\s*=\s*["\']([^"\']+)["\']', 'AWS Access Key'),
(r'aws[_-]?secret[_-]?key\s*=\s*["\']([^"\']+)["\']', 'AWS Secret Key'),
(r'private[_-]?key\s*=\s*["\']([^"\']+)["\']', 'Private Key'),
(r'["\']([A-Za-z0-9]{32,})["\']', 'Potential Secret Hash'),
(r'Bearer\s+([A-Za-z0-9\-_]+\.[A-Za-z0-9\-_]+\.[A-Za-z0-9\-_]+)', 'JWT Token'),
]
for pattern, secret_type in secret_patterns:
for match in re.finditer(pattern, content, re.IGNORECASE):
# Find line number
line_num = content[:match.start()].count('\n') + 1
line_content = lines[line_num - 1] if line_num <= len(lines) else ""
# Skip common false positives
if self._is_false_positive_secret(match.group(0)):
continue
findings.append(self.create_finding(
title=f"Hardcoded {secret_type} detected",
description=f"Found potential hardcoded {secret_type} in {file_path}",
severity="high" if "key" in secret_type.lower() else "medium",
category="hardcoded_secret",
file_path=str(file_path),
line_start=line_num,
code_snippet=line_content.strip()[:100],
recommendation=f"Remove hardcoded {secret_type} and use environment variables or secure vault",
metadata={"secret_type": secret_type}
))
return findings
def _check_sql_injection(
self, content: str, lines: List[str], file_path: Path
) -> List[ModuleFinding]:
"""
Check for potential SQL injection vulnerabilities.
Args:
content: File content
lines: File lines
file_path: Relative file path
Returns:
List of findings
"""
findings = []
# SQL injection patterns
sql_patterns = [
(r'(SELECT|INSERT|UPDATE|DELETE).*\+\s*[\'"]?\s*\+?\s*\w+', 'String concatenation in SQL'),
(r'(SELECT|INSERT|UPDATE|DELETE).*%\s*[\'"]?\s*%?\s*\w+', 'String formatting in SQL'),
(r'f[\'"].*?(SELECT|INSERT|UPDATE|DELETE).*?\{.*?\}', 'F-string in SQL query'),
(r'query\s*=.*?\+', 'Dynamic query building'),
(r'execute\s*\(.*?\+.*?\)', 'Dynamic execute statement'),
]
for pattern, vuln_type in sql_patterns:
for match in re.finditer(pattern, content, re.IGNORECASE):
line_num = content[:match.start()].count('\n') + 1
line_content = lines[line_num - 1] if line_num <= len(lines) else ""
findings.append(self.create_finding(
title=f"Potential SQL Injection: {vuln_type}",
description=f"Detected potential SQL injection vulnerability via {vuln_type}",
severity="high",
category="sql_injection",
file_path=str(file_path),
line_start=line_num,
code_snippet=line_content.strip()[:100],
recommendation="Use parameterized queries or prepared statements instead",
metadata={"vulnerability_type": vuln_type}
))
return findings
def _check_dangerous_functions(
self, content: str, lines: List[str], file_path: Path, ext: str
) -> List[ModuleFinding]:
"""
Check for dangerous function calls.
Args:
content: File content
lines: File lines
file_path: Relative file path
ext: File extension
Returns:
List of findings
"""
findings = []
# Language-specific dangerous functions
dangerous_functions = {
".py": [
(r'eval\s*\(', 'eval()', 'Arbitrary code execution'),
(r'exec\s*\(', 'exec()', 'Arbitrary code execution'),
(r'os\.system\s*\(', 'os.system()', 'Command injection risk'),
(r'subprocess\.call\s*\(.*shell=True', 'subprocess with shell=True', 'Command injection risk'),
(r'pickle\.loads?\s*\(', 'pickle.load()', 'Deserialization vulnerability'),
],
".js": [
(r'eval\s*\(', 'eval()', 'Arbitrary code execution'),
(r'new\s+Function\s*\(', 'new Function()', 'Arbitrary code execution'),
(r'innerHTML\s*=', 'innerHTML', 'XSS vulnerability'),
(r'document\.write\s*\(', 'document.write()', 'XSS vulnerability'),
],
".php": [
(r'eval\s*\(', 'eval()', 'Arbitrary code execution'),
(r'exec\s*\(', 'exec()', 'Command execution'),
(r'system\s*\(', 'system()', 'Command execution'),
(r'shell_exec\s*\(', 'shell_exec()', 'Command execution'),
(r'\$_GET\[', 'Direct $_GET usage', 'Input validation missing'),
(r'\$_POST\[', 'Direct $_POST usage', 'Input validation missing'),
]
}
if ext in dangerous_functions:
for pattern, func_name, risk_type in dangerous_functions[ext]:
for match in re.finditer(pattern, content):
line_num = content[:match.start()].count('\n') + 1
line_content = lines[line_num - 1] if line_num <= len(lines) else ""
findings.append(self.create_finding(
title=f"Dangerous function: {func_name}",
description=f"Use of potentially dangerous function {func_name}: {risk_type}",
severity="medium",
category="dangerous_function",
file_path=str(file_path),
line_start=line_num,
code_snippet=line_content.strip()[:100],
recommendation=f"Consider safer alternatives to {func_name}",
metadata={
"function": func_name,
"risk": risk_type
}
))
return findings
def _is_false_positive_secret(self, value: str) -> bool:
"""
Check if a potential secret is likely a false positive.
Args:
value: Potential secret value
Returns:
True if likely false positive
"""
false_positive_patterns = [
'example',
'test',
'demo',
'sample',
'dummy',
'placeholder',
'xxx',
'123',
'change',
'your',
'here'
]
value_lower = value.lower()
return any(pattern in value_lower for pattern in false_positive_patterns)

View File

@@ -1,272 +0,0 @@
"""
Base module interface for all FuzzForge modules
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
from abc import ABC, abstractmethod
from pathlib import Path
from typing import Dict, Any, List, Optional
from pydantic import BaseModel, Field
from datetime import datetime
import logging
logger = logging.getLogger(__name__)
class ModuleMetadata(BaseModel):
"""Metadata describing a module's capabilities and requirements"""
name: str = Field(..., description="Module name")
version: str = Field(..., description="Module version")
description: str = Field(..., description="Module description")
author: Optional[str] = Field(None, description="Module author")
category: str = Field(..., description="Module category (scanner, analyzer, reporter, etc.)")
tags: List[str] = Field(default_factory=list, description="Module tags")
input_schema: Dict[str, Any] = Field(default_factory=dict, description="Expected input schema")
output_schema: Dict[str, Any] = Field(default_factory=dict, description="Output schema")
requires_workspace: bool = Field(True, description="Whether module requires workspace access")
class ModuleFinding(BaseModel):
"""Individual finding from a module"""
id: str = Field(..., description="Unique finding ID")
title: str = Field(..., description="Finding title")
description: str = Field(..., description="Detailed description")
severity: str = Field(..., description="Severity level (info, low, medium, high, critical)")
category: str = Field(..., description="Finding category")
file_path: Optional[str] = Field(None, description="Affected file path relative to workspace")
line_start: Optional[int] = Field(None, description="Starting line number")
line_end: Optional[int] = Field(None, description="Ending line number")
code_snippet: Optional[str] = Field(None, description="Relevant code snippet")
recommendation: Optional[str] = Field(None, description="Remediation recommendation")
metadata: Dict[str, Any] = Field(default_factory=dict, description="Additional metadata")
class ModuleResult(BaseModel):
"""Standard result format from module execution"""
module: str = Field(..., description="Module name")
version: str = Field(..., description="Module version")
status: str = Field(default="success", description="Execution status (success, partial, failed)")
execution_time: float = Field(..., description="Execution time in seconds")
findings: List[ModuleFinding] = Field(default_factory=list, description="List of findings")
summary: Dict[str, Any] = Field(default_factory=dict, description="Summary statistics")
metadata: Dict[str, Any] = Field(default_factory=dict, description="Additional metadata")
error: Optional[str] = Field(None, description="Error message if failed")
sarif: Optional[Dict[str, Any]] = Field(None, description="SARIF report if generated by reporter module")
class BaseModule(ABC):
"""
Base interface for all security testing modules.
All modules must inherit from this class and implement the required methods.
Modules are designed to be stateless and reusable across different workflows.
"""
def __init__(self):
"""Initialize the module"""
self._metadata = self.get_metadata()
self._start_time = None
logger.info(f"Initialized module: {self._metadata.name} v{self._metadata.version}")
@abstractmethod
def get_metadata(self) -> ModuleMetadata:
"""
Get module metadata.
Returns:
ModuleMetadata object describing the module
"""
pass
@abstractmethod
async def execute(self, config: Dict[str, Any], workspace: Path) -> ModuleResult:
"""
Execute the module with given configuration and workspace.
Args:
config: Module-specific configuration parameters
workspace: Path to the mounted workspace directory
Returns:
ModuleResult containing findings and metadata
"""
pass
@abstractmethod
def validate_config(self, config: Dict[str, Any]) -> bool:
"""
Validate the provided configuration against module requirements.
Args:
config: Configuration to validate
Returns:
True if configuration is valid, False otherwise
Raises:
ValueError: If configuration is invalid with details
"""
pass
def validate_workspace(self, workspace: Path) -> bool:
"""
Validate that the workspace exists and is accessible.
Args:
workspace: Path to the workspace
Returns:
True if workspace is valid
Raises:
ValueError: If workspace is invalid
"""
if not workspace.exists():
raise ValueError(f"Workspace does not exist: {workspace}")
if not workspace.is_dir():
raise ValueError(f"Workspace is not a directory: {workspace}")
return True
def create_finding(
self,
title: str,
description: str,
severity: str,
category: str,
**kwargs
) -> ModuleFinding:
"""
Helper method to create a standardized finding.
Args:
title: Finding title
description: Detailed description
severity: Severity level
category: Finding category
**kwargs: Additional finding fields
Returns:
ModuleFinding object
"""
import uuid
finding_id = str(uuid.uuid4())
return ModuleFinding(
id=finding_id,
title=title,
description=description,
severity=severity,
category=category,
**kwargs
)
def start_timer(self):
"""Start the execution timer"""
from time import time
self._start_time = time()
def get_execution_time(self) -> float:
"""Get the execution time in seconds"""
from time import time
if self._start_time is None:
return 0.0
return time() - self._start_time
def create_result(
self,
findings: List[ModuleFinding],
status: str = "success",
summary: Dict[str, Any] = None,
metadata: Dict[str, Any] = None,
error: str = None
) -> ModuleResult:
"""
Helper method to create a module result.
Args:
findings: List of findings
status: Execution status
summary: Summary statistics
metadata: Additional metadata
error: Error message if failed
Returns:
ModuleResult object
"""
return ModuleResult(
module=self._metadata.name,
version=self._metadata.version,
status=status,
execution_time=self.get_execution_time(),
findings=findings,
summary=summary or self._generate_summary(findings),
metadata=metadata or {},
error=error
)
def _generate_summary(self, findings: List[ModuleFinding]) -> Dict[str, Any]:
"""
Generate summary statistics from findings.
Args:
findings: List of findings
Returns:
Summary dictionary
"""
severity_counts = {
"info": 0,
"low": 0,
"medium": 0,
"high": 0,
"critical": 0
}
category_counts = {}
for finding in findings:
# Count by severity
if finding.severity in severity_counts:
severity_counts[finding.severity] += 1
# Count by category
if finding.category not in category_counts:
category_counts[finding.category] = 0
category_counts[finding.category] += 1
return {
"total_findings": len(findings),
"severity_counts": severity_counts,
"category_counts": category_counts,
"highest_severity": self._get_highest_severity(findings)
}
def _get_highest_severity(self, findings: List[ModuleFinding]) -> str:
"""
Get the highest severity from findings.
Args:
findings: List of findings
Returns:
Highest severity level
"""
severity_order = ["critical", "high", "medium", "low", "info"]
for severity in severity_order:
if any(f.severity == severity for f in findings):
return severity
return "none"

View File

@@ -1,14 +0,0 @@
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
from .sarif_reporter import SARIFReporter
__all__ = ["SARIFReporter"]

View File

@@ -1,401 +0,0 @@
"""
SARIF Reporter Module - Generates SARIF-formatted security reports
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import logging
from pathlib import Path
from typing import Dict, Any, List
from datetime import datetime
import json
try:
from toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
except ImportError:
try:
from modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
except ImportError:
from src.toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
logger = logging.getLogger(__name__)
class SARIFReporter(BaseModule):
"""
Generates SARIF (Static Analysis Results Interchange Format) reports.
This module:
- Converts findings to SARIF format
- Aggregates results from multiple modules
- Adds metadata and context
- Provides actionable recommendations
"""
def get_metadata(self) -> ModuleMetadata:
"""Get module metadata"""
return ModuleMetadata(
name="sarif_reporter",
version="1.0.0",
description="Generates SARIF-formatted security reports",
author="FuzzForge Team",
category="reporter",
tags=["reporting", "sarif", "output"],
input_schema={
"findings": {
"type": "array",
"description": "List of findings to report",
"required": True
},
"tool_name": {
"type": "string",
"description": "Name of the tool",
"default": "FuzzForge Security Assessment"
},
"tool_version": {
"type": "string",
"description": "Tool version",
"default": "1.0.0"
},
"include_code_flows": {
"type": "boolean",
"description": "Include code flow information",
"default": False
}
},
output_schema={
"sarif": {
"type": "object",
"description": "SARIF 2.1.0 formatted report"
}
},
requires_workspace=False # Reporter doesn't need direct workspace access
)
def validate_config(self, config: Dict[str, Any]) -> bool:
"""Validate module configuration"""
if "findings" not in config and "modules_results" not in config:
raise ValueError("Either 'findings' or 'modules_results' must be provided")
return True
async def execute(self, config: Dict[str, Any], workspace: Path = None) -> ModuleResult:
"""
Execute the SARIF reporter module.
Args:
config: Module configuration with findings
workspace: Optional workspace path for context
Returns:
ModuleResult with SARIF report
"""
self.start_timer()
self.validate_config(config)
# Get configuration
tool_name = config.get("tool_name", "FuzzForge Security Assessment")
tool_version = config.get("tool_version", "1.0.0")
include_code_flows = config.get("include_code_flows", False)
# Collect findings from either direct findings or module results
all_findings = []
if "findings" in config:
# Direct findings provided
all_findings = config["findings"]
if isinstance(all_findings, list) and all(isinstance(f, dict) for f in all_findings):
# Convert dict findings to ModuleFinding objects
all_findings = [ModuleFinding(**f) if isinstance(f, dict) else f for f in all_findings]
elif "modules_results" in config:
# Aggregate from module results
for module_result in config["modules_results"]:
if isinstance(module_result, dict):
findings = module_result.get("findings", [])
all_findings.extend(findings)
elif hasattr(module_result, "findings"):
all_findings.extend(module_result.findings)
logger.info(f"Generating SARIF report for {len(all_findings)} findings")
try:
# Generate SARIF report
sarif_report = self._generate_sarif(
findings=all_findings,
tool_name=tool_name,
tool_version=tool_version,
include_code_flows=include_code_flows,
workspace_path=str(workspace) if workspace else None
)
# Create summary
summary = self._generate_report_summary(all_findings)
return ModuleResult(
module=self.get_metadata().name,
version=self.get_metadata().version,
status="success",
execution_time=self.get_execution_time(),
findings=[], # Reporter doesn't generate new findings
summary=summary,
metadata={
"tool_name": tool_name,
"tool_version": tool_version,
"report_format": "SARIF 2.1.0",
"total_findings": len(all_findings)
},
error=None,
sarif=sarif_report # Add SARIF as custom field
)
except Exception as e:
logger.error(f"SARIF reporter failed: {e}")
return self.create_result(
findings=[],
status="failed",
error=str(e)
)
def _generate_sarif(
self,
findings: List[ModuleFinding],
tool_name: str,
tool_version: str,
include_code_flows: bool,
workspace_path: str = None
) -> Dict[str, Any]:
"""
Generate SARIF 2.1.0 formatted report.
Args:
findings: List of findings to report
tool_name: Name of the tool
tool_version: Tool version
include_code_flows: Whether to include code flow information
workspace_path: Optional workspace path
Returns:
SARIF formatted dictionary
"""
# Create rules from unique finding types
rules = self._create_rules(findings)
# Create results from findings
results = self._create_results(findings, include_code_flows)
# Build SARIF structure
sarif = {
"$schema": "https://raw.githubusercontent.com/oasis-tcs/sarif-spec/master/Schemata/sarif-schema-2.1.0.json",
"version": "2.1.0",
"runs": [
{
"tool": {
"driver": {
"name": tool_name,
"version": tool_version,
"informationUri": "https://fuzzforge.io",
"rules": rules
}
},
"results": results,
"invocations": [
{
"executionSuccessful": True,
"endTimeUtc": datetime.utcnow().isoformat() + "Z"
}
]
}
]
}
# Add workspace information if available
if workspace_path:
sarif["runs"][0]["originalUriBaseIds"] = {
"WORKSPACE": {
"uri": f"file://{workspace_path}/",
"description": "The workspace root directory"
}
}
return sarif
def _create_rules(self, findings: List[ModuleFinding]) -> List[Dict[str, Any]]:
"""
Create SARIF rules from findings.
Args:
findings: List of findings
Returns:
List of SARIF rule objects
"""
rules_dict = {}
for finding in findings:
rule_id = f"{finding.category}_{finding.severity}"
if rule_id not in rules_dict:
rules_dict[rule_id] = {
"id": rule_id,
"name": finding.category.replace("_", " ").title(),
"shortDescription": {
"text": f"{finding.category} vulnerability"
},
"fullDescription": {
"text": f"Detection rule for {finding.category} vulnerabilities with {finding.severity} severity"
},
"defaultConfiguration": {
"level": self._severity_to_sarif_level(finding.severity)
},
"properties": {
"category": finding.category,
"severity": finding.severity,
"tags": ["security", finding.category, finding.severity]
}
}
return list(rules_dict.values())
def _create_results(
self, findings: List[ModuleFinding], include_code_flows: bool
) -> List[Dict[str, Any]]:
"""
Create SARIF results from findings.
Args:
findings: List of findings
include_code_flows: Whether to include code flows
Returns:
List of SARIF result objects
"""
results = []
for finding in findings:
result = {
"ruleId": f"{finding.category}_{finding.severity}",
"level": self._severity_to_sarif_level(finding.severity),
"message": {
"text": finding.description
},
"locations": []
}
# Add location information if available
if finding.file_path:
location = {
"physicalLocation": {
"artifactLocation": {
"uri": finding.file_path,
"uriBaseId": "WORKSPACE"
}
}
}
# Add line information if available
if finding.line_start:
location["physicalLocation"]["region"] = {
"startLine": finding.line_start
}
if finding.line_end:
location["physicalLocation"]["region"]["endLine"] = finding.line_end
# Add code snippet if available
if finding.code_snippet:
location["physicalLocation"]["region"]["snippet"] = {
"text": finding.code_snippet
}
result["locations"].append(location)
# Add fix suggestions if available
if finding.recommendation:
result["fixes"] = [
{
"description": {
"text": finding.recommendation
}
}
]
# Add properties
result["properties"] = {
"findingId": finding.id,
"title": finding.title,
"metadata": finding.metadata
}
results.append(result)
return results
def _severity_to_sarif_level(self, severity: str) -> str:
"""
Convert severity to SARIF level.
Args:
severity: Finding severity
Returns:
SARIF level string
"""
mapping = {
"critical": "error",
"high": "error",
"medium": "warning",
"low": "note",
"info": "none"
}
return mapping.get(severity.lower(), "warning")
def _generate_report_summary(self, findings: List[ModuleFinding]) -> Dict[str, Any]:
"""
Generate summary statistics for the report.
Args:
findings: List of findings
Returns:
Summary dictionary
"""
severity_counts = {
"critical": 0,
"high": 0,
"medium": 0,
"low": 0,
"info": 0
}
category_counts = {}
affected_files = set()
for finding in findings:
# Count by severity
if finding.severity in severity_counts:
severity_counts[finding.severity] += 1
# Count by category
if finding.category not in category_counts:
category_counts[finding.category] = 0
category_counts[finding.category] += 1
# Track affected files
if finding.file_path:
affected_files.add(finding.file_path)
return {
"total_findings": len(findings),
"severity_distribution": severity_counts,
"category_distribution": category_counts,
"affected_files": len(affected_files),
"report_format": "SARIF 2.1.0",
"generated_at": datetime.utcnow().isoformat()
}

View File

@@ -1,14 +0,0 @@
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
from .file_scanner import FileScanner
__all__ = ["FileScanner"]

View File

@@ -1,315 +0,0 @@
"""
File Scanner Module - Scans and enumerates files in the workspace
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import logging
import mimetypes
from pathlib import Path
from typing import Dict, Any, List
import hashlib
try:
from toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
except ImportError:
try:
from modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
except ImportError:
from src.toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
logger = logging.getLogger(__name__)
class FileScanner(BaseModule):
"""
Scans files in the mounted workspace and collects information.
This module:
- Enumerates files based on patterns
- Detects file types
- Calculates file hashes
- Identifies potentially sensitive files
"""
def get_metadata(self) -> ModuleMetadata:
"""Get module metadata"""
return ModuleMetadata(
name="file_scanner",
version="1.0.0",
description="Scans and enumerates files in the workspace",
author="FuzzForge Team",
category="scanner",
tags=["files", "enumeration", "discovery"],
input_schema={
"patterns": {
"type": "array",
"items": {"type": "string"},
"description": "File patterns to scan (e.g., ['*.py', '*.js'])",
"default": ["*"]
},
"max_file_size": {
"type": "integer",
"description": "Maximum file size to scan in bytes",
"default": 10485760 # 10MB
},
"check_sensitive": {
"type": "boolean",
"description": "Check for sensitive file patterns",
"default": True
},
"calculate_hashes": {
"type": "boolean",
"description": "Calculate SHA256 hashes for files",
"default": False
}
},
output_schema={
"findings": {
"type": "array",
"description": "List of discovered files with metadata"
}
},
requires_workspace=True
)
def validate_config(self, config: Dict[str, Any]) -> bool:
"""Validate module configuration"""
patterns = config.get("patterns", ["*"])
if not isinstance(patterns, list):
raise ValueError("patterns must be a list")
max_size = config.get("max_file_size", 10485760)
if not isinstance(max_size, int) or max_size <= 0:
raise ValueError("max_file_size must be a positive integer")
return True
async def execute(self, config: Dict[str, Any], workspace: Path) -> ModuleResult:
"""
Execute the file scanning module.
Args:
config: Module configuration
workspace: Path to the workspace directory
Returns:
ModuleResult with file findings
"""
self.start_timer()
self.validate_workspace(workspace)
self.validate_config(config)
findings = []
file_count = 0
total_size = 0
file_types = {}
# Get configuration
patterns = config.get("patterns", ["*"])
max_file_size = config.get("max_file_size", 10485760)
check_sensitive = config.get("check_sensitive", True)
calculate_hashes = config.get("calculate_hashes", False)
logger.info(f"Scanning workspace with patterns: {patterns}")
try:
# Scan for each pattern
for pattern in patterns:
for file_path in workspace.rglob(pattern):
if not file_path.is_file():
continue
file_count += 1
relative_path = file_path.relative_to(workspace)
# Get file stats
try:
stats = file_path.stat()
file_size = stats.st_size
total_size += file_size
# Skip large files
if file_size > max_file_size:
logger.warning(f"Skipping large file: {relative_path} ({file_size} bytes)")
continue
# Detect file type
file_type = self._detect_file_type(file_path)
if file_type not in file_types:
file_types[file_type] = 0
file_types[file_type] += 1
# Check for sensitive files
if check_sensitive and self._is_sensitive_file(file_path):
findings.append(self.create_finding(
title=f"Potentially sensitive file: {relative_path.name}",
description=f"Found potentially sensitive file at {relative_path}",
severity="medium",
category="sensitive_file",
file_path=str(relative_path),
metadata={
"file_size": file_size,
"file_type": file_type
}
))
# Calculate hash if requested
file_hash = None
if calculate_hashes and file_size < 1048576: # Only hash files < 1MB
file_hash = self._calculate_hash(file_path)
# Create informational finding for each file
findings.append(self.create_finding(
title=f"File discovered: {relative_path.name}",
description=f"File: {relative_path}",
severity="info",
category="file_enumeration",
file_path=str(relative_path),
metadata={
"file_size": file_size,
"file_type": file_type,
"file_hash": file_hash
}
))
except Exception as e:
logger.error(f"Error processing file {relative_path}: {e}")
# Create summary
summary = {
"total_files": file_count,
"total_size_bytes": total_size,
"file_types": file_types,
"patterns_scanned": patterns
}
return self.create_result(
findings=findings,
status="success",
summary=summary,
metadata={
"workspace": str(workspace),
"config": config
}
)
except Exception as e:
logger.error(f"File scanner failed: {e}")
return self.create_result(
findings=findings,
status="failed",
error=str(e)
)
def _detect_file_type(self, file_path: Path) -> str:
"""
Detect the type of a file.
Args:
file_path: Path to the file
Returns:
File type string
"""
# Try to determine from extension
mime_type, _ = mimetypes.guess_type(str(file_path))
if mime_type:
return mime_type
# Check by extension
ext = file_path.suffix.lower()
type_map = {
'.py': 'text/x-python',
'.js': 'application/javascript',
'.java': 'text/x-java',
'.cpp': 'text/x-c++',
'.c': 'text/x-c',
'.go': 'text/x-go',
'.rs': 'text/x-rust',
'.rb': 'text/x-ruby',
'.php': 'text/x-php',
'.yaml': 'text/yaml',
'.yml': 'text/yaml',
'.json': 'application/json',
'.xml': 'text/xml',
'.md': 'text/markdown',
'.txt': 'text/plain',
'.sh': 'text/x-shellscript',
'.bat': 'text/x-batch',
'.ps1': 'text/x-powershell'
}
return type_map.get(ext, 'application/octet-stream')
def _is_sensitive_file(self, file_path: Path) -> bool:
"""
Check if a file might contain sensitive information.
Args:
file_path: Path to the file
Returns:
True if potentially sensitive
"""
sensitive_patterns = [
'.env',
'.env.local',
'.env.production',
'credentials',
'password',
'secret',
'private_key',
'id_rsa',
'id_dsa',
'.pem',
'.key',
'.pfx',
'.p12',
'wallet',
'.ssh',
'token',
'api_key',
'config.json',
'settings.json',
'.git-credentials',
'.npmrc',
'.pypirc',
'.docker/config.json'
]
file_name_lower = file_path.name.lower()
for pattern in sensitive_patterns:
if pattern in file_name_lower:
return True
return False
def _calculate_hash(self, file_path: Path) -> str:
"""
Calculate SHA256 hash of a file.
Args:
file_path: Path to the file
Returns:
Hex string of SHA256 hash
"""
try:
sha256_hash = hashlib.sha256()
with open(file_path, "rb") as f:
for byte_block in iter(lambda: f.read(4096), b""):
sha256_hash.update(byte_block)
return sha256_hash.hexdigest()
except Exception as e:
logger.error(f"Failed to calculate hash for {file_path}: {e}")
return None

View File

@@ -1,36 +0,0 @@
"""
Secret Detection Modules
This package contains modules for detecting secrets, credentials, and sensitive information
in codebases and repositories.
Available modules:
- TruffleHog: Comprehensive secret detection with verification
- Gitleaks: Git-specific secret scanning and leak detection
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
from typing import List, Type
from ..base import BaseModule
# Module registry for automatic discovery
SECRET_DETECTION_MODULES: List[Type[BaseModule]] = []
def register_module(module_class: Type[BaseModule]):
"""Register a secret detection module"""
SECRET_DETECTION_MODULES.append(module_class)
return module_class
def get_available_modules() -> List[Type[BaseModule]]:
"""Get all available secret detection modules"""
return SECRET_DETECTION_MODULES.copy()

View File

@@ -1,351 +0,0 @@
"""
Gitleaks Secret Detection Module
This module uses Gitleaks to detect secrets and sensitive information in Git repositories
and file systems.
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import asyncio
import json
from pathlib import Path
from typing import Dict, Any, List
import subprocess
import logging
from ..base import BaseModule, ModuleMetadata, ModuleFinding, ModuleResult
from . import register_module
logger = logging.getLogger(__name__)
@register_module
class GitleaksModule(BaseModule):
"""Gitleaks secret detection module"""
def get_metadata(self) -> ModuleMetadata:
"""Get module metadata"""
return ModuleMetadata(
name="gitleaks",
version="8.18.0",
description="Git-specific secret scanning and leak detection using Gitleaks",
author="FuzzForge Team",
category="secret_detection",
tags=["secrets", "git", "leak-detection", "credentials"],
input_schema={
"type": "object",
"properties": {
"scan_mode": {
"type": "string",
"enum": ["detect", "protect"],
"default": "detect",
"description": "Scan mode: detect (entire repo history) or protect (staged changes)"
},
"config_file": {
"type": "string",
"description": "Path to custom Gitleaks configuration file"
},
"baseline_file": {
"type": "string",
"description": "Path to baseline file to ignore known findings"
},
"max_target_megabytes": {
"type": "integer",
"default": 100,
"description": "Maximum size of files to scan (in MB)"
},
"redact": {
"type": "boolean",
"default": True,
"description": "Redact secrets in output"
},
"no_git": {
"type": "boolean",
"default": False,
"description": "Scan files without Git context"
}
}
},
output_schema={
"type": "object",
"properties": {
"findings": {
"type": "array",
"items": {
"type": "object",
"properties": {
"rule_id": {"type": "string"},
"category": {"type": "string"},
"file_path": {"type": "string"},
"line_number": {"type": "integer"},
"secret": {"type": "string"}
}
}
}
}
}
)
def validate_config(self, config: Dict[str, Any]) -> bool:
"""Validate configuration"""
scan_mode = config.get("scan_mode", "detect")
if scan_mode not in ["detect", "protect"]:
raise ValueError("scan_mode must be 'detect' or 'protect'")
max_size = config.get("max_target_megabytes", 100)
if not isinstance(max_size, int) or max_size < 1 or max_size > 1000:
raise ValueError("max_target_megabytes must be between 1 and 1000")
return True
async def execute(self, config: Dict[str, Any], workspace: Path) -> ModuleResult:
"""Execute Gitleaks secret detection"""
self.start_timer()
try:
# Validate inputs
self.validate_config(config)
self.validate_workspace(workspace)
logger.info(f"Running Gitleaks on {workspace}")
# Build Gitleaks command
scan_mode = config.get("scan_mode", "detect")
cmd = ["gitleaks", scan_mode]
# Add source path
cmd.extend(["--source", str(workspace)])
# Create temp file for JSON output
import tempfile
output_file = tempfile.NamedTemporaryFile(mode='w+', suffix='.json', delete=False)
output_path = output_file.name
output_file.close()
# Add report format and output file
cmd.extend(["--report-format", "json"])
cmd.extend(["--report-path", output_path])
# Add redact option
if config.get("redact", True):
cmd.append("--redact")
# Add max target size
max_size = config.get("max_target_megabytes", 100)
cmd.extend(["--max-target-megabytes", str(max_size)])
# Add config file if specified
if config.get("config_file"):
config_path = Path(config["config_file"])
if config_path.exists():
cmd.extend(["--config", str(config_path)])
# Add baseline file if specified
if config.get("baseline_file"):
baseline_path = Path(config["baseline_file"])
if baseline_path.exists():
cmd.extend(["--baseline-path", str(baseline_path)])
# Add no-git flag if specified
if config.get("no_git", False):
cmd.append("--no-git")
# Add verbose output
cmd.append("--verbose")
logger.debug(f"Running command: {' '.join(cmd)}")
# Run Gitleaks
process = await asyncio.create_subprocess_exec(
*cmd,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
cwd=workspace
)
stdout, stderr = await process.communicate()
# Parse results
findings = []
try:
# Read the JSON output from file
with open(output_path, 'r') as f:
output_content = f.read()
if process.returncode == 0:
# No secrets found
logger.info("No secrets detected by Gitleaks")
elif process.returncode == 1:
# Secrets found - parse from file content
findings = self._parse_gitleaks_output(output_content, workspace)
else:
# Error occurred
error_msg = stderr.decode()
logger.error(f"Gitleaks failed: {error_msg}")
return self.create_result(
findings=[],
status="failed",
error=f"Gitleaks execution failed: {error_msg}"
)
finally:
# Clean up temp file
import os
try:
os.unlink(output_path)
except:
pass
# Create summary
summary = {
"total_leaks": len(findings),
"unique_rules": len(set(f.metadata.get("rule_id", "") for f in findings)),
"files_with_leaks": len(set(f.file_path for f in findings if f.file_path)),
"scan_mode": scan_mode
}
logger.info(f"Gitleaks found {len(findings)} potential leaks")
return self.create_result(
findings=findings,
status="success",
summary=summary
)
except Exception as e:
logger.error(f"Gitleaks module failed: {e}")
return self.create_result(
findings=[],
status="failed",
error=str(e)
)
def _parse_gitleaks_output(self, output: str, workspace: Path) -> List[ModuleFinding]:
"""Parse Gitleaks JSON output into findings"""
findings = []
if not output.strip():
return findings
try:
# Gitleaks outputs JSON array
results = json.loads(output)
if not isinstance(results, list):
logger.warning("Unexpected Gitleaks output format")
return findings
for result in results:
# Extract information
rule_id = result.get("RuleID", "unknown")
description = result.get("Description", "")
file_path = result.get("File", "")
line_number = result.get("LineNumber", 0)
secret = result.get("Secret", "")
match_text = result.get("Match", "")
# Commit info (if available)
commit = result.get("Commit", "")
author = result.get("Author", "")
email = result.get("Email", "")
date = result.get("Date", "")
# Make file path relative to workspace
if file_path:
try:
rel_path = Path(file_path).relative_to(workspace)
file_path = str(rel_path)
except ValueError:
# If file is outside workspace, keep absolute path
pass
# Determine severity based on rule type
severity = self._get_leak_severity(rule_id, description)
# Create finding
finding = self.create_finding(
title=f"Secret leak detected: {rule_id}",
description=self._get_leak_description(rule_id, description, commit),
severity=severity,
category="secret_leak",
file_path=file_path if file_path else None,
line_start=line_number if line_number > 0 else None,
code_snippet=match_text if match_text else secret,
recommendation=self._get_leak_recommendation(rule_id),
metadata={
"rule_id": rule_id,
"secret_type": description,
"commit": commit,
"author": author,
"email": email,
"date": date,
"entropy": result.get("Entropy", 0),
"fingerprint": result.get("Fingerprint", "")
}
)
findings.append(finding)
except json.JSONDecodeError as e:
logger.warning(f"Failed to parse Gitleaks output: {e}")
except Exception as e:
logger.warning(f"Error processing Gitleaks results: {e}")
return findings
def _get_leak_severity(self, rule_id: str, description: str) -> str:
"""Determine severity based on secret type"""
critical_patterns = [
"aws", "amazon", "gcp", "google", "azure", "microsoft",
"private_key", "rsa", "ssh", "certificate", "database",
"password", "auth", "token", "secret", "key"
]
rule_lower = rule_id.lower()
desc_lower = description.lower()
# Check for critical patterns
for pattern in critical_patterns:
if pattern in rule_lower or pattern in desc_lower:
if any(x in rule_lower for x in ["aws", "gcp", "azure"]):
return "critical"
elif any(x in rule_lower for x in ["private", "key", "password"]):
return "high"
else:
return "medium"
return "low"
def _get_leak_description(self, rule_id: str, description: str, commit: str) -> str:
"""Get description for the leak finding"""
base_desc = f"Gitleaks detected a potential secret leak matching rule '{rule_id}'"
if description:
base_desc += f" ({description})"
if commit:
base_desc += f" in commit {commit[:8]}"
base_desc += ". This may indicate sensitive information has been committed to version control."
return base_desc
def _get_leak_recommendation(self, rule_id: str) -> str:
"""Get remediation recommendation"""
base_rec = "Remove the secret from the codebase and Git history. "
if any(pattern in rule_id.lower() for pattern in ["aws", "gcp", "azure"]):
base_rec += "Revoke the cloud credentials immediately and rotate them. "
base_rec += "Consider using Git history rewriting tools (git-filter-branch, BFG) " \
"to remove sensitive data from commit history. Implement pre-commit hooks " \
"to prevent future secret commits."
return base_rec

View File

@@ -1,294 +0,0 @@
"""
TruffleHog Secret Detection Module
This module uses TruffleHog to detect secrets, credentials, and sensitive information
with verification capabilities.
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import asyncio
import json
import tempfile
from pathlib import Path
from typing import Dict, Any, List
import subprocess
import logging
from ..base import BaseModule, ModuleMetadata, ModuleFinding, ModuleResult
from . import register_module
logger = logging.getLogger(__name__)
@register_module
class TruffleHogModule(BaseModule):
"""TruffleHog secret detection module"""
def get_metadata(self) -> ModuleMetadata:
"""Get module metadata"""
return ModuleMetadata(
name="trufflehog",
version="3.63.2",
description="Comprehensive secret detection with verification using TruffleHog",
author="FuzzForge Team",
category="secret_detection",
tags=["secrets", "credentials", "sensitive-data", "verification"],
input_schema={
"type": "object",
"properties": {
"verify": {
"type": "boolean",
"default": False,
"description": "Verify discovered secrets"
},
"include_detectors": {
"type": "array",
"items": {"type": "string"},
"description": "Specific detectors to include"
},
"exclude_detectors": {
"type": "array",
"items": {"type": "string"},
"description": "Specific detectors to exclude"
},
"max_depth": {
"type": "integer",
"default": 10,
"description": "Maximum directory depth to scan"
},
"concurrency": {
"type": "integer",
"default": 10,
"description": "Number of concurrent workers"
}
}
},
output_schema={
"type": "object",
"properties": {
"findings": {
"type": "array",
"items": {
"type": "object",
"properties": {
"detector": {"type": "string"},
"verified": {"type": "boolean"},
"file_path": {"type": "string"},
"line": {"type": "integer"},
"secret": {"type": "string"}
}
}
}
}
}
)
def validate_config(self, config: Dict[str, Any]) -> bool:
"""Validate configuration"""
# Check concurrency bounds
concurrency = config.get("concurrency", 10)
if not isinstance(concurrency, int) or concurrency < 1 or concurrency > 50:
raise ValueError("Concurrency must be between 1 and 50")
# Check max_depth bounds
max_depth = config.get("max_depth", 10)
if not isinstance(max_depth, int) or max_depth < 1 or max_depth > 20:
raise ValueError("Max depth must be between 1 and 20")
return True
async def execute(self, config: Dict[str, Any], workspace: Path) -> ModuleResult:
"""Execute TruffleHog secret detection"""
self.start_timer()
try:
# Validate inputs
self.validate_config(config)
self.validate_workspace(workspace)
logger.info(f"Running TruffleHog on {workspace}")
# Build TruffleHog command
cmd = ["trufflehog", "filesystem", str(workspace)]
# Add verification flag
if config.get("verify", False):
cmd.append("--verify")
# Add JSON output
cmd.extend(["--json", "--no-update"])
# Add concurrency
cmd.extend(["--concurrency", str(config.get("concurrency", 10))])
# Add max depth
cmd.extend(["--max-depth", str(config.get("max_depth", 10))])
# Add include/exclude detectors
if config.get("include_detectors"):
cmd.extend(["--include-detectors", ",".join(config["include_detectors"])])
if config.get("exclude_detectors"):
cmd.extend(["--exclude-detectors", ",".join(config["exclude_detectors"])])
logger.debug(f"Running command: {' '.join(cmd)}")
# Run TruffleHog
process = await asyncio.create_subprocess_exec(
*cmd,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
cwd=workspace
)
stdout, stderr = await process.communicate()
# Parse results
findings = []
if process.returncode == 0 or process.returncode == 1: # 1 indicates secrets found
findings = self._parse_trufflehog_output(stdout.decode(), workspace)
else:
error_msg = stderr.decode()
logger.error(f"TruffleHog failed: {error_msg}")
return self.create_result(
findings=[],
status="failed",
error=f"TruffleHog execution failed: {error_msg}"
)
# Create summary
summary = {
"total_secrets": len(findings),
"verified_secrets": len([f for f in findings if f.metadata.get("verified", False)]),
"detectors_triggered": len(set(f.metadata.get("detector", "") for f in findings)),
"files_with_secrets": len(set(f.file_path for f in findings if f.file_path))
}
logger.info(f"TruffleHog found {len(findings)} secrets")
return self.create_result(
findings=findings,
status="success",
summary=summary
)
except Exception as e:
logger.error(f"TruffleHog module failed: {e}")
return self.create_result(
findings=[],
status="failed",
error=str(e)
)
def _parse_trufflehog_output(self, output: str, workspace: Path) -> List[ModuleFinding]:
"""Parse TruffleHog JSON output into findings"""
findings = []
for line in output.strip().split('\n'):
if not line.strip():
continue
try:
result = json.loads(line)
# Extract information
detector = result.get("DetectorName", "unknown")
verified = result.get("Verified", False)
raw_secret = result.get("Raw", "")
# Source info
source_metadata = result.get("SourceMetadata", {})
source_data = source_metadata.get("Data", {})
file_path = source_data.get("Filesystem", {}).get("file", "")
line_num = source_data.get("Filesystem", {}).get("line", 0)
# Make file path relative to workspace
if file_path:
try:
rel_path = Path(file_path).relative_to(workspace)
file_path = str(rel_path)
except ValueError:
# If file is outside workspace, keep absolute path
pass
# Determine severity based on verification and detector type
severity = self._get_secret_severity(detector, verified, raw_secret)
# Create finding
finding = self.create_finding(
title=f"{detector} secret detected",
description=self._get_secret_description(detector, verified),
severity=severity,
category="secret_detection",
file_path=file_path if file_path else None,
line_start=line_num if line_num > 0 else None,
code_snippet=self._truncate_secret(raw_secret),
recommendation=self._get_secret_recommendation(detector, verified),
metadata={
"detector": detector,
"verified": verified,
"detector_type": result.get("DetectorType", ""),
"decoder_type": result.get("DecoderType", ""),
"structured_data": result.get("StructuredData", {})
}
)
findings.append(finding)
except json.JSONDecodeError as e:
logger.warning(f"Failed to parse TruffleHog output line: {e}")
continue
except Exception as e:
logger.warning(f"Error processing TruffleHog result: {e}")
continue
return findings
def _get_secret_severity(self, detector: str, verified: bool, secret: str) -> str:
"""Determine severity based on secret type and verification status"""
if verified:
# Verified secrets are always high risk
critical_detectors = ["aws", "gcp", "azure", "github", "gitlab", "database"]
if any(crit in detector.lower() for crit in critical_detectors):
return "critical"
return "high"
# Unverified secrets
high_risk_detectors = ["private_key", "certificate", "password", "token"]
if any(high in detector.lower() for high in high_risk_detectors):
return "medium"
return "low"
def _get_secret_description(self, detector: str, verified: bool) -> str:
"""Get description for the secret finding"""
verification_status = "verified and active" if verified else "unverified"
return f"A {detector} secret was detected and is {verification_status}. " \
f"This may represent a security risk if the credential is valid."
def _get_secret_recommendation(self, detector: str, verified: bool) -> str:
"""Get remediation recommendation"""
if verified:
return f"IMMEDIATE ACTION REQUIRED: This {detector} secret is verified and active. " \
f"Revoke the credential immediately, remove it from the codebase, and " \
f"implement proper secret management practices."
else:
return f"Review this {detector} secret to determine if it's valid. " \
f"If real, revoke the credential and remove it from the codebase. " \
f"Consider implementing secret scanning in CI/CD pipelines."
def _truncate_secret(self, secret: str, max_length: int = 50) -> str:
"""Truncate secret for display purposes"""
if len(secret) <= max_length:
return secret
return secret[:max_length] + "..."

View File

@@ -1,11 +0,0 @@
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.

View File

@@ -1,12 +0,0 @@
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.

View File

@@ -1,47 +0,0 @@
# Secret Detection Workflow Dockerfile
FROM prefecthq/prefect:3-python3.11
# Install system dependencies
RUN apt-get update && apt-get install -y \
curl \
wget \
git \
ca-certificates \
gnupg \
&& rm -rf /var/lib/apt/lists/*
# Install TruffleHog (use direct binary download to avoid install script issues)
RUN curl -sSfL "https://github.com/trufflesecurity/trufflehog/releases/download/v3.63.2/trufflehog_3.63.2_linux_amd64.tar.gz" -o trufflehog.tar.gz \
&& tar -xzf trufflehog.tar.gz \
&& mv trufflehog /usr/local/bin/ \
&& rm trufflehog.tar.gz
# Install Gitleaks (use specific version to avoid API rate limiting)
RUN wget https://github.com/gitleaks/gitleaks/releases/download/v8.18.2/gitleaks_8.18.2_linux_x64.tar.gz \
&& tar -xzf gitleaks_8.18.2_linux_x64.tar.gz \
&& mv gitleaks /usr/local/bin/ \
&& rm gitleaks_8.18.2_linux_x64.tar.gz
# Verify installations
RUN trufflehog --version && gitleaks version
# Set working directory
WORKDIR /opt/prefect
# Create toolbox directory structure
RUN mkdir -p /opt/prefect/toolbox
# Set environment variables
ENV PYTHONPATH=/opt/prefect/toolbox:/opt/prefect/toolbox/workflows
ENV WORKFLOW_NAME=secret_detection_scan
# The toolbox code will be mounted at runtime from the backend container
# This includes:
# - /opt/prefect/toolbox/modules/base.py
# - /opt/prefect/toolbox/modules/secret_detection/ (TruffleHog, Gitleaks modules)
# - /opt/prefect/toolbox/modules/reporter/ (SARIF reporter)
# - /opt/prefect/toolbox/workflows/comprehensive/secret_detection_scan/
VOLUME /opt/prefect/toolbox
# Set working directory for execution
WORKDIR /opt/prefect

View File

@@ -1,58 +0,0 @@
# Secret Detection Workflow Dockerfile - Self-Contained Version
# This version copies all required modules into the image for complete isolation
FROM prefecthq/prefect:3-python3.11
# Install system dependencies
RUN apt-get update && apt-get install -y \
curl \
wget \
git \
ca-certificates \
gnupg \
&& rm -rf /var/lib/apt/lists/*
# Install TruffleHog
RUN curl -sSfL https://raw.githubusercontent.com/trufflesecurity/trufflehog/main/scripts/install.sh | sh -s -- -b /usr/local/bin
# Install Gitleaks
RUN wget https://github.com/gitleaks/gitleaks/releases/latest/download/gitleaks_linux_x64.tar.gz \
&& tar -xzf gitleaks_linux_x64.tar.gz \
&& mv gitleaks /usr/local/bin/ \
&& rm gitleaks_linux_x64.tar.gz
# Verify installations
RUN trufflehog --version && gitleaks version
# Set working directory
WORKDIR /opt/prefect
# Create directory structure
RUN mkdir -p /opt/prefect/toolbox/modules/secret_detection \
/opt/prefect/toolbox/modules/reporter \
/opt/prefect/toolbox/workflows/comprehensive/secret_detection_scan
# Copy the base module and required modules
COPY toolbox/modules/base.py /opt/prefect/toolbox/modules/base.py
COPY toolbox/modules/__init__.py /opt/prefect/toolbox/modules/__init__.py
COPY toolbox/modules/secret_detection/ /opt/prefect/toolbox/modules/secret_detection/
COPY toolbox/modules/reporter/ /opt/prefect/toolbox/modules/reporter/
# Copy the workflow code
COPY toolbox/workflows/comprehensive/secret_detection_scan/ /opt/prefect/toolbox/workflows/comprehensive/secret_detection_scan/
# Copy toolbox init files
COPY toolbox/__init__.py /opt/prefect/toolbox/__init__.py
COPY toolbox/workflows/__init__.py /opt/prefect/toolbox/workflows/__init__.py
COPY toolbox/workflows/comprehensive/__init__.py /opt/prefect/toolbox/workflows/comprehensive/__init__.py
# Install Python dependencies for the modules
RUN pip install --no-cache-dir \
pydantic \
asyncio
# Set environment variables
ENV PYTHONPATH=/opt/prefect/toolbox:/opt/prefect/toolbox/workflows
ENV WORKFLOW_NAME=secret_detection_scan
# Set default command (can be overridden)
CMD ["python", "-m", "toolbox.workflows.comprehensive.secret_detection_scan.workflow"]

View File

@@ -1,130 +0,0 @@
# Secret Detection Scan Workflow
This workflow performs comprehensive secret detection using multiple industry-standard tools:
- **TruffleHog**: Comprehensive secret detection with verification capabilities
- **Gitleaks**: Git-specific secret scanning and leak detection
## Features
- **Parallel Execution**: Runs TruffleHog and Gitleaks concurrently for faster results
- **Deduplication**: Automatically removes duplicate findings across tools
- **SARIF Output**: Generates standardized SARIF reports for integration with security tools
- **Configurable**: Supports extensive configuration for both tools
## Dependencies
### Required Modules
- `toolbox.modules.secret_detection.trufflehog`
- `toolbox.modules.secret_detection.gitleaks`
- `toolbox.modules.reporter` (SARIF reporter)
- `toolbox.modules.base` (Base module interface)
### External Tools
- TruffleHog v3.63.2+
- Gitleaks v8.18.0+
## Docker Deployment
This workflow provides two Docker deployment approaches:
### 1. Volume-Based Approach (Default: `Dockerfile`)
**Advantages:**
- Live code updates without rebuilding images
- Smaller image sizes
- Consistent module versions across workflows
- Faster development iteration
**How it works:**
- Docker image contains only external tools (TruffleHog, Gitleaks)
- Python modules are mounted at runtime from the backend container
- Backend manages code synchronization via shared volumes
### 2. Self-Contained Approach (`Dockerfile.self-contained`)
**Advantages:**
- Complete isolation and reproducibility
- No runtime dependencies on backend code
- Can run independently of FuzzForge platform
- Better for CI/CD integration
**How it works:**
- All required Python modules are copied into the Docker image
- Image is completely self-contained
- Larger image size but fully portable
## Configuration
### TruffleHog Configuration
```json
{
"trufflehog_config": {
"verify": true, // Verify discovered secrets
"concurrency": 10, // Number of concurrent workers
"max_depth": 10, // Maximum directory depth
"include_detectors": [], // Specific detectors to include
"exclude_detectors": [] // Specific detectors to exclude
}
}
```
### Gitleaks Configuration
```json
{
"gitleaks_config": {
"scan_mode": "detect", // "detect" or "protect"
"redact": true, // Redact secrets in output
"max_target_megabytes": 100, // Maximum file size (MB)
"no_git": false, // Scan without Git context
"config_file": "", // Custom Gitleaks config
"baseline_file": "" // Baseline file for known findings
}
}
```
## Usage Example
```bash
curl -X POST "http://localhost:8000/workflows/secret_detection_scan/submit" \
-H "Content-Type: application/json" \
-d '{
"target_path": "/path/to/scan",
"volume_mode": "ro",
"parameters": {
"trufflehog_config": {
"verify": true,
"concurrency": 15
},
"gitleaks_config": {
"scan_mode": "detect",
"max_target_megabytes": 200
}
}
}'
```
## Output Format
The workflow generates a SARIF report containing:
- All unique findings from both tools
- Severity levels mapped to standard scale
- File locations and line numbers
- Detailed descriptions and recommendations
- Tool-specific metadata
## Performance Considerations
- **TruffleHog**: CPU-intensive with verification enabled
- **Gitleaks**: Memory-intensive for large repositories
- **Recommended Resources**: 512Mi memory, 500m CPU
- **Typical Runtime**: 1-5 minutes for small repos, 10-30 minutes for large ones
## Security Notes
- Secrets are redacted in output by default
- Verified secrets are marked with higher severity
- Both tools support custom rules and exclusions
- Consider using baseline files for known false positives

View File

@@ -1,17 +0,0 @@
"""
Secret Detection Scan Workflow
This package contains the comprehensive secret detection workflow that combines
multiple secret detection tools for thorough analysis.
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.

View File

@@ -1,113 +0,0 @@
name: secret_detection_scan
version: "2.0.0"
description: "Comprehensive secret detection using TruffleHog and Gitleaks"
author: "FuzzForge Team"
category: "comprehensive"
tags:
- "secrets"
- "credentials"
- "detection"
- "trufflehog"
- "gitleaks"
- "comprehensive"
supported_volume_modes:
- "ro"
- "rw"
default_volume_mode: "ro"
default_target_path: "/workspace"
requirements:
tools:
- "trufflehog"
- "gitleaks"
resources:
memory: "512Mi"
cpu: "500m"
timeout: 1800
has_docker: true
default_parameters:
target_path: "/workspace"
volume_mode: "ro"
trufflehog_config: {}
gitleaks_config: {}
reporter_config: {}
parameters:
type: object
properties:
target_path:
type: string
default: "/workspace"
description: "Path to analyze"
volume_mode:
type: string
enum: ["ro", "rw"]
default: "ro"
description: "Volume mount mode"
trufflehog_config:
type: object
description: "TruffleHog configuration"
properties:
verify:
type: boolean
description: "Verify discovered secrets"
concurrency:
type: integer
description: "Number of concurrent workers"
max_depth:
type: integer
description: "Maximum directory depth to scan"
include_detectors:
type: array
items:
type: string
description: "Specific detectors to include"
exclude_detectors:
type: array
items:
type: string
description: "Specific detectors to exclude"
gitleaks_config:
type: object
description: "Gitleaks configuration"
properties:
scan_mode:
type: string
enum: ["detect", "protect"]
description: "Scan mode"
redact:
type: boolean
description: "Redact secrets in output"
max_target_megabytes:
type: integer
description: "Maximum file size to scan (MB)"
no_git:
type: boolean
description: "Scan files without Git context"
config_file:
type: string
description: "Path to custom configuration file"
baseline_file:
type: string
description: "Path to baseline file"
reporter_config:
type: object
description: "SARIF reporter configuration"
properties:
output_file:
type: string
description: "Output SARIF file name"
include_code_flows:
type: boolean
description: "Include code flow information"
output_schema:
type: object
properties:
sarif:
type: object
description: "SARIF-formatted security findings"

View File

@@ -1,290 +0,0 @@
"""
Secret Detection Scan Workflow
This workflow performs comprehensive secret detection using multiple tools:
- TruffleHog: Comprehensive secret detection with verification
- Gitleaks: Git-specific secret scanning
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import sys
import logging
from pathlib import Path
from typing import Dict, Any, List, Optional
from prefect import flow, task
from prefect.artifacts import create_markdown_artifact, create_table_artifact
import asyncio
import json
# Add modules to path
sys.path.insert(0, '/app')
# Import modules
from toolbox.modules.secret_detection.trufflehog import TruffleHogModule
from toolbox.modules.secret_detection.gitleaks import GitleaksModule
from toolbox.modules.reporter import SARIFReporter
# Configure logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
@task(name="trufflehog_scan")
async def run_trufflehog_task(workspace: Path, config: Dict[str, Any]) -> Dict[str, Any]:
"""
Task to run TruffleHog secret detection.
Args:
workspace: Path to the workspace
config: TruffleHog configuration
Returns:
TruffleHog results
"""
logger.info("Running TruffleHog secret detection")
module = TruffleHogModule()
result = await module.execute(config, workspace)
logger.info(f"TruffleHog completed: {result.summary.get('total_secrets', 0)} secrets found")
return result.dict()
@task(name="gitleaks_scan")
async def run_gitleaks_task(workspace: Path, config: Dict[str, Any]) -> Dict[str, Any]:
"""
Task to run Gitleaks secret detection.
Args:
workspace: Path to the workspace
config: Gitleaks configuration
Returns:
Gitleaks results
"""
logger.info("Running Gitleaks secret detection")
module = GitleaksModule()
result = await module.execute(config, workspace)
logger.info(f"Gitleaks completed: {result.summary.get('total_leaks', 0)} leaks found")
return result.dict()
@task(name="aggregate_findings")
async def aggregate_findings_task(
trufflehog_results: Dict[str, Any],
gitleaks_results: Dict[str, Any],
config: Dict[str, Any],
workspace: Path
) -> Dict[str, Any]:
"""
Task to aggregate findings from all secret detection tools.
Args:
trufflehog_results: Results from TruffleHog
gitleaks_results: Results from Gitleaks
config: Reporter configuration
workspace: Path to workspace
Returns:
Aggregated SARIF report
"""
logger.info("Aggregating secret detection findings")
# Combine all findings
all_findings = []
# Add TruffleHog findings
trufflehog_findings = trufflehog_results.get("findings", [])
all_findings.extend(trufflehog_findings)
# Add Gitleaks findings
gitleaks_findings = gitleaks_results.get("findings", [])
all_findings.extend(gitleaks_findings)
# Deduplicate findings based on file path and line number
unique_findings = []
seen_signatures = set()
for finding in all_findings:
# Create signature for deduplication
signature = (
finding.get("file_path", ""),
finding.get("line_start", 0),
finding.get("title", "").lower()[:50] # First 50 chars of title
)
if signature not in seen_signatures:
seen_signatures.add(signature)
unique_findings.append(finding)
else:
logger.debug(f"Deduplicated finding: {signature}")
logger.info(f"Aggregated {len(unique_findings)} unique findings from {len(all_findings)} total")
# Generate SARIF report
reporter = SARIFReporter()
reporter_config = {
**config,
"findings": unique_findings,
"tool_name": "FuzzForge Secret Detection",
"tool_version": "1.0.0",
"tool_description": "Comprehensive secret detection using TruffleHog and Gitleaks"
}
result = await reporter.execute(reporter_config, workspace)
return result.dict().get("sarif", {})
@flow(name="secret_detection_scan", log_prints=True)
async def main_flow(
target_path: str = "/workspace",
volume_mode: str = "ro",
trufflehog_config: Optional[Dict[str, Any]] = None,
gitleaks_config: Optional[Dict[str, Any]] = None,
reporter_config: Optional[Dict[str, Any]] = None
) -> Dict[str, Any]:
"""
Main secret detection workflow.
This workflow:
1. Runs TruffleHog for comprehensive secret detection
2. Runs Gitleaks for Git-specific secret detection
3. Aggregates and deduplicates findings
4. Generates a unified SARIF report
Args:
target_path: Path to the mounted workspace (default: /workspace)
volume_mode: Volume mount mode (ro/rw)
trufflehog_config: Configuration for TruffleHog
gitleaks_config: Configuration for Gitleaks
reporter_config: Configuration for SARIF reporter
Returns:
SARIF-formatted findings report
"""
logger.info("Starting comprehensive secret detection workflow")
logger.info(f"Workspace: {target_path}, Mode: {volume_mode}")
# Set workspace path
workspace = Path(target_path)
if not workspace.exists():
logger.error(f"Workspace does not exist: {workspace}")
return {
"error": f"Workspace not found: {workspace}",
"sarif": None
}
# Default configurations - merge with provided configs to ensure defaults are always applied
default_trufflehog_config = {
"verify": False,
"concurrency": 10,
"max_depth": 10,
"no_git": True # Add no_git for filesystem scanning
}
trufflehog_config = {**default_trufflehog_config, **(trufflehog_config or {})}
default_gitleaks_config = {
"scan_mode": "detect",
"redact": True,
"max_target_megabytes": 100,
"no_git": True # Critical for non-git directories
}
gitleaks_config = {**default_gitleaks_config, **(gitleaks_config or {})}
default_reporter_config = {
"include_code_flows": False
}
reporter_config = {**default_reporter_config, **(reporter_config or {})}
try:
# Run secret detection tools in parallel
logger.info("Phase 1: Running secret detection tools")
# Create tasks for parallel execution
trufflehog_task_result = run_trufflehog_task(workspace, trufflehog_config)
gitleaks_task_result = run_gitleaks_task(workspace, gitleaks_config)
# Wait for both to complete
trufflehog_results, gitleaks_results = await asyncio.gather(
trufflehog_task_result,
gitleaks_task_result,
return_exceptions=True
)
# Handle any exceptions
if isinstance(trufflehog_results, Exception):
logger.error(f"TruffleHog failed: {trufflehog_results}")
trufflehog_results = {"findings": [], "status": "failed"}
if isinstance(gitleaks_results, Exception):
logger.error(f"Gitleaks failed: {gitleaks_results}")
gitleaks_results = {"findings": [], "status": "failed"}
# Aggregate findings
logger.info("Phase 2: Aggregating findings")
sarif_report = await aggregate_findings_task(
trufflehog_results,
gitleaks_results,
reporter_config,
workspace
)
# Log summary
if sarif_report and "runs" in sarif_report:
results_count = len(sarif_report["runs"][0].get("results", []))
logger.info(f"Workflow completed successfully with {results_count} unique secret findings")
# Log tool-specific stats
trufflehog_count = len(trufflehog_results.get("findings", []))
gitleaks_count = len(gitleaks_results.get("findings", []))
logger.info(f"Tool results - TruffleHog: {trufflehog_count}, Gitleaks: {gitleaks_count}")
else:
logger.info("Workflow completed successfully with no findings")
return sarif_report
except Exception as e:
logger.error(f"Secret detection workflow failed: {e}")
# Return error in SARIF format
return {
"$schema": "https://raw.githubusercontent.com/oasis-tcs/sarif-spec/master/Schemata/sarif-schema-2.1.0.json",
"version": "2.1.0",
"runs": [
{
"tool": {
"driver": {
"name": "FuzzForge Secret Detection",
"version": "1.0.0"
}
},
"results": [],
"invocations": [
{
"executionSuccessful": False,
"exitCode": 1,
"exitCodeDescription": str(e)
}
]
}
]
}
if __name__ == "__main__":
# For local testing
import asyncio
asyncio.run(main_flow(
target_path="/tmp/test",
trufflehog_config={"verify": True, "max_depth": 5},
gitleaks_config={"scan_mode": "detect"}
))

View File

@@ -1,187 +0,0 @@
"""
Manual Workflow Registry for Prefect Deployment
This file contains the manual registry of all workflows that can be deployed.
Developers MUST add their workflows here after creating them.
This approach is required because:
1. Prefect cannot deploy dynamically imported flows
2. Docker deployment needs static flow references
3. Explicit registration provides better control and visibility
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
from typing import Dict, Any, Callable
import logging
logger = logging.getLogger(__name__)
# Import only essential workflows
# Import each workflow individually to handle failures gracefully
security_assessment_flow = None
secret_detection_flow = None
# Try to import each workflow individually
try:
from .security_assessment.workflow import main_flow as security_assessment_flow
except ImportError as e:
logger.warning(f"Failed to import security_assessment workflow: {e}")
try:
from .comprehensive.secret_detection_scan.workflow import main_flow as secret_detection_flow
except ImportError as e:
logger.warning(f"Failed to import secret_detection_scan workflow: {e}")
# Manual registry - developers add workflows here after creation
# Only include workflows that were successfully imported
WORKFLOW_REGISTRY: Dict[str, Dict[str, Any]] = {}
# Add workflows that were successfully imported
if security_assessment_flow is not None:
WORKFLOW_REGISTRY["security_assessment"] = {
"flow": security_assessment_flow,
"module_path": "toolbox.workflows.security_assessment.workflow",
"function_name": "main_flow",
"description": "Comprehensive security assessment workflow that scans files, analyzes code for vulnerabilities, and generates SARIF reports",
"version": "1.0.0",
"author": "FuzzForge Team",
"tags": ["security", "scanner", "analyzer", "static-analysis", "sarif"]
}
if secret_detection_flow is not None:
WORKFLOW_REGISTRY["secret_detection_scan"] = {
"flow": secret_detection_flow,
"module_path": "toolbox.workflows.comprehensive.secret_detection_scan.workflow",
"function_name": "main_flow",
"description": "Comprehensive secret detection using TruffleHog and Gitleaks for thorough credential scanning",
"version": "1.0.0",
"author": "FuzzForge Team",
"tags": ["secrets", "credentials", "detection", "trufflehog", "gitleaks", "comprehensive"]
}
#
# To add a new workflow, follow this pattern:
#
# "my_new_workflow": {
# "flow": my_new_flow_function, # Import the flow function above
# "module_path": "toolbox.workflows.my_new_workflow.workflow",
# "function_name": "my_new_flow_function",
# "description": "Description of what this workflow does",
# "version": "1.0.0",
# "author": "Developer Name",
# "tags": ["tag1", "tag2"]
# }
def get_workflow_flow(workflow_name: str) -> Callable:
"""
Get the flow function for a workflow.
Args:
workflow_name: Name of the workflow
Returns:
Flow function
Raises:
KeyError: If workflow not found in registry
"""
if workflow_name not in WORKFLOW_REGISTRY:
available = list(WORKFLOW_REGISTRY.keys())
raise KeyError(
f"Workflow '{workflow_name}' not found in registry. "
f"Available workflows: {available}. "
f"Please add the workflow to toolbox/workflows/registry.py"
)
return WORKFLOW_REGISTRY[workflow_name]["flow"]
def get_workflow_info(workflow_name: str) -> Dict[str, Any]:
"""
Get registry information for a workflow.
Args:
workflow_name: Name of the workflow
Returns:
Registry information dictionary
Raises:
KeyError: If workflow not found in registry
"""
if workflow_name not in WORKFLOW_REGISTRY:
available = list(WORKFLOW_REGISTRY.keys())
raise KeyError(
f"Workflow '{workflow_name}' not found in registry. "
f"Available workflows: {available}"
)
return WORKFLOW_REGISTRY[workflow_name]
def list_registered_workflows() -> Dict[str, Dict[str, Any]]:
"""
Get all registered workflows.
Returns:
Dictionary of all workflow registry entries
"""
return WORKFLOW_REGISTRY.copy()
def validate_registry() -> bool:
"""
Validate the workflow registry for consistency.
Returns:
True if valid, raises exceptions if not
Raises:
ValueError: If registry is invalid
"""
if not WORKFLOW_REGISTRY:
raise ValueError("Workflow registry is empty")
required_fields = ["flow", "module_path", "function_name", "description"]
for name, entry in WORKFLOW_REGISTRY.items():
# Check required fields
missing_fields = [field for field in required_fields if field not in entry]
if missing_fields:
raise ValueError(
f"Workflow '{name}' missing required fields: {missing_fields}"
)
# Check if flow is callable
if not callable(entry["flow"]):
raise ValueError(f"Workflow '{name}' flow is not callable")
# Check if flow has the required Prefect attributes
if not hasattr(entry["flow"], "deploy"):
raise ValueError(
f"Workflow '{name}' flow is not a Prefect flow (missing deploy method)"
)
logger.info(f"Registry validation passed. {len(WORKFLOW_REGISTRY)} workflows registered.")
return True
# Validate registry on import
try:
validate_registry()
logger.info(f"Workflow registry loaded successfully with {len(WORKFLOW_REGISTRY)} workflows")
except Exception as e:
logger.error(f"Workflow registry validation failed: {e}")
raise

View File

@@ -1,30 +0,0 @@
FROM prefecthq/prefect:3-python3.11
WORKDIR /app
# Create toolbox directory structure to match expected import paths
RUN mkdir -p /app/toolbox/workflows /app/toolbox/modules
# Copy base module infrastructure
COPY modules/__init__.py /app/toolbox/modules/
COPY modules/base.py /app/toolbox/modules/
# Copy only required modules (manual selection)
COPY modules/scanner /app/toolbox/modules/scanner
COPY modules/analyzer /app/toolbox/modules/analyzer
COPY modules/reporter /app/toolbox/modules/reporter
# Copy this workflow
COPY workflows/security_assessment /app/toolbox/workflows/security_assessment
# Install workflow-specific requirements if they exist
RUN if [ -f /app/toolbox/workflows/security_assessment/requirements.txt ]; then pip install --no-cache-dir -r /app/toolbox/workflows/security_assessment/requirements.txt; fi
# Install common requirements
RUN pip install --no-cache-dir pyyaml
# Set Python path
ENV PYTHONPATH=/app:$PYTHONPATH
# Create workspace directory
RUN mkdir -p /workspace

View File

@@ -1,11 +0,0 @@
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.

View File

@@ -1,111 +0,0 @@
name: security_assessment
version: "2.0.0"
description: "Comprehensive security assessment workflow that scans files, analyzes code for vulnerabilities, and generates SARIF reports"
author: "FuzzForge Team"
category: "comprehensive"
tags:
- "security"
- "scanner"
- "analyzer"
- "static-analysis"
- "sarif"
- "comprehensive"
supported_volume_modes:
- "ro"
- "rw"
default_volume_mode: "ro"
default_target_path: "/workspace"
requirements:
tools:
- "file_scanner"
- "security_analyzer"
- "sarif_reporter"
resources:
memory: "512Mi"
cpu: "500m"
timeout: 1800
has_docker: true
default_parameters:
target_path: "/workspace"
volume_mode: "ro"
scanner_config: {}
analyzer_config: {}
reporter_config: {}
parameters:
type: object
properties:
target_path:
type: string
default: "/workspace"
description: "Path to analyze"
volume_mode:
type: string
enum: ["ro", "rw"]
default: "ro"
description: "Volume mount mode"
scanner_config:
type: object
description: "File scanner configuration"
properties:
patterns:
type: array
items:
type: string
description: "File patterns to scan"
check_sensitive:
type: boolean
description: "Check for sensitive files"
calculate_hashes:
type: boolean
description: "Calculate file hashes"
max_file_size:
type: integer
description: "Maximum file size to scan (bytes)"
analyzer_config:
type: object
description: "Security analyzer configuration"
properties:
file_extensions:
type: array
items:
type: string
description: "File extensions to analyze"
check_secrets:
type: boolean
description: "Check for hardcoded secrets"
check_sql:
type: boolean
description: "Check for SQL injection risks"
check_dangerous_functions:
type: boolean
description: "Check for dangerous function calls"
reporter_config:
type: object
description: "SARIF reporter configuration"
properties:
include_code_flows:
type: boolean
description: "Include code flow information"
output_schema:
type: object
properties:
sarif:
type: object
description: "SARIF-formatted security findings"
summary:
type: object
description: "Scan execution summary"
properties:
total_findings:
type: integer
severity_counts:
type: object
tool_counts:
type: object

View File

@@ -1,4 +0,0 @@
# Requirements for security assessment workflow
pydantic>=2.0.0
pyyaml>=6.0
aiofiles>=23.0.0

View File

@@ -1,252 +0,0 @@
"""
Security Assessment Workflow - Comprehensive security analysis using multiple modules
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import sys
import logging
from pathlib import Path
from typing import Dict, Any, Optional
from prefect import flow, task
import json
# Add modules to path
sys.path.insert(0, '/app')
# Import modules
from toolbox.modules.scanner import FileScanner
from toolbox.modules.analyzer import SecurityAnalyzer
from toolbox.modules.reporter import SARIFReporter
# Configure logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
@task(name="file_scanning")
async def scan_files_task(workspace: Path, config: Dict[str, Any]) -> Dict[str, Any]:
"""
Task to scan files in the workspace.
Args:
workspace: Path to the workspace
config: Scanner configuration
Returns:
Scanner results
"""
logger.info(f"Starting file scanning in {workspace}")
scanner = FileScanner()
result = await scanner.execute(config, workspace)
logger.info(f"File scanning completed: {result.summary.get('total_files', 0)} files found")
return result.dict()
@task(name="security_analysis")
async def analyze_security_task(workspace: Path, config: Dict[str, Any]) -> Dict[str, Any]:
"""
Task to analyze security vulnerabilities.
Args:
workspace: Path to the workspace
config: Analyzer configuration
Returns:
Analysis results
"""
logger.info("Starting security analysis")
analyzer = SecurityAnalyzer()
result = await analyzer.execute(config, workspace)
logger.info(
f"Security analysis completed: {result.summary.get('total_findings', 0)} findings"
)
return result.dict()
@task(name="report_generation")
async def generate_report_task(
scan_results: Dict[str, Any],
analysis_results: Dict[str, Any],
config: Dict[str, Any],
workspace: Path
) -> Dict[str, Any]:
"""
Task to generate SARIF report from all findings.
Args:
scan_results: Results from scanner
analysis_results: Results from analyzer
config: Reporter configuration
workspace: Path to the workspace
Returns:
SARIF report
"""
logger.info("Generating SARIF report")
reporter = SARIFReporter()
# Combine findings from all modules
all_findings = []
# Add scanner findings (only sensitive files, not all files)
scanner_findings = scan_results.get("findings", [])
sensitive_findings = [f for f in scanner_findings if f.get("severity") != "info"]
all_findings.extend(sensitive_findings)
# Add analyzer findings
analyzer_findings = analysis_results.get("findings", [])
all_findings.extend(analyzer_findings)
# Prepare reporter config
reporter_config = {
**config,
"findings": all_findings,
"tool_name": "FuzzForge Security Assessment",
"tool_version": "1.0.0"
}
result = await reporter.execute(reporter_config, workspace)
# Extract SARIF from result
sarif = result.dict().get("sarif", {})
logger.info(f"Report generated with {len(all_findings)} total findings")
return sarif
@flow(name="security_assessment", log_prints=True)
async def main_flow(
target_path: str = "/workspace",
volume_mode: str = "ro",
scanner_config: Optional[Dict[str, Any]] = None,
analyzer_config: Optional[Dict[str, Any]] = None,
reporter_config: Optional[Dict[str, Any]] = None
) -> Dict[str, Any]:
"""
Main security assessment workflow.
This workflow:
1. Scans files in the workspace
2. Analyzes code for security vulnerabilities
3. Generates a SARIF report with all findings
Args:
target_path: Path to the mounted workspace (default: /workspace)
volume_mode: Volume mount mode (ro/rw)
scanner_config: Configuration for file scanner
analyzer_config: Configuration for security analyzer
reporter_config: Configuration for SARIF reporter
Returns:
SARIF-formatted findings report
"""
logger.info(f"Starting security assessment workflow")
logger.info(f"Workspace: {target_path}, Mode: {volume_mode}")
# Set workspace path
workspace = Path(target_path)
if not workspace.exists():
logger.error(f"Workspace does not exist: {workspace}")
return {
"error": f"Workspace not found: {workspace}",
"sarif": None
}
# Default configurations
if not scanner_config:
scanner_config = {
"patterns": ["*"],
"check_sensitive": True,
"calculate_hashes": False,
"max_file_size": 10485760 # 10MB
}
if not analyzer_config:
analyzer_config = {
"file_extensions": [".py", ".js", ".java", ".php", ".rb", ".go"],
"check_secrets": True,
"check_sql": True,
"check_dangerous_functions": True
}
if not reporter_config:
reporter_config = {
"include_code_flows": False
}
try:
# Execute workflow tasks
logger.info("Phase 1: File scanning")
scan_results = await scan_files_task(workspace, scanner_config)
logger.info("Phase 2: Security analysis")
analysis_results = await analyze_security_task(workspace, analyzer_config)
logger.info("Phase 3: Report generation")
sarif_report = await generate_report_task(
scan_results,
analysis_results,
reporter_config,
workspace
)
# Log summary
if sarif_report and "runs" in sarif_report:
results_count = len(sarif_report["runs"][0].get("results", []))
logger.info(f"Workflow completed successfully with {results_count} findings")
else:
logger.info("Workflow completed successfully")
return sarif_report
except Exception as e:
logger.error(f"Workflow failed: {e}")
# Return error in SARIF format
return {
"$schema": "https://raw.githubusercontent.com/oasis-tcs/sarif-spec/master/Schemata/sarif-schema-2.1.0.json",
"version": "2.1.0",
"runs": [
{
"tool": {
"driver": {
"name": "FuzzForge Security Assessment",
"version": "1.0.0"
}
},
"results": [],
"invocations": [
{
"executionSuccessful": False,
"exitCode": 1,
"exitCodeDescription": str(e)
}
]
}
]
}
if __name__ == "__main__":
# For local testing
import asyncio
asyncio.run(main_flow(
target_path="/tmp/test",
scanner_config={"patterns": ["*.py"]},
analyzer_config={"check_secrets": True}
))

2635
backend/uv.lock generated

File diff suppressed because it is too large Load Diff

64
cli/.gitignore vendored
View File

@@ -1,64 +0,0 @@
# FuzzForge CLI specific .gitignore
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# Virtual environments
.venv/
venv/
ENV/
env/
# UV package manager - keep uv.lock for CLI
# uv.lock # Commented out - we want to keep this for reproducible CLI builds
# IDE
.vscode/
.idea/
*.swp
*.swo
# OS
.DS_Store
Thumbs.db
# Testing
.coverage
.pytest_cache/
.tox/
htmlcov/
# MyPy
.mypy_cache/
# Local development
local_config.yaml
.env.local
# Generated files
*.log
*.tmp
# CLI specific
# Don't ignore uv.lock in CLI as it's needed for reproducible builds
!uv.lock

View File

@@ -1,583 +0,0 @@
# FuzzForge CLI
🛡️ **FuzzForge CLI** - Command-line interface for FuzzForge security testing platform
A comprehensive CLI for managing security testing workflows, monitoring runs in real-time, and analyzing findings with beautiful terminal interfaces and persistent project management.
## ✨ Features
- 📁 **Project Management** - Initialize and manage FuzzForge projects with local databases
- 🔧 **Workflow Management** - Browse, configure, and run security testing workflows
- 🚀 **Workflow Execution** - Execute and manage security testing workflows
- 🔍 **Findings Analysis** - View, export, and analyze security findings in multiple formats
- 📊 **Real-time Monitoring** - Live dashboards for fuzzing statistics and crash reports
- ⚙️ **Configuration** - Flexible project and global configuration management
- 🎨 **Rich UI** - Beautiful tables, progress bars, and interactive prompts
- 💾 **Persistent Storage** - SQLite database for runs, findings, and crash data
- 🛡️ **Error Handling** - Comprehensive error handling with user-friendly messages
- 🔄 **Network Resilience** - Automatic retries and graceful degradation
## 🚀 Quick Start
### Installation
#### Prerequisites
- Python 3.11 or higher
- [uv](https://docs.astral.sh/uv/) package manager
#### Install FuzzForge CLI
```bash
# Clone the repository
git clone https://github.com/FuzzingLabs/fuzzforge_alpha.git
cd fuzzforge_alpha/cli
# Install globally with uv (recommended)
uv tool install .
# Alternative: Install in development mode
uv sync
uv add --editable ../sdk
uv tool install --editable .
# Verify installation
fuzzforge --help
```
#### Shell Completion (Optional)
```bash
# Install completion for your shell
fuzzforge --install-completion
```
### Initialize Your First Project
```bash
# Create a new project directory
mkdir my-security-project
cd my-security-project
# Initialize FuzzForge project
ff init
# Check status
fuzzforge status
```
This creates a `.fuzzforge/` directory with:
- SQLite database for persistent storage
- Configuration file (`config.yaml`)
- Project metadata
### Run Your First Analysis
```bash
# List available workflows
fuzzforge workflows list
# Get workflow details
fuzzforge workflows info security_assessment
# Submit a workflow for analysis
fuzzforge workflow security_assessment /path/to/your/code
# View findings when complete
fuzzforge finding <execution-id>
```
## 📚 Command Reference
### Project Management
#### `ff init`
Initialize a new FuzzForge project in the current directory.
```bash
ff init --name "My Security Project" --api-url "http://localhost:8000"
```
**Options:**
- `--name, -n` - Project name (defaults to directory name)
- `--api-url, -u` - FuzzForge API URL (defaults to http://localhost:8000)
- `--force, -f` - Force initialization even if project exists
#### `fuzzforge status`
Show comprehensive project and API status information.
```bash
fuzzforge status
```
Displays:
- Project information and configuration
- Database statistics (runs, findings, crashes)
- API connectivity and available workflows
### Workflow Management
#### `fuzzforge workflows list`
List all available security testing workflows.
```bash
fuzzforge workflows list
```
#### `fuzzforge workflows info <workflow-name>`
Show detailed information about a specific workflow.
```bash
fuzzforge workflows info security_assessment
```
Displays:
- Workflow metadata (version, author, description)
- Parameter schema and requirements
- Supported volume modes and features
#### `fuzzforge workflows parameters <workflow-name>`
Interactive parameter builder for workflows.
```bash
# Interactive mode
fuzzforge workflows parameters security_assessment
# Save parameters to file
fuzzforge workflows parameters security_assessment --output params.json
# Non-interactive mode (show schema only)
fuzzforge workflows parameters security_assessment --no-interactive
```
### Workflow Execution
#### `fuzzforge workflow <workflow> <target-path>`
Execute a security testing workflow.
```bash
# Basic execution
fuzzforge workflow security_assessment /path/to/code
# With parameters
fuzzforge workflow security_assessment /path/to/binary \
--param timeout=3600 \
--param iterations=10000
# With parameter file
fuzzforge workflow security_assessment /path/to/code \
--param-file my-params.json
# Wait for completion
fuzzforge workflow security_assessment /path/to/code --wait
```
**Options:**
- `--param, -p` - Parameter in key=value format (can be used multiple times)
- `--param-file, -f` - JSON file containing parameters
- `--volume-mode, -v` - Volume mount mode: `ro` (read-only) or `rw` (read-write)
- `--timeout, -t` - Execution timeout in seconds
- `--interactive/--no-interactive, -i/-n` - Interactive parameter input
- `--wait, -w` - Wait for execution to complete
#### `fuzzforge workflow status [execution-id]`
Check the status of a workflow execution.
```bash
# Check specific execution
fuzzforge workflow status abc123def456
# Check most recent execution
fuzzforge workflow status
```
#### `fuzzforge workflow history`
Show workflow execution history from local database.
```bash
# List all executions
fuzzforge workflow history
# Filter by workflow
fuzzforge workflow history --workflow security_assessment
# Filter by status
fuzzforge workflow history --status completed
# Limit results
fuzzforge workflow history --limit 10
```
#### `fuzzforge workflow retry <execution-id>`
Retry a workflow with the same or modified parameters.
```bash
# Retry with same parameters
fuzzforge workflow retry abc123def456
# Modify parameters interactively
fuzzforge workflow retry abc123def456 --modify-params
```
### Findings Management
#### `fuzzforge finding [execution-id]`
View security findings for a specific execution.
```bash
# Display latest findings
fuzzforge finding
# Display specific execution findings
fuzzforge finding abc123def456
```
#### `fuzzforge findings`
Browse all security findings from local database.
```bash
# List all findings
fuzzforge findings
# Show findings history
fuzzforge findings history --limit 20
```
#### `fuzzforge finding export [execution-id]`
Export security findings in various formats.
```bash
# Export latest findings
fuzzforge finding export --format json
# Export specific execution findings
fuzzforge finding export abc123def456 --format sarif
# Export as CSV with output file
fuzzforge finding export abc123def456 --format csv --output report.csv
# Export as HTML report
fuzzforge finding export --format html --output report.html
```
### Configuration Management
#### `fuzzforge config show`
Display current configuration settings.
```bash
# Show project configuration
fuzzforge config show
# Show global configuration
fuzzforge config show --global
```
#### `fuzzforge config set <key> <value>`
Set a configuration value.
```bash
# Project settings
fuzzforge config set project.api_url "http://api.fuzzforge.com"
fuzzforge config set project.default_timeout 7200
fuzzforge config set project.default_workflow "security_assessment"
# Retention settings
fuzzforge config set retention.max_runs 200
fuzzforge config set retention.keep_findings_days 120
# Preferences
fuzzforge config set preferences.auto_save_findings true
fuzzforge config set preferences.show_progress_bars false
# Global configuration
fuzzforge config set project.api_url "http://global.api.com" --global
```
#### `fuzzforge config get <key>`
Get a specific configuration value.
```bash
fuzzforge config get project.api_url
fuzzforge config get retention.max_runs --global
```
#### `fuzzforge config reset`
Reset configuration to defaults.
```bash
# Reset project configuration
fuzzforge config reset
# Reset global configuration
fuzzforge config reset --global
# Skip confirmation
fuzzforge config reset --force
```
#### `fuzzforge config edit`
Open configuration file in default editor.
```bash
# Edit project configuration
fuzzforge config edit
# Edit global configuration
fuzzforge config edit --global
```
## 🏗️ Project Structure
When you initialize a FuzzForge project, the following structure is created:
```
my-project/
├── .fuzzforge/
│ ├── config.yaml # Project configuration
│ └── findings.db # SQLite database
├── .gitignore # Updated with FuzzForge entries
└── README.md # Project README (if created)
```
### Database Schema
The SQLite database stores:
- **runs** - Workflow run history and metadata
- **findings** - Security findings and SARIF data
- **crashes** - Crash reports and fuzzing data
### Configuration Format
Project configuration (`.fuzzforge/config.yaml`):
```yaml
project:
name: "My Security Project"
api_url: "http://localhost:8000"
default_timeout: 3600
default_workflow: null
retention:
max_runs: 100
keep_findings_days: 90
preferences:
auto_save_findings: true
show_progress_bars: true
table_style: "rich"
color_output: true
```
## 🔧 Advanced Usage
### Parameter Handling
FuzzForge CLI supports flexible parameter input:
1. **Command line parameters**:
```bash
ff workflow workflow-name /path key1=value1 key2=value2
```
2. **Parameter files**:
```bash
echo '{"timeout": 3600, "threads": 4}' > params.json
ff workflow workflow-name /path --param-file params.json
```
3. **Interactive prompts**:
```bash
ff workflow workflow-name /path --interactive
```
4. **Parameter builder**:
```bash
ff workflows parameters workflow-name --output my-params.json
ff workflow workflow-name /path --param-file my-params.json
```
### Environment Variables
Override configuration with environment variables:
```bash
export FUZZFORGE_API_URL="http://production.api.com"
export FUZZFORGE_TIMEOUT="7200"
```
### Data Retention
Configure automatic cleanup of old data:
```bash
# Keep only 50 runs
fuzzforge config set retention.max_runs 50
# Keep findings for 30 days
fuzzforge config set retention.keep_findings_days 30
```
### Export Formats
Support for multiple export formats:
- **JSON** - Simplified findings structure
- **CSV** - Tabular data for spreadsheets
- **HTML** - Interactive web report
- **SARIF** - Standard security analysis format
## 🛠️ Development
### Setup Development Environment
```bash
# Clone repository
git clone https://github.com/FuzzingLabs/fuzzforge_alpha.git
cd fuzzforge_alpha/cli
# Install in development mode
uv sync
uv add --editable ../sdk
# Install CLI in editable mode
uv tool install --editable .
```
### Project Structure
```
cli/
├── src/fuzzforge_cli/
│ ├── __init__.py
│ ├── main.py # Main CLI app
│ ├── config.py # Configuration management
│ ├── database.py # Database operations
│ ├── exceptions.py # Error handling
│ ├── api_validation.py # API response validation
│ └── commands/ # Command implementations
│ ├── init.py # Project initialization
│ ├── workflows.py # Workflow management
│ ├── runs.py # Run management
│ ├── findings.py # Findings management
│ ├── config.py # Configuration commands
│ └── status.py # Status information
├── pyproject.toml # Project configuration
└── README.md # This file
```
### Running Tests
```bash
# Run tests (when available)
uv run pytest
# Code formatting
uv run black src/
uv run isort src/
# Type checking
uv run mypy src/
```
## ⚠️ Troubleshooting
### Common Issues
#### "No FuzzForge project found"
```bash
# Initialize a project first
ff init
```
#### API Connection Failed
```bash
# Check API URL configuration
fuzzforge config get project.api_url
# Test API connectivity
fuzzforge status
# Update API URL if needed
fuzzforge config set project.api_url "http://correct-url:8000"
```
#### Permission Errors
```bash
# Ensure proper permissions for project directory
chmod -R 755 .fuzzforge/
# Check file ownership
ls -la .fuzzforge/
```
#### Database Issues
```bash
# Check database file exists
ls -la .fuzzforge/findings.db
# Reinitialize if corrupted (will lose data)
rm .fuzzforge/findings.db
ff init --force
```
### Environment Variables
Set these environment variables for debugging:
```bash
export FUZZFORGE_DEBUG=1 # Enable debug logging
export FUZZFORGE_API_URL="..." # Override API URL
export FUZZFORGE_TIMEOUT="30" # Override timeout
```
### Getting Help
```bash
# General help
fuzzforge --help
# Command-specific help
ff workflows --help
ff workflow run --help
# Show version
fuzzforge --version
```
## 🏆 Example Workflow
Here's a complete example of analyzing a project:
```bash
# 1. Initialize project
mkdir my-security-audit
cd my-security-audit
ff init --name "Security Audit 2024"
# 2. Check available workflows
fuzzforge workflows list
# 3. Submit comprehensive security assessment
ff workflow security_assessment /path/to/source/code --wait
# 4. View findings in table format
fuzzforge findings get <run-id>
# 5. Export detailed report
fuzzforge findings export <run-id> --format html --output security_report.html
# 6. Check project statistics
fuzzforge status
```
## 📜 License
This project is licensed under the terms specified in the main FuzzForge repository.
## 🤝 Contributing
Contributions are welcome! Please see the main FuzzForge repository for contribution guidelines.
---
**FuzzForge CLI** - Making security testing workflows accessible and efficient from the command line.

View File

@@ -1,323 +0,0 @@
#!/usr/bin/env python3
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
"""
Install shell completion for FuzzForge CLI.
This script installs completion using Typer's built-in --install-completion command.
"""
import os
import sys
import subprocess
from pathlib import Path
import typer
def run_fuzzforge_completion_install(shell: str) -> bool:
"""Install completion using the fuzzforge CLI itself."""
try:
# Use the CLI's built-in completion installation
result = subprocess.run([
sys.executable, "-m", "fuzzforge_cli.main",
"--install-completion", shell
], capture_output=True, text=True, cwd=Path(__file__).parent.parent)
if result.returncode == 0:
print(f"{shell.capitalize()} completion installed successfully")
return True
else:
print(f"❌ Failed to install {shell} completion: {result.stderr}")
return False
except Exception as e:
print(f"❌ Error installing {shell} completion: {e}")
return False
def create_manual_completion_scripts():
"""Create manual completion scripts as fallback."""
scripts = {
"bash": '''
# FuzzForge CLI completion for bash
_fuzzforge_completion() {
local IFS=$'\\t'
local response
response=$(env COMP_WORDS="${COMP_WORDS[*]}" COMP_CWORD=$COMP_CWORD _FUZZFORGE_COMPLETE=bash_complete $1)
for completion in $response; do
IFS=',' read type value <<< "$completion"
if [[ $type == 'dir' ]]; then
COMPREPLY=()
compopt -o dirnames
elif [[ $type == 'file' ]]; then
COMPREPLY=()
compopt -o default
elif [[ $type == 'plain' ]]; then
COMPREPLY+=($value)
fi
done
return 0
}
complete -o nosort -F _fuzzforge_completion fuzzforge
''',
"zsh": '''
#compdef fuzzforge
_fuzzforge_completion() {
local -a completions
local -a completions_with_descriptions
local -a response
response=(${(f)"$(env COMP_WORDS="${words[*]}" COMP_CWORD=$((CURRENT-1)) _FUZZFORGE_COMPLETE=zsh_complete fuzzforge)"})
for type_and_line in $response; do
if [[ "$type_and_line" =~ ^([^,]*),(.*)$ ]]; then
local type="$match[1]"
local line="$match[2]"
if [[ "$type" == "dir" ]]; then
_path_files -/
elif [[ "$type" == "file" ]]; then
_path_files -f
elif [[ "$type" == "plain" ]]; then
if [[ "$line" =~ ^([^:]*):(.*)$ ]]; then
completions_with_descriptions+=("$match[1]":"$match[2]")
else
completions+=("$line")
fi
fi
fi
done
if [ -n "$completions_with_descriptions" ]; then
_describe "" completions_with_descriptions -V unsorted
fi
if [ -n "$completions" ]; then
compadd -U -V unsorted -a completions
fi
}
compdef _fuzzforge_completion fuzzforge;
''',
"fish": '''
# FuzzForge CLI completion for fish
function __fuzzforge_completion
set -l response
for value in (env _FUZZFORGE_COMPLETE=fish_complete COMP_WORDS=(commandline -cp) COMP_CWORD=(commandline -t) fuzzforge)
set response $response $value
end
for completion in $response
set -l metadata (string split "," $completion)
if test $metadata[1] = "dir"
__fish_complete_directories $metadata[2]
else if test $metadata[1] = "file"
__fish_complete_path $metadata[2]
else if test $metadata[1] = "plain"
echo $metadata[2]
end
end
end
complete --no-files --command fuzzforge --arguments "(__fuzzforge_completion)"
'''
}
return scripts
def install_bash_completion():
"""Install bash completion."""
print("📝 Installing bash completion...")
# Get the manual completion script
scripts = create_manual_completion_scripts()
completion_script = scripts["bash"]
# Try different locations for bash completion
completion_dirs = [
Path.home() / ".bash_completion.d",
Path("/usr/local/etc/bash_completion.d"),
Path("/etc/bash_completion.d")
]
for completion_dir in completion_dirs:
try:
completion_dir.mkdir(exist_ok=True)
completion_file = completion_dir / "fuzzforge"
completion_file.write_text(completion_script)
print(f"✅ Bash completion installed to: {completion_file}")
# Add source line to .bashrc if not present
bashrc = Path.home() / ".bashrc"
source_line = f"source {completion_file}"
if bashrc.exists():
bashrc_content = bashrc.read_text()
if source_line not in bashrc_content:
with bashrc.open("a") as f:
f.write(f"\n# FuzzForge CLI completion\n{source_line}\n")
print("✅ Added completion source to ~/.bashrc")
return True
except PermissionError:
continue
except Exception as e:
print(f"❌ Failed to install bash completion: {e}")
continue
print("❌ Could not install bash completion (permission denied)")
return False
def install_zsh_completion():
"""Install zsh completion."""
print("📝 Installing zsh completion...")
# Get the manual completion script
scripts = create_manual_completion_scripts()
completion_script = scripts["zsh"]
# Create completion directory
comp_dir = Path.home() / ".zsh" / "completions"
comp_dir.mkdir(parents=True, exist_ok=True)
try:
completion_file = comp_dir / "_fuzzforge"
completion_file.write_text(completion_script)
print(f"✅ Zsh completion installed to: {completion_file}")
# Add fpath to .zshrc if not present
zshrc = Path.home() / ".zshrc"
fpath_line = f'fpath=(~/.zsh/completions $fpath)'
autoload_line = 'autoload -U compinit && compinit'
if zshrc.exists():
zshrc_content = zshrc.read_text()
lines_to_add = []
if fpath_line not in zshrc_content:
lines_to_add.append(fpath_line)
if autoload_line not in zshrc_content:
lines_to_add.append(autoload_line)
if lines_to_add:
with zshrc.open("a") as f:
f.write(f"\n# FuzzForge CLI completion\n")
for line in lines_to_add:
f.write(f"{line}\n")
print("✅ Added completion setup to ~/.zshrc")
return True
except Exception as e:
print(f"❌ Failed to install zsh completion: {e}")
return False
def install_fish_completion():
"""Install fish completion."""
print("📝 Installing fish completion...")
# Get the manual completion script
scripts = create_manual_completion_scripts()
completion_script = scripts["fish"]
# Fish completion directory
comp_dir = Path.home() / ".config" / "fish" / "completions"
comp_dir.mkdir(parents=True, exist_ok=True)
try:
completion_file = comp_dir / "fuzzforge.fish"
completion_file.write_text(completion_script)
print(f"✅ Fish completion installed to: {completion_file}")
return True
except Exception as e:
print(f"❌ Failed to install fish completion: {e}")
return False
def detect_shell():
"""Detect the current shell."""
shell_path = os.environ.get('SHELL', '')
if 'bash' in shell_path:
return 'bash'
elif 'zsh' in shell_path:
return 'zsh'
elif 'fish' in shell_path:
return 'fish'
else:
return None
def main():
"""Install completion for the current shell or all shells."""
print("🚀 FuzzForge CLI Completion Installer")
print("=" * 50)
current_shell = detect_shell()
if current_shell:
print(f"🐚 Detected shell: {current_shell}")
# Check for command line arguments
if len(sys.argv) > 1 and sys.argv[1] == "--all":
install_all = True
print("Installing completion for all shells...")
else:
# Ask user which shells to install (with default to current shell only)
if current_shell:
install_all = typer.confirm("Install completion for all supported shells (bash, zsh, fish)?", default=False)
if not install_all:
print(f"Installing completion for {current_shell} only...")
else:
install_all = typer.confirm("Install completion for all supported shells (bash, zsh, fish)?", default=True)
success_count = 0
if install_all or current_shell == 'bash':
if install_bash_completion():
success_count += 1
if install_all or current_shell == 'zsh':
if install_zsh_completion():
success_count += 1
if install_all or current_shell == 'fish':
if install_fish_completion():
success_count += 1
print("\n" + "=" * 50)
if success_count > 0:
print(f"✅ Successfully installed completion for {success_count} shell(s)!")
print("\n📋 To activate completion:")
print(" • Bash: Restart your terminal or run 'source ~/.bashrc'")
print(" • Zsh: Restart your terminal or run 'source ~/.zshrc'")
print(" • Fish: Completion is active immediately")
print("\n💡 Try typing 'fuzzforge <TAB>' to test completion!")
else:
print("❌ No completions were installed successfully.")
return 1
return 0
if __name__ == "__main__":
sys.exit(main())

View File

@@ -1,22 +0,0 @@
"""
FuzzForge CLI - Command-line interface for FuzzForge security testing platform.
This module provides the main entry point for the FuzzForge CLI application.
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import typer
from src.fuzzforge_cli.main import app
if __name__ == "__main__":
app()

View File

@@ -1,41 +0,0 @@
[project]
name = "fuzzforge-cli"
version = "0.6.0"
description = "FuzzForge CLI - Command-line interface for FuzzForge security testing platform"
readme = "README.md"
authors = [
{ name = "Tanguy Duhamel", email = "tduhamel@fuzzinglabs.com" }
]
requires-python = ">=3.11"
dependencies = [
"typer>=0.12.0",
"rich>=13.0.0",
"pyyaml>=6.0.0",
"pydantic>=2.0.0",
"httpx>=0.27.0",
"websockets>=13.0",
"sseclient-py>=1.8.0",
"fuzzforge-sdk",
"fuzzforge-ai",
]
[project.optional-dependencies]
dev = [
"pytest>=8.0.0",
"pytest-asyncio>=0.23.0",
"black>=24.0.0",
"isort>=5.13.0",
"mypy>=1.11.0",
]
[project.scripts]
fuzzforge = "fuzzforge_cli.main:main"
ff = "fuzzforge_cli.main:main"
[build-system]
requires = ["uv_build>=0.8.17,<0.9.0"]
build-backend = "uv_build"
[tool.uv.sources]
fuzzforge-sdk = { path = "../sdk", editable = true }
fuzzforge-ai = { path = "../ai", editable = true }

View File

@@ -1,19 +0,0 @@
"""
FuzzForge CLI - Command-line interface for FuzzForge security testing platform.
A comprehensive CLI for managing workflows, runs, findings, and real-time monitoring
with local project management and persistent storage.
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
__version__ = "0.6.0"

View File

@@ -1,311 +0,0 @@
"""
API response validation and graceful degradation utilities.
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import logging
from typing import Any, Dict, List, Optional, Union
from pydantic import BaseModel, ValidationError as PydanticValidationError
from .exceptions import ValidationError, APIConnectionError
logger = logging.getLogger(__name__)
class WorkflowMetadata(BaseModel):
"""Expected workflow metadata structure"""
name: str
version: str
author: Optional[str] = None
description: Optional[str] = None
parameters: Dict[str, Any] = {}
supported_volume_modes: List[str] = ["ro", "rw"]
class RunStatus(BaseModel):
"""Expected run status structure"""
run_id: str
workflow: str
status: str
created_at: str
updated_at: str
@property
def is_completed(self) -> bool:
"""Check if run is in a completed state"""
return self.status.lower() in ["completed", "success", "finished"]
@property
def is_running(self) -> bool:
"""Check if run is currently running"""
return self.status.lower() in ["running", "in_progress", "active"]
@property
def is_failed(self) -> bool:
"""Check if run has failed"""
return self.status.lower() in ["failed", "error", "cancelled"]
class FindingsResponse(BaseModel):
"""Expected findings response structure"""
run_id: str
sarif: Dict[str, Any]
total_issues: Optional[int] = None
def model_post_init(self, __context: Any) -> None:
"""Validate SARIF structure after initialization"""
if not self.sarif.get("runs"):
logger.warning(f"SARIF data for run {self.run_id} missing 'runs' section")
elif not isinstance(self.sarif["runs"], list):
logger.warning(f"SARIF 'runs' section is not a list for run {self.run_id}")
def validate_api_response(response_data: Any, expected_model: type[BaseModel],
operation: str = "API operation") -> BaseModel:
"""
Validate API response against expected Pydantic model.
Args:
response_data: Raw response data from API
expected_model: Pydantic model class to validate against
operation: Description of the operation for error messages
Returns:
Validated model instance
Raises:
ValidationError: If validation fails
"""
try:
return expected_model.model_validate(response_data)
except PydanticValidationError as e:
logger.error(f"API response validation failed for {operation}: {e}")
raise ValidationError(
f"API response for {operation}",
str(response_data)[:200] + "..." if len(str(response_data)) > 200 else str(response_data),
f"valid {expected_model.__name__} format"
) from e
except Exception as e:
logger.error(f"Unexpected error validating API response for {operation}: {e}")
raise ValidationError(
f"API response for {operation}",
"invalid data",
f"valid {expected_model.__name__} format"
) from e
def validate_sarif_structure(sarif_data: Dict[str, Any]) -> Dict[str, str]:
"""
Validate basic SARIF structure and return validation issues.
Args:
sarif_data: SARIF data dictionary
Returns:
Dictionary of validation issues found
"""
issues = {}
# Check basic SARIF structure
if not isinstance(sarif_data, dict):
issues["structure"] = "SARIF data is not a dictionary"
return issues
if "runs" not in sarif_data:
issues["runs"] = "Missing 'runs' section in SARIF data"
elif not isinstance(sarif_data["runs"], list):
issues["runs_type"] = "'runs' section is not a list"
elif len(sarif_data["runs"]) == 0:
issues["runs_empty"] = "'runs' section is empty"
else:
# Check first run structure
run = sarif_data["runs"][0]
if not isinstance(run, dict):
issues["run_structure"] = "First run is not a dictionary"
else:
if "results" not in run:
issues["results"] = "Missing 'results' section in run"
elif not isinstance(run["results"], list):
issues["results_type"] = "'results' section is not a list"
if "tool" not in run:
issues["tool"] = "Missing 'tool' section in run"
elif not isinstance(run["tool"], dict):
issues["tool_type"] = "'tool' section is not a dictionary"
return issues
def safe_extract_sarif_summary(sarif_data: Dict[str, Any]) -> Dict[str, Any]:
"""
Safely extract summary information from SARIF data with fallbacks.
Args:
sarif_data: SARIF data dictionary
Returns:
Summary dictionary with safe defaults
"""
summary = {
"total_issues": 0,
"by_severity": {},
"by_rule": {},
"tools": [],
"validation_issues": []
}
# Validate structure first
validation_issues = validate_sarif_structure(sarif_data)
if validation_issues:
summary["validation_issues"] = list(validation_issues.values())
logger.warning(f"SARIF validation issues: {validation_issues}")
try:
runs = sarif_data.get("runs", [])
if not runs:
return summary
run = runs[0]
results = run.get("results", [])
summary["total_issues"] = len(results)
# Count by severity/level
for result in results:
try:
level = result.get("level", "note")
rule_id = result.get("ruleId", "unknown")
summary["by_severity"][level] = summary["by_severity"].get(level, 0) + 1
summary["by_rule"][rule_id] = summary["by_rule"].get(rule_id, 0) + 1
except Exception as e:
logger.warning(f"Failed to process result: {e}")
continue
# Extract tool information safely
try:
tool = run.get("tool", {})
driver = tool.get("driver", {})
if driver.get("name"):
summary["tools"].append({
"name": driver.get("name", "unknown"),
"version": driver.get("version", "unknown"),
"rules": len(driver.get("rules", []))
})
except Exception as e:
logger.warning(f"Failed to extract tool information: {e}")
except Exception as e:
logger.error(f"Failed to extract SARIF summary: {e}")
summary["validation_issues"].append(f"Summary extraction failed: {e}")
return summary
def validate_workflow_parameters(parameters: Dict[str, Any],
workflow_schema: Dict[str, Any]) -> List[str]:
"""
Validate workflow parameters against schema with detailed error messages.
Args:
parameters: Parameters to validate
workflow_schema: JSON schema for the workflow
Returns:
List of validation error messages
"""
errors = []
try:
properties = workflow_schema.get("properties", {})
required = set(workflow_schema.get("required", []))
# Check required parameters
missing_required = required - set(parameters.keys())
if missing_required:
errors.append(f"Missing required parameters: {', '.join(missing_required)}")
# Validate individual parameters
for param_name, param_value in parameters.items():
if param_name not in properties:
errors.append(f"Unknown parameter: {param_name}")
continue
param_schema = properties[param_name]
param_type = param_schema.get("type", "string")
# Type validation
if param_type == "integer" and not isinstance(param_value, int):
errors.append(f"Parameter '{param_name}' must be an integer")
elif param_type == "number" and not isinstance(param_value, (int, float)):
errors.append(f"Parameter '{param_name}' must be a number")
elif param_type == "boolean" and not isinstance(param_value, bool):
errors.append(f"Parameter '{param_name}' must be a boolean")
elif param_type == "array" and not isinstance(param_value, list):
errors.append(f"Parameter '{param_name}' must be an array")
# Range validation for numbers
if param_type in ["integer", "number"] and isinstance(param_value, (int, float)):
minimum = param_schema.get("minimum")
maximum = param_schema.get("maximum")
if minimum is not None and param_value < minimum:
errors.append(f"Parameter '{param_name}' must be >= {minimum}")
if maximum is not None and param_value > maximum:
errors.append(f"Parameter '{param_name}' must be <= {maximum}")
except Exception as e:
logger.error(f"Parameter validation failed: {e}")
errors.append(f"Parameter validation error: {e}")
return errors
def create_fallback_response(response_type: str, **kwargs) -> Dict[str, Any]:
"""
Create fallback responses when API calls fail.
Args:
response_type: Type of response to create
**kwargs: Additional data for the fallback
Returns:
Fallback response dictionary
"""
fallbacks = {
"workflow_list": {
"workflows": [],
"message": "Unable to fetch workflows from API"
},
"run_status": {
"run_id": kwargs.get("run_id", "unknown"),
"workflow": kwargs.get("workflow", "unknown"),
"status": "unknown",
"created_at": kwargs.get("created_at", "unknown"),
"updated_at": kwargs.get("updated_at", "unknown"),
"message": "Unable to fetch run status from API"
},
"findings": {
"run_id": kwargs.get("run_id", "unknown"),
"sarif": {
"version": "2.1.0",
"runs": []
},
"message": "Unable to fetch findings from API"
}
}
fallback = fallbacks.get(response_type, {"message": f"No fallback available for {response_type}"})
logger.info(f"Using fallback response for {response_type}: {fallback.get('message', 'Unknown fallback')}")
return fallback

View File

@@ -1,14 +0,0 @@
"""
Command modules for FuzzForge CLI.
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.

View File

@@ -1,133 +0,0 @@
"""AI integration commands for the FuzzForge CLI."""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
from __future__ import annotations
import asyncio
import os
from datetime import datetime
from typing import Optional
import typer
from rich.console import Console
from rich.panel import Panel
from rich.table import Table
from ..config import ProjectConfigManager
console = Console()
app = typer.Typer(name="ai", help="Interact with the FuzzForge AI system")
@app.command("agent")
def ai_agent() -> None:
"""Launch the full AI agent CLI with A2A orchestration."""
console.print("[cyan]🤖 Opening Project FuzzForge AI Agent session[/cyan]\n")
try:
from fuzzforge_ai.cli import FuzzForgeCLI
cli = FuzzForgeCLI()
asyncio.run(cli.run())
except ImportError as exc:
console.print(f"[red]Failed to import AI CLI:[/red] {exc}")
console.print("[dim]Ensure AI dependencies are installed (pip install -e .)[/dim]")
raise typer.Exit(1) from exc
except Exception as exc: # pragma: no cover - runtime safety
console.print(f"[red]Failed to launch AI agent:[/red] {exc}")
console.print("[dim]Check that .env contains LITELLM_MODEL and API keys[/dim]")
raise typer.Exit(1) from exc
# Memory + health commands
@app.command("status")
def ai_status() -> None:
"""Show AI system health and configuration."""
try:
status = asyncio.run(get_ai_status_async())
except Exception as exc: # pragma: no cover
console.print(f"[red]Failed to get AI status:[/red] {exc}")
raise typer.Exit(1) from exc
console.print("[bold cyan]🤖 FuzzForge AI System Status[/bold cyan]\n")
config_table = Table(title="Configuration", show_header=True, header_style="bold magenta")
config_table.add_column("Setting", style="bold")
config_table.add_column("Value", style="cyan")
config_table.add_column("Status", style="green")
for key, info in status["config"].items():
status_icon = "" if info["configured"] else ""
display_value = info["value"] if info["value"] else "-"
config_table.add_row(key, display_value, f"{status_icon}")
console.print(config_table)
console.print()
components_table = Table(title="AI Components", show_header=True, header_style="bold magenta")
components_table.add_column("Component", style="bold")
components_table.add_column("Status", style="green")
components_table.add_column("Details", style="dim")
for component, info in status["components"].items():
status_icon = "🟢" if info["available"] else "🔴"
components_table.add_row(component, status_icon, info["details"])
console.print(components_table)
if status["agents"]:
console.print()
console.print(f"[bold green]✓[/bold green] {len(status['agents'])} agents registered")
@app.command("server")
def ai_server(
port: int = typer.Option(10100, "--port", "-p", help="Server port (default: 10100)"),
) -> None:
"""Start AI system as an A2A server."""
console.print(f"[cyan]🚀 Starting FuzzForge AI Server on port {port}[/cyan]")
console.print("[dim]Other agents can register this instance at the A2A endpoint[/dim]\n")
try:
os.environ["FUZZFORGE_PORT"] = str(port)
from fuzzforge_ai.__main__ import main as start_server
start_server()
except Exception as exc: # pragma: no cover
console.print(f"[red]Failed to start AI server:[/red] {exc}")
raise typer.Exit(1) from exc
# ---------------------------------------------------------------------------
# Helper functions (largely adapted from the OSS implementation)
# ---------------------------------------------------------------------------
@app.callback(invoke_without_command=True)
def ai_callback(ctx: typer.Context):
"""
🤖 AI integration features
"""
# Check if a subcommand is being invoked
if ctx.invoked_subcommand is not None:
# Let the subcommand handle it
return
# Show not implemented message for default command
console.print("🚧 [yellow]AI command is not fully implemented yet.[/yellow]")
console.print("Please use specific subcommands:")
console.print(" • [cyan]ff ai agent[/cyan] - Launch the full AI agent CLI")
console.print(" • [cyan]ff ai status[/cyan] - Show AI system health and configuration")
console.print(" • [cyan]ff ai server[/cyan] - Start AI system as an A2A server")

View File

@@ -1,384 +0,0 @@
"""
Configuration management commands.
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import typer
from pathlib import Path
from rich.console import Console
from rich.table import Table
from rich.panel import Panel
from rich.prompt import Prompt, Confirm
from rich import box
from typing import Optional
from ..config import (
get_project_config,
ensure_project_config,
get_global_config,
save_global_config,
FuzzForgeConfig
)
from ..exceptions import require_project, ValidationError, handle_error
console = Console()
app = typer.Typer()
@app.command("show")
def show_config(
global_config: bool = typer.Option(
False, "--global", "-g",
help="Show global configuration instead of project config"
)
):
"""
📋 Display current configuration settings
"""
if global_config:
config = get_global_config()
config_type = "Global"
config_path = Path.home() / ".config" / "fuzzforge" / "config.yaml"
else:
try:
require_project()
config = get_project_config()
if not config:
raise ValidationError("project configuration", "missing", "initialized project")
except Exception as e:
handle_error(e, "loading project configuration")
return # Unreachable, but makes static analysis happy
config_type = "Project"
config_path = Path.cwd() / ".fuzzforge" / "config.yaml"
console.print(f"\n⚙️ [bold]{config_type} Configuration[/bold]\n")
# Project settings
project_table = Table(show_header=False, box=box.SIMPLE)
project_table.add_column("Setting", style="bold cyan")
project_table.add_column("Value")
project_table.add_row("Project Name", config.project.name)
project_table.add_row("API URL", config.project.api_url)
project_table.add_row("Default Timeout", f"{config.project.default_timeout}s")
if config.project.default_workflow:
project_table.add_row("Default Workflow", config.project.default_workflow)
console.print(
Panel.fit(
project_table,
title="📁 Project Settings",
box=box.ROUNDED
)
)
# Retention settings
retention_table = Table(show_header=False, box=box.SIMPLE)
retention_table.add_column("Setting", style="bold cyan")
retention_table.add_column("Value")
retention_table.add_row("Max Runs", str(config.retention.max_runs))
retention_table.add_row("Keep Findings (days)", str(config.retention.keep_findings_days))
console.print(
Panel.fit(
retention_table,
title="🗄️ Data Retention",
box=box.ROUNDED
)
)
# Preferences
prefs_table = Table(show_header=False, box=box.SIMPLE)
prefs_table.add_column("Setting", style="bold cyan")
prefs_table.add_column("Value")
prefs_table.add_row("Auto Save Findings", "✅ Yes" if config.preferences.auto_save_findings else "❌ No")
prefs_table.add_row("Show Progress Bars", "✅ Yes" if config.preferences.show_progress_bars else "❌ No")
prefs_table.add_row("Table Style", config.preferences.table_style)
prefs_table.add_row("Color Output", "✅ Yes" if config.preferences.color_output else "❌ No")
console.print(
Panel.fit(
prefs_table,
title="🎨 Preferences",
box=box.ROUNDED
)
)
console.print(f"\n📍 Config file: [dim]{config_path}[/dim]")
@app.command("set")
def set_config(
key: str = typer.Argument(..., help="Configuration key to set (e.g., 'project.name', 'project.api_url')"),
value: str = typer.Argument(..., help="Value to set"),
global_config: bool = typer.Option(
False, "--global", "-g",
help="Set in global configuration instead of project config"
)
):
"""
⚙️ Set a configuration value
"""
if global_config:
config = get_global_config()
config_type = "global"
else:
config = get_project_config()
if not config:
console.print("❌ No project configuration found. Run 'ff init' first.", style="red")
raise typer.Exit(1)
config_type = "project"
# Parse the key path
key_parts = key.split('.')
if len(key_parts) != 2:
console.print("❌ Key must be in format 'section.setting' (e.g., 'project.name')", style="red")
raise typer.Exit(1)
section, setting = key_parts
try:
# Update configuration
if section == "project":
if setting == "name":
config.project.name = value
elif setting == "api_url":
config.project.api_url = value
elif setting == "default_timeout":
config.project.default_timeout = int(value)
elif setting == "default_workflow":
config.project.default_workflow = value if value.lower() != "none" else None
else:
console.print(f"❌ Unknown project setting: {setting}", style="red")
raise typer.Exit(1)
elif section == "retention":
if setting == "max_runs":
config.retention.max_runs = int(value)
elif setting == "keep_findings_days":
config.retention.keep_findings_days = int(value)
else:
console.print(f"❌ Unknown retention setting: {setting}", style="red")
raise typer.Exit(1)
elif section == "preferences":
if setting == "auto_save_findings":
config.preferences.auto_save_findings = value.lower() in ("true", "yes", "1", "on")
elif setting == "show_progress_bars":
config.preferences.show_progress_bars = value.lower() in ("true", "yes", "1", "on")
elif setting == "table_style":
config.preferences.table_style = value
elif setting == "color_output":
config.preferences.color_output = value.lower() in ("true", "yes", "1", "on")
else:
console.print(f"❌ Unknown preferences setting: {setting}", style="red")
raise typer.Exit(1)
else:
console.print(f"❌ Unknown configuration section: {section}", style="red")
console.print("Valid sections: project, retention, preferences", style="dim")
raise typer.Exit(1)
# Save configuration
if global_config:
save_global_config(config)
else:
config_path = Path.cwd() / ".fuzzforge" / "config.yaml"
config.save_to_file(config_path)
console.print(f"✅ Set {config_type} configuration: [bold cyan]{key}[/bold cyan] = [bold]{value}[/bold]", style="green")
except ValueError as e:
console.print(f"❌ Invalid value for {key}: {e}", style="red")
raise typer.Exit(1)
except Exception as e:
console.print(f"❌ Failed to set configuration: {e}", style="red")
raise typer.Exit(1)
@app.command("get")
def get_config(
key: str = typer.Argument(..., help="Configuration key to get (e.g., 'project.name')"),
global_config: bool = typer.Option(
False, "--global", "-g",
help="Get from global configuration instead of project config"
)
):
"""
📖 Get a specific configuration value
"""
if global_config:
config = get_global_config()
else:
config = get_project_config()
if not config:
console.print("❌ No project configuration found. Run 'ff init' first.", style="red")
raise typer.Exit(1)
# Parse the key path
key_parts = key.split('.')
if len(key_parts) != 2:
console.print("❌ Key must be in format 'section.setting' (e.g., 'project.name')", style="red")
raise typer.Exit(1)
section, setting = key_parts
try:
# Get configuration value
if section == "project":
if setting == "name":
value = config.project.name
elif setting == "api_url":
value = config.project.api_url
elif setting == "default_timeout":
value = config.project.default_timeout
elif setting == "default_workflow":
value = config.project.default_workflow or "none"
else:
console.print(f"❌ Unknown project setting: {setting}", style="red")
raise typer.Exit(1)
elif section == "retention":
if setting == "max_runs":
value = config.retention.max_runs
elif setting == "keep_findings_days":
value = config.retention.keep_findings_days
else:
console.print(f"❌ Unknown retention setting: {setting}", style="red")
raise typer.Exit(1)
elif section == "preferences":
if setting == "auto_save_findings":
value = config.preferences.auto_save_findings
elif setting == "show_progress_bars":
value = config.preferences.show_progress_bars
elif setting == "table_style":
value = config.preferences.table_style
elif setting == "color_output":
value = config.preferences.color_output
else:
console.print(f"❌ Unknown preferences setting: {setting}", style="red")
raise typer.Exit(1)
else:
console.print(f"❌ Unknown configuration section: {section}", style="red")
raise typer.Exit(1)
console.print(f"{key}: [bold cyan]{value}[/bold cyan]")
except Exception as e:
console.print(f"❌ Failed to get configuration: {e}", style="red")
raise typer.Exit(1)
@app.command("reset")
def reset_config(
global_config: bool = typer.Option(
False, "--global", "-g",
help="Reset global configuration instead of project config"
),
force: bool = typer.Option(
False, "--force", "-f",
help="Skip confirmation prompt"
)
):
"""
🔄 Reset configuration to defaults
"""
config_type = "global" if global_config else "project"
if not force:
if not Confirm.ask(f"Reset {config_type} configuration to defaults?", default=False, console=console):
console.print("❌ Reset cancelled", style="yellow")
raise typer.Exit(0)
try:
# Create new default configuration
new_config = FuzzForgeConfig()
if global_config:
save_global_config(new_config)
else:
if not Path.cwd().joinpath(".fuzzforge").exists():
console.print("❌ No project configuration found. Run 'ff init' first.", style="red")
raise typer.Exit(1)
config_path = Path.cwd() / ".fuzzforge" / "config.yaml"
new_config.save_to_file(config_path)
console.print(f"{config_type.title()} configuration reset to defaults", style="green")
except Exception as e:
console.print(f"❌ Failed to reset configuration: {e}", style="red")
raise typer.Exit(1)
@app.command("edit")
def edit_config(
global_config: bool = typer.Option(
False, "--global", "-g",
help="Edit global configuration instead of project config"
)
):
"""
📝 Open configuration file in default editor
"""
import os
import subprocess
if global_config:
config_path = Path.home() / ".config" / "fuzzforge" / "config.yaml"
config_type = "global"
else:
config_path = Path.cwd() / ".fuzzforge" / "config.yaml"
config_type = "project"
if not config_path.exists():
console.print("❌ No project configuration found. Run 'ff init' first.", style="red")
raise typer.Exit(1)
# Try to find a suitable editor
editors = ["code", "vim", "nano", "notepad"]
editor = None
for e in editors:
try:
subprocess.run([e, "--version"], capture_output=True, check=True)
editor = e
break
except (subprocess.CalledProcessError, FileNotFoundError):
continue
if not editor:
console.print(f"📍 Configuration file: [bold cyan]{config_path}[/bold cyan]")
console.print("❌ No suitable editor found. Please edit the file manually.", style="red")
raise typer.Exit(1)
try:
console.print(f"📝 Opening {config_type} configuration in {editor}...")
subprocess.run([editor, str(config_path)], check=True)
console.print(f"✅ Configuration file edited", style="green")
except subprocess.CalledProcessError as e:
console.print(f"❌ Failed to open editor: {e}", style="red")
raise typer.Exit(1)
@app.callback()
def config_callback():
"""
⚙️ Manage configuration settings
"""
pass

View File

@@ -1,940 +0,0 @@
"""
Findings and security results management commands.
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import json
import csv
from datetime import datetime
from pathlib import Path
from typing import Optional, Dict, Any, List
import typer
from rich.console import Console
from rich.table import Table, Column
from rich.panel import Panel
from rich.syntax import Syntax
from rich.tree import Tree
from rich.text import Text
from rich import box
from ..config import get_project_config, FuzzForgeConfig
from ..database import get_project_db, ensure_project_db, FindingRecord
from ..exceptions import (
handle_error, retry_on_network_error, validate_run_id,
require_project, ValidationError, DatabaseError
)
from fuzzforge_sdk import FuzzForgeClient
console = Console()
app = typer.Typer()
@retry_on_network_error(max_retries=3, delay=1.0)
def get_client() -> FuzzForgeClient:
"""Get configured FuzzForge client with retry on network errors"""
config = get_project_config() or FuzzForgeConfig()
return FuzzForgeClient(base_url=config.get_api_url(), timeout=config.get_timeout())
def severity_style(severity: str) -> str:
"""Get rich style for severity level"""
return {
"error": "bold red",
"warning": "bold yellow",
"note": "bold blue",
"info": "bold cyan"
}.get(severity.lower(), "white")
@app.command("get")
def get_findings(
run_id: str = typer.Argument(..., help="Run ID to get findings for"),
save: bool = typer.Option(
True, "--save/--no-save",
help="Save findings to local database"
),
format: str = typer.Option(
"table", "--format", "-f",
help="Output format: table, json, sarif"
)
):
"""
🔍 Retrieve and display security findings for a run
"""
try:
require_project()
validate_run_id(run_id)
if format not in ["table", "json", "sarif"]:
raise ValidationError("format", format, "one of: table, json, sarif")
with get_client() as client:
console.print(f"🔍 Fetching findings for run: {run_id}")
findings = client.get_run_findings(run_id)
# Save to database if requested
if save:
try:
db = ensure_project_db()
# Extract summary from SARIF
sarif_data = findings.sarif
runs_data = sarif_data.get("runs", [])
summary = {}
if runs_data:
results = runs_data[0].get("results", [])
summary = {
"total_issues": len(results),
"by_severity": {},
"by_rule": {},
"tools": []
}
for result in results:
level = result.get("level", "note")
rule_id = result.get("ruleId", "unknown")
summary["by_severity"][level] = summary["by_severity"].get(level, 0) + 1
summary["by_rule"][rule_id] = summary["by_rule"].get(rule_id, 0) + 1
# Extract tool info
tool = runs_data[0].get("tool", {})
driver = tool.get("driver", {})
if driver.get("name"):
summary["tools"].append({
"name": driver.get("name"),
"version": driver.get("version"),
"rules": len(driver.get("rules", []))
})
finding_record = FindingRecord(
run_id=run_id,
sarif_data=sarif_data,
summary=summary,
created_at=datetime.now()
)
db.save_findings(finding_record)
console.print("✅ Findings saved to local database", style="green")
except Exception as e:
console.print(f"⚠️ Failed to save findings to database: {e}", style="yellow")
# Display findings
if format == "json":
findings_json = json.dumps(findings.sarif, indent=2)
console.print(Syntax(findings_json, "json", theme="monokai"))
elif format == "sarif":
sarif_json = json.dumps(findings.sarif, indent=2)
console.print(sarif_json)
else: # table format
display_findings_table(findings.sarif)
except Exception as e:
console.print(f"❌ Failed to get findings: {e}", style="red")
raise typer.Exit(1)
def display_findings_table(sarif_data: Dict[str, Any]):
"""Display SARIF findings in a rich table format"""
runs = sarif_data.get("runs", [])
if not runs:
console.print(" No findings data available", style="dim")
return
run_data = runs[0]
results = run_data.get("results", [])
tool = run_data.get("tool", {})
driver = tool.get("driver", {})
# Tool information
console.print(f"\n🔍 [bold]Security Analysis Results[/bold]")
if driver.get("name"):
console.print(f"Tool: {driver.get('name')} v{driver.get('version', 'unknown')}")
if not results:
console.print("✅ No security issues found!", style="green")
return
# Summary statistics
summary_by_level = {}
for result in results:
level = result.get("level", "note")
summary_by_level[level] = summary_by_level.get(level, 0) + 1
summary_table = Table(show_header=False, box=box.SIMPLE)
summary_table.add_column("Severity", width=15, justify="left", style="bold")
summary_table.add_column("Count", width=8, justify="right", style="bold")
for level, count in sorted(summary_by_level.items()):
# Create Rich Text object with color styling
level_text = level.upper()
severity_text = Text(level_text, style=severity_style(level))
count_text = Text(str(count))
summary_table.add_row(severity_text, count_text)
console.print(
Panel.fit(
summary_table,
title=f"📊 Summary ({len(results)} total issues)",
box=box.ROUNDED
)
)
# Detailed results - Rich Text-based table with proper emoji alignment
results_table = Table(box=box.ROUNDED)
results_table.add_column("Severity", width=12, justify="left", no_wrap=True)
results_table.add_column("Rule", width=25, justify="left", style="bold cyan", no_wrap=True)
results_table.add_column("Message", width=55, justify="left", no_wrap=True)
results_table.add_column("Location", width=20, justify="left", style="dim", no_wrap=True)
for result in results[:50]: # Limit to first 50 results
level = result.get("level", "note")
rule_id = result.get("ruleId", "unknown")
message = result.get("message", {}).get("text", "No message")
# Extract location information
locations = result.get("locations", [])
location_str = ""
if locations:
physical_location = locations[0].get("physicalLocation", {})
artifact_location = physical_location.get("artifactLocation", {})
region = physical_location.get("region", {})
file_path = artifact_location.get("uri", "")
if file_path:
location_str = Path(file_path).name
if region.get("startLine"):
location_str += f":{region['startLine']}"
if region.get("startColumn"):
location_str += f":{region['startColumn']}"
# Create Rich Text objects with color styling
severity_text = Text(level.upper(), style=severity_style(level))
severity_text.truncate(12, overflow="ellipsis")
rule_text = Text(rule_id)
rule_text.truncate(25, overflow="ellipsis")
message_text = Text(message)
message_text.truncate(55, overflow="ellipsis")
location_text = Text(location_str)
location_text.truncate(20, overflow="ellipsis")
results_table.add_row(
severity_text,
rule_text,
message_text,
location_text
)
console.print(f"\n📋 [bold]Detailed Results[/bold]")
if len(results) > 50:
console.print(f"Showing first 50 of {len(results)} results")
console.print()
console.print(results_table)
@app.command("history")
def findings_history(
limit: int = typer.Option(20, "--limit", "-l", help="Maximum number of findings to show")
):
"""
📚 Show findings history from local database
"""
db = get_project_db()
if not db:
console.print("❌ No FuzzForge project found. Run 'ff init' first.", style="red")
raise typer.Exit(1)
try:
findings = db.list_findings(limit=limit)
if not findings:
console.print("❌ No findings found in database", style="red")
return
table = Table(box=box.ROUNDED)
table.add_column("Run ID", style="bold cyan", width=36) # Full UUID width
table.add_column("Date", justify="center")
table.add_column("Total Issues", justify="center", style="bold")
table.add_column("Errors", justify="center", style="red")
table.add_column("Warnings", justify="center", style="yellow")
table.add_column("Notes", justify="center", style="blue")
table.add_column("Tools", style="dim")
for finding in findings:
summary = finding.summary
total_issues = summary.get("total_issues", 0)
by_severity = summary.get("by_severity", {})
tools = summary.get("tools", [])
tool_names = ", ".join([tool.get("name", "Unknown") for tool in tools])
table.add_row(
finding.run_id, # Show full Run ID
finding.created_at.strftime("%m-%d %H:%M"),
str(total_issues),
str(by_severity.get("error", 0)),
str(by_severity.get("warning", 0)),
str(by_severity.get("note", 0)),
tool_names[:30] + "..." if len(tool_names) > 30 else tool_names
)
console.print(f"\n📚 [bold]Findings History ({len(findings)})[/bold]\n")
console.print(table)
console.print(f"\n💡 Use [bold cyan]fuzzforge finding <run-id>[/bold cyan] to view detailed findings")
except Exception as e:
console.print(f"❌ Failed to get findings history: {e}", style="red")
raise typer.Exit(1)
@app.command("export")
def export_findings(
run_id: str = typer.Argument(..., help="Run ID to export findings for"),
format: str = typer.Option(
"json", "--format", "-f",
help="Export format: json, csv, html, sarif"
),
output: Optional[str] = typer.Option(
None, "--output", "-o",
help="Output file path (defaults to findings-<run-id>.<format>)"
)
):
"""
📤 Export security findings in various formats
"""
db = get_project_db()
if not db:
console.print("❌ No FuzzForge project found. Run 'ff init' first.", style="red")
raise typer.Exit(1)
try:
# Get findings from database first, fallback to API
findings_data = db.get_findings(run_id)
if not findings_data:
console.print(f"📡 Fetching findings from API for run: {run_id}")
with get_client() as client:
findings = client.get_run_findings(run_id)
sarif_data = findings.sarif
else:
sarif_data = findings_data.sarif_data
# Generate output filename
if not output:
output = f"findings-{run_id[:8]}.{format}"
output_path = Path(output)
# Export based on format
if format == "sarif":
with open(output_path, 'w') as f:
json.dump(sarif_data, f, indent=2)
elif format == "json":
# Simplified JSON format
simplified_data = extract_simplified_findings(sarif_data)
with open(output_path, 'w') as f:
json.dump(simplified_data, f, indent=2)
elif format == "csv":
export_to_csv(sarif_data, output_path)
elif format == "html":
export_to_html(sarif_data, output_path, run_id)
else:
console.print(f"❌ Unsupported format: {format}", style="red")
raise typer.Exit(1)
console.print(f"✅ Findings exported to: [bold cyan]{output_path}[/bold cyan]")
except Exception as e:
console.print(f"❌ Failed to export findings: {e}", style="red")
raise typer.Exit(1)
def extract_simplified_findings(sarif_data: Dict[str, Any]) -> Dict[str, Any]:
"""Extract simplified findings structure from SARIF"""
runs = sarif_data.get("runs", [])
if not runs:
return {"findings": [], "summary": {}}
run_data = runs[0]
results = run_data.get("results", [])
tool = run_data.get("tool", {}).get("driver", {})
simplified = {
"tool": {
"name": tool.get("name", "Unknown"),
"version": tool.get("version", "Unknown")
},
"summary": {
"total_issues": len(results),
"by_severity": {}
},
"findings": []
}
for result in results:
level = result.get("level", "note")
simplified["summary"]["by_severity"][level] = simplified["summary"]["by_severity"].get(level, 0) + 1
# Extract location
location_info = {}
locations = result.get("locations", [])
if locations:
physical_location = locations[0].get("physicalLocation", {})
artifact_location = physical_location.get("artifactLocation", {})
region = physical_location.get("region", {})
location_info = {
"file": artifact_location.get("uri", ""),
"line": region.get("startLine"),
"column": region.get("startColumn")
}
simplified["findings"].append({
"rule_id": result.get("ruleId", "unknown"),
"severity": level,
"message": result.get("message", {}).get("text", ""),
"location": location_info
})
return simplified
def export_to_csv(sarif_data: Dict[str, Any], output_path: Path):
"""Export findings to CSV format"""
runs = sarif_data.get("runs", [])
if not runs:
return
results = runs[0].get("results", [])
with open(output_path, 'w', newline='', encoding='utf-8') as csvfile:
fieldnames = ['rule_id', 'severity', 'message', 'file', 'line', 'column']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
for result in results:
location_info = {"file": "", "line": "", "column": ""}
locations = result.get("locations", [])
if locations:
physical_location = locations[0].get("physicalLocation", {})
artifact_location = physical_location.get("artifactLocation", {})
region = physical_location.get("region", {})
location_info = {
"file": artifact_location.get("uri", ""),
"line": region.get("startLine", ""),
"column": region.get("startColumn", "")
}
writer.writerow({
"rule_id": result.get("ruleId", ""),
"severity": result.get("level", "note"),
"message": result.get("message", {}).get("text", ""),
**location_info
})
def export_to_html(sarif_data: Dict[str, Any], output_path: Path, run_id: str):
"""Export findings to HTML format"""
runs = sarif_data.get("runs", [])
if not runs:
return
run_data = runs[0]
results = run_data.get("results", [])
tool = run_data.get("tool", {}).get("driver", {})
# Simple HTML template
html_content = f"""<!DOCTYPE html>
<html>
<head>
<title>Security Findings - {run_id}</title>
<style>
body {{ font-family: Arial, sans-serif; margin: 40px; }}
.header {{ background: #f4f4f4; padding: 20px; border-radius: 5px; }}
.summary {{ margin: 20px 0; }}
.findings {{ margin: 20px 0; }}
table {{ width: 100%; border-collapse: collapse; }}
th, td {{ padding: 10px; text-align: left; border-bottom: 1px solid #ddd; }}
th {{ background-color: #f2f2f2; }}
.error {{ color: #d32f2f; }}
.warning {{ color: #f57c00; }}
.note {{ color: #1976d2; }}
.info {{ color: #388e3c; }}
</style>
</head>
<body>
<div class="header">
<h1>Security Findings Report</h1>
<p><strong>Run ID:</strong> {run_id}</p>
<p><strong>Tool:</strong> {tool.get('name', 'Unknown')} v{tool.get('version', 'Unknown')}</p>
<p><strong>Generated:</strong> {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}</p>
</div>
<div class="summary">
<h2>Summary</h2>
<p><strong>Total Issues:</strong> {len(results)}</p>
</div>
<div class="findings">
<h2>Detailed Findings</h2>
<table>
<thead>
<tr>
<th>Rule ID</th>
<th>Severity</th>
<th>Message</th>
<th>Location</th>
</tr>
</thead>
<tbody>
"""
for result in results:
level = result.get("level", "note")
rule_id = result.get("ruleId", "unknown")
message = result.get("message", {}).get("text", "")
# Extract location
location_str = ""
locations = result.get("locations", [])
if locations:
physical_location = locations[0].get("physicalLocation", {})
artifact_location = physical_location.get("artifactLocation", {})
region = physical_location.get("region", {})
file_path = artifact_location.get("uri", "")
if file_path:
location_str = file_path
if region.get("startLine"):
location_str += f":{region['startLine']}"
html_content += f"""
<tr>
<td>{rule_id}</td>
<td class="{level}">{level}</td>
<td>{message}</td>
<td>{location_str}</td>
</tr>
"""
html_content += """
</tbody>
</table>
</div>
</body>
</html>
"""
with open(output_path, 'w', encoding='utf-8') as f:
f.write(html_content)
@app.command("all")
def all_findings(
workflow: Optional[str] = typer.Option(
None, "--workflow", "-w",
help="Filter by workflow name"
),
severity: Optional[str] = typer.Option(
None, "--severity", "-s",
help="Filter by severity levels (comma-separated: error,warning,note,info)"
),
since: Optional[str] = typer.Option(
None, "--since",
help="Show findings since date (YYYY-MM-DD)"
),
limit: Optional[int] = typer.Option(
None, "--limit", "-l",
help="Maximum number of findings to show"
),
export_format: Optional[str] = typer.Option(
None, "--export", "-e",
help="Export format: json, csv, html"
),
output: Optional[str] = typer.Option(
None, "--output", "-o",
help="Output file for export"
),
stats_only: bool = typer.Option(
False, "--stats",
help="Show statistics only"
),
show_findings: bool = typer.Option(
False, "--show-findings", "-f",
help="Show actual findings content, not just summary"
),
max_findings: int = typer.Option(
50, "--max-findings",
help="Maximum number of individual findings to display"
)
):
"""
📊 Show all findings for the entire project
"""
db = get_project_db()
if not db:
console.print("❌ No FuzzForge project found. Run 'ff init' first.", style="red")
raise typer.Exit(1)
try:
# Parse filters
severity_list = None
if severity:
severity_list = [s.strip().lower() for s in severity.split(",")]
since_date = None
if since:
try:
since_date = datetime.strptime(since, "%Y-%m-%d")
except ValueError:
console.print(f"❌ Invalid date format: {since}. Use YYYY-MM-DD", style="red")
raise typer.Exit(1)
# Get aggregated stats
stats = db.get_aggregated_stats()
# Show statistics
if stats_only or not export_format:
# Create summary panel
summary_text = f"""[bold]📊 Project Security Summary[/bold]
[cyan]Total Findings Records:[/cyan] {stats['total_findings_records']}
[cyan]Total Runs Analyzed:[/cyan] {stats['total_runs']}
[cyan]Total Security Issues:[/cyan] {stats['total_issues']}
[cyan]Recent Findings (7 days):[/cyan] {stats['recent_findings']}
[bold]Severity Distribution:[/bold]
🔴 Errors: {stats['severity_distribution'].get('error', 0)}
🟡 Warnings: {stats['severity_distribution'].get('warning', 0)}
🔵 Notes: {stats['severity_distribution'].get('note', 0)}
Info: {stats['severity_distribution'].get('info', 0)}
[bold]By Workflow:[/bold]"""
for wf_name, count in stats['workflows'].items():
summary_text += f"\n{wf_name}: {count} findings"
console.print(Panel(summary_text, box=box.ROUNDED, title="FuzzForge Project Analysis", border_style="cyan"))
if stats_only:
return
# Get all findings with filters
findings = db.get_all_findings(
workflow=workflow,
severity=severity_list,
since_date=since_date,
limit=limit
)
if not findings:
console.print(" No findings match the specified filters", style="dim")
return
# Export if requested
if export_format:
if not output:
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
output = f"all_findings_{timestamp}.{export_format}"
export_all_findings(findings, export_format, output)
console.print(f"✅ Exported {len(findings)} findings to: {output}", style="green")
return
# Display findings table
table = Table(box=box.ROUNDED, title=f"All Project Findings ({len(findings)} records)")
table.add_column("Run ID", style="bold cyan", width=36) # Full UUID width
table.add_column("Workflow", style="dim", width=20)
table.add_column("Date", justify="center")
table.add_column("Issues", justify="center", style="bold")
table.add_column("Errors", justify="center", style="red")
table.add_column("Warnings", justify="center", style="yellow")
table.add_column("Notes", justify="center", style="blue")
# Get run info for each finding
runs_info = {}
for finding in findings:
run_id = finding.run_id
if run_id not in runs_info:
run_info = db.get_run(run_id)
runs_info[run_id] = run_info
for finding in findings:
run_id = finding.run_id
run_info = runs_info.get(run_id)
workflow_name = run_info.workflow if run_info else "unknown"
summary = finding.summary
total_issues = summary.get("total_issues", 0)
by_severity = summary.get("by_severity", {})
# Count issues from SARIF data if summary is incomplete
if total_issues == 0 and "runs" in finding.sarif_data:
for run in finding.sarif_data["runs"]:
total_issues += len(run.get("results", []))
table.add_row(
run_id, # Show full Run ID
workflow_name[:17] + "..." if len(workflow_name) > 20 else workflow_name,
finding.created_at.strftime("%Y-%m-%d %H:%M"),
str(total_issues),
str(by_severity.get("error", 0)),
str(by_severity.get("warning", 0)),
str(by_severity.get("note", 0))
)
console.print(table)
# Show actual findings if requested
if show_findings:
display_detailed_findings(findings, max_findings)
console.print(f"\n💡 Use filters to refine results: --workflow, --severity, --since")
console.print(f"💡 Show findings content: --show-findings")
console.print(f"💡 Export findings: --export json --output report.json")
console.print(f"💡 View specific findings: [bold cyan]fuzzforge finding <run-id>[/bold cyan]")
except Exception as e:
console.print(f"❌ Failed to get all findings: {e}", style="red")
raise typer.Exit(1)
def display_detailed_findings(findings: List[FindingRecord], max_findings: int):
"""Display detailed findings content"""
console.print(f"\n📋 [bold]Detailed Findings Content[/bold] (showing up to {max_findings} findings)\n")
findings_count = 0
for finding_record in findings:
if findings_count >= max_findings:
remaining = sum(len(run.get("results", []))
for f in findings[findings.index(finding_record):]
for run in f.sarif_data.get("runs", []))
if remaining > 0:
console.print(f"\n... and {remaining} more findings (use --max-findings to show more)")
break
# Get run info for this finding
sarif_data = finding_record.sarif_data
if not sarif_data or "runs" not in sarif_data:
continue
for run in sarif_data["runs"]:
tool = run.get("tool", {})
driver = tool.get("driver", {})
tool_name = driver.get("name", "Unknown Tool")
results = run.get("results", [])
if not results:
continue
# Group results by severity
for result in results:
if findings_count >= max_findings:
break
findings_count += 1
# Extract key information
rule_id = result.get("ruleId", "unknown")
level = result.get("level", "note").upper()
message_text = result.get("message", {}).get("text", "No description")
# Get location information
locations = result.get("locations", [])
location_str = "Unknown location"
if locations:
physical = locations[0].get("physicalLocation", {})
artifact = physical.get("artifactLocation", {})
region = physical.get("region", {})
file_path = artifact.get("uri", "")
line_number = region.get("startLine", "")
if file_path:
location_str = f"{file_path}"
if line_number:
location_str += f":{line_number}"
# Get severity style
severity_style = {
"ERROR": "bold red",
"WARNING": "bold yellow",
"NOTE": "bold blue",
"INFO": "bold cyan"
}.get(level, "white")
# Create finding panel
finding_content = f"""[bold]Rule:[/bold] {rule_id}
[bold]Location:[/bold] {location_str}
[bold]Tool:[/bold] {tool_name}
[bold]Run:[/bold] {finding_record.run_id[:12]}...
[bold]Description:[/bold]
{message_text}"""
# Add code context if available
region = locations[0].get("physicalLocation", {}).get("region", {}) if locations else {}
if region.get("snippet", {}).get("text"):
code_snippet = region["snippet"]["text"].strip()
finding_content += f"\n\n[bold]Code:[/bold]\n[dim]{code_snippet}[/dim]"
console.print(Panel(
finding_content,
title=f"[{severity_style}]{level}[/{severity_style}] Finding #{findings_count}",
border_style=severity_style.split()[-1] if " " in severity_style else severity_style,
box=box.ROUNDED
))
console.print() # Add spacing between findings
def export_all_findings(findings: List[FindingRecord], format: str, output_path: str):
"""Export all findings to specified format"""
output_file = Path(output_path)
if format == "json":
# Combine all SARIF data
all_results = []
for finding in findings:
if "runs" in finding.sarif_data:
for run in finding.sarif_data["runs"]:
for result in run.get("results", []):
result_entry = {
"run_id": finding.run_id,
"created_at": finding.created_at.isoformat(),
**result
}
all_results.append(result_entry)
with open(output_file, 'w') as f:
json.dump({
"total_findings": len(findings),
"export_date": datetime.now().isoformat(),
"results": all_results
}, f, indent=2)
elif format == "csv":
# Export to CSV
with open(output_file, 'w', newline='') as f:
writer = csv.writer(f)
writer.writerow(["Run ID", "Date", "Severity", "Rule ID", "Message", "File", "Line"])
for finding in findings:
if "runs" in finding.sarif_data:
for run in finding.sarif_data["runs"]:
for result in run.get("results", []):
locations = result.get("locations", [])
location_info = locations[0] if locations else {}
physical = location_info.get("physicalLocation", {})
artifact = physical.get("artifactLocation", {})
region = physical.get("region", {})
writer.writerow([
finding.run_id[:12],
finding.created_at.strftime("%Y-%m-%d %H:%M"),
result.get("level", "note"),
result.get("ruleId", ""),
result.get("message", {}).get("text", ""),
artifact.get("uri", ""),
region.get("startLine", "")
])
elif format == "html":
# Generate HTML report
html_content = f"""<!DOCTYPE html>
<html>
<head>
<title>FuzzForge Security Findings Report</title>
<style>
body {{ font-family: Arial, sans-serif; margin: 20px; }}
h1 {{ color: #333; }}
.stats {{ background: #f5f5f5; padding: 15px; border-radius: 5px; margin: 20px 0; }}
table {{ width: 100%; border-collapse: collapse; }}
th, td {{ padding: 10px; text-align: left; border-bottom: 1px solid #ddd; }}
th {{ background: #4CAF50; color: white; }}
.error {{ color: red; font-weight: bold; }}
.warning {{ color: orange; font-weight: bold; }}
.note {{ color: blue; }}
.info {{ color: gray; }}
</style>
</head>
<body>
<h1>FuzzForge Security Findings Report</h1>
<div class="stats">
<p><strong>Generated:</strong> {datetime.now().strftime("%Y-%m-%d %H:%M:%S")}</p>
<p><strong>Total Findings:</strong> {len(findings)}</p>
</div>
<table>
<tr>
<th>Run ID</th>
<th>Date</th>
<th>Severity</th>
<th>Rule</th>
<th>Message</th>
<th>Location</th>
</tr>"""
for finding in findings:
if "runs" in finding.sarif_data:
for run in finding.sarif_data["runs"]:
for result in run.get("results", []):
level = result.get("level", "note")
locations = result.get("locations", [])
location_info = locations[0] if locations else {}
physical = location_info.get("physicalLocation", {})
artifact = physical.get("artifactLocation", {})
region = physical.get("region", {})
html_content += f"""
<tr>
<td>{finding.run_id[:12]}</td>
<td>{finding.created_at.strftime("%Y-%m-%d %H:%M")}</td>
<td class="{level}">{level.upper()}</td>
<td>{result.get("ruleId", "")}</td>
<td>{result.get("message", {}).get("text", "")}</td>
<td>{artifact.get("uri", "")} : {region.get("startLine", "")}</td>
</tr>"""
html_content += """
</table>
</body>
</html>"""
with open(output_file, 'w') as f:
f.write(html_content)
@app.callback(invoke_without_command=True)
def findings_callback(ctx: typer.Context):
"""
🔍 View and export security findings
"""
# Check if a subcommand is being invoked
if ctx.invoked_subcommand is not None:
# Let the subcommand handle it
return
# Default to history when no subcommand provided
findings_history(limit=20)

View File

@@ -1,251 +0,0 @@
"""Cognee ingestion commands for FuzzForge CLI."""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
from __future__ import annotations
import asyncio
import os
from pathlib import Path
from typing import List, Optional
import typer
from rich.console import Console
from rich.prompt import Confirm
from ..config import ProjectConfigManager
from ..ingest_utils import collect_ingest_files
console = Console()
app = typer.Typer(
name="ingest",
help="Ingest files or directories into the Cognee knowledge graph for the current project",
invoke_without_command=True,
)
@app.callback()
def ingest_callback(
ctx: typer.Context,
path: Optional[Path] = typer.Argument(
None,
exists=True,
file_okay=True,
dir_okay=True,
readable=True,
resolve_path=True,
help="File or directory to ingest (defaults to current directory)",
),
recursive: bool = typer.Option(
False,
"--recursive",
"-r",
help="Recursively ingest directories",
),
file_types: Optional[List[str]] = typer.Option(
None,
"--file-types",
"-t",
help="File extensions to include (e.g. --file-types .py --file-types .js)",
),
exclude: Optional[List[str]] = typer.Option(
None,
"--exclude",
"-e",
help="Glob patterns to exclude",
),
dataset: Optional[str] = typer.Option(
None,
"--dataset",
"-d",
help="Dataset name to ingest into",
),
force: bool = typer.Option(
False,
"--force",
"-f",
help="Force re-ingestion and skip confirmation",
),
):
"""Entry point for `fuzzforge ingest` when no subcommand is provided."""
if ctx.invoked_subcommand:
return
try:
config = ProjectConfigManager()
except FileNotFoundError as exc:
console.print(f"[red]Error:[/red] {exc}")
raise typer.Exit(1) from exc
if not config.is_initialized():
console.print("[red]Error: FuzzForge project not initialized. Run 'ff init' first.[/red]")
raise typer.Exit(1)
config.setup_cognee_environment()
if os.getenv("FUZZFORGE_DEBUG", "0") == "1":
console.print(
"[dim]Cognee directories:\n"
f" DATA: {os.getenv('COGNEE_DATA_ROOT', 'unset')}\n"
f" SYSTEM: {os.getenv('COGNEE_SYSTEM_ROOT', 'unset')}\n"
f" USER: {os.getenv('COGNEE_USER_ID', 'unset')}\n",
)
project_context = config.get_project_context()
target_path = path or Path.cwd()
dataset_name = dataset or f"{project_context['project_name']}_codebase"
try:
import cognee # noqa: F401 # Just to validate installation
except ImportError as exc:
console.print("[red]Cognee is not installed.[/red]")
console.print("Install with: pip install 'cognee[all]' litellm")
raise typer.Exit(1) from exc
console.print(f"[bold]🔍 Ingesting {target_path} into Cognee knowledge graph[/bold]")
console.print(
f"Project: [cyan]{project_context['project_name']}[/cyan] "
f"(ID: [dim]{project_context['project_id']}[/dim])"
)
console.print(f"Dataset: [cyan]{dataset_name}[/cyan]")
console.print(f"Tenant: [dim]{project_context['tenant_id']}[/dim]")
if not force:
confirm_message = f"Ingest {target_path} into knowledge graph for this project?"
if not Confirm.ask(confirm_message, console=console):
console.print("[yellow]Ingestion cancelled[/yellow]")
raise typer.Exit(0)
try:
asyncio.run(
_run_ingestion(
config=config,
path=target_path.resolve(),
recursive=recursive,
file_types=file_types,
exclude=exclude,
dataset=dataset_name,
force=force,
)
)
except KeyboardInterrupt:
console.print("\n[yellow]Ingestion cancelled by user[/yellow]")
raise typer.Exit(1)
except Exception as exc: # pragma: no cover - rich reporting
console.print(f"[red]Failed to ingest:[/red] {exc}")
raise typer.Exit(1) from exc
async def _run_ingestion(
*,
config: ProjectConfigManager,
path: Path,
recursive: bool,
file_types: Optional[List[str]],
exclude: Optional[List[str]],
dataset: str,
force: bool,
) -> None:
"""Perform the actual ingestion work."""
from fuzzforge_ai.cognee_service import CogneeService
cognee_service = CogneeService(config)
await cognee_service.initialize()
# Always skip internal bookkeeping directories
exclude_patterns = list(exclude or [])
default_excludes = {
".fuzzforge/**",
".git/**",
}
added_defaults = []
for pattern in default_excludes:
if pattern not in exclude_patterns:
exclude_patterns.append(pattern)
added_defaults.append(pattern)
if added_defaults and os.getenv("FUZZFORGE_DEBUG", "0") == "1":
console.print(
"[dim]Auto-excluding paths: {patterns}[/dim]".format(
patterns=", ".join(added_defaults)
)
)
try:
files_to_ingest = collect_ingest_files(path, recursive, file_types, exclude_patterns)
except Exception as exc:
console.print(f"[red]Failed to collect files:[/red] {exc}")
return
if not files_to_ingest:
console.print("[yellow]No files found to ingest[/yellow]")
return
console.print(f"Found [green]{len(files_to_ingest)}[/green] files to ingest")
if force:
console.print("Cleaning existing data for this project...")
try:
await cognee_service.clear_data(confirm=True)
except Exception as exc:
console.print(f"[yellow]Warning:[/yellow] Could not clean existing data: {exc}")
console.print("Adding files to Cognee...")
valid_file_paths = []
for file_path in files_to_ingest:
try:
with open(file_path, "r", encoding="utf-8") as fh:
fh.read(1)
valid_file_paths.append(file_path)
console.print(f"{file_path}")
except (UnicodeDecodeError, PermissionError) as exc:
console.print(f"[yellow]Skipping {file_path}: {exc}[/yellow]")
if not valid_file_paths:
console.print("[yellow]No readable files found to ingest[/yellow]")
return
results = await cognee_service.ingest_files(valid_file_paths, dataset)
console.print(
f"[green]✅ Successfully ingested {results['success']} files into knowledge graph[/green]"
)
if results["failed"]:
console.print(
f"[yellow]⚠️ Skipped {results['failed']} files due to errors[/yellow]"
)
try:
insights = await cognee_service.search_insights(
query=f"What insights can you provide about the {dataset} dataset?",
dataset=dataset,
)
if insights:
console.print(f"\n[bold]📊 Generated {len(insights)} insights:[/bold]")
for index, insight in enumerate(insights[:3], 1):
console.print(f" {index}. {insight}")
if len(insights) > 3:
console.print(f" ... and {len(insights) - 3} more")
chunks = await cognee_service.search_chunks(
query=f"functions classes methods in {dataset}",
dataset=dataset,
)
if chunks:
console.print(
f"\n[bold]🔍 Sample searchable content ({len(chunks)} chunks found):[/bold]"
)
for index, chunk in enumerate(chunks[:2], 1):
preview = chunk[:100] + "..." if len(chunk) > 100 else chunk
console.print(f" {index}. {preview}")
except Exception:
# Best-effort stats — ignore failures here
pass

View File

@@ -1,277 +0,0 @@
"""Project initialization commands."""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
from __future__ import annotations
import os
from pathlib import Path
from textwrap import dedent
from typing import Optional
import typer
from rich.console import Console
from rich.prompt import Confirm, Prompt
from ..config import ensure_project_config
from ..database import ensure_project_db
console = Console()
app = typer.Typer()
@app.command()
def project(
name: Optional[str] = typer.Option(
None, "--name", "-n", help="Project name (defaults to current directory name)"
),
api_url: Optional[str] = typer.Option(
None,
"--api-url",
"-u",
help="FuzzForge API URL (defaults to http://localhost:8000)",
),
force: bool = typer.Option(
False,
"--force",
"-f",
help="Force initialization even if project already exists",
),
):
"""
📁 Initialize a new FuzzForge project in the current directory.
This creates a .fuzzforge directory with:
• SQLite database for storing runs, findings, and crashes
• Configuration file with project settings
• Default ignore patterns and preferences
"""
current_dir = Path.cwd()
fuzzforge_dir = current_dir / ".fuzzforge"
# Check if project already exists
if fuzzforge_dir.exists() and not force:
if fuzzforge_dir.is_dir() and any(fuzzforge_dir.iterdir()):
console.print(
"❌ FuzzForge project already exists in this directory", style="red"
)
console.print("Use --force to reinitialize", style="dim")
raise typer.Exit(1)
# Get project name
if not name:
name = Prompt.ask("Project name", default=current_dir.name, console=console)
# Get API URL
if not api_url:
api_url = Prompt.ask(
"FuzzForge API URL", default="http://localhost:8000", console=console
)
# Confirm initialization
console.print(f"\n📁 Initializing FuzzForge project: [bold cyan]{name}[/bold cyan]")
console.print(f"📍 Location: [dim]{current_dir}[/dim]")
console.print(f"🔗 API URL: [dim]{api_url}[/dim]")
if not Confirm.ask("\nProceed with initialization?", default=True, console=console):
console.print("❌ Initialization cancelled", style="yellow")
raise typer.Exit(0)
try:
# Create .fuzzforge directory
console.print("\n🔨 Creating project structure...")
fuzzforge_dir.mkdir(exist_ok=True)
# Initialize configuration
console.print("⚙️ Setting up configuration...")
ensure_project_config(
project_dir=current_dir,
project_name=name,
api_url=api_url,
)
# Initialize database
console.print("🗄️ Initializing database...")
ensure_project_db(current_dir)
_ensure_env_file(fuzzforge_dir, force)
_ensure_agents_registry(fuzzforge_dir, force)
# Create .gitignore if needed
gitignore_path = current_dir / ".gitignore"
gitignore_entries = [
"# FuzzForge CLI",
".fuzzforge/findings.db-*", # SQLite temp files
".fuzzforge/cache/",
".fuzzforge/temp/",
]
if gitignore_path.exists():
with open(gitignore_path, "r") as f:
existing_content = f.read()
if "# FuzzForge CLI" not in existing_content:
with open(gitignore_path, "a") as f:
f.write(f"\n{chr(10).join(gitignore_entries)}\n")
console.print("📝 Updated .gitignore with FuzzForge entries")
else:
with open(gitignore_path, "w") as f:
f.write(f"{chr(10).join(gitignore_entries)}\n")
console.print("📝 Created .gitignore")
# Create README if it doesn't exist
readme_path = current_dir / "README.md"
if not readme_path.exists():
readme_content = f"""# {name}
FuzzForge security testing project.
## Quick Start
```bash
# List available workflows
fuzzforge workflows
# Submit a workflow for analysis
fuzzforge workflow <workflow-name> /path/to/target
# View findings
fuzzforge finding <run-id>
```
## Project Structure
- `.fuzzforge/` - Project data and configuration
- `.fuzzforge/config.yaml` - Project configuration
- `.fuzzforge/findings.db` - Local database for runs and findings
"""
with open(readme_path, "w") as f:
f.write(readme_content)
console.print("📚 Created README.md")
console.print("\n✅ FuzzForge project initialized successfully!", style="green")
console.print("\n🎯 Next steps:")
console.print(" • ff workflows - See available workflows")
console.print(" • ff status - Check API connectivity")
console.print(" • ff workflow <workflow> <path> - Start your first analysis")
console.print(" • edit .fuzzforge/.env with API keys & provider settings")
except Exception as e:
console.print(f"\n❌ Initialization failed: {e}", style="red")
raise typer.Exit(1)
@app.callback()
def init_callback():
"""
📁 Initialize FuzzForge projects and components
"""
def _ensure_env_file(fuzzforge_dir: Path, force: bool) -> None:
"""Create or update the .fuzzforge/.env file with AI defaults."""
env_path = fuzzforge_dir / ".env"
if env_path.exists() and not force:
console.print("🧪 Using existing .fuzzforge/.env (use --force to regenerate)")
return
console.print("🧠 Configuring AI environment...")
console.print(" • Default LLM provider: openai")
console.print(" • Default LLM model: gpt-5-mini")
console.print(" • To customise provider/model later, edit .fuzzforge/.env")
llm_provider = "openai"
llm_model = "gpt-5-mini"
api_key = Prompt.ask(
"OpenAI API key (leave blank to fill manually)",
default="",
show_default=False,
console=console,
)
enable_cognee = False
cognee_url = ""
session_db_path = fuzzforge_dir / "fuzzforge_sessions.db"
session_db_rel = session_db_path.relative_to(fuzzforge_dir.parent)
env_lines = [
"# FuzzForge AI configuration",
"# Populate the API key(s) that match your LLM provider",
"",
f"LLM_PROVIDER={llm_provider}",
f"LLM_MODEL={llm_model}",
f"LITELLM_MODEL={llm_model}",
f"OPENAI_API_KEY={api_key}",
f"FUZZFORGE_MCP_URL={os.getenv('FUZZFORGE_MCP_URL', 'http://localhost:8010/mcp')}",
"",
"# Cognee configuration mirrors the primary LLM by default",
f"LLM_COGNEE_PROVIDER={llm_provider}",
f"LLM_COGNEE_MODEL={llm_model}",
f"LLM_COGNEE_API_KEY={api_key}",
"LLM_COGNEE_ENDPOINT=",
"COGNEE_MCP_URL=",
"",
"# Session persistence options: inmemory | sqlite",
"SESSION_PERSISTENCE=sqlite",
f"SESSION_DB_PATH={session_db_rel}",
"",
"# Optional integrations",
"AGENTOPS_API_KEY=",
"FUZZFORGE_DEBUG=0",
"",
]
env_path.write_text("\n".join(env_lines), encoding="utf-8")
console.print(f"📝 Created {env_path.relative_to(fuzzforge_dir.parent)}")
template_path = fuzzforge_dir / ".env.template"
if not template_path.exists() or force:
template_lines = []
for line in env_lines:
if line.startswith("OPENAI_API_KEY="):
template_lines.append("OPENAI_API_KEY=")
elif line.startswith("LLM_COGNEE_API_KEY="):
template_lines.append("LLM_COGNEE_API_KEY=")
else:
template_lines.append(line)
template_path.write_text("\n".join(template_lines), encoding="utf-8")
console.print(f"📝 Created {template_path.relative_to(fuzzforge_dir.parent)}")
# SQLite session DB will be created automatically when first used by the AI agent
def _ensure_agents_registry(fuzzforge_dir: Path, force: bool) -> None:
"""Create a starter agents.yaml registry if needed."""
agents_path = fuzzforge_dir / "agents.yaml"
if agents_path.exists() and not force:
return
template = dedent(
"""\
# FuzzForge Registered Agents
# Populate this list to auto-register remote agents when the AI CLI starts
registered_agents: []
# Example:
# registered_agents:
# - name: Calculator
# url: http://localhost:10201
# description: Sample math agent
""".strip()
)
agents_path.write_text(template + "\n", encoding="utf-8")
console.print(f"📝 Created {agents_path.relative_to(fuzzforge_dir.parent)}")

View File

@@ -1,165 +0,0 @@
"""
Status command for showing project and API information.
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
from pathlib import Path
from rich.console import Console
from rich.table import Table
from rich.panel import Panel
from rich import box
from ..config import get_project_config, FuzzForgeConfig
from ..database import get_project_db
from fuzzforge_sdk import FuzzForgeClient
console = Console()
def show_status():
"""Show comprehensive project and API status"""
current_dir = Path.cwd()
fuzzforge_dir = current_dir / ".fuzzforge"
# Project status
console.print("\n📊 [bold]FuzzForge Project Status[/bold]\n")
if not fuzzforge_dir.exists():
console.print(
Panel.fit(
"❌ No FuzzForge project found in current directory\n\n"
"Run [bold cyan]ff init[/bold cyan] to initialize a project",
title="Project Status",
box=box.ROUNDED
)
)
return
# Load project configuration
config = get_project_config()
if not config:
config = FuzzForgeConfig()
# Project info table
project_table = Table(show_header=False, box=box.SIMPLE)
project_table.add_column("Property", style="bold cyan")
project_table.add_column("Value")
project_table.add_row("Project Name", config.project.name)
project_table.add_row("Location", str(current_dir))
project_table.add_row("API URL", config.project.api_url)
project_table.add_row("Default Timeout", f"{config.project.default_timeout}s")
console.print(
Panel.fit(
project_table,
title="✅ Project Information",
box=box.ROUNDED
)
)
# Database status
db = get_project_db()
if db:
try:
stats = db.get_stats()
db_table = Table(show_header=False, box=box.SIMPLE)
db_table.add_column("Metric", style="bold cyan")
db_table.add_column("Count", justify="right")
db_table.add_row("Total Runs", str(stats["total_runs"]))
db_table.add_row("Total Findings", str(stats["total_findings"]))
db_table.add_row("Total Crashes", str(stats["total_crashes"]))
db_table.add_row("Runs (Last 7 days)", str(stats["runs_last_7_days"]))
if stats["runs_by_status"]:
db_table.add_row("", "") # Spacer
for status, count in stats["runs_by_status"].items():
status_emoji = {
"completed": "",
"running": "🔄",
"failed": "",
"queued": "",
"cancelled": "⏹️"
}.get(status, "📋")
db_table.add_row(f"{status_emoji} {status.title()}", str(count))
console.print(
Panel.fit(
db_table,
title="🗄️ Database Statistics",
box=box.ROUNDED
)
)
except Exception as e:
console.print(f"⚠️ Database error: {e}", style="yellow")
# API status
console.print("\n🔗 [bold]API Connectivity[/bold]")
try:
with FuzzForgeClient(base_url=config.get_api_url(), timeout=10.0) as client:
api_status = client.get_api_status()
workflows = client.list_workflows()
api_table = Table(show_header=False, box=box.SIMPLE)
api_table.add_column("Property", style="bold cyan")
api_table.add_column("Value")
api_table.add_row("Status", f"✅ Connected")
api_table.add_row("Service", f"{api_status.name} v{api_status.version}")
api_table.add_row("Workflows", str(len(workflows)))
console.print(
Panel.fit(
api_table,
title="✅ API Status",
box=box.ROUNDED
)
)
# Show available workflows
if workflows:
workflow_table = Table(box=box.SIMPLE_HEAD)
workflow_table.add_column("Name", style="bold")
workflow_table.add_column("Version", justify="center")
workflow_table.add_column("Description")
for workflow in workflows[:10]: # Limit to first 10
workflow_table.add_row(
workflow.name,
workflow.version,
workflow.description[:60] + "..." if len(workflow.description) > 60 else workflow.description
)
if len(workflows) > 10:
workflow_table.add_row("...", "...", f"and {len(workflows) - 10} more workflows")
console.print(
Panel.fit(
workflow_table,
title=f"🔧 Available Workflows ({len(workflows)})",
box=box.ROUNDED
)
)
except Exception as e:
console.print(
Panel.fit(
f"❌ Failed to connect to API\n\n"
f"Error: {str(e)}\n\n"
f"API URL: {config.get_api_url()}\n\n"
"Check that the FuzzForge API is running and accessible.",
title="❌ API Connection Failed",
box=box.ROUNDED
)
)

View File

@@ -1,641 +0,0 @@
"""
Workflow execution and management commands.
Replaces the old 'runs' terminology with cleaner workflow-centric commands.
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import time
from datetime import datetime
from pathlib import Path
from typing import Any, Dict, List, Optional
import typer
from fuzzforge_sdk import FuzzForgeClient, WorkflowSubmission
from rich import box
from rich.console import Console
from rich.panel import Panel
from rich.prompt import Confirm, Prompt
from rich.table import Table
from ..config import FuzzForgeConfig, get_project_config
from ..constants import (
DEFAULT_VOLUME_MODE,
MAX_RETRIES,
MAX_RUN_ID_DISPLAY_LENGTH,
POLL_INTERVAL,
PROGRESS_STEP_DELAYS,
RETRY_DELAY,
STATUS_EMOJIS,
)
from ..database import RunRecord, ensure_project_db, get_project_db
from ..exceptions import (
DatabaseError,
ValidationError,
handle_error,
require_project,
retry_on_network_error,
safe_json_load,
)
from ..progress import step_progress
from ..validation import (
validate_parameters,
validate_run_id,
validate_target_path,
validate_timeout,
validate_volume_mode,
validate_workflow_name,
)
console = Console()
app = typer.Typer()
@retry_on_network_error(max_retries=MAX_RETRIES, delay=RETRY_DELAY)
def get_client() -> FuzzForgeClient:
"""Get configured FuzzForge client with retry on network errors"""
config = get_project_config() or FuzzForgeConfig()
return FuzzForgeClient(base_url=config.get_api_url(), timeout=config.get_timeout())
def status_emoji(status: str) -> str:
"""Get emoji for execution status"""
return STATUS_EMOJIS.get(status.lower(), STATUS_EMOJIS["unknown"])
def parse_inline_parameters(params: List[str]) -> Dict[str, Any]:
"""Parse inline key=value parameters using improved validation"""
return validate_parameters(params)
def execute_workflow_submission(
client: FuzzForgeClient,
workflow: str,
target_path: str,
parameters: Dict[str, Any],
volume_mode: str,
timeout: Optional[int],
interactive: bool,
) -> Any:
"""Handle the workflow submission process"""
# Get workflow metadata for parameter validation
console.print(f"🔧 Getting workflow information for: {workflow}")
workflow_meta = client.get_workflow_metadata(workflow)
param_response = client.get_workflow_parameters(workflow)
# Interactive parameter input
if interactive and workflow_meta.parameters.get("properties"):
properties = workflow_meta.parameters.get("properties", {})
required_params = set(workflow_meta.parameters.get("required", []))
defaults = param_response.defaults
missing_required = required_params - set(parameters.keys())
if missing_required:
console.print(
f"\n📝 [bold]Missing required parameters:[/bold] {', '.join(missing_required)}"
)
console.print("Please provide values:\n")
for param_name in missing_required:
param_schema = properties.get(param_name, {})
description = param_schema.get("description", "")
param_type = param_schema.get("type", "string")
prompt_text = f"{param_name}"
if description:
prompt_text += f" ({description})"
prompt_text += f" [{param_type}]"
while True:
user_input = Prompt.ask(prompt_text, console=console)
try:
if param_type == "integer":
parameters[param_name] = int(user_input)
elif param_type == "number":
parameters[param_name] = float(user_input)
elif param_type == "boolean":
parameters[param_name] = user_input.lower() in (
"true",
"yes",
"1",
"on",
)
elif param_type == "array":
parameters[param_name] = [
item.strip()
for item in user_input.split(",")
if item.strip()
]
else:
parameters[param_name] = user_input
break
except ValueError as e:
console.print(f"❌ Invalid {param_type}: {e}", style="red")
# Validate volume mode
validate_volume_mode(volume_mode)
if volume_mode not in workflow_meta.supported_volume_modes:
raise ValidationError(
"volume mode",
volume_mode,
f"one of: {', '.join(workflow_meta.supported_volume_modes)}",
)
# Create submission
submission = WorkflowSubmission(
target_path=target_path,
volume_mode=volume_mode,
parameters=parameters,
timeout=timeout,
)
# Show submission summary
console.print("\n🎯 [bold]Executing workflow:[/bold]")
console.print(f" Workflow: {workflow}")
console.print(f" Target: {target_path}")
console.print(f" Volume Mode: {volume_mode}")
if parameters:
console.print(f" Parameters: {len(parameters)} provided")
if timeout:
console.print(f" Timeout: {timeout}s")
# Only ask for confirmation in interactive mode
if interactive:
if not Confirm.ask("\nExecute workflow?", default=True, console=console):
console.print("❌ Execution cancelled", style="yellow")
raise typer.Exit(0)
else:
console.print("\n🚀 Executing workflow...")
# Submit the workflow with enhanced progress
console.print(f"\n🚀 Executing workflow: [bold yellow]{workflow}[/bold yellow]")
steps = [
"Validating workflow configuration",
"Connecting to FuzzForge API",
"Uploading parameters and settings",
"Creating workflow deployment",
"Initializing execution environment",
]
with step_progress(steps, f"Executing {workflow}") as progress:
progress.next_step() # Validating
time.sleep(PROGRESS_STEP_DELAYS["validating"])
progress.next_step() # Connecting
time.sleep(PROGRESS_STEP_DELAYS["connecting"])
progress.next_step() # Uploading
response = client.submit_workflow(workflow, submission)
time.sleep(PROGRESS_STEP_DELAYS["uploading"])
progress.next_step() # Creating deployment
time.sleep(PROGRESS_STEP_DELAYS["creating"])
progress.next_step() # Initializing
time.sleep(PROGRESS_STEP_DELAYS["initializing"])
progress.complete("Workflow started successfully!")
return response
# Main workflow execution command (replaces 'runs submit')
@app.command(
name="exec", hidden=True
) # Hidden because it will be called from main workflow command
def execute_workflow(
workflow: str = typer.Argument(..., help="Workflow name to execute"),
target_path: str = typer.Argument(..., help="Path to analyze"),
params: List[str] = typer.Argument(
default=None, help="Parameters as key=value pairs"
),
param_file: Optional[str] = typer.Option(
None, "--param-file", "-f", help="JSON file containing workflow parameters"
),
volume_mode: str = typer.Option(
DEFAULT_VOLUME_MODE,
"--volume-mode",
"-v",
help="Volume mount mode: ro (read-only) or rw (read-write)",
),
timeout: Optional[int] = typer.Option(
None, "--timeout", "-t", help="Execution timeout in seconds"
),
interactive: bool = typer.Option(
True,
"--interactive/--no-interactive",
"-i/-n",
help="Interactive parameter input for missing required parameters",
),
wait: bool = typer.Option(
False, "--wait", "-w", help="Wait for execution to complete"
),
):
"""
🚀 Execute a workflow on a target
Use --wait to wait for completion without live dashboard.
"""
try:
# Validate inputs
validate_workflow_name(workflow)
target_path_obj = validate_target_path(target_path, must_exist=True)
target_path = str(target_path_obj.absolute())
validate_timeout(timeout)
# Ensure we're in a project directory
require_project()
except Exception as e:
handle_error(e, "validating inputs")
# Parse parameters
parameters = {}
# Load from param file
if param_file:
try:
file_params = safe_json_load(param_file)
if isinstance(file_params, dict):
parameters.update(file_params)
else:
raise ValidationError("parameter file", param_file, "a JSON object")
except Exception as e:
handle_error(e, "loading parameter file")
# Parse inline parameters
if params:
try:
inline_params = parse_inline_parameters(params)
parameters.update(inline_params)
except Exception as e:
handle_error(e, "parsing parameters")
try:
with get_client() as client:
response = execute_workflow_submission(
client,
workflow,
target_path,
parameters,
volume_mode,
timeout,
interactive,
)
console.print("✅ Workflow execution started!", style="green")
console.print(f" Execution ID: [bold cyan]{response.run_id}[/bold cyan]")
console.print(
f" Status: {status_emoji(response.status)} {response.status}"
)
# Save to database
try:
db = ensure_project_db()
run_record = RunRecord(
run_id=response.run_id,
workflow=workflow,
status=response.status,
target_path=target_path,
parameters=parameters,
created_at=datetime.now(),
)
db.save_run(run_record)
except Exception as e:
# Don't fail the whole operation if database save fails
console.print(
f"⚠️ Failed to save execution to database: {e}", style="yellow"
)
console.print(
f"💡 Check status: [bold cyan]fuzzforge workflow status {response.run_id}[/bold cyan]"
)
# Wait for completion if requested
if wait:
console.print("\n⏳ Waiting for execution to complete...")
try:
final_status = client.wait_for_completion(
response.run_id, poll_interval=POLL_INTERVAL
)
# Update database
try:
db.update_run_status(
response.run_id,
final_status.status,
completed_at=datetime.now()
if final_status.is_completed
else None,
)
except Exception as e:
console.print(
f"⚠️ Failed to update database: {e}", style="yellow"
)
console.print(
f"🏁 Execution completed with status: {status_emoji(final_status.status)} {final_status.status}"
)
if final_status.is_completed:
console.print(
f"💡 View findings: [bold cyan]fuzzforge findings {response.run_id}[/bold cyan]"
)
except KeyboardInterrupt:
console.print(
"\n⏹️ Monitoring cancelled (execution continues in background)",
style="yellow",
)
except Exception as e:
handle_error(e, "waiting for completion")
except Exception as e:
handle_error(e, "executing workflow")
@app.command("status")
def workflow_status(
execution_id: Optional[str] = typer.Argument(
None, help="Execution ID to check (defaults to most recent)"
),
):
"""
📊 Check the status of a workflow execution
"""
try:
require_project()
if execution_id:
validate_run_id(execution_id)
db = get_project_db()
if not db:
raise DatabaseError("get project database", Exception("No database found"))
# Get execution ID
if not execution_id:
recent_runs = db.list_runs(limit=1)
if not recent_runs:
console.print(
"⚠️ No executions found in project database", style="yellow"
)
raise typer.Exit(0)
execution_id = recent_runs[0].run_id
console.print(f"🔍 Using most recent execution: {execution_id}")
else:
validate_run_id(execution_id)
# Get status from API
with get_client() as client:
status = client.get_run_status(execution_id)
# Update local database
try:
db.update_run_status(
execution_id,
status.status,
completed_at=status.updated_at if status.is_completed else None,
)
except Exception as e:
console.print(f"⚠️ Failed to update database: {e}", style="yellow")
# Display status
console.print(f"\n📊 [bold]Execution Status: {execution_id}[/bold]\n")
status_table = Table(show_header=False, box=box.SIMPLE)
status_table.add_column("Property", style="bold cyan")
status_table.add_column("Value")
status_table.add_row("Execution ID", execution_id)
status_table.add_row("Workflow", status.workflow)
status_table.add_row("Status", f"{status_emoji(status.status)} {status.status}")
status_table.add_row("Created", status.created_at.strftime("%Y-%m-%d %H:%M:%S"))
status_table.add_row("Updated", status.updated_at.strftime("%Y-%m-%d %H:%M:%S"))
if status.is_completed:
duration = status.updated_at - status.created_at
status_table.add_row(
"Duration", str(duration).split(".")[0]
) # Remove microseconds
console.print(
Panel.fit(status_table, title="📊 Status Information", box=box.ROUNDED)
)
# Show next steps
if status.is_completed:
console.print(
f"💡 View findings: [bold cyan]fuzzforge finding {execution_id}[/bold cyan]"
)
elif status.is_failed:
console.print(
f"💡 Check logs: [bold cyan]fuzzforge workflow logs {execution_id}[/bold cyan]"
)
except Exception as e:
handle_error(e, "getting execution status")
@app.command("history")
def workflow_history(
workflow: Optional[str] = typer.Option(
None, "--workflow", "-w", help="Filter by workflow name"
),
status: Optional[str] = typer.Option(
None, "--status", "-s", help="Filter by status"
),
limit: int = typer.Option(
20, "--limit", "-l", help="Maximum number of executions to show"
),
):
"""
📋 Show workflow execution history
"""
try:
require_project()
if limit <= 0:
raise ValidationError("limit", limit, "a positive integer")
db = get_project_db()
if not db:
raise DatabaseError("get project database", Exception("No database found"))
runs = db.list_runs(workflow=workflow, status=status, limit=limit)
if not runs:
console.print("⚠️ No executions found matching criteria", style="yellow")
return
table = Table(box=box.ROUNDED)
table.add_column("Execution ID", style="bold cyan")
table.add_column("Workflow", style="bold")
table.add_column("Status", justify="center")
table.add_column("Target", style="dim")
table.add_column("Created", justify="center")
table.add_column("Parameters", justify="center", style="dim")
for run in runs:
param_count = len(run.parameters) if run.parameters else 0
param_str = f"{param_count} params" if param_count > 0 else "-"
table.add_row(
run.run_id[:12] + "..."
if len(run.run_id) > MAX_RUN_ID_DISPLAY_LENGTH
else run.run_id,
run.workflow,
f"{status_emoji(run.status)} {run.status}",
Path(run.target_path).name,
run.created_at.strftime("%m-%d %H:%M"),
param_str,
)
console.print(f"\n📋 [bold]Workflow Execution History ({len(runs)})[/bold]")
if workflow:
console.print(f" Filtered by workflow: {workflow}")
if status:
console.print(f" Filtered by status: {status}")
console.print()
console.print(table)
console.print(
"\n💡 Use [bold cyan]fuzzforge workflow status <execution-id>[/bold cyan] for detailed status"
)
except Exception as e:
handle_error(e, "listing execution history")
@app.command("retry")
def retry_workflow(
execution_id: Optional[str] = typer.Argument(
None, help="Execution ID to retry (defaults to most recent)"
),
modify_params: bool = typer.Option(
False,
"--modify-params",
"-m",
help="Interactively modify parameters before retrying",
),
):
"""
🔄 Retry a workflow execution with the same or modified parameters
"""
try:
require_project()
db = get_project_db()
if not db:
raise DatabaseError("get project database", Exception("No database found"))
# Get execution ID if not provided
if not execution_id:
recent_runs = db.list_runs(limit=1)
if not recent_runs:
console.print("⚠️ No executions found to retry", style="yellow")
raise typer.Exit(0)
execution_id = recent_runs[0].run_id
console.print(f"🔄 Retrying most recent execution: {execution_id}")
else:
validate_run_id(execution_id)
# Get original execution
original_run = db.get_run(execution_id)
if not original_run:
raise ValidationError(
"execution_id", execution_id, "an existing execution ID in the database"
)
console.print(f"🔄 [bold]Retrying workflow:[/bold] {original_run.workflow}")
console.print(f" Original Execution ID: {execution_id}")
console.print(f" Target: {original_run.target_path}")
parameters = original_run.parameters.copy()
# Modify parameters if requested
if modify_params and parameters:
console.print("\n📝 [bold]Current parameters:[/bold]")
for key, value in parameters.items():
new_value = Prompt.ask(f"{key}", default=str(value), console=console)
if new_value != str(value):
# Try to maintain type
try:
if isinstance(value, bool):
parameters[key] = new_value.lower() in (
"true",
"yes",
"1",
"on",
)
elif isinstance(value, int):
parameters[key] = int(new_value)
elif isinstance(value, float):
parameters[key] = float(new_value)
elif isinstance(value, list):
parameters[key] = [
item.strip()
for item in new_value.split(",")
if item.strip()
]
else:
parameters[key] = new_value
except ValueError:
parameters[key] = new_value
# Submit new execution
with get_client() as client:
submission = WorkflowSubmission(
target_path=original_run.target_path, parameters=parameters
)
response = client.submit_workflow(original_run.workflow, submission)
console.print("\n✅ Retry submitted successfully!", style="green")
console.print(
f" New Execution ID: [bold cyan]{response.run_id}[/bold cyan]"
)
console.print(
f" Status: {status_emoji(response.status)} {response.status}"
)
# Save to database
try:
run_record = RunRecord(
run_id=response.run_id,
workflow=original_run.workflow,
status=response.status,
target_path=original_run.target_path,
parameters=parameters,
created_at=datetime.now(),
metadata={"retry_of": execution_id},
)
db.save_run(run_record)
except Exception as e:
console.print(
f"⚠️ Failed to save execution to database: {e}", style="yellow"
)
console.print(
f"\n💡 Monitor progress: [bold cyan]fuzzforge monitor {response.run_id}[/bold cyan]"
)
except Exception as e:
handle_error(e, "retrying workflow")
@app.callback()
def workflow_exec_callback():
"""
🚀 Workflow execution management
"""

View File

@@ -1,305 +0,0 @@
"""
Workflow management commands.
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import json
import typer
from rich.console import Console
from rich.table import Table
from rich.panel import Panel
from rich.prompt import Prompt, Confirm
from rich.syntax import Syntax
from rich import box
from typing import Optional, Dict, Any
from ..config import get_project_config, FuzzForgeConfig
from ..fuzzy import enhanced_workflow_not_found_handler
from fuzzforge_sdk import FuzzForgeClient
console = Console()
app = typer.Typer()
def get_client() -> FuzzForgeClient:
"""Get configured FuzzForge client"""
config = get_project_config() or FuzzForgeConfig()
return FuzzForgeClient(base_url=config.get_api_url(), timeout=config.get_timeout())
@app.command("list")
def list_workflows():
"""
📋 List all available security testing workflows
"""
try:
with get_client() as client:
workflows = client.list_workflows()
if not workflows:
console.print("❌ No workflows available", style="red")
return
table = Table(box=box.ROUNDED)
table.add_column("Name", style="bold cyan")
table.add_column("Version", justify="center")
table.add_column("Description")
table.add_column("Tags", style="dim")
for workflow in workflows:
tags_str = ", ".join(workflow.tags) if workflow.tags else ""
table.add_row(
workflow.name,
workflow.version,
workflow.description,
tags_str
)
console.print(f"\n🔧 [bold]Available Workflows ({len(workflows)})[/bold]\n")
console.print(table)
console.print(f"\n💡 Use [bold cyan]fuzzforge workflows info <name>[/bold cyan] for detailed information")
except Exception as e:
console.print(f"❌ Failed to fetch workflows: {e}", style="red")
raise typer.Exit(1)
@app.command("info")
def workflow_info(
name: str = typer.Argument(..., help="Workflow name to get information about")
):
"""
📋 Show detailed information about a specific workflow
"""
try:
with get_client() as client:
workflow = client.get_workflow_metadata(name)
console.print(f"\n🔧 [bold]Workflow: {workflow.name}[/bold]\n")
# Basic information
info_table = Table(show_header=False, box=box.SIMPLE)
info_table.add_column("Property", style="bold cyan")
info_table.add_column("Value")
info_table.add_row("Name", workflow.name)
info_table.add_row("Version", workflow.version)
info_table.add_row("Description", workflow.description)
if workflow.author:
info_table.add_row("Author", workflow.author)
if workflow.tags:
info_table.add_row("Tags", ", ".join(workflow.tags))
info_table.add_row("Volume Modes", ", ".join(workflow.supported_volume_modes))
info_table.add_row("Custom Docker", "✅ Yes" if workflow.has_custom_docker else "❌ No")
console.print(
Panel.fit(
info_table,
title=" Basic Information",
box=box.ROUNDED
)
)
# Parameters
if workflow.parameters:
console.print("\n📝 [bold]Parameters Schema[/bold]")
param_table = Table(box=box.ROUNDED)
param_table.add_column("Parameter", style="bold")
param_table.add_column("Type", style="cyan")
param_table.add_column("Required", justify="center")
param_table.add_column("Default")
param_table.add_column("Description", style="dim")
# Extract parameter information from JSON schema
properties = workflow.parameters.get("properties", {})
required_params = set(workflow.parameters.get("required", []))
defaults = workflow.default_parameters
for param_name, param_schema in properties.items():
param_type = param_schema.get("type", "unknown")
is_required = "" if param_name in required_params else ""
default_val = str(defaults.get(param_name, "")) if param_name in defaults else ""
description = param_schema.get("description", "")
# Handle array types
if param_type == "array":
items_type = param_schema.get("items", {}).get("type", "unknown")
param_type = f"array[{items_type}]"
param_table.add_row(
param_name,
param_type,
is_required,
default_val[:30] + "..." if len(default_val) > 30 else default_val,
description[:50] + "..." if len(description) > 50 else description
)
console.print(param_table)
# Required modules
if workflow.required_modules:
console.print(f"\n🔧 [bold]Required Modules:[/bold] {', '.join(workflow.required_modules)}")
console.print(f"\n💡 Use [bold cyan]fuzzforge workflows parameters {name}[/bold cyan] for interactive parameter builder")
except Exception as e:
error_message = str(e)
if "not found" in error_message.lower() or "404" in error_message:
# Try fuzzy matching for workflow name
enhanced_workflow_not_found_handler(name)
else:
console.print(f"❌ Failed to get workflow info: {e}", style="red")
raise typer.Exit(1)
@app.command("parameters")
def workflow_parameters(
name: str = typer.Argument(..., help="Workflow name"),
output_file: Optional[str] = typer.Option(
None, "--output", "-o",
help="Save parameters to JSON file"
),
interactive: bool = typer.Option(
True, "--interactive/--no-interactive", "-i/-n",
help="Interactive parameter builder"
)
):
"""
📝 Interactive parameter builder for workflows
"""
try:
with get_client() as client:
workflow = client.get_workflow_metadata(name)
param_response = client.get_workflow_parameters(name)
console.print(f"\n📝 [bold]Parameter Builder: {name}[/bold]\n")
if not workflow.parameters.get("properties"):
console.print(" This workflow has no configurable parameters")
return
parameters = {}
properties = workflow.parameters.get("properties", {})
required_params = set(workflow.parameters.get("required", []))
defaults = param_response.defaults
if interactive:
console.print("🔧 Enter parameter values (press Enter for default):\n")
for param_name, param_schema in properties.items():
param_type = param_schema.get("type", "string")
description = param_schema.get("description", "")
is_required = param_name in required_params
default_value = defaults.get(param_name)
# Build prompt
prompt_text = f"{param_name}"
if description:
prompt_text += f" ({description})"
if param_type:
prompt_text += f" [{param_type}]"
if is_required:
prompt_text += " [bold red]*required*[/bold red]"
# Get user input
while True:
if default_value is not None:
user_input = Prompt.ask(
prompt_text,
default=str(default_value),
console=console
)
else:
user_input = Prompt.ask(
prompt_text,
console=console
)
# Validate and convert input
if user_input.strip() == "" and not is_required:
break
if user_input.strip() == "" and is_required:
console.print("❌ This parameter is required", style="red")
continue
try:
# Type conversion
if param_type == "integer":
parameters[param_name] = int(user_input)
elif param_type == "number":
parameters[param_name] = float(user_input)
elif param_type == "boolean":
parameters[param_name] = user_input.lower() in ("true", "yes", "1", "on")
elif param_type == "array":
# Simple comma-separated array
parameters[param_name] = [item.strip() for item in user_input.split(",") if item.strip()]
else:
parameters[param_name] = user_input
break
except ValueError as e:
console.print(f"❌ Invalid {param_type}: {e}", style="red")
# Show summary
console.print("\n📋 [bold]Parameter Summary:[/bold]")
summary_table = Table(show_header=False, box=box.SIMPLE)
summary_table.add_column("Parameter", style="cyan")
summary_table.add_column("Value", style="white")
for key, value in parameters.items():
summary_table.add_row(key, str(value))
console.print(summary_table)
else:
# Non-interactive mode - show schema
console.print("📋 Parameter Schema:")
schema_json = json.dumps(workflow.parameters, indent=2)
console.print(Syntax(schema_json, "json", theme="monokai"))
if defaults:
console.print("\n📋 Default Values:")
defaults_json = json.dumps(defaults, indent=2)
console.print(Syntax(defaults_json, "json", theme="monokai"))
# Save to file if requested
if output_file:
if parameters or not interactive:
data_to_save = parameters if interactive else {"schema": workflow.parameters, "defaults": defaults}
with open(output_file, 'w') as f:
json.dump(data_to_save, f, indent=2)
console.print(f"\n💾 Parameters saved to: {output_file}")
else:
console.print("\n❌ No parameters to save", style="red")
except Exception as e:
console.print(f"❌ Failed to build parameters: {e}", style="red")
raise typer.Exit(1)
@app.callback(invoke_without_command=True)
def workflows_callback(ctx: typer.Context):
"""
🔧 Manage security testing workflows
"""
# Check if a subcommand is being invoked
if ctx.invoked_subcommand is not None:
# Let the subcommand handle it
return
# Default to list when no subcommand provided
list_workflows()

View File

@@ -1,190 +0,0 @@
"""
Shell auto-completion support for FuzzForge CLI.
Provides intelligent tab completion for commands, workflows, run IDs, and parameters.
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import typer
from typing import List, Optional
from pathlib import Path
from .config import get_project_config, FuzzForgeConfig
from .database import get_project_db
from fuzzforge_sdk import FuzzForgeClient
def complete_workflow_names(incomplete: str) -> List[str]:
"""Auto-complete workflow names from the API."""
try:
config = get_project_config() or FuzzForgeConfig()
with FuzzForgeClient(base_url=config.get_api_url(), timeout=5.0) as client:
workflows = client.list_workflows()
workflow_names = [w.name for w in workflows]
return [name for name in workflow_names if name.startswith(incomplete)]
except Exception:
# Fallback to common workflow names if API is unavailable
common_workflows = [
"security_assessment",
"language_fuzzing",
"infrastructure_scan",
"static_analysis_scan",
"penetration_testing_scan",
"secret_detection_scan"
]
return [name for name in common_workflows if name.startswith(incomplete)]
def complete_run_ids(incomplete: str) -> List[str]:
"""Auto-complete run IDs from local database."""
try:
db = get_project_db()
if db:
runs = db.get_recent_runs(limit=50) # Get recent runs for completion
run_ids = [run.run_id for run in runs]
return [run_id for run_id in run_ids if run_id.startswith(incomplete)]
except Exception:
pass
return []
def complete_target_paths(incomplete: str) -> List[str]:
"""Auto-complete file/directory paths."""
try:
# Convert incomplete path to Path object
path = Path(incomplete) if incomplete else Path.cwd()
if path.is_dir():
# Complete directory contents
try:
entries = []
for entry in path.iterdir():
entry_str = str(entry)
if entry.is_dir():
entry_str += "/"
entries.append(entry_str)
return entries
except PermissionError:
return []
else:
# Complete parent directory contents that match the incomplete name
parent = path.parent
name = path.name
try:
entries = []
for entry in parent.iterdir():
if entry.name.startswith(name):
entry_str = str(entry)
if entry.is_dir():
entry_str += "/"
entries.append(entry_str)
return entries
except (PermissionError, FileNotFoundError):
return []
except Exception:
return []
def complete_volume_modes(incomplete: str) -> List[str]:
"""Auto-complete volume mount modes."""
modes = ["ro", "rw"]
return [mode for mode in modes if mode.startswith(incomplete)]
def complete_export_formats(incomplete: str) -> List[str]:
"""Auto-complete export formats."""
formats = ["json", "csv", "html", "sarif"]
return [fmt for fmt in formats if fmt.startswith(incomplete)]
def complete_severity_levels(incomplete: str) -> List[str]:
"""Auto-complete severity levels."""
severities = ["critical", "high", "medium", "low", "info"]
return [sev for sev in severities if sev.startswith(incomplete)]
def complete_workflow_tags(incomplete: str) -> List[str]:
"""Auto-complete workflow tags."""
try:
config = get_project_config() or FuzzForgeConfig()
with FuzzForgeClient(base_url=config.get_api_url(), timeout=5.0) as client:
workflows = client.list_workflows()
all_tags = set()
for w in workflows:
if w.tags:
all_tags.update(w.tags)
return [tag for tag in sorted(all_tags) if tag.startswith(incomplete)]
except Exception:
# Fallback tags
common_tags = [
"security", "fuzzing", "static-analysis", "infrastructure",
"secrets", "containers", "vulnerabilities", "pentest"
]
return [tag for tag in common_tags if tag.startswith(incomplete)]
def complete_config_keys(incomplete: str) -> List[str]:
"""Auto-complete configuration keys."""
config_keys = [
"api_url",
"api_timeout",
"default_workflow",
"default_volume_mode",
"project_name",
"data_retention_days",
"auto_save_findings",
"notification_webhook"
]
return [key for key in config_keys if key.startswith(incomplete)]
# Completion callbacks for Typer
WorkflowNameComplete = typer.Option(
autocompletion=complete_workflow_names,
help="Workflow name (tab completion available)"
)
RunIdComplete = typer.Option(
autocompletion=complete_run_ids,
help="Run ID (tab completion available)"
)
TargetPathComplete = typer.Argument(
autocompletion=complete_target_paths,
help="Target path (tab completion available)"
)
VolumeModetComplete = typer.Option(
autocompletion=complete_volume_modes,
help="Volume mode: ro or rw (tab completion available)"
)
ExportFormatComplete = typer.Option(
autocompletion=complete_export_formats,
help="Export format (tab completion available)"
)
SeverityComplete = typer.Option(
autocompletion=complete_severity_levels,
help="Severity level (tab completion available)"
)
WorkflowTagComplete = typer.Option(
autocompletion=complete_workflow_tags,
help="Workflow tag (tab completion available)"
)
ConfigKeyComplete = typer.Option(
autocompletion=complete_config_keys,
help="Configuration key (tab completion available)"
)

View File

@@ -1,420 +0,0 @@
"""
Configuration management for FuzzForge CLI.
Extends project configuration with Cognee integration metadata
and provides helpers for AI modules.
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
from __future__ import annotations
import hashlib
import os
from pathlib import Path
from typing import Any, Dict, Optional
try: # Optional dependency; fall back if not installed
from dotenv import load_dotenv
except ImportError: # pragma: no cover - optional dependency
load_dotenv = None
import yaml
from pydantic import BaseModel, Field
def _generate_project_id(project_dir: Path, project_name: str) -> str:
"""Generate a deterministic project identifier based on path and name."""
resolved_path = str(project_dir.resolve())
hash_input = f"{resolved_path}:{project_name}".encode()
return hashlib.sha256(hash_input).hexdigest()[:16]
class ProjectConfig(BaseModel):
"""Project configuration model."""
name: str = "fuzzforge-project"
api_url: str = "http://localhost:8000"
default_timeout: int = 3600
default_workflow: Optional[str] = None
id: Optional[str] = None
tenant_id: Optional[str] = None
class RetentionConfig(BaseModel):
"""Data retention configuration."""
max_runs: int = 100
keep_findings_days: int = 90
class PreferencesConfig(BaseModel):
"""User preferences."""
auto_save_findings: bool = True
show_progress_bars: bool = True
table_style: str = "rich"
color_output: bool = True
class CogneeConfig(BaseModel):
"""Cognee integration metadata."""
enabled: bool = True
graph_database_provider: str = "kuzu"
data_directory: Optional[str] = None
system_directory: Optional[str] = None
backend_access_control: bool = True
project_id: Optional[str] = None
tenant_id: Optional[str] = None
class FuzzForgeConfig(BaseModel):
"""Complete FuzzForge CLI configuration."""
project: ProjectConfig = Field(default_factory=ProjectConfig)
retention: RetentionConfig = Field(default_factory=RetentionConfig)
preferences: PreferencesConfig = Field(default_factory=PreferencesConfig)
cognee: CogneeConfig = Field(default_factory=CogneeConfig)
@classmethod
def from_file(cls, config_path: Path) -> "FuzzForgeConfig":
"""Load configuration from YAML file."""
if not config_path.exists():
return cls()
try:
with open(config_path, "r", encoding="utf-8") as fh:
data = yaml.safe_load(fh) or {}
return cls(**data)
except Exception as exc: # pragma: no cover - defensive fallback
print(f"Warning: Failed to load config from {config_path}: {exc}")
return cls()
def save_to_file(self, config_path: Path) -> None:
"""Save configuration to YAML file."""
config_path.parent.mkdir(parents=True, exist_ok=True)
with open(config_path, "w", encoding="utf-8") as fh:
yaml.dump(
self.model_dump(),
fh,
default_flow_style=False,
sort_keys=False,
)
# ------------------------------------------------------------------
# Convenience helpers used by CLI and AI modules
# ------------------------------------------------------------------
def ensure_project_metadata(self, project_dir: Path) -> bool:
"""Ensure project id/tenant metadata is populated."""
changed = False
project = self.project
if not project.id:
project.id = _generate_project_id(project_dir, project.name)
changed = True
if not project.tenant_id:
project.tenant_id = f"fuzzforge_project_{project.id}"
changed = True
return changed
def ensure_cognee_defaults(self, project_dir: Path) -> bool:
"""Ensure Cognee configuration and directories exist."""
self.ensure_project_metadata(project_dir)
changed = False
cognee = self.cognee
if not cognee.project_id:
cognee.project_id = self.project.id
changed = True
if not cognee.tenant_id:
cognee.tenant_id = self.project.tenant_id
changed = True
base_dir = project_dir / ".fuzzforge" / "cognee" / f"project_{self.project.id}"
data_dir = base_dir / "data"
system_dir = base_dir / "system"
for path in (
base_dir,
data_dir,
system_dir,
system_dir / "kuzu_db",
system_dir / "lancedb",
):
if not path.exists():
path.mkdir(parents=True, exist_ok=True)
if cognee.data_directory != str(data_dir):
cognee.data_directory = str(data_dir)
changed = True
if cognee.system_directory != str(system_dir):
cognee.system_directory = str(system_dir)
changed = True
return changed
def get_api_url(self) -> str:
"""Get API URL with environment variable override."""
return os.getenv("FUZZFORGE_API_URL", self.project.api_url)
def get_timeout(self) -> int:
"""Get timeout with environment variable override."""
env_timeout = os.getenv("FUZZFORGE_TIMEOUT")
if env_timeout and env_timeout.isdigit():
return int(env_timeout)
return self.project.default_timeout
def get_project_context(self, project_dir: Path) -> Dict[str, str]:
"""Return project metadata for AI integrations."""
self.ensure_cognee_defaults(project_dir)
return {
"project_id": self.project.id or "unknown_project",
"project_name": self.project.name,
"tenant_id": self.project.tenant_id or "fuzzforge_tenant",
"data_directory": self.cognee.data_directory,
"system_directory": self.cognee.system_directory,
}
def get_cognee_config(self, project_dir: Path) -> Dict[str, Any]:
"""Expose Cognee configuration as a plain dictionary."""
self.ensure_cognee_defaults(project_dir)
return self.cognee.model_dump()
# ----------------------------------------------------------------------
# Project-level helpers used across the CLI
# ----------------------------------------------------------------------
def _get_project_paths(project_dir: Path) -> Dict[str, Path]:
config_dir = project_dir / ".fuzzforge"
return {
"config_dir": config_dir,
"config_path": config_dir / "config.yaml",
}
def get_project_config(project_dir: Optional[Path] = None) -> Optional[FuzzForgeConfig]:
"""Get configuration for the current project."""
project_dir = Path(project_dir or Path.cwd())
paths = _get_project_paths(project_dir)
config_path = paths["config_path"]
if not config_path.exists():
return None
config = FuzzForgeConfig.from_file(config_path)
if config.ensure_cognee_defaults(project_dir):
config.save_to_file(config_path)
return config
def ensure_project_config(
project_dir: Optional[Path] = None,
project_name: Optional[str] = None,
api_url: Optional[str] = None,
) -> FuzzForgeConfig:
"""Ensure project configuration exists, creating defaults if needed."""
project_dir = Path(project_dir or Path.cwd())
paths = _get_project_paths(project_dir)
config_dir = paths["config_dir"]
config_path = paths["config_path"]
config_dir.mkdir(parents=True, exist_ok=True)
if config_path.exists():
config = FuzzForgeConfig.from_file(config_path)
else:
config = FuzzForgeConfig()
if project_name:
config.project.name = project_name
if api_url:
config.project.api_url = api_url
if config.ensure_cognee_defaults(project_dir):
config.save_to_file(config_path)
else:
# Still ensure latest values persisted (e.g., updated name/url)
config.save_to_file(config_path)
return config
def get_global_config() -> FuzzForgeConfig:
"""Get global user configuration."""
home = Path.home()
global_config_dir = home / ".config" / "fuzzforge"
global_config_path = global_config_dir / "config.yaml"
if global_config_path.exists():
return FuzzForgeConfig.from_file(global_config_path)
return FuzzForgeConfig()
def save_global_config(config: FuzzForgeConfig) -> None:
"""Save global user configuration."""
home = Path.home()
global_config_dir = home / ".config" / "fuzzforge"
global_config_path = global_config_dir / "config.yaml"
config.save_to_file(global_config_path)
# ----------------------------------------------------------------------
# Compatibility layer for AI modules
# ----------------------------------------------------------------------
class ProjectConfigManager:
"""Lightweight wrapper mimicking the legacy Config class used by the AI module."""
def __init__(self, project_dir: Optional[Path] = None):
self.project_dir = Path(project_dir or Path.cwd())
paths = _get_project_paths(self.project_dir)
self.config_path = paths["config_dir"]
self.file_path = paths["config_path"]
self._config = get_project_config(self.project_dir)
if self._config is None:
raise FileNotFoundError(
f"FuzzForge project not initialized in {self.project_dir}. Run 'ff init'."
)
# Legacy API ------------------------------------------------------
def is_initialized(self) -> bool:
return self.file_path.exists()
def get_project_context(self) -> Dict[str, str]:
return self._config.get_project_context(self.project_dir)
def get_cognee_config(self) -> Dict[str, Any]:
return self._config.get_cognee_config(self.project_dir)
def setup_cognee_environment(self) -> None:
cognee = self.get_cognee_config()
if not cognee.get("enabled", True):
return
# Load project-specific environment overrides from .fuzzforge/.env if available
env_file = self.project_dir / ".fuzzforge" / ".env"
if env_file.exists():
if load_dotenv:
load_dotenv(env_file, override=False)
else:
try:
for line in env_file.read_text(encoding="utf-8").splitlines():
stripped = line.strip()
if not stripped or stripped.startswith("#"):
continue
if "=" not in stripped:
continue
key, value = stripped.split("=", 1)
os.environ.setdefault(key.strip(), value.strip())
except Exception: # pragma: no cover - best effort fallback
pass
backend_access = "true" if cognee.get("backend_access_control", True) else "false"
os.environ["ENABLE_BACKEND_ACCESS_CONTROL"] = backend_access
os.environ["GRAPH_DATABASE_PROVIDER"] = cognee.get("graph_database_provider", "kuzu")
data_dir = cognee.get("data_directory")
system_dir = cognee.get("system_directory")
tenant_id = cognee.get("tenant_id", "fuzzforge_tenant")
if data_dir:
os.environ["COGNEE_DATA_ROOT"] = data_dir
if system_dir:
os.environ["COGNEE_SYSTEM_ROOT"] = system_dir
os.environ["COGNEE_USER_ID"] = tenant_id
os.environ["COGNEE_TENANT_ID"] = tenant_id
# Configure LLM provider defaults for Cognee. Values prefixed with COGNEE_
# take precedence so users can segregate credentials.
def _env(*names: str, default: str | None = None) -> str | None:
for name in names:
value = os.getenv(name)
if value:
return value
return default
provider = _env(
"LLM_COGNEE_PROVIDER",
"COGNEE_LLM_PROVIDER",
"LLM_PROVIDER",
default="openai",
)
model = _env(
"LLM_COGNEE_MODEL",
"COGNEE_LLM_MODEL",
"LLM_MODEL",
"LITELLM_MODEL",
default="gpt-4o-mini",
)
api_key = _env(
"LLM_COGNEE_API_KEY",
"COGNEE_LLM_API_KEY",
"LLM_API_KEY",
"OPENAI_API_KEY",
)
endpoint = _env("LLM_COGNEE_ENDPOINT", "COGNEE_LLM_ENDPOINT", "LLM_ENDPOINT")
api_version = _env(
"LLM_COGNEE_API_VERSION",
"COGNEE_LLM_API_VERSION",
"LLM_API_VERSION",
)
max_tokens = _env(
"LLM_COGNEE_MAX_TOKENS",
"COGNEE_LLM_MAX_TOKENS",
"LLM_MAX_TOKENS",
)
if provider:
os.environ["LLM_PROVIDER"] = provider
if model:
os.environ["LLM_MODEL"] = model
# Maintain backwards compatibility with components expecting LITELLM_MODEL
os.environ.setdefault("LITELLM_MODEL", model)
if api_key:
os.environ["LLM_API_KEY"] = api_key
# Provide OPENAI_API_KEY fallback when using OpenAI-compatible providers
if provider and provider.lower() in {"openai", "azure_openai", "custom"}:
os.environ.setdefault("OPENAI_API_KEY", api_key)
if endpoint:
os.environ["LLM_ENDPOINT"] = endpoint
if api_version:
os.environ["LLM_API_VERSION"] = api_version
if max_tokens:
os.environ["LLM_MAX_TOKENS"] = str(max_tokens)
# Provide a default MCP endpoint for local FuzzForge backend access when unset
if not os.getenv("FUZZFORGE_MCP_URL"):
os.environ["FUZZFORGE_MCP_URL"] = os.getenv(
"FUZZFORGE_DEFAULT_MCP_URL",
"http://localhost:8010/mcp",
)
def refresh(self) -> None:
"""Reload configuration from disk."""
self._config = get_project_config(self.project_dir)
if self._config is None:
raise FileNotFoundError(
f"FuzzForge project not initialized in {self.project_dir}. Run 'ff init'."
)
# Convenience accessors ------------------------------------------
@property
def fuzzforge_dir(self) -> Path:
return self.config_path
def get_api_url(self) -> str:
return self._config.get_api_url()
def get_timeout(self) -> int:
return self._config.get_timeout()

View File

@@ -1,73 +0,0 @@
"""
Constants for FuzzForge CLI.
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
# Database constants
DEFAULT_DB_TIMEOUT = 30.0
DEFAULT_CLEANUP_DAYS = 90
STATS_SAMPLE_SIZE = 100
# Network constants
DEFAULT_API_TIMEOUT = 30.0
MAX_RETRIES = 3
RETRY_DELAY = 1.0
POLL_INTERVAL = 5.0
# Display constants
MAX_RUN_ID_DISPLAY_LENGTH = 15
MAX_DESCRIPTION_LENGTH = 50
MAX_DEFAULT_VALUE_LENGTH = 30
# Progress constants
PROGRESS_STEP_DELAYS = {
"validating": 0.3,
"connecting": 0.2,
"uploading": 0.2,
"creating": 0.3,
"initializing": 0.2
}
# Status emojis
STATUS_EMOJIS = {
"completed": "",
"running": "🔄",
"failed": "",
"queued": "",
"cancelled": "⏹️",
"pending": "📋",
"unknown": ""
}
# Severity styles for Rich
SEVERITY_STYLES = {
"error": "bold red",
"warning": "bold yellow",
"note": "bold blue",
"info": "bold cyan"
}
# Default volume modes
DEFAULT_VOLUME_MODE = "ro"
SUPPORTED_VOLUME_MODES = ["ro", "rw"]
# Default export formats
DEFAULT_EXPORT_FORMAT = "sarif"
SUPPORTED_EXPORT_FORMATS = ["sarif", "json", "csv"]
# Default configuration
DEFAULT_CONFIG = {
"api_url": "http://localhost:8000",
"timeout": DEFAULT_API_TIMEOUT,
"max_retries": MAX_RETRIES,
}

View File

@@ -1,661 +0,0 @@
"""
Database module for FuzzForge CLI.
Handles SQLite database operations for local project management,
including runs, findings, and crash storage.
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import sqlite3
import json
import logging
from datetime import datetime
from pathlib import Path
from typing import Dict, Any, List, Optional, Union
from contextlib import contextmanager
from pydantic import BaseModel
from .constants import DEFAULT_DB_TIMEOUT, DEFAULT_CLEANUP_DAYS, STATS_SAMPLE_SIZE
logger = logging.getLogger(__name__)
class RunRecord(BaseModel):
"""Database record for workflow runs"""
run_id: str
workflow: str
status: str
target_path: str
parameters: Dict[str, Any] = {}
created_at: datetime
completed_at: Optional[datetime] = None
metadata: Dict[str, Any] = {}
class FindingRecord(BaseModel):
"""Database record for findings"""
id: Optional[int] = None
run_id: str
sarif_data: Dict[str, Any]
summary: Dict[str, Any] = {}
created_at: datetime
class CrashRecord(BaseModel):
"""Database record for crash reports"""
id: Optional[int] = None
run_id: str
crash_id: str
signal: Optional[str] = None
stack_trace: Optional[str] = None
input_file: Optional[str] = None
severity: str = "medium"
timestamp: datetime
class FuzzForgeDatabase:
"""SQLite database manager for FuzzForge CLI projects"""
SCHEMA = """
CREATE TABLE IF NOT EXISTS runs (
run_id TEXT PRIMARY KEY,
workflow TEXT NOT NULL,
status TEXT NOT NULL,
target_path TEXT NOT NULL,
parameters TEXT DEFAULT '{}',
created_at TIMESTAMP NOT NULL,
completed_at TIMESTAMP,
metadata TEXT DEFAULT '{}'
);
CREATE TABLE IF NOT EXISTS findings (
id INTEGER PRIMARY KEY AUTOINCREMENT,
run_id TEXT NOT NULL,
sarif_data TEXT NOT NULL,
summary TEXT DEFAULT '{}',
created_at TIMESTAMP NOT NULL,
FOREIGN KEY (run_id) REFERENCES runs (run_id)
);
CREATE TABLE IF NOT EXISTS crashes (
id INTEGER PRIMARY KEY AUTOINCREMENT,
run_id TEXT NOT NULL,
crash_id TEXT NOT NULL,
signal TEXT,
stack_trace TEXT,
input_file TEXT,
severity TEXT DEFAULT 'medium',
timestamp TIMESTAMP NOT NULL,
FOREIGN KEY (run_id) REFERENCES runs (run_id)
);
CREATE INDEX IF NOT EXISTS idx_runs_status ON runs (status);
CREATE INDEX IF NOT EXISTS idx_runs_workflow ON runs (workflow);
CREATE INDEX IF NOT EXISTS idx_runs_created_at ON runs (created_at);
CREATE INDEX IF NOT EXISTS idx_findings_run_id ON findings (run_id);
CREATE INDEX IF NOT EXISTS idx_crashes_run_id ON crashes (run_id);
"""
def __init__(self, db_path: Union[str, Path]):
self.db_path = Path(db_path)
self.db_path.parent.mkdir(parents=True, exist_ok=True)
self._initialize_db()
def _initialize_db(self):
"""Initialize database with schema, handling corruption"""
try:
with self.connection() as conn:
# Test database integrity first
conn.execute("PRAGMA integrity_check").fetchone()
conn.executescript(self.SCHEMA)
except sqlite3.DatabaseError as e:
logger.warning(f"Database corruption detected: {e}")
# Backup corrupted database
backup_path = self.db_path.with_suffix('.db.corrupted')
if self.db_path.exists():
self.db_path.rename(backup_path)
logger.info(f"Corrupted database backed up to: {backup_path}")
# Create fresh database
with self.connection() as conn:
conn.executescript(self.SCHEMA)
logger.info("Created fresh database after corruption")
@contextmanager
def connection(self):
"""Context manager for database connections with proper resource management"""
conn = None
try:
conn = sqlite3.connect(
self.db_path,
detect_types=sqlite3.PARSE_DECLTYPES | sqlite3.PARSE_COLNAMES,
timeout=DEFAULT_DB_TIMEOUT
)
conn.row_factory = sqlite3.Row
# Enable WAL mode for better concurrency
conn.execute("PRAGMA journal_mode=WAL")
# Enable query optimization
conn.execute("PRAGMA optimize")
yield conn
conn.commit()
except sqlite3.OperationalError as e:
if conn:
try:
conn.rollback()
except:
pass # Connection might be broken
if "database is locked" in str(e).lower():
raise sqlite3.OperationalError(
"Database is locked. Another FuzzForge process may be running."
) from e
elif "database disk image is malformed" in str(e).lower():
raise sqlite3.DatabaseError(
"Database is corrupted. Use 'ff init --force' to reset."
) from e
raise
except Exception as e:
if conn:
try:
conn.rollback()
except:
pass # Connection might be broken
raise
finally:
if conn:
try:
conn.close()
except:
pass # Ensure cleanup even if close fails
# Run management methods
def save_run(self, run: RunRecord) -> None:
"""Save or update a run record with validation"""
try:
# Validate JSON serialization before database write
parameters_json = json.dumps(run.parameters)
metadata_json = json.dumps(run.metadata)
with self.connection() as conn:
conn.execute("""
INSERT OR REPLACE INTO runs
(run_id, workflow, status, target_path, parameters, created_at, completed_at, metadata)
VALUES (?, ?, ?, ?, ?, ?, ?, ?)
""", (
run.run_id,
run.workflow,
run.status,
run.target_path,
parameters_json,
run.created_at,
run.completed_at,
metadata_json
))
except (TypeError, ValueError) as e:
raise ValueError(f"Failed to serialize run data: {e}") from e
def get_run(self, run_id: str) -> Optional[RunRecord]:
"""Get a run record by ID with error handling"""
with self.connection() as conn:
row = conn.execute(
"SELECT * FROM runs WHERE run_id = ?",
(run_id,)
).fetchone()
if row:
try:
return RunRecord(
run_id=row["run_id"],
workflow=row["workflow"],
status=row["status"],
target_path=row["target_path"],
parameters=json.loads(row["parameters"] or "{}"),
created_at=row["created_at"],
completed_at=row["completed_at"],
metadata=json.loads(row["metadata"] or "{}")
)
except (json.JSONDecodeError, TypeError) as e:
logger.warning(f"Failed to deserialize run {run_id}: {e}")
# Return with empty dicts for corrupted JSON
return RunRecord(
run_id=row["run_id"],
workflow=row["workflow"],
status=row["status"],
target_path=row["target_path"],
parameters={},
created_at=row["created_at"],
completed_at=row["completed_at"],
metadata={}
)
return None
def list_runs(
self,
workflow: Optional[str] = None,
status: Optional[str] = None,
limit: int = 50
) -> List[RunRecord]:
"""List runs with optional filters"""
query = "SELECT * FROM runs WHERE 1=1"
params = []
if workflow:
query += " AND workflow = ?"
params.append(workflow)
if status:
query += " AND status = ?"
params.append(status)
query += " ORDER BY created_at DESC LIMIT ?"
params.append(limit)
with self.connection() as conn:
rows = conn.execute(query, params).fetchall()
runs = []
for row in rows:
try:
runs.append(RunRecord(
run_id=row["run_id"],
workflow=row["workflow"],
status=row["status"],
target_path=row["target_path"],
parameters=json.loads(row["parameters"] or "{}"),
created_at=row["created_at"],
completed_at=row["completed_at"],
metadata=json.loads(row["metadata"] or "{}")
))
except (json.JSONDecodeError, TypeError) as e:
logger.warning(f"Skipping corrupted run {row['run_id']}: {e}")
# Skip corrupted records instead of failing
continue
return runs
def update_run_status(self, run_id: str, status: str, completed_at: Optional[datetime] = None):
"""Update run status"""
with self.connection() as conn:
conn.execute(
"UPDATE runs SET status = ?, completed_at = ? WHERE run_id = ?",
(status, completed_at, run_id)
)
# Findings management methods
def save_findings(self, finding: FindingRecord) -> int:
"""Save findings and return the ID"""
with self.connection() as conn:
cursor = conn.execute("""
INSERT INTO findings (run_id, sarif_data, summary, created_at)
VALUES (?, ?, ?, ?)
""", (
finding.run_id,
json.dumps(finding.sarif_data),
json.dumps(finding.summary),
finding.created_at
))
return cursor.lastrowid
def get_findings(self, run_id: str) -> Optional[FindingRecord]:
"""Get findings for a run"""
with self.connection() as conn:
row = conn.execute(
"SELECT * FROM findings WHERE run_id = ? ORDER BY created_at DESC LIMIT 1",
(run_id,)
).fetchone()
if row:
return FindingRecord(
id=row["id"],
run_id=row["run_id"],
sarif_data=json.loads(row["sarif_data"]),
summary=json.loads(row["summary"]),
created_at=row["created_at"]
)
return None
def list_findings(self, limit: int = 50) -> List[FindingRecord]:
"""List recent findings"""
with self.connection() as conn:
rows = conn.execute("""
SELECT * FROM findings
ORDER BY created_at DESC
LIMIT ?
""", (limit,)).fetchall()
return [
FindingRecord(
id=row["id"],
run_id=row["run_id"],
sarif_data=json.loads(row["sarif_data"]),
summary=json.loads(row["summary"]),
created_at=row["created_at"]
)
for row in rows
]
def get_all_findings(self,
workflow: Optional[str] = None,
severity: Optional[List[str]] = None,
since_date: Optional[datetime] = None,
limit: Optional[int] = None) -> List[FindingRecord]:
"""Get all findings with optional filters"""
with self.connection() as conn:
query = """
SELECT f.*, r.workflow
FROM findings f
JOIN runs r ON f.run_id = r.run_id
WHERE 1=1
"""
params = []
if workflow:
query += " AND r.workflow = ?"
params.append(workflow)
if since_date:
query += " AND f.created_at >= ?"
params.append(since_date)
query += " ORDER BY f.created_at DESC"
if limit:
query += " LIMIT ?"
params.append(limit)
rows = conn.execute(query, params).fetchall()
findings = []
for row in rows:
try:
finding = FindingRecord(
id=row["id"],
run_id=row["run_id"],
sarif_data=json.loads(row["sarif_data"]),
summary=json.loads(row["summary"]),
created_at=row["created_at"]
)
# Filter by severity if specified
if severity:
finding_severities = set()
if "runs" in finding.sarif_data:
for run in finding.sarif_data["runs"]:
for result in run.get("results", []):
finding_severities.add(result.get("level", "note").lower())
if not any(sev.lower() in finding_severities for sev in severity):
continue
findings.append(finding)
except (json.JSONDecodeError, KeyError) as e:
logger.warning(f"Skipping malformed finding {row['id']}: {e}")
continue
return findings
def get_findings_by_workflow(self, workflow: str) -> List[FindingRecord]:
"""Get all findings for a specific workflow"""
return self.get_all_findings(workflow=workflow)
def get_aggregated_stats(self) -> Dict[str, Any]:
"""Get aggregated statistics for all findings using SQL aggregation"""
with self.connection() as conn:
# Total findings and runs
total_findings = conn.execute("SELECT COUNT(*) FROM findings").fetchone()[0]
total_runs = conn.execute("SELECT COUNT(DISTINCT run_id) FROM findings").fetchone()[0]
# Findings by workflow
workflow_stats = conn.execute("""
SELECT r.workflow, COUNT(f.id) as count
FROM findings f
JOIN runs r ON f.run_id = r.run_id
GROUP BY r.workflow
ORDER BY count DESC
""").fetchall()
# Recent activity
recent_findings = conn.execute("""
SELECT COUNT(*) FROM findings
WHERE created_at > datetime('now', '-7 days')
""").fetchone()[0]
# Use SQL JSON functions to aggregate severity stats efficiently
# This avoids loading all findings into memory
severity_stats = conn.execute("""
SELECT
SUM(json_array_length(json_extract(sarif_data, '$.runs[0].results'))) as total_issues,
COUNT(*) as finding_count
FROM findings
WHERE json_extract(sarif_data, '$.runs[0].results') IS NOT NULL
""").fetchone()
total_issues = severity_stats["total_issues"] or 0
# Get severity distribution using SQL
# Note: This is a simplified version - for full accuracy we'd need JSON parsing
# But it's much more efficient than loading all data into Python
severity_counts = {"error": 0, "warning": 0, "note": 0, "info": 0}
# Sample the first N findings for severity distribution
# This gives a good approximation without loading everything
sample_findings = conn.execute("""
SELECT sarif_data
FROM findings
LIMIT ?
""", (STATS_SAMPLE_SIZE,)).fetchall()
for row in sample_findings:
try:
data = json.loads(row["sarif_data"])
if "runs" in data:
for run in data["runs"]:
for result in run.get("results", []):
level = result.get("level", "note").lower()
severity_counts[level] = severity_counts.get(level, 0) + 1
except (json.JSONDecodeError, KeyError):
continue
# Extrapolate severity counts if we have more than sample size
if total_findings > STATS_SAMPLE_SIZE:
multiplier = total_findings / STATS_SAMPLE_SIZE
for key in severity_counts:
severity_counts[key] = int(severity_counts[key] * multiplier)
return {
"total_findings_records": total_findings,
"total_runs": total_runs,
"total_issues": total_issues,
"severity_distribution": severity_counts,
"workflows": {row["workflow"]: row["count"] for row in workflow_stats},
"recent_findings": recent_findings,
"last_updated": datetime.now()
}
# Crash management methods
def save_crash(self, crash: CrashRecord) -> int:
"""Save crash report and return the ID"""
with self.connection() as conn:
cursor = conn.execute("""
INSERT INTO crashes
(run_id, crash_id, signal, stack_trace, input_file, severity, timestamp)
VALUES (?, ?, ?, ?, ?, ?, ?)
""", (
crash.run_id,
crash.crash_id,
crash.signal,
crash.stack_trace,
crash.input_file,
crash.severity,
crash.timestamp
))
return cursor.lastrowid
def get_crashes(self, run_id: str) -> List[CrashRecord]:
"""Get all crashes for a run"""
with self.connection() as conn:
rows = conn.execute(
"SELECT * FROM crashes WHERE run_id = ? ORDER BY timestamp DESC",
(run_id,)
).fetchall()
return [
CrashRecord(
id=row["id"],
run_id=row["run_id"],
crash_id=row["crash_id"],
signal=row["signal"],
stack_trace=row["stack_trace"],
input_file=row["input_file"],
severity=row["severity"],
timestamp=row["timestamp"]
)
for row in rows
]
# Utility methods
def cleanup_old_runs(self, keep_days: int = DEFAULT_CLEANUP_DAYS) -> int:
"""Remove old runs and associated data"""
cutoff_date = datetime.now().replace(
hour=0, minute=0, second=0, microsecond=0
) - datetime.timedelta(days=keep_days)
with self.connection() as conn:
# Get run IDs to delete
old_runs = conn.execute(
"SELECT run_id FROM runs WHERE created_at < ?",
(cutoff_date,)
).fetchall()
if not old_runs:
return 0
run_ids = [row["run_id"] for row in old_runs]
placeholders = ",".join("?" * len(run_ids))
# Delete associated findings and crashes
conn.execute(f"DELETE FROM findings WHERE run_id IN ({placeholders})", run_ids)
conn.execute(f"DELETE FROM crashes WHERE run_id IN ({placeholders})", run_ids)
# Delete runs
conn.execute(f"DELETE FROM runs WHERE run_id IN ({placeholders})", run_ids)
return len(run_ids)
def get_stats(self) -> Dict[str, Any]:
"""Get database statistics"""
with self.connection() as conn:
stats = {}
# Run counts by status
run_stats = conn.execute("""
SELECT status, COUNT(*) as count
FROM runs
GROUP BY status
""").fetchall()
stats["runs_by_status"] = {row["status"]: row["count"] for row in run_stats}
# Total counts
stats["total_runs"] = conn.execute("SELECT COUNT(*) FROM runs").fetchone()[0]
stats["total_findings"] = conn.execute("SELECT COUNT(*) FROM findings").fetchone()[0]
stats["total_crashes"] = conn.execute("SELECT COUNT(*) FROM crashes").fetchone()[0]
# Recent activity
stats["runs_last_7_days"] = conn.execute("""
SELECT COUNT(*) FROM runs
WHERE created_at > datetime('now', '-7 days')
""").fetchone()[0]
return stats
def health_check(self) -> Dict[str, Any]:
"""Perform database health check"""
health = {
"healthy": True,
"issues": [],
"recommendations": []
}
try:
with self.connection() as conn:
# Check database integrity
integrity_result = conn.execute("PRAGMA integrity_check").fetchone()
if integrity_result[0] != "ok":
health["healthy"] = False
health["issues"].append(f"Database integrity check failed: {integrity_result[0]}")
# Check for orphaned records
orphaned_findings = conn.execute("""
SELECT COUNT(*) FROM findings
WHERE run_id NOT IN (SELECT run_id FROM runs)
""").fetchone()[0]
if orphaned_findings > 0:
health["issues"].append(f"Found {orphaned_findings} orphaned findings")
health["recommendations"].append("Run database cleanup to remove orphaned records")
orphaned_crashes = conn.execute("""
SELECT COUNT(*) FROM crashes
WHERE run_id NOT IN (SELECT run_id FROM runs)
""").fetchone()[0]
if orphaned_crashes > 0:
health["issues"].append(f"Found {orphaned_crashes} orphaned crashes")
# Check database size
db_size = self.db_path.stat().st_size if self.db_path.exists() else 0
if db_size > 100 * 1024 * 1024: # 100MB
health["recommendations"].append("Database is large (>100MB). Consider cleanup.")
except Exception as e:
health["healthy"] = False
health["issues"].append(f"Health check failed: {e}")
return health
def get_project_db(project_dir: Optional[Path] = None) -> Optional[FuzzForgeDatabase]:
"""Get the database for the current project with error handling"""
if project_dir is None:
project_dir = Path.cwd()
fuzzforge_dir = project_dir / ".fuzzforge"
if not fuzzforge_dir.exists():
return None
db_path = fuzzforge_dir / "findings.db"
try:
return FuzzForgeDatabase(db_path)
except Exception as e:
logger.error(f"Failed to open project database: {e}")
raise sqlite3.DatabaseError(f"Failed to open project database: {e}") from e
def ensure_project_db(project_dir: Optional[Path] = None) -> FuzzForgeDatabase:
"""Ensure project database exists, create if needed with error handling"""
if project_dir is None:
project_dir = Path.cwd()
fuzzforge_dir = project_dir / ".fuzzforge"
try:
fuzzforge_dir.mkdir(exist_ok=True)
except PermissionError as e:
raise PermissionError(f"Cannot create .fuzzforge directory: {e}") from e
db_path = fuzzforge_dir / "findings.db"
try:
return FuzzForgeDatabase(db_path)
except Exception as e:
logger.error(f"Failed to create/open project database: {e}")
raise sqlite3.DatabaseError(f"Failed to create project database: {e}") from e

Some files were not shown because too many files have changed in this diff Show More