Compare commits

...

41 Commits

Author SHA1 Message Date
fztee
a271a6bef7 chore: update Gitlab CI and add Makefile (backend package) with code quality commands to run. 2025-11-12 12:00:33 +01:00
fztee
40bbb18795 chore: improve code quality (backend package).
- add configuration file for 'ruff'.
    - fix most of 'ruff' lints.
    - format 'backend' package using 'ruff'.
2025-11-10 17:01:42 +01:00
fztee
a810e29f76 chore: update file 'pyproject.toml' (backend package).
- remove unused dependency 'httpx'.
    - rename optional dependency 'dev' to 'tests'.
2025-11-07 16:29:05 +01:00
fztee
1dc0d967b3 chore: bump and fix versions (backend package). 2025-11-07 16:19:26 +01:00
tduhamel42
511a89c8c2 Update GitHub link to fuzzforge_ai 2025-11-04 17:42:52 +01:00
tduhamel42
321b9d5eed chore: bump all package versions to 0.7.3 for consistency 2025-11-04 14:04:33 +01:00
tduhamel42
7782e3917a docs: update CHANGELOG with missing versions and recent changes
- Add Unreleased section for post-v0.7.3 documentation updates
- Add v0.7.2 entry with bug fixes and worker improvements
- Document that v0.7.1 was re-tagged as v0.7.2
- Fix v0.6.0 date to "Undocumented" (no tag exists)
- Add version comparison links for easier navigation
2025-11-04 14:04:33 +01:00
tduhamel42
e33c611711 chore: add worker startup documentation and cleanup .gitignore
- Add workflow-to-worker mapping tables across documentation
- Update troubleshooting guide with worker requirements section
- Enhance getting started guide with worker examples
- Add quick reference to docker setup guide
- Add WEEK_SUMMARY*.md pattern to .gitignore
2025-11-04 14:04:33 +01:00
tduhamel42
bdcedec091 docs: fix broken documentation links in cli-reference 2025-11-04 14:04:33 +01:00
tduhamel42
1a835b95ee chore: bump version to 0.7.3 2025-11-04 14:04:33 +01:00
tduhamel42
d005521c78 fix: MobSF scanner now properly parses files dict structure
MobSF returns 'files' as a dict (not list):
{"filename": "line_numbers"}

The parser was treating it as a list, causing zero findings
to be extracted. Now properly iterates over the dict and
creates one finding per affected file with correct line numbers
and metadata (CWE, OWASP, MASVS, CVSS).

Fixed in both code_analysis and behaviour sections.
2025-11-04 14:04:33 +01:00
tduhamel42
9a7138fdb6 feat(cli): add worker management commands with improved progress feedback
Add comprehensive CLI commands for managing Temporal workers:
- ff worker list - List workers with status and uptime
- ff worker start <name> - Start specific worker with optional rebuild
- ff worker stop - Safely stop all workers without affecting core services

Improvements:
- Live progress display during worker startup with Rich Status spinner
- Real-time elapsed time counter and container state updates
- Health check status tracking (starting → unhealthy → healthy)
- Helpful contextual hints at 10s, 30s, 60s intervals
- Better timeout messages showing last known state

Worker management enhancements:
- Use 'docker compose' (space) instead of 'docker-compose' (hyphen)
- Stop workers individually with 'docker stop' to avoid stopping core services
- Platform detection and Dockerfile selection (ARM64/AMD64)

Documentation:
- Updated docker-setup.md with CLI commands as primary method
- Created comprehensive cli-reference.md with all commands and examples
- Added worker management best practices
2025-11-04 14:04:33 +01:00
Songbird
8bf5e1bb77 refactor: replace .env.example with .env.template in documentation
- Remove volumes/env/.env.example file
- Update all documentation references to use .env.template instead
- Update bootstrap script error message
- Update .gitignore comment
2025-11-04 14:04:33 +01:00
Songbird
97d8af4c52 fix: add default values to llm_analysis workflow parameters
Resolves validation error where agent_url was None when not explicitly provided. The TemporalManager applies defaults from metadata.yaml, not from module input schemas, so all parameters need defaults in the workflow metadata.

Changes:
- Add default agent_url, llm_model (gpt-5-mini), llm_provider (openai)
- Expand file_patterns to 45 comprehensive patterns covering code, configs, secrets, and Docker files
- Increase default limits: max_files (10), max_file_size (100KB), timeout (90s)
2025-11-04 14:04:33 +01:00
Songbird99
f77c3ff1e9 Feature/litellm proxy (#27)
* feat: seed governance config and responses routing

* Add env-configurable timeout for proxy providers

* Integrate LiteLLM OTEL collector and update docs

* Make .env.litellm optional for LiteLLM proxy

* Add LiteLLM proxy integration with model-agnostic virtual keys

Changes:
- Bootstrap generates 3 virtual keys with individual budgets (CLI: $100, Task-Agent: $25, Cognee: $50)
- Task-agent loads config at runtime via entrypoint script to wait for bootstrap completion
- All keys are model-agnostic by default (no LITELLM_DEFAULT_MODELS restrictions)
- Bootstrap handles database/env mismatch after docker prune by deleting stale aliases
- CLI and Cognee configured to use LiteLLM proxy with virtual keys
- Added comprehensive documentation in volumes/env/README.md

Technical details:
- task-agent entrypoint waits for keys in .env file before starting uvicorn
- Bootstrap creates/updates TASK_AGENT_API_KEY, COGNEE_API_KEY, and OPENAI_API_KEY
- Removed hardcoded API keys from docker-compose.yml
- All services route through http://localhost:10999 proxy

* Fix CLI not loading virtual keys from global .env

Project .env files with empty OPENAI_API_KEY values were overriding
the global virtual keys. Updated _load_env_file_if_exists to only
override with non-empty values.

* Fix agent executor not passing API key to LiteLLM

The agent was initializing LiteLlm without api_key or api_base,
causing authentication errors when using the LiteLLM proxy. Now
reads from OPENAI_API_KEY/LLM_API_KEY and LLM_ENDPOINT environment
variables and passes them to LiteLlm constructor.

* Auto-populate project .env with virtual key from global config

When running 'ff init', the command now checks for a global
volumes/env/.env file and automatically uses the OPENAI_API_KEY
virtual key if found. This ensures projects work with LiteLLM
proxy out of the box without manual key configuration.

* docs: Update README with LiteLLM configuration instructions

Add note about LITELLM_GEMINI_API_KEY configuration and clarify that OPENAI_API_KEY default value should not be changed as it's used for the LLM proxy.

* Refactor workflow parameters to use JSON Schema defaults

Consolidates parameter defaults into JSON Schema format, removing the separate default_parameters field. Adds extract_defaults_from_json_schema() helper to extract defaults from the standard schema structure. Updates LiteLLM proxy config to use LITELLM_OPENAI_API_KEY environment variable.

* Remove .env.example from task_agent

* Fix MDX syntax error in llm-proxy.md

* fix: apply default parameters from metadata.yaml automatically

Fixed TemporalManager.run_workflow() to correctly apply default parameter
values from workflow metadata.yaml files when parameters are not provided
by the caller.

Previous behavior:
- When workflow_params was empty {}, the condition
  `if workflow_params and 'parameters' in metadata` would fail
- Parameters would not be extracted from schema, resulting in workflows
  receiving only target_id with no other parameters

New behavior:
- Removed the `workflow_params and` requirement from the condition
- Now explicitly checks for defaults in parameter spec
- Applies defaults from metadata.yaml automatically when param not provided
- Workflows receive all parameters with proper fallback:
  provided value > metadata default > None

This makes metadata.yaml the single source of truth for parameter defaults,
removing the need for workflows to implement defensive default handling.

Affected workflows:
- llm_secret_detection (was failing with KeyError)
- All other workflows now benefit from automatic default application

Co-authored-by: tduhamel42 <tduhamel@fuzzinglabs.com>
2025-11-04 14:04:10 +01:00
tduhamel42
bd94d19d34 Merge pull request #28 from FuzzingLabs/feature/android-workflow-conversion
feat: Android Static Analysis Workflow with ARM64 Support
2025-10-24 17:22:49 +02:00
tduhamel42
b0a0d591e4 ci: support multi-platform Dockerfiles in worker validation
Updated worker validation script to accept both:
- Single Dockerfile pattern (existing workers)
- Multi-platform Dockerfile pattern (Dockerfile.amd64, Dockerfile.arm64, etc.)

This enables platform-aware worker architectures like the Android worker
which uses different Dockerfiles for x86_64 and ARM64 platforms.
2025-10-24 17:06:00 +02:00
tduhamel42
1fd525f904 fix: resolve linter errors in Android modules
- Remove unused imports from mobsf_scanner.py (asyncio, hashlib, json, Optional)
- Remove unused variables from opengrep_android.py (start_col, end_col)
- Remove duplicate Path import from workflow.py
2025-10-24 17:05:04 +02:00
tduhamel42
73dc26493d docs: update CHANGELOG with Android workflow and ARM64 support
Added [Unreleased] section documenting:
- Android Static Analysis Workflow (Jadx, OpenGrep, MobSF)
- Platform-Aware Worker Architecture with ARM64 support
- Python SAST Workflow
- CI/CD improvements and worker validation
- CLI enhancements
- Bug fixes and technical changes

Fixed date typo: 2025-01-16 → 2025-10-16
2025-10-24 16:52:48 +02:00
tduhamel42
b1a98dbf73 fix: make MobSFScanner import conditional for ARM64 compatibility
- Add try-except block to conditionally import MobSFScanner in modules/android/__init__.py
- Allows Android worker to start on ARM64 without MobSF dependencies (aiohttp)
- MobSF activity gracefully skips on ARM64 with clear warning message
- Remove workflow path detection logic (not needed - workflows receive directories)

Platform-aware architecture fully functional on ARM64:
- CLI detects ARM64 and selects Dockerfile.arm64 automatically
- Worker builds and runs without MobSF on ARM64
- Jadx successfully decompiles APKs (4145 files from BeetleBug.apk)
- OpenGrep finds security vulnerabilities (8 issues found)
- MobSF gracefully skips with warning on ARM64
- Graceful degradation working as designed

Tested with:
  ff workflow run android_static_analysis test_projects/android_test/ \
    --wait --no-interactive apk_path=BeetleBug.apk decompile_apk=true

Results: 8 security findings (1 ERROR, 7 WARNINGS)
2025-10-24 15:14:06 +02:00
tduhamel42
0801ca3d78 feat: add platform-aware worker architecture with ARM64 support
Implement platform-specific Dockerfile selection and graceful tool degradation to support both x86_64 and ARM64 (Apple Silicon) platforms.

**Backend Changes:**
- Add system info API endpoint (/system/info) exposing host filesystem paths
- Add FUZZFORGE_HOST_ROOT environment variable to backend service
- Add graceful degradation in MobSF activity for ARM64 platforms

**CLI Changes:**
- Implement multi-strategy path resolution (backend API, .fuzzforge marker, env var)
- Add platform detection (linux/amd64 vs linux/arm64)
- Add worker metadata.yaml reading for platform capabilities
- Auto-select appropriate Dockerfile based on detected platform
- Pass platform-specific env vars to docker-compose

**Worker Changes:**
- Create workers/android/metadata.yaml defining platform capabilities
- Rename Dockerfile -> Dockerfile.amd64 (full toolchain with MobSF)
- Create Dockerfile.arm64 (excludes MobSF due to Rosetta 2 incompatibility)
- Update docker-compose.yml to use ${ANDROID_DOCKERFILE} variable

**Workflow Changes:**
- Handle MobSF "skipped" status gracefully in workflow
- Log clear warnings when tools are unavailable on platform

**Key Features:**
- Automatic platform detection and Dockerfile selection
- Graceful degradation when tools unavailable (MobSF on ARM64)
- Works from any directory (backend API provides paths)
- Manual override via environment variables
- Clear user feedback about platform and selected Dockerfile

**Benefits:**
- Android workflow now works on Apple Silicon Macs
- No code changes needed for other workflows
- Convention established for future platform-specific workers

Closes: MobSF Rosetta 2 incompatibility issue
Implements: Platform-aware worker architecture (Option B)
2025-10-23 16:43:17 +02:00
tduhamel42
1d3e033bcc fix(android): correct activity names and MobSF API key generation
- Fix activity names in workflow.py (get_target, upload_results, cleanup_cache)
- Fix MobSF API key generation in Dockerfile startup script (cut delimiter)
- Update activity parameter signatures to match actual implementations
- Workflow now executes successfully with Jadx and OpenGrep
2025-10-23 16:36:39 +02:00
tduhamel42
cfcbe91610 feat: Add Android static analysis workflow with Jadx, OpenGrep, and MobSF
Comprehensive Android security testing workflow converted from Prefect to Temporal architecture:

Modules (3):
- JadxDecompiler: APK to Java source code decompilation
- OpenGrepAndroid: Static analysis with Android-specific security rules
- MobSFScanner: Comprehensive mobile security framework integration

Custom Rules (13):
- clipboard-sensitive-data, hardcoded-secrets, insecure-data-storage
- insecure-deeplink, insecure-logging, intent-redirection
- sensitive_data_sharedPreferences, sqlite-injection
- vulnerable-activity, vulnerable-content-provider, vulnerable-service
- webview-javascript-enabled, webview-load-arbitrary-url

Workflow:
- 6-phase Temporal workflow: download → Jadx → OpenGrep → MobSF → SARIF → upload
- 4 activities: decompile_with_jadx, scan_with_opengrep, scan_with_mobsf, generate_android_sarif
- SARIF output combining findings from all security tools

Docker Worker:
- ARM64 Mac compatibility via amd64 platform emulation
- Pre-installed: Android SDK, Jadx 1.4.7, OpenGrep 1.45.0, MobSF 3.9.7
- MobSF runs as background service with API key auto-generation
- Added aiohttp for async HTTP communication

Test APKs:
- BeetleBug.apk and shopnest.apk for workflow validation
2025-10-23 10:25:52 +02:00
tduhamel42
e180431b1e Merge pull request #24 from FuzzingLabs/fix/cleanup-and-bugs
fix: resolve live monitoring bug, remove deprecated parameters, and auto-start Python worker
2025-10-22 17:12:08 +02:00
tduhamel42
6ca5cf36c0 fix: resolve linter errors and optimize CI worker builds
- Remove unused Literal import from backend findings model
- Remove unnecessary f-string prefixes in CLI findings command
- Optimize GitHub Actions to build only modified workers
  - Detect specific worker changes (python, secrets, rust, android, ossfuzz)
  - Build only changed workers instead of all 5
  - Build all workers if docker-compose.yml changes
  - Significantly reduces CI build time
2025-10-22 16:56:51 +02:00
tduhamel42
09951d68d7 fix: resolve live monitoring bug, remove deprecated parameters, and auto-start Python worker
- Fix live monitoring style error by calling _live_monitor() helper directly
- Remove default_parameters duplication from 10 workflow metadata files
- Remove deprecated volume_mode parameter from 26 files across CLI, SDK, backend, and docs
- Configure Python worker to start automatically with docker compose up
- Clean up constants, validation, completion, and example files

Fixes #
- Live monitoring now works correctly with --live flag
- Workflow metadata follows JSON Schema standard
- Cleaner codebase without deprecated volume_mode
- Python worker (most commonly used) starts by default
2025-10-22 16:26:58 +02:00
tduhamel42
1c3c7a801e Merge pull request #23 from FuzzingLabs/feature/python-sast-workflow
feat: Add Python SAST workflow (Issue #5)
2025-10-22 15:55:26 +02:00
tduhamel42
66e797a0e7 fix: Remove unused imports to pass linter 2025-10-22 15:36:35 +02:00
tduhamel42
9468a8b023 feat: Add Python SAST workflow with three security analysis tools
Implements Issue #5 - Python SAST workflow that combines:
- Dependency scanning (pip-audit) for CVE detection
- Security linting (Bandit) for vulnerability patterns
- Type checking (Mypy) for type safety issues

## Changes

**New Modules:**
- `DependencyScanner`: Scans Python dependencies for known CVEs using pip-audit
- `BanditAnalyzer`: Analyzes Python code for security issues using Bandit
- `MypyAnalyzer`: Checks Python code for type safety issues using Mypy

**New Workflow:**
- `python_sast`: Temporal workflow that orchestrates all three SAST tools
  - Runs tools in parallel for fast feedback (3-5 min vs hours for fuzzing)
  - Generates unified SARIF report with findings from all tools
  - Supports configurable severity/confidence thresholds

**Updates:**
- Added SAST dependencies to Python worker (bandit, pip-audit, mypy)
- Updated module __init__.py files to export new analyzers
- Added type_errors.py test file to vulnerable_app for Mypy validation

## Testing

Workflow tested successfully on vulnerable_app:
-  Bandit: Detected 9 security issues (command injection, unsafe functions)
-  Mypy: Detected 5 type errors
-  DependencyScanner: Ran successfully (no CVEs in test dependencies)
-  SARIF export: Generated valid SARIF with 14 total findings
2025-10-22 15:28:19 +02:00
tduhamel42
6e4241a15f fix: properly detect worker file changes in CI
The previous condition used invalid GitHub context field.
Now uses git diff to properly detect changes to workers/ or docker-compose.yml.

Behavior:
- Job always runs the check step
- Detects if workers/ or docker-compose.yml modified
- Only builds Docker images if workers actually changed
- Shows clear skip message when no worker changes detected
2025-10-22 11:51:32 +02:00
tduhamel42
d68344867b fix: add dev branch to test workflow triggers
The test workflow was configured for 'develop' but the actual branch is named 'dev'.
This caused tests not to run on PRs to dev branch.

Now tests will run on:
- PRs to: main, master, dev, develop
- Pushes to: main, master, dev, develop, feature/**
2025-10-22 11:49:06 +02:00
tduhamel42
f5554d0836 Merge pull request #22 from FuzzingLabs/ci/worker-validation-and-docker-builds
ci: add worker validation and Docker build checks
2025-10-22 11:46:58 +02:00
tduhamel42
3e949b2ae8 ci: add worker validation and Docker build checks
Add automated validation to prevent worker-related issues:

**Worker Validation Script:**
- New script: .github/scripts/validate-workers.sh
- Validates all workers in docker-compose.yml exist
- Checks required files: Dockerfile, requirements.txt, worker.py
- Verifies files are tracked by git (not gitignored)
- Detects gitignore issues that could hide workers

**CI Workflow Updates:**
- Added validate-workers job (runs on every PR)
- Added build-workers job (runs if workers/ modified)
- Uses Docker Buildx for caching
- Validates Docker images build successfully
- Updated test-summary to check validation results

**PR Template:**
- New pull request template with comprehensive checklist
- Specific section for worker-related changes
- Reminds contributors to validate worker files
- Includes documentation and changelog reminders

These checks would have caught the secrets worker gitignore issue.

Implements Phase 1 improvements from CI/CD quality assessment.
2025-10-22 11:45:04 +02:00
tduhamel42
731927667d fix/ Change default llm_secret_detection to gpt-5-mini 2025-10-22 10:17:41 +02:00
tduhamel42
75df59ddef fix: add missing secrets worker to repository
The secrets worker was being ignored due to broad gitignore pattern.
Added exception to allow workers/secrets/ directory while still ignoring actual secrets.

Files added:
- workers/secrets/Dockerfile
- workers/secrets/requirements.txt
- workers/secrets/worker.py
2025-10-22 08:39:20 +02:00
tduhamel42
4e14b4207d Merge pull request #20 from FuzzingLabs/dev
Release: v0.7.1 - Worker fixes, monitor consolidation, and findings improvements
2025-10-21 16:59:44 +02:00
tduhamel42
4cf4a1e5e8 Merge pull request #19 from FuzzingLabs/fix/worker-naming-and-compose-version
fix: worker naming, monitor commands, and findings CLI improvements
2025-10-21 16:54:51 +02:00
tduhamel42
076ec71482 fix: worker naming, monitor commands, and findings CLI improvements
This PR addresses multiple issues and improvements across the CLI and backend:

**Worker Naming Fixes:**
- Fix worker container naming mismatch between CLI and docker-compose
- Update worker_manager.py to use docker compose commands with service names
- Remove worker_container field from workflows API, keep only worker_service
- Backend now correctly uses service names (worker-python, worker-secrets, etc.)

**Backend API Fixes:**
- Fix workflow name extraction from run_id in runs.py (was showing "unknown")
- Update monitor command suggestions from 'monitor stats' to 'monitor live'

**Monitor Command Consolidation:**
- Merge 'monitor stats' and 'monitor live' into single 'monitor live' command
- Add --once and --style flags for flexibility
- Remove all references to deprecated 'monitor stats' command

**Findings CLI Structure Improvements (Closes #18):**
- Move 'show' command from 'findings' (plural) to 'finding' (singular)
- Keep 'export' command in 'findings' (plural) as it exports all findings
- Remove broken 'analyze' command (imported non-existent function)
- Update all command suggestions to use correct paths
- Fix smart routing logic in main.py to handle new command structure
- Add export suggestions after viewing findings with unique timestamps
- Change default export format to SARIF (industry standard)

**Docker Compose:**
- Remove obsolete version field to fix deprecation warning

All commands tested and working:
- ff finding show <run-id> --rule <rule-id> ✓
- ff findings export <run-id> ✓
- ff finding <run-id> (direct viewing) ✓
- ff monitor live <run-id> ✓
2025-10-21 16:53:08 +02:00
tduhamel42
f200cb6fb7 docs: add worker startup instructions to quickstart and tutorial 2025-10-17 11:46:40 +02:00
tduhamel42
a72a0072df Merge pull request #17 from FuzzingLabs/docs/update-temporal-architecture
docs: Update documentation for v0.7.0 Temporal architecture
2025-10-17 11:02:02 +02:00
tduhamel42
fe58b39abf fix: Add benchmark results files to git
- Added exception in .gitignore for benchmark results directory
- Force-added comparison_report.md and comparison_results.json
- These files contain benchmark metrics, not actual secrets
- Fixes broken link in README to benchmark results
2025-10-17 09:56:09 +02:00
138 changed files with 10332 additions and 1707 deletions

79
.github/pull_request_template.md vendored Normal file
View File

@@ -0,0 +1,79 @@
## Description
<!-- Provide a brief description of the changes in this PR -->
## Type of Change
<!-- Mark the appropriate option with an 'x' -->
- [ ] 🐛 Bug fix (non-breaking change which fixes an issue)
- [ ] ✨ New feature (non-breaking change which adds functionality)
- [ ] 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
- [ ] 📝 Documentation update
- [ ] 🔧 Configuration change
- [ ] ♻️ Refactoring (no functional changes)
- [ ] 🎨 Style/formatting changes
- [ ] ✅ Test additions or updates
## Related Issues
<!-- Link to related issues using #issue_number -->
<!-- Example: Closes #123, Relates to #456 -->
## Changes Made
<!-- List the specific changes made in this PR -->
-
-
-
## Testing
<!-- Describe the tests you ran to verify your changes -->
### Tested Locally
- [ ] All tests pass (`pytest`, `uv build`, etc.)
- [ ] Linting passes (`ruff check`)
- [ ] Code builds successfully
### Worker Changes (if applicable)
- [ ] Docker images build successfully (`docker compose build`)
- [ ] Worker containers start correctly
- [ ] Tested with actual workflow execution
### Documentation
- [ ] Documentation updated (if needed)
- [ ] README updated (if needed)
- [ ] CHANGELOG.md updated (if user-facing changes)
## Pre-Merge Checklist
<!-- Ensure all items are completed before requesting review -->
- [ ] My code follows the project's coding standards
- [ ] I have performed a self-review of my code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged and published
### Worker-Specific Checks (if workers/ modified)
- [ ] All worker files properly tracked by git (not gitignored)
- [ ] Worker validation script passes (`.github/scripts/validate-workers.sh`)
- [ ] Docker images build without errors
- [ ] Worker configuration updated in `docker-compose.yml` (if needed)
## Screenshots (if applicable)
<!-- Add screenshots to help explain your changes -->
## Additional Notes
<!-- Any additional information that reviewers should know -->

127
.github/scripts/validate-workers.sh vendored Executable file
View File

@@ -0,0 +1,127 @@
#!/bin/bash
# Worker Validation Script
# Ensures all workers defined in docker-compose.yml exist in the repository
# and are properly tracked by git.
set -e
echo "🔍 Validating worker completeness..."
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
ERRORS=0
WARNINGS=0
# Extract worker service names from docker-compose.yml
echo ""
echo "📋 Checking workers defined in docker-compose.yml..."
WORKERS=$(grep -E "^\s+worker-" docker-compose.yml | grep -v "#" | cut -d: -f1 | tr -d ' ' | sort -u)
if [ -z "$WORKERS" ]; then
echo -e "${RED}❌ No workers found in docker-compose.yml${NC}"
exit 1
fi
echo "Found workers:"
for worker in $WORKERS; do
echo " - $worker"
done
# Check each worker
echo ""
echo "🔎 Validating worker files..."
for worker in $WORKERS; do
WORKER_DIR="workers/${worker#worker-}"
echo ""
echo "Checking $worker ($WORKER_DIR)..."
# Check if directory exists
if [ ! -d "$WORKER_DIR" ]; then
echo -e "${RED} ❌ Directory not found: $WORKER_DIR${NC}"
ERRORS=$((ERRORS + 1))
continue
fi
# Check Dockerfile (single file or multi-platform pattern)
if [ -f "$WORKER_DIR/Dockerfile" ]; then
# Single Dockerfile
if ! git ls-files --error-unmatch "$WORKER_DIR/Dockerfile" &> /dev/null; then
echo -e "${RED} ❌ File not tracked by git: $WORKER_DIR/Dockerfile${NC}"
echo -e "${YELLOW} Check .gitignore patterns!${NC}"
ERRORS=$((ERRORS + 1))
else
echo -e "${GREEN} ✓ Dockerfile (tracked)${NC}"
fi
elif compgen -G "$WORKER_DIR/Dockerfile.*" > /dev/null; then
# Multi-platform Dockerfiles (e.g., Dockerfile.amd64, Dockerfile.arm64)
PLATFORM_DOCKERFILES=$(ls "$WORKER_DIR"/Dockerfile.* 2>/dev/null)
DOCKERFILE_FOUND=false
for dockerfile in $PLATFORM_DOCKERFILES; do
if git ls-files --error-unmatch "$dockerfile" &> /dev/null; then
echo -e "${GREEN}$(basename "$dockerfile") (tracked)${NC}"
DOCKERFILE_FOUND=true
else
echo -e "${RED} ❌ File not tracked by git: $dockerfile${NC}"
ERRORS=$((ERRORS + 1))
fi
done
if [ "$DOCKERFILE_FOUND" = false ]; then
echo -e "${RED} ❌ No platform-specific Dockerfiles found${NC}"
ERRORS=$((ERRORS + 1))
fi
else
echo -e "${RED} ❌ Missing Dockerfile or Dockerfile.* files${NC}"
ERRORS=$((ERRORS + 1))
fi
# Check other required files
REQUIRED_FILES=("requirements.txt" "worker.py")
for file in "${REQUIRED_FILES[@]}"; do
FILE_PATH="$WORKER_DIR/$file"
if [ ! -f "$FILE_PATH" ]; then
echo -e "${RED} ❌ Missing file: $FILE_PATH${NC}"
ERRORS=$((ERRORS + 1))
else
# Check if file is tracked by git
if ! git ls-files --error-unmatch "$FILE_PATH" &> /dev/null; then
echo -e "${RED} ❌ File not tracked by git: $FILE_PATH${NC}"
echo -e "${YELLOW} Check .gitignore patterns!${NC}"
ERRORS=$((ERRORS + 1))
else
echo -e "${GREEN}$file (tracked)${NC}"
fi
fi
done
done
# Check for any ignored worker files
echo ""
echo "🚫 Checking for gitignored worker files..."
IGNORED_FILES=$(git check-ignore workers/*/* 2>/dev/null || true)
if [ -n "$IGNORED_FILES" ]; then
echo -e "${YELLOW}⚠️ Warning: Some worker files are being ignored:${NC}"
echo "$IGNORED_FILES" | while read -r file; do
echo -e "${YELLOW} - $file${NC}"
done
WARNINGS=$((WARNINGS + 1))
fi
# Summary
echo ""
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
if [ $ERRORS -eq 0 ] && [ $WARNINGS -eq 0 ]; then
echo -e "${GREEN}✅ All workers validated successfully!${NC}"
exit 0
elif [ $ERRORS -eq 0 ]; then
echo -e "${YELLOW}⚠️ Validation passed with $WARNINGS warning(s)${NC}"
exit 0
else
echo -e "${RED}❌ Validation failed with $ERRORS error(s) and $WARNINGS warning(s)${NC}"
exit 1
fi

View File

@@ -2,11 +2,100 @@ name: Tests
on:
push:
branches: [ main, master, develop, feature/** ]
branches: [ main, master, dev, develop, feature/** ]
pull_request:
branches: [ main, master, develop ]
branches: [ main, master, dev, develop ]
jobs:
validate-workers:
name: Validate Workers
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run worker validation
run: |
chmod +x .github/scripts/validate-workers.sh
.github/scripts/validate-workers.sh
build-workers:
name: Build Worker Docker Images
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Fetch all history for proper diff
- name: Check which workers were modified
id: check-workers
run: |
if [ "${{ github.event_name }}" == "pull_request" ]; then
# For PRs, check changed files
CHANGED_FILES=$(git diff --name-only origin/${{ github.base_ref }}...HEAD)
echo "Changed files:"
echo "$CHANGED_FILES"
else
# For direct pushes, check last commit
CHANGED_FILES=$(git diff --name-only HEAD~1 HEAD)
fi
# Check if docker-compose.yml changed (build all workers)
if echo "$CHANGED_FILES" | grep -q "^docker-compose.yml"; then
echo "workers_to_build=worker-python worker-secrets worker-rust worker-android worker-ossfuzz" >> $GITHUB_OUTPUT
echo "workers_modified=true" >> $GITHUB_OUTPUT
echo "✅ docker-compose.yml modified - building all workers"
exit 0
fi
# Detect which specific workers changed
WORKERS_TO_BUILD=""
if echo "$CHANGED_FILES" | grep -q "^workers/python/"; then
WORKERS_TO_BUILD="$WORKERS_TO_BUILD worker-python"
echo "✅ Python worker modified"
fi
if echo "$CHANGED_FILES" | grep -q "^workers/secrets/"; then
WORKERS_TO_BUILD="$WORKERS_TO_BUILD worker-secrets"
echo "✅ Secrets worker modified"
fi
if echo "$CHANGED_FILES" | grep -q "^workers/rust/"; then
WORKERS_TO_BUILD="$WORKERS_TO_BUILD worker-rust"
echo "✅ Rust worker modified"
fi
if echo "$CHANGED_FILES" | grep -q "^workers/android/"; then
WORKERS_TO_BUILD="$WORKERS_TO_BUILD worker-android"
echo "✅ Android worker modified"
fi
if echo "$CHANGED_FILES" | grep -q "^workers/ossfuzz/"; then
WORKERS_TO_BUILD="$WORKERS_TO_BUILD worker-ossfuzz"
echo "✅ OSS-Fuzz worker modified"
fi
if [ -z "$WORKERS_TO_BUILD" ]; then
echo "workers_modified=false" >> $GITHUB_OUTPUT
echo "⏭️ No worker changes detected - skipping build"
else
echo "workers_to_build=$WORKERS_TO_BUILD" >> $GITHUB_OUTPUT
echo "workers_modified=true" >> $GITHUB_OUTPUT
echo "Building workers:$WORKERS_TO_BUILD"
fi
- name: Set up Docker Buildx
if: steps.check-workers.outputs.workers_modified == 'true'
uses: docker/setup-buildx-action@v3
- name: Build worker images
if: steps.check-workers.outputs.workers_modified == 'true'
run: |
WORKERS="${{ steps.check-workers.outputs.workers_to_build }}"
echo "Building worker Docker images: $WORKERS"
docker compose build $WORKERS --no-cache
continue-on-error: false
lint:
name: Lint
runs-on: ubuntu-latest
@@ -21,7 +110,7 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install ruff mypy
pip install ruff mypy bandit
- name: Run ruff
run: ruff check backend/src backend/toolbox backend/tests backend/benchmarks --output-format=github
@@ -30,6 +119,10 @@ jobs:
run: mypy backend/src backend/toolbox || true
continue-on-error: true
- name: Run bandit (continue on error)
run: bandit --recursive backend/src || true
continue-on-error: true
unit-tests:
name: Unit Tests
runs-on: ubuntu-latest
@@ -143,11 +236,15 @@ jobs:
test-summary:
name: Test Summary
runs-on: ubuntu-latest
needs: [lint, unit-tests]
needs: [validate-workers, lint, unit-tests]
if: always()
steps:
- name: Check test results
run: |
if [ "${{ needs.validate-workers.result }}" != "success" ]; then
echo "Worker validation failed"
exit 1
fi
if [ "${{ needs.unit-tests.result }}" != "success" ]; then
echo "Unit tests failed"
exit 1

14
.gitignore vendored
View File

@@ -188,6 +188,10 @@ logs/
# Docker volume configs (keep .env.example but ignore actual .env)
volumes/env/.env
# Vendored proxy sources (kept locally for reference)
ai/proxy/bifrost/
ai/proxy/litellm/
# Test project databases and configurations
test_projects/*/.fuzzforge/
test_projects/*/findings.db*
@@ -240,6 +244,10 @@ yarn-error.log*
!**/secret_detection_benchmark_GROUND_TRUTH.json
!**/secret_detection/results/
# Exception: Allow workers/secrets/ directory (secrets detection worker)
!workers/secrets/
!workers/secrets/**
secret*
secrets/
credentials*
@@ -300,4 +308,8 @@ test_projects/*/.npmrc
test_projects/*/.git-credentials
test_projects/*/credentials.*
test_projects/*/api_keys.*
test_projects/*/ci-*.sh
test_projects/*/ci-*.sh
# -------------------- Internal Documentation --------------------
# Weekly summaries and temporary project documentation
WEEK_SUMMARY*.md

View File

@@ -5,7 +5,118 @@ All notable changes to FuzzForge will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [0.7.0] - 2025-01-16
## [Unreleased]
### 📝 Documentation
- Added comprehensive worker startup documentation across all guides
- Added workflow-to-worker mapping tables in README, troubleshooting guide, getting started guide, and docker setup guide
- Fixed broken documentation links in CLI reference
- Added WEEK_SUMMARY*.md pattern to .gitignore
---
## [0.7.3] - 2025-10-30
### 🎯 Major Features
#### Android Static Analysis Workflow
- **Added comprehensive Android security testing workflow** (`android_static_analysis`):
- Jadx decompiler for APK → Java source code decompilation
- OpenGrep/Semgrep static analysis with custom Android security rules
- MobSF integration for comprehensive mobile security scanning
- SARIF report generation with unified findings format
- Test results: Successfully decompiled 4,145 Java files, found 8 security vulnerabilities
- Full workflow completes in ~1.5 minutes
#### Platform-Aware Worker Architecture
- **ARM64 (Apple Silicon) support**:
- Automatic platform detection (ARM64 vs x86_64) in CLI using `platform.machine()`
- Worker metadata convention (`metadata.yaml`) for platform-specific capabilities
- Multi-Dockerfile support: `Dockerfile.amd64` (full toolchain) and `Dockerfile.arm64` (optimized)
- Conditional module imports for graceful degradation (MobSF skips on ARM64)
- Backend path resolution via `FUZZFORGE_HOST_ROOT` for CLI worker management
- **Worker selection logic**:
- CLI automatically selects appropriate Dockerfile based on detected platform
- Multi-strategy path resolution (API → .fuzzforge marker → environment variable)
- Platform-specific tool availability documented in metadata
#### Python SAST Workflow
- **Added Python Static Application Security Testing workflow** (`python_sast`):
- Bandit for Python security linting (SAST)
- MyPy for static type checking
- Safety for dependency vulnerability scanning
- Integrated SARIF reporter for unified findings format
- Auto-start Python worker on-demand
### ✨ Enhancements
#### CI/CD Improvements
- Added automated worker validation in CI pipeline
- Docker build checks for all workers before merge
- Worker file change detection for selective builds
- Optimized Docker layer caching for faster builds
- Dev branch testing workflow triggers
#### CLI Improvements
- Fixed live monitoring bug in `ff monitor live` command
- Enhanced `ff findings` command with better table formatting
- Improved `ff monitor` with clearer status displays
- Auto-start workers on-demand when workflows require them
- Better error messages with actionable manual start commands
#### Worker Management
- Standardized worker service names (`worker-python`, `worker-android`, etc.)
- Added missing `worker-secrets` to repository
- Improved worker naming consistency across codebase
#### LiteLLM Integration
- Centralized LLM provider management with proxy
- Governance and request/response routing
- OTEL collector integration for observability
- Environment-based configurable timeouts
- Optional `.env.litellm` configuration
### 🐛 Bug Fixes
- Fixed MobSF API key generation from secret file (SHA256 hash)
- Corrected Temporal activity names (decompile_with_jadx, scan_with_opengrep, scan_with_mobsf)
- Resolved linter errors across codebase
- Fixed unused import issues to pass CI checks
- Removed deprecated workflow parameters
- Docker Compose version compatibility fixes
### 🔧 Technical Changes
- Conditional import pattern for optional dependencies (MobSF on ARM64)
- Multi-platform Dockerfile architecture
- Worker metadata convention for capability declaration
- Improved CI worker build optimization
- Enhanced storage activity error handling
### 📝 Test Projects
- Added `test_projects/android_test/` with BeetleBug.apk and shopnest.apk
- Android workflow validation with real APK samples
- ARM64 platform testing and validation
---
## [0.7.2] - 2025-10-22
### 🐛 Bug Fixes
- Fixed worker naming inconsistencies across codebase
- Improved monitor command consolidation and usability
- Enhanced findings CLI with better formatting and display
- Added missing secrets worker to repository
### 📝 Documentation
- Added benchmark results files to git for secret detection workflows
**Note:** v0.7.1 was re-tagged as v0.7.2 (both point to the same commit)
---
## [0.7.0] - 2025-10-16
### 🎯 Major Features
@@ -40,7 +151,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
#### Documentation
- Updated README for Temporal + MinIO architecture
- Removed obsolete `volume_mode` references across all documentation
- Added `.env` configuration guide for AI agent API keys
- Fixed worker startup instructions with correct service names
- Updated docker compose commands to modern syntax
@@ -52,6 +162,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### 🐛 Bug Fixes
- Fixed default parameters from metadata.yaml not being applied to workflows when no parameters provided
- Fixed gitleaks workflow failing on uploaded directories without Git history
- Fixed worker startup command suggestions (now uses `docker compose up -d` with service names)
- Fixed missing `cognify_text` method in CogneeProjectIntegration
@@ -71,7 +182,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
---
## [0.6.0] - 2024-12-XX
## [0.6.0] - Undocumented
### Features
- Initial Temporal migration
@@ -79,7 +190,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Security assessment workflow
- Basic CLI commands
**Note:** No git tag exists for v0.6.0. Release date undocumented.
---
[0.7.0]: https://github.com/FuzzingLabs/fuzzforge_ai/compare/v0.6.0...v0.7.0
[0.6.0]: https://github.com/FuzzingLabs/fuzzforge_ai/releases/tag/v0.6.0
[0.7.3]: https://github.com/FuzzingLabs/fuzzforge_ai/compare/v0.7.2...v0.7.3
[0.7.2]: https://github.com/FuzzingLabs/fuzzforge_ai/compare/v0.7.0...v0.7.2
[0.7.0]: https://github.com/FuzzingLabs/fuzzforge_ai/releases/tag/v0.7.0
[0.6.0]: https://github.com/FuzzingLabs/fuzzforge_ai/tree/v0.6.0

View File

@@ -10,7 +10,7 @@
<a href="LICENSE"><img src="https://img.shields.io/badge/license-BSL%20%2B%20Apache-orange" alt="License: BSL + Apache"></a>
<a href="https://www.python.org/downloads/"><img src="https://img.shields.io/badge/python-3.11%2B-blue" alt="Python 3.11+"/></a>
<a href="https://fuzzforge.ai"><img src="https://img.shields.io/badge/Website-fuzzforge.ai-blue" alt="Website"/></a>
<img src="https://img.shields.io/badge/version-0.7.0-green" alt="Version">
<img src="https://img.shields.io/badge/version-0.7.3-green" alt="Version">
<a href="https://github.com/FuzzingLabs/fuzzforge_ai/stargazers"><img src="https://img.shields.io/github/stars/FuzzingLabs/fuzzforge_ai?style=social" alt="GitHub Stars"></a>
</p>
@@ -115,9 +115,11 @@ For containerized workflows, see the [Docker Installation Guide](https://docs.do
For AI-powered workflows, configure your LLM API keys:
```bash
cp volumes/env/.env.example volumes/env/.env
cp volumes/env/.env.template volumes/env/.env
# Edit volumes/env/.env and add your API keys (OpenAI, Anthropic, Google, etc.)
# Add your key to LITELLM_GEMINI_API_KEY
```
> Dont change the OPENAI_API_KEY default value, as it is used for the LLM proxy.
This is required for:
- `llm_secret_detection` workflow
@@ -150,16 +152,31 @@ git clone https://github.com/fuzzinglabs/fuzzforge_ai.git
cd fuzzforge_ai
# 2. Copy the default LLM env config
cp volumes/env/.env.example volumes/env/.env
cp volumes/env/.env.template volumes/env/.env
# 3. Start FuzzForge with Temporal
docker compose up -d
# 4. Start the Python worker (needed for security_assessment workflow)
docker compose up -d worker-python
```
> The first launch can take 2-3 minutes for services to initialize ☕
>
> Workers don't auto-start by default (saves RAM). Start the worker you need before running workflows.
**Workflow-to-Worker Quick Reference:**
| Workflow | Worker Required | Startup Command |
|----------|----------------|-----------------|
| `security_assessment`, `python_sast`, `llm_analysis`, `atheris_fuzzing` | worker-python | `docker compose up -d worker-python` |
| `android_static_analysis` | worker-android | `docker compose up -d worker-android` |
| `cargo_fuzzing` | worker-rust | `docker compose up -d worker-rust` |
| `ossfuzz_campaign` | worker-ossfuzz | `docker compose up -d worker-ossfuzz` |
| `llm_secret_detection`, `trufflehog_detection`, `gitleaks_detection` | worker-secrets | `docker compose up -d worker-secrets` |
```bash
# 3. Run your first workflow (files are automatically uploaded)
# 5. Run your first workflow (files are automatically uploaded)
cd test_projects/vulnerable_app/
fuzzforge init # Initialize FuzzForge project
ff workflow run security_assessment . # Start workflow - CLI uploads files automatically!
@@ -172,7 +189,7 @@ ff workflow run security_assessment . # Start workflow - CLI uploads files au
```
**What's running:**
- **Temporal**: Workflow orchestration (UI at http://localhost:8233)
- **Temporal**: Workflow orchestration (UI at http://localhost:8080)
- **MinIO**: File storage for targets (Console at http://localhost:9001)
- **Vertical Workers**: Pre-built workers with security toolchains
- **Backend API**: FuzzForge REST API (http://localhost:8000)

View File

@@ -1,10 +0,0 @@
# Default LiteLLM configuration
LITELLM_MODEL=gemini/gemini-2.0-flash-001
# LITELLM_PROVIDER=gemini
# API keys (uncomment and fill as needed)
# GOOGLE_API_KEY=
# OPENAI_API_KEY=
# ANTHROPIC_API_KEY=
# OPENROUTER_API_KEY=
# MISTRAL_API_KEY=

View File

@@ -16,4 +16,9 @@ COPY . /app/agent_with_adk_format
WORKDIR /app/agent_with_adk_format
ENV PYTHONPATH=/app
# Copy and set up entrypoint
COPY docker-entrypoint.sh /docker-entrypoint.sh
RUN chmod +x /docker-entrypoint.sh
ENTRYPOINT ["/docker-entrypoint.sh"]
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

View File

@@ -43,18 +43,34 @@ cd task_agent
# cp .env.example .env
```
Edit `.env` (or `.env.example`) and add your API keys. The agent must be restarted after changes so the values are picked up:
Edit `.env` (or `.env.example`) and add your proxy + API keys. The agent must be restarted after changes so the values are picked up:
```bash
# Set default model
LITELLM_MODEL=gemini/gemini-2.0-flash-001
# Route every request through the proxy container (use http://localhost:10999 from the host)
FF_LLM_PROXY_BASE_URL=http://llm-proxy:4000
# Add API keys for providers you want to use
GOOGLE_API_KEY=your_google_api_key
OPENAI_API_KEY=your_openai_api_key
ANTHROPIC_API_KEY=your_anthropic_api_key
OPENROUTER_API_KEY=your_openrouter_api_key
# Default model + provider the agent boots with
LITELLM_MODEL=openai/gpt-4o-mini
LITELLM_PROVIDER=openai
# Virtual key issued by the proxy to the task agent (bootstrap replaces the placeholder)
OPENAI_API_KEY=sk-proxy-default
# Upstream keys stay inside the proxy. Store real secrets under the LiteLLM
# aliases and the bootstrapper mirrors them into .env.litellm for the proxy container.
LITELLM_OPENAI_API_KEY=your_real_openai_api_key
LITELLM_ANTHROPIC_API_KEY=your_real_anthropic_key
LITELLM_GEMINI_API_KEY=your_real_gemini_key
LITELLM_MISTRAL_API_KEY=your_real_mistral_key
LITELLM_OPENROUTER_API_KEY=your_real_openrouter_key
```
> When running the agent outside of Docker, swap `FF_LLM_PROXY_BASE_URL` to the host port (default `http://localhost:10999`).
The bootstrap container provisions LiteLLM, copies provider secrets into
`volumes/env/.env.litellm`, and rewrites `volumes/env/.env` with the virtual key.
Populate the `LITELLM_*_API_KEY` values before the first launch so the proxy can
reach your upstream providers as soon as the bootstrap script runs.
### 2. Install Dependencies
```bash

View File

@@ -0,0 +1,31 @@
#!/bin/bash
set -e
# Wait for .env file to have keys (max 30 seconds)
echo "[task-agent] Waiting for virtual keys to be provisioned..."
for i in $(seq 1 30); do
if [ -f /app/config/.env ]; then
# Check if TASK_AGENT_API_KEY has a value (not empty)
KEY=$(grep -E '^TASK_AGENT_API_KEY=' /app/config/.env | cut -d'=' -f2)
if [ -n "$KEY" ] && [ "$KEY" != "" ]; then
echo "[task-agent] Virtual keys found, loading environment..."
# Export keys from .env file
export TASK_AGENT_API_KEY="$KEY"
export OPENAI_API_KEY=$(grep -E '^OPENAI_API_KEY=' /app/config/.env | cut -d'=' -f2)
export FF_LLM_PROXY_BASE_URL=$(grep -E '^FF_LLM_PROXY_BASE_URL=' /app/config/.env | cut -d'=' -f2)
echo "[task-agent] Loaded TASK_AGENT_API_KEY: ${TASK_AGENT_API_KEY:0:15}..."
echo "[task-agent] Loaded FF_LLM_PROXY_BASE_URL: $FF_LLM_PROXY_BASE_URL"
break
fi
fi
echo "[task-agent] Keys not ready yet, waiting... ($i/30)"
sleep 1
done
if [ -z "$TASK_AGENT_API_KEY" ]; then
echo "[task-agent] ERROR: Virtual keys were not provisioned within 30 seconds!"
exit 1
fi
echo "[task-agent] Starting uvicorn..."
exec "$@"

View File

@@ -4,13 +4,28 @@ from __future__ import annotations
import os
def _normalize_proxy_base_url(raw_value: str | None) -> str | None:
if not raw_value:
return None
cleaned = raw_value.strip()
if not cleaned:
return None
# Avoid double slashes in downstream requests
return cleaned.rstrip("/")
AGENT_NAME = "litellm_agent"
AGENT_DESCRIPTION = (
"A LiteLLM-backed shell that exposes hot-swappable model and prompt controls."
)
DEFAULT_MODEL = os.getenv("LITELLM_MODEL", "gemini-2.0-flash-001")
DEFAULT_PROVIDER = os.getenv("LITELLM_PROVIDER")
DEFAULT_MODEL = os.getenv("LITELLM_MODEL", "openai/gpt-4o-mini")
DEFAULT_PROVIDER = os.getenv("LITELLM_PROVIDER") or None
PROXY_BASE_URL = _normalize_proxy_base_url(
os.getenv("FF_LLM_PROXY_BASE_URL")
or os.getenv("LITELLM_API_BASE")
or os.getenv("LITELLM_BASE_URL")
)
STATE_PREFIX = "app:litellm_agent/"
STATE_MODEL_KEY = f"{STATE_PREFIX}model"

View File

@@ -3,11 +3,15 @@
from __future__ import annotations
from dataclasses import dataclass
import os
from typing import Any, Mapping, MutableMapping, Optional
import httpx
from .config import (
DEFAULT_MODEL,
DEFAULT_PROVIDER,
PROXY_BASE_URL,
STATE_MODEL_KEY,
STATE_PROMPT_KEY,
STATE_PROVIDER_KEY,
@@ -66,11 +70,109 @@ class HotSwapState:
"""Create a LiteLlm instance for the current state."""
from google.adk.models.lite_llm import LiteLlm # Lazy import to avoid cycle
from google.adk.models.lite_llm import LiteLLMClient
from litellm.types.utils import Choices, Message, ModelResponse, Usage
kwargs = {"model": self.model}
if self.provider:
kwargs["custom_llm_provider"] = self.provider
return LiteLlm(**kwargs)
if PROXY_BASE_URL:
provider = (self.provider or DEFAULT_PROVIDER or "").lower()
if provider and provider != "openai":
kwargs["api_base"] = f"{PROXY_BASE_URL.rstrip('/')}/{provider}"
else:
kwargs["api_base"] = PROXY_BASE_URL
kwargs.setdefault("api_key", os.environ.get("TASK_AGENT_API_KEY") or os.environ.get("OPENAI_API_KEY"))
provider = (self.provider or DEFAULT_PROVIDER or "").lower()
model_suffix = self.model.split("/", 1)[-1]
use_responses = provider == "openai" and (
model_suffix.startswith("gpt-5") or model_suffix.startswith("o1")
)
if use_responses:
kwargs.setdefault("use_responses_api", True)
llm = LiteLlm(**kwargs)
if use_responses and PROXY_BASE_URL:
class _ResponsesAwareClient(LiteLLMClient):
def __init__(self, base_client: LiteLLMClient, api_base: str, api_key: str):
self._base_client = base_client
self._api_base = api_base.rstrip("/")
self._api_key = api_key
async def acompletion(self, model, messages, tools, **kwargs): # type: ignore[override]
use_responses_api = kwargs.pop("use_responses_api", False)
if not use_responses_api:
return await self._base_client.acompletion(
model=model,
messages=messages,
tools=tools,
**kwargs,
)
resolved_model = model
if "/" not in resolved_model:
resolved_model = f"openai/{resolved_model}"
payload = {
"model": resolved_model,
"input": _messages_to_responses_input(messages),
}
timeout = kwargs.get("timeout", 60)
headers = {
"Authorization": f"Bearer {self._api_key}",
"Content-Type": "application/json",
}
async with httpx.AsyncClient(timeout=timeout) as client:
response = await client.post(
f"{self._api_base}/v1/responses",
json=payload,
headers=headers,
)
try:
response.raise_for_status()
except httpx.HTTPStatusError as exc:
text = exc.response.text
raise RuntimeError(
f"LiteLLM responses request failed: {text}"
) from exc
data = response.json()
text_output = _extract_output_text(data)
usage = data.get("usage", {})
return ModelResponse(
id=data.get("id"),
model=model,
choices=[
Choices(
finish_reason="stop",
index=0,
message=Message(role="assistant", content=text_output),
provider_specific_fields={"bifrost_response": data},
)
],
usage=Usage(
prompt_tokens=usage.get("input_tokens"),
completion_tokens=usage.get("output_tokens"),
reasoning_tokens=usage.get("output_tokens_details", {}).get(
"reasoning_tokens"
),
total_tokens=usage.get("total_tokens"),
),
)
llm.llm_client = _ResponsesAwareClient(
llm.llm_client,
PROXY_BASE_URL,
os.environ.get("TASK_AGENT_API_KEY") or os.environ.get("OPENAI_API_KEY", ""),
)
return llm
@property
def display_model(self) -> str:
@@ -84,3 +186,69 @@ def apply_state_to_agent(invocation_context, state: HotSwapState) -> None:
agent = invocation_context.agent
agent.model = state.instantiate_llm()
def _messages_to_responses_input(messages: list[dict[str, Any]]) -> list[dict[str, Any]]:
inputs: list[dict[str, Any]] = []
for message in messages:
role = message.get("role", "user")
content = message.get("content", "")
text_segments: list[str] = []
if isinstance(content, list):
for item in content:
if isinstance(item, dict):
text = item.get("text") or item.get("content")
if text:
text_segments.append(str(text))
elif isinstance(item, str):
text_segments.append(item)
elif isinstance(content, str):
text_segments.append(content)
text = "\n".join(segment.strip() for segment in text_segments if segment)
if not text:
continue
entry_type = "input_text"
if role == "assistant":
entry_type = "output_text"
inputs.append(
{
"role": role,
"content": [
{
"type": entry_type,
"text": text,
}
],
}
)
if not inputs:
inputs.append(
{
"role": "user",
"content": [
{
"type": "input_text",
"text": "",
}
],
}
)
return inputs
def _extract_output_text(response_json: dict[str, Any]) -> str:
outputs = response_json.get("output", [])
collected: list[str] = []
for item in outputs:
if isinstance(item, dict) and item.get("type") == "message":
for part in item.get("content", []):
if isinstance(part, dict) and part.get("type") == "output_text":
text = part.get("text", "")
if text:
collected.append(str(text))
return "\n\n".join(collected).strip()

5
ai/proxy/README.md Normal file
View File

@@ -0,0 +1,5 @@
# LLM Proxy Integrations
This directory contains vendor source trees that were vendored only for reference when integrating LLM gateways. The actual FuzzForge deployment uses the official Docker images for each project.
See `docs/docs/how-to/llm-proxy.md` for up-to-date instructions on running the proxy services and issuing keys for the agents.

View File

@@ -1,6 +1,6 @@
[project]
name = "fuzzforge-ai"
version = "0.7.0"
version = "0.7.3"
description = "FuzzForge AI orchestration module"
readme = "README.md"
requires-python = ">=3.11"

View File

@@ -21,4 +21,4 @@ Usage:
# Additional attribution and requirements are provided in the NOTICE file.
__version__ = "0.6.0"
__version__ = "0.7.3"

View File

@@ -831,20 +831,9 @@ class FuzzForgeExecutor:
async def submit_security_scan_mcp(
workflow_name: str,
target_path: str = "",
volume_mode: str = "",
parameters: Dict[str, Any] | None = None,
tool_context: ToolContext | None = None,
) -> Any:
# Normalise volume mode to supported values
normalised_mode = (volume_mode or "ro").strip().lower().replace("-", "_")
if normalised_mode in {"read_only", "readonly", "ro"}:
normalised_mode = "ro"
elif normalised_mode in {"read_write", "readwrite", "rw"}:
normalised_mode = "rw"
else:
# Fall back to read-only if we can't recognise the input
normalised_mode = "ro"
# Resolve the target path to an absolute path for validation
resolved_path = target_path or "."
try:
@@ -883,7 +872,6 @@ class FuzzForgeExecutor:
payload = {
"workflow_name": workflow_name,
"target_path": resolved_path,
"volume_mode": normalised_mode,
"parameters": cleaned_parameters,
}
result = await _call_fuzzforge_mcp("submit_security_scan_mcp", payload)
@@ -1061,10 +1049,19 @@ class FuzzForgeExecutor:
FunctionTool(get_task_list)
])
# Create the agent
# Create the agent with LiteLLM configuration
llm_kwargs = {}
api_key = os.getenv('OPENAI_API_KEY') or os.getenv('LLM_API_KEY')
api_base = os.getenv('LLM_ENDPOINT') or os.getenv('LLM_API_BASE') or os.getenv('OPENAI_API_BASE')
if api_key:
llm_kwargs['api_key'] = api_key
if api_base:
llm_kwargs['api_base'] = api_base
self.agent = LlmAgent(
model=LiteLlm(model=self.model),
model=LiteLlm(model=self.model, **llm_kwargs),
name="fuzzforge_executor",
description="Intelligent A2A orchestrator with memory",
instruction=self._build_instruction(),

View File

@@ -56,7 +56,7 @@ class CogneeService:
# Configure LLM with API key BEFORE any other cognee operations
provider = os.getenv("LLM_PROVIDER", "openai")
model = os.getenv("LLM_MODEL") or os.getenv("LITELLM_MODEL", "gpt-4o-mini")
api_key = os.getenv("LLM_API_KEY") or os.getenv("OPENAI_API_KEY")
api_key = os.getenv("COGNEE_API_KEY") or os.getenv("LLM_API_KEY") or os.getenv("OPENAI_API_KEY")
endpoint = os.getenv("LLM_ENDPOINT")
api_version = os.getenv("LLM_API_VERSION")
max_tokens = os.getenv("LLM_MAX_TOKENS")
@@ -78,48 +78,62 @@ class CogneeService:
os.environ.setdefault("OPENAI_API_KEY", api_key)
if endpoint:
os.environ["LLM_ENDPOINT"] = endpoint
os.environ.setdefault("LLM_API_BASE", endpoint)
os.environ.setdefault("OPENAI_API_BASE", endpoint)
os.environ.setdefault("LITELLM_PROXY_API_BASE", endpoint)
if api_key:
os.environ.setdefault("LITELLM_PROXY_API_KEY", api_key)
if api_version:
os.environ["LLM_API_VERSION"] = api_version
if max_tokens:
os.environ["LLM_MAX_TOKENS"] = str(max_tokens)
# Configure Cognee's runtime using its configuration helpers when available
embedding_model = os.getenv("LLM_EMBEDDING_MODEL")
embedding_endpoint = os.getenv("LLM_EMBEDDING_ENDPOINT")
if embedding_endpoint:
os.environ.setdefault("LLM_EMBEDDING_API_BASE", embedding_endpoint)
if hasattr(cognee.config, "set_llm_provider"):
cognee.config.set_llm_provider(provider)
if hasattr(cognee.config, "set_llm_model"):
cognee.config.set_llm_model(model)
if api_key and hasattr(cognee.config, "set_llm_api_key"):
cognee.config.set_llm_api_key(api_key)
if endpoint and hasattr(cognee.config, "set_llm_endpoint"):
cognee.config.set_llm_endpoint(endpoint)
if hasattr(cognee.config, "set_llm_model"):
cognee.config.set_llm_model(model)
if api_key and hasattr(cognee.config, "set_llm_api_key"):
cognee.config.set_llm_api_key(api_key)
if endpoint and hasattr(cognee.config, "set_llm_endpoint"):
cognee.config.set_llm_endpoint(endpoint)
if embedding_model and hasattr(cognee.config, "set_llm_embedding_model"):
cognee.config.set_llm_embedding_model(embedding_model)
if embedding_endpoint and hasattr(cognee.config, "set_llm_embedding_endpoint"):
cognee.config.set_llm_embedding_endpoint(embedding_endpoint)
if api_version and hasattr(cognee.config, "set_llm_api_version"):
cognee.config.set_llm_api_version(api_version)
if max_tokens and hasattr(cognee.config, "set_llm_max_tokens"):
cognee.config.set_llm_max_tokens(int(max_tokens))
# Configure graph database
cognee.config.set_graph_db_config({
"graph_database_provider": self.cognee_config.get("graph_database_provider", "kuzu"),
})
# Set data directories
data_dir = self.cognee_config.get("data_directory")
system_dir = self.cognee_config.get("system_directory")
if data_dir:
logger.debug("Setting cognee data root", extra={"path": data_dir})
cognee.config.data_root_directory(data_dir)
if system_dir:
logger.debug("Setting cognee system root", extra={"path": system_dir})
cognee.config.system_root_directory(system_dir)
# Setup multi-tenant user context
await self._setup_user_context()
self._initialized = True
logger.info(f"Cognee initialized for project {self.project_context['project_name']} "
f"with Kuzu at {system_dir}")
except ImportError:
logger.error("Cognee not installed. Install with: pip install cognee")
raise

19
backend/Makefile Normal file
View File

@@ -0,0 +1,19 @@
SOURCES=./src
TESTS=./tests
.PHONY: bandit format mypy pytest ruff
bandit:
uv run bandit --recursive $(SOURCES)
format:
uv run ruff format $(SOURCES) $(TESTS)
mypy:
uv run mypy $(SOURCES) $(TESTS)
pytest:
PYTHONPATH=./toolbox uv run pytest $(TESTS)
ruff:
uv run ruff check --fix $(SOURCES) $(TESTS)

View File

@@ -22,7 +22,6 @@
"parameters": {
"workflow_name": "string",
"target_path": "string",
"volume_mode": "string (ro|rw)",
"parameters": "object"
}
},

View File

@@ -1,33 +1,36 @@
[project]
name = "backend"
version = "0.7.0"
version = "0.7.3"
description = "FuzzForge OSS backend"
authors = []
readme = "README.md"
requires-python = ">=3.11"
dependencies = [
"fastapi>=0.116.1",
"temporalio>=1.6.0",
"boto3>=1.34.0",
"pydantic>=2.0.0",
"pyyaml>=6.0",
"docker>=7.0.0",
"aiofiles>=23.0.0",
"uvicorn>=0.30.0",
"aiohttp>=3.12.15",
"fastmcp",
"aiofiles==25.1.0",
"aiohttp==3.13.2",
"boto3==1.40.68",
"docker==7.1.0",
"fastapi==0.121.0",
"fastmcp==2.13.0.2",
"pydantic==2.12.4",
"pyyaml==6.0.3",
"temporalio==1.18.2",
"uvicorn==0.38.0",
]
[project.optional-dependencies]
dev = [
"pytest>=8.0.0",
"pytest-asyncio>=0.23.0",
"pytest-benchmark>=4.0.0",
"pytest-cov>=5.0.0",
"pytest-xdist>=3.5.0",
"pytest-mock>=3.12.0",
"httpx>=0.27.0",
"ruff>=0.1.0",
lints = [
"bandit==1.8.6",
"mypy==1.18.2",
"ruff==0.14.4",
]
tests = [
"pytest==8.4.2",
"pytest-asyncio==1.2.0",
"pytest-benchmark==5.2.1",
"pytest-cov==7.0.0",
"pytest-mock==3.15.1",
"pytest-xdist==3.8.0",
]
[tool.pytest.ini_options]

11
backend/ruff.toml Normal file
View File

@@ -0,0 +1,11 @@
line-length = 120
[lint]
select = [ "ALL" ]
ignore = []
[lint.per-file-ignores]
"tests/*" = [
"PLR2004", # allowing comparisons using unamed numerical constants in tests
"S101", # allowing 'assert' statements in tests
]

View File

@@ -1,6 +1,4 @@
"""
API endpoints for fuzzing workflow management and real-time monitoring
"""
"""API endpoints for fuzzing workflow management and real-time monitoring."""
# Copyright (c) 2025 FuzzingLabs
#
@@ -13,32 +11,29 @@ API endpoints for fuzzing workflow management and real-time monitoring
#
# Additional attribution and requirements are provided in the NOTICE file.
import logging
from typing import List, Dict
from fastapi import APIRouter, HTTPException, WebSocket, WebSocketDisconnect
from fastapi.responses import StreamingResponse
import asyncio
import contextlib
import json
import logging
from datetime import datetime
from src.models.findings import (
FuzzingStats,
CrashReport
)
from fastapi import APIRouter, HTTPException, WebSocket, WebSocketDisconnect
from fastapi.responses import StreamingResponse
from src.models.findings import CrashReport, FuzzingStats
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/fuzzing", tags=["fuzzing"])
# In-memory storage for real-time stats (in production, use Redis or similar)
fuzzing_stats: Dict[str, FuzzingStats] = {}
crash_reports: Dict[str, List[CrashReport]] = {}
active_connections: Dict[str, List[WebSocket]] = {}
fuzzing_stats: dict[str, FuzzingStats] = {}
crash_reports: dict[str, list[CrashReport]] = {}
active_connections: dict[str, list[WebSocket]] = {}
def initialize_fuzzing_tracking(run_id: str, workflow_name: str):
"""
Initialize fuzzing tracking for a new run.
def initialize_fuzzing_tracking(run_id: str, workflow_name: str) -> None:
"""Initialize fuzzing tracking for a new run.
This function should be called when a workflow is submitted to enable
real-time monitoring and stats collection.
@@ -46,19 +41,19 @@ def initialize_fuzzing_tracking(run_id: str, workflow_name: str):
Args:
run_id: The run identifier
workflow_name: Name of the workflow
"""
fuzzing_stats[run_id] = FuzzingStats(
run_id=run_id,
workflow=workflow_name
workflow=workflow_name,
)
crash_reports[run_id] = []
active_connections[run_id] = []
@router.get("/{run_id}/stats", response_model=FuzzingStats)
@router.get("/{run_id}/stats")
async def get_fuzzing_stats(run_id: str) -> FuzzingStats:
"""
Get current fuzzing statistics for a run.
"""Get current fuzzing statistics for a run.
Args:
run_id: The fuzzing run ID
@@ -68,20 +63,20 @@ async def get_fuzzing_stats(run_id: str) -> FuzzingStats:
Raises:
HTTPException: 404 if run not found
"""
if run_id not in fuzzing_stats:
raise HTTPException(
status_code=404,
detail=f"Fuzzing run not found: {run_id}"
detail=f"Fuzzing run not found: {run_id}",
)
return fuzzing_stats[run_id]
@router.get("/{run_id}/crashes", response_model=List[CrashReport])
async def get_crash_reports(run_id: str) -> List[CrashReport]:
"""
Get crash reports for a fuzzing run.
@router.get("/{run_id}/crashes")
async def get_crash_reports(run_id: str) -> list[CrashReport]:
"""Get crash reports for a fuzzing run.
Args:
run_id: The fuzzing run ID
@@ -91,11 +86,12 @@ async def get_crash_reports(run_id: str) -> List[CrashReport]:
Raises:
HTTPException: 404 if run not found
"""
if run_id not in crash_reports:
raise HTTPException(
status_code=404,
detail=f"Fuzzing run not found: {run_id}"
detail=f"Fuzzing run not found: {run_id}",
)
return crash_reports[run_id]
@@ -103,8 +99,7 @@ async def get_crash_reports(run_id: str) -> List[CrashReport]:
@router.post("/{run_id}/stats")
async def update_fuzzing_stats(run_id: str, stats: FuzzingStats):
"""
Update fuzzing statistics (called by fuzzing workflows).
"""Update fuzzing statistics (called by fuzzing workflows).
Args:
run_id: The fuzzing run ID
@@ -112,18 +107,19 @@ async def update_fuzzing_stats(run_id: str, stats: FuzzingStats):
Raises:
HTTPException: 404 if run not found
"""
if run_id not in fuzzing_stats:
raise HTTPException(
status_code=404,
detail=f"Fuzzing run not found: {run_id}"
detail=f"Fuzzing run not found: {run_id}",
)
# Update stats
fuzzing_stats[run_id] = stats
# Debug: log reception for live instrumentation
try:
with contextlib.suppress(Exception):
logger.info(
"Received fuzzing stats update: run_id=%s exec=%s eps=%.2f crashes=%s corpus=%s coverage=%s elapsed=%ss",
run_id,
@@ -134,14 +130,12 @@ async def update_fuzzing_stats(run_id: str, stats: FuzzingStats):
stats.coverage,
stats.elapsed_time,
)
except Exception:
pass
# Notify connected WebSocket clients
if run_id in active_connections:
message = {
"type": "stats_update",
"data": stats.model_dump()
"data": stats.model_dump(),
}
for websocket in active_connections[run_id][:]: # Copy to avoid modification during iteration
try:
@@ -153,12 +147,12 @@ async def update_fuzzing_stats(run_id: str, stats: FuzzingStats):
@router.post("/{run_id}/crash")
async def report_crash(run_id: str, crash: CrashReport):
"""
Report a new crash (called by fuzzing workflows).
"""Report a new crash (called by fuzzing workflows).
Args:
run_id: The fuzzing run ID
crash: Crash report details
"""
if run_id not in crash_reports:
crash_reports[run_id] = []
@@ -175,7 +169,7 @@ async def report_crash(run_id: str, crash: CrashReport):
if run_id in active_connections:
message = {
"type": "crash_report",
"data": crash.model_dump()
"data": crash.model_dump(),
}
for websocket in active_connections[run_id][:]:
try:
@@ -186,12 +180,12 @@ async def report_crash(run_id: str, crash: CrashReport):
@router.websocket("/{run_id}/live")
async def websocket_endpoint(websocket: WebSocket, run_id: str):
"""
WebSocket endpoint for real-time fuzzing updates.
"""WebSocket endpoint for real-time fuzzing updates.
Args:
websocket: WebSocket connection
run_id: The fuzzing run ID to monitor
"""
await websocket.accept()
@@ -223,7 +217,7 @@ async def websocket_endpoint(websocket: WebSocket, run_id: str):
# Echo back for ping-pong
if data == "ping":
await websocket.send_text("pong")
except asyncio.TimeoutError:
except TimeoutError:
# Send periodic heartbeat
await websocket.send_text(json.dumps({"type": "heartbeat"}))
@@ -231,31 +225,31 @@ async def websocket_endpoint(websocket: WebSocket, run_id: str):
# Clean up connection
if run_id in active_connections and websocket in active_connections[run_id]:
active_connections[run_id].remove(websocket)
except Exception as e:
logger.error(f"WebSocket error for run {run_id}: {e}")
except Exception:
logger.exception("WebSocket error for run %s", run_id)
if run_id in active_connections and websocket in active_connections[run_id]:
active_connections[run_id].remove(websocket)
@router.get("/{run_id}/stream")
async def stream_fuzzing_updates(run_id: str):
"""
Server-Sent Events endpoint for real-time fuzzing updates.
"""Server-Sent Events endpoint for real-time fuzzing updates.
Args:
run_id: The fuzzing run ID to monitor
Returns:
Streaming response with real-time updates
"""
if run_id not in fuzzing_stats:
raise HTTPException(
status_code=404,
detail=f"Fuzzing run not found: {run_id}"
detail=f"Fuzzing run not found: {run_id}",
)
async def event_stream():
"""Generate server-sent events for fuzzing updates"""
"""Generate server-sent events for fuzzing updates."""
last_stats_time = datetime.utcnow()
while True:
@@ -276,10 +270,7 @@ async def stream_fuzzing_updates(run_id: str):
# Send recent crashes
if run_id in crash_reports:
recent_crashes = [
crash for crash in crash_reports[run_id]
if crash.timestamp > last_stats_time
]
recent_crashes = [crash for crash in crash_reports[run_id] if crash.timestamp > last_stats_time]
for crash in recent_crashes:
event_data = f"data: {json.dumps({'type': 'crash', 'data': crash.model_dump()})}\n\n"
yield event_data
@@ -287,8 +278,8 @@ async def stream_fuzzing_updates(run_id: str):
last_stats_time = datetime.utcnow()
await asyncio.sleep(5) # Update every 5 seconds
except Exception as e:
logger.error(f"Error in event stream for run {run_id}: {e}")
except Exception:
logger.exception("Error in event stream for run %s", run_id)
break
return StreamingResponse(
@@ -297,17 +288,17 @@ async def stream_fuzzing_updates(run_id: str):
headers={
"Cache-Control": "no-cache",
"Connection": "keep-alive",
}
},
)
@router.delete("/{run_id}")
async def cleanup_fuzzing_run(run_id: str):
"""
Clean up fuzzing run data.
async def cleanup_fuzzing_run(run_id: str) -> dict[str, str]:
"""Clean up fuzzing run data.
Args:
run_id: The fuzzing run ID to clean up
"""
# Clean up tracking data
fuzzing_stats.pop(run_id, None)

View File

@@ -1,6 +1,4 @@
"""
API endpoints for workflow run management and findings retrieval
"""
"""API endpoints for workflow run management and findings retrieval."""
# Copyright (c) 2025 FuzzingLabs
#
@@ -14,37 +12,36 @@ API endpoints for workflow run management and findings retrieval
# Additional attribution and requirements are provided in the NOTICE file.
import logging
from fastapi import APIRouter, HTTPException, Depends
from typing import Annotated
from fastapi import APIRouter, Depends, HTTPException
from src.main import temporal_mgr
from src.models.findings import WorkflowFindings, WorkflowStatus
from src.temporal import TemporalManager
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/runs", tags=["runs"])
def get_temporal_manager():
"""Dependency to get the Temporal manager instance"""
from src.main import temporal_mgr
def get_temporal_manager() -> TemporalManager:
"""Dependency to get the Temporal manager instance."""
return temporal_mgr
@router.get("/{run_id}/status", response_model=WorkflowStatus)
@router.get("/{run_id}/status")
async def get_run_status(
run_id: str,
temporal_mgr=Depends(get_temporal_manager)
temporal_mgr: Annotated[TemporalManager, Depends(get_temporal_manager)],
) -> WorkflowStatus:
"""
Get the current status of a workflow run.
"""Get the current status of a workflow run.
Args:
run_id: The workflow run ID
:param run_id: The workflow run ID
:param temporal_mgr: The temporal manager instance.
:return: Status information including state, timestamps, and completion flags
:raises HTTPException: 404 if run not found
Returns:
Status information including state, timestamps, and completion flags
Raises:
HTTPException: 404 if run not found
"""
try:
status = await temporal_mgr.get_workflow_status(run_id)
@@ -55,41 +52,40 @@ async def get_run_status(
is_failed = workflow_status == "FAILED"
is_running = workflow_status == "RUNNING"
# Extract workflow name from run_id (format: workflow_name-unique_id)
workflow_name = run_id.rsplit("-", 1)[0] if "-" in run_id else "unknown"
return WorkflowStatus(
run_id=run_id,
workflow="unknown", # Temporal doesn't track workflow name in status
workflow=workflow_name,
status=workflow_status,
is_completed=is_completed,
is_failed=is_failed,
is_running=is_running,
created_at=status.get("start_time"),
updated_at=status.get("close_time") or status.get("execution_time")
updated_at=status.get("close_time") or status.get("execution_time"),
)
except Exception as e:
logger.error(f"Failed to get status for run {run_id}: {e}")
logger.exception("Failed to get status for run %s", run_id)
raise HTTPException(
status_code=404,
detail=f"Run not found: {run_id}"
)
detail=f"Run not found: {run_id}",
) from e
@router.get("/{run_id}/findings", response_model=WorkflowFindings)
@router.get("/{run_id}/findings")
async def get_run_findings(
run_id: str,
temporal_mgr=Depends(get_temporal_manager)
temporal_mgr: Annotated[TemporalManager, Depends(get_temporal_manager)],
) -> WorkflowFindings:
"""
Get the findings from a completed workflow run.
"""Get the findings from a completed workflow run.
Args:
run_id: The workflow run ID
:param run_id: The workflow run ID
:param temporal_mgr: The temporal manager instance.
:return: SARIF-formatted findings from the workflow execution
:raises HTTPException: 404 if run not found, 400 if run not completed
Returns:
SARIF-formatted findings from the workflow execution
Raises:
HTTPException: 404 if run not found, 400 if run not completed
"""
try:
# Get run status first
@@ -100,77 +96,72 @@ async def get_run_findings(
if workflow_status == "RUNNING":
raise HTTPException(
status_code=400,
detail=f"Run {run_id} is still running. Current status: {workflow_status}"
)
else:
raise HTTPException(
status_code=400,
detail=f"Run {run_id} not completed. Status: {workflow_status}"
detail=f"Run {run_id} is still running. Current status: {workflow_status}",
)
raise HTTPException(
status_code=400,
detail=f"Run {run_id} not completed. Status: {workflow_status}",
)
if workflow_status == "FAILED":
raise HTTPException(
status_code=400,
detail=f"Run {run_id} failed. Status: {workflow_status}"
detail=f"Run {run_id} failed. Status: {workflow_status}",
)
# Get the workflow result
result = await temporal_mgr.get_workflow_result(run_id)
# Extract SARIF from result (handle None for backwards compatibility)
if isinstance(result, dict):
sarif = result.get("sarif") or {}
else:
sarif = {}
sarif = result.get("sarif", {}) if isinstance(result, dict) else {}
# Extract workflow name from run_id (format: workflow_name-unique_id)
workflow_name = run_id.rsplit("-", 1)[0] if "-" in run_id else "unknown"
# Metadata
metadata = {
"completion_time": status.get("close_time"),
"workflow_version": "unknown"
"workflow_version": "unknown",
}
return WorkflowFindings(
workflow="unknown",
workflow=workflow_name,
run_id=run_id,
sarif=sarif,
metadata=metadata
metadata=metadata,
)
except HTTPException:
raise
except Exception as e:
logger.error(f"Failed to get findings for run {run_id}: {e}")
logger.exception("Failed to get findings for run %s", run_id)
raise HTTPException(
status_code=500,
detail=f"Failed to retrieve findings: {str(e)}"
)
detail=f"Failed to retrieve findings: {e!s}",
) from e
@router.get("/{workflow_name}/findings/{run_id}", response_model=WorkflowFindings)
@router.get("/{workflow_name}/findings/{run_id}")
async def get_workflow_findings(
workflow_name: str,
run_id: str,
temporal_mgr=Depends(get_temporal_manager)
temporal_mgr: Annotated[TemporalManager, Depends(get_temporal_manager)],
) -> WorkflowFindings:
"""
Get findings for a specific workflow run.
"""Get findings for a specific workflow run.
Alternative endpoint that includes workflow name in the path for clarity.
Args:
workflow_name: Name of the workflow
run_id: The workflow run ID
:param workflow_name: Name of the workflow
:param run_id: The workflow run ID
:param temporal_mgr: The temporal manager instance.
:return: SARIF-formatted findings from the workflow execution
:raises HTTPException: 404 if workflow or run not found, 400 if run not completed
Returns:
SARIF-formatted findings from the workflow execution
Raises:
HTTPException: 404 if workflow or run not found, 400 if run not completed
"""
if workflow_name not in temporal_mgr.workflows:
raise HTTPException(
status_code=404,
detail=f"Workflow not found: {workflow_name}"
detail=f"Workflow not found: {workflow_name}",
)
# Delegate to the main findings endpoint

45
backend/src/api/system.py Normal file
View File

@@ -0,0 +1,45 @@
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
"""System information endpoints for FuzzForge API.
Provides system configuration and filesystem paths to CLI for worker management.
"""
import os
from fastapi import APIRouter
router = APIRouter(prefix="/system", tags=["system"])
@router.get("/info")
async def get_system_info() -> dict[str, str]:
"""Get system information including host filesystem paths.
This endpoint exposes paths needed by the CLI to manage workers via docker-compose.
The FUZZFORGE_HOST_ROOT environment variable is set by docker-compose and points
to the FuzzForge installation directory on the host machine.
Returns:
Dictionary containing:
- host_root: Absolute path to FuzzForge root on host
- docker_compose_path: Path to docker-compose.yml on host
- workers_dir: Path to workers directory on host
"""
host_root = os.getenv("FUZZFORGE_HOST_ROOT", "")
return {
"host_root": host_root,
"docker_compose_path": f"{host_root}/docker-compose.yml" if host_root else "",
"workers_dir": f"{host_root}/workers" if host_root else "",
}

View File

@@ -1,6 +1,4 @@
"""
API endpoints for workflow management with enhanced error handling
"""
"""API endpoints for workflow management with enhanced error handling."""
# Copyright (c) 2025 FuzzingLabs
#
@@ -13,20 +11,24 @@ API endpoints for workflow management with enhanced error handling
#
# Additional attribution and requirements are provided in the NOTICE file.
import json
import logging
import traceback
import tempfile
from typing import List, Dict, Any, Optional
from fastapi import APIRouter, HTTPException, Depends, UploadFile, File, Form
from pathlib import Path
from typing import Annotated, Any
from fastapi import APIRouter, Depends, File, Form, HTTPException, UploadFile
from src.api.fuzzing import initialize_fuzzing_tracking
from src.main import temporal_mgr
from src.models.findings import (
WorkflowSubmission,
WorkflowMetadata,
RunSubmissionResponse,
WorkflowListItem,
RunSubmissionResponse
WorkflowMetadata,
WorkflowSubmission,
)
from src.temporal.discovery import WorkflowDiscovery
from src.temporal.manager import TemporalManager
logger = logging.getLogger(__name__)
@@ -43,22 +45,58 @@ ALLOWED_CONTENT_TYPES = [
router = APIRouter(prefix="/workflows", tags=["workflows"])
def extract_defaults_from_json_schema(metadata: dict[str, Any]) -> dict[str, Any]:
"""Extract default parameter values from JSON Schema format.
Converts from:
parameters:
properties:
param_name:
default: value
To:
{param_name: value}
Args:
metadata: Workflow metadata dictionary
Returns:
Dictionary of parameter defaults
"""
defaults = {}
# Check if there's a legacy default_parameters field
if "default_parameters" in metadata:
defaults.update(metadata["default_parameters"])
# Extract defaults from JSON Schema parameters
parameters = metadata.get("parameters", {})
properties = parameters.get("properties", {})
for param_name, param_spec in properties.items():
if "default" in param_spec:
defaults[param_name] = param_spec["default"]
return defaults
def create_structured_error_response(
error_type: str,
message: str,
workflow_name: Optional[str] = None,
run_id: Optional[str] = None,
container_info: Optional[Dict[str, Any]] = None,
deployment_info: Optional[Dict[str, Any]] = None,
suggestions: Optional[List[str]] = None
) -> Dict[str, Any]:
workflow_name: str | None = None,
run_id: str | None = None,
container_info: dict[str, Any] | None = None,
deployment_info: dict[str, Any] | None = None,
suggestions: list[str] | None = None,
) -> dict[str, Any]:
"""Create a structured error response with rich context."""
error_response = {
"error": {
"type": error_type,
"message": message,
"timestamp": __import__("datetime").datetime.utcnow().isoformat() + "Z"
}
"timestamp": __import__("datetime").datetime.utcnow().isoformat() + "Z",
},
}
if workflow_name:
@@ -79,39 +117,38 @@ def create_structured_error_response(
return error_response
def get_temporal_manager():
"""Dependency to get the Temporal manager instance"""
from src.main import temporal_mgr
def get_temporal_manager() -> TemporalManager:
"""Dependency to get the Temporal manager instance."""
return temporal_mgr
@router.get("/", response_model=List[WorkflowListItem])
@router.get("/")
async def list_workflows(
temporal_mgr=Depends(get_temporal_manager)
) -> List[WorkflowListItem]:
"""
List all discovered workflows with their metadata.
temporal_mgr: Annotated[TemporalManager, Depends(get_temporal_manager)],
) -> list[WorkflowListItem]:
"""List all discovered workflows with their metadata.
Returns a summary of each workflow including name, version, description,
author, and tags.
"""
workflows = []
for name, info in temporal_mgr.workflows.items():
workflows.append(WorkflowListItem(
name=name,
version=info.metadata.get("version", "0.6.0"),
description=info.metadata.get("description", ""),
author=info.metadata.get("author"),
tags=info.metadata.get("tags", [])
))
workflows.append(
WorkflowListItem(
name=name,
version=info.metadata.get("version", "0.6.0"),
description=info.metadata.get("description", ""),
author=info.metadata.get("author"),
tags=info.metadata.get("tags", []),
),
)
return workflows
@router.get("/metadata/schema")
async def get_metadata_schema() -> Dict[str, Any]:
"""
Get the JSON schema for workflow metadata files.
async def get_metadata_schema() -> dict[str, Any]:
"""Get the JSON schema for workflow metadata files.
This schema defines the structure and requirements for metadata.yaml files
that must accompany each workflow.
@@ -119,23 +156,19 @@ async def get_metadata_schema() -> Dict[str, Any]:
return WorkflowDiscovery.get_metadata_schema()
@router.get("/{workflow_name}/metadata", response_model=WorkflowMetadata)
@router.get("/{workflow_name}/metadata")
async def get_workflow_metadata(
workflow_name: str,
temporal_mgr=Depends(get_temporal_manager)
temporal_mgr: Annotated[TemporalManager, Depends(get_temporal_manager)],
) -> WorkflowMetadata:
"""
Get complete metadata for a specific workflow.
"""Get complete metadata for a specific workflow.
Args:
workflow_name: Name of the workflow
Returns:
Complete metadata including parameters schema, supported volume modes,
:param workflow_name: Name of the workflow
:param temporal_mgr: The temporal manager instance.
:return: Complete metadata including parameters schema, supported volume modes,
required modules, and more.
:raises HTTPException: 404 if workflow not found
Raises:
HTTPException: 404 if workflow not found
"""
if workflow_name not in temporal_mgr.workflows:
available_workflows = list(temporal_mgr.workflows.keys())
@@ -146,12 +179,12 @@ async def get_workflow_metadata(
suggestions=[
f"Available workflows: {', '.join(available_workflows)}",
"Use GET /workflows/ to see all available workflows",
"Check workflow name spelling and case sensitivity"
]
"Check workflow name spelling and case sensitivity",
],
)
raise HTTPException(
status_code=404,
detail=error_response
detail=error_response,
)
info = temporal_mgr.workflows[workflow_name]
@@ -164,29 +197,25 @@ async def get_workflow_metadata(
author=metadata.get("author"),
tags=metadata.get("tags", []),
parameters=metadata.get("parameters", {}),
default_parameters=metadata.get("default_parameters", {}),
required_modules=metadata.get("required_modules", [])
default_parameters=extract_defaults_from_json_schema(metadata),
required_modules=metadata.get("required_modules", []),
)
@router.post("/{workflow_name}/submit", response_model=RunSubmissionResponse)
@router.post("/{workflow_name}/submit")
async def submit_workflow(
workflow_name: str,
submission: WorkflowSubmission,
temporal_mgr=Depends(get_temporal_manager)
temporal_mgr: Annotated[TemporalManager, Depends(get_temporal_manager)],
) -> RunSubmissionResponse:
"""
Submit a workflow for execution.
"""Submit a workflow for execution.
Args:
workflow_name: Name of the workflow to execute
submission: Submission parameters including target path and parameters
:param workflow_name: Name of the workflow to execute
:param submission: Submission parameters including target path and parameters
:param temporal_mgr: The temporal manager instance.
:return: Run submission response with run_id and initial status
:raises HTTPException: 404 if workflow not found, 400 for invalid parameters
Returns:
Run submission response with run_id and initial status
Raises:
HTTPException: 404 if workflow not found, 400 for invalid parameters
"""
if workflow_name not in temporal_mgr.workflows:
available_workflows = list(temporal_mgr.workflows.keys())
@@ -197,31 +226,32 @@ async def submit_workflow(
suggestions=[
f"Available workflows: {', '.join(available_workflows)}",
"Use GET /workflows/ to see all available workflows",
"Check workflow name spelling and case sensitivity"
]
"Check workflow name spelling and case sensitivity",
],
)
raise HTTPException(
status_code=404,
detail=error_response
detail=error_response,
)
try:
# Upload target file to MinIO and get target_id
target_path = Path(submission.target_path)
if not target_path.exists():
raise ValueError(f"Target path does not exist: {submission.target_path}")
msg = f"Target path does not exist: {submission.target_path}"
raise ValueError(msg)
# Upload target (using anonymous user for now)
target_id = await temporal_mgr.upload_target(
file_path=target_path,
user_id="api-user",
metadata={"workflow": workflow_name}
metadata={"workflow": workflow_name},
)
# Merge default parameters with user parameters
workflow_info = temporal_mgr.workflows[workflow_name]
metadata = workflow_info.metadata or {}
defaults = metadata.get("default_parameters", {})
defaults = extract_defaults_from_json_schema(metadata)
user_params = submission.parameters or {}
workflow_params = {**defaults, **user_params}
@@ -229,23 +259,22 @@ async def submit_workflow(
handle = await temporal_mgr.run_workflow(
workflow_name=workflow_name,
target_id=target_id,
workflow_params=workflow_params
workflow_params=workflow_params,
)
run_id = handle.id
# Initialize fuzzing tracking if this looks like a fuzzing workflow
workflow_info = temporal_mgr.workflows.get(workflow_name, {})
workflow_tags = workflow_info.metadata.get("tags", []) if hasattr(workflow_info, 'metadata') else []
workflow_tags = workflow_info.metadata.get("tags", []) if hasattr(workflow_info, "metadata") else []
if "fuzzing" in workflow_tags or "fuzz" in workflow_name.lower():
from src.api.fuzzing import initialize_fuzzing_tracking
initialize_fuzzing_tracking(run_id, workflow_name)
return RunSubmissionResponse(
run_id=run_id,
status="RUNNING",
workflow=workflow_name,
message=f"Workflow '{workflow_name}' submitted successfully"
message=f"Workflow '{workflow_name}' submitted successfully",
)
except ValueError as e:
@@ -257,14 +286,13 @@ async def submit_workflow(
suggestions=[
"Check parameter types and values",
"Use GET /workflows/{workflow_name}/parameters for schema",
"Ensure all required parameters are provided"
]
"Ensure all required parameters are provided",
],
)
raise HTTPException(status_code=400, detail=error_response)
raise HTTPException(status_code=400, detail=error_response) from e
except Exception as e:
logger.error(f"Failed to submit workflow '{workflow_name}': {e}")
logger.error(f"Traceback: {traceback.format_exc()}")
logger.exception("Failed to submit workflow '%s'", workflow_name)
# Try to get more context about the error
container_info = None
@@ -277,47 +305,57 @@ async def submit_workflow(
# Detect specific error patterns
if "workflow" in error_message.lower() and "not found" in error_message.lower():
error_type = "WorkflowError"
suggestions.extend([
"Check if Temporal server is running and accessible",
"Verify workflow workers are running",
"Check if workflow is registered with correct vertical",
"Ensure Docker is running and has sufficient resources"
])
suggestions.extend(
[
"Check if Temporal server is running and accessible",
"Verify workflow workers are running",
"Check if workflow is registered with correct vertical",
"Ensure Docker is running and has sufficient resources",
],
)
elif "volume" in error_message.lower() or "mount" in error_message.lower():
error_type = "VolumeError"
suggestions.extend([
"Check if the target path exists and is accessible",
"Verify file permissions (Docker needs read access)",
"Ensure the path is not in use by another process",
"Try using an absolute path instead of relative path"
])
suggestions.extend(
[
"Check if the target path exists and is accessible",
"Verify file permissions (Docker needs read access)",
"Ensure the path is not in use by another process",
"Try using an absolute path instead of relative path",
],
)
elif "memory" in error_message.lower() or "resource" in error_message.lower():
error_type = "ResourceError"
suggestions.extend([
"Check system memory and CPU availability",
"Consider reducing resource limits or dataset size",
"Monitor Docker resource usage",
"Increase Docker memory limits if needed"
])
suggestions.extend(
[
"Check system memory and CPU availability",
"Consider reducing resource limits or dataset size",
"Monitor Docker resource usage",
"Increase Docker memory limits if needed",
],
)
elif "image" in error_message.lower():
error_type = "ImageError"
suggestions.extend([
"Check if the workflow image exists",
"Verify Docker registry access",
"Try rebuilding the workflow image",
"Check network connectivity to registries"
])
suggestions.extend(
[
"Check if the workflow image exists",
"Verify Docker registry access",
"Try rebuilding the workflow image",
"Check network connectivity to registries",
],
)
else:
suggestions.extend([
"Check FuzzForge backend logs for details",
"Verify all services are running (docker-compose up -d)",
"Try restarting the workflow deployment",
"Contact support if the issue persists"
])
suggestions.extend(
[
"Check FuzzForge backend logs for details",
"Verify all services are running (docker-compose up -d)",
"Try restarting the workflow deployment",
"Contact support if the issue persists",
],
)
error_response = create_structured_error_response(
error_type=error_type,
@@ -325,41 +363,35 @@ async def submit_workflow(
workflow_name=workflow_name,
container_info=container_info,
deployment_info=deployment_info,
suggestions=suggestions
suggestions=suggestions,
)
raise HTTPException(
status_code=500,
detail=error_response
)
detail=error_response,
) from e
@router.post("/{workflow_name}/upload-and-submit", response_model=RunSubmissionResponse)
@router.post("/{workflow_name}/upload-and-submit")
async def upload_and_submit_workflow(
workflow_name: str,
file: UploadFile = File(..., description="Target file or tarball to analyze"),
parameters: Optional[str] = Form(None, description="JSON-encoded workflow parameters"),
timeout: Optional[int] = Form(None, description="Timeout in seconds"),
temporal_mgr=Depends(get_temporal_manager)
temporal_mgr: Annotated[TemporalManager, Depends(get_temporal_manager)],
file: Annotated[UploadFile, File(..., description="Target file or tarball to analyze")],
parameters: Annotated[str, Form(None, description="JSON-encoded workflow parameters")],
) -> RunSubmissionResponse:
"""
Upload a target file/tarball and submit workflow for execution.
"""Upload a target file/tarball and submit workflow for execution.
This endpoint accepts multipart/form-data uploads and is the recommended
way to submit workflows from remote CLI clients.
Args:
workflow_name: Name of the workflow to execute
file: Target file or tarball (compressed directory)
parameters: JSON string of workflow parameters (optional)
timeout: Execution timeout in seconds (optional)
:param workflow_name: Name of the workflow to execute
:param temporal_mgr: The temporal manager instance.
:param file: Target file or tarball (compressed directory)
:param parameters: JSON string of workflow parameters (optional)
:returns: Run submission response with run_id and initial status
:raises HTTPException: 404 if workflow not found, 400 for invalid parameters,
413 if file too large
Returns:
Run submission response with run_id and initial status
Raises:
HTTPException: 404 if workflow not found, 400 for invalid parameters,
413 if file too large
"""
if workflow_name not in temporal_mgr.workflows:
available_workflows = list(temporal_mgr.workflows.keys())
@@ -369,8 +401,8 @@ async def upload_and_submit_workflow(
workflow_name=workflow_name,
suggestions=[
f"Available workflows: {', '.join(available_workflows)}",
"Use GET /workflows/ to see all available workflows"
]
"Use GET /workflows/ to see all available workflows",
],
)
raise HTTPException(status_code=404, detail=error_response)
@@ -384,10 +416,10 @@ async def upload_and_submit_workflow(
# Create temporary file
temp_fd, temp_file_path = tempfile.mkstemp(suffix=".tar.gz")
logger.info(f"Receiving file upload for workflow '{workflow_name}': {file.filename}")
logger.info("Receiving file upload for workflow '%s': %s", workflow_name, file.filename)
# Stream file to disk
with open(temp_fd, 'wb') as temp_file:
with open(temp_fd, "wb") as temp_file:
while True:
chunk = await file.read(chunk_size)
if not chunk:
@@ -406,33 +438,33 @@ async def upload_and_submit_workflow(
suggestions=[
"Reduce the size of your target directory",
"Exclude unnecessary files (build artifacts, dependencies, etc.)",
"Consider splitting into smaller analysis targets"
]
)
"Consider splitting into smaller analysis targets",
],
),
)
temp_file.write(chunk)
logger.info(f"Received file: {file_size / (1024**2):.2f} MB")
logger.info("Received file: %s MB", f"{file_size / (1024**2):.2f}")
# Parse parameters
workflow_params = {}
if parameters:
try:
import json
workflow_params = json.loads(parameters)
if not isinstance(workflow_params, dict):
raise ValueError("Parameters must be a JSON object")
except (json.JSONDecodeError, ValueError) as e:
msg = "Parameters must be a JSON object"
raise TypeError(msg)
except (json.JSONDecodeError, TypeError) as e:
raise HTTPException(
status_code=400,
detail=create_structured_error_response(
error_type="InvalidParameters",
message=f"Invalid parameters JSON: {e}",
workflow_name=workflow_name,
suggestions=["Ensure parameters is valid JSON object"]
)
)
suggestions=["Ensure parameters is valid JSON object"],
),
) from e
# Upload to MinIO
target_id = await temporal_mgr.upload_target(
@@ -441,90 +473,84 @@ async def upload_and_submit_workflow(
metadata={
"workflow": workflow_name,
"original_filename": file.filename,
"upload_method": "multipart"
}
"upload_method": "multipart",
},
)
logger.info(f"Uploaded to MinIO with target_id: {target_id}")
logger.info("Uploaded to MinIO with target_id: %s", target_id)
# Merge default parameters with user parameters
workflow_info = temporal_mgr.workflows.get(workflow_name)
metadata = workflow_info.metadata or {}
defaults = metadata.get("default_parameters", {})
defaults = extract_defaults_from_json_schema(metadata)
workflow_params = {**defaults, **workflow_params}
# Start workflow execution
handle = await temporal_mgr.run_workflow(
workflow_name=workflow_name,
target_id=target_id,
workflow_params=workflow_params
workflow_params=workflow_params,
)
run_id = handle.id
# Initialize fuzzing tracking if needed
workflow_info = temporal_mgr.workflows.get(workflow_name, {})
workflow_tags = workflow_info.metadata.get("tags", []) if hasattr(workflow_info, 'metadata') else []
workflow_tags = workflow_info.metadata.get("tags", []) if hasattr(workflow_info, "metadata") else []
if "fuzzing" in workflow_tags or "fuzz" in workflow_name.lower():
from src.api.fuzzing import initialize_fuzzing_tracking
initialize_fuzzing_tracking(run_id, workflow_name)
return RunSubmissionResponse(
run_id=run_id,
status="RUNNING",
workflow=workflow_name,
message=f"Workflow '{workflow_name}' submitted successfully with uploaded target"
message=f"Workflow '{workflow_name}' submitted successfully with uploaded target",
)
except HTTPException:
raise
except Exception as e:
logger.error(f"Failed to upload and submit workflow '{workflow_name}': {e}")
logger.error(f"Traceback: {traceback.format_exc()}")
logger.exception("Failed to upload and submit workflow '%s'", workflow_name)
error_response = create_structured_error_response(
error_type="WorkflowSubmissionError",
message=f"Failed to process upload and submit workflow: {str(e)}",
message=f"Failed to process upload and submit workflow: {e!s}",
workflow_name=workflow_name,
suggestions=[
"Check if the uploaded file is a valid tarball",
"Verify MinIO storage is accessible",
"Check backend logs for detailed error information",
"Ensure Temporal workers are running"
]
"Ensure Temporal workers are running",
],
)
raise HTTPException(status_code=500, detail=error_response)
raise HTTPException(status_code=500, detail=error_response) from e
finally:
# Cleanup temporary file
if temp_file_path and Path(temp_file_path).exists():
try:
Path(temp_file_path).unlink()
logger.debug(f"Cleaned up temp file: {temp_file_path}")
logger.debug("Cleaned up temp file: %s", temp_file_path)
except Exception as e:
logger.warning(f"Failed to cleanup temp file {temp_file_path}: {e}")
logger.warning("Failed to cleanup temp file %s: %s", temp_file_path, e)
@router.get("/{workflow_name}/worker-info")
async def get_workflow_worker_info(
workflow_name: str,
temporal_mgr=Depends(get_temporal_manager)
) -> Dict[str, Any]:
"""
Get worker information for a workflow.
temporal_mgr: Annotated[TemporalManager, Depends(get_temporal_manager)],
) -> dict[str, Any]:
"""Get worker information for a workflow.
Returns details about which worker is required to execute this workflow,
including container name, task queue, and vertical.
Args:
workflow_name: Name of the workflow
:param workflow_name: Name of the workflow
:param temporal_mgr: The temporal manager instance.
:return: Worker information including vertical, container name, and task queue
:raises HTTPException: 404 if workflow not found
Returns:
Worker information including vertical, container name, and task queue
Raises:
HTTPException: 404 if workflow not found
"""
if workflow_name not in temporal_mgr.workflows:
available_workflows = list(temporal_mgr.workflows.keys())
@@ -534,12 +560,12 @@ async def get_workflow_worker_info(
workflow_name=workflow_name,
suggestions=[
f"Available workflows: {', '.join(available_workflows)}",
"Use GET /workflows/ to see all available workflows"
]
"Use GET /workflows/ to see all available workflows",
],
)
raise HTTPException(
status_code=404,
detail=error_response
detail=error_response,
)
info = temporal_mgr.workflows[workflow_name]
@@ -555,40 +581,35 @@ async def get_workflow_worker_info(
workflow_name=workflow_name,
suggestions=[
"Check workflow metadata.yaml for 'vertical' field",
"Contact workflow author for support"
]
"Contact workflow author for support",
],
)
raise HTTPException(
status_code=500,
detail=error_response
detail=error_response,
)
return {
"workflow": workflow_name,
"vertical": vertical,
"worker_container": f"fuzzforge-worker-{vertical}",
"worker_service": f"worker-{vertical}",
"task_queue": f"{vertical}-queue",
"required": True
"required": True,
}
@router.get("/{workflow_name}/parameters")
async def get_workflow_parameters(
workflow_name: str,
temporal_mgr=Depends(get_temporal_manager)
) -> Dict[str, Any]:
"""
Get the parameters schema for a workflow.
temporal_mgr: Annotated[TemporalManager, Depends(get_temporal_manager)],
) -> dict[str, Any]:
"""Get the parameters schema for a workflow.
Args:
workflow_name: Name of the workflow
:param workflow_name: Name of the workflow
:param temporal_mgr: The temporal manager instance.
:return: Parameters schema with types, descriptions, and defaults
:raises HTTPException: 404 if workflow not found
Returns:
Parameters schema with types, descriptions, and defaults
Raises:
HTTPException: 404 if workflow not found
"""
if workflow_name not in temporal_mgr.workflows:
available_workflows = list(temporal_mgr.workflows.keys())
@@ -598,12 +619,12 @@ async def get_workflow_parameters(
workflow_name=workflow_name,
suggestions=[
f"Available workflows: {', '.join(available_workflows)}",
"Use GET /workflows/ to see all available workflows"
]
"Use GET /workflows/ to see all available workflows",
],
)
raise HTTPException(
status_code=404,
detail=error_response
detail=error_response,
)
info = temporal_mgr.workflows[workflow_name]
@@ -613,23 +634,18 @@ async def get_workflow_parameters(
parameters_schema = metadata.get("parameters", {})
# Extract the actual parameter definitions from JSON schema structure
if "properties" in parameters_schema:
param_definitions = parameters_schema["properties"]
else:
param_definitions = parameters_schema
param_definitions = parameters_schema.get("properties", parameters_schema)
# Add default values to the schema
default_params = metadata.get("default_parameters", {})
for param_name, param_schema in param_definitions.items():
if isinstance(param_schema, dict) and param_name in default_params:
param_schema["default"] = default_params[param_name]
# Extract default values from JSON Schema
default_params = extract_defaults_from_json_schema(metadata)
return {
"workflow": workflow_name,
"parameters": param_definitions,
"default_parameters": default_params,
"required_parameters": [
name for name, schema in param_definitions.items()
name
for name, schema in param_definitions.items()
if isinstance(schema, dict) and schema.get("required", False)
]
}
],
}

View File

@@ -1,6 +1,4 @@
"""
Setup utilities for FuzzForge infrastructure
"""
"""Setup utilities for FuzzForge infrastructure."""
# Copyright (c) 2025 FuzzingLabs
#
@@ -18,9 +16,8 @@ import logging
logger = logging.getLogger(__name__)
async def setup_result_storage():
"""
Setup result storage (MinIO).
async def setup_result_storage() -> bool:
"""Set up result storage (MinIO).
MinIO is used for both target upload and result storage.
This is a placeholder for any MinIO-specific setup if needed.
@@ -31,9 +28,8 @@ async def setup_result_storage():
return True
async def validate_infrastructure():
"""
Validate all required infrastructure components.
async def validate_infrastructure() -> None:
"""Validate all required infrastructure components.
This should be called during startup to ensure everything is ready.
"""

View File

@@ -13,20 +13,19 @@ import asyncio
import logging
import os
from contextlib import AsyncExitStack, asynccontextmanager, suppress
from typing import Any, Dict, Optional, List
from typing import Any
import uvicorn
from fastapi import FastAPI
from fastmcp import FastMCP
from fastmcp.server.http import create_sse_app
from starlette.applications import Starlette
from starlette.routing import Mount
from fastmcp.server.http import create_sse_app
from src.temporal.manager import TemporalManager
from src.api import fuzzing, runs, system, workflows
from src.core.setup import setup_result_storage, validate_infrastructure
from src.api import workflows, runs, fuzzing
from fastmcp import FastMCP
from src.temporal.discovery import WorkflowDiscovery
from src.temporal.manager import TemporalManager
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
@@ -38,12 +37,14 @@ class TemporalBootstrapState:
"""Tracks Temporal initialization progress for API and MCP consumers."""
def __init__(self) -> None:
"""Initialize an instance of the class."""
self.ready: bool = False
self.status: str = "not_started"
self.last_error: Optional[str] = None
self.last_error: str | None = None
self.task_running: bool = False
def as_dict(self) -> Dict[str, Any]:
def as_dict(self) -> dict[str, Any]:
"""Return the current state as a Python dictionnary."""
return {
"ready": self.ready,
"status": self.status,
@@ -61,7 +62,7 @@ STARTUP_RETRY_MAX_SECONDS = max(
int(os.getenv("FUZZFORGE_STARTUP_RETRY_MAX_SECONDS", "60")),
)
temporal_bootstrap_task: Optional[asyncio.Task] = None
temporal_bootstrap_task: asyncio.Task | None = None
# ---------------------------------------------------------------------------
# FastAPI application (REST API)
@@ -76,19 +77,18 @@ app = FastAPI(
app.include_router(workflows.router)
app.include_router(runs.router)
app.include_router(fuzzing.router)
app.include_router(system.router)
def get_temporal_status() -> Dict[str, Any]:
def get_temporal_status() -> dict[str, Any]:
"""Return a snapshot of Temporal bootstrap state for diagnostics."""
status = temporal_bootstrap_state.as_dict()
status["workflows_loaded"] = len(temporal_mgr.workflows)
status["bootstrap_task_running"] = (
temporal_bootstrap_task is not None and not temporal_bootstrap_task.done()
)
status["bootstrap_task_running"] = temporal_bootstrap_task is not None and not temporal_bootstrap_task.done()
return status
def _temporal_not_ready_status() -> Optional[Dict[str, Any]]:
def _temporal_not_ready_status() -> dict[str, Any] | None:
"""Return status details if Temporal is not ready yet."""
status = get_temporal_status()
if status.get("ready"):
@@ -97,7 +97,7 @@ def _temporal_not_ready_status() -> Optional[Dict[str, Any]]:
@app.get("/")
async def root() -> Dict[str, Any]:
async def root() -> dict[str, Any]:
status = get_temporal_status()
return {
"name": "FuzzForge API",
@@ -109,14 +109,14 @@ async def root() -> Dict[str, Any]:
@app.get("/health")
async def health() -> Dict[str, str]:
async def health() -> dict[str, str]:
status = get_temporal_status()
health_status = "healthy" if status.get("ready") else "initializing"
return {"status": health_status}
# Map FastAPI OpenAPI operationIds to readable MCP tool names
FASTAPI_MCP_NAME_OVERRIDES: Dict[str, str] = {
FASTAPI_MCP_NAME_OVERRIDES: dict[str, str] = {
"list_workflows_workflows__get": "api_list_workflows",
"get_metadata_schema_workflows_metadata_schema_get": "api_get_metadata_schema",
"get_workflow_metadata_workflows__workflow_name__metadata_get": "api_get_workflow_metadata",
@@ -154,7 +154,6 @@ mcp = FastMCP(name="FuzzForge MCP")
async def _bootstrap_temporal_with_retries() -> None:
"""Initialize Temporal infrastructure with exponential backoff retries."""
attempt = 0
while True:
@@ -174,7 +173,6 @@ async def _bootstrap_temporal_with_retries() -> None:
temporal_bootstrap_state.status = "ready"
temporal_bootstrap_state.task_running = False
logger.info("Temporal infrastructure ready")
return
except asyncio.CancelledError:
temporal_bootstrap_state.status = "cancelled"
@@ -203,23 +201,17 @@ async def _bootstrap_temporal_with_retries() -> None:
temporal_bootstrap_state.status = "cancelled"
temporal_bootstrap_state.task_running = False
raise
else:
return
def _lookup_workflow(workflow_name: str):
def _lookup_workflow(workflow_name: str) -> dict[str, Any]:
info = temporal_mgr.workflows.get(workflow_name)
if not info:
return None
metadata = info.metadata
defaults = metadata.get("default_parameters", {})
default_target_path = metadata.get("default_target_path") or defaults.get("target_path")
supported_modes = metadata.get("supported_volume_modes") or ["ro", "rw"]
if not isinstance(supported_modes, list) or not supported_modes:
supported_modes = ["ro", "rw"]
default_volume_mode = (
metadata.get("default_volume_mode")
or defaults.get("volume_mode")
or supported_modes[0]
)
return {
"name": workflow_name,
"version": metadata.get("version", "0.6.0"),
@@ -229,14 +221,12 @@ def _lookup_workflow(workflow_name: str):
"parameters": metadata.get("parameters", {}),
"default_parameters": metadata.get("default_parameters", {}),
"required_modules": metadata.get("required_modules", []),
"supported_volume_modes": supported_modes,
"default_target_path": default_target_path,
"default_volume_mode": default_volume_mode
}
@mcp.tool
async def list_workflows_mcp() -> Dict[str, Any]:
async def list_workflows_mcp() -> dict[str, Any]:
"""List all discovered workflows and their metadata summary."""
not_ready = _temporal_not_ready_status()
if not_ready:
@@ -250,24 +240,21 @@ async def list_workflows_mcp() -> Dict[str, Any]:
for name, info in temporal_mgr.workflows.items():
metadata = info.metadata
defaults = metadata.get("default_parameters", {})
workflows_summary.append({
"name": name,
"version": metadata.get("version", "0.6.0"),
"description": metadata.get("description", ""),
"author": metadata.get("author"),
"tags": metadata.get("tags", []),
"supported_volume_modes": metadata.get("supported_volume_modes", ["ro", "rw"]),
"default_volume_mode": metadata.get("default_volume_mode")
or defaults.get("volume_mode")
or "ro",
"default_target_path": metadata.get("default_target_path")
or defaults.get("target_path")
})
workflows_summary.append(
{
"name": name,
"version": metadata.get("version", "0.6.0"),
"description": metadata.get("description", ""),
"author": metadata.get("author"),
"tags": metadata.get("tags", []),
"default_target_path": metadata.get("default_target_path") or defaults.get("target_path"),
},
)
return {"workflows": workflows_summary, "temporal": get_temporal_status()}
@mcp.tool
async def get_workflow_metadata_mcp(workflow_name: str) -> Dict[str, Any]:
async def get_workflow_metadata_mcp(workflow_name: str) -> dict[str, Any]:
"""Fetch detailed metadata for a workflow."""
not_ready = _temporal_not_ready_status()
if not_ready:
@@ -283,7 +270,7 @@ async def get_workflow_metadata_mcp(workflow_name: str) -> Dict[str, Any]:
@mcp.tool
async def get_workflow_parameters_mcp(workflow_name: str) -> Dict[str, Any]:
async def get_workflow_parameters_mcp(workflow_name: str) -> dict[str, Any]:
"""Return the parameter schema and defaults for a workflow."""
not_ready = _temporal_not_ready_status()
if not_ready:
@@ -302,9 +289,8 @@ async def get_workflow_parameters_mcp(workflow_name: str) -> Dict[str, Any]:
@mcp.tool
async def get_workflow_metadata_schema_mcp() -> Dict[str, Any]:
async def get_workflow_metadata_schema_mcp() -> dict[str, Any]:
"""Return the JSON schema describing workflow metadata files."""
from src.temporal.discovery import WorkflowDiscovery
return WorkflowDiscovery.get_metadata_schema()
@@ -312,8 +298,8 @@ async def get_workflow_metadata_schema_mcp() -> Dict[str, Any]:
async def submit_security_scan_mcp(
workflow_name: str,
target_id: str,
parameters: Dict[str, Any] | None = None,
) -> Dict[str, Any] | Dict[str, str]:
parameters: dict[str, Any] | None = None,
) -> dict[str, Any] | dict[str, str]:
"""Submit a Temporal workflow via MCP."""
try:
not_ready = _temporal_not_ready_status()
@@ -331,7 +317,7 @@ async def submit_security_scan_mcp(
defaults = metadata.get("default_parameters", {})
parameters = parameters or {}
cleaned_parameters: Dict[str, Any] = {**defaults, **parameters}
cleaned_parameters: dict[str, Any] = {**defaults, **parameters}
# Ensure *_config structures default to dicts
for key, value in list(cleaned_parameters.items()):
@@ -340,9 +326,7 @@ async def submit_security_scan_mcp(
# Some workflows expect configuration dictionaries even when omitted
parameter_definitions = (
metadata.get("parameters", {}).get("properties", {})
if isinstance(metadata.get("parameters"), dict)
else {}
metadata.get("parameters", {}).get("properties", {}) if isinstance(metadata.get("parameters"), dict) else {}
)
for key, definition in parameter_definitions.items():
if not isinstance(key, str) or not key.endswith("_config"):
@@ -360,6 +344,10 @@ async def submit_security_scan_mcp(
workflow_params=cleaned_parameters,
)
except Exception as exc: # pragma: no cover - defensive logging
logger.exception("MCP submit failed")
return {"error": f"Failed to submit workflow: {exc}"}
else:
return {
"run_id": handle.id,
"status": "RUNNING",
@@ -369,13 +357,10 @@ async def submit_security_scan_mcp(
"parameters": cleaned_parameters,
"mcp_enabled": True,
}
except Exception as exc: # pragma: no cover - defensive logging
logger.exception("MCP submit failed")
return {"error": f"Failed to submit workflow: {exc}"}
@mcp.tool
async def get_comprehensive_scan_summary(run_id: str) -> Dict[str, Any] | Dict[str, str]:
async def get_comprehensive_scan_summary(run_id: str) -> dict[str, Any] | dict[str, str]:
"""Return a summary for the given workflow run via MCP."""
try:
not_ready = _temporal_not_ready_status()
@@ -398,7 +383,7 @@ async def get_comprehensive_scan_summary(run_id: str) -> Dict[str, Any] | Dict[s
summary = result.get("summary", {})
total_findings = summary.get("total_findings", 0)
except Exception as e:
logger.debug(f"Could not retrieve result for {run_id}: {e}")
logger.debug("Could not retrieve result for %s: %s", run_id, e)
return {
"run_id": run_id,
@@ -425,7 +410,7 @@ async def get_comprehensive_scan_summary(run_id: str) -> Dict[str, Any] | Dict[s
@mcp.tool
async def get_run_status_mcp(run_id: str) -> Dict[str, Any]:
async def get_run_status_mcp(run_id: str) -> dict[str, Any]:
"""Return current status information for a Temporal run."""
try:
not_ready = _temporal_not_ready_status()
@@ -453,7 +438,7 @@ async def get_run_status_mcp(run_id: str) -> Dict[str, Any]:
@mcp.tool
async def get_run_findings_mcp(run_id: str) -> Dict[str, Any]:
async def get_run_findings_mcp(run_id: str) -> dict[str, Any]:
"""Return SARIF findings for a completed run."""
try:
not_ready = _temporal_not_ready_status()
@@ -476,24 +461,24 @@ async def get_run_findings_mcp(run_id: str) -> Dict[str, Any]:
sarif = result.get("sarif", {}) if isinstance(result, dict) else {}
except Exception as exc:
logger.exception("MCP findings failed")
return {"error": f"Failed to retrieve findings: {exc}"}
else:
return {
"workflow": "unknown",
"run_id": run_id,
"sarif": sarif,
"metadata": metadata,
}
except Exception as exc:
logger.exception("MCP findings failed")
return {"error": f"Failed to retrieve findings: {exc}"}
@mcp.tool
async def list_recent_runs_mcp(
limit: int = 10,
workflow_name: str | None = None,
) -> Dict[str, Any]:
) -> dict[str, Any]:
"""List recent Temporal runs with optional workflow filter."""
not_ready = _temporal_not_ready_status()
if not_ready:
return {
@@ -518,19 +503,21 @@ async def list_recent_runs_mcp(
workflows = await temporal_mgr.list_workflows(filter_query, limit_value)
results: List[Dict[str, Any]] = []
results: list[dict[str, Any]] = []
for wf in workflows:
results.append({
"run_id": wf["workflow_id"],
"workflow": workflow_name or "unknown",
"state": wf["status"],
"state_type": wf["status"],
"is_completed": wf["status"] in ["COMPLETED", "FAILED", "CANCELLED"],
"is_running": wf["status"] == "RUNNING",
"is_failed": wf["status"] == "FAILED",
"created_at": wf.get("start_time"),
"updated_at": wf.get("close_time"),
})
results.append(
{
"run_id": wf["workflow_id"],
"workflow": workflow_name or "unknown",
"state": wf["status"],
"state_type": wf["status"],
"is_completed": wf["status"] in ["COMPLETED", "FAILED", "CANCELLED"],
"is_running": wf["status"] == "RUNNING",
"is_failed": wf["status"] == "FAILED",
"created_at": wf.get("start_time"),
"updated_at": wf.get("close_time"),
},
)
return {"runs": results, "temporal": get_temporal_status()}
@@ -539,12 +526,12 @@ async def list_recent_runs_mcp(
return {
"runs": [],
"temporal": get_temporal_status(),
"error": str(exc)
"error": str(exc),
}
@mcp.tool
async def get_fuzzing_stats_mcp(run_id: str) -> Dict[str, Any]:
async def get_fuzzing_stats_mcp(run_id: str) -> dict[str, Any]:
"""Return fuzzing statistics for a run if available."""
not_ready = _temporal_not_ready_status()
if not_ready:
@@ -568,7 +555,7 @@ async def get_fuzzing_stats_mcp(run_id: str) -> Dict[str, Any]:
@mcp.tool
async def get_fuzzing_crash_reports_mcp(run_id: str) -> Dict[str, Any]:
async def get_fuzzing_crash_reports_mcp(run_id: str) -> dict[str, Any]:
"""Return crash reports collected for a fuzzing run."""
not_ready = _temporal_not_ready_status()
if not_ready:
@@ -584,11 +571,10 @@ async def get_fuzzing_crash_reports_mcp(run_id: str) -> Dict[str, Any]:
@mcp.tool
async def get_backend_status_mcp() -> Dict[str, Any]:
async def get_backend_status_mcp() -> dict[str, Any]:
"""Expose backend readiness, workflows, and registered MCP tools."""
status = get_temporal_status()
response: Dict[str, Any] = {"temporal": status}
response: dict[str, Any] = {"temporal": status}
if status.get("ready"):
response["workflows"] = list(temporal_mgr.workflows.keys())
@@ -604,7 +590,6 @@ async def get_backend_status_mcp() -> Dict[str, Any]:
def create_mcp_transport_app() -> Starlette:
"""Build a Starlette app serving HTTP + SSE transports on one port."""
http_app = mcp.http_app(path="/", transport="streamable-http")
sse_app = create_sse_app(
server=mcp,
@@ -622,10 +607,10 @@ def create_mcp_transport_app() -> Starlette:
async def lifespan(app: Starlette): # pragma: no cover - integration wiring
async with AsyncExitStack() as stack:
await stack.enter_async_context(
http_app.router.lifespan_context(http_app)
http_app.router.lifespan_context(http_app),
)
await stack.enter_async_context(
sse_app.router.lifespan_context(sse_app)
sse_app.router.lifespan_context(sse_app),
)
yield
@@ -640,6 +625,7 @@ def create_mcp_transport_app() -> Starlette:
# Combined lifespan: Temporal init + dedicated MCP transports
# ---------------------------------------------------------------------------
@asynccontextmanager
async def combined_lifespan(app: FastAPI):
global temporal_bootstrap_task, _fastapi_mcp_imported
@@ -688,13 +674,14 @@ async def combined_lifespan(app: FastAPI):
if getattr(mcp_server, "started", False):
return
await asyncio.sleep(poll_interval)
raise asyncio.TimeoutError
raise TimeoutError
try:
await _wait_for_uvicorn_startup()
except asyncio.TimeoutError: # pragma: no cover - defensive logging
except TimeoutError: # pragma: no cover - defensive logging
if mcp_task.done():
raise RuntimeError("MCP server failed to start") from mcp_task.exception()
msg = "MCP server failed to start"
raise RuntimeError(msg) from mcp_task.exception()
logger.warning("Timed out waiting for MCP server startup; continuing anyway")
logger.info("MCP HTTP available at http://0.0.0.0:8010/mcp")

View File

@@ -1,6 +1,4 @@
"""
Models for workflow findings and submissions
"""
"""Models for workflow findings and submissions."""
# Copyright (c) 2025 FuzzingLabs
#
@@ -13,40 +11,43 @@ Models for workflow findings and submissions
#
# Additional attribution and requirements are provided in the NOTICE file.
from pydantic import BaseModel, Field
from typing import Dict, Any, Optional, Literal, List
from datetime import datetime
from typing import Any
from pydantic import BaseModel, Field
class WorkflowFindings(BaseModel):
"""Findings from a workflow execution in SARIF format"""
"""Findings from a workflow execution in SARIF format."""
workflow: str = Field(..., description="Workflow name")
run_id: str = Field(..., description="Unique run identifier")
sarif: Dict[str, Any] = Field(..., description="SARIF formatted findings")
metadata: Dict[str, Any] = Field(default_factory=dict, description="Additional metadata")
sarif: dict[str, Any] = Field(..., description="SARIF formatted findings")
metadata: dict[str, Any] = Field(default_factory=dict, description="Additional metadata")
class WorkflowSubmission(BaseModel):
"""
Submit a workflow with configurable settings.
"""Submit a workflow with configurable settings.
Note: This model is deprecated in favor of the /upload-and-submit endpoint
which handles file uploads directly.
"""
parameters: Dict[str, Any] = Field(
parameters: dict[str, Any] = Field(
default_factory=dict,
description="Workflow-specific parameters"
description="Workflow-specific parameters",
)
timeout: Optional[int] = Field(
timeout: int | None = Field(
default=None, # Allow workflow-specific defaults
description="Timeout in seconds (None for workflow default)",
ge=1,
le=604800 # Max 7 days to support fuzzing campaigns
le=604800, # Max 7 days to support fuzzing campaigns
)
class WorkflowStatus(BaseModel):
"""Status of a workflow run"""
"""Status of a workflow run."""
run_id: str = Field(..., description="Unique run identifier")
workflow: str = Field(..., description="Workflow name")
status: str = Field(..., description="Current status")
@@ -58,38 +59,37 @@ class WorkflowStatus(BaseModel):
class WorkflowMetadata(BaseModel):
"""Complete metadata for a workflow"""
"""Complete metadata for a workflow."""
name: str = Field(..., description="Workflow name")
version: str = Field(..., description="Semantic version")
description: str = Field(..., description="Workflow description")
author: Optional[str] = Field(None, description="Workflow author")
tags: List[str] = Field(default_factory=list, description="Workflow tags")
parameters: Dict[str, Any] = Field(..., description="Parameters schema")
default_parameters: Dict[str, Any] = Field(
author: str | None = Field(None, description="Workflow author")
tags: list[str] = Field(default_factory=list, description="Workflow tags")
parameters: dict[str, Any] = Field(..., description="Parameters schema")
default_parameters: dict[str, Any] = Field(
default_factory=dict,
description="Default parameter values"
description="Default parameter values",
)
required_modules: List[str] = Field(
required_modules: list[str] = Field(
default_factory=list,
description="Required module names"
)
supported_volume_modes: List[Literal["ro", "rw"]] = Field(
default=["ro", "rw"],
description="Supported volume mount modes"
description="Required module names",
)
class WorkflowListItem(BaseModel):
"""Summary information for a workflow in list views"""
"""Summary information for a workflow in list views."""
name: str = Field(..., description="Workflow name")
version: str = Field(..., description="Semantic version")
description: str = Field(..., description="Workflow description")
author: Optional[str] = Field(None, description="Workflow author")
tags: List[str] = Field(default_factory=list, description="Workflow tags")
author: str | None = Field(None, description="Workflow author")
tags: list[str] = Field(default_factory=list, description="Workflow tags")
class RunSubmissionResponse(BaseModel):
"""Response after submitting a workflow"""
"""Response after submitting a workflow."""
run_id: str = Field(..., description="Unique run identifier")
status: str = Field(..., description="Initial status")
workflow: str = Field(..., description="Workflow name")
@@ -97,28 +97,30 @@ class RunSubmissionResponse(BaseModel):
class FuzzingStats(BaseModel):
"""Real-time fuzzing statistics"""
"""Real-time fuzzing statistics."""
run_id: str = Field(..., description="Unique run identifier")
workflow: str = Field(..., description="Workflow name")
executions: int = Field(default=0, description="Total executions")
executions_per_sec: float = Field(default=0.0, description="Current execution rate")
crashes: int = Field(default=0, description="Total crashes found")
unique_crashes: int = Field(default=0, description="Unique crashes")
coverage: Optional[float] = Field(None, description="Code coverage percentage")
coverage: float | None = Field(None, description="Code coverage percentage")
corpus_size: int = Field(default=0, description="Current corpus size")
elapsed_time: int = Field(default=0, description="Elapsed time in seconds")
last_crash_time: Optional[datetime] = Field(None, description="Time of last crash")
last_crash_time: datetime | None = Field(None, description="Time of last crash")
class CrashReport(BaseModel):
"""Individual crash report from fuzzing"""
"""Individual crash report from fuzzing."""
run_id: str = Field(..., description="Run identifier")
crash_id: str = Field(..., description="Unique crash identifier")
timestamp: datetime = Field(default_factory=datetime.utcnow)
signal: Optional[str] = Field(None, description="Crash signal (SIGSEGV, etc.)")
crash_type: Optional[str] = Field(None, description="Type of crash")
stack_trace: Optional[str] = Field(None, description="Stack trace")
input_file: Optional[str] = Field(None, description="Path to crashing input")
reproducer: Optional[str] = Field(None, description="Minimized reproducer")
signal: str | None = Field(None, description="Crash signal (SIGSEGV, etc.)")
crash_type: str | None = Field(None, description="Type of crash")
stack_trace: str | None = Field(None, description="Stack trace")
input_file: str | None = Field(None, description="Path to crashing input")
reproducer: str | None = Field(None, description="Minimized reproducer")
severity: str = Field(default="medium", description="Crash severity")
exploitability: Optional[str] = Field(None, description="Exploitability assessment")
exploitability: str | None = Field(None, description="Exploitability assessment")

View File

@@ -1,5 +1,4 @@
"""
Storage abstraction layer for FuzzForge.
"""Storage abstraction layer for FuzzForge.
Provides unified interface for storing and retrieving targets and results.
"""
@@ -7,4 +6,4 @@ Provides unified interface for storing and retrieving targets and results.
from .base import StorageBackend
from .s3_cached import S3CachedStorage
__all__ = ["StorageBackend", "S3CachedStorage"]
__all__ = ["S3CachedStorage", "StorageBackend"]

View File

@@ -1,17 +1,15 @@
"""
Base storage backend interface.
"""Base storage backend interface.
All storage implementations must implement this interface.
"""
from abc import ABC, abstractmethod
from pathlib import Path
from typing import Optional, Dict, Any
from typing import Any
class StorageBackend(ABC):
"""
Abstract base class for storage backends.
"""Abstract base class for storage backends.
Implementations handle storage and retrieval of:
- Uploaded targets (code, binaries, etc.)
@@ -24,10 +22,9 @@ class StorageBackend(ABC):
self,
file_path: Path,
user_id: str,
metadata: Optional[Dict[str, Any]] = None
metadata: dict[str, Any] | None = None,
) -> str:
"""
Upload a target file to storage.
"""Upload a target file to storage.
Args:
file_path: Local path to file to upload
@@ -40,13 +37,12 @@ class StorageBackend(ABC):
Raises:
FileNotFoundError: If file_path doesn't exist
StorageError: If upload fails
"""
pass
@abstractmethod
async def get_target(self, target_id: str) -> Path:
"""
Get target file from storage.
"""Get target file from storage.
Args:
target_id: Unique identifier from upload_target()
@@ -57,31 +53,29 @@ class StorageBackend(ABC):
Raises:
FileNotFoundError: If target doesn't exist
StorageError: If download fails
"""
pass
@abstractmethod
async def delete_target(self, target_id: str) -> None:
"""
Delete target from storage.
"""Delete target from storage.
Args:
target_id: Unique identifier to delete
Raises:
StorageError: If deletion fails (doesn't raise if not found)
"""
pass
@abstractmethod
async def upload_results(
self,
workflow_id: str,
results: Dict[str, Any],
results_format: str = "json"
results: dict[str, Any],
results_format: str = "json",
) -> str:
"""
Upload workflow results to storage.
"""Upload workflow results to storage.
Args:
workflow_id: Workflow execution ID
@@ -93,13 +87,12 @@ class StorageBackend(ABC):
Raises:
StorageError: If upload fails
"""
pass
@abstractmethod
async def get_results(self, workflow_id: str) -> Dict[str, Any]:
"""
Get workflow results from storage.
async def get_results(self, workflow_id: str) -> dict[str, Any]:
"""Get workflow results from storage.
Args:
workflow_id: Workflow execution ID
@@ -110,17 +103,16 @@ class StorageBackend(ABC):
Raises:
FileNotFoundError: If results don't exist
StorageError: If download fails
"""
pass
@abstractmethod
async def list_targets(
self,
user_id: Optional[str] = None,
limit: int = 100
) -> list[Dict[str, Any]]:
"""
List uploaded targets.
user_id: str | None = None,
limit: int = 100,
) -> list[dict[str, Any]]:
"""List uploaded targets.
Args:
user_id: Filter by user ID (None = all users)
@@ -131,23 +123,21 @@ class StorageBackend(ABC):
Raises:
StorageError: If listing fails
"""
pass
@abstractmethod
async def cleanup_cache(self) -> int:
"""
Clean up local cache (LRU eviction).
"""Clean up local cache (LRU eviction).
Returns:
Number of files removed
Raises:
StorageError: If cleanup fails
"""
pass
class StorageError(Exception):
"""Base exception for storage operations."""
pass

View File

@@ -1,5 +1,4 @@
"""
S3-compatible storage backend with local caching.
"""S3-compatible storage backend with local caching.
Works with MinIO (dev/prod) or AWS S3 (cloud).
"""
@@ -10,7 +9,7 @@ import os
import shutil
from datetime import datetime
from pathlib import Path
from typing import Optional, Dict, Any
from typing import Any
from uuid import uuid4
import boto3
@@ -22,8 +21,7 @@ logger = logging.getLogger(__name__)
class S3CachedStorage(StorageBackend):
"""
S3-compatible storage with local caching.
"""S3-compatible storage with local caching.
Features:
- Upload targets to S3/MinIO
@@ -34,17 +32,16 @@ class S3CachedStorage(StorageBackend):
def __init__(
self,
endpoint_url: Optional[str] = None,
access_key: Optional[str] = None,
secret_key: Optional[str] = None,
endpoint_url: str | None = None,
access_key: str | None = None,
secret_key: str | None = None,
bucket: str = "targets",
region: str = "us-east-1",
use_ssl: bool = False,
cache_dir: Optional[Path] = None,
cache_max_size_gb: int = 10
):
"""
Initialize S3 storage backend.
cache_dir: Path | None = None,
cache_max_size_gb: int = 10,
) -> None:
"""Initialize S3 storage backend.
Args:
endpoint_url: S3 endpoint (None = AWS S3, or MinIO URL)
@@ -55,18 +52,19 @@ class S3CachedStorage(StorageBackend):
use_ssl: Use HTTPS
cache_dir: Local cache directory
cache_max_size_gb: Maximum cache size in GB
"""
# Use environment variables as defaults
self.endpoint_url = endpoint_url or os.getenv('S3_ENDPOINT', 'http://minio:9000')
self.access_key = access_key or os.getenv('S3_ACCESS_KEY', 'fuzzforge')
self.secret_key = secret_key or os.getenv('S3_SECRET_KEY', 'fuzzforge123')
self.bucket = bucket or os.getenv('S3_BUCKET', 'targets')
self.region = region or os.getenv('S3_REGION', 'us-east-1')
self.use_ssl = use_ssl or os.getenv('S3_USE_SSL', 'false').lower() == 'true'
self.endpoint_url = endpoint_url or os.getenv("S3_ENDPOINT", "http://minio:9000")
self.access_key = access_key or os.getenv("S3_ACCESS_KEY", "fuzzforge")
self.secret_key = secret_key or os.getenv("S3_SECRET_KEY", "fuzzforge123")
self.bucket = bucket or os.getenv("S3_BUCKET", "targets")
self.region = region or os.getenv("S3_REGION", "us-east-1")
self.use_ssl = use_ssl or os.getenv("S3_USE_SSL", "false").lower() == "true"
# Cache configuration
self.cache_dir = cache_dir or Path(os.getenv('CACHE_DIR', '/tmp/fuzzforge-cache'))
self.cache_max_size = cache_max_size_gb * (1024 ** 3) # Convert to bytes
self.cache_dir = cache_dir or Path(os.getenv("CACHE_DIR", "/tmp/fuzzforge-cache"))
self.cache_max_size = cache_max_size_gb * (1024**3) # Convert to bytes
# Ensure cache directory exists
self.cache_dir.mkdir(parents=True, exist_ok=True)
@@ -74,69 +72,75 @@ class S3CachedStorage(StorageBackend):
# Initialize S3 client
try:
self.s3_client = boto3.client(
's3',
"s3",
endpoint_url=self.endpoint_url,
aws_access_key_id=self.access_key,
aws_secret_access_key=self.secret_key,
region_name=self.region,
use_ssl=self.use_ssl
use_ssl=self.use_ssl,
)
logger.info(f"Initialized S3 storage: {self.endpoint_url}/{self.bucket}")
logger.info("Initialized S3 storage: %s/%s", self.endpoint_url, self.bucket)
except Exception as e:
logger.error(f"Failed to initialize S3 client: {e}")
raise StorageError(f"S3 initialization failed: {e}")
logger.exception("Failed to initialize S3 client")
msg = f"S3 initialization failed: {e}"
raise StorageError(msg) from e
async def upload_target(
self,
file_path: Path,
user_id: str,
metadata: Optional[Dict[str, Any]] = None
metadata: dict[str, Any] | None = None,
) -> str:
"""Upload target file to S3/MinIO."""
if not file_path.exists():
raise FileNotFoundError(f"File not found: {file_path}")
msg = f"File not found: {file_path}"
raise FileNotFoundError(msg)
# Generate unique target ID
target_id = str(uuid4())
# Prepare metadata
upload_metadata = {
'user_id': user_id,
'uploaded_at': datetime.now().isoformat(),
'filename': file_path.name,
'size': str(file_path.stat().st_size)
"user_id": user_id,
"uploaded_at": datetime.now().isoformat(),
"filename": file_path.name,
"size": str(file_path.stat().st_size),
}
if metadata:
upload_metadata.update(metadata)
# Upload to S3
s3_key = f'{target_id}/target'
s3_key = f"{target_id}/target"
try:
logger.info(f"Uploading target to s3://{self.bucket}/{s3_key}")
logger.info("Uploading target to s3://%s/%s", self.bucket, s3_key)
self.s3_client.upload_file(
str(file_path),
self.bucket,
s3_key,
ExtraArgs={
'Metadata': upload_metadata
}
"Metadata": upload_metadata,
},
)
file_size_mb = file_path.stat().st_size / (1024 * 1024)
logger.info(
f"✓ Uploaded target {target_id} "
f"({file_path.name}, {file_size_mb:.2f} MB)"
"✓ Uploaded target %s (%s, %s MB)",
target_id,
file_path.name,
f"{file_size_mb:.2f}",
)
return target_id
except ClientError as e:
logger.error(f"S3 upload failed: {e}", exc_info=True)
raise StorageError(f"Failed to upload target: {e}")
logger.exception("S3 upload failed")
msg = f"Failed to upload target: {e}"
raise StorageError(msg) from e
except Exception as e:
logger.error(f"Upload failed: {e}", exc_info=True)
raise StorageError(f"Upload error: {e}")
logger.exception("Upload failed")
msg = f"Upload error: {e}"
raise StorageError(msg) from e
else:
return target_id
async def get_target(self, target_id: str) -> Path:
"""Get target from cache or download from S3/MinIO."""
@@ -147,105 +151,110 @@ class S3CachedStorage(StorageBackend):
if cached_file.exists():
# Update access time for LRU
cached_file.touch()
logger.info(f"Cache HIT: {target_id}")
logger.info("Cache HIT: %s", target_id)
return cached_file
# Cache miss - download from S3
logger.info(f"Cache MISS: {target_id}, downloading from S3...")
logger.info("Cache MISS: %s, downloading from S3...", target_id)
try:
# Create cache directory
cache_path.mkdir(parents=True, exist_ok=True)
# Download from S3
s3_key = f'{target_id}/target'
logger.info(f"Downloading s3://{self.bucket}/{s3_key}")
s3_key = f"{target_id}/target"
logger.info("Downloading s3://%s/%s", self.bucket, s3_key)
self.s3_client.download_file(
self.bucket,
s3_key,
str(cached_file)
str(cached_file),
)
# Verify download
if not cached_file.exists():
raise StorageError(f"Downloaded file not found: {cached_file}")
msg = f"Downloaded file not found: {cached_file}"
raise StorageError(msg)
file_size_mb = cached_file.stat().st_size / (1024 * 1024)
logger.info(f"✓ Downloaded target {target_id} ({file_size_mb:.2f} MB)")
return cached_file
logger.info("✓ Downloaded target %s (%s MB)", target_id, f"{file_size_mb:.2f}")
except ClientError as e:
error_code = e.response.get('Error', {}).get('Code')
if error_code in ['404', 'NoSuchKey']:
logger.error(f"Target not found: {target_id}")
raise FileNotFoundError(f"Target {target_id} not found in storage")
else:
logger.error(f"S3 download failed: {e}", exc_info=True)
raise StorageError(f"Download failed: {e}")
error_code = e.response.get("Error", {}).get("Code")
if error_code in ["404", "NoSuchKey"]:
logger.exception("Target not found: %s", target_id)
msg = f"Target {target_id} not found in storage"
raise FileNotFoundError(msg) from e
logger.exception("S3 download failed")
msg = f"Download failed: {e}"
raise StorageError(msg) from e
except Exception as e:
logger.error(f"Download error: {e}", exc_info=True)
logger.exception("Download error")
# Cleanup partial download
if cache_path.exists():
shutil.rmtree(cache_path, ignore_errors=True)
raise StorageError(f"Download error: {e}")
msg = f"Download error: {e}"
raise StorageError(msg) from e
else:
return cached_file
async def delete_target(self, target_id: str) -> None:
"""Delete target from S3/MinIO."""
try:
s3_key = f'{target_id}/target'
logger.info(f"Deleting s3://{self.bucket}/{s3_key}")
s3_key = f"{target_id}/target"
logger.info("Deleting s3://%s/%s", self.bucket, s3_key)
self.s3_client.delete_object(
Bucket=self.bucket,
Key=s3_key
Key=s3_key,
)
# Also delete from cache if present
cache_path = self.cache_dir / target_id
if cache_path.exists():
shutil.rmtree(cache_path, ignore_errors=True)
logger.info(f"✓ Deleted target {target_id} from S3 and cache")
logger.info("✓ Deleted target %s from S3 and cache", target_id)
else:
logger.info(f"✓ Deleted target {target_id} from S3")
logger.info("✓ Deleted target %s from S3", target_id)
except ClientError as e:
logger.error(f"S3 delete failed: {e}", exc_info=True)
logger.exception("S3 delete failed")
# Don't raise error if object doesn't exist
if e.response.get('Error', {}).get('Code') not in ['404', 'NoSuchKey']:
raise StorageError(f"Delete failed: {e}")
if e.response.get("Error", {}).get("Code") not in ["404", "NoSuchKey"]:
msg = f"Delete failed: {e}"
raise StorageError(msg) from e
except Exception as e:
logger.error(f"Delete error: {e}", exc_info=True)
raise StorageError(f"Delete error: {e}")
logger.exception("Delete error")
msg = f"Delete error: {e}"
raise StorageError(msg) from e
async def upload_results(
self,
workflow_id: str,
results: Dict[str, Any],
results_format: str = "json"
results: dict[str, Any],
results_format: str = "json",
) -> str:
"""Upload workflow results to S3/MinIO."""
try:
# Prepare results content
if results_format == "json":
content = json.dumps(results, indent=2).encode('utf-8')
content_type = 'application/json'
file_ext = 'json'
content = json.dumps(results, indent=2).encode("utf-8")
content_type = "application/json"
file_ext = "json"
elif results_format == "sarif":
content = json.dumps(results, indent=2).encode('utf-8')
content_type = 'application/sarif+json'
file_ext = 'sarif'
content = json.dumps(results, indent=2).encode("utf-8")
content_type = "application/sarif+json"
file_ext = "sarif"
else:
content = json.dumps(results, indent=2).encode('utf-8')
content_type = 'application/json'
file_ext = 'json'
content = json.dumps(results, indent=2).encode("utf-8")
content_type = "application/json"
file_ext = "json"
# Upload to results bucket
results_bucket = 'results'
s3_key = f'{workflow_id}/results.{file_ext}'
results_bucket = "results"
s3_key = f"{workflow_id}/results.{file_ext}"
logger.info(f"Uploading results to s3://{results_bucket}/{s3_key}")
logger.info("Uploading results to s3://%s/%s", results_bucket, s3_key)
self.s3_client.put_object(
Bucket=results_bucket,
@@ -253,95 +262,103 @@ class S3CachedStorage(StorageBackend):
Body=content,
ContentType=content_type,
Metadata={
'workflow_id': workflow_id,
'format': results_format,
'uploaded_at': datetime.now().isoformat()
}
"workflow_id": workflow_id,
"format": results_format,
"uploaded_at": datetime.now().isoformat(),
},
)
# Construct URL
results_url = f"{self.endpoint_url}/{results_bucket}/{s3_key}"
logger.info(f"✓ Uploaded results: {results_url}")
return results_url
logger.info("✓ Uploaded results: %s", results_url)
except Exception as e:
logger.error(f"Results upload failed: {e}", exc_info=True)
raise StorageError(f"Results upload failed: {e}")
logger.exception("Results upload failed")
msg = f"Results upload failed: {e}"
raise StorageError(msg) from e
else:
return results_url
async def get_results(self, workflow_id: str) -> Dict[str, Any]:
async def get_results(self, workflow_id: str) -> dict[str, Any]:
"""Get workflow results from S3/MinIO."""
try:
results_bucket = 'results'
s3_key = f'{workflow_id}/results.json'
results_bucket = "results"
s3_key = f"{workflow_id}/results.json"
logger.info(f"Downloading results from s3://{results_bucket}/{s3_key}")
logger.info("Downloading results from s3://%s/%s", results_bucket, s3_key)
response = self.s3_client.get_object(
Bucket=results_bucket,
Key=s3_key
Key=s3_key,
)
content = response['Body'].read().decode('utf-8')
content = response["Body"].read().decode("utf-8")
results = json.loads(content)
logger.info(f"✓ Downloaded results for workflow {workflow_id}")
return results
logger.info("✓ Downloaded results for workflow %s", workflow_id)
except ClientError as e:
error_code = e.response.get('Error', {}).get('Code')
if error_code in ['404', 'NoSuchKey']:
logger.error(f"Results not found: {workflow_id}")
raise FileNotFoundError(f"Results for workflow {workflow_id} not found")
else:
logger.error(f"Results download failed: {e}", exc_info=True)
raise StorageError(f"Results download failed: {e}")
error_code = e.response.get("Error", {}).get("Code")
if error_code in ["404", "NoSuchKey"]:
logger.exception("Results not found: %s", workflow_id)
msg = f"Results for workflow {workflow_id} not found"
raise FileNotFoundError(msg) from e
logger.exception("Results download failed")
msg = f"Results download failed: {e}"
raise StorageError(msg) from e
except Exception as e:
logger.error(f"Results download error: {e}", exc_info=True)
raise StorageError(f"Results download error: {e}")
logger.exception("Results download error")
msg = f"Results download error: {e}"
raise StorageError(msg) from e
else:
return results
async def list_targets(
self,
user_id: Optional[str] = None,
limit: int = 100
) -> list[Dict[str, Any]]:
user_id: str | None = None,
limit: int = 100,
) -> list[dict[str, Any]]:
"""List uploaded targets."""
try:
targets = []
paginator = self.s3_client.get_paginator('list_objects_v2')
paginator = self.s3_client.get_paginator("list_objects_v2")
for page in paginator.paginate(Bucket=self.bucket, PaginationConfig={'MaxItems': limit}):
for obj in page.get('Contents', []):
for page in paginator.paginate(Bucket=self.bucket, PaginationConfig={"MaxItems": limit}):
for obj in page.get("Contents", []):
# Get object metadata
try:
metadata_response = self.s3_client.head_object(
Bucket=self.bucket,
Key=obj['Key']
Key=obj["Key"],
)
metadata = metadata_response.get('Metadata', {})
metadata = metadata_response.get("Metadata", {})
# Filter by user_id if specified
if user_id and metadata.get('user_id') != user_id:
if user_id and metadata.get("user_id") != user_id:
continue
targets.append({
'target_id': obj['Key'].split('/')[0],
'key': obj['Key'],
'size': obj['Size'],
'last_modified': obj['LastModified'].isoformat(),
'metadata': metadata
})
targets.append(
{
"target_id": obj["Key"].split("/")[0],
"key": obj["Key"],
"size": obj["Size"],
"last_modified": obj["LastModified"].isoformat(),
"metadata": metadata,
},
)
except Exception as e:
logger.warning(f"Failed to get metadata for {obj['Key']}: {e}")
logger.warning("Failed to get metadata for %s: %s", obj["Key"], e)
continue
logger.info(f"Listed {len(targets)} targets (user_id={user_id})")
return targets
logger.info("Listed %s targets (user_id=%s)", len(targets), user_id)
except Exception as e:
logger.error(f"List targets failed: {e}", exc_info=True)
raise StorageError(f"List targets failed: {e}")
logger.exception("List targets failed")
msg = f"List targets failed: {e}"
raise StorageError(msg) from e
else:
return targets
async def cleanup_cache(self) -> int:
"""Clean up local cache using LRU eviction."""
@@ -350,30 +367,33 @@ class S3CachedStorage(StorageBackend):
total_size = 0
# Gather all cached files with metadata
for cache_file in self.cache_dir.rglob('*'):
for cache_file in self.cache_dir.rglob("*"):
if cache_file.is_file():
try:
stat = cache_file.stat()
cache_files.append({
'path': cache_file,
'size': stat.st_size,
'atime': stat.st_atime # Last access time
})
cache_files.append(
{
"path": cache_file,
"size": stat.st_size,
"atime": stat.st_atime, # Last access time
},
)
total_size += stat.st_size
except Exception as e:
logger.warning(f"Failed to stat {cache_file}: {e}")
logger.warning("Failed to stat %s: %s", cache_file, e)
continue
# Check if cleanup is needed
if total_size <= self.cache_max_size:
logger.info(
f"Cache size OK: {total_size / (1024**3):.2f} GB / "
f"{self.cache_max_size / (1024**3):.2f} GB"
"Cache size OK: %s GB / %s GB",
f"{total_size / (1024**3):.2f}",
f"{self.cache_max_size / (1024**3):.2f}",
)
return 0
# Sort by access time (oldest first)
cache_files.sort(key=lambda x: x['atime'])
cache_files.sort(key=lambda x: x["atime"])
# Remove files until under limit
removed_count = 0
@@ -382,42 +402,46 @@ class S3CachedStorage(StorageBackend):
break
try:
file_info['path'].unlink()
total_size -= file_info['size']
file_info["path"].unlink()
total_size -= file_info["size"]
removed_count += 1
logger.debug(f"Evicted from cache: {file_info['path']}")
logger.debug("Evicted from cache: %s", file_info["path"])
except Exception as e:
logger.warning(f"Failed to delete {file_info['path']}: {e}")
logger.warning("Failed to delete %s: %s", file_info["path"], e)
continue
logger.info(
f"✓ Cache cleanup: removed {removed_count} files, "
f"new size: {total_size / (1024**3):.2f} GB"
"✓ Cache cleanup: removed %s files, new size: %s GB",
removed_count,
f"{total_size / (1024**3):.2f}",
)
return removed_count
except Exception as e:
logger.error(f"Cache cleanup failed: {e}", exc_info=True)
raise StorageError(f"Cache cleanup failed: {e}")
logger.exception("Cache cleanup failed")
msg = f"Cache cleanup failed: {e}"
raise StorageError(msg) from e
def get_cache_stats(self) -> Dict[str, Any]:
else:
return removed_count
def get_cache_stats(self) -> dict[str, Any]:
"""Get cache statistics."""
try:
total_size = 0
file_count = 0
for cache_file in self.cache_dir.rglob('*'):
for cache_file in self.cache_dir.rglob("*"):
if cache_file.is_file():
total_size += cache_file.stat().st_size
file_count += 1
return {
'total_size_bytes': total_size,
'total_size_gb': total_size / (1024 ** 3),
'file_count': file_count,
'max_size_gb': self.cache_max_size / (1024 ** 3),
'usage_percent': (total_size / self.cache_max_size) * 100
"total_size_bytes": total_size,
"total_size_gb": total_size / (1024**3),
"file_count": file_count,
"max_size_gb": self.cache_max_size / (1024**3),
"usage_percent": (total_size / self.cache_max_size) * 100,
}
except Exception as e:
logger.error(f"Failed to get cache stats: {e}")
return {'error': str(e)}
logger.exception("Failed to get cache stats")
return {"error": str(e)}

View File

@@ -1,10 +1,9 @@
"""
Temporal integration for FuzzForge.
"""Temporal integration for FuzzForge.
Handles workflow execution, monitoring, and management.
"""
from .manager import TemporalManager
from .discovery import WorkflowDiscovery
from .manager import TemporalManager
__all__ = ["TemporalManager", "WorkflowDiscovery"]

View File

@@ -1,25 +1,26 @@
"""
Workflow Discovery for Temporal
"""Workflow Discovery for Temporal.
Discovers workflows from the toolbox/workflows directory
and provides metadata about available workflows.
"""
import logging
import yaml
from pathlib import Path
from typing import Dict, Any
from pydantic import BaseModel, Field, ConfigDict
from typing import Any
import yaml
from pydantic import BaseModel, ConfigDict, Field
logger = logging.getLogger(__name__)
class WorkflowInfo(BaseModel):
"""Information about a discovered workflow"""
"""Information about a discovered workflow."""
name: str = Field(..., description="Workflow name")
path: Path = Field(..., description="Path to workflow directory")
workflow_file: Path = Field(..., description="Path to workflow.py file")
metadata: Dict[str, Any] = Field(..., description="Workflow metadata from YAML")
metadata: dict[str, Any] = Field(..., description="Workflow metadata from YAML")
workflow_type: str = Field(..., description="Workflow class name")
vertical: str = Field(..., description="Vertical (worker type) for this workflow")
@@ -27,8 +28,7 @@ class WorkflowInfo(BaseModel):
class WorkflowDiscovery:
"""
Discovers workflows from the filesystem.
"""Discovers workflows from the filesystem.
Scans toolbox/workflows/ for directories containing:
- metadata.yaml (required)
@@ -38,106 +38,109 @@ class WorkflowDiscovery:
which determines which worker pool will execute it.
"""
def __init__(self, workflows_dir: Path):
"""
Initialize workflow discovery.
def __init__(self, workflows_dir: Path) -> None:
"""Initialize workflow discovery.
Args:
workflows_dir: Path to the workflows directory
"""
self.workflows_dir = workflows_dir
if not self.workflows_dir.exists():
self.workflows_dir.mkdir(parents=True, exist_ok=True)
logger.info(f"Created workflows directory: {self.workflows_dir}")
logger.info("Created workflows directory: %s", self.workflows_dir)
async def discover_workflows(self) -> Dict[str, WorkflowInfo]:
"""
Discover workflows by scanning the workflows directory.
async def discover_workflows(self) -> dict[str, WorkflowInfo]:
"""Discover workflows by scanning the workflows directory.
Returns:
Dictionary mapping workflow names to their information
"""
workflows = {}
logger.info(f"Scanning for workflows in: {self.workflows_dir}")
logger.info("Scanning for workflows in: %s", self.workflows_dir)
for workflow_dir in self.workflows_dir.iterdir():
if not workflow_dir.is_dir():
continue
# Skip special directories
if workflow_dir.name.startswith('.') or workflow_dir.name == '__pycache__':
if workflow_dir.name.startswith(".") or workflow_dir.name == "__pycache__":
continue
metadata_file = workflow_dir / "metadata.yaml"
if not metadata_file.exists():
logger.debug(f"No metadata.yaml in {workflow_dir.name}, skipping")
logger.debug("No metadata.yaml in %s, skipping", workflow_dir.name)
continue
workflow_file = workflow_dir / "workflow.py"
if not workflow_file.exists():
logger.warning(
f"Workflow {workflow_dir.name} has metadata but no workflow.py, skipping"
"Workflow %s has metadata but no workflow.py, skipping",
workflow_dir.name,
)
continue
try:
# Parse metadata
with open(metadata_file) as f:
with metadata_file.open() as f:
metadata = yaml.safe_load(f)
# Validate required fields
if 'name' not in metadata:
logger.warning(f"Workflow {workflow_dir.name} metadata missing 'name' field")
metadata['name'] = workflow_dir.name
if "name" not in metadata:
logger.warning("Workflow %s metadata missing 'name' field", workflow_dir.name)
metadata["name"] = workflow_dir.name
if 'vertical' not in metadata:
if "vertical" not in metadata:
logger.warning(
f"Workflow {workflow_dir.name} metadata missing 'vertical' field"
"Workflow %s metadata missing 'vertical' field",
workflow_dir.name,
)
continue
# Infer workflow class name from metadata or use convention
workflow_type = metadata.get('workflow_class')
workflow_type = metadata.get("workflow_class")
if not workflow_type:
# Convention: convert snake_case to PascalCase + Workflow
# e.g., rust_test -> RustTestWorkflow
parts = workflow_dir.name.split('_')
workflow_type = ''.join(part.capitalize() for part in parts) + 'Workflow'
parts = workflow_dir.name.split("_")
workflow_type = "".join(part.capitalize() for part in parts) + "Workflow"
# Create workflow info
info = WorkflowInfo(
name=metadata['name'],
name=metadata["name"],
path=workflow_dir,
workflow_file=workflow_file,
metadata=metadata,
workflow_type=workflow_type,
vertical=metadata['vertical']
vertical=metadata["vertical"],
)
workflows[info.name] = info
logger.info(
f"✓ Discovered workflow: {info.name} "
f"(vertical: {info.vertical}, class: {info.workflow_type})"
"✓ Discovered workflow: %s (vertical: %s, class: %s)",
info.name,
info.vertical,
info.workflow_type,
)
except Exception as e:
logger.error(
f"Error discovering workflow {workflow_dir.name}: {e}",
exc_info=True
except Exception:
logger.exception(
"Error discovering workflow %s",
workflow_dir.name,
)
continue
logger.info(f"Discovered {len(workflows)} workflows")
logger.info("Discovered %s workflows", len(workflows))
return workflows
def get_workflows_by_vertical(
self,
workflows: Dict[str, WorkflowInfo],
vertical: str
) -> Dict[str, WorkflowInfo]:
"""
Filter workflows by vertical.
workflows: dict[str, WorkflowInfo],
vertical: str,
) -> dict[str, WorkflowInfo]:
"""Filter workflows by vertical.
Args:
workflows: All discovered workflows
@@ -145,32 +148,29 @@ class WorkflowDiscovery:
Returns:
Filtered workflows dictionary
"""
return {
name: info
for name, info in workflows.items()
if info.vertical == vertical
}
def get_available_verticals(self, workflows: Dict[str, WorkflowInfo]) -> list[str]:
"""
Get list of all verticals from discovered workflows.
return {name: info for name, info in workflows.items() if info.vertical == vertical}
def get_available_verticals(self, workflows: dict[str, WorkflowInfo]) -> list[str]:
"""Get list of all verticals from discovered workflows.
Args:
workflows: All discovered workflows
Returns:
List of unique vertical names
"""
return list(set(info.vertical for info in workflows.values()))
return {info.vertical for info in workflows.values()}
@staticmethod
def get_metadata_schema() -> Dict[str, Any]:
"""
Get the JSON schema for workflow metadata.
def get_metadata_schema() -> dict[str, Any]:
"""Get the JSON schema for workflow metadata.
Returns:
JSON schema dictionary
"""
return {
"type": "object",
@@ -178,34 +178,34 @@ class WorkflowDiscovery:
"properties": {
"name": {
"type": "string",
"description": "Workflow name"
"description": "Workflow name",
},
"version": {
"type": "string",
"pattern": "^\\d+\\.\\d+\\.\\d+$",
"description": "Semantic version (x.y.z)"
"description": "Semantic version (x.y.z)",
},
"vertical": {
"type": "string",
"description": "Vertical worker type (rust, android, web, etc.)"
"description": "Vertical worker type (rust, android, web, etc.)",
},
"description": {
"type": "string",
"description": "Workflow description"
"description": "Workflow description",
},
"author": {
"type": "string",
"description": "Workflow author"
"description": "Workflow author",
},
"category": {
"type": "string",
"enum": ["comprehensive", "specialized", "fuzzing", "focused"],
"description": "Workflow category"
"description": "Workflow category",
},
"tags": {
"type": "array",
"items": {"type": "string"},
"description": "Workflow tags for categorization"
"description": "Workflow tags for categorization",
},
"requirements": {
"type": "object",
@@ -214,7 +214,7 @@ class WorkflowDiscovery:
"tools": {
"type": "array",
"items": {"type": "string"},
"description": "Required security tools"
"description": "Required security tools",
},
"resources": {
"type": "object",
@@ -223,35 +223,35 @@ class WorkflowDiscovery:
"memory": {
"type": "string",
"pattern": "^\\d+[GMK]i$",
"description": "Memory limit (e.g., 1Gi, 512Mi)"
"description": "Memory limit (e.g., 1Gi, 512Mi)",
},
"cpu": {
"type": "string",
"pattern": "^\\d+m?$",
"description": "CPU limit (e.g., 1000m, 2)"
"description": "CPU limit (e.g., 1000m, 2)",
},
"timeout": {
"type": "integer",
"minimum": 60,
"maximum": 7200,
"description": "Workflow timeout in seconds"
}
}
}
}
"description": "Workflow timeout in seconds",
},
},
},
},
},
"parameters": {
"type": "object",
"description": "Workflow parameters schema"
"description": "Workflow parameters schema",
},
"default_parameters": {
"type": "object",
"description": "Default parameter values"
"description": "Default parameter values",
},
"required_modules": {
"type": "array",
"items": {"type": "string"},
"description": "Required module names"
}
}
"description": "Required module names",
},
},
}

View File

@@ -1,5 +1,4 @@
"""
Temporal Manager - Workflow execution and management
"""Temporal Manager - Workflow execution and management.
Handles:
- Workflow discovery from toolbox
@@ -8,25 +7,26 @@ Handles:
- Results retrieval
"""
import asyncio
import logging
import os
from datetime import timedelta
from pathlib import Path
from typing import Dict, Optional, Any
from typing import Any
from uuid import uuid4
from temporalio.client import Client, WorkflowHandle
from temporalio.common import RetryPolicy
from datetime import timedelta
from src.storage import S3CachedStorage
from .discovery import WorkflowDiscovery, WorkflowInfo
from src.storage import S3CachedStorage
logger = logging.getLogger(__name__)
class TemporalManager:
"""
Manages Temporal workflow execution for FuzzForge.
"""Manages Temporal workflow execution for FuzzForge.
This class:
- Discovers available workflows from toolbox
@@ -37,41 +37,42 @@ class TemporalManager:
def __init__(
self,
workflows_dir: Optional[Path] = None,
temporal_address: Optional[str] = None,
workflows_dir: Path | None = None,
temporal_address: str | None = None,
temporal_namespace: str = "default",
storage: Optional[S3CachedStorage] = None
):
"""
Initialize Temporal manager.
storage: S3CachedStorage | None = None,
) -> None:
"""Initialize Temporal manager.
Args:
workflows_dir: Path to workflows directory (default: toolbox/workflows)
temporal_address: Temporal server address (default: from env or localhost:7233)
temporal_namespace: Temporal namespace
storage: Storage backend for file uploads (default: S3CachedStorage)
"""
if workflows_dir is None:
workflows_dir = Path("toolbox/workflows")
self.temporal_address = temporal_address or os.getenv(
'TEMPORAL_ADDRESS',
'localhost:7233'
"TEMPORAL_ADDRESS",
"localhost:7233",
)
self.temporal_namespace = temporal_namespace
self.discovery = WorkflowDiscovery(workflows_dir)
self.workflows: Dict[str, WorkflowInfo] = {}
self.client: Optional[Client] = None
self.workflows: dict[str, WorkflowInfo] = {}
self.client: Client | None = None
# Initialize storage backend
self.storage = storage or S3CachedStorage()
logger.info(
f"TemporalManager initialized: {self.temporal_address} "
f"(namespace: {self.temporal_namespace})"
"TemporalManager initialized: %s (namespace: %s)",
self.temporal_address,
self.temporal_namespace,
)
async def initialize(self):
async def initialize(self) -> None:
"""Initialize the manager by discovering workflows and connecting to Temporal."""
try:
# Discover workflows
@@ -81,45 +82,46 @@ class TemporalManager:
logger.warning("No workflows discovered")
else:
logger.info(
f"Discovered {len(self.workflows)} workflows: "
f"{list(self.workflows.keys())}"
"Discovered %s workflows: %s",
len(self.workflows),
list(self.workflows.keys()),
)
# Connect to Temporal
self.client = await Client.connect(
self.temporal_address,
namespace=self.temporal_namespace
namespace=self.temporal_namespace,
)
logger.info(f"✓ Connected to Temporal: {self.temporal_address}")
logger.info("✓ Connected to Temporal: %s", self.temporal_address)
except Exception as e:
logger.error(f"Failed to initialize Temporal manager: {e}", exc_info=True)
except Exception:
logger.exception("Failed to initialize Temporal manager")
raise
async def close(self):
async def close(self) -> None:
"""Close Temporal client connection."""
if self.client:
# Temporal client doesn't need explicit close in Python SDK
pass
async def get_workflows(self) -> Dict[str, WorkflowInfo]:
"""
Get all discovered workflows.
async def get_workflows(self) -> dict[str, WorkflowInfo]:
"""Get all discovered workflows.
Returns:
Dictionary mapping workflow names to their info
"""
return self.workflows
async def get_workflow(self, name: str) -> Optional[WorkflowInfo]:
"""
Get workflow info by name.
async def get_workflow(self, name: str) -> WorkflowInfo | None:
"""Get workflow info by name.
Args:
name: Workflow name
Returns:
WorkflowInfo or None if not found
"""
return self.workflows.get(name)
@@ -127,10 +129,9 @@ class TemporalManager:
self,
file_path: Path,
user_id: str,
metadata: Optional[Dict[str, Any]] = None
metadata: dict[str, Any] | None = None,
) -> str:
"""
Upload target file to storage.
"""Upload target file to storage.
Args:
file_path: Local path to file
@@ -139,20 +140,20 @@ class TemporalManager:
Returns:
Target ID for use in workflow execution
"""
target_id = await self.storage.upload_target(file_path, user_id, metadata)
logger.info(f"Uploaded target: {target_id}")
logger.info("Uploaded target: %s", target_id)
return target_id
async def run_workflow(
self,
workflow_name: str,
target_id: str,
workflow_params: Optional[Dict[str, Any]] = None,
workflow_id: Optional[str] = None
workflow_params: dict[str, Any] | None = None,
workflow_id: str | None = None,
) -> WorkflowHandle:
"""
Execute a workflow.
"""Execute a workflow.
Args:
workflow_name: Name of workflow to execute
@@ -165,14 +166,17 @@ class TemporalManager:
Raises:
ValueError: If workflow not found or client not initialized
"""
if not self.client:
raise ValueError("Temporal client not initialized. Call initialize() first.")
msg = "Temporal client not initialized. Call initialize() first."
raise ValueError(msg)
# Get workflow info
workflow_info = self.workflows.get(workflow_name)
if not workflow_info:
raise ValueError(f"Workflow not found: {workflow_name}")
msg = f"Workflow not found: {workflow_name}"
raise ValueError(msg)
# Generate workflow ID if not provided
if not workflow_id:
@@ -187,23 +191,42 @@ class TemporalManager:
# Add parameters in order based on metadata schema
# This ensures parameters match the workflow signature order
if workflow_params and 'parameters' in workflow_info.metadata:
param_schema = workflow_info.metadata['parameters'].get('properties', {})
# Apply defaults from metadata.yaml if parameter not provided
if "parameters" in workflow_info.metadata:
param_schema = workflow_info.metadata["parameters"].get("properties", {})
logger.debug("Found %s parameters in schema", len(param_schema))
# Iterate parameters in schema order and add values
for param_name in param_schema.keys():
param_value = workflow_params.get(param_name)
for param_name in param_schema:
param_spec = param_schema[param_name]
# Use provided param, or fall back to default from metadata
if workflow_params and param_name in workflow_params:
param_value = workflow_params[param_name]
logger.debug("Using provided value for %s: %s", param_name, param_value)
elif "default" in param_spec:
param_value = param_spec["default"]
logger.debug("Using default for %s: %s", param_name, param_value)
else:
param_value = None
logger.debug("No value or default for {param_name}, using None")
workflow_args.append(param_value)
else:
logger.debug("No 'parameters' section found in workflow metadata")
# Determine task queue from workflow vertical
vertical = workflow_info.metadata.get("vertical", "default")
task_queue = f"{vertical}-queue"
logger.info(
f"Starting workflow: {workflow_name} "
f"(id={workflow_id}, queue={task_queue}, target={target_id})"
"Starting workflow: %s (id=%s, queue=%s, target=%s)",
workflow_name,
workflow_id,
task_queue,
target_id,
)
logger.info(f"DEBUG: workflow_args = {workflow_args}")
logger.info(f"DEBUG: workflow_params received = {workflow_params}")
logger.info("DEBUG: workflow_args = %s", workflow_args)
logger.infof("DEBUG: workflow_params received = %s", workflow_params)
try:
# Start workflow execution with positional arguments
@@ -215,20 +238,20 @@ class TemporalManager:
retry_policy=RetryPolicy(
initial_interval=timedelta(seconds=1),
maximum_interval=timedelta(minutes=1),
maximum_attempts=3
)
maximum_attempts=3,
),
)
logger.info(f"✓ Workflow started: {workflow_id}")
logger.info("✓ Workflow started: %s", workflow_id)
except Exception:
logger.exception("Failed to start workflow %s", workflow_name)
raise
else:
return handle
except Exception as e:
logger.error(f"Failed to start workflow {workflow_name}: {e}", exc_info=True)
raise
async def get_workflow_status(self, workflow_id: str) -> Dict[str, Any]:
"""
Get workflow execution status.
async def get_workflow_status(self, workflow_id: str) -> dict[str, Any]:
"""Get workflow execution status.
Args:
workflow_id: Workflow execution ID
@@ -238,9 +261,11 @@ class TemporalManager:
Raises:
ValueError: If client not initialized or workflow not found
"""
if not self.client:
raise ValueError("Temporal client not initialized")
msg = "Temporal client not initialized"
raise ValueError(msg)
try:
# Get workflow handle
@@ -258,20 +283,20 @@ class TemporalManager:
"task_queue": description.task_queue,
}
logger.info(f"Workflow {workflow_id} status: {status['status']}")
return status
logger.info("Workflow %s status: %s", workflow_id, status["status"])
except Exception as e:
logger.error(f"Failed to get workflow status: {e}", exc_info=True)
except Exception:
logger.exception("Failed to get workflow status")
raise
else:
return status
async def get_workflow_result(
self,
workflow_id: str,
timeout: Optional[timedelta] = None
timeout: timedelta | None = None,
) -> Any:
"""
Get workflow execution result (blocking).
"""Get workflow execution result (blocking).
Args:
workflow_id: Workflow execution ID
@@ -283,60 +308,62 @@ class TemporalManager:
Raises:
ValueError: If client not initialized
TimeoutError: If timeout exceeded
"""
if not self.client:
raise ValueError("Temporal client not initialized")
msg = "Temporal client not initialized"
raise ValueError(msg)
try:
handle = self.client.get_workflow_handle(workflow_id)
logger.info(f"Waiting for workflow result: {workflow_id}")
logger.info("Waiting for workflow result: %s", workflow_id)
# Wait for workflow to complete and get result
if timeout:
# Use asyncio timeout if provided
import asyncio
result = await asyncio.wait_for(handle.result(), timeout=timeout.total_seconds())
else:
result = await handle.result()
logger.info(f"✓ Workflow {workflow_id} completed")
logger.info("✓ Workflow %s completed", workflow_id)
except Exception:
logger.exception("Failed to get workflow result")
raise
else:
return result
except Exception as e:
logger.error(f"Failed to get workflow result: {e}", exc_info=True)
raise
async def cancel_workflow(self, workflow_id: str) -> None:
"""
Cancel a running workflow.
"""Cancel a running workflow.
Args:
workflow_id: Workflow execution ID
Raises:
ValueError: If client not initialized
"""
if not self.client:
raise ValueError("Temporal client not initialized")
msg = "Temporal client not initialized"
raise ValueError(msg)
try:
handle = self.client.get_workflow_handle(workflow_id)
await handle.cancel()
logger.info(f"✓ Workflow cancelled: {workflow_id}")
logger.info("✓ Workflow cancelled: %s", workflow_id)
except Exception as e:
logger.error(f"Failed to cancel workflow: {e}", exc_info=True)
except Exception:
logger.exception("Failed to cancel workflow: %s")
raise
async def list_workflows(
self,
filter_query: Optional[str] = None,
limit: int = 100
) -> list[Dict[str, Any]]:
"""
List workflow executions.
filter_query: str | None = None,
limit: int = 100,
) -> list[dict[str, Any]]:
"""List workflow executions.
Args:
filter_query: Optional Temporal list filter query
@@ -347,30 +374,36 @@ class TemporalManager:
Raises:
ValueError: If client not initialized
"""
if not self.client:
raise ValueError("Temporal client not initialized")
msg = "Temporal client not initialized"
raise ValueError(msg)
try:
workflows = []
# Use Temporal's list API
async for workflow in self.client.list_workflows(filter_query):
workflows.append({
"workflow_id": workflow.id,
"workflow_type": workflow.workflow_type,
"status": workflow.status.name,
"start_time": workflow.start_time.isoformat() if workflow.start_time else None,
"close_time": workflow.close_time.isoformat() if workflow.close_time else None,
"task_queue": workflow.task_queue,
})
workflows.append(
{
"workflow_id": workflow.id,
"workflow_type": workflow.workflow_type,
"status": workflow.status.name,
"start_time": workflow.start_time.isoformat() if workflow.start_time else None,
"close_time": workflow.close_time.isoformat() if workflow.close_time else None,
"task_queue": workflow.task_queue,
},
)
if len(workflows) >= limit:
break
logger.info(f"Listed {len(workflows)} workflows")
logger.info("Listed %s workflows", len(workflows))
return workflows
except Exception as e:
logger.error(f"Failed to list workflows: {e}", exc_info=True)
except Exception:
logger.exception("Failed to list workflows")
raise
else:
return workflows

View File

@@ -8,11 +8,19 @@
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
"""Fixtures used across tests."""
import sys
from collections.abc import Callable
from pathlib import Path
from typing import Dict, Any
from types import CoroutineType
from typing import Any
import pytest
from modules.analyzer.security_analyzer import SecurityAnalyzer
from modules.fuzzer.atheris_fuzzer import AtherisFuzzer
from modules.fuzzer.cargo_fuzzer import CargoFuzzer
from modules.scanner.file_scanner import FileScanner
# Ensure project root is on sys.path so `src` is importable
ROOT = Path(__file__).resolve().parents[1]
@@ -29,17 +37,18 @@ if str(TOOLBOX) not in sys.path:
# Workspace Fixtures
# ============================================================================
@pytest.fixture
def temp_workspace(tmp_path):
"""Create a temporary workspace directory for testing"""
def temp_workspace(tmp_path: Path) -> Path:
"""Create a temporary workspace directory for testing."""
workspace = tmp_path / "workspace"
workspace.mkdir()
return workspace
@pytest.fixture
def python_test_workspace(temp_workspace):
"""Create a Python test workspace with sample files"""
def python_test_workspace(temp_workspace: Path) -> Path:
"""Create a Python test workspace with sample files."""
# Create a simple Python project structure
(temp_workspace / "main.py").write_text("""
def process_data(data):
@@ -62,8 +71,8 @@ AWS_SECRET = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
@pytest.fixture
def rust_test_workspace(temp_workspace):
"""Create a Rust test workspace with fuzz targets"""
def rust_test_workspace(temp_workspace: Path) -> Path:
"""Create a Rust test workspace with fuzz targets."""
# Create Cargo.toml
(temp_workspace / "Cargo.toml").write_text("""[package]
name = "test_project"
@@ -131,44 +140,45 @@ fuzz_target!(|data: &[u8]| {
# Module Configuration Fixtures
# ============================================================================
@pytest.fixture
def atheris_config():
"""Default Atheris fuzzer configuration"""
def atheris_config() -> dict[str, Any]:
"""Return default Atheris fuzzer configuration."""
return {
"target_file": "auto-discover",
"max_iterations": 1000,
"timeout_seconds": 10,
"corpus_dir": None
"corpus_dir": None,
}
@pytest.fixture
def cargo_fuzz_config():
"""Default cargo-fuzz configuration"""
def cargo_fuzz_config() -> dict[str, Any]:
"""Return default cargo-fuzz configuration."""
return {
"target_name": None,
"max_iterations": 1000,
"timeout_seconds": 10,
"sanitizer": "address"
"sanitizer": "address",
}
@pytest.fixture
def gitleaks_config():
"""Default Gitleaks configuration"""
def gitleaks_config() -> dict[str, Any]:
"""Return default Gitleaks configuration."""
return {
"config_path": None,
"scan_uncommitted": True
"scan_uncommitted": True,
}
@pytest.fixture
def file_scanner_config():
"""Default file scanner configuration"""
def file_scanner_config() -> dict[str, Any]:
"""Return default file scanner configuration."""
return {
"scan_patterns": ["*.py", "*.rs", "*.js"],
"exclude_patterns": ["*.test.*", "*.spec.*"],
"max_file_size": 1048576 # 1MB
"max_file_size": 1048576, # 1MB
}
@@ -176,55 +186,67 @@ def file_scanner_config():
# Module Instance Fixtures
# ============================================================================
@pytest.fixture
def atheris_fuzzer():
"""Create an AtherisFuzzer instance"""
from modules.fuzzer.atheris_fuzzer import AtherisFuzzer
def atheris_fuzzer() -> AtherisFuzzer:
"""Create an AtherisFuzzer instance."""
return AtherisFuzzer()
@pytest.fixture
def cargo_fuzzer():
"""Create a CargoFuzzer instance"""
from modules.fuzzer.cargo_fuzzer import CargoFuzzer
def cargo_fuzzer() -> CargoFuzzer:
"""Create a CargoFuzzer instance."""
return CargoFuzzer()
@pytest.fixture
def file_scanner():
"""Create a FileScanner instance"""
from modules.scanner.file_scanner import FileScanner
def file_scanner() -> FileScanner:
"""Create a FileScanner instance."""
return FileScanner()
@pytest.fixture
def security_analyzer() -> SecurityAnalyzer:
"""Create SecurityAnalyzer instance."""
return SecurityAnalyzer()
# ============================================================================
# Mock Fixtures
# ============================================================================
@pytest.fixture
def mock_stats_callback():
"""Mock stats callback for fuzzing"""
def mock_stats_callback() -> Callable[[], CoroutineType]:
"""Mock stats callback for fuzzing."""
stats_received = []
async def callback(stats: Dict[str, Any]):
async def callback(stats: dict[str, Any]) -> None:
stats_received.append(stats)
callback.stats_received = stats_received
return callback
class MockActivityInfo:
"""Mock activity info."""
def __init__(self) -> None:
"""Initialize an instance of the class."""
self.workflow_id = "test-workflow-123"
self.activity_id = "test-activity-1"
self.attempt = 1
class MockContext:
"""Mock context."""
def __init__(self) -> None:
"""Initialize an instance of the class."""
self.info = MockActivityInfo()
@pytest.fixture
def mock_temporal_context():
"""Mock Temporal activity context"""
class MockActivityInfo:
def __init__(self):
self.workflow_id = "test-workflow-123"
self.activity_id = "test-activity-1"
self.attempt = 1
class MockContext:
def __init__(self):
self.info = MockActivityInfo()
def mock_temporal_context() -> MockContext:
"""Mock Temporal activity context."""
return MockContext()

View File

View File

@@ -0,0 +1 @@
"""Unit tests."""

View File

@@ -0,0 +1 @@
"""Unit tests for modules."""

View File

@@ -1,17 +1,26 @@
"""
Unit tests for AtherisFuzzer module
"""
"""Unit tests for AtherisFuzzer module."""
from __future__ import annotations
from typing import TYPE_CHECKING
from unittest.mock import AsyncMock, patch
import pytest
from unittest.mock import AsyncMock, patch
if TYPE_CHECKING:
from collections.abc import Callable
from pathlib import Path
from typing import Any
from modules.fuzzer.atheris_fuzzer import AtherisFuzzer
@pytest.mark.asyncio
class TestAtherisFuzzerMetadata:
"""Test AtherisFuzzer metadata"""
"""Test AtherisFuzzer metadata."""
async def test_metadata_structure(self, atheris_fuzzer):
"""Test that module metadata is properly defined"""
async def test_metadata_structure(self, atheris_fuzzer: AtherisFuzzer) -> None:
"""Test that module metadata is properly defined."""
metadata = atheris_fuzzer.get_metadata()
assert metadata.name == "atheris_fuzzer"
@@ -22,28 +31,28 @@ class TestAtherisFuzzerMetadata:
@pytest.mark.asyncio
class TestAtherisFuzzerConfigValidation:
"""Test configuration validation"""
"""Test configuration validation."""
async def test_valid_config(self, atheris_fuzzer, atheris_config):
"""Test validation of valid configuration"""
async def test_valid_config(self, atheris_fuzzer: AtherisFuzzer, atheris_config: dict[str, Any]) -> None:
"""Test validation of valid configuration."""
assert atheris_fuzzer.validate_config(atheris_config) is True
async def test_invalid_max_iterations(self, atheris_fuzzer):
"""Test validation fails with invalid max_iterations"""
async def test_invalid_max_iterations(self, atheris_fuzzer: AtherisFuzzer) -> None:
"""Test validation fails with invalid max_iterations."""
config = {
"target_file": "fuzz_target.py",
"max_iterations": -1,
"timeout_seconds": 10
"timeout_seconds": 10,
}
with pytest.raises(ValueError, match="max_iterations"):
atheris_fuzzer.validate_config(config)
async def test_invalid_timeout(self, atheris_fuzzer):
"""Test validation fails with invalid timeout"""
async def test_invalid_timeout(self, atheris_fuzzer: AtherisFuzzer) -> None:
"""Test validation fails with invalid timeout."""
config = {
"target_file": "fuzz_target.py",
"max_iterations": 1000,
"timeout_seconds": 0
"timeout_seconds": 0,
}
with pytest.raises(ValueError, match="timeout_seconds"):
atheris_fuzzer.validate_config(config)
@@ -51,10 +60,10 @@ class TestAtherisFuzzerConfigValidation:
@pytest.mark.asyncio
class TestAtherisFuzzerDiscovery:
"""Test fuzz target discovery"""
"""Test fuzz target discovery."""
async def test_auto_discover(self, atheris_fuzzer, python_test_workspace):
"""Test auto-discovery of Python fuzz targets"""
async def test_auto_discover(self, atheris_fuzzer: AtherisFuzzer, python_test_workspace: Path) -> None:
"""Test auto-discovery of Python fuzz targets."""
# Create a fuzz target file
(python_test_workspace / "fuzz_target.py").write_text("""
import atheris
@@ -69,7 +78,7 @@ if __name__ == "__main__":
""")
# Pass None for auto-discovery
target = atheris_fuzzer._discover_target(python_test_workspace, None)
target = atheris_fuzzer._discover_target(python_test_workspace, None) # noqa: SLF001
assert target is not None
assert "fuzz_target.py" in str(target)
@@ -77,10 +86,14 @@ if __name__ == "__main__":
@pytest.mark.asyncio
class TestAtherisFuzzerExecution:
"""Test fuzzer execution logic"""
"""Test fuzzer execution logic."""
async def test_execution_creates_result(self, atheris_fuzzer, python_test_workspace, atheris_config):
"""Test that execution returns a ModuleResult"""
async def test_execution_creates_result(
self,
atheris_fuzzer: AtherisFuzzer,
python_test_workspace: Path,
) -> None:
"""Test that execution returns a ModuleResult."""
# Create a simple fuzz target
(python_test_workspace / "fuzz_target.py").write_text("""
import atheris
@@ -99,11 +112,16 @@ if __name__ == "__main__":
test_config = {
"target_file": "fuzz_target.py",
"max_iterations": 10,
"timeout_seconds": 1
"timeout_seconds": 1,
}
# Mock the fuzzing subprocess to avoid actual execution
with patch.object(atheris_fuzzer, '_run_fuzzing', new_callable=AsyncMock, return_value=([], {"total_executions": 10})):
with patch.object(
atheris_fuzzer,
"_run_fuzzing",
new_callable=AsyncMock,
return_value=([], {"total_executions": 10}),
):
result = await atheris_fuzzer.execute(test_config, python_test_workspace)
assert result.module == "atheris_fuzzer"
@@ -113,10 +131,16 @@ if __name__ == "__main__":
@pytest.mark.asyncio
class TestAtherisFuzzerStatsCallback:
"""Test stats callback functionality"""
"""Test stats callback functionality."""
async def test_stats_callback_invoked(self, atheris_fuzzer, python_test_workspace, atheris_config, mock_stats_callback):
"""Test that stats callback is invoked during fuzzing"""
async def test_stats_callback_invoked(
self,
atheris_fuzzer: AtherisFuzzer,
python_test_workspace: Path,
atheris_config: dict[str, Any],
mock_stats_callback: Callable | None,
) -> None:
"""Test that stats callback is invoked during fuzzing."""
(python_test_workspace / "fuzz_target.py").write_text("""
import atheris
import sys
@@ -130,35 +154,45 @@ if __name__ == "__main__":
""")
# Mock fuzzing to simulate stats
async def mock_run_fuzzing(test_one_input, target_path, workspace, max_iterations, timeout_seconds, stats_callback):
async def mock_run_fuzzing(
test_one_input: Callable, # noqa: ARG001
target_path: Path, # noqa: ARG001
workspace: Path, # noqa: ARG001
max_iterations: int, # noqa: ARG001
timeout_seconds: int, # noqa: ARG001
stats_callback: Callable | None,
) -> None:
if stats_callback:
await stats_callback({
"total_execs": 100,
"execs_per_sec": 10.0,
"crashes": 0,
"coverage": 5,
"corpus_size": 2,
"elapsed_time": 10
})
return
await stats_callback(
{
"total_execs": 100,
"execs_per_sec": 10.0,
"crashes": 0,
"coverage": 5,
"corpus_size": 2,
"elapsed_time": 10,
},
)
with patch.object(atheris_fuzzer, '_run_fuzzing', side_effect=mock_run_fuzzing):
with patch.object(atheris_fuzzer, '_load_target_module', return_value=lambda x: None):
# Put stats_callback in config dict, not as kwarg
atheris_config["target_file"] = "fuzz_target.py"
atheris_config["stats_callback"] = mock_stats_callback
await atheris_fuzzer.execute(atheris_config, python_test_workspace)
with (
patch.object(atheris_fuzzer, "_run_fuzzing", side_effect=mock_run_fuzzing),
patch.object(atheris_fuzzer, "_load_target_module", return_value=lambda _x: None),
):
# Put stats_callback in config dict, not as kwarg
atheris_config["target_file"] = "fuzz_target.py"
atheris_config["stats_callback"] = mock_stats_callback
await atheris_fuzzer.execute(atheris_config, python_test_workspace)
# Verify callback was invoked
assert len(mock_stats_callback.stats_received) > 0
# Verify callback was invoked
assert len(mock_stats_callback.stats_received) > 0
@pytest.mark.asyncio
class TestAtherisFuzzerFindingGeneration:
"""Test finding generation from crashes"""
"""Test finding generation from crashes."""
async def test_create_crash_finding(self, atheris_fuzzer):
"""Test crash finding creation"""
async def test_create_crash_finding(self, atheris_fuzzer: AtherisFuzzer) -> None:
"""Test crash finding creation."""
finding = atheris_fuzzer.create_finding(
title="Crash: Exception in TestOneInput",
description="IndexError: list index out of range",
@@ -167,8 +201,8 @@ class TestAtherisFuzzerFindingGeneration:
file_path="fuzz_target.py",
metadata={
"crash_type": "IndexError",
"stack_trace": "Traceback..."
}
"stack_trace": "Traceback...",
},
)
assert finding.title == "Crash: Exception in TestOneInput"

View File

@@ -1,17 +1,26 @@
"""
Unit tests for CargoFuzzer module
"""
"""Unit tests for CargoFuzzer module."""
from __future__ import annotations
from typing import TYPE_CHECKING
from unittest.mock import AsyncMock, patch
import pytest
from unittest.mock import AsyncMock, patch
if TYPE_CHECKING:
from collections.abc import Callable
from pathlib import Path
from typing import Any
from modules.fuzzer.cargo_fuzzer import CargoFuzzer
@pytest.mark.asyncio
class TestCargoFuzzerMetadata:
"""Test CargoFuzzer metadata"""
"""Test CargoFuzzer metadata."""
async def test_metadata_structure(self, cargo_fuzzer):
"""Test that module metadata is properly defined"""
async def test_metadata_structure(self, cargo_fuzzer: CargoFuzzer) -> None:
"""Test that module metadata is properly defined."""
metadata = cargo_fuzzer.get_metadata()
assert metadata.name == "cargo_fuzz"
@@ -23,38 +32,38 @@ class TestCargoFuzzerMetadata:
@pytest.mark.asyncio
class TestCargoFuzzerConfigValidation:
"""Test configuration validation"""
"""Test configuration validation."""
async def test_valid_config(self, cargo_fuzzer, cargo_fuzz_config):
"""Test validation of valid configuration"""
async def test_valid_config(self, cargo_fuzzer: CargoFuzzer, cargo_fuzz_config: dict[str, Any]) -> None:
"""Test validation of valid configuration."""
assert cargo_fuzzer.validate_config(cargo_fuzz_config) is True
async def test_invalid_max_iterations(self, cargo_fuzzer):
"""Test validation fails with invalid max_iterations"""
async def test_invalid_max_iterations(self, cargo_fuzzer: CargoFuzzer) -> None:
"""Test validation fails with invalid max_iterations."""
config = {
"max_iterations": -1,
"timeout_seconds": 10,
"sanitizer": "address"
"sanitizer": "address",
}
with pytest.raises(ValueError, match="max_iterations"):
cargo_fuzzer.validate_config(config)
async def test_invalid_timeout(self, cargo_fuzzer):
"""Test validation fails with invalid timeout"""
async def test_invalid_timeout(self, cargo_fuzzer: CargoFuzzer) -> None:
"""Test validation fails with invalid timeout."""
config = {
"max_iterations": 1000,
"timeout_seconds": 0,
"sanitizer": "address"
"sanitizer": "address",
}
with pytest.raises(ValueError, match="timeout_seconds"):
cargo_fuzzer.validate_config(config)
async def test_invalid_sanitizer(self, cargo_fuzzer):
"""Test validation fails with invalid sanitizer"""
async def test_invalid_sanitizer(self, cargo_fuzzer: CargoFuzzer) -> None:
"""Test validation fails with invalid sanitizer."""
config = {
"max_iterations": 1000,
"timeout_seconds": 10,
"sanitizer": "invalid_sanitizer"
"sanitizer": "invalid_sanitizer",
}
with pytest.raises(ValueError, match="sanitizer"):
cargo_fuzzer.validate_config(config)
@@ -62,20 +71,20 @@ class TestCargoFuzzerConfigValidation:
@pytest.mark.asyncio
class TestCargoFuzzerWorkspaceValidation:
"""Test workspace validation"""
"""Test workspace validation."""
async def test_valid_workspace(self, cargo_fuzzer, rust_test_workspace):
"""Test validation of valid workspace"""
async def test_valid_workspace(self, cargo_fuzzer: CargoFuzzer, rust_test_workspace: Path) -> None:
"""Test validation of valid workspace."""
assert cargo_fuzzer.validate_workspace(rust_test_workspace) is True
async def test_nonexistent_workspace(self, cargo_fuzzer, tmp_path):
"""Test validation fails with nonexistent workspace"""
async def test_nonexistent_workspace(self, cargo_fuzzer: CargoFuzzer, tmp_path: Path) -> None:
"""Test validation fails with nonexistent workspace."""
nonexistent = tmp_path / "does_not_exist"
with pytest.raises(ValueError, match="does not exist"):
cargo_fuzzer.validate_workspace(nonexistent)
async def test_workspace_is_file(self, cargo_fuzzer, tmp_path):
"""Test validation fails when workspace is a file"""
async def test_workspace_is_file(self, cargo_fuzzer: CargoFuzzer, tmp_path: Path) -> None:
"""Test validation fails when workspace is a file."""
file_path = tmp_path / "file.txt"
file_path.write_text("test")
with pytest.raises(ValueError, match="not a directory"):
@@ -84,41 +93,58 @@ class TestCargoFuzzerWorkspaceValidation:
@pytest.mark.asyncio
class TestCargoFuzzerDiscovery:
"""Test fuzz target discovery"""
"""Test fuzz target discovery."""
async def test_discover_targets(self, cargo_fuzzer, rust_test_workspace):
"""Test discovery of fuzz targets"""
targets = await cargo_fuzzer._discover_fuzz_targets(rust_test_workspace)
async def test_discover_targets(self, cargo_fuzzer: CargoFuzzer, rust_test_workspace: Path) -> None:
"""Test discovery of fuzz targets."""
targets = await cargo_fuzzer._discover_fuzz_targets(rust_test_workspace) # noqa: SLF001
assert len(targets) == 1
assert "fuzz_target_1" in targets
async def test_no_fuzz_directory(self, cargo_fuzzer, temp_workspace):
"""Test discovery with no fuzz directory"""
targets = await cargo_fuzzer._discover_fuzz_targets(temp_workspace)
async def test_no_fuzz_directory(self, cargo_fuzzer: CargoFuzzer, temp_workspace: Path) -> None:
"""Test discovery with no fuzz directory."""
targets = await cargo_fuzzer._discover_fuzz_targets(temp_workspace) # noqa: SLF001
assert targets == []
@pytest.mark.asyncio
class TestCargoFuzzerExecution:
"""Test fuzzer execution logic"""
"""Test fuzzer execution logic."""
async def test_execution_creates_result(self, cargo_fuzzer, rust_test_workspace, cargo_fuzz_config):
"""Test that execution returns a ModuleResult"""
async def test_execution_creates_result(
self,
cargo_fuzzer: CargoFuzzer,
rust_test_workspace: Path,
cargo_fuzz_config: dict[str, Any],
) -> None:
"""Test that execution returns a ModuleResult."""
# Mock the build and run methods to avoid actual fuzzing
with patch.object(cargo_fuzzer, '_build_fuzz_target', new_callable=AsyncMock, return_value=True):
with patch.object(cargo_fuzzer, '_run_fuzzing', new_callable=AsyncMock, return_value=([], {"total_executions": 0, "crashes_found": 0})):
with patch.object(cargo_fuzzer, '_parse_crash_artifacts', new_callable=AsyncMock, return_value=[]):
result = await cargo_fuzzer.execute(cargo_fuzz_config, rust_test_workspace)
with (
patch.object(cargo_fuzzer, "_build_fuzz_target", new_callable=AsyncMock, return_value=True),
patch.object(
cargo_fuzzer,
"_run_fuzzing",
new_callable=AsyncMock,
return_value=([], {"total_executions": 0, "crashes_found": 0}),
),
patch.object(cargo_fuzzer, "_parse_crash_artifacts", new_callable=AsyncMock, return_value=[]),
):
result = await cargo_fuzzer.execute(cargo_fuzz_config, rust_test_workspace)
assert result.module == "cargo_fuzz"
assert result.status == "success"
assert isinstance(result.execution_time, float)
assert result.execution_time >= 0
assert result.module == "cargo_fuzz"
assert result.status == "success"
assert isinstance(result.execution_time, float)
assert result.execution_time >= 0
async def test_execution_with_no_targets(self, cargo_fuzzer, temp_workspace, cargo_fuzz_config):
"""Test execution fails gracefully with no fuzz targets"""
async def test_execution_with_no_targets(
self,
cargo_fuzzer: CargoFuzzer,
temp_workspace: Path,
cargo_fuzz_config: dict[str, Any],
) -> None:
"""Test execution fails gracefully with no fuzz targets."""
result = await cargo_fuzzer.execute(cargo_fuzz_config, temp_workspace)
assert result.status == "failed"
@@ -127,47 +153,67 @@ class TestCargoFuzzerExecution:
@pytest.mark.asyncio
class TestCargoFuzzerStatsCallback:
"""Test stats callback functionality"""
"""Test stats callback functionality."""
async def test_stats_callback_invoked(
self,
cargo_fuzzer: CargoFuzzer,
rust_test_workspace: Path,
cargo_fuzz_config: dict[str, Any],
mock_stats_callback: Callable | None,
) -> None:
"""Test that stats callback is invoked during fuzzing."""
async def test_stats_callback_invoked(self, cargo_fuzzer, rust_test_workspace, cargo_fuzz_config, mock_stats_callback):
"""Test that stats callback is invoked during fuzzing"""
# Mock build/run to simulate stats generation
async def mock_run_fuzzing(workspace, target, config, callback):
async def mock_run_fuzzing(
_workspace: Path,
_target: str,
_config: dict[str, Any],
callback: Callable | None,
) -> tuple[list, dict[str, int]]:
# Simulate stats callback
if callback:
await callback({
"total_execs": 1000,
"execs_per_sec": 100.0,
"crashes": 0,
"coverage": 10,
"corpus_size": 5,
"elapsed_time": 10
})
await callback(
{
"total_execs": 1000,
"execs_per_sec": 100.0,
"crashes": 0,
"coverage": 10,
"corpus_size": 5,
"elapsed_time": 10,
},
)
return [], {"total_executions": 1000}
with patch.object(cargo_fuzzer, '_build_fuzz_target', new_callable=AsyncMock, return_value=True):
with patch.object(cargo_fuzzer, '_run_fuzzing', side_effect=mock_run_fuzzing):
with patch.object(cargo_fuzzer, '_parse_crash_artifacts', new_callable=AsyncMock, return_value=[]):
await cargo_fuzzer.execute(cargo_fuzz_config, rust_test_workspace, stats_callback=mock_stats_callback)
with (
patch.object(cargo_fuzzer, "_build_fuzz_target", new_callable=AsyncMock, return_value=True),
patch.object(cargo_fuzzer, "_run_fuzzing", side_effect=mock_run_fuzzing),
patch.object(cargo_fuzzer, "_parse_crash_artifacts", new_callable=AsyncMock, return_value=[]),
):
await cargo_fuzzer.execute(
cargo_fuzz_config,
rust_test_workspace,
stats_callback=mock_stats_callback,
)
# Verify callback was invoked
assert len(mock_stats_callback.stats_received) > 0
assert mock_stats_callback.stats_received[0]["total_execs"] == 1000
# Verify callback was invoked
assert len(mock_stats_callback.stats_received) > 0
assert mock_stats_callback.stats_received[0]["total_execs"] == 1000
@pytest.mark.asyncio
class TestCargoFuzzerFindingGeneration:
"""Test finding generation from crashes"""
"""Test finding generation from crashes."""
async def test_create_finding_from_crash(self, cargo_fuzzer):
"""Test finding creation"""
async def test_create_finding_from_crash(self, cargo_fuzzer: CargoFuzzer) -> None:
"""Test finding creation."""
finding = cargo_fuzzer.create_finding(
title="Crash: Segmentation Fault",
description="Test crash",
severity="critical",
category="crash",
file_path="fuzz/fuzz_targets/fuzz_target_1.rs",
metadata={"crash_type": "SIGSEGV"}
metadata={"crash_type": "SIGSEGV"},
)
assert finding.title == "Crash: Segmentation Fault"

View File

@@ -1,22 +1,25 @@
"""
Unit tests for FileScanner module
"""
"""Unit tests for FileScanner module."""
from __future__ import annotations
import sys
from pathlib import Path
from typing import TYPE_CHECKING
import pytest
if TYPE_CHECKING:
from modules.scanner.file_scanner import FileScanner
sys.path.insert(0, str(Path(__file__).resolve().parents[3] / "toolbox"))
@pytest.mark.asyncio
class TestFileScannerMetadata:
"""Test FileScanner metadata"""
"""Test FileScanner metadata."""
async def test_metadata_structure(self, file_scanner):
"""Test that metadata has correct structure"""
async def test_metadata_structure(self, file_scanner: FileScanner) -> None:
"""Test that metadata has correct structure."""
metadata = file_scanner.get_metadata()
assert metadata.name == "file_scanner"
@@ -29,37 +32,37 @@ class TestFileScannerMetadata:
@pytest.mark.asyncio
class TestFileScannerConfigValidation:
"""Test configuration validation"""
"""Test configuration validation."""
async def test_valid_config(self, file_scanner):
"""Test that valid config passes validation"""
async def test_valid_config(self, file_scanner: FileScanner) -> None:
"""Test that valid config passes validation."""
config = {
"patterns": ["*.py", "*.js"],
"max_file_size": 1048576,
"check_sensitive": True,
"calculate_hashes": False
"calculate_hashes": False,
}
assert file_scanner.validate_config(config) is True
async def test_default_config(self, file_scanner):
"""Test that empty config uses defaults"""
async def test_default_config(self, file_scanner: FileScanner) -> None:
"""Test that empty config uses defaults."""
config = {}
assert file_scanner.validate_config(config) is True
async def test_invalid_patterns_type(self, file_scanner):
"""Test that non-list patterns raises error"""
async def test_invalid_patterns_type(self, file_scanner: FileScanner) -> None:
"""Test that non-list patterns raises error."""
config = {"patterns": "*.py"}
with pytest.raises(ValueError, match="patterns must be a list"):
file_scanner.validate_config(config)
async def test_invalid_max_file_size(self, file_scanner):
"""Test that invalid max_file_size raises error"""
async def test_invalid_max_file_size(self, file_scanner: FileScanner) -> None:
"""Test that invalid max_file_size raises error."""
config = {"max_file_size": -1}
with pytest.raises(ValueError, match="max_file_size must be a positive integer"):
file_scanner.validate_config(config)
async def test_invalid_max_file_size_type(self, file_scanner):
"""Test that non-integer max_file_size raises error"""
async def test_invalid_max_file_size_type(self, file_scanner: FileScanner) -> None:
"""Test that non-integer max_file_size raises error."""
config = {"max_file_size": "large"}
with pytest.raises(ValueError, match="max_file_size must be a positive integer"):
file_scanner.validate_config(config)
@@ -67,14 +70,14 @@ class TestFileScannerConfigValidation:
@pytest.mark.asyncio
class TestFileScannerExecution:
"""Test scanner execution"""
"""Test scanner execution."""
async def test_scan_python_files(self, file_scanner, python_test_workspace):
"""Test scanning Python files"""
async def test_scan_python_files(self, file_scanner: FileScanner, python_test_workspace: Path) -> None:
"""Test scanning Python files."""
config = {
"patterns": ["*.py"],
"check_sensitive": False,
"calculate_hashes": False
"calculate_hashes": False,
}
result = await file_scanner.execute(config, python_test_workspace)
@@ -84,15 +87,15 @@ class TestFileScannerExecution:
assert len(result.findings) > 0
# Check that Python files were found
python_files = [f for f in result.findings if f.file_path.endswith('.py')]
python_files = [f for f in result.findings if f.file_path.endswith(".py")]
assert len(python_files) > 0
async def test_scan_all_files(self, file_scanner, python_test_workspace):
"""Test scanning all files with wildcard"""
async def test_scan_all_files(self, file_scanner: FileScanner, python_test_workspace: Path) -> None:
"""Test scanning all files with wildcard."""
config = {
"patterns": ["*"],
"check_sensitive": False,
"calculate_hashes": False
"calculate_hashes": False,
}
result = await file_scanner.execute(config, python_test_workspace)
@@ -101,12 +104,12 @@ class TestFileScannerExecution:
assert len(result.findings) > 0
assert result.summary["total_files"] > 0
async def test_scan_with_multiple_patterns(self, file_scanner, python_test_workspace):
"""Test scanning with multiple patterns"""
async def test_scan_with_multiple_patterns(self, file_scanner: FileScanner, python_test_workspace: Path) -> None:
"""Test scanning with multiple patterns."""
config = {
"patterns": ["*.py", "*.txt"],
"check_sensitive": False,
"calculate_hashes": False
"calculate_hashes": False,
}
result = await file_scanner.execute(config, python_test_workspace)
@@ -114,11 +117,11 @@ class TestFileScannerExecution:
assert result.status == "success"
assert len(result.findings) > 0
async def test_empty_workspace(self, file_scanner, temp_workspace):
"""Test scanning empty workspace"""
async def test_empty_workspace(self, file_scanner: FileScanner, temp_workspace: Path) -> None:
"""Test scanning empty workspace."""
config = {
"patterns": ["*.py"],
"check_sensitive": False
"check_sensitive": False,
}
result = await file_scanner.execute(config, temp_workspace)
@@ -130,17 +133,17 @@ class TestFileScannerExecution:
@pytest.mark.asyncio
class TestFileScannerSensitiveDetection:
"""Test sensitive file detection"""
"""Test sensitive file detection."""
async def test_detect_env_file(self, file_scanner, temp_workspace):
"""Test detection of .env file"""
async def test_detect_env_file(self, file_scanner: FileScanner, temp_workspace: Path) -> None:
"""Test detection of .env file."""
# Create .env file
(temp_workspace / ".env").write_text("API_KEY=secret123")
config = {
"patterns": ["*"],
"check_sensitive": True,
"calculate_hashes": False
"calculate_hashes": False,
}
result = await file_scanner.execute(config, temp_workspace)
@@ -152,14 +155,14 @@ class TestFileScannerSensitiveDetection:
assert len(sensitive_findings) > 0
assert any(".env" in f.title for f in sensitive_findings)
async def test_detect_private_key(self, file_scanner, temp_workspace):
"""Test detection of private key file"""
async def test_detect_private_key(self, file_scanner: FileScanner, temp_workspace: Path) -> None:
"""Test detection of private key file."""
# Create private key file
(temp_workspace / "id_rsa").write_text("-----BEGIN RSA PRIVATE KEY-----")
config = {
"patterns": ["*"],
"check_sensitive": True
"check_sensitive": True,
}
result = await file_scanner.execute(config, temp_workspace)
@@ -168,13 +171,13 @@ class TestFileScannerSensitiveDetection:
sensitive_findings = [f for f in result.findings if f.category == "sensitive_file"]
assert len(sensitive_findings) > 0
async def test_no_sensitive_detection_when_disabled(self, file_scanner, temp_workspace):
"""Test that sensitive detection can be disabled"""
async def test_no_sensitive_detection_when_disabled(self, file_scanner: FileScanner, temp_workspace: Path) -> None:
"""Test that sensitive detection can be disabled."""
(temp_workspace / ".env").write_text("API_KEY=secret123")
config = {
"patterns": ["*"],
"check_sensitive": False
"check_sensitive": False,
}
result = await file_scanner.execute(config, temp_workspace)
@@ -186,17 +189,17 @@ class TestFileScannerSensitiveDetection:
@pytest.mark.asyncio
class TestFileScannerHashing:
"""Test file hashing functionality"""
"""Test file hashing functionality."""
async def test_hash_calculation(self, file_scanner, temp_workspace):
"""Test SHA256 hash calculation"""
async def test_hash_calculation(self, file_scanner: FileScanner, temp_workspace: Path) -> None:
"""Test SHA256 hash calculation."""
# Create test file
test_file = temp_workspace / "test.txt"
test_file.write_text("Hello World")
config = {
"patterns": ["*.txt"],
"calculate_hashes": True
"calculate_hashes": True,
}
result = await file_scanner.execute(config, temp_workspace)
@@ -212,14 +215,14 @@ class TestFileScannerHashing:
assert finding.metadata.get("file_hash") is not None
assert len(finding.metadata["file_hash"]) == 64 # SHA256 hex length
async def test_no_hash_when_disabled(self, file_scanner, temp_workspace):
"""Test that hashing can be disabled"""
async def test_no_hash_when_disabled(self, file_scanner: FileScanner, temp_workspace: Path) -> None:
"""Test that hashing can be disabled."""
test_file = temp_workspace / "test.txt"
test_file.write_text("Hello World")
config = {
"patterns": ["*.txt"],
"calculate_hashes": False
"calculate_hashes": False,
}
result = await file_scanner.execute(config, temp_workspace)
@@ -234,10 +237,10 @@ class TestFileScannerHashing:
@pytest.mark.asyncio
class TestFileScannerFileTypes:
"""Test file type detection"""
"""Test file type detection."""
async def test_detect_python_type(self, file_scanner, temp_workspace):
"""Test detection of Python file type"""
async def test_detect_python_type(self, file_scanner: FileScanner, temp_workspace: Path) -> None:
"""Test detection of Python file type."""
(temp_workspace / "script.py").write_text("print('hello')")
config = {"patterns": ["*.py"]}
@@ -248,8 +251,8 @@ class TestFileScannerFileTypes:
assert len(py_findings) > 0
assert "python" in py_findings[0].metadata["file_type"]
async def test_detect_javascript_type(self, file_scanner, temp_workspace):
"""Test detection of JavaScript file type"""
async def test_detect_javascript_type(self, file_scanner: FileScanner, temp_workspace: Path) -> None:
"""Test detection of JavaScript file type."""
(temp_workspace / "app.js").write_text("console.log('hello')")
config = {"patterns": ["*.js"]}
@@ -260,8 +263,8 @@ class TestFileScannerFileTypes:
assert len(js_findings) > 0
assert "javascript" in js_findings[0].metadata["file_type"]
async def test_file_type_summary(self, file_scanner, temp_workspace):
"""Test that file type summary is generated"""
async def test_file_type_summary(self, file_scanner: FileScanner, temp_workspace: Path) -> None:
"""Test that file type summary is generated."""
(temp_workspace / "script.py").write_text("print('hello')")
(temp_workspace / "app.js").write_text("console.log('hello')")
(temp_workspace / "readme.txt").write_text("Documentation")
@@ -276,17 +279,17 @@ class TestFileScannerFileTypes:
@pytest.mark.asyncio
class TestFileScannerSizeLimits:
"""Test file size handling"""
"""Test file size handling."""
async def test_skip_large_files(self, file_scanner, temp_workspace):
"""Test that large files are skipped"""
async def test_skip_large_files(self, file_scanner: FileScanner, temp_workspace: Path) -> None:
"""Test that large files are skipped."""
# Create a "large" file
large_file = temp_workspace / "large.txt"
large_file.write_text("x" * 1000)
config = {
"patterns": ["*.txt"],
"max_file_size": 500 # Set limit smaller than file
"max_file_size": 500, # Set limit smaller than file
}
result = await file_scanner.execute(config, temp_workspace)
@@ -297,14 +300,14 @@ class TestFileScannerSizeLimits:
# The file should still be counted but not have a detailed finding
assert result.summary["total_files"] > 0
async def test_process_small_files(self, file_scanner, temp_workspace):
"""Test that small files are processed"""
async def test_process_small_files(self, file_scanner: FileScanner, temp_workspace: Path) -> None:
"""Test that small files are processed."""
small_file = temp_workspace / "small.txt"
small_file.write_text("small content")
config = {
"patterns": ["*.txt"],
"max_file_size": 1048576 # 1MB
"max_file_size": 1048576, # 1MB
}
result = await file_scanner.execute(config, temp_workspace)
@@ -316,10 +319,10 @@ class TestFileScannerSizeLimits:
@pytest.mark.asyncio
class TestFileScannerSummary:
"""Test result summary generation"""
"""Test result summary generation."""
async def test_summary_structure(self, file_scanner, python_test_workspace):
"""Test that summary has correct structure"""
async def test_summary_structure(self, file_scanner: FileScanner, python_test_workspace: Path) -> None:
"""Test that summary has correct structure."""
config = {"patterns": ["*"]}
result = await file_scanner.execute(config, python_test_workspace)
@@ -334,8 +337,8 @@ class TestFileScannerSummary:
assert isinstance(result.summary["file_types"], dict)
assert isinstance(result.summary["patterns_scanned"], list)
async def test_summary_counts(self, file_scanner, temp_workspace):
"""Test that summary counts are accurate"""
async def test_summary_counts(self, file_scanner: FileScanner, temp_workspace: Path) -> None:
"""Test that summary counts are accurate."""
# Create known files
(temp_workspace / "file1.py").write_text("content1")
(temp_workspace / "file2.py").write_text("content2")

View File

@@ -1,28 +1,25 @@
"""
Unit tests for SecurityAnalyzer module
"""
"""Unit tests for SecurityAnalyzer module."""
from __future__ import annotations
import pytest
import sys
from pathlib import Path
from typing import TYPE_CHECKING
import pytest
sys.path.insert(0, str(Path(__file__).resolve().parents[3] / "toolbox"))
from modules.analyzer.security_analyzer import SecurityAnalyzer
@pytest.fixture
def security_analyzer():
"""Create SecurityAnalyzer instance"""
return SecurityAnalyzer()
if TYPE_CHECKING:
from modules.analyzer.security_analyzer import SecurityAnalyzer
@pytest.mark.asyncio
class TestSecurityAnalyzerMetadata:
"""Test SecurityAnalyzer metadata"""
"""Test SecurityAnalyzer metadata."""
async def test_metadata_structure(self, security_analyzer):
"""Test that metadata has correct structure"""
async def test_metadata_structure(self, security_analyzer: SecurityAnalyzer) -> None:
"""Test that metadata has correct structure."""
metadata = security_analyzer.get_metadata()
assert metadata.name == "security_analyzer"
@@ -35,25 +32,25 @@ class TestSecurityAnalyzerMetadata:
@pytest.mark.asyncio
class TestSecurityAnalyzerConfigValidation:
"""Test configuration validation"""
"""Test configuration validation."""
async def test_valid_config(self, security_analyzer):
"""Test that valid config passes validation"""
async def test_valid_config(self, security_analyzer: SecurityAnalyzer) -> None:
"""Test that valid config passes validation."""
config = {
"file_extensions": [".py", ".js"],
"check_secrets": True,
"check_sql": True,
"check_dangerous_functions": True
"check_dangerous_functions": True,
}
assert security_analyzer.validate_config(config) is True
async def test_default_config(self, security_analyzer):
"""Test that empty config uses defaults"""
async def test_default_config(self, security_analyzer: SecurityAnalyzer) -> None:
"""Test that empty config uses defaults."""
config = {}
assert security_analyzer.validate_config(config) is True
async def test_invalid_extensions_type(self, security_analyzer):
"""Test that non-list extensions raises error"""
async def test_invalid_extensions_type(self, security_analyzer: SecurityAnalyzer) -> None:
"""Test that non-list extensions raises error."""
config = {"file_extensions": ".py"}
with pytest.raises(ValueError, match="file_extensions must be a list"):
security_analyzer.validate_config(config)
@@ -61,10 +58,10 @@ class TestSecurityAnalyzerConfigValidation:
@pytest.mark.asyncio
class TestSecurityAnalyzerSecretDetection:
"""Test hardcoded secret detection"""
"""Test hardcoded secret detection."""
async def test_detect_api_key(self, security_analyzer, temp_workspace):
"""Test detection of hardcoded API key"""
async def test_detect_api_key(self, security_analyzer: SecurityAnalyzer, temp_workspace: Path) -> None:
"""Test detection of hardcoded API key."""
code_file = temp_workspace / "config.py"
code_file.write_text("""
# Configuration file
@@ -76,7 +73,7 @@ database_url = "postgresql://localhost/db"
"file_extensions": [".py"],
"check_secrets": True,
"check_sql": False,
"check_dangerous_functions": False
"check_dangerous_functions": False,
}
result = await security_analyzer.execute(config, temp_workspace)
@@ -86,8 +83,8 @@ database_url = "postgresql://localhost/db"
assert len(secret_findings) > 0
assert any("API Key" in f.title for f in secret_findings)
async def test_detect_password(self, security_analyzer, temp_workspace):
"""Test detection of hardcoded password"""
async def test_detect_password(self, security_analyzer: SecurityAnalyzer, temp_workspace: Path) -> None:
"""Test detection of hardcoded password."""
code_file = temp_workspace / "auth.py"
code_file.write_text("""
def connect():
@@ -99,7 +96,7 @@ def connect():
"file_extensions": [".py"],
"check_secrets": True,
"check_sql": False,
"check_dangerous_functions": False
"check_dangerous_functions": False,
}
result = await security_analyzer.execute(config, temp_workspace)
@@ -108,8 +105,8 @@ def connect():
secret_findings = [f for f in result.findings if f.category == "hardcoded_secret"]
assert len(secret_findings) > 0
async def test_detect_aws_credentials(self, security_analyzer, temp_workspace):
"""Test detection of AWS credentials"""
async def test_detect_aws_credentials(self, security_analyzer: SecurityAnalyzer, temp_workspace: Path) -> None:
"""Test detection of AWS credentials."""
code_file = temp_workspace / "aws_config.py"
code_file.write_text("""
aws_access_key = "AKIAIOSFODNN7REALKEY"
@@ -118,7 +115,7 @@ aws_secret_key = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYREALKEY"
config = {
"file_extensions": [".py"],
"check_secrets": True
"check_secrets": True,
}
result = await security_analyzer.execute(config, temp_workspace)
@@ -127,14 +124,18 @@ aws_secret_key = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYREALKEY"
aws_findings = [f for f in result.findings if "AWS" in f.title]
assert len(aws_findings) >= 2 # Both access key and secret key
async def test_no_secret_detection_when_disabled(self, security_analyzer, temp_workspace):
"""Test that secret detection can be disabled"""
async def test_no_secret_detection_when_disabled(
self,
security_analyzer: SecurityAnalyzer,
temp_workspace: Path,
) -> None:
"""Test that secret detection can be disabled."""
code_file = temp_workspace / "config.py"
code_file.write_text('api_key = "sk_live_1234567890abcdef"')
config = {
"file_extensions": [".py"],
"check_secrets": False
"check_secrets": False,
}
result = await security_analyzer.execute(config, temp_workspace)
@@ -146,10 +147,10 @@ aws_secret_key = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYREALKEY"
@pytest.mark.asyncio
class TestSecurityAnalyzerSQLInjection:
"""Test SQL injection detection"""
"""Test SQL injection detection."""
async def test_detect_string_concatenation(self, security_analyzer, temp_workspace):
"""Test detection of SQL string concatenation"""
async def test_detect_string_concatenation(self, security_analyzer: SecurityAnalyzer, temp_workspace: Path) -> None:
"""Test detection of SQL string concatenation."""
code_file = temp_workspace / "db.py"
code_file.write_text("""
def get_user(user_id):
@@ -161,7 +162,7 @@ def get_user(user_id):
"file_extensions": [".py"],
"check_secrets": False,
"check_sql": True,
"check_dangerous_functions": False
"check_dangerous_functions": False,
}
result = await security_analyzer.execute(config, temp_workspace)
@@ -170,8 +171,8 @@ def get_user(user_id):
sql_findings = [f for f in result.findings if f.category == "sql_injection"]
assert len(sql_findings) > 0
async def test_detect_f_string_sql(self, security_analyzer, temp_workspace):
"""Test detection of f-string in SQL"""
async def test_detect_f_string_sql(self, security_analyzer: SecurityAnalyzer, temp_workspace: Path) -> None:
"""Test detection of f-string in SQL."""
code_file = temp_workspace / "db.py"
code_file.write_text("""
def get_user(name):
@@ -181,7 +182,7 @@ def get_user(name):
config = {
"file_extensions": [".py"],
"check_sql": True
"check_sql": True,
}
result = await security_analyzer.execute(config, temp_workspace)
@@ -190,8 +191,12 @@ def get_user(name):
sql_findings = [f for f in result.findings if f.category == "sql_injection"]
assert len(sql_findings) > 0
async def test_detect_dynamic_query_building(self, security_analyzer, temp_workspace):
"""Test detection of dynamic query building"""
async def test_detect_dynamic_query_building(
self,
security_analyzer: SecurityAnalyzer,
temp_workspace: Path,
) -> None:
"""Test detection of dynamic query building."""
code_file = temp_workspace / "queries.py"
code_file.write_text("""
def search(keyword):
@@ -201,7 +206,7 @@ def search(keyword):
config = {
"file_extensions": [".py"],
"check_sql": True
"check_sql": True,
}
result = await security_analyzer.execute(config, temp_workspace)
@@ -210,14 +215,18 @@ def search(keyword):
sql_findings = [f for f in result.findings if f.category == "sql_injection"]
assert len(sql_findings) > 0
async def test_no_sql_detection_when_disabled(self, security_analyzer, temp_workspace):
"""Test that SQL detection can be disabled"""
async def test_no_sql_detection_when_disabled(
self,
security_analyzer: SecurityAnalyzer,
temp_workspace: Path,
) -> None:
"""Test that SQL detection can be disabled."""
code_file = temp_workspace / "db.py"
code_file.write_text('query = "SELECT * FROM users WHERE id = " + user_id')
config = {
"file_extensions": [".py"],
"check_sql": False
"check_sql": False,
}
result = await security_analyzer.execute(config, temp_workspace)
@@ -229,10 +238,10 @@ def search(keyword):
@pytest.mark.asyncio
class TestSecurityAnalyzerDangerousFunctions:
"""Test dangerous function detection"""
"""Test dangerous function detection."""
async def test_detect_eval(self, security_analyzer, temp_workspace):
"""Test detection of eval() usage"""
async def test_detect_eval(self, security_analyzer: SecurityAnalyzer, temp_workspace: Path) -> None:
"""Test detection of eval() usage."""
code_file = temp_workspace / "dangerous.py"
code_file.write_text("""
def process_input(user_input):
@@ -244,7 +253,7 @@ def process_input(user_input):
"file_extensions": [".py"],
"check_secrets": False,
"check_sql": False,
"check_dangerous_functions": True
"check_dangerous_functions": True,
}
result = await security_analyzer.execute(config, temp_workspace)
@@ -254,8 +263,8 @@ def process_input(user_input):
assert len(dangerous_findings) > 0
assert any("eval" in f.title.lower() for f in dangerous_findings)
async def test_detect_exec(self, security_analyzer, temp_workspace):
"""Test detection of exec() usage"""
async def test_detect_exec(self, security_analyzer: SecurityAnalyzer, temp_workspace: Path) -> None:
"""Test detection of exec() usage."""
code_file = temp_workspace / "runner.py"
code_file.write_text("""
def run_code(code):
@@ -264,7 +273,7 @@ def run_code(code):
config = {
"file_extensions": [".py"],
"check_dangerous_functions": True
"check_dangerous_functions": True,
}
result = await security_analyzer.execute(config, temp_workspace)
@@ -273,8 +282,8 @@ def run_code(code):
dangerous_findings = [f for f in result.findings if f.category == "dangerous_function"]
assert len(dangerous_findings) > 0
async def test_detect_os_system(self, security_analyzer, temp_workspace):
"""Test detection of os.system() usage"""
async def test_detect_os_system(self, security_analyzer: SecurityAnalyzer, temp_workspace: Path) -> None:
"""Test detection of os.system() usage."""
code_file = temp_workspace / "commands.py"
code_file.write_text("""
import os
@@ -285,7 +294,7 @@ def run_command(cmd):
config = {
"file_extensions": [".py"],
"check_dangerous_functions": True
"check_dangerous_functions": True,
}
result = await security_analyzer.execute(config, temp_workspace)
@@ -295,8 +304,8 @@ def run_command(cmd):
assert len(dangerous_findings) > 0
assert any("os.system" in f.title for f in dangerous_findings)
async def test_detect_pickle_loads(self, security_analyzer, temp_workspace):
"""Test detection of pickle.loads() usage"""
async def test_detect_pickle_loads(self, security_analyzer: SecurityAnalyzer, temp_workspace: Path) -> None:
"""Test detection of pickle.loads() usage."""
code_file = temp_workspace / "serializer.py"
code_file.write_text("""
import pickle
@@ -307,7 +316,7 @@ def deserialize(data):
config = {
"file_extensions": [".py"],
"check_dangerous_functions": True
"check_dangerous_functions": True,
}
result = await security_analyzer.execute(config, temp_workspace)
@@ -316,8 +325,8 @@ def deserialize(data):
dangerous_findings = [f for f in result.findings if f.category == "dangerous_function"]
assert len(dangerous_findings) > 0
async def test_detect_javascript_eval(self, security_analyzer, temp_workspace):
"""Test detection of eval() in JavaScript"""
async def test_detect_javascript_eval(self, security_analyzer: SecurityAnalyzer, temp_workspace: Path) -> None:
"""Test detection of eval() in JavaScript."""
code_file = temp_workspace / "app.js"
code_file.write_text("""
function processInput(userInput) {
@@ -327,7 +336,7 @@ function processInput(userInput) {
config = {
"file_extensions": [".js"],
"check_dangerous_functions": True
"check_dangerous_functions": True,
}
result = await security_analyzer.execute(config, temp_workspace)
@@ -336,8 +345,8 @@ function processInput(userInput) {
dangerous_findings = [f for f in result.findings if f.category == "dangerous_function"]
assert len(dangerous_findings) > 0
async def test_detect_innerHTML(self, security_analyzer, temp_workspace):
"""Test detection of innerHTML (XSS risk)"""
async def test_detect_inner_html(self, security_analyzer: SecurityAnalyzer, temp_workspace: Path) -> None:
"""Test detection of innerHTML (XSS risk)."""
code_file = temp_workspace / "dom.js"
code_file.write_text("""
function updateContent(html) {
@@ -347,7 +356,7 @@ function updateContent(html) {
config = {
"file_extensions": [".js"],
"check_dangerous_functions": True
"check_dangerous_functions": True,
}
result = await security_analyzer.execute(config, temp_workspace)
@@ -356,14 +365,18 @@ function updateContent(html) {
dangerous_findings = [f for f in result.findings if f.category == "dangerous_function"]
assert len(dangerous_findings) > 0
async def test_no_dangerous_detection_when_disabled(self, security_analyzer, temp_workspace):
"""Test that dangerous function detection can be disabled"""
async def test_no_dangerous_detection_when_disabled(
self,
security_analyzer: SecurityAnalyzer,
temp_workspace: Path,
) -> None:
"""Test that dangerous function detection can be disabled."""
code_file = temp_workspace / "code.py"
code_file.write_text('result = eval(user_input)')
code_file.write_text("result = eval(user_input)")
config = {
"file_extensions": [".py"],
"check_dangerous_functions": False
"check_dangerous_functions": False,
}
result = await security_analyzer.execute(config, temp_workspace)
@@ -375,10 +388,14 @@ function updateContent(html) {
@pytest.mark.asyncio
class TestSecurityAnalyzerMultipleIssues:
"""Test detection of multiple issues in same file"""
"""Test detection of multiple issues in same file."""
async def test_detect_multiple_vulnerabilities(self, security_analyzer, temp_workspace):
"""Test detection of multiple vulnerability types"""
async def test_detect_multiple_vulnerabilities(
self,
security_analyzer: SecurityAnalyzer,
temp_workspace: Path,
) -> None:
"""Test detection of multiple vulnerability types."""
code_file = temp_workspace / "vulnerable.py"
code_file.write_text("""
import os
@@ -404,7 +421,7 @@ def process_query(user_input):
"file_extensions": [".py"],
"check_secrets": True,
"check_sql": True,
"check_dangerous_functions": True
"check_dangerous_functions": True,
}
result = await security_analyzer.execute(config, temp_workspace)
@@ -423,10 +440,10 @@ def process_query(user_input):
@pytest.mark.asyncio
class TestSecurityAnalyzerSummary:
"""Test result summary generation"""
"""Test result summary generation."""
async def test_summary_structure(self, security_analyzer, temp_workspace):
"""Test that summary has correct structure"""
async def test_summary_structure(self, security_analyzer: SecurityAnalyzer, temp_workspace: Path) -> None:
"""Test that summary has correct structure."""
(temp_workspace / "test.py").write_text("print('hello')")
config = {"file_extensions": [".py"]}
@@ -441,16 +458,16 @@ class TestSecurityAnalyzerSummary:
assert isinstance(result.summary["total_findings"], int)
assert isinstance(result.summary["extensions_scanned"], list)
async def test_empty_workspace(self, security_analyzer, temp_workspace):
"""Test analyzing empty workspace"""
async def test_empty_workspace(self, security_analyzer: SecurityAnalyzer, temp_workspace: Path) -> None:
"""Test analyzing empty workspace."""
config = {"file_extensions": [".py"]}
result = await security_analyzer.execute(config, temp_workspace)
assert result.status == "partial" # No files found
assert result.summary["files_analyzed"] == 0
async def test_analyze_multiple_file_types(self, security_analyzer, temp_workspace):
"""Test analyzing multiple file types"""
async def test_analyze_multiple_file_types(self, security_analyzer: SecurityAnalyzer, temp_workspace: Path) -> None:
"""Test analyzing multiple file types."""
(temp_workspace / "app.py").write_text("print('hello')")
(temp_workspace / "script.js").write_text("console.log('hello')")
(temp_workspace / "index.php").write_text("<?php echo 'hello'; ?>")
@@ -464,10 +481,10 @@ class TestSecurityAnalyzerSummary:
@pytest.mark.asyncio
class TestSecurityAnalyzerFalsePositives:
"""Test false positive filtering"""
"""Test false positive filtering."""
async def test_skip_test_secrets(self, security_analyzer, temp_workspace):
"""Test that test/example secrets are filtered"""
async def test_skip_test_secrets(self, security_analyzer: SecurityAnalyzer, temp_workspace: Path) -> None:
"""Test that test/example secrets are filtered."""
code_file = temp_workspace / "test_config.py"
code_file.write_text("""
# Test configuration - should be filtered
@@ -478,7 +495,7 @@ token = "sample_token_placeholder"
config = {
"file_extensions": [".py"],
"check_secrets": True
"check_secrets": True,
}
result = await security_analyzer.execute(config, temp_workspace)
@@ -488,6 +505,6 @@ token = "sample_token_placeholder"
secret_findings = [f for f in result.findings if f.category == "hardcoded_secret"]
# Should have fewer or no findings due to false positive filtering
assert len(secret_findings) == 0 or all(
not any(fp in f.description.lower() for fp in ['test', 'example', 'dummy', 'sample'])
not any(fp in f.description.lower() for fp in ["test", "example", "dummy", "sample"])
for f in secret_findings
)

View File

@@ -10,5 +10,7 @@
# Additional attribution and requirements are provided in the NOTICE file.
from .security_analyzer import SecurityAnalyzer
from .bandit_analyzer import BanditAnalyzer
from .mypy_analyzer import MypyAnalyzer
__all__ = ["SecurityAnalyzer"]
__all__ = ["SecurityAnalyzer", "BanditAnalyzer", "MypyAnalyzer"]

View File

@@ -0,0 +1,328 @@
"""
Bandit Analyzer Module - Analyzes Python code for security issues using Bandit
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import asyncio
import json
import logging
import time
from pathlib import Path
from typing import Dict, Any, List
try:
from toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
except ImportError:
try:
from modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
except ImportError:
from src.toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
logger = logging.getLogger(__name__)
class BanditAnalyzer(BaseModule):
"""
Analyzes Python code for security issues using Bandit.
This module:
- Runs Bandit security linter on Python files
- Detects common security issues (SQL injection, hardcoded secrets, etc.)
- Reports findings with severity levels
"""
# Severity mapping from Bandit levels to our standard
SEVERITY_MAP = {
"LOW": "low",
"MEDIUM": "medium",
"HIGH": "high"
}
def get_metadata(self) -> ModuleMetadata:
"""Get module metadata"""
return ModuleMetadata(
name="bandit_analyzer",
version="1.0.0",
description="Analyzes Python code for security issues using Bandit",
author="FuzzForge Team",
category="analyzer",
tags=["python", "security", "bandit", "sast"],
input_schema={
"severity_level": {
"type": "string",
"enum": ["low", "medium", "high"],
"description": "Minimum severity level to report",
"default": "low"
},
"confidence_level": {
"type": "string",
"enum": ["low", "medium", "high"],
"description": "Minimum confidence level to report",
"default": "medium"
},
"exclude_tests": {
"type": "boolean",
"description": "Exclude test files from analysis",
"default": True
},
"skip_ids": {
"type": "array",
"items": {"type": "string"},
"description": "List of Bandit test IDs to skip",
"default": []
}
},
output_schema={
"findings": {
"type": "array",
"description": "List of security issues found by Bandit"
}
},
requires_workspace=True
)
def validate_config(self, config: Dict[str, Any]) -> bool:
"""Validate module configuration"""
severity = config.get("severity_level", "low")
if severity not in ["low", "medium", "high"]:
raise ValueError("severity_level must be one of: low, medium, high")
confidence = config.get("confidence_level", "medium")
if confidence not in ["low", "medium", "high"]:
raise ValueError("confidence_level must be one of: low, medium, high")
skip_ids = config.get("skip_ids", [])
if not isinstance(skip_ids, list):
raise ValueError("skip_ids must be a list")
return True
async def _run_bandit(
self,
workspace: Path,
severity_level: str,
confidence_level: str,
exclude_tests: bool,
skip_ids: List[str]
) -> Dict[str, Any]:
"""
Run Bandit on the workspace.
Args:
workspace: Path to workspace
severity_level: Minimum severity to report
confidence_level: Minimum confidence to report
exclude_tests: Whether to exclude test files
skip_ids: List of test IDs to skip
Returns:
Bandit JSON output as dict
"""
try:
# Build bandit command
cmd = [
"bandit",
"-r", str(workspace),
"-f", "json",
"-ll", # Report all findings (we'll filter later)
]
# Add exclude patterns for test files
if exclude_tests:
cmd.extend(["-x", "*/test_*.py,*/tests/*,*_test.py"])
# Add skip IDs if specified
if skip_ids:
cmd.extend(["-s", ",".join(skip_ids)])
logger.info(f"Running Bandit on: {workspace}")
process = await asyncio.create_subprocess_exec(
*cmd,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE
)
stdout, stderr = await process.communicate()
# Bandit returns non-zero if issues found, which is expected
if process.returncode not in [0, 1]:
logger.error(f"Bandit failed: {stderr.decode()}")
return {"results": []}
# Parse JSON output
result = json.loads(stdout.decode())
return result
except Exception as e:
logger.error(f"Error running Bandit: {e}")
return {"results": []}
def _should_include_finding(
self,
issue: Dict[str, Any],
min_severity: str,
min_confidence: str
) -> bool:
"""
Determine if a Bandit issue should be included based on severity/confidence.
Args:
issue: Bandit issue dict
min_severity: Minimum severity threshold
min_confidence: Minimum confidence threshold
Returns:
True if issue should be included
"""
severity_order = ["low", "medium", "high"]
issue_severity = issue.get("issue_severity", "LOW").lower()
issue_confidence = issue.get("issue_confidence", "LOW").lower()
severity_meets_threshold = severity_order.index(issue_severity) >= severity_order.index(min_severity)
confidence_meets_threshold = severity_order.index(issue_confidence) >= severity_order.index(min_confidence)
return severity_meets_threshold and confidence_meets_threshold
def _convert_to_findings(
self,
bandit_result: Dict[str, Any],
workspace: Path,
min_severity: str,
min_confidence: str
) -> List[ModuleFinding]:
"""
Convert Bandit results to ModuleFindings.
Args:
bandit_result: Bandit JSON output
workspace: Workspace path for relative paths
min_severity: Minimum severity to include
min_confidence: Minimum confidence to include
Returns:
List of ModuleFindings
"""
findings = []
for issue in bandit_result.get("results", []):
# Filter by severity and confidence
if not self._should_include_finding(issue, min_severity, min_confidence):
continue
# Extract issue details
test_id = issue.get("test_id", "B000")
test_name = issue.get("test_name", "unknown")
issue_text = issue.get("issue_text", "No description")
severity = self.SEVERITY_MAP.get(issue.get("issue_severity", "LOW"), "low")
# File location
filename = issue.get("filename", "")
line_number = issue.get("line_number", 0)
code = issue.get("code", "")
# Try to get relative path
try:
file_path = Path(filename)
rel_path = file_path.relative_to(workspace)
except (ValueError, TypeError):
rel_path = Path(filename).name
# Create finding
finding = self.create_finding(
title=f"{test_name} ({test_id})",
description=issue_text,
severity=severity,
category="security-issue",
file_path=str(rel_path),
line_start=line_number,
line_end=line_number,
code_snippet=code.strip() if code else None,
recommendation=f"Review and fix the security issue identified by Bandit test {test_id}",
metadata={
"test_id": test_id,
"test_name": test_name,
"confidence": issue.get("issue_confidence", "LOW").lower(),
"cwe": issue.get("issue_cwe", {}).get("id") if issue.get("issue_cwe") else None,
"more_info": issue.get("more_info", "")
}
)
findings.append(finding)
return findings
async def execute(self, config: Dict[str, Any], workspace: Path) -> ModuleResult:
"""
Execute the Bandit analyzer module.
Args:
config: Module configuration
workspace: Path to workspace
Returns:
ModuleResult with security findings
"""
start_time = time.time()
metadata = self.get_metadata()
# Validate inputs
self.validate_config(config)
self.validate_workspace(workspace)
# Get configuration
severity_level = config.get("severity_level", "low")
confidence_level = config.get("confidence_level", "medium")
exclude_tests = config.get("exclude_tests", True)
skip_ids = config.get("skip_ids", [])
# Run Bandit
logger.info("Starting Bandit analysis...")
bandit_result = await self._run_bandit(
workspace,
severity_level,
confidence_level,
exclude_tests,
skip_ids
)
# Convert to findings
findings = self._convert_to_findings(
bandit_result,
workspace,
severity_level,
confidence_level
)
# Calculate summary
severity_counts = {}
for finding in findings:
sev = finding.severity
severity_counts[sev] = severity_counts.get(sev, 0) + 1
execution_time = time.time() - start_time
return ModuleResult(
module=metadata.name,
version=metadata.version,
status="success",
execution_time=execution_time,
findings=findings,
summary={
"total_issues": len(findings),
"by_severity": severity_counts,
"files_analyzed": len(set(f.file_path for f in findings if f.file_path))
},
metadata={
"bandit_version": bandit_result.get("generated_at", "unknown"),
"metrics": bandit_result.get("metrics", {})
}
)

View File

@@ -0,0 +1,269 @@
"""
Mypy Analyzer Module - Analyzes Python code for type safety issues using Mypy
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import asyncio
import logging
import re
import time
from pathlib import Path
from typing import Dict, Any, List
try:
from toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
except ImportError:
try:
from modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
except ImportError:
from src.toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
logger = logging.getLogger(__name__)
class MypyAnalyzer(BaseModule):
"""
Analyzes Python code for type safety issues using Mypy.
This module:
- Runs Mypy type checker on Python files
- Detects type errors and inconsistencies
- Reports findings with configurable strictness
"""
# Map Mypy error codes to severity
ERROR_SEVERITY_MAP = {
"error": "medium",
"note": "info"
}
def get_metadata(self) -> ModuleMetadata:
"""Get module metadata"""
return ModuleMetadata(
name="mypy_analyzer",
version="1.0.0",
description="Analyzes Python code for type safety issues using Mypy",
author="FuzzForge Team",
category="analyzer",
tags=["python", "type-checking", "mypy", "sast"],
input_schema={
"strict_mode": {
"type": "boolean",
"description": "Enable strict type checking",
"default": False
},
"ignore_missing_imports": {
"type": "boolean",
"description": "Ignore errors about missing imports",
"default": True
},
"follow_imports": {
"type": "string",
"enum": ["normal", "silent", "skip", "error"],
"description": "How to handle imports",
"default": "silent"
}
},
output_schema={
"findings": {
"type": "array",
"description": "List of type errors found by Mypy"
}
},
requires_workspace=True
)
def validate_config(self, config: Dict[str, Any]) -> bool:
"""Validate module configuration"""
follow_imports = config.get("follow_imports", "silent")
if follow_imports not in ["normal", "silent", "skip", "error"]:
raise ValueError("follow_imports must be one of: normal, silent, skip, error")
return True
async def _run_mypy(
self,
workspace: Path,
strict_mode: bool,
ignore_missing_imports: bool,
follow_imports: str
) -> str:
"""
Run Mypy on the workspace.
Args:
workspace: Path to workspace
strict_mode: Enable strict checking
ignore_missing_imports: Ignore missing import errors
follow_imports: How to handle imports
Returns:
Mypy output as string
"""
try:
# Build mypy command
cmd = [
"mypy",
str(workspace),
"--show-column-numbers",
"--no-error-summary",
f"--follow-imports={follow_imports}"
]
if strict_mode:
cmd.append("--strict")
if ignore_missing_imports:
cmd.append("--ignore-missing-imports")
logger.info(f"Running Mypy on: {workspace}")
process = await asyncio.create_subprocess_exec(
*cmd,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE
)
stdout, stderr = await process.communicate()
# Mypy returns non-zero if errors found, which is expected
output = stdout.decode()
return output
except Exception as e:
logger.error(f"Error running Mypy: {e}")
return ""
def _parse_mypy_output(self, output: str, workspace: Path) -> List[ModuleFinding]:
"""
Parse Mypy output and convert to findings.
Mypy output format:
file.py:10:5: error: Incompatible return value type [return-value]
file.py:15: note: See https://...
Args:
output: Mypy stdout
workspace: Workspace path for relative paths
Returns:
List of ModuleFindings
"""
findings = []
# Regex to parse mypy output lines
# Format: filename:line:column: level: message [error-code]
pattern = r'^(.+?):(\d+)(?::(\d+))?: (error|note): (.+?)(?:\s+\[([^\]]+)\])?$'
for line in output.splitlines():
match = re.match(pattern, line.strip())
if not match:
continue
filename, line_num, column, level, message, error_code = match.groups()
# Convert to relative path
try:
file_path = Path(filename)
rel_path = file_path.relative_to(workspace)
except (ValueError, TypeError):
rel_path = Path(filename).name
# Skip if it's just a note (unless it's a standalone note)
if level == "note" and not error_code:
continue
# Map severity
severity = self.ERROR_SEVERITY_MAP.get(level, "medium")
# Create finding
title = f"Type error: {error_code or 'type-issue'}"
description = message
finding = self.create_finding(
title=title,
description=description,
severity=severity,
category="type-error",
file_path=str(rel_path),
line_start=int(line_num),
line_end=int(line_num),
recommendation="Review and fix the type inconsistency or add appropriate type annotations",
metadata={
"error_code": error_code or "unknown",
"column": int(column) if column else None,
"level": level
}
)
findings.append(finding)
return findings
async def execute(self, config: Dict[str, Any], workspace: Path) -> ModuleResult:
"""
Execute the Mypy analyzer module.
Args:
config: Module configuration
workspace: Path to workspace
Returns:
ModuleResult with type checking findings
"""
start_time = time.time()
metadata = self.get_metadata()
# Validate inputs
self.validate_config(config)
self.validate_workspace(workspace)
# Get configuration
strict_mode = config.get("strict_mode", False)
ignore_missing_imports = config.get("ignore_missing_imports", True)
follow_imports = config.get("follow_imports", "silent")
# Run Mypy
logger.info("Starting Mypy analysis...")
mypy_output = await self._run_mypy(
workspace,
strict_mode,
ignore_missing_imports,
follow_imports
)
# Parse output to findings
findings = self._parse_mypy_output(mypy_output, workspace)
# Calculate summary
error_code_counts = {}
for finding in findings:
code = finding.metadata.get("error_code", "unknown")
error_code_counts[code] = error_code_counts.get(code, 0) + 1
execution_time = time.time() - start_time
return ModuleResult(
module=metadata.name,
version=metadata.version,
status="success",
execution_time=execution_time,
findings=findings,
summary={
"total_errors": len(findings),
"by_error_code": error_code_counts,
"files_with_errors": len(set(f.file_path for f in findings if f.file_path))
},
metadata={
"strict_mode": strict_mode,
"ignore_missing_imports": ignore_missing_imports
}
)

View File

@@ -0,0 +1,31 @@
"""
Android Security Analysis Modules
Modules for Android application security testing:
- JadxDecompiler: APK decompilation using Jadx
- MobSFScanner: Mobile security analysis using MobSF
- OpenGrepAndroid: Static analysis using OpenGrep/Semgrep with Android-specific rules
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
from .jadx_decompiler import JadxDecompiler
from .opengrep_android import OpenGrepAndroid
# MobSF is optional (not available on ARM64 platform)
try:
from .mobsf_scanner import MobSFScanner
__all__ = ["JadxDecompiler", "MobSFScanner", "OpenGrepAndroid"]
except ImportError:
# MobSF dependencies not available (e.g., ARM64 platform)
MobSFScanner = None
__all__ = ["JadxDecompiler", "OpenGrepAndroid"]

View File

@@ -0,0 +1,15 @@
rules:
- id: clipboard-sensitive-data
severity: WARNING
languages: [java]
message: "Sensitive data may be copied to the clipboard."
metadata:
authors:
- Guerric ELOI (FuzzingLabs)
category: security
area: clipboard
verification-level: [L1]
paths:
include:
- "**/*.java"
pattern: "$CLIPBOARD.setPrimaryClip($CLIP)"

View File

@@ -0,0 +1,23 @@
rules:
- id: hardcoded-secrets
severity: WARNING
languages: [java]
message: "Possible hardcoded secret found in variable '$NAME'."
metadata:
authors:
- Guerric ELOI (FuzzingLabs)
owasp-mobile: M2
category: secrets
verification-level: [L1]
paths:
include:
- "**/*.java"
patterns:
- pattern-either:
- pattern: 'String $NAME = "$VAL";'
- pattern: 'final String $NAME = "$VAL";'
- pattern: 'private String $NAME = "$VAL";'
- pattern: 'public static String $NAME = "$VAL";'
- pattern: 'static final String $NAME = "$VAL";'
- pattern-regex: "$NAME =~ /(?i).*(api|key|token|secret|pass|auth|session|bearer|access|private).*/"

View File

@@ -0,0 +1,18 @@
rules:
- id: insecure-data-storage
severity: WARNING
languages: [java]
message: "Potential insecure data storage (external storage)."
metadata:
authors:
- Guerric ELOI (FuzzingLabs)
owasp-mobile: M2
category: security
area: storage
verification-level: [L1]
paths:
include:
- "**/*.java"
pattern-either:
- pattern: "$CTX.openFileOutput($NAME, $MODE)"
- pattern: "Environment.getExternalStorageDirectory()"

View File

@@ -0,0 +1,16 @@
rules:
- id: insecure-deeplink
severity: WARNING
languages: [xml]
message: "Potential insecure deeplink found in intent-filter."
metadata:
authors:
- Guerric ELOI (FuzzingLabs)
category: component
area: manifest
verification-level: [L1]
paths:
include:
- "**/AndroidManifest.xml"
pattern: |
<intent-filter>

View File

@@ -0,0 +1,21 @@
rules:
- id: insecure-logging
severity: WARNING
languages: [java]
message: "Sensitive data logged via Android Log API."
metadata:
authors:
- Guerric ELOI (FuzzingLabs)
owasp-mobile: M2
category: logging
verification-level: [L1]
paths:
include:
- "**/*.java"
patterns:
- pattern-either:
- pattern: "Log.d($TAG, $MSG)"
- pattern: "Log.e($TAG, $MSG)"
- pattern: "System.out.println($MSG)"
- pattern-regex: "$MSG =~ /(?i).*(password|token|secret|api|auth|session).*/"

View File

@@ -0,0 +1,15 @@
rules:
- id: intent-redirection
severity: WARNING
languages: [java]
message: "Potential intent redirection: using getIntent().getExtras() without validation."
metadata:
authors:
- Guerric ELOI (FuzzingLabs)
category: intent
area: intercomponent
verification-level: [L1]
paths:
include:
- "**/*.java"
pattern: "$ACT.getIntent().getExtras()"

View File

@@ -0,0 +1,18 @@
rules:
- id: sensitive-data-in-shared-preferences
severity: WARNING
languages: [java]
message: "Sensitive data may be stored in SharedPreferences. Please review the key '$KEY'."
metadata:
authors:
- Guerric ELOI (FuzzingLabs)
owasp-mobile: M2
category: security
area: storage
verification-level: [L1]
paths:
include:
- "**/*.java"
patterns:
- pattern: "$EDITOR.putString($KEY, $VAL);"
- pattern-regex: "$KEY =~ /(?i).*(username|password|pass|token|auth_token|api_key|secret|sessionid|email).*/"

View File

@@ -0,0 +1,21 @@
rules:
- id: sqlite-injection
severity: ERROR
languages: [java]
message: "Possible SQL injection: concatenated input in rawQuery or execSQL."
metadata:
authors:
- Guerric ELOI (FuzzingLabs)
owasp-mobile: M7
category: injection
area: database
verification-level: [L1]
paths:
include:
- "**/*.java"
patterns:
- pattern-either:
- pattern: "$DB.rawQuery($QUERY, ...)"
- pattern: "$DB.execSQL($QUERY)"
- pattern-regex: "$QUERY =~ /.*\".*\".*\\+.*/"

View File

@@ -0,0 +1,16 @@
rules:
- id: vulnerable-activity
severity: WARNING
languages: [xml]
message: "Activity exported without permission."
metadata:
authors:
- Guerric ELOI (FuzzingLabs)
category: component
area: manifest
verification-level: [L1]
paths:
include:
- "**/AndroidManifest.xml"
pattern: |
<activity android:exported="true"

View File

@@ -0,0 +1,16 @@
rules:
- id: vulnerable-content-provider
severity: WARNING
languages: [xml]
message: "ContentProvider exported without permission."
metadata:
authors:
- Guerric ELOI (FuzzingLabs)
category: component
area: manifest
verification-level: [L1]
paths:
include:
- "**/AndroidManifest.xml"
pattern: |
<provider android:exported="true"

View File

@@ -0,0 +1,16 @@
rules:
- id: vulnerable-service
severity: WARNING
languages: [xml]
message: "Service exported without permission."
metadata:
authors:
- Guerric ELOI (FuzzingLabs)
category: component
area: manifest
verification-level: [L1]
paths:
include:
- "**/AndroidManifest.xml"
pattern: |
<service android:exported="true"

View File

@@ -0,0 +1,16 @@
rules:
- id: webview-javascript-enabled
severity: ERROR
languages: [java]
message: "WebView with JavaScript enabled can be dangerous if loading untrusted content."
metadata:
authors:
- Guerric ELOI (FuzzingLabs)
owasp-mobile: M7
category: webview
area: ui
verification-level: [L1]
paths:
include:
- "**/*.java"
pattern: "$W.getSettings().setJavaScriptEnabled(true)"

View File

@@ -0,0 +1,16 @@
rules:
- id: webview-load-arbitrary-url
severity: WARNING
languages: [java]
message: "Loading unvalidated URL in WebView may cause open redirect or XSS."
metadata:
authors:
- Guerric ELOI (FuzzingLabs)
owasp-mobile: M7
category: webview
area: ui
verification-level: [L1]
paths:
include:
- "**/*.java"
pattern: "$W.loadUrl($URL)"

View File

@@ -0,0 +1,270 @@
"""
Jadx APK Decompilation Module
Decompiles Android APK files to Java source code using Jadx.
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import asyncio
import shutil
import logging
from pathlib import Path
from typing import Dict, Any
try:
from toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult
except ImportError:
try:
from modules.base import BaseModule, ModuleMetadata, ModuleResult
except ImportError:
from src.toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult
logger = logging.getLogger(__name__)
class JadxDecompiler(BaseModule):
"""Module for decompiling APK files to Java source code using Jadx"""
def get_metadata(self) -> ModuleMetadata:
return ModuleMetadata(
name="jadx_decompiler",
version="1.5.0",
description="Android APK decompilation using Jadx - converts DEX bytecode to Java source",
author="FuzzForge Team",
category="android",
tags=["android", "jadx", "decompilation", "reverse", "apk"],
input_schema={
"type": "object",
"properties": {
"apk_path": {
"type": "string",
"description": "Path to the APK to decompile (absolute or relative to workspace)",
},
"output_dir": {
"type": "string",
"description": "Directory (relative to workspace) where Jadx output should be written",
"default": "jadx_output",
},
"overwrite": {
"type": "boolean",
"description": "Overwrite existing output directory if present",
"default": True,
},
"threads": {
"type": "integer",
"description": "Number of Jadx decompilation threads",
"default": 4,
"minimum": 1,
"maximum": 32,
},
"decompiler_args": {
"type": "array",
"items": {"type": "string"},
"description": "Additional arguments passed directly to Jadx",
"default": [],
},
},
"required": ["apk_path"],
},
output_schema={
"type": "object",
"properties": {
"output_dir": {
"type": "string",
"description": "Path to decompiled output directory",
},
"source_dir": {
"type": "string",
"description": "Path to decompiled Java sources",
},
"resource_dir": {
"type": "string",
"description": "Path to extracted resources",
},
"java_files": {
"type": "integer",
"description": "Number of Java files decompiled",
},
},
},
requires_workspace=True,
)
def validate_config(self, config: Dict[str, Any]) -> bool:
"""Validate module configuration"""
apk_path = config.get("apk_path")
if not apk_path:
raise ValueError("'apk_path' must be provided for Jadx decompilation")
threads = config.get("threads", 4)
if not isinstance(threads, int) or threads < 1 or threads > 32:
raise ValueError("threads must be between 1 and 32")
return True
async def execute(self, config: Dict[str, Any], workspace: Path) -> ModuleResult:
"""
Execute Jadx decompilation on an APK file.
Args:
config: Configuration dict with apk_path, output_dir, etc.
workspace: Workspace directory path
Returns:
ModuleResult with decompilation summary and metadata
"""
self.start_timer()
try:
self.validate_config(config)
self.validate_workspace(workspace)
workspace = workspace.resolve()
# Resolve APK path
apk_path = Path(config["apk_path"])
if not apk_path.is_absolute():
apk_path = (workspace / apk_path).resolve()
if not apk_path.exists():
raise ValueError(f"APK not found: {apk_path}")
if apk_path.is_dir():
raise ValueError(f"APK path must be a file, not a directory: {apk_path}")
logger.info(f"Decompiling APK: {apk_path}")
# Resolve output directory
output_dir = Path(config.get("output_dir", "jadx_output"))
if not output_dir.is_absolute():
output_dir = (workspace / output_dir).resolve()
# Handle existing output directory
if output_dir.exists():
if config.get("overwrite", True):
logger.info(f"Removing existing output directory: {output_dir}")
shutil.rmtree(output_dir)
else:
raise ValueError(
f"Output directory already exists: {output_dir}. Set overwrite=true to replace it."
)
output_dir.mkdir(parents=True, exist_ok=True)
# Build Jadx command
threads = str(config.get("threads", 4))
extra_args = config.get("decompiler_args", []) or []
cmd = [
"jadx",
"--threads-count",
threads,
"--deobf", # Deobfuscate code
"--output-dir",
str(output_dir),
]
cmd.extend(extra_args)
cmd.append(str(apk_path))
logger.info(f"Running Jadx: {' '.join(cmd)}")
# Execute Jadx
process = await asyncio.create_subprocess_exec(
*cmd,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
cwd=str(workspace),
)
stdout, stderr = await process.communicate()
stdout_str = stdout.decode(errors="ignore") if stdout else ""
stderr_str = stderr.decode(errors="ignore") if stderr else ""
if stdout_str:
logger.debug(f"Jadx stdout: {stdout_str[:200]}...")
if stderr_str:
logger.debug(f"Jadx stderr: {stderr_str[:200]}...")
if process.returncode != 0:
error_output = stderr_str or stdout_str or "No error output"
raise RuntimeError(
f"Jadx failed with exit code {process.returncode}: {error_output[:500]}"
)
# Verify output structure
source_dir = output_dir / "sources"
resource_dir = output_dir / "resources"
if not source_dir.exists():
logger.warning(
f"Jadx sources directory not found at expected path: {source_dir}"
)
# Use output_dir as fallback
source_dir = output_dir
# Count decompiled Java files
java_files = 0
if source_dir.exists():
java_files = sum(1 for _ in source_dir.rglob("*.java"))
logger.info(f"Decompiled {java_files} Java files")
# Log sample files for debugging
sample_files = []
for idx, file_path in enumerate(source_dir.rglob("*.java")):
sample_files.append(str(file_path.relative_to(workspace)))
if idx >= 4:
break
if sample_files:
logger.debug(f"Sample Java files: {sample_files}")
# Create summary
summary = {
"output_dir": str(output_dir),
"source_dir": str(source_dir if source_dir.exists() else output_dir),
"resource_dir": str(
resource_dir if resource_dir.exists() else output_dir
),
"java_files": java_files,
"apk_name": apk_path.name,
"apk_size_bytes": apk_path.stat().st_size,
}
metadata = {
"apk_path": str(apk_path),
"output_dir": str(output_dir),
"source_dir": summary["source_dir"],
"resource_dir": summary["resource_dir"],
"threads": threads,
"decompiler": "jadx",
"decompiler_version": "1.5.0",
}
logger.info(
f"✓ Jadx decompilation completed: {java_files} Java files generated"
)
return self.create_result(
findings=[], # Jadx doesn't generate findings, only decompiles
status="success",
summary=summary,
metadata=metadata,
)
except Exception as exc:
logger.error(f"Jadx decompilation failed: {exc}", exc_info=True)
return self.create_result(
findings=[],
status="failed",
error=str(exc),
metadata={"decompiler": "jadx", "apk_path": config.get("apk_path")},
)

View File

@@ -0,0 +1,437 @@
"""
MobSF Scanner Module
Mobile Security Framework (MobSF) integration for comprehensive Android app security analysis.
Performs static analysis on APK files including permissions, manifest analysis, code analysis, and behavior checks.
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import logging
import os
from collections import Counter
from pathlib import Path
from typing import Dict, Any, List
import aiohttp
try:
from toolbox.modules.base import BaseModule, ModuleMetadata, ModuleFinding, ModuleResult
except ImportError:
try:
from modules.base import BaseModule, ModuleMetadata, ModuleFinding, ModuleResult
except ImportError:
from src.toolbox.modules.base import BaseModule, ModuleMetadata, ModuleFinding, ModuleResult
logger = logging.getLogger(__name__)
class MobSFScanner(BaseModule):
"""Mobile Security Framework (MobSF) scanner module for Android applications"""
SEVERITY_MAP = {
"dangerous": "critical",
"high": "high",
"warning": "medium",
"medium": "medium",
"low": "low",
"info": "low",
"secure": "low",
}
def get_metadata(self) -> ModuleMetadata:
return ModuleMetadata(
name="mobsf_scanner",
version="3.9.7",
description="Comprehensive Android security analysis using Mobile Security Framework (MobSF)",
author="FuzzForge Team",
category="android",
tags=["mobile", "android", "mobsf", "sast", "scanner", "security"],
input_schema={
"type": "object",
"properties": {
"mobsf_url": {
"type": "string",
"description": "MobSF server URL",
"default": "http://localhost:8877",
},
"file_path": {
"type": "string",
"description": "Path to the APK file to scan (absolute or relative to workspace)",
},
"api_key": {
"type": "string",
"description": "MobSF API key (if not provided, will try MOBSF_API_KEY env var)",
"default": None,
},
"rescan": {
"type": "boolean",
"description": "Force rescan even if file was previously analyzed",
"default": False,
},
},
"required": ["file_path"],
},
output_schema={
"type": "object",
"properties": {
"findings": {
"type": "array",
"description": "Security findings from MobSF analysis"
},
"scan_hash": {"type": "string"},
"total_findings": {"type": "integer"},
"severity_counts": {"type": "object"},
}
},
requires_workspace=True,
)
def validate_config(self, config: Dict[str, Any]) -> bool:
"""Validate module configuration"""
if "mobsf_url" in config and not isinstance(config["mobsf_url"], str):
raise ValueError("mobsf_url must be a string")
file_path = config.get("file_path")
if not file_path:
raise ValueError("file_path is required for MobSF scanning")
return True
async def execute(self, config: Dict[str, Any], workspace: Path) -> ModuleResult:
"""
Execute MobSF security analysis on an APK file.
Args:
config: Configuration dict with file_path, mobsf_url, api_key
workspace: Workspace directory path
Returns:
ModuleResult with security findings from MobSF
"""
self.start_timer()
try:
self.validate_config(config)
self.validate_workspace(workspace)
# Get configuration
mobsf_url = config.get("mobsf_url", "http://localhost:8877")
file_path_str = config["file_path"]
rescan = config.get("rescan", False)
# Get API key from config or environment
api_key = config.get("api_key") or os.environ.get("MOBSF_API_KEY", "")
if not api_key:
logger.warning("No MobSF API key provided. Some functionality may be limited.")
# Resolve APK file path
file_path = Path(file_path_str)
if not file_path.is_absolute():
file_path = (workspace / file_path).resolve()
if not file_path.exists():
raise FileNotFoundError(f"APK file not found: {file_path}")
if not file_path.is_file():
raise ValueError(f"APK path must be a file: {file_path}")
logger.info(f"Starting MobSF scan of APK: {file_path}")
# Upload and scan APK
scan_hash = await self._upload_file(mobsf_url, file_path, api_key)
logger.info(f"APK uploaded to MobSF with hash: {scan_hash}")
# Start scan
await self._start_scan(mobsf_url, scan_hash, api_key, rescan=rescan)
logger.info(f"MobSF scan completed for hash: {scan_hash}")
# Get JSON results
scan_results = await self._get_json_results(mobsf_url, scan_hash, api_key)
# Parse results into findings
findings = self._parse_scan_results(scan_results, file_path)
# Create summary
summary = self._create_summary(findings, scan_hash)
logger.info(f"✓ MobSF scan completed: {len(findings)} findings")
return self.create_result(
findings=findings,
status="success",
summary=summary,
metadata={
"tool": "mobsf",
"tool_version": "3.9.7",
"scan_hash": scan_hash,
"apk_file": str(file_path),
"mobsf_url": mobsf_url,
}
)
except Exception as exc:
logger.error(f"MobSF scanner failed: {exc}", exc_info=True)
return self.create_result(
findings=[],
status="failed",
error=str(exc),
metadata={"tool": "mobsf", "file_path": config.get("file_path")}
)
async def _upload_file(self, mobsf_url: str, file_path: Path, api_key: str) -> str:
"""
Upload APK file to MobSF server.
Returns:
Scan hash for the uploaded file
"""
headers = {'X-Mobsf-Api-Key': api_key} if api_key else {}
# Create multipart form data
filename = file_path.name
async with aiohttp.ClientSession() as session:
with open(file_path, 'rb') as f:
data = aiohttp.FormData()
data.add_field('file',
f,
filename=filename,
content_type='application/vnd.android.package-archive')
async with session.post(
f"{mobsf_url}/api/v1/upload",
headers=headers,
data=data,
timeout=aiohttp.ClientTimeout(total=300)
) as response:
if response.status != 200:
error_text = await response.text()
raise Exception(f"Failed to upload file to MobSF: {error_text}")
result = await response.json()
scan_hash = result.get('hash')
if not scan_hash:
raise Exception(f"MobSF upload failed: {result}")
return scan_hash
async def _start_scan(self, mobsf_url: str, scan_hash: str, api_key: str, rescan: bool = False) -> Dict[str, Any]:
"""
Start MobSF scan for uploaded file.
Returns:
Scan result dictionary
"""
headers = {'X-Mobsf-Api-Key': api_key} if api_key else {}
data = {
'hash': scan_hash,
're_scan': '1' if rescan else '0'
}
async with aiohttp.ClientSession() as session:
async with session.post(
f"{mobsf_url}/api/v1/scan",
headers=headers,
data=data,
timeout=aiohttp.ClientTimeout(total=600) # 10 minutes for scan
) as response:
if response.status != 200:
error_text = await response.text()
raise Exception(f"MobSF scan failed: {error_text}")
result = await response.json()
return result
async def _get_json_results(self, mobsf_url: str, scan_hash: str, api_key: str) -> Dict[str, Any]:
"""
Retrieve JSON scan results from MobSF.
Returns:
Scan results dictionary
"""
headers = {'X-Mobsf-Api-Key': api_key} if api_key else {}
data = {'hash': scan_hash}
async with aiohttp.ClientSession() as session:
async with session.post(
f"{mobsf_url}/api/v1/report_json",
headers=headers,
data=data,
timeout=aiohttp.ClientTimeout(total=60)
) as response:
if response.status != 200:
error_text = await response.text()
raise Exception(f"Failed to retrieve MobSF results: {error_text}")
return await response.json()
def _parse_scan_results(self, scan_data: Dict[str, Any], apk_path: Path) -> List[ModuleFinding]:
"""Parse MobSF JSON results into standardized findings"""
findings = []
# Parse permissions
if 'permissions' in scan_data:
for perm_name, perm_attrs in scan_data['permissions'].items():
if isinstance(perm_attrs, dict):
severity = self.SEVERITY_MAP.get(
perm_attrs.get('status', '').lower(), 'low'
)
finding = self.create_finding(
title=f"Android Permission: {perm_name}",
description=perm_attrs.get('description', 'No description'),
severity=severity,
category="android-permission",
metadata={
'permission': perm_name,
'status': perm_attrs.get('status'),
'info': perm_attrs.get('info'),
'tool': 'mobsf',
}
)
findings.append(finding)
# Parse manifest analysis
if 'manifest_analysis' in scan_data:
manifest_findings = scan_data['manifest_analysis'].get('manifest_findings', [])
for item in manifest_findings:
if isinstance(item, dict):
severity = self.SEVERITY_MAP.get(item.get('severity', '').lower(), 'medium')
finding = self.create_finding(
title=item.get('title') or item.get('name') or "Manifest Issue",
description=item.get('description', 'No description'),
severity=severity,
category="android-manifest",
metadata={
'rule': item.get('rule'),
'tool': 'mobsf',
}
)
findings.append(finding)
# Parse code analysis
if 'code_analysis' in scan_data:
code_findings = scan_data['code_analysis'].get('findings', {})
for finding_name, finding_data in code_findings.items():
if isinstance(finding_data, dict):
metadata_dict = finding_data.get('metadata', {})
severity = self.SEVERITY_MAP.get(
metadata_dict.get('severity', '').lower(), 'medium'
)
# MobSF returns 'files' as a dict: {filename: line_numbers}
files_dict = finding_data.get('files', {})
# Create a finding for each affected file
if isinstance(files_dict, dict) and files_dict:
for file_path, line_numbers in files_dict.items():
finding = self.create_finding(
title=finding_name,
description=metadata_dict.get('description', 'No description'),
severity=severity,
category="android-code-analysis",
file_path=file_path,
line_number=line_numbers, # Can be string like "28" or "65,81"
metadata={
'cwe': metadata_dict.get('cwe'),
'owasp': metadata_dict.get('owasp'),
'masvs': metadata_dict.get('masvs'),
'cvss': metadata_dict.get('cvss'),
'ref': metadata_dict.get('ref'),
'line_numbers': line_numbers,
'tool': 'mobsf',
}
)
findings.append(finding)
else:
# Fallback: create one finding without file info
finding = self.create_finding(
title=finding_name,
description=metadata_dict.get('description', 'No description'),
severity=severity,
category="android-code-analysis",
metadata={
'cwe': metadata_dict.get('cwe'),
'owasp': metadata_dict.get('owasp'),
'masvs': metadata_dict.get('masvs'),
'cvss': metadata_dict.get('cvss'),
'ref': metadata_dict.get('ref'),
'tool': 'mobsf',
}
)
findings.append(finding)
# Parse behavior analysis
if 'behaviour' in scan_data:
for key, value in scan_data['behaviour'].items():
if isinstance(value, dict):
metadata_dict = value.get('metadata', {})
labels = metadata_dict.get('label', [])
label = labels[0] if labels else 'Unknown Behavior'
severity = self.SEVERITY_MAP.get(
metadata_dict.get('severity', '').lower(), 'medium'
)
# MobSF returns 'files' as a dict: {filename: line_numbers}
files_dict = value.get('files', {})
# Create a finding for each affected file
if isinstance(files_dict, dict) and files_dict:
for file_path, line_numbers in files_dict.items():
finding = self.create_finding(
title=f"Behavior: {label}",
description=metadata_dict.get('description', 'No description'),
severity=severity,
category="android-behavior",
file_path=file_path,
line_number=line_numbers,
metadata={
'line_numbers': line_numbers,
'behavior_key': key,
'tool': 'mobsf',
}
)
findings.append(finding)
else:
# Fallback: create one finding without file info
finding = self.create_finding(
title=f"Behavior: {label}",
description=metadata_dict.get('description', 'No description'),
severity=severity,
category="android-behavior",
metadata={
'behavior_key': key,
'tool': 'mobsf',
}
)
findings.append(finding)
logger.debug(f"Parsed {len(findings)} findings from MobSF results")
return findings
def _create_summary(self, findings: List[ModuleFinding], scan_hash: str) -> Dict[str, Any]:
"""Create analysis summary"""
severity_counter = Counter()
category_counter = Counter()
for finding in findings:
severity_counter[finding.severity] += 1
category_counter[finding.category] += 1
return {
"scan_hash": scan_hash,
"total_findings": len(findings),
"severity_counts": dict(severity_counter),
"category_counts": dict(category_counter),
}

View File

@@ -0,0 +1,440 @@
"""
OpenGrep Android Static Analysis Module
Pattern-based static analysis for Android applications using OpenGrep/Semgrep
with Android-specific security rules.
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import asyncio
import json
import logging
from pathlib import Path
from typing import Dict, Any, List
try:
from toolbox.modules.base import BaseModule, ModuleMetadata, ModuleFinding, ModuleResult
except ImportError:
try:
from modules.base import BaseModule, ModuleMetadata, ModuleFinding, ModuleResult
except ImportError:
from src.toolbox.modules.base import BaseModule, ModuleMetadata, ModuleFinding, ModuleResult
logger = logging.getLogger(__name__)
class OpenGrepAndroid(BaseModule):
"""OpenGrep static analysis module specialized for Android security"""
def get_metadata(self) -> ModuleMetadata:
"""Get module metadata"""
return ModuleMetadata(
name="opengrep_android",
version="1.45.0",
description="Android-focused static analysis using OpenGrep/Semgrep with custom security rules for Java/Kotlin",
author="FuzzForge Team",
category="android",
tags=["sast", "android", "opengrep", "semgrep", "java", "kotlin", "security"],
input_schema={
"type": "object",
"properties": {
"config": {
"type": "string",
"enum": ["auto", "p/security-audit", "p/owasp-top-ten", "p/cwe-top-25"],
"default": "auto",
"description": "Rule configuration to use"
},
"custom_rules_path": {
"type": "string",
"description": "Path to a directory containing custom OpenGrep rules (Android-specific rules recommended)",
"default": None,
},
"languages": {
"type": "array",
"items": {"type": "string"},
"description": "Specific languages to analyze (defaults to java, kotlin for Android)",
"default": ["java", "kotlin"],
},
"include_patterns": {
"type": "array",
"items": {"type": "string"},
"description": "File patterns to include",
"default": [],
},
"exclude_patterns": {
"type": "array",
"items": {"type": "string"},
"description": "File patterns to exclude",
"default": [],
},
"max_target_bytes": {
"type": "integer",
"default": 1000000,
"description": "Maximum file size to analyze (bytes)"
},
"timeout": {
"type": "integer",
"default": 300,
"description": "Analysis timeout in seconds"
},
"severity": {
"type": "array",
"items": {"type": "string", "enum": ["ERROR", "WARNING", "INFO"]},
"default": ["ERROR", "WARNING", "INFO"],
"description": "Minimum severity levels to report"
},
"confidence": {
"type": "array",
"items": {"type": "string", "enum": ["HIGH", "MEDIUM", "LOW"]},
"default": ["HIGH", "MEDIUM", "LOW"],
"description": "Minimum confidence levels to report"
}
}
},
output_schema={
"type": "object",
"properties": {
"findings": {
"type": "array",
"description": "Security findings from OpenGrep analysis"
},
"total_findings": {"type": "integer"},
"severity_counts": {"type": "object"},
"files_analyzed": {"type": "integer"},
}
},
requires_workspace=True,
)
def validate_config(self, config: Dict[str, Any]) -> bool:
"""Validate configuration"""
timeout = config.get("timeout", 300)
if not isinstance(timeout, int) or timeout < 30 or timeout > 3600:
raise ValueError("Timeout must be between 30 and 3600 seconds")
max_bytes = config.get("max_target_bytes", 1000000)
if not isinstance(max_bytes, int) or max_bytes < 1000 or max_bytes > 10000000:
raise ValueError("max_target_bytes must be between 1000 and 10000000")
custom_rules_path = config.get("custom_rules_path")
if custom_rules_path:
rules_path = Path(custom_rules_path)
if not rules_path.exists():
logger.warning(f"Custom rules path does not exist: {custom_rules_path}")
return True
async def execute(self, config: Dict[str, Any], workspace: Path) -> ModuleResult:
"""Execute OpenGrep static analysis on Android code"""
self.start_timer()
try:
# Validate inputs
self.validate_config(config)
self.validate_workspace(workspace)
logger.info(f"Running OpenGrep Android analysis on {workspace}")
# Build opengrep command
cmd = ["opengrep", "scan", "--json"]
# Add configuration
custom_rules_path = config.get("custom_rules_path")
use_custom_rules = False
if custom_rules_path and Path(custom_rules_path).exists():
cmd.extend(["--config", custom_rules_path])
use_custom_rules = True
logger.info(f"Using custom Android rules from: {custom_rules_path}")
else:
config_type = config.get("config", "auto")
if config_type == "auto":
cmd.extend(["--config", "auto"])
else:
cmd.extend(["--config", config_type])
# Add timeout
cmd.extend(["--timeout", str(config.get("timeout", 300))])
# Add max target bytes
cmd.extend(["--max-target-bytes", str(config.get("max_target_bytes", 1000000))])
# Add languages if specified (but NOT when using custom rules)
languages = config.get("languages", ["java", "kotlin"])
if languages and not use_custom_rules:
langs = ",".join(languages)
cmd.extend(["--lang", langs])
logger.debug(f"Analyzing languages: {langs}")
# Add include patterns
include_patterns = config.get("include_patterns", [])
for pattern in include_patterns:
cmd.extend(["--include", pattern])
# Add exclude patterns
exclude_patterns = config.get("exclude_patterns", [])
for pattern in exclude_patterns:
cmd.extend(["--exclude", pattern])
# Add severity filter if single level requested
severity_levels = config.get("severity", ["ERROR", "WARNING", "INFO"])
if severity_levels and len(severity_levels) == 1:
cmd.extend(["--severity", severity_levels[0]])
# Disable metrics collection
cmd.append("--disable-version-check")
cmd.append("--no-git-ignore")
# Add target directory
cmd.append(str(workspace))
logger.debug(f"Running command: {' '.join(cmd)}")
# Run OpenGrep
process = await asyncio.create_subprocess_exec(
*cmd,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
cwd=workspace
)
stdout, stderr = await process.communicate()
# Parse results
findings = []
if process.returncode in [0, 1]: # 0 = no findings, 1 = findings found
findings = self._parse_opengrep_output(stdout.decode(), workspace, config)
logger.info(f"OpenGrep found {len(findings)} potential security issues")
else:
error_msg = stderr.decode()
logger.error(f"OpenGrep failed: {error_msg}")
return self.create_result(
findings=[],
status="failed",
error=f"OpenGrep execution failed (exit code {process.returncode}): {error_msg[:500]}"
)
# Create summary
summary = self._create_summary(findings)
return self.create_result(
findings=findings,
status="success",
summary=summary,
metadata={
"tool": "opengrep",
"tool_version": "1.45.0",
"languages": languages,
"custom_rules": bool(custom_rules_path),
}
)
except Exception as e:
logger.error(f"OpenGrep Android module failed: {e}", exc_info=True)
return self.create_result(
findings=[],
status="failed",
error=str(e)
)
def _parse_opengrep_output(self, output: str, workspace: Path, config: Dict[str, Any]) -> List[ModuleFinding]:
"""Parse OpenGrep JSON output into findings"""
findings = []
if not output.strip():
return findings
try:
data = json.loads(output)
results = data.get("results", [])
logger.debug(f"OpenGrep returned {len(results)} raw results")
# Get filtering criteria
allowed_severities = set(config.get("severity", ["ERROR", "WARNING", "INFO"]))
allowed_confidences = set(config.get("confidence", ["HIGH", "MEDIUM", "LOW"]))
for result in results:
# Extract basic info
rule_id = result.get("check_id", "unknown")
message = result.get("message", "")
extra = result.get("extra", {})
severity = extra.get("severity", "INFO").upper()
# File location info
path_info = result.get("path", "")
start_line = result.get("start", {}).get("line", 0)
end_line = result.get("end", {}).get("line", 0)
# Code snippet
lines = extra.get("lines", "")
# Metadata
rule_metadata = extra.get("metadata", {})
cwe = rule_metadata.get("cwe", [])
owasp = rule_metadata.get("owasp", [])
confidence = extra.get("confidence", rule_metadata.get("confidence", "MEDIUM")).upper()
# Apply severity filter
if severity not in allowed_severities:
continue
# Apply confidence filter
if confidence not in allowed_confidences:
continue
# Make file path relative to workspace
if path_info:
try:
rel_path = Path(path_info).relative_to(workspace)
path_info = str(rel_path)
except ValueError:
pass
# Map severity to our standard levels
finding_severity = self._map_severity(severity)
# Create finding
finding = self.create_finding(
title=f"Android Security: {rule_id}",
description=message or f"OpenGrep rule {rule_id} triggered",
severity=finding_severity,
category=self._get_category(rule_id, extra),
file_path=path_info if path_info else None,
line_start=start_line if start_line > 0 else None,
line_end=end_line if end_line > 0 and end_line != start_line else None,
code_snippet=lines.strip() if lines else None,
recommendation=self._get_recommendation(rule_id, extra),
metadata={
"rule_id": rule_id,
"opengrep_severity": severity,
"confidence": confidence,
"cwe": cwe,
"owasp": owasp,
"fix": extra.get("fix", ""),
"impact": extra.get("impact", ""),
"likelihood": extra.get("likelihood", ""),
"references": extra.get("references", []),
"tool": "opengrep",
}
)
findings.append(finding)
except json.JSONDecodeError as e:
logger.warning(f"Failed to parse OpenGrep output: {e}. Output snippet: {output[:200]}...")
except Exception as e:
logger.warning(f"Error processing OpenGrep results: {e}", exc_info=True)
return findings
def _map_severity(self, opengrep_severity: str) -> str:
"""Map OpenGrep severity to our standard severity levels"""
severity_map = {
"ERROR": "high",
"WARNING": "medium",
"INFO": "low"
}
return severity_map.get(opengrep_severity.upper(), "medium")
def _get_category(self, rule_id: str, extra: Dict[str, Any]) -> str:
"""Determine finding category based on rule and metadata"""
rule_metadata = extra.get("metadata", {})
cwe_list = rule_metadata.get("cwe", [])
owasp_list = rule_metadata.get("owasp", [])
rule_lower = rule_id.lower()
# Android-specific categories
if "injection" in rule_lower or "sql" in rule_lower:
return "injection"
elif "intent" in rule_lower:
return "android-intent"
elif "webview" in rule_lower:
return "android-webview"
elif "deeplink" in rule_lower:
return "android-deeplink"
elif "storage" in rule_lower or "sharedpreferences" in rule_lower:
return "android-storage"
elif "logging" in rule_lower or "log" in rule_lower:
return "android-logging"
elif "clipboard" in rule_lower:
return "android-clipboard"
elif "activity" in rule_lower or "service" in rule_lower or "provider" in rule_lower:
return "android-component"
elif "crypto" in rule_lower or "encrypt" in rule_lower:
return "cryptography"
elif "hardcode" in rule_lower or "secret" in rule_lower:
return "secrets"
elif "auth" in rule_lower:
return "authentication"
elif cwe_list:
return f"cwe-{cwe_list[0]}"
elif owasp_list:
return f"owasp-{owasp_list[0].replace(' ', '-').lower()}"
else:
return "android-security"
def _get_recommendation(self, rule_id: str, extra: Dict[str, Any]) -> str:
"""Generate recommendation based on rule and metadata"""
fix_suggestion = extra.get("fix", "")
if fix_suggestion:
return fix_suggestion
rule_lower = rule_id.lower()
# Android-specific recommendations
if "injection" in rule_lower or "sql" in rule_lower:
return "Use parameterized queries or Room database with type-safe queries to prevent SQL injection."
elif "intent" in rule_lower:
return "Validate all incoming Intent data and use explicit Intents when possible to prevent Intent manipulation attacks."
elif "webview" in rule_lower and "javascript" in rule_lower:
return "Disable JavaScript in WebView if not needed, or implement proper JavaScript interfaces with @JavascriptInterface annotation."
elif "deeplink" in rule_lower:
return "Validate all deeplink URLs and sanitize user input to prevent deeplink hijacking attacks."
elif "storage" in rule_lower or "sharedpreferences" in rule_lower:
return "Encrypt sensitive data before storing in SharedPreferences or use EncryptedSharedPreferences for Android API 23+."
elif "logging" in rule_lower:
return "Remove sensitive data from logs in production builds. Use ProGuard/R8 to strip logging statements."
elif "clipboard" in rule_lower:
return "Avoid placing sensitive data on the clipboard. If necessary, clear clipboard data when no longer needed."
elif "crypto" in rule_lower:
return "Use modern cryptographic algorithms (AES-GCM, RSA-OAEP) and Android Keystore for key management."
elif "hardcode" in rule_lower or "secret" in rule_lower:
return "Remove hardcoded secrets. Use Android Keystore, environment variables, or secure configuration management."
else:
return "Review this Android security issue and apply appropriate fixes based on Android security best practices."
def _create_summary(self, findings: List[ModuleFinding]) -> Dict[str, Any]:
"""Create analysis summary"""
severity_counts = {"critical": 0, "high": 0, "medium": 0, "low": 0}
category_counts = {}
rule_counts = {}
for finding in findings:
# Count by severity
severity_counts[finding.severity] += 1
# Count by category
category = finding.category
category_counts[category] = category_counts.get(category, 0) + 1
# Count by rule
rule_id = finding.metadata.get("rule_id", "unknown")
rule_counts[rule_id] = rule_counts.get(rule_id, 0) + 1
return {
"total_findings": len(findings),
"severity_counts": severity_counts,
"category_counts": category_counts,
"top_rules": dict(sorted(rule_counts.items(), key=lambda x: x[1], reverse=True)[:10]),
"files_analyzed": len(set(f.file_path for f in findings if f.file_path))
}

View File

@@ -10,5 +10,6 @@
# Additional attribution and requirements are provided in the NOTICE file.
from .file_scanner import FileScanner
from .dependency_scanner import DependencyScanner
__all__ = ["FileScanner"]
__all__ = ["FileScanner", "DependencyScanner"]

View File

@@ -0,0 +1,302 @@
"""
Dependency Scanner Module - Scans Python dependencies for known vulnerabilities using pip-audit
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import asyncio
import json
import logging
import time
from pathlib import Path
from typing import Dict, Any, List
try:
from toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
except ImportError:
try:
from modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
except ImportError:
from src.toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
logger = logging.getLogger(__name__)
class DependencyScanner(BaseModule):
"""
Scans Python dependencies for known vulnerabilities using pip-audit.
This module:
- Discovers dependency files (requirements.txt, pyproject.toml, setup.py, Pipfile)
- Runs pip-audit to check for vulnerable dependencies
- Reports CVEs with severity and affected versions
"""
def get_metadata(self) -> ModuleMetadata:
"""Get module metadata"""
return ModuleMetadata(
name="dependency_scanner",
version="1.0.0",
description="Scans Python dependencies for known vulnerabilities",
author="FuzzForge Team",
category="scanner",
tags=["dependencies", "cve", "vulnerabilities", "pip-audit"],
input_schema={
"dependency_files": {
"type": "array",
"items": {"type": "string"},
"description": "List of dependency files to scan (auto-discovered if empty)",
"default": []
},
"ignore_vulns": {
"type": "array",
"items": {"type": "string"},
"description": "List of vulnerability IDs to ignore",
"default": []
}
},
output_schema={
"findings": {
"type": "array",
"description": "List of vulnerable dependencies with CVE information"
}
},
requires_workspace=True
)
def validate_config(self, config: Dict[str, Any]) -> bool:
"""Validate module configuration"""
dep_files = config.get("dependency_files", [])
if not isinstance(dep_files, list):
raise ValueError("dependency_files must be a list")
ignore_vulns = config.get("ignore_vulns", [])
if not isinstance(ignore_vulns, list):
raise ValueError("ignore_vulns must be a list")
return True
def _discover_dependency_files(self, workspace: Path) -> List[Path]:
"""
Discover Python dependency files in workspace.
Returns:
List of discovered dependency file paths
"""
dependency_patterns = [
"requirements.txt",
"*requirements*.txt",
"pyproject.toml",
"setup.py",
"Pipfile",
"poetry.lock"
]
found_files = []
for pattern in dependency_patterns:
found_files.extend(workspace.rglob(pattern))
# Deduplicate and return
unique_files = list(set(found_files))
logger.info(f"Discovered {len(unique_files)} dependency files")
return unique_files
async def _run_pip_audit(self, file_path: Path) -> Dict[str, Any]:
"""
Run pip-audit on a specific dependency file.
Args:
file_path: Path to dependency file
Returns:
pip-audit JSON output as dict
"""
try:
# Run pip-audit with JSON output
cmd = [
"pip-audit",
"--requirement", str(file_path),
"--format", "json",
"--progress-spinner", "off"
]
logger.info(f"Running pip-audit on: {file_path.name}")
process = await asyncio.create_subprocess_exec(
*cmd,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE
)
stdout, stderr = await process.communicate()
# pip-audit returns 0 if no vulns, 1 if vulns found
if process.returncode not in [0, 1]:
logger.error(f"pip-audit failed: {stderr.decode()}")
return {"dependencies": []}
# Parse JSON output
result = json.loads(stdout.decode())
return result
except Exception as e:
logger.error(f"Error running pip-audit on {file_path}: {e}")
return {"dependencies": []}
def _convert_to_findings(
self,
audit_result: Dict[str, Any],
file_path: Path,
workspace: Path,
ignore_vulns: List[str]
) -> List[ModuleFinding]:
"""
Convert pip-audit results to ModuleFindings.
Args:
audit_result: pip-audit JSON output
file_path: Path to scanned file
workspace: Workspace path for relative path calculation
ignore_vulns: List of vulnerability IDs to ignore
Returns:
List of ModuleFindings
"""
findings = []
# pip-audit format: {"dependencies": [{package, version, vulns: []}]}
for dep in audit_result.get("dependencies", []):
package_name = dep.get("name", "unknown")
package_version = dep.get("version", "unknown")
vulnerabilities = dep.get("vulns", [])
for vuln in vulnerabilities:
vuln_id = vuln.get("id", "UNKNOWN")
# Skip if in ignore list
if vuln_id in ignore_vulns:
logger.debug(f"Ignoring vulnerability: {vuln_id}")
continue
description = vuln.get("description", "No description available")
fix_versions = vuln.get("fix_versions", [])
# Map CVSS scores to severity
# pip-audit doesn't always provide CVSS, so we default to medium
severity = "medium"
# Try to get relative path
try:
rel_path = file_path.relative_to(workspace)
except ValueError:
rel_path = file_path
recommendation = f"Upgrade {package_name} to a fixed version: {', '.join(fix_versions)}" if fix_versions else f"Check for updates to {package_name}"
finding = self.create_finding(
title=f"Vulnerable dependency: {package_name} ({vuln_id})",
description=f"{description}\n\nAffected package: {package_name} {package_version}",
severity=severity,
category="vulnerable-dependency",
file_path=str(rel_path),
recommendation=recommendation,
metadata={
"cve_id": vuln_id,
"package": package_name,
"installed_version": package_version,
"fix_versions": fix_versions,
"aliases": vuln.get("aliases", []),
"link": vuln.get("link", "")
}
)
findings.append(finding)
return findings
async def execute(self, config: Dict[str, Any], workspace: Path) -> ModuleResult:
"""
Execute the dependency scanning module.
Args:
config: Module configuration
workspace: Path to workspace
Returns:
ModuleResult with vulnerability findings
"""
start_time = time.time()
metadata = self.get_metadata()
# Validate inputs
self.validate_config(config)
self.validate_workspace(workspace)
# Get configuration
specified_files = config.get("dependency_files", [])
ignore_vulns = config.get("ignore_vulns", [])
# Discover or use specified dependency files
if specified_files:
dep_files = [workspace / f for f in specified_files]
else:
dep_files = self._discover_dependency_files(workspace)
if not dep_files:
logger.warning("No dependency files found in workspace")
return ModuleResult(
module=metadata.name,
version=metadata.version,
status="success",
execution_time=time.time() - start_time,
findings=[],
summary={
"total_files": 0,
"total_vulnerabilities": 0,
"vulnerable_packages": 0
}
)
# Scan each dependency file
all_findings = []
files_scanned = 0
for dep_file in dep_files:
if not dep_file.exists():
logger.warning(f"Dependency file not found: {dep_file}")
continue
logger.info(f"Scanning dependencies in: {dep_file.name}")
audit_result = await self._run_pip_audit(dep_file)
findings = self._convert_to_findings(audit_result, dep_file, workspace, ignore_vulns)
all_findings.extend(findings)
files_scanned += 1
# Calculate summary
unique_packages = len(set(f.metadata.get("package") for f in all_findings))
execution_time = time.time() - start_time
return ModuleResult(
module=metadata.name,
version=metadata.version,
status="success",
execution_time=execution_time,
findings=all_findings,
summary={
"total_files": files_scanned,
"total_vulnerabilities": len(all_findings),
"vulnerable_packages": unique_packages
},
metadata={
"scanned_files": [str(f.name) for f in dep_files if f.exists()]
}
)

View File

@@ -107,7 +107,8 @@ class LLMSecretDetectorModule(BaseModule):
)
agent_url = config.get("agent_url")
if not agent_url or not isinstance(agent_url, str):
# agent_url is optional - will have default from metadata.yaml
if agent_url is not None and not isinstance(agent_url, str):
raise ValueError("agent_url must be a valid URL string")
max_files = config.get("max_files", 20)
@@ -131,14 +132,14 @@ class LLMSecretDetectorModule(BaseModule):
logger.info(f"Starting LLM secret detection in workspace: {workspace}")
# Extract configuration
agent_url = config.get("agent_url", "http://fuzzforge-task-agent:8000/a2a/litellm_agent")
llm_model = config.get("llm_model", "gpt-4o-mini")
llm_provider = config.get("llm_provider", "openai")
file_patterns = config.get("file_patterns", ["*.py", "*.js", "*.ts", "*.java", "*.go", "*.env", "*.yaml", "*.yml", "*.json", "*.xml", "*.ini", "*.sql", "*.properties", "*.sh", "*.bat", "*.config", "*.conf", "*.toml", "*id_rsa*", "*.txt"])
max_files = config.get("max_files", 20)
max_file_size = config.get("max_file_size", 30000)
timeout = config.get("timeout", 30) # Reduced from 45s
# Extract configuration (defaults come from metadata.yaml via API)
agent_url = config["agent_url"]
llm_model = config["llm_model"]
llm_provider = config["llm_provider"]
file_patterns = config["file_patterns"]
max_files = config["max_files"]
max_file_size = config["max_file_size"]
timeout = config["timeout"]
# Find files to analyze
# Skip files that are unlikely to contain secrets

View File

@@ -0,0 +1,35 @@
"""
Android Static Analysis Workflow
Comprehensive Android application security testing combining:
- Jadx APK decompilation
- OpenGrep/Semgrep static analysis with Android-specific rules
- MobSF mobile security framework analysis
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
from .workflow import AndroidStaticAnalysisWorkflow
from .activities import (
decompile_with_jadx_activity,
scan_with_opengrep_activity,
scan_with_mobsf_activity,
generate_android_sarif_activity,
)
__all__ = [
"AndroidStaticAnalysisWorkflow",
"decompile_with_jadx_activity",
"scan_with_opengrep_activity",
"scan_with_mobsf_activity",
"generate_android_sarif_activity",
]

View File

@@ -0,0 +1,213 @@
"""
Android Static Analysis Workflow Activities
Activities for the Android security testing workflow:
- decompile_with_jadx_activity: Decompile APK using Jadx
- scan_with_opengrep_activity: Analyze code with OpenGrep/Semgrep
- scan_with_mobsf_activity: Scan APK with MobSF
- generate_android_sarif_activity: Generate combined SARIF report
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import logging
import sys
from pathlib import Path
from temporalio import activity
# Configure logging
logger = logging.getLogger(__name__)
# Add toolbox to path for module imports
sys.path.insert(0, '/app/toolbox')
@activity.defn(name="decompile_with_jadx")
async def decompile_with_jadx_activity(workspace_path: str, config: dict) -> dict:
"""
Decompile Android APK to Java source code using Jadx.
Args:
workspace_path: Path to the workspace directory
config: JadxDecompiler configuration
Returns:
Decompilation results dictionary
"""
logger.info(f"Activity: decompile_with_jadx (workspace={workspace_path})")
try:
from modules.android import JadxDecompiler
workspace = Path(workspace_path)
if not workspace.exists():
raise FileNotFoundError(f"Workspace not found: {workspace_path}")
decompiler = JadxDecompiler()
result = await decompiler.execute(config, workspace)
logger.info(
f"✓ Jadx decompilation completed: "
f"{result.summary.get('java_files', 0)} Java files generated"
)
return result.dict()
except Exception as e:
logger.error(f"Jadx decompilation failed: {e}", exc_info=True)
raise
@activity.defn(name="scan_with_opengrep")
async def scan_with_opengrep_activity(workspace_path: str, config: dict) -> dict:
"""
Analyze Android code for security issues using OpenGrep/Semgrep.
Args:
workspace_path: Path to the workspace directory
config: OpenGrepAndroid configuration
Returns:
Analysis results dictionary
"""
logger.info(f"Activity: scan_with_opengrep (workspace={workspace_path})")
try:
from modules.android import OpenGrepAndroid
workspace = Path(workspace_path)
if not workspace.exists():
raise FileNotFoundError(f"Workspace not found: {workspace_path}")
analyzer = OpenGrepAndroid()
result = await analyzer.execute(config, workspace)
logger.info(
f"✓ OpenGrep analysis completed: "
f"{result.summary.get('total_findings', 0)} security issues found"
)
return result.dict()
except Exception as e:
logger.error(f"OpenGrep analysis failed: {e}", exc_info=True)
raise
@activity.defn(name="scan_with_mobsf")
async def scan_with_mobsf_activity(workspace_path: str, config: dict) -> dict:
"""
Analyze Android APK for security issues using MobSF.
Args:
workspace_path: Path to the workspace directory
config: MobSFScanner configuration
Returns:
Scan results dictionary (or skipped status if MobSF unavailable)
"""
logger.info(f"Activity: scan_with_mobsf (workspace={workspace_path})")
# Check if MobSF is installed (graceful degradation for ARM64 platform)
mobsf_path = Path("/app/mobsf")
if not mobsf_path.exists():
logger.warning("MobSF not installed on this platform (ARM64/Rosetta limitation)")
return {
"status": "skipped",
"findings": [],
"summary": {
"total_findings": 0,
"skip_reason": "MobSF unavailable on ARM64 platform (Rosetta 2 incompatibility)"
}
}
try:
from modules.android import MobSFScanner
workspace = Path(workspace_path)
if not workspace.exists():
raise FileNotFoundError(f"Workspace not found: {workspace_path}")
scanner = MobSFScanner()
result = await scanner.execute(config, workspace)
logger.info(
f"✓ MobSF scan completed: "
f"{result.summary.get('total_findings', 0)} findings"
)
return result.dict()
except Exception as e:
logger.error(f"MobSF scan failed: {e}", exc_info=True)
raise
@activity.defn(name="generate_android_sarif")
async def generate_android_sarif_activity(
jadx_result: dict,
opengrep_result: dict,
mobsf_result: dict,
config: dict,
workspace_path: str
) -> dict:
"""
Generate combined SARIF report from all Android security findings.
Args:
jadx_result: Jadx decompilation results
opengrep_result: OpenGrep analysis results
mobsf_result: MobSF scan results (may be None if disabled)
config: Reporter configuration
workspace_path: Workspace path
Returns:
SARIF report dictionary
"""
logger.info("Activity: generate_android_sarif")
try:
from modules.reporter import SARIFReporter
workspace = Path(workspace_path)
# Collect all findings
all_findings = []
all_findings.extend(opengrep_result.get("findings", []))
if mobsf_result:
all_findings.extend(mobsf_result.get("findings", []))
# Prepare reporter config
reporter_config = {
**(config or {}),
"findings": all_findings,
"tool_name": "FuzzForge Android Static Analysis",
"tool_version": "1.0.0",
"metadata": {
"jadx_version": "1.5.0",
"opengrep_version": "1.45.0",
"mobsf_version": "3.9.7",
"java_files_decompiled": jadx_result.get("summary", {}).get("java_files", 0),
}
}
reporter = SARIFReporter()
result = await reporter.execute(reporter_config, workspace)
sarif_report = result.dict().get("sarif", {})
logger.info(f"✓ SARIF report generated with {len(all_findings)} findings")
return sarif_report
except Exception as e:
logger.error(f"SARIF report generation failed: {e}", exc_info=True)
raise

View File

@@ -0,0 +1,172 @@
name: android_static_analysis
version: "1.0.0"
vertical: android
description: "Comprehensive Android application security testing using Jadx decompilation, OpenGrep static analysis, and MobSF mobile security framework"
author: "FuzzForge Team"
tags:
- "android"
- "mobile"
- "static-analysis"
- "security"
- "opengrep"
- "semgrep"
- "mobsf"
- "jadx"
- "apk"
- "sarif"
# Workspace isolation mode
# Using "shared" mode for read-only APK analysis (no file modifications except decompilation output)
workspace_isolation: "shared"
parameters:
type: object
properties:
apk_path:
type: string
description: "Path to the APK file to analyze (relative to uploaded target or absolute within workspace)"
default: ""
decompile_apk:
type: boolean
description: "Whether to decompile APK with Jadx before OpenGrep analysis"
default: true
jadx_config:
type: object
description: "Jadx decompiler configuration"
properties:
output_dir:
type: string
description: "Output directory for decompiled sources"
default: "jadx_output"
overwrite:
type: boolean
description: "Overwrite existing decompilation output"
default: true
threads:
type: integer
description: "Number of decompilation threads"
default: 4
minimum: 1
maximum: 32
decompiler_args:
type: array
items:
type: string
description: "Additional Jadx arguments"
default: []
opengrep_config:
type: object
description: "OpenGrep/Semgrep static analysis configuration"
properties:
config:
type: string
enum: ["auto", "p/security-audit", "p/owasp-top-ten", "p/cwe-top-25"]
description: "Preset OpenGrep ruleset (ignored if custom_rules_path is set)"
default: "auto"
custom_rules_path:
type: string
description: "Path to custom OpenGrep rules directory (use Android-specific rules for best results)"
default: "/app/toolbox/modules/android/custom_rules"
languages:
type: array
items:
type: string
description: "Programming languages to analyze (defaults to java, kotlin for Android)"
default: ["java", "kotlin"]
include_patterns:
type: array
items:
type: string
description: "File patterns to include in scan"
default: []
exclude_patterns:
type: array
items:
type: string
description: "File patterns to exclude from scan"
default: []
max_target_bytes:
type: integer
description: "Maximum file size to analyze (bytes)"
default: 1000000
timeout:
type: integer
description: "Analysis timeout in seconds"
default: 300
severity:
type: array
items:
type: string
enum: ["ERROR", "WARNING", "INFO"]
description: "Severity levels to include in results"
default: ["ERROR", "WARNING", "INFO"]
confidence:
type: array
items:
type: string
enum: ["HIGH", "MEDIUM", "LOW"]
description: "Confidence levels to include in results"
default: ["HIGH", "MEDIUM", "LOW"]
mobsf_config:
type: object
description: "MobSF scanner configuration"
properties:
enabled:
type: boolean
description: "Enable MobSF analysis (requires APK file)"
default: true
mobsf_url:
type: string
description: "MobSF server URL"
default: "http://localhost:8877"
api_key:
type: string
description: "MobSF API key (if not provided, uses MOBSF_API_KEY env var)"
default: null
rescan:
type: boolean
description: "Force rescan even if APK was previously analyzed"
default: false
reporter_config:
type: object
description: "SARIF reporter configuration"
properties:
include_code_flows:
type: boolean
description: "Include code flow information in SARIF output"
default: false
logical_id:
type: string
description: "Custom identifier for the SARIF report"
default: null
output_schema:
type: object
properties:
sarif:
type: object
description: "SARIF-formatted findings from all Android security tools"
summary:
type: object
description: "Android security analysis summary"
properties:
total_findings:
type: integer
decompiled_java_files:
type: integer
description: "Number of Java files decompiled by Jadx"
opengrep_findings:
type: integer
description: "Findings from OpenGrep/Semgrep analysis"
mobsf_findings:
type: integer
description: "Findings from MobSF analysis"
severity_distribution:
type: object
category_distribution:
type: object

View File

@@ -0,0 +1,289 @@
"""
Android Static Analysis Workflow - Temporal Version
Comprehensive security testing for Android applications using Jadx, OpenGrep, and MobSF.
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
from datetime import timedelta
from typing import Dict, Any, Optional
from pathlib import Path
from temporalio import workflow
from temporalio.common import RetryPolicy
# Import activity interfaces (will be executed by worker)
with workflow.unsafe.imports_passed_through():
import logging
logger = logging.getLogger(__name__)
@workflow.defn
class AndroidStaticAnalysisWorkflow:
"""
Android Static Application Security Testing workflow.
This workflow:
1. Downloads target (APK) from MinIO
2. (Optional) Decompiles APK using Jadx
3. Runs OpenGrep/Semgrep static analysis on decompiled code
4. (Optional) Runs MobSF comprehensive security scan
5. Generates a SARIF report with all findings
6. Uploads results to MinIO
7. Cleans up cache
"""
@workflow.run
async def run(
self,
target_id: str,
apk_path: Optional[str] = None,
decompile_apk: bool = True,
jadx_config: Optional[Dict[str, Any]] = None,
opengrep_config: Optional[Dict[str, Any]] = None,
mobsf_config: Optional[Dict[str, Any]] = None,
reporter_config: Optional[Dict[str, Any]] = None
) -> Dict[str, Any]:
"""
Main workflow execution.
Args:
target_id: UUID of the uploaded target (APK) in MinIO
apk_path: Path to APK file within target (if target is not a single APK)
decompile_apk: Whether to decompile APK with Jadx before OpenGrep
jadx_config: Configuration for Jadx decompiler
opengrep_config: Configuration for OpenGrep analyzer
mobsf_config: Configuration for MobSF scanner
reporter_config: Configuration for SARIF reporter
Returns:
Dictionary containing SARIF report and summary
"""
workflow_id = workflow.info().workflow_id
workflow.logger.info(
f"Starting AndroidStaticAnalysisWorkflow "
f"(workflow_id={workflow_id}, target_id={target_id})"
)
# Default configurations
if not jadx_config:
jadx_config = {
"output_dir": "jadx_output",
"overwrite": True,
"threads": 4,
"decompiler_args": []
}
if not opengrep_config:
opengrep_config = {
"config": "auto",
"custom_rules_path": "/app/toolbox/modules/android/custom_rules",
"languages": ["java", "kotlin"],
"severity": ["ERROR", "WARNING", "INFO"],
"confidence": ["HIGH", "MEDIUM", "LOW"],
"timeout": 300,
}
if not mobsf_config:
mobsf_config = {
"enabled": True,
"mobsf_url": "http://localhost:8877",
"api_key": None,
"rescan": False,
}
if not reporter_config:
reporter_config = {
"include_code_flows": False
}
# Activity retry policy
retry_policy = RetryPolicy(
initial_interval=timedelta(seconds=1),
maximum_interval=timedelta(seconds=60),
maximum_attempts=3,
backoff_coefficient=2.0,
)
# Phase 0: Download target from MinIO
workflow.logger.info(f"Phase 0: Downloading target from MinIO (target_id={target_id})")
workspace_path = await workflow.execute_activity(
"get_target",
args=[target_id, workflow.info().workflow_id, "shared"],
start_to_close_timeout=timedelta(minutes=10),
retry_policy=retry_policy,
)
workflow.logger.info(f"✓ Target downloaded to: {workspace_path}")
# Handle case where workspace_path is a file (single APK upload)
# vs. a directory containing files
workspace_path_obj = Path(workspace_path)
# Determine actual workspace directory and APK path
if apk_path:
# User explicitly provided apk_path
actual_apk_path = apk_path
# workspace_path could be either a file or directory
# If it's a file and apk_path matches the filename, use parent as workspace
if workspace_path_obj.name == apk_path:
workspace_path = str(workspace_path_obj.parent)
workflow.logger.info(f"Adjusted workspace to parent directory: {workspace_path}")
else:
# No apk_path provided - check if workspace_path is an APK file
if workspace_path_obj.suffix.lower() == '.apk' or workspace_path_obj.name.endswith('.apk'):
# workspace_path is the APK file itself
actual_apk_path = workspace_path_obj.name
workspace_path = str(workspace_path_obj.parent)
workflow.logger.info(f"Detected single APK file: {actual_apk_path}, workspace: {workspace_path}")
else:
# workspace_path is a directory, need to find APK within it
actual_apk_path = None
workflow.logger.info("Workspace is a directory, APK detection will be handled by modules")
# Phase 1: Jadx decompilation (if enabled and APK provided)
jadx_result = None
analysis_workspace = workspace_path
if decompile_apk and actual_apk_path:
workflow.logger.info(f"Phase 1: Decompiling APK with Jadx (apk={actual_apk_path})")
jadx_activity_config = {
**jadx_config,
"apk_path": actual_apk_path
}
jadx_result = await workflow.execute_activity(
"decompile_with_jadx",
args=[workspace_path, jadx_activity_config],
start_to_close_timeout=timedelta(minutes=15),
retry_policy=retry_policy,
)
if jadx_result.get("status") == "success":
# Use decompiled sources as workspace for OpenGrep
source_dir = jadx_result.get("summary", {}).get("source_dir")
if source_dir:
analysis_workspace = source_dir
workflow.logger.info(
f"✓ Jadx decompiled {jadx_result.get('summary', {}).get('java_files', 0)} Java files"
)
else:
workflow.logger.warning(f"Jadx decompilation failed: {jadx_result.get('error')}")
else:
workflow.logger.info("Phase 1: Jadx decompilation skipped")
# Phase 2: OpenGrep static analysis
workflow.logger.info(f"Phase 2: OpenGrep analysis on {analysis_workspace}")
opengrep_result = await workflow.execute_activity(
"scan_with_opengrep",
args=[analysis_workspace, opengrep_config],
start_to_close_timeout=timedelta(minutes=20),
retry_policy=retry_policy,
)
workflow.logger.info(
f"✓ OpenGrep completed: {opengrep_result.get('summary', {}).get('total_findings', 0)} findings"
)
# Phase 3: MobSF analysis (if enabled and APK provided)
mobsf_result = None
if mobsf_config.get("enabled", True) and actual_apk_path:
workflow.logger.info(f"Phase 3: MobSF scan on APK: {actual_apk_path}")
mobsf_activity_config = {
**mobsf_config,
"file_path": actual_apk_path
}
try:
mobsf_result = await workflow.execute_activity(
"scan_with_mobsf",
args=[workspace_path, mobsf_activity_config],
start_to_close_timeout=timedelta(minutes=30),
retry_policy=RetryPolicy(
maximum_attempts=2 # MobSF can be flaky, limit retries
),
)
# Handle skipped or completed status
if mobsf_result.get("status") == "skipped":
workflow.logger.warning(
f"⚠️ MobSF skipped: {mobsf_result.get('summary', {}).get('skip_reason', 'Unknown reason')}"
)
else:
workflow.logger.info(
f"✓ MobSF completed: {mobsf_result.get('summary', {}).get('total_findings', 0)} findings"
)
except Exception as e:
workflow.logger.warning(f"MobSF scan failed (continuing without it): {e}")
mobsf_result = None
else:
workflow.logger.info("Phase 3: MobSF scan skipped (disabled or no APK)")
# Phase 4: Generate SARIF report
workflow.logger.info("Phase 4: Generating SARIF report")
sarif_report = await workflow.execute_activity(
"generate_android_sarif",
args=[jadx_result or {}, opengrep_result, mobsf_result, reporter_config, workspace_path],
start_to_close_timeout=timedelta(minutes=5),
retry_policy=retry_policy,
)
# Phase 5: Upload results to MinIO
workflow.logger.info("Phase 5: Uploading results to MinIO")
result_url = await workflow.execute_activity(
"upload_results",
args=[workflow.info().workflow_id, sarif_report, "sarif"],
start_to_close_timeout=timedelta(minutes=10),
retry_policy=retry_policy,
)
workflow.logger.info(f"✓ Results uploaded: {result_url}")
# Phase 6: Cleanup cache
workflow.logger.info("Phase 6: Cleaning up cache")
await workflow.execute_activity(
"cleanup_cache",
args=[workspace_path, "shared"],
start_to_close_timeout=timedelta(minutes=5),
retry_policy=RetryPolicy(maximum_attempts=1), # Don't retry cleanup
)
# Calculate summary
total_findings = len(sarif_report.get("runs", [{}])[0].get("results", []))
summary = {
"workflow": "android_static_analysis",
"target_id": target_id,
"total_findings": total_findings,
"decompiled_java_files": (jadx_result or {}).get("summary", {}).get("java_files", 0) if jadx_result else 0,
"opengrep_findings": opengrep_result.get("summary", {}).get("total_findings", 0),
"mobsf_findings": mobsf_result.get("summary", {}).get("total_findings", 0) if mobsf_result else 0,
"result_url": result_url,
}
workflow.logger.info(
f"✅ AndroidStaticAnalysisWorkflow completed successfully: {total_findings} findings"
)
return {
"sarif": sarif_report,
"summary": summary,
}

View File

@@ -16,11 +16,6 @@ tags:
# - "copy-on-write": Download once, copy for each run (balances performance and isolation)
workspace_isolation: "isolated"
default_parameters:
target_file: null
max_iterations: 1000000
timeout_seconds: 1800
parameters:
type: object
properties:

View File

@@ -16,12 +16,6 @@ tags:
# - "copy-on-write": Download once, copy for each run (balances performance and isolation)
workspace_isolation: "isolated"
default_parameters:
target_name: null
max_iterations: 1000000
timeout_seconds: 1800
sanitizer: "address"
parameters:
type: object
properties:

View File

@@ -30,13 +30,5 @@ parameters:
default: false
description: "Scan files without Git context"
default_parameters:
scan_mode: "detect"
redact: true
no_git: false
required_modules:
- "gitleaks"
supported_volume_modes:
- "ro"

View File

@@ -13,38 +13,84 @@ tags:
# Workspace isolation mode
workspace_isolation: "shared"
default_parameters:
agent_url: "http://fuzzforge-task-agent:8000/a2a/litellm_agent"
llm_model: "gpt-5-mini"
llm_provider: "openai"
max_files: 5
parameters:
type: object
properties:
agent_url:
type: string
description: "A2A agent endpoint URL"
default: "http://fuzzforge-task-agent:8000/a2a/litellm_agent"
llm_model:
type: string
description: "LLM model to use (e.g., gpt-4o-mini, claude-3-5-sonnet)"
default: "gpt-5-mini"
llm_provider:
type: string
description: "LLM provider (openai, anthropic, etc.)"
default: "openai"
file_patterns:
type: array
items:
type: string
description: "File patterns to analyze (e.g., ['*.py', '*.js'])"
default:
- "*.py"
- "*.js"
- "*.ts"
- "*.jsx"
- "*.tsx"
- "*.java"
- "*.go"
- "*.rs"
- "*.c"
- "*.cpp"
- "*.h"
- "*.hpp"
- "*.cs"
- "*.php"
- "*.rb"
- "*.swift"
- "*.kt"
- "*.scala"
- "*.env"
- "*.yaml"
- "*.yml"
- "*.json"
- "*.xml"
- "*.ini"
- "*.sql"
- "*.properties"
- "*.sh"
- "*.bat"
- "*.ps1"
- "*.config"
- "*.conf"
- "*.toml"
- "*id_rsa*"
- "*id_dsa*"
- "*id_ecdsa*"
- "*id_ed25519*"
- "*.pem"
- "*.key"
- "*.pub"
- "*.txt"
- "*.md"
- "Dockerfile"
- "docker-compose.yml"
- ".gitignore"
- ".dockerignore"
description: "File patterns to analyze for security issues and secrets"
max_files:
type: integer
description: "Maximum number of files to analyze"
default: 10
max_file_size:
type: integer
description: "Maximum file size in bytes"
default: 100000
timeout:
type: integer
description: "Timeout per file in seconds"
default: 90
output_schema:
type: object

View File

@@ -17,27 +17,55 @@ parameters:
agent_url:
type: string
default: "http://fuzzforge-task-agent:8000/a2a/litellm_agent"
llm_model:
type: string
default: "gpt-4o-mini"
default: "gpt-5-mini"
llm_provider:
type: string
default: "openai"
max_files:
type: integer
default: 20
default_parameters:
agent_url: "http://fuzzforge-task-agent:8000/a2a/litellm_agent"
llm_model: "gpt-4o-mini"
llm_provider: "openai"
max_files: 20
max_file_size:
type: integer
default: 30000
description: "Maximum file size in bytes"
timeout:
type: integer
default: 30
description: "Timeout per file in seconds"
file_patterns:
type: array
items:
type: string
default:
- "*.py"
- "*.js"
- "*.ts"
- "*.java"
- "*.go"
- "*.env"
- "*.yaml"
- "*.yml"
- "*.json"
- "*.xml"
- "*.ini"
- "*.sql"
- "*.properties"
- "*.sh"
- "*.bat"
- "*.config"
- "*.conf"
- "*.toml"
- "*id_rsa*"
- "*.txt"
description: "File patterns to scan for secrets"
required_modules:
- "llm_secret_detector"
supported_volume_modes:
- "ro"

View File

@@ -17,6 +17,7 @@ class LlmSecretDetectionWorkflow:
llm_model: Optional[str] = None,
llm_provider: Optional[str] = None,
max_files: Optional[int] = None,
max_file_size: Optional[int] = None,
timeout: Optional[int] = None,
file_patterns: Optional[list] = None
) -> Dict[str, Any]:
@@ -67,6 +68,8 @@ class LlmSecretDetectionWorkflow:
config["llm_provider"] = llm_provider
if max_files:
config["max_files"] = max_files
if max_file_size:
config["max_file_size"] = max_file_size
if timeout:
config["timeout"] = timeout
if file_patterns:

View File

@@ -16,13 +16,6 @@ tags:
# OSS-Fuzz campaigns use isolated mode for safe concurrent campaigns
workspace_isolation: "isolated"
default_parameters:
project_name: null
campaign_duration_hours: 1
override_engine: null
override_sanitizer: null
max_iterations: null
parameters:
type: object
required:

View File

@@ -0,0 +1,10 @@
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.

View File

@@ -0,0 +1,191 @@
"""
Python SAST Workflow Activities
Activities specific to the Python SAST workflow:
- scan_dependencies_activity: Scan Python dependencies for CVEs using pip-audit
- analyze_with_bandit_activity: Analyze Python code for security issues using Bandit
- analyze_with_mypy_activity: Analyze Python code for type safety using Mypy
- generate_python_sast_sarif_activity: Generate SARIF report from all findings
"""
import logging
import sys
from pathlib import Path
from temporalio import activity
# Configure logging
logger = logging.getLogger(__name__)
# Add toolbox to path for module imports
sys.path.insert(0, '/app/toolbox')
@activity.defn(name="scan_dependencies")
async def scan_dependencies_activity(workspace_path: str, config: dict) -> dict:
"""
Scan Python dependencies for known vulnerabilities using pip-audit.
Args:
workspace_path: Path to the workspace directory
config: DependencyScanner configuration
Returns:
Scanner results dictionary
"""
logger.info(f"Activity: scan_dependencies (workspace={workspace_path})")
try:
from modules.scanner import DependencyScanner
workspace = Path(workspace_path)
if not workspace.exists():
raise FileNotFoundError(f"Workspace not found: {workspace_path}")
scanner = DependencyScanner()
result = await scanner.execute(config, workspace)
logger.info(
f"✓ Dependency scanning completed: "
f"{result.summary.get('total_vulnerabilities', 0)} vulnerabilities found"
)
return result.dict()
except Exception as e:
logger.error(f"Dependency scanning failed: {e}", exc_info=True)
raise
@activity.defn(name="analyze_with_bandit")
async def analyze_with_bandit_activity(workspace_path: str, config: dict) -> dict:
"""
Analyze Python code for security issues using Bandit.
Args:
workspace_path: Path to the workspace directory
config: BanditAnalyzer configuration
Returns:
Analysis results dictionary
"""
logger.info(f"Activity: analyze_with_bandit (workspace={workspace_path})")
try:
from modules.analyzer import BanditAnalyzer
workspace = Path(workspace_path)
if not workspace.exists():
raise FileNotFoundError(f"Workspace not found: {workspace_path}")
analyzer = BanditAnalyzer()
result = await analyzer.execute(config, workspace)
logger.info(
f"✓ Bandit analysis completed: "
f"{result.summary.get('total_issues', 0)} security issues found"
)
return result.dict()
except Exception as e:
logger.error(f"Bandit analysis failed: {e}", exc_info=True)
raise
@activity.defn(name="analyze_with_mypy")
async def analyze_with_mypy_activity(workspace_path: str, config: dict) -> dict:
"""
Analyze Python code for type safety issues using Mypy.
Args:
workspace_path: Path to the workspace directory
config: MypyAnalyzer configuration
Returns:
Analysis results dictionary
"""
logger.info(f"Activity: analyze_with_mypy (workspace={workspace_path})")
try:
from modules.analyzer import MypyAnalyzer
workspace = Path(workspace_path)
if not workspace.exists():
raise FileNotFoundError(f"Workspace not found: {workspace_path}")
analyzer = MypyAnalyzer()
result = await analyzer.execute(config, workspace)
logger.info(
f"✓ Mypy analysis completed: "
f"{result.summary.get('total_errors', 0)} type errors found"
)
return result.dict()
except Exception as e:
logger.error(f"Mypy analysis failed: {e}", exc_info=True)
raise
@activity.defn(name="generate_python_sast_sarif")
async def generate_python_sast_sarif_activity(
dependency_results: dict,
bandit_results: dict,
mypy_results: dict,
config: dict,
workspace_path: str
) -> dict:
"""
Generate SARIF report from all SAST analysis results.
Args:
dependency_results: Results from dependency scanner
bandit_results: Results from Bandit analyzer
mypy_results: Results from Mypy analyzer
config: Reporter configuration
workspace_path: Path to the workspace
Returns:
SARIF report dictionary
"""
logger.info("Activity: generate_python_sast_sarif")
try:
from modules.reporter import SARIFReporter
workspace = Path(workspace_path)
# Combine findings from all modules
all_findings = []
# Add dependency scanner findings
dependency_findings = dependency_results.get("findings", [])
all_findings.extend(dependency_findings)
# Add Bandit findings
bandit_findings = bandit_results.get("findings", [])
all_findings.extend(bandit_findings)
# Add Mypy findings
mypy_findings = mypy_results.get("findings", [])
all_findings.extend(mypy_findings)
# Prepare reporter config
reporter_config = {
**config,
"findings": all_findings,
"tool_name": "FuzzForge Python SAST",
"tool_version": "1.0.0"
}
reporter = SARIFReporter()
result = await reporter.execute(reporter_config, workspace)
# Extract SARIF from result
sarif = result.dict().get("sarif", {})
logger.info(f"✓ SARIF report generated with {len(all_findings)} findings")
return sarif
except Exception as e:
logger.error(f"SARIF report generation failed: {e}", exc_info=True)
raise

View File

@@ -0,0 +1,110 @@
name: python_sast
version: "1.0.0"
vertical: python
description: "Python Static Application Security Testing (SAST) workflow combining dependency scanning (pip-audit), security linting (Bandit), and type checking (Mypy)"
author: "FuzzForge Team"
tags:
- "python"
- "sast"
- "security"
- "type-checking"
- "dependencies"
- "bandit"
- "mypy"
- "pip-audit"
- "sarif"
# Workspace isolation mode (system-level configuration)
# Using "shared" mode for read-only SAST analysis (no file modifications)
workspace_isolation: "shared"
parameters:
type: object
properties:
dependency_config:
type: object
description: "Dependency scanner (pip-audit) configuration"
properties:
dependency_files:
type: array
items:
type: string
description: "List of dependency files to scan (auto-discovered if empty)"
default: []
ignore_vulns:
type: array
items:
type: string
description: "List of vulnerability IDs to ignore"
default: []
bandit_config:
type: object
description: "Bandit security analyzer configuration"
properties:
severity_level:
type: string
enum: ["low", "medium", "high"]
description: "Minimum severity level to report"
default: "low"
confidence_level:
type: string
enum: ["low", "medium", "high"]
description: "Minimum confidence level to report"
default: "medium"
exclude_tests:
type: boolean
description: "Exclude test files from analysis"
default: true
skip_ids:
type: array
items:
type: string
description: "List of Bandit test IDs to skip"
default: []
mypy_config:
type: object
description: "Mypy type checker configuration"
properties:
strict_mode:
type: boolean
description: "Enable strict type checking"
default: false
ignore_missing_imports:
type: boolean
description: "Ignore errors about missing imports"
default: true
follow_imports:
type: string
enum: ["normal", "silent", "skip", "error"]
description: "How to handle imports"
default: "silent"
reporter_config:
type: object
description: "SARIF reporter configuration"
properties:
include_code_flows:
type: boolean
description: "Include code flow information"
default: false
output_schema:
type: object
properties:
sarif:
type: object
description: "SARIF-formatted SAST findings from all tools"
summary:
type: object
description: "SAST execution summary"
properties:
total_findings:
type: integer
vulnerabilities:
type: integer
description: "CVEs found in dependencies"
security_issues:
type: integer
description: "Security issues found by Bandit"
type_errors:
type: integer
description: "Type errors found by Mypy"

View File

@@ -0,0 +1,265 @@
"""
Python SAST Workflow - Temporal Version
Static Application Security Testing for Python projects using multiple tools.
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
from datetime import timedelta
from typing import Dict, Any, Optional
from temporalio import workflow
from temporalio.common import RetryPolicy
# Import activity interfaces (will be executed by worker)
with workflow.unsafe.imports_passed_through():
import logging
logger = logging.getLogger(__name__)
@workflow.defn
class PythonSastWorkflow:
"""
Python Static Application Security Testing workflow.
This workflow:
1. Downloads target from MinIO
2. Runs dependency scanning (pip-audit for CVEs)
3. Runs security linting (Bandit for security issues)
4. Runs type checking (Mypy for type safety)
5. Generates a SARIF report with all findings
6. Uploads results to MinIO
7. Cleans up cache
"""
@workflow.run
async def run(
self,
target_id: str,
dependency_config: Optional[Dict[str, Any]] = None,
bandit_config: Optional[Dict[str, Any]] = None,
mypy_config: Optional[Dict[str, Any]] = None,
reporter_config: Optional[Dict[str, Any]] = None
) -> Dict[str, Any]:
"""
Main workflow execution.
Args:
target_id: UUID of the uploaded target in MinIO
dependency_config: Configuration for dependency scanner
bandit_config: Configuration for Bandit analyzer
mypy_config: Configuration for Mypy analyzer
reporter_config: Configuration for SARIF reporter
Returns:
Dictionary containing SARIF report and summary
"""
workflow_id = workflow.info().workflow_id
workflow.logger.info(
f"Starting PythonSASTWorkflow "
f"(workflow_id={workflow_id}, target_id={target_id})"
)
# Default configurations
if not dependency_config:
dependency_config = {
"dependency_files": [], # Auto-discover
"ignore_vulns": []
}
if not bandit_config:
bandit_config = {
"severity_level": "low",
"confidence_level": "medium",
"exclude_tests": True,
"skip_ids": []
}
if not mypy_config:
mypy_config = {
"strict_mode": False,
"ignore_missing_imports": True,
"follow_imports": "silent"
}
if not reporter_config:
reporter_config = {
"include_code_flows": False
}
results = {
"workflow_id": workflow_id,
"target_id": target_id,
"status": "running",
"steps": []
}
try:
# Get run ID for workspace isolation (using shared mode for read-only analysis)
run_id = workflow.info().run_id
# Step 1: Download target from MinIO
workflow.logger.info("Step 1: Downloading target from MinIO")
target_path = await workflow.execute_activity(
"get_target",
args=[target_id, run_id, "shared"], # target_id, run_id, workspace_isolation
start_to_close_timeout=timedelta(minutes=5),
retry_policy=RetryPolicy(
initial_interval=timedelta(seconds=1),
maximum_interval=timedelta(seconds=30),
maximum_attempts=3
)
)
results["steps"].append({
"step": "download_target",
"status": "success",
"target_path": target_path
})
workflow.logger.info(f"✓ Target downloaded to: {target_path}")
# Step 2: Dependency scanning (pip-audit)
workflow.logger.info("Step 2: Scanning dependencies for vulnerabilities")
dependency_results = await workflow.execute_activity(
"scan_dependencies",
args=[target_path, dependency_config],
start_to_close_timeout=timedelta(minutes=10),
retry_policy=RetryPolicy(
initial_interval=timedelta(seconds=2),
maximum_interval=timedelta(seconds=60),
maximum_attempts=2
)
)
results["steps"].append({
"step": "dependency_scanning",
"status": "success",
"vulnerabilities": dependency_results.get("summary", {}).get("total_vulnerabilities", 0)
})
workflow.logger.info(
f"✓ Dependency scanning completed: "
f"{dependency_results.get('summary', {}).get('total_vulnerabilities', 0)} vulnerabilities"
)
# Step 3: Security linting (Bandit)
workflow.logger.info("Step 3: Analyzing security issues with Bandit")
bandit_results = await workflow.execute_activity(
"analyze_with_bandit",
args=[target_path, bandit_config],
start_to_close_timeout=timedelta(minutes=10),
retry_policy=RetryPolicy(
initial_interval=timedelta(seconds=2),
maximum_interval=timedelta(seconds=60),
maximum_attempts=2
)
)
results["steps"].append({
"step": "bandit_analysis",
"status": "success",
"issues": bandit_results.get("summary", {}).get("total_issues", 0)
})
workflow.logger.info(
f"✓ Bandit analysis completed: "
f"{bandit_results.get('summary', {}).get('total_issues', 0)} security issues"
)
# Step 4: Type checking (Mypy)
workflow.logger.info("Step 4: Type checking with Mypy")
mypy_results = await workflow.execute_activity(
"analyze_with_mypy",
args=[target_path, mypy_config],
start_to_close_timeout=timedelta(minutes=10),
retry_policy=RetryPolicy(
initial_interval=timedelta(seconds=2),
maximum_interval=timedelta(seconds=60),
maximum_attempts=2
)
)
results["steps"].append({
"step": "mypy_analysis",
"status": "success",
"type_errors": mypy_results.get("summary", {}).get("total_errors", 0)
})
workflow.logger.info(
f"✓ Mypy analysis completed: "
f"{mypy_results.get('summary', {}).get('total_errors', 0)} type errors"
)
# Step 5: Generate SARIF report
workflow.logger.info("Step 5: Generating SARIF report")
sarif_report = await workflow.execute_activity(
"generate_python_sast_sarif",
args=[dependency_results, bandit_results, mypy_results, reporter_config, target_path],
start_to_close_timeout=timedelta(minutes=5)
)
results["steps"].append({
"step": "report_generation",
"status": "success"
})
# Count total findings in SARIF
total_findings = 0
if sarif_report and "runs" in sarif_report:
total_findings = len(sarif_report["runs"][0].get("results", []))
workflow.logger.info(f"✓ SARIF report generated with {total_findings} findings")
# Step 6: Upload results to MinIO
workflow.logger.info("Step 6: Uploading results")
try:
results_url = await workflow.execute_activity(
"upload_results",
args=[workflow_id, sarif_report, "sarif"],
start_to_close_timeout=timedelta(minutes=2)
)
results["results_url"] = results_url
workflow.logger.info(f"✓ Results uploaded to: {results_url}")
except Exception as e:
workflow.logger.warning(f"Failed to upload results: {e}")
results["results_url"] = None
# Step 7: Cleanup cache
workflow.logger.info("Step 7: Cleaning up cache")
try:
await workflow.execute_activity(
"cleanup_cache",
args=[target_path, "shared"], # target_path, workspace_isolation
start_to_close_timeout=timedelta(minutes=1)
)
workflow.logger.info("✓ Cache cleaned up (skipped for shared mode)")
except Exception as e:
workflow.logger.warning(f"Cache cleanup failed: {e}")
# Mark workflow as successful
results["status"] = "success"
results["sarif"] = sarif_report
results["summary"] = {
"total_findings": total_findings,
"vulnerabilities": dependency_results.get("summary", {}).get("total_vulnerabilities", 0),
"security_issues": bandit_results.get("summary", {}).get("total_issues", 0),
"type_errors": mypy_results.get("summary", {}).get("total_errors", 0)
}
workflow.logger.info(f"✓ Workflow completed successfully: {workflow_id}")
return results
except Exception as e:
workflow.logger.error(f"Workflow failed: {e}")
results["status"] = "error"
results["error"] = str(e)
results["steps"].append({
"step": "error",
"status": "failed",
"error": str(e)
})
raise

View File

@@ -18,11 +18,6 @@ tags:
# Using "shared" mode for read-only security analysis (no file modifications)
workspace_isolation: "shared"
default_parameters:
scanner_config: {}
analyzer_config: {}
reporter_config: {}
parameters:
type: object
properties:

View File

@@ -23,12 +23,5 @@ parameters:
default: 10
description: "Maximum directory depth to scan"
default_parameters:
verify: true
max_depth: 10
required_modules:
- "trufflehog"
supported_volume_modes:
- "ro"

View File

@@ -1,6 +1,6 @@
[project]
name = "fuzzforge-cli"
version = "0.7.0"
version = "0.7.3"
description = "FuzzForge CLI - Command-line interface for FuzzForge security testing platform"
readme = "README.md"
authors = [

View File

@@ -16,4 +16,4 @@ with local project management and persistent storage.
# Additional attribution and requirements are provided in the NOTICE file.
__version__ = "0.6.0"
__version__ = "0.7.3"

View File

@@ -12,3 +12,6 @@ Command modules for FuzzForge CLI.
#
# Additional attribution and requirements are provided in the NOTICE file.
from . import worker
__all__ = ["worker"]

View File

@@ -140,11 +140,145 @@ def get_findings(
else: # table format
display_findings_table(findings.sarif)
# Suggest export command and show command
console.print(f"\n💡 View full details of a finding: [bold cyan]ff finding show {run_id} --rule <rule-id>[/bold cyan]")
console.print(f"💡 Export these findings: [bold cyan]ff findings export {run_id} --format sarif[/bold cyan]")
console.print(" Supported formats: [cyan]sarif[/cyan] (standard), [cyan]json[/cyan], [cyan]csv[/cyan], [cyan]html[/cyan]")
except Exception as e:
console.print(f"❌ Failed to get findings: {e}", style="red")
raise typer.Exit(1)
def show_finding(
run_id: str = typer.Argument(..., help="Run ID to get finding from"),
rule_id: str = typer.Option(..., "--rule", "-r", help="Rule ID of the specific finding to show")
):
"""
🔍 Show detailed information about a specific finding
This function is registered as a command in main.py under the finding (singular) command group.
"""
try:
require_project()
validate_run_id(run_id)
# Try to get from database first, fallback to API
db = get_project_db()
findings_data = None
if db:
findings_data = db.get_findings(run_id)
if not findings_data:
with get_client() as client:
console.print(f"🔍 Fetching findings for run: {run_id}")
findings = client.get_run_findings(run_id)
sarif_data = findings.sarif
else:
sarif_data = findings_data.sarif_data
# Find the specific finding by rule_id
runs = sarif_data.get("runs", [])
if not runs:
console.print("❌ No findings data available", style="red")
raise typer.Exit(1)
run_data = runs[0]
results = run_data.get("results", [])
tool = run_data.get("tool", {}).get("driver", {})
# Search for matching finding
matching_finding = None
for result in results:
if result.get("ruleId") == rule_id:
matching_finding = result
break
if not matching_finding:
console.print(f"❌ No finding found with rule ID: {rule_id}", style="red")
console.print(f"💡 Use [bold cyan]ff findings get {run_id}[/bold cyan] to see all findings", style="dim")
raise typer.Exit(1)
# Display detailed finding
display_finding_detail(matching_finding, tool, run_id)
except Exception as e:
console.print(f"❌ Failed to get finding: {e}", style="red")
raise typer.Exit(1)
def display_finding_detail(finding: Dict[str, Any], tool: Dict[str, Any], run_id: str):
"""Display detailed information about a single finding"""
rule_id = finding.get("ruleId", "unknown")
level = finding.get("level", "note")
message = finding.get("message", {})
message_text = message.get("text", "No summary available")
message_markdown = message.get("markdown", message_text)
# Get location
locations = finding.get("locations", [])
location_str = "Unknown location"
code_snippet = None
if locations:
physical_location = locations[0].get("physicalLocation", {})
artifact_location = physical_location.get("artifactLocation", {})
region = physical_location.get("region", {})
file_path = artifact_location.get("uri", "")
if file_path:
location_str = file_path
if region.get("startLine"):
location_str += f":{region['startLine']}"
if region.get("startColumn"):
location_str += f":{region['startColumn']}"
# Get code snippet if available
if region.get("snippet", {}).get("text"):
code_snippet = region["snippet"]["text"].strip()
# Get severity style
severity_color = {
"error": "red",
"warning": "yellow",
"note": "blue",
"info": "cyan"
}.get(level.lower(), "white")
# Build detailed content
content_lines = []
content_lines.append(f"[bold]Rule ID:[/bold] {rule_id}")
content_lines.append(f"[bold]Severity:[/bold] [{severity_color}]{level.upper()}[/{severity_color}]")
content_lines.append(f"[bold]Location:[/bold] {location_str}")
content_lines.append(f"[bold]Tool:[/bold] {tool.get('name', 'Unknown')} v{tool.get('version', 'unknown')}")
content_lines.append(f"[bold]Run ID:[/bold] {run_id}")
content_lines.append("")
content_lines.append("[bold]Summary:[/bold]")
content_lines.append(message_text)
content_lines.append("")
content_lines.append("[bold]Description:[/bold]")
content_lines.append(message_markdown)
if code_snippet:
content_lines.append("")
content_lines.append("[bold]Code Snippet:[/bold]")
content_lines.append(f"[dim]{code_snippet}[/dim]")
content = "\n".join(content_lines)
# Display in panel
console.print()
console.print(Panel(
content,
title="🔍 Finding Detail",
border_style=severity_color,
box=box.ROUNDED,
padding=(1, 2)
))
console.print()
console.print(f"💡 Export this run: [bold cyan]ff findings export {run_id} --format sarif[/bold cyan]")
def display_findings_table(sarif_data: Dict[str, Any]):
"""Display SARIF findings in a rich table format"""
runs = sarif_data.get("runs", [])
@@ -195,8 +329,8 @@ def display_findings_table(sarif_data: Dict[str, Any]):
# Detailed results - Rich Text-based table with proper emoji alignment
results_table = Table(box=box.ROUNDED)
results_table.add_column("Severity", width=12, justify="left", no_wrap=True)
results_table.add_column("Rule", width=25, justify="left", style="bold cyan", no_wrap=True)
results_table.add_column("Message", width=55, justify="left", no_wrap=True)
results_table.add_column("Rule", justify="left", style="bold cyan", no_wrap=True)
results_table.add_column("Message", width=45, justify="left", no_wrap=True)
results_table.add_column("Location", width=20, justify="left", style="dim", no_wrap=True)
for result in results[:50]: # Limit to first 50 results
@@ -224,18 +358,16 @@ def display_findings_table(sarif_data: Dict[str, Any]):
severity_text = Text(level.upper(), style=severity_style(level))
severity_text.truncate(12, overflow="ellipsis")
rule_text = Text(rule_id)
rule_text.truncate(25, overflow="ellipsis")
# Show full rule ID without truncation
message_text = Text(message)
message_text.truncate(55, overflow="ellipsis")
message_text.truncate(45, overflow="ellipsis")
location_text = Text(location_str)
location_text.truncate(20, overflow="ellipsis")
results_table.add_row(
severity_text,
rule_text,
rule_id, # Pass string directly to show full UUID
message_text,
location_text
)
@@ -307,16 +439,20 @@ def findings_history(
def export_findings(
run_id: str = typer.Argument(..., help="Run ID to export findings for"),
format: str = typer.Option(
"json", "--format", "-f",
help="Export format: json, csv, html, sarif"
"sarif", "--format", "-f",
help="Export format: sarif (standard), json, csv, html"
),
output: Optional[str] = typer.Option(
None, "--output", "-o",
help="Output file path (defaults to findings-<run-id>.<format>)"
help="Output file path (defaults to findings-<run-id>-<timestamp>.<format>)"
)
):
"""
📤 Export security findings in various formats
SARIF is the standard format for security findings and is recommended
for interoperability with other security tools. Filenames are automatically
made unique with timestamps to prevent overwriting previous exports.
"""
db = get_project_db()
if not db:
@@ -334,9 +470,10 @@ def export_findings(
else:
sarif_data = findings_data.sarif_data
# Generate output filename
# Generate output filename with timestamp for uniqueness
if not output:
output = f"findings-{run_id[:8]}.{format}"
timestamp = datetime.now().strftime("%Y%m%d-%H%M%S")
output = f"findings-{run_id[:8]}-{timestamp}.{format}"
output_path = Path(output)

View File

@@ -187,19 +187,40 @@ def _ensure_env_file(fuzzforge_dir: Path, force: bool) -> None:
console.print("🧠 Configuring AI environment...")
console.print(" • Default LLM provider: openai")
console.print(" • Default LLM model: gpt-5-mini")
console.print(" • Default LLM model: litellm_proxy/gpt-5-mini")
console.print(" • To customise provider/model later, edit .fuzzforge/.env")
llm_provider = "openai"
llm_model = "gpt-5-mini"
llm_model = "litellm_proxy/gpt-5-mini"
# Check for global virtual keys from volumes/env/.env
global_env_key = None
for parent in fuzzforge_dir.parents:
global_env = parent / "volumes" / "env" / ".env"
if global_env.exists():
try:
for line in global_env.read_text(encoding="utf-8").splitlines():
if line.strip().startswith("OPENAI_API_KEY=") and "=" in line:
key_value = line.split("=", 1)[1].strip()
if key_value and not key_value.startswith("your-") and key_value.startswith("sk-"):
global_env_key = key_value
console.print(f" • Found virtual key in {global_env.relative_to(parent)}")
break
except Exception:
pass
break
api_key = Prompt.ask(
"OpenAI API key (leave blank to fill manually)",
"OpenAI API key (leave blank to use global virtual key)" if global_env_key else "OpenAI API key (leave blank to fill manually)",
default="",
show_default=False,
console=console,
)
# Use global key if user didn't provide one
if not api_key and global_env_key:
api_key = global_env_key
session_db_path = fuzzforge_dir / "fuzzforge_sessions.db"
session_db_rel = session_db_path.relative_to(fuzzforge_dir.parent)
@@ -210,14 +231,20 @@ def _ensure_env_file(fuzzforge_dir: Path, force: bool) -> None:
f"LLM_PROVIDER={llm_provider}",
f"LLM_MODEL={llm_model}",
f"LITELLM_MODEL={llm_model}",
"LLM_ENDPOINT=http://localhost:10999",
"LLM_API_KEY=",
"LLM_EMBEDDING_MODEL=litellm_proxy/text-embedding-3-large",
"LLM_EMBEDDING_ENDPOINT=http://localhost:10999",
f"OPENAI_API_KEY={api_key}",
"FUZZFORGE_MCP_URL=http://localhost:8010/mcp",
"",
"# Cognee configuration mirrors the primary LLM by default",
f"LLM_COGNEE_PROVIDER={llm_provider}",
f"LLM_COGNEE_MODEL={llm_model}",
f"LLM_COGNEE_API_KEY={api_key}",
"LLM_COGNEE_ENDPOINT=",
"LLM_COGNEE_ENDPOINT=http://localhost:10999",
"LLM_COGNEE_API_KEY=",
"LLM_COGNEE_EMBEDDING_MODEL=litellm_proxy/text-embedding-3-large",
"LLM_COGNEE_EMBEDDING_ENDPOINT=http://localhost:10999",
"COGNEE_MCP_URL=",
"",
"# Session persistence options: inmemory | sqlite",
@@ -239,6 +266,8 @@ def _ensure_env_file(fuzzforge_dir: Path, force: bool) -> None:
for line in env_lines:
if line.startswith("OPENAI_API_KEY="):
template_lines.append("OPENAI_API_KEY=")
elif line.startswith("LLM_API_KEY="):
template_lines.append("LLM_API_KEY=")
elif line.startswith("LLM_COGNEE_API_KEY="):
template_lines.append("LLM_COGNEE_API_KEY=")
else:

View File

@@ -59,66 +59,6 @@ def format_number(num: int) -> str:
return str(num)
@app.command("stats")
def fuzzing_stats(
run_id: str = typer.Argument(..., help="Run ID to get statistics for"),
refresh: int = typer.Option(
5, "--refresh", "-r",
help="Refresh interval in seconds"
),
once: bool = typer.Option(
False, "--once",
help="Show stats once and exit"
)
):
"""
📊 Show current fuzzing statistics for a run
"""
try:
with get_client() as client:
if once:
# Show stats once
stats = client.get_fuzzing_stats(run_id)
display_stats_table(stats)
else:
# Live updating stats
console.print(f"📊 [bold]Live Fuzzing Statistics[/bold] (Run: {run_id[:12]}...)")
console.print(f"Refreshing every {refresh}s. Press Ctrl+C to stop.\n")
with Live(auto_refresh=False, console=console) as live:
while True:
try:
# Check workflow status
run_status = client.get_run_status(run_id)
stats = client.get_fuzzing_stats(run_id)
table = create_stats_table(stats)
live.update(table, refresh=True)
# Exit if workflow completed or failed
if getattr(run_status, 'is_completed', False) or getattr(run_status, 'is_failed', False):
final_status = getattr(run_status, 'status', 'Unknown')
if getattr(run_status, 'is_completed', False):
console.print("\n✅ [bold green]Workflow completed[/bold green]", style="green")
else:
console.print(f"\n⚠️ [bold yellow]Workflow ended[/bold yellow] | Status: {final_status}", style="yellow")
break
time.sleep(refresh)
except KeyboardInterrupt:
console.print("\n📊 Monitoring stopped", style="yellow")
break
except Exception as e:
console.print(f"❌ Failed to get fuzzing stats: {e}", style="red")
raise typer.Exit(1)
def display_stats_table(stats):
"""Display stats in a simple table"""
table = create_stats_table(stats)
console.print(table)
def create_stats_table(stats) -> Panel:
"""Create a rich table for fuzzing statistics"""
# Create main stats table
@@ -266,8 +206,8 @@ def crash_reports(
raise typer.Exit(1)
def _live_monitor(run_id: str, refresh: int):
"""Helper for live monitoring with inline real-time display"""
def _live_monitor(run_id: str, refresh: int, once: bool = False, style: str = "inline"):
"""Helper for live monitoring with inline real-time display or table display"""
with get_client() as client:
start_time = time.time()
@@ -319,16 +259,29 @@ def _live_monitor(run_id: str, refresh: int):
self.elapsed_time = 0
self.last_crash_time = None
with Live(auto_refresh=False, console=console) as live:
# Initial fetch
try:
run_status = client.get_run_status(run_id)
stats = client.get_fuzzing_stats(run_id)
except Exception:
stats = FallbackStats(run_id)
run_status = type("RS", (), {"status":"Unknown","is_completed":False,"is_failed":False})()
# Initial fetch
try:
run_status = client.get_run_status(run_id)
stats = client.get_fuzzing_stats(run_id)
except Exception:
stats = FallbackStats(run_id)
run_status = type("RS", (), {"status":"Unknown","is_completed":False,"is_failed":False})()
live.update(render_inline_stats(run_status, stats), refresh=True)
# Handle --once mode: show stats once and exit
if once:
if style == "table":
console.print(create_stats_table(stats))
else:
console.print(render_inline_stats(run_status, stats))
return
# Live monitoring mode
with Live(auto_refresh=False, console=console) as live:
# Render based on style
if style == "table":
live.update(create_stats_table(stats), refresh=True)
else:
live.update(render_inline_stats(run_status, stats), refresh=True)
# Polling loop
consecutive_errors = 0
@@ -354,8 +307,11 @@ def _live_monitor(run_id: str, refresh: int):
except Exception:
stats = FallbackStats(run_id)
# Update display
live.update(render_inline_stats(run_status, stats), refresh=True)
# Update display based on style
if style == "table":
live.update(create_stats_table(stats), refresh=True)
else:
live.update(render_inline_stats(run_status, stats), refresh=True)
# Check if completed
if getattr(run_status, 'is_completed', False) or getattr(run_status, 'is_failed', False):
@@ -386,17 +342,36 @@ def live_monitor(
refresh: int = typer.Option(
2, "--refresh", "-r",
help="Refresh interval in seconds"
),
once: bool = typer.Option(
False, "--once",
help="Show stats once and exit"
),
style: str = typer.Option(
"inline", "--style",
help="Display style: 'inline' (default) or 'table'"
)
):
"""
📺 Real-time inline monitoring with live statistics updates
📺 Real-time monitoring with live statistics updates
Display styles:
- inline: Visual inline display with emojis (default)
- table: Clean table-based display
Use --once to show stats once without continuous monitoring (useful for scripts)
"""
try:
_live_monitor(run_id, refresh)
# Validate style
if style not in ["inline", "table"]:
console.print(f"❌ Invalid style: {style}. Must be 'inline' or 'table'", style="red")
raise typer.Exit(1)
_live_monitor(run_id, refresh, once, style)
except KeyboardInterrupt:
console.print("\n\n📊 Monitoring stopped by user.", style="yellow")
except Exception as e:
console.print(f"\n❌ Failed to start live monitoring: {e}", style="red")
console.print(f"\n❌ Failed to start monitoring: {e}", style="red")
raise typer.Exit(1)
@@ -423,6 +398,5 @@ def monitor_callback(ctx: typer.Context):
console = Console()
console.print("📊 [bold cyan]Monitor Command[/bold cyan]")
console.print("\nAvailable subcommands:")
console.print(" • [cyan]ff monitor stats <run-id>[/cyan] - Show execution statistics")
console.print(" • [cyan]ff monitor live <run-id>[/cyan] - Monitor with live updates (supports --once, --style)")
console.print(" • [cyan]ff monitor crashes <run-id>[/cyan] - Show crash reports")
console.print(" • [cyan]ff monitor live <run-id>[/cyan] - Real-time inline monitoring")

View File

@@ -0,0 +1,225 @@
"""
Worker management commands for FuzzForge CLI.
Provides commands to start, stop, and list Temporal workers.
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import subprocess
import sys
import typer
from pathlib import Path
from rich.console import Console
from rich.table import Table
from typing import Optional
from ..worker_manager import WorkerManager
console = Console()
app = typer.Typer(
name="worker",
help="🔧 Manage Temporal workers",
no_args_is_help=True,
)
@app.command("stop")
def stop_workers(
all: bool = typer.Option(
False, "--all",
help="Stop all workers (default behavior, flag for clarity)"
)
):
"""
🛑 Stop all running FuzzForge workers.
This command stops all worker containers using the proper Docker Compose
profile flag to ensure workers are actually stopped (since they're in profiles).
Examples:
$ ff worker stop
$ ff worker stop --all
"""
try:
worker_mgr = WorkerManager()
success = worker_mgr.stop_all_workers()
if success:
sys.exit(0)
else:
console.print("⚠️ Some workers may not have stopped properly", style="yellow")
sys.exit(1)
except Exception as e:
console.print(f"❌ Error: {e}", style="red")
sys.exit(1)
@app.command("list")
def list_workers(
all: bool = typer.Option(
False, "--all", "-a",
help="Show all workers (including stopped)"
)
):
"""
📋 List FuzzForge workers and their status.
By default, shows only running workers. Use --all to see all workers.
Examples:
$ ff worker list
$ ff worker list --all
"""
try:
# Get list of running workers
result = subprocess.run(
["docker", "ps", "--filter", "name=fuzzforge-worker-",
"--format", "{{.Names}}\t{{.Status}}\t{{.RunningFor}}"],
capture_output=True,
text=True,
check=False
)
running_workers = []
if result.stdout.strip():
for line in result.stdout.strip().splitlines():
parts = line.split('\t')
if len(parts) >= 3:
running_workers.append({
"name": parts[0].replace("fuzzforge-worker-", ""),
"status": "Running",
"uptime": parts[2]
})
# If --all, also get stopped workers
stopped_workers = []
if all:
result_all = subprocess.run(
["docker", "ps", "-a", "--filter", "name=fuzzforge-worker-",
"--format", "{{.Names}}\t{{.Status}}"],
capture_output=True,
text=True,
check=False
)
all_worker_names = set()
for line in result_all.stdout.strip().splitlines():
parts = line.split('\t')
if len(parts) >= 2:
worker_name = parts[0].replace("fuzzforge-worker-", "")
all_worker_names.add(worker_name)
# If not running, it's stopped
if not any(w["name"] == worker_name for w in running_workers):
stopped_workers.append({
"name": worker_name,
"status": "Stopped",
"uptime": "-"
})
# Display results
if not running_workers and not stopped_workers:
console.print(" No workers found", style="cyan")
console.print("\n💡 Start a worker with: [cyan]docker compose up -d worker-<name>[/cyan]")
console.print(" Or run a workflow, which auto-starts workers: [cyan]ff workflow run <workflow> <target>[/cyan]")
return
# Create table
table = Table(title="FuzzForge Workers", show_header=True, header_style="bold cyan")
table.add_column("Worker", style="cyan", no_wrap=True)
table.add_column("Status", style="green")
table.add_column("Uptime", style="dim")
# Add running workers
for worker in running_workers:
table.add_row(
worker["name"],
f"[green]●[/green] {worker['status']}",
worker["uptime"]
)
# Add stopped workers if --all
for worker in stopped_workers:
table.add_row(
worker["name"],
f"[red]●[/red] {worker['status']}",
worker["uptime"]
)
console.print(table)
# Summary
if running_workers:
console.print(f"\n{len(running_workers)} worker(s) running")
if stopped_workers:
console.print(f"⏹️ {len(stopped_workers)} worker(s) stopped")
except Exception as e:
console.print(f"❌ Error listing workers: {e}", style="red")
sys.exit(1)
@app.command("start")
def start_worker(
name: str = typer.Argument(
...,
help="Worker name (e.g., 'python', 'android', 'secrets')"
),
build: bool = typer.Option(
False, "--build",
help="Rebuild worker image before starting"
)
):
"""
🚀 Start a specific worker.
The worker name should be the vertical name (e.g., 'python', 'android', 'rust').
Examples:
$ ff worker start python
$ ff worker start android --build
"""
try:
service_name = f"worker-{name}"
console.print(f"🚀 Starting worker: [cyan]{service_name}[/cyan]")
# Build docker compose command
cmd = ["docker", "compose", "up", "-d"]
if build:
cmd.append("--build")
cmd.append(service_name)
result = subprocess.run(
cmd,
capture_output=True,
text=True,
check=False
)
if result.returncode == 0:
console.print(f"✅ Worker [cyan]{service_name}[/cyan] started successfully")
else:
console.print(f"❌ Failed to start worker: {result.stderr}", style="red")
console.print(
f"\n💡 Try manually: [yellow]docker compose up -d {service_name}[/yellow]",
style="dim"
)
sys.exit(1)
except Exception as e:
console.print(f"❌ Error: {e}", style="red")
sys.exit(1)
if __name__ == "__main__":
app()

View File

@@ -39,7 +39,7 @@ from ..validation import (
)
from ..progress import step_progress
from ..constants import (
STATUS_EMOJIS, MAX_RUN_ID_DISPLAY_LENGTH, DEFAULT_VOLUME_MODE,
STATUS_EMOJIS, MAX_RUN_ID_DISPLAY_LENGTH,
PROGRESS_STEP_DELAYS, MAX_RETRIES, RETRY_DELAY, POLL_INTERVAL
)
from ..worker_manager import WorkerManager
@@ -112,7 +112,6 @@ def execute_workflow_submission(
workflow: str,
target_path: str,
parameters: Dict[str, Any],
volume_mode: str,
timeout: Optional[int],
interactive: bool
) -> Any:
@@ -160,13 +159,10 @@ def execute_workflow_submission(
except ValueError as e:
console.print(f"❌ Invalid {param_type}: {e}", style="red")
# Note: volume_mode is no longer used (Temporal uses MinIO storage)
# Show submission summary
console.print("\n🎯 [bold]Executing workflow:[/bold]")
console.print(f" Workflow: {workflow}")
console.print(f" Target: {target_path}")
console.print(f" Volume Mode: {volume_mode}")
if parameters:
console.print(f" Parameters: {len(parameters)} provided")
if timeout:
@@ -252,8 +248,6 @@ def execute_workflow_submission(
progress.next_step() # Submitting
submission = WorkflowSubmission(
target_path=target_path,
volume_mode=volume_mode,
parameters=parameters,
timeout=timeout
)
@@ -281,10 +275,6 @@ def execute_workflow(
None, "--param-file", "-f",
help="JSON file containing workflow parameters"
),
volume_mode: str = typer.Option(
DEFAULT_VOLUME_MODE, "--volume-mode", "-v",
help="Volume mount mode: ro (read-only) or rw (read-write)"
),
timeout: Optional[int] = typer.Option(
None, "--timeout", "-t",
help="Execution timeout in seconds"
@@ -365,7 +355,7 @@ def execute_workflow(
should_auto_start = auto_start if auto_start is not None else config.workers.auto_start_workers
should_auto_stop = auto_stop if auto_stop is not None else config.workers.auto_stop_workers
worker_container = None # Track for cleanup
worker_service = None # Track for cleanup
worker_mgr = None
wait_completed = False # Track if wait completed successfully
@@ -384,7 +374,6 @@ def execute_workflow(
)
# Ensure worker is running
worker_container = worker_info["worker_container"]
worker_service = worker_info.get("worker_service", f"worker-{worker_info['vertical']}")
if not worker_mgr.ensure_worker_running(worker_info, auto_start=should_auto_start):
console.print(
@@ -411,7 +400,7 @@ def execute_workflow(
response = execute_workflow_submission(
client, workflow, target_path, parameters,
volume_mode, timeout, interactive
timeout, interactive
)
console.print("✅ Workflow execution started!", style="green")
@@ -434,7 +423,7 @@ def execute_workflow(
# Don't fail the whole operation if database save fails
console.print(f"⚠️ Failed to save execution to database: {e}", style="yellow")
console.print(f"\n💡 Monitor progress: [bold cyan]fuzzforge monitor stats {response.run_id}[/bold cyan]")
console.print(f"\n💡 Monitor progress: [bold cyan]fuzzforge monitor live {response.run_id}[/bold cyan]")
console.print(f"💡 Check status: [bold cyan]fuzzforge workflow status {response.run_id}[/bold cyan]")
# Suggest --live for fuzzing workflows
@@ -454,14 +443,14 @@ def execute_workflow(
console.print("Press Ctrl+C to stop monitoring (execution continues in background).\n")
try:
from ..commands.monitor import live_monitor
# Import monitor command and run it
live_monitor(response.run_id, refresh=3)
from ..commands.monitor import _live_monitor
# Call helper function directly with proper parameters
_live_monitor(response.run_id, refresh=3, once=False, style="inline")
except KeyboardInterrupt:
console.print("\n⏹️ Live monitoring stopped (execution continues in background)", style="yellow")
except Exception as e:
console.print(f"⚠️ Failed to start live monitoring: {e}", style="yellow")
console.print(f"💡 You can still monitor manually: [bold cyan]fuzzforge monitor {response.run_id}[/bold cyan]")
console.print(f"💡 You can still monitor manually: [bold cyan]fuzzforge monitor live {response.run_id}[/bold cyan]")
# Wait for completion if requested
elif wait:
@@ -527,11 +516,11 @@ def execute_workflow(
handle_error(e, "executing workflow")
finally:
# Stop worker if auto-stop is enabled and wait completed
if should_auto_stop and worker_container and worker_mgr and wait_completed:
if should_auto_stop and worker_service and worker_mgr and wait_completed:
try:
console.print("\n🛑 Stopping worker (auto-stop enabled)...")
if worker_mgr.stop_worker(worker_container):
console.print(f"✅ Worker stopped: {worker_container}")
if worker_mgr.stop_worker(worker_service):
console.print(f"✅ Worker stopped: {worker_service}")
except Exception as e:
console.print(
f"⚠️ Failed to stop worker: {e}",
@@ -608,7 +597,7 @@ def workflow_status(
# Show next steps
if status.is_running:
console.print(f"\n💡 Monitor live: [bold cyan]fuzzforge monitor {execution_id}[/bold cyan]")
console.print(f"\n💡 Monitor live: [bold cyan]fuzzforge monitor live {execution_id}[/bold cyan]")
elif status.is_completed:
console.print(f"💡 View findings: [bold cyan]fuzzforge finding {execution_id}[/bold cyan]")
elif status.is_failed:
@@ -770,7 +759,7 @@ def retry_workflow(
except Exception as e:
console.print(f"⚠️ Failed to save execution to database: {e}", style="yellow")
console.print(f"\n💡 Monitor progress: [bold cyan]fuzzforge monitor stats {response.run_id}[/bold cyan]")
console.print(f"\n💡 Monitor progress: [bold cyan]fuzzforge monitor live {response.run_id}[/bold cyan]")
except Exception as e:
handle_error(e, "retrying workflow")

View File

@@ -95,12 +95,6 @@ def complete_target_paths(incomplete: str) -> List[str]:
return []
def complete_volume_modes(incomplete: str) -> List[str]:
"""Auto-complete volume mount modes."""
modes = ["ro", "rw"]
return [mode for mode in modes if mode.startswith(incomplete)]
def complete_export_formats(incomplete: str) -> List[str]:
"""Auto-complete export formats."""
formats = ["json", "csv", "html", "sarif"]
@@ -139,7 +133,6 @@ def complete_config_keys(incomplete: str) -> List[str]:
"api_url",
"api_timeout",
"default_workflow",
"default_volume_mode",
"project_name",
"data_retention_days",
"auto_save_findings",
@@ -164,11 +157,6 @@ TargetPathComplete = typer.Argument(
help="Target path (tab completion available)"
)
VolumeModetComplete = typer.Option(
autocompletion=complete_volume_modes,
help="Volume mode: ro or rw (tab completion available)"
)
ExportFormatComplete = typer.Option(
autocompletion=complete_export_formats,
help="Export format (tab completion available)"

View File

@@ -28,6 +28,58 @@ try: # Optional dependency; fall back if not installed
except ImportError: # pragma: no cover - optional dependency
load_dotenv = None
def _load_env_file_if_exists(path: Path, override: bool = False) -> bool:
if not path.exists():
return False
# Always use manual parsing to handle empty values correctly
try:
for line in path.read_text(encoding="utf-8").splitlines():
stripped = line.strip()
if not stripped or stripped.startswith("#") or "=" not in stripped:
continue
key, value = stripped.split("=", 1)
key = key.strip()
value = value.strip()
if override:
# Only override if value is non-empty
if value:
os.environ[key] = value
else:
# Set if not already in environment and value is non-empty
if key not in os.environ and value:
os.environ[key] = value
return True
except Exception: # pragma: no cover - best effort fallback
return False
def _find_shared_env_file(project_dir: Path) -> Path | None:
for directory in [project_dir] + list(project_dir.parents):
candidate = directory / "volumes" / "env" / ".env"
if candidate.exists():
return candidate
return None
def load_project_env(project_dir: Optional[Path] = None) -> Path | None:
"""Load project-local env, falling back to shared volumes/env/.env."""
project_dir = Path(project_dir or Path.cwd())
shared_env = _find_shared_env_file(project_dir)
loaded_shared = False
if shared_env:
loaded_shared = _load_env_file_if_exists(shared_env, override=False)
project_env = project_dir / ".fuzzforge" / ".env"
if _load_env_file_if_exists(project_env, override=True):
return project_env
if loaded_shared:
return shared_env
return None
import yaml
from pydantic import BaseModel, Field
@@ -312,23 +364,7 @@ class ProjectConfigManager:
if not cognee.get("enabled", True):
return
# Load project-specific environment overrides from .fuzzforge/.env if available
env_file = self.project_dir / ".fuzzforge" / ".env"
if env_file.exists():
if load_dotenv:
load_dotenv(env_file, override=False)
else:
try:
for line in env_file.read_text(encoding="utf-8").splitlines():
stripped = line.strip()
if not stripped or stripped.startswith("#"):
continue
if "=" not in stripped:
continue
key, value = stripped.split("=", 1)
os.environ.setdefault(key.strip(), value.strip())
except Exception: # pragma: no cover - best effort fallback
pass
load_project_env(self.project_dir)
backend_access = "true" if cognee.get("backend_access_control", True) else "false"
os.environ["ENABLE_BACKEND_ACCESS_CONTROL"] = backend_access
@@ -374,6 +410,17 @@ class ProjectConfigManager:
"OPENAI_API_KEY",
)
endpoint = _env("LLM_COGNEE_ENDPOINT", "COGNEE_LLM_ENDPOINT", "LLM_ENDPOINT")
embedding_model = _env(
"LLM_COGNEE_EMBEDDING_MODEL",
"COGNEE_LLM_EMBEDDING_MODEL",
"LLM_EMBEDDING_MODEL",
)
embedding_endpoint = _env(
"LLM_COGNEE_EMBEDDING_ENDPOINT",
"COGNEE_LLM_EMBEDDING_ENDPOINT",
"LLM_EMBEDDING_ENDPOINT",
"LLM_ENDPOINT",
)
api_version = _env(
"LLM_COGNEE_API_VERSION",
"COGNEE_LLM_API_VERSION",
@@ -398,6 +445,20 @@ class ProjectConfigManager:
os.environ.setdefault("OPENAI_API_KEY", api_key)
if endpoint:
os.environ["LLM_ENDPOINT"] = endpoint
os.environ.setdefault("LLM_API_BASE", endpoint)
os.environ.setdefault("LLM_EMBEDDING_ENDPOINT", endpoint)
os.environ.setdefault("LLM_EMBEDDING_API_BASE", endpoint)
os.environ.setdefault("OPENAI_API_BASE", endpoint)
# Set LiteLLM proxy environment variables for SDK usage
os.environ.setdefault("LITELLM_PROXY_API_BASE", endpoint)
if api_key:
# Set LiteLLM proxy API key from the virtual key
os.environ.setdefault("LITELLM_PROXY_API_KEY", api_key)
if embedding_model:
os.environ["LLM_EMBEDDING_MODEL"] = embedding_model
if embedding_endpoint:
os.environ["LLM_EMBEDDING_ENDPOINT"] = embedding_endpoint
os.environ.setdefault("LLM_EMBEDDING_API_BASE", embedding_endpoint)
if api_version:
os.environ["LLM_API_VERSION"] = api_version
if max_tokens:

View File

@@ -57,10 +57,6 @@ SEVERITY_STYLES = {
"info": "bold cyan"
}
# Default volume modes
DEFAULT_VOLUME_MODE = "ro"
SUPPORTED_VOLUME_MODES = ["ro", "rw"]
# Default export formats
DEFAULT_EXPORT_FORMAT = "sarif"
SUPPORTED_EXPORT_FORMATS = ["sarif", "json", "csv"]

View File

@@ -52,7 +52,6 @@ class FuzzyMatcher:
# Common parameter names
self.parameter_names = [
"target_path",
"volume_mode",
"timeout",
"workflow",
"param",
@@ -70,7 +69,6 @@ class FuzzyMatcher:
# Common values
self.common_values = {
"volume_mode": ["ro", "rw"],
"format": ["json", "csv", "html", "sarif"],
"severity": ["critical", "high", "medium", "low", "info"],
}

View File

@@ -19,6 +19,8 @@ from rich.traceback import install
from typing import Optional, List
import sys
from .config import load_project_env
from .commands import (
workflows,
workflow_exec,
@@ -27,13 +29,16 @@ from .commands import (
config as config_cmd,
ai,
ingest,
worker,
)
from .constants import DEFAULT_VOLUME_MODE
from .fuzzy import enhanced_command_not_found_handler
# Install rich traceback handler
install(show_locals=True)
# Ensure environment variables are available before command execution
load_project_env()
# Create console for rich output
console = Console()
@@ -184,10 +189,6 @@ def run_workflow(
None, "--param-file", "-f",
help="JSON file containing workflow parameters"
),
volume_mode: str = typer.Option(
DEFAULT_VOLUME_MODE, "--volume-mode", "-v",
help="Volume mount mode: ro (read-only) or rw (read-write)"
),
timeout: Optional[int] = typer.Option(
None, "--timeout", "-t",
help="Execution timeout in seconds"
@@ -234,7 +235,6 @@ def run_workflow(
target_path=target,
params=params,
param_file=param_file,
volume_mode=volume_mode,
timeout=timeout,
interactive=interactive,
wait=wait,
@@ -260,57 +260,17 @@ def workflow_main():
# === Finding commands (singular) ===
@finding_app.command("export")
def export_finding(
execution_id: Optional[str] = typer.Argument(None, help="Execution ID (defaults to latest)"),
format: str = typer.Option(
"sarif", "--format", "-f",
help="Export format: sarif, json, csv"
),
output: Optional[str] = typer.Option(
None, "--output", "-o",
help="Output file (defaults to stdout)"
)
@finding_app.command("show")
def show_finding_detail(
run_id: str = typer.Argument(..., help="Run ID to get finding from"),
rule_id: str = typer.Option(..., "--rule", "-r", help="Rule ID of the specific finding to show")
):
"""
📤 Export findings to file
🔍 Show detailed information about a specific finding
"""
from .commands.findings import export_findings
from .database import get_project_db
from .exceptions import require_project
from .commands.findings import show_finding
show_finding(run_id=run_id, rule_id=rule_id)
try:
require_project()
# If no ID provided, get the latest
if not execution_id:
db = get_project_db()
if db:
recent_runs = db.list_runs(limit=1)
if recent_runs:
execution_id = recent_runs[0].run_id
console.print(f"🔍 Using most recent execution: {execution_id}")
else:
console.print("⚠️ No findings found in project database", style="yellow")
return
else:
console.print("❌ No project database found", style="red")
return
export_findings(run_id=execution_id, format=format, output=output)
except Exception as e:
console.print(f"❌ Failed to export findings: {e}", style="red")
@finding_app.command("analyze")
def analyze_finding(
finding_id: Optional[str] = typer.Argument(None, help="Finding ID to analyze")
):
"""
🤖 AI analysis of a finding
"""
from .commands.ai import analyze_finding as ai_analyze
ai_analyze(finding_id)
@finding_app.callback(invoke_without_command=True)
def finding_main(
@@ -320,9 +280,9 @@ def finding_main(
View and analyze individual findings
Examples:
fuzzforge finding # Show latest finding
fuzzforge finding <id> # Show specific finding
fuzzforge finding export # Export latest findings
fuzzforge finding # Show latest finding
fuzzforge finding <id> # Show specific finding
fuzzforge finding show <run-id> --rule <id> # Show specific finding detail
"""
# Check if a subcommand is being invoked
if ctx.invoked_subcommand is not None:
@@ -375,6 +335,7 @@ app.add_typer(finding_app, name="finding", help="🔍 View and analyze findings"
app.add_typer(monitor.app, name="monitor", help="📊 Real-time monitoring")
app.add_typer(ai.app, name="ai", help="🤖 AI integration features")
app.add_typer(ingest.app, name="ingest", help="🧠 Ingest knowledge into AI")
app.add_typer(worker.app, name="worker", help="🔧 Manage Temporal workers")
# Help and utility commands
@app.command()
@@ -418,7 +379,7 @@ def main():
# Handle finding command with pattern recognition
if len(args) >= 2 and args[0] == 'finding':
finding_subcommands = ['export', 'analyze']
finding_subcommands = ['show']
# Skip custom dispatching if help flags are present
if not any(arg in ['--help', '-h', '--version', '-v'] for arg in args):
if args[1] not in finding_subcommands:
@@ -450,7 +411,7 @@ def main():
'init', 'status', 'config', 'clean',
'workflows', 'workflow',
'findings', 'finding',
'monitor', 'ai', 'ingest',
'monitor', 'ai', 'ingest', 'worker',
'version'
]

View File

@@ -17,7 +17,7 @@ import re
from pathlib import Path
from typing import Any, Dict, List, Optional
from .constants import SUPPORTED_VOLUME_MODES, SUPPORTED_EXPORT_FORMATS
from .constants import SUPPORTED_EXPORT_FORMATS
from .exceptions import ValidationError
@@ -65,15 +65,6 @@ def validate_target_path(target_path: str, must_exist: bool = True) -> Path:
return path
def validate_volume_mode(volume_mode: str) -> None:
"""Validate volume mode"""
if volume_mode not in SUPPORTED_VOLUME_MODES:
raise ValidationError(
"volume_mode", volume_mode,
f"one of: {', '.join(SUPPORTED_VOLUME_MODES)}"
)
def validate_export_format(export_format: str) -> None:
"""Validate export format"""
if export_format not in SUPPORTED_EXPORT_FORMATS:

View File

@@ -15,12 +15,17 @@ Manages on-demand startup and shutdown of Temporal workers using Docker Compose.
# Additional attribution and requirements are provided in the NOTICE file.
import logging
import os
import platform
import subprocess
import time
from pathlib import Path
from typing import Optional, Dict, Any
import requests
import yaml
from rich.console import Console
from rich.status import Status
logger = logging.getLogger(__name__)
console = Console()
@@ -57,27 +62,206 @@ class WorkerManager:
def _find_compose_file(self) -> Path:
"""
Auto-detect docker-compose.yml location.
Auto-detect docker-compose.yml location using multiple strategies.
Searches upward from current directory to find the compose file.
Strategies (in order):
1. Query backend API for host path
2. Search upward for .fuzzforge marker directory
3. Use FUZZFORGE_ROOT environment variable
4. Fallback to current directory
Returns:
Path to docker-compose.yml
Raises:
FileNotFoundError: If docker-compose.yml cannot be located
"""
current = Path.cwd()
# Strategy 1: Ask backend for location
try:
backend_url = os.getenv("FUZZFORGE_API_URL", "http://localhost:8000")
response = requests.get(f"{backend_url}/system/info", timeout=2)
if response.ok:
info = response.json()
if compose_path_str := info.get("docker_compose_path"):
compose_path = Path(compose_path_str)
if compose_path.exists():
logger.debug(f"Found docker-compose.yml via backend API: {compose_path}")
return compose_path
except Exception as e:
logger.debug(f"Backend API not reachable for path lookup: {e}")
# Try current directory and parents
# Strategy 2: Search upward for .fuzzforge marker directory
current = Path.cwd()
for parent in [current] + list(current.parents):
compose_path = parent / "docker-compose.yml"
if (parent / ".fuzzforge").exists():
compose_path = parent / "docker-compose.yml"
if compose_path.exists():
logger.debug(f"Found docker-compose.yml via .fuzzforge marker: {compose_path}")
return compose_path
# Strategy 3: Environment variable
if fuzzforge_root := os.getenv("FUZZFORGE_ROOT"):
compose_path = Path(fuzzforge_root) / "docker-compose.yml"
if compose_path.exists():
logger.debug(f"Found docker-compose.yml via FUZZFORGE_ROOT: {compose_path}")
return compose_path
# Fallback to default location
return Path("docker-compose.yml")
# Strategy 4: Fallback to current directory
compose_path = Path("docker-compose.yml")
if compose_path.exists():
return compose_path
def _run_docker_compose(self, *args: str) -> subprocess.CompletedProcess:
raise FileNotFoundError(
"Cannot find docker-compose.yml. Ensure backend is running, "
"run from FuzzForge directory, or set FUZZFORGE_ROOT environment variable."
)
def _get_workers_dir(self) -> Path:
"""
Run docker-compose command.
Get the workers directory path.
Uses same strategy as _find_compose_file():
1. Query backend API
2. Derive from compose_file location
3. Use FUZZFORGE_ROOT
Returns:
Path to workers directory
"""
# Strategy 1: Ask backend
try:
backend_url = os.getenv("FUZZFORGE_API_URL", "http://localhost:8000")
response = requests.get(f"{backend_url}/system/info", timeout=2)
if response.ok:
info = response.json()
if workers_dir_str := info.get("workers_dir"):
workers_dir = Path(workers_dir_str)
if workers_dir.exists():
return workers_dir
except Exception:
pass
# Strategy 2: Derive from compose file location
if self.compose_file.exists():
workers_dir = self.compose_file.parent / "workers"
if workers_dir.exists():
return workers_dir
# Strategy 3: Use environment variable
if fuzzforge_root := os.getenv("FUZZFORGE_ROOT"):
workers_dir = Path(fuzzforge_root) / "workers"
if workers_dir.exists():
return workers_dir
# Fallback
return Path("workers")
def _detect_platform(self) -> str:
"""
Detect the current platform.
Returns:
Platform string: "linux/amd64" or "linux/arm64"
"""
machine = platform.machine().lower()
system = platform.system().lower()
logger.debug(f"Platform detection: machine={machine}, system={system}")
# Normalize machine architecture
if machine in ["x86_64", "amd64", "x64"]:
detected = "linux/amd64"
elif machine in ["arm64", "aarch64", "armv8", "arm64v8"]:
detected = "linux/arm64"
else:
# Fallback to amd64 for unknown architectures
logger.warning(
f"Unknown architecture '{machine}' detected, falling back to linux/amd64. "
f"Please report this issue if you're experiencing problems."
)
detected = "linux/amd64"
logger.info(f"Detected platform: {detected}")
return detected
def _read_worker_metadata(self, vertical: str) -> dict:
"""
Read worker metadata.yaml for a vertical.
Args:
*args: Arguments to pass to docker-compose
vertical: Worker vertical name (e.g., "android", "python")
Returns:
Dictionary containing metadata, or empty dict if not found
"""
try:
workers_dir = self._get_workers_dir()
metadata_file = workers_dir / vertical / "metadata.yaml"
if not metadata_file.exists():
logger.debug(f"No metadata.yaml found for {vertical}")
return {}
with open(metadata_file, 'r') as f:
return yaml.safe_load(f) or {}
except Exception as e:
logger.debug(f"Failed to read metadata for {vertical}: {e}")
return {}
def _select_dockerfile(self, vertical: str) -> str:
"""
Select the appropriate Dockerfile for the current platform.
Args:
vertical: Worker vertical name
Returns:
Dockerfile name (e.g., "Dockerfile.amd64", "Dockerfile.arm64")
"""
detected_platform = self._detect_platform()
metadata = self._read_worker_metadata(vertical)
if not metadata:
# No metadata: use default Dockerfile
logger.debug(f"No metadata for {vertical}, using Dockerfile")
return "Dockerfile"
platforms = metadata.get("platforms", {})
if not platforms:
# Metadata exists but no platform definitions
logger.debug(f"No platform definitions in metadata for {vertical}, using Dockerfile")
return "Dockerfile"
# Try detected platform first
if detected_platform in platforms:
dockerfile = platforms[detected_platform].get("dockerfile", "Dockerfile")
logger.info(f"✓ Selected {dockerfile} for {vertical} on {detected_platform}")
return dockerfile
# Fallback to default platform
default_platform = metadata.get("default_platform", "linux/amd64")
logger.warning(
f"Platform {detected_platform} not found in metadata for {vertical}, "
f"falling back to default: {default_platform}"
)
if default_platform in platforms:
dockerfile = platforms[default_platform].get("dockerfile", "Dockerfile.amd64")
logger.info(f"Using default platform {default_platform}: {dockerfile}")
return dockerfile
# Last resort: just use Dockerfile
logger.warning(f"No suitable Dockerfile found for {vertical}, using 'Dockerfile'")
return "Dockerfile"
def _run_docker_compose(self, *args: str, env: Optional[Dict[str, str]] = None) -> subprocess.CompletedProcess:
"""
Run docker compose command with optional environment variables.
Args:
*args: Arguments to pass to docker compose
env: Optional environment variables to set
Returns:
CompletedProcess with result
@@ -85,27 +269,47 @@ class WorkerManager:
Raises:
subprocess.CalledProcessError: If command fails
"""
cmd = ["docker-compose", "-f", str(self.compose_file)] + list(args)
cmd = ["docker", "compose", "-f", str(self.compose_file)] + list(args)
logger.debug(f"Running: {' '.join(cmd)}")
# Merge with current environment
full_env = os.environ.copy()
if env:
full_env.update(env)
logger.debug(f"Environment overrides: {env}")
return subprocess.run(
cmd,
capture_output=True,
text=True,
check=True
check=True,
env=full_env
)
def is_worker_running(self, container_name: str) -> bool:
def _service_to_container_name(self, service_name: str) -> str:
"""
Check if a worker container is running.
Convert service name to container name based on docker-compose naming convention.
Args:
container_name: Name of the Docker container (e.g., "fuzzforge-worker-ossfuzz")
service_name: Docker Compose service name (e.g., "worker-python")
Returns:
Container name (e.g., "fuzzforge-worker-python")
"""
return f"fuzzforge-{service_name}"
def is_worker_running(self, service_name: str) -> bool:
"""
Check if a worker service is running.
Args:
service_name: Name of the Docker Compose service (e.g., "worker-ossfuzz")
Returns:
True if container is running, False otherwise
"""
try:
container_name = self._service_to_container_name(service_name)
result = subprocess.run(
["docker", "inspect", "-f", "{{.State.Running}}", container_name],
capture_output=True,
@@ -120,131 +324,282 @@ class WorkerManager:
logger.debug(f"Failed to check worker status: {e}")
return False
def start_worker(self, container_name: str) -> bool:
def start_worker(self, service_name: str) -> bool:
"""
Start a worker container using docker.
Start a worker service using docker-compose with platform-specific Dockerfile.
Args:
container_name: Name of the Docker container to start
service_name: Name of the Docker Compose service to start (e.g., "worker-android")
Returns:
True if started successfully, False otherwise
"""
try:
console.print(f"🚀 Starting worker: {container_name}")
# Extract vertical name from service name
vertical = service_name.replace("worker-", "")
# Use docker start directly (works with container name)
subprocess.run(
["docker", "start", container_name],
capture_output=True,
text=True,
check=True
# Detect platform and select appropriate Dockerfile
detected_platform = self._detect_platform()
dockerfile = self._select_dockerfile(vertical)
# Set environment variable for docker-compose
env_var_name = f"{vertical.upper()}_DOCKERFILE"
env = {env_var_name: dockerfile}
console.print(
f"🚀 Starting worker: {service_name} "
f"(platform: {detected_platform}, using {dockerfile})"
)
logger.info(f"Worker {container_name} started")
# Use docker-compose up with --build to ensure correct Dockerfile is used
result = self._run_docker_compose("up", "-d", "--build", service_name, env=env)
logger.info(f"Worker {service_name} started with {dockerfile}")
return True
except subprocess.CalledProcessError as e:
logger.error(f"Failed to start worker {container_name}: {e.stderr}")
logger.error(f"Failed to start worker {service_name}: {e.stderr}")
console.print(f"❌ Failed to start worker: {e.stderr}", style="red")
console.print(f"💡 Start the worker manually: docker compose up -d {service_name}", style="yellow")
return False
except Exception as e:
logger.error(f"Unexpected error starting worker {container_name}: {e}")
logger.error(f"Unexpected error starting worker {service_name}: {e}")
console.print(f"❌ Unexpected error: {e}", style="red")
return False
def wait_for_worker_ready(self, container_name: str, timeout: Optional[int] = None) -> bool:
def _get_container_state(self, service_name: str) -> str:
"""
Wait for a worker to be healthy and ready to process tasks.
Get the current state of a container (running, created, restarting, etc.).
Args:
container_name: Name of the Docker container
service_name: Name of the Docker Compose service
Returns:
Container state string (running, created, restarting, exited, etc.) or "unknown"
"""
try:
container_name = self._service_to_container_name(service_name)
result = subprocess.run(
["docker", "inspect", "-f", "{{.State.Status}}", container_name],
capture_output=True,
text=True,
check=False
)
if result.returncode == 0:
return result.stdout.strip()
return "unknown"
except Exception as e:
logger.debug(f"Failed to get container state: {e}")
return "unknown"
def _get_health_status(self, container_name: str) -> str:
"""
Get container health status.
Args:
container_name: Docker container name
Returns:
Health status: "healthy", "unhealthy", "starting", "none", or "unknown"
"""
try:
result = subprocess.run(
["docker", "inspect", "-f", "{{.State.Health.Status}}", container_name],
capture_output=True,
text=True,
check=False
)
if result.returncode != 0:
return "unknown"
health_status = result.stdout.strip()
if health_status == "<no value>" or health_status == "":
return "none" # No health check defined
return health_status # healthy, unhealthy, starting
except Exception as e:
logger.debug(f"Failed to check health: {e}")
return "unknown"
def wait_for_worker_ready(self, service_name: str, timeout: Optional[int] = None) -> bool:
"""
Wait for a worker to be healthy and ready to process tasks.
Shows live progress updates during startup.
Args:
service_name: Name of the Docker Compose service
timeout: Maximum seconds to wait (uses instance default if not specified)
Returns:
True if worker is ready, False if timeout reached
Raises:
TimeoutError: If worker doesn't become ready within timeout
"""
timeout = timeout or self.startup_timeout
start_time = time.time()
container_name = self._service_to_container_name(service_name)
last_status_msg = ""
console.print("⏳ Waiting for worker to be ready...")
with Status("[bold cyan]Starting worker...", console=console, spinner="dots") as status:
while time.time() - start_time < timeout:
elapsed = int(time.time() - start_time)
# Get container state
container_state = self._get_container_state(service_name)
# Get health status
health_status = self._get_health_status(container_name)
# Build status message based on current state
if container_state == "created":
status_msg = f"[cyan]Worker starting... ({elapsed}s)[/cyan]"
elif container_state == "restarting":
status_msg = f"[yellow]Worker restarting... ({elapsed}s)[/yellow]"
elif container_state == "running":
if health_status == "starting":
status_msg = f"[cyan]Worker running, health check starting... ({elapsed}s)[/cyan]"
elif health_status == "unhealthy":
status_msg = f"[yellow]Worker running, health check: unhealthy ({elapsed}s)[/yellow]"
elif health_status == "healthy":
status_msg = f"[green]Worker healthy! ({elapsed}s)[/green]"
status.update(status_msg)
console.print(f"✅ Worker ready: {service_name} (took {elapsed}s)")
logger.info(f"Worker {service_name} is healthy (took {elapsed}s)")
return True
elif health_status == "none":
# No health check defined, assume ready
status_msg = f"[green]Worker running (no health check) ({elapsed}s)[/green]"
status.update(status_msg)
console.print(f"✅ Worker ready: {service_name} (took {elapsed}s)")
logger.info(f"Worker {service_name} is running, no health check (took {elapsed}s)")
return True
else:
status_msg = f"[cyan]Worker running ({elapsed}s)[/cyan]"
elif not container_state or container_state == "exited":
status_msg = f"[yellow]Waiting for container to start... ({elapsed}s)[/yellow]"
else:
status_msg = f"[cyan]Worker state: {container_state} ({elapsed}s)[/cyan]"
# Show helpful hints at certain intervals
if elapsed == 10:
status_msg += " [dim](pulling image if not cached)[/dim]"
elif elapsed == 30:
status_msg += " [dim](large images can take time)[/dim]"
elif elapsed == 60:
status_msg += " [dim](still working...)[/dim]"
# Update status if changed
if status_msg != last_status_msg:
status.update(status_msg)
last_status_msg = status_msg
logger.debug(f"Worker {service_name} - state: {container_state}, health: {health_status}")
while time.time() - start_time < timeout:
# Check if container is running
if not self.is_worker_running(container_name):
logger.debug(f"Worker {container_name} not running yet")
time.sleep(self.health_check_interval)
continue
# Check container health status
try:
result = subprocess.run(
["docker", "inspect", "-f", "{{.State.Health.Status}}", container_name],
capture_output=True,
text=True,
check=False
)
# Timeout reached
elapsed = int(time.time() - start_time)
logger.warning(f"Worker {service_name} did not become ready within {elapsed}s")
console.print(f"⚠️ Worker startup timeout after {elapsed}s", style="yellow")
console.print(f" Last state: {container_state}, health: {health_status}", style="dim")
return False
health_status = result.stdout.strip()
# If no health check is defined, assume healthy after running
if health_status == "<no value>" or health_status == "":
logger.info(f"Worker {container_name} is running (no health check)")
console.print(f"✅ Worker ready: {container_name}")
return True
if health_status == "healthy":
logger.info(f"Worker {container_name} is healthy")
console.print(f"✅ Worker ready: {container_name}")
return True
logger.debug(f"Worker {container_name} health: {health_status}")
except Exception as e:
logger.debug(f"Failed to check health: {e}")
time.sleep(self.health_check_interval)
elapsed = time.time() - start_time
logger.warning(f"Worker {container_name} did not become ready within {elapsed:.1f}s")
console.print(f"⚠️ Worker startup timeout after {elapsed:.1f}s", style="yellow")
return False
def stop_worker(self, container_name: str) -> bool:
def stop_worker(self, service_name: str) -> bool:
"""
Stop a worker container using docker.
Stop a worker service using docker-compose.
Args:
container_name: Name of the Docker container to stop
service_name: Name of the Docker Compose service to stop
Returns:
True if stopped successfully, False otherwise
"""
try:
console.print(f"🛑 Stopping worker: {container_name}")
console.print(f"🛑 Stopping worker: {service_name}")
# Use docker stop directly (works with container name)
subprocess.run(
["docker", "stop", container_name],
capture_output=True,
text=True,
check=True
)
# Use docker-compose down to stop and remove the service
result = self._run_docker_compose("stop", service_name)
logger.info(f"Worker {container_name} stopped")
logger.info(f"Worker {service_name} stopped")
return True
except subprocess.CalledProcessError as e:
logger.error(f"Failed to stop worker {container_name}: {e.stderr}")
logger.error(f"Failed to stop worker {service_name}: {e.stderr}")
console.print(f"❌ Failed to stop worker: {e.stderr}", style="red")
return False
except Exception as e:
logger.error(f"Unexpected error stopping worker {container_name}: {e}")
logger.error(f"Unexpected error stopping worker {service_name}: {e}")
console.print(f"❌ Unexpected error: {e}", style="red")
return False
def stop_all_workers(self) -> bool:
"""
Stop all running FuzzForge worker containers.
This uses `docker stop` to stop worker containers individually,
avoiding the Docker Compose profile issue and preventing accidental
shutdown of core services.
Returns:
True if all workers stopped successfully, False otherwise
"""
try:
console.print("🛑 Stopping all FuzzForge workers...")
# Get list of all running worker containers
result = subprocess.run(
["docker", "ps", "--filter", "name=fuzzforge-worker-", "--format", "{{.Names}}"],
capture_output=True,
text=True,
check=False
)
running_workers = [name.strip() for name in result.stdout.splitlines() if name.strip()]
if not running_workers:
console.print("✓ No workers running")
return True
console.print(f"Found {len(running_workers)} running worker(s):")
for worker in running_workers:
console.print(f" - {worker}")
# Stop each worker container individually using docker stop
# This is safer than docker compose down and won't affect core services
failed_workers = []
for worker in running_workers:
try:
logger.info(f"Stopping {worker}...")
result = subprocess.run(
["docker", "stop", worker],
capture_output=True,
text=True,
check=True,
timeout=30
)
console.print(f" ✓ Stopped {worker}")
except subprocess.CalledProcessError as e:
logger.error(f"Failed to stop {worker}: {e.stderr}")
failed_workers.append(worker)
console.print(f" ✗ Failed to stop {worker}", style="red")
except subprocess.TimeoutExpired:
logger.error(f"Timeout stopping {worker}")
failed_workers.append(worker)
console.print(f" ✗ Timeout stopping {worker}", style="red")
if failed_workers:
console.print(f"\n⚠️ {len(failed_workers)} worker(s) failed to stop", style="yellow")
console.print("💡 Try manually: docker stop " + " ".join(failed_workers), style="dim")
return False
console.print("\n✅ All workers stopped")
logger.info("All workers stopped successfully")
return True
except Exception as e:
logger.error(f"Unexpected error stopping workers: {e}")
console.print(f"❌ Unexpected error: {e}", style="red")
return False
@@ -257,17 +612,18 @@ class WorkerManager:
Ensure a worker is running, starting it if necessary.
Args:
worker_info: Worker information dict from API (contains worker_container, etc.)
worker_info: Worker information dict from API (contains worker_service, etc.)
auto_start: Whether to automatically start the worker if not running
Returns:
True if worker is running, False otherwise
"""
container_name = worker_info["worker_container"]
# Get worker_service (docker-compose service name)
service_name = worker_info.get("worker_service", f"worker-{worker_info['vertical']}")
vertical = worker_info["vertical"]
# Check if already running
if self.is_worker_running(container_name):
if self.is_worker_running(service_name):
console.print(f"✓ Worker already running: {vertical}")
return True
@@ -279,8 +635,8 @@ class WorkerManager:
return False
# Start the worker
if not self.start_worker(container_name):
if not self.start_worker(service_name):
return False
# Wait for it to be ready
return self.wait_for_worker_ready(container_name)
return self.wait_for_worker_ready(service_name)

Some files were not shown because too many files have changed in this diff Show More