mirror of
https://github.com/FuzzingLabs/fuzzforge_ai.git
synced 2026-04-12 05:38:30 +02:00
Compare commits
1 Commits
feat/skill
...
feat/artif
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
85420c2328 |
48
.github/ISSUE_TEMPLATE/bug_report.md
vendored
Normal file
48
.github/ISSUE_TEMPLATE/bug_report.md
vendored
Normal file
@@ -0,0 +1,48 @@
|
||||
---
|
||||
name: 🐛 Bug Report
|
||||
about: Create a report to help us improve FuzzForge
|
||||
title: "[BUG] "
|
||||
labels: bug
|
||||
assignees: ''
|
||||
---
|
||||
|
||||
## Description
|
||||
A clear and concise description of the bug you encountered.
|
||||
|
||||
## Environment
|
||||
Please provide details about your environment:
|
||||
- **OS**: (e.g., macOS 14.0, Ubuntu 22.04, Windows 11)
|
||||
- **Python version**: (e.g., 3.9.7)
|
||||
- **Docker version**: (e.g., 24.0.6)
|
||||
- **FuzzForge version**: (e.g., 0.6.0)
|
||||
|
||||
## Steps to Reproduce
|
||||
Clear steps to recreate the issue:
|
||||
|
||||
1. Go to '...'
|
||||
2. Run command '...'
|
||||
3. Click on '...'
|
||||
4. See error
|
||||
|
||||
## Expected Behavior
|
||||
A clear and concise description of what should happen.
|
||||
|
||||
## Actual Behavior
|
||||
A clear and concise description of what actually happens.
|
||||
|
||||
## Logs
|
||||
Please include relevant error messages and stack traces:
|
||||
|
||||
```
|
||||
Paste logs here
|
||||
```
|
||||
|
||||
## Screenshots
|
||||
If applicable, add screenshots to help explain your problem.
|
||||
|
||||
## Additional Context
|
||||
Add any other context about the problem here (workflow used, specific target, configuration, etc.).
|
||||
|
||||
---
|
||||
|
||||
💬 **Need help?** Join our [Discord Community](https://discord.com/invite/acqv9FVG) for real-time support.
|
||||
8
.github/ISSUE_TEMPLATE/config.yml
vendored
Normal file
8
.github/ISSUE_TEMPLATE/config.yml
vendored
Normal file
@@ -0,0 +1,8 @@
|
||||
blank_issues_enabled: false
|
||||
contact_links:
|
||||
- name: 💬 Community Discord
|
||||
url: https://discord.com/invite/acqv9FVG
|
||||
about: Join our Discord to discuss ideas, workflows, and security research with the community.
|
||||
- name: 📖 Documentation
|
||||
url: https://github.com/FuzzingLabs/fuzzforge_ai/tree/main/docs
|
||||
about: Check our documentation for guides, tutorials, and API reference.
|
||||
38
.github/ISSUE_TEMPLATE/feature_request.md
vendored
Normal file
38
.github/ISSUE_TEMPLATE/feature_request.md
vendored
Normal file
@@ -0,0 +1,38 @@
|
||||
---
|
||||
name: ✨ Feature Request
|
||||
about: Suggest an idea for FuzzForge
|
||||
title: "[FEATURE] "
|
||||
labels: enhancement
|
||||
assignees: ''
|
||||
---
|
||||
|
||||
## Use Case
|
||||
Why is this feature needed? Describe the problem you're trying to solve or the improvement you'd like to see.
|
||||
|
||||
## Proposed Solution
|
||||
How should it work? Describe your ideal solution in detail.
|
||||
|
||||
## Alternatives
|
||||
What other approaches have you considered? List any alternative solutions or features you've thought about.
|
||||
|
||||
## Implementation
|
||||
**(Optional)** Do you have any technical considerations or implementation ideas?
|
||||
|
||||
## Category
|
||||
What area of FuzzForge would this feature enhance?
|
||||
|
||||
- [ ] 🤖 AI Agents for Security
|
||||
- [ ] 🛠 Workflow Automation
|
||||
- [ ] 📈 Vulnerability Research
|
||||
- [ ] 🔗 Fuzzer Integration
|
||||
- [ ] 🌐 Community Marketplace
|
||||
- [ ] 🔒 Enterprise Features
|
||||
- [ ] 📚 Documentation
|
||||
- [ ] 🎯 Other
|
||||
|
||||
## Additional Context
|
||||
Add any other context, screenshots, references, or examples about the feature request here.
|
||||
|
||||
---
|
||||
|
||||
💬 **Want to discuss this idea?** Join our [Discord Community](https://discord.com/invite/acqv9FVG) to collaborate with other contributors!
|
||||
67
.github/ISSUE_TEMPLATE/workflow_submission.md
vendored
Normal file
67
.github/ISSUE_TEMPLATE/workflow_submission.md
vendored
Normal file
@@ -0,0 +1,67 @@
|
||||
---
|
||||
name: 🔄 Workflow Submission
|
||||
about: Contribute a security workflow or module to the FuzzForge community
|
||||
title: "[WORKFLOW] "
|
||||
labels: workflow, community
|
||||
assignees: ''
|
||||
---
|
||||
|
||||
## Workflow Name
|
||||
Provide a short, descriptive name for your workflow.
|
||||
|
||||
## Description
|
||||
Explain what this workflow does and what security problems it solves.
|
||||
|
||||
## Category
|
||||
What type of security workflow is this?
|
||||
|
||||
- [ ] 🛡️ **Security Assessment** - Static analysis, vulnerability scanning
|
||||
- [ ] 🔍 **Secret Detection** - Credential and secret scanning
|
||||
- [ ] 🎯 **Fuzzing** - Dynamic testing and fuzz testing
|
||||
- [ ] 🔄 **Reverse Engineering** - Binary analysis and decompilation
|
||||
- [ ] 🌐 **Infrastructure Security** - Container, cloud, network security
|
||||
- [ ] 🔒 **Penetration Testing** - Offensive security testing
|
||||
- [ ] 📋 **Other** - Please describe
|
||||
|
||||
## Files
|
||||
Please attach or provide links to your workflow files:
|
||||
|
||||
- [ ] `workflow.py` - Main Prefect flow implementation
|
||||
- [ ] `Dockerfile` - Container definition
|
||||
- [ ] `metadata.yaml` - Workflow metadata
|
||||
- [ ] Test files or examples
|
||||
- [ ] Documentation
|
||||
|
||||
## Testing
|
||||
How did you test this workflow? Please describe:
|
||||
|
||||
- **Test targets used**: (e.g., vulnerable_app, custom test cases)
|
||||
- **Expected outputs**: (e.g., SARIF format, specific vulnerabilities detected)
|
||||
- **Validation results**: (e.g., X vulnerabilities found, Y false positives)
|
||||
|
||||
## SARIF Compliance
|
||||
- [ ] My workflow outputs results in SARIF format
|
||||
- [ ] Results include severity levels and descriptions
|
||||
- [ ] Code flow information is provided where applicable
|
||||
|
||||
## Security Guidelines
|
||||
- [ ] This workflow focuses on **defensive security** purposes only
|
||||
- [ ] I have not included any malicious tools or capabilities
|
||||
- [ ] All secrets/credentials are parameterized (no hardcoded values)
|
||||
- [ ] I have followed responsible disclosure practices
|
||||
|
||||
## Registry Integration
|
||||
Have you updated the workflow registry?
|
||||
|
||||
- [ ] Added import statement to `backend/toolbox/workflows/registry.py`
|
||||
- [ ] Added registry entry with proper metadata
|
||||
- [ ] Tested workflow registration and deployment
|
||||
|
||||
## Additional Notes
|
||||
Anything else the maintainers should know about this workflow?
|
||||
|
||||
---
|
||||
|
||||
🚀 **Thank you for contributing to FuzzForge!** Your workflow will help the security community automate and scale their testing efforts.
|
||||
|
||||
💬 **Questions?** Join our [Discord Community](https://discord.com/invite/acqv9FVG) to discuss your contribution!
|
||||
70
.github/workflows/ci-python.yml
vendored
Normal file
70
.github/workflows/ci-python.yml
vendored
Normal file
@@ -0,0 +1,70 @@
|
||||
name: Python CI
|
||||
|
||||
# This is a dumb Ci to ensure that the python client and backend builds correctly
|
||||
# It could be optimized to run faster, building, testing and linting only changed code
|
||||
# but for now it is good enough. It runs on every push and PR to any branch.
|
||||
# It also runs on demand.
|
||||
|
||||
on:
|
||||
workflow_dispatch:
|
||||
|
||||
push:
|
||||
paths:
|
||||
- "ai/**"
|
||||
- "backend/**"
|
||||
- "cli/**"
|
||||
- "sdk/**"
|
||||
- "src/**"
|
||||
pull_request:
|
||||
paths:
|
||||
- "ai/**"
|
||||
- "backend/**"
|
||||
- "cli/**"
|
||||
- "sdk/**"
|
||||
- "src/**"
|
||||
|
||||
jobs:
|
||||
ci:
|
||||
name: ci
|
||||
runs-on: ubuntu-latest
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
|
||||
- name: Setup uv
|
||||
uses: astral-sh/setup-uv@v6
|
||||
with:
|
||||
enable-cache: true
|
||||
|
||||
- name: Set up Python
|
||||
run: uv python install
|
||||
|
||||
# Validate no obvious issues
|
||||
# Quick hack because CLI returns non-zero exit code when no args are provided
|
||||
- name: Run base command
|
||||
run: |
|
||||
set +e
|
||||
uv run ff
|
||||
if [ $? -ne 2 ]; then
|
||||
echo "Expected exit code 2 from 'uv run ff', got $?"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
- name: Build fuzzforge_ai package
|
||||
run: uv build
|
||||
|
||||
- name: Build ai package
|
||||
working-directory: ai
|
||||
run: uv build
|
||||
|
||||
- name: Build cli package
|
||||
working-directory: cli
|
||||
run: uv build
|
||||
|
||||
- name: Build sdk package
|
||||
working-directory: sdk
|
||||
run: uv build
|
||||
|
||||
- name: Build backend package
|
||||
working-directory: backend
|
||||
run: uv build
|
||||
86
.github/workflows/ci.yml
vendored
86
.github/workflows/ci.yml
vendored
@@ -1,86 +0,0 @@
|
||||
name: CI
|
||||
|
||||
on:
|
||||
push:
|
||||
branches: [main, dev, feature/*]
|
||||
pull_request:
|
||||
branches: [main, dev]
|
||||
workflow_dispatch:
|
||||
|
||||
jobs:
|
||||
lint-and-typecheck:
|
||||
name: Lint & Type Check
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Install uv
|
||||
uses: astral-sh/setup-uv@v5
|
||||
with:
|
||||
version: "latest"
|
||||
|
||||
- name: Set up Python
|
||||
run: uv python install 3.14
|
||||
|
||||
- name: Install dependencies
|
||||
run: uv sync
|
||||
|
||||
- name: Ruff check (fuzzforge-cli)
|
||||
run: |
|
||||
cd fuzzforge-cli
|
||||
uv run --extra lints ruff check src/
|
||||
|
||||
- name: Ruff check (fuzzforge-mcp)
|
||||
run: |
|
||||
cd fuzzforge-mcp
|
||||
uv run --extra lints ruff check src/
|
||||
|
||||
- name: Ruff check (fuzzforge-common)
|
||||
run: |
|
||||
cd fuzzforge-common
|
||||
uv run --extra lints ruff check src/
|
||||
|
||||
- name: Mypy type check (fuzzforge-cli)
|
||||
run: |
|
||||
cd fuzzforge-cli
|
||||
uv run --extra lints mypy src/
|
||||
|
||||
- name: Mypy type check (fuzzforge-mcp)
|
||||
run: |
|
||||
cd fuzzforge-mcp
|
||||
uv run --extra lints mypy src/
|
||||
|
||||
# NOTE: Mypy check for fuzzforge-common temporarily disabled
|
||||
# due to 37 pre-existing type errors in legacy code.
|
||||
# TODO: Fix type errors and re-enable strict checking
|
||||
#- name: Mypy type check (fuzzforge-common)
|
||||
# run: |
|
||||
# cd fuzzforge-common
|
||||
# uv run --extra lints mypy src/
|
||||
|
||||
test:
|
||||
name: Tests
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Install uv
|
||||
uses: astral-sh/setup-uv@v5
|
||||
with:
|
||||
version: "latest"
|
||||
|
||||
- name: Set up Python
|
||||
run: uv python install 3.14
|
||||
|
||||
- name: Install dependencies
|
||||
run: uv sync --all-extras
|
||||
|
||||
- name: Run MCP tests
|
||||
run: |
|
||||
cd fuzzforge-mcp
|
||||
uv run --extra tests pytest -v
|
||||
|
||||
- name: Run common tests
|
||||
run: |
|
||||
cd fuzzforge-common
|
||||
uv run --extra tests pytest -v
|
||||
57
.github/workflows/docs-deploy.yml
vendored
Normal file
57
.github/workflows/docs-deploy.yml
vendored
Normal file
@@ -0,0 +1,57 @@
|
||||
name: Deploy Docusaurus to GitHub Pages
|
||||
|
||||
on:
|
||||
workflow_dispatch:
|
||||
|
||||
push:
|
||||
branches:
|
||||
- master
|
||||
paths:
|
||||
- "docs/**"
|
||||
|
||||
jobs:
|
||||
build:
|
||||
name: Build Docusaurus
|
||||
runs-on: ubuntu-latest
|
||||
defaults:
|
||||
run:
|
||||
working-directory: ./docs
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
fetch-depth: 0
|
||||
- uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version: 24
|
||||
cache: npm
|
||||
cache-dependency-path: "**/package-lock.json"
|
||||
|
||||
- name: Install dependencies
|
||||
run: npm ci
|
||||
- name: Build website
|
||||
run: npm run build
|
||||
|
||||
- name: Upload Build Artifact
|
||||
uses: actions/upload-pages-artifact@v3
|
||||
with:
|
||||
path: ./docs/build
|
||||
|
||||
deploy:
|
||||
name: Deploy to GitHub Pages
|
||||
needs: build
|
||||
|
||||
# Grant GITHUB_TOKEN the permissions required to make a Pages deployment
|
||||
permissions:
|
||||
pages: write # to deploy to Pages
|
||||
id-token: write # to verify the deployment originates from an appropriate source
|
||||
|
||||
# Deploy to the github-pages environment
|
||||
environment:
|
||||
name: github-pages
|
||||
url: ${{ steps.deployment.outputs.page_url }}
|
||||
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Deploy to GitHub Pages
|
||||
id: deployment
|
||||
uses: actions/deploy-pages@v4
|
||||
33
.github/workflows/docs-test-deploy.yml
vendored
Normal file
33
.github/workflows/docs-test-deploy.yml
vendored
Normal file
@@ -0,0 +1,33 @@
|
||||
name: Docusaurus test deployment
|
||||
|
||||
on:
|
||||
workflow_dispatch:
|
||||
|
||||
push:
|
||||
paths:
|
||||
- "docs/**"
|
||||
pull_request:
|
||||
paths:
|
||||
- "docs/**"
|
||||
|
||||
jobs:
|
||||
test-deploy:
|
||||
name: Test deployment
|
||||
runs-on: ubuntu-latest
|
||||
defaults:
|
||||
run:
|
||||
working-directory: ./docs
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
fetch-depth: 0
|
||||
- uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version: 24
|
||||
cache: npm
|
||||
cache-dependency-path: "**/package-lock.json"
|
||||
|
||||
- name: Install dependencies
|
||||
run: npm ci
|
||||
- name: Test build website
|
||||
run: npm run build
|
||||
49
.github/workflows/mcp-server.yml
vendored
49
.github/workflows/mcp-server.yml
vendored
@@ -1,49 +0,0 @@
|
||||
name: MCP Server Smoke Test
|
||||
|
||||
on:
|
||||
push:
|
||||
branches: [main, dev]
|
||||
pull_request:
|
||||
branches: [main, dev]
|
||||
workflow_dispatch:
|
||||
|
||||
jobs:
|
||||
mcp-server:
|
||||
name: MCP Server Test
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Install uv
|
||||
uses: astral-sh/setup-uv@v5
|
||||
with:
|
||||
version: "latest"
|
||||
|
||||
- name: Set up Python
|
||||
run: uv python install 3.14
|
||||
|
||||
- name: Install dependencies
|
||||
run: uv sync --all-extras
|
||||
|
||||
- name: Start MCP server in background
|
||||
run: |
|
||||
cd fuzzforge-mcp
|
||||
nohup uv run python -m fuzzforge_mcp.server > server.log 2>&1 &
|
||||
echo $! > server.pid
|
||||
sleep 3
|
||||
|
||||
- name: Run MCP tool tests
|
||||
run: |
|
||||
cd fuzzforge-mcp
|
||||
uv run --extra tests pytest tests/test_resources.py -v
|
||||
|
||||
- name: Stop MCP server
|
||||
if: always()
|
||||
run: |
|
||||
if [ -f fuzzforge-mcp/server.pid ]; then
|
||||
kill $(cat fuzzforge-mcp/server.pid) || true
|
||||
fi
|
||||
|
||||
- name: Show server logs
|
||||
if: failure()
|
||||
run: cat fuzzforge-mcp/server.log || true
|
||||
298
.gitignore
vendored
298
.gitignore
vendored
@@ -1,15 +1,291 @@
|
||||
*.egg-info
|
||||
*.whl
|
||||
# ========================================
|
||||
# FuzzForge Platform .gitignore
|
||||
# ========================================
|
||||
|
||||
# -------------------- Python --------------------
|
||||
# Byte-compiled / optimized / DLL files
|
||||
__pycache__/
|
||||
*.py[cod]
|
||||
*$py.class
|
||||
|
||||
# C extensions
|
||||
*.so
|
||||
|
||||
# Distribution / packaging
|
||||
.Python
|
||||
build/
|
||||
develop-eggs/
|
||||
dist/
|
||||
downloads/
|
||||
eggs/
|
||||
.eggs/
|
||||
lib/
|
||||
lib64/
|
||||
parts/
|
||||
sdist/
|
||||
var/
|
||||
wheels/
|
||||
pip-wheel-metadata/
|
||||
share/python-wheels/
|
||||
*.egg-info/
|
||||
.installed.cfg
|
||||
*.egg
|
||||
MANIFEST
|
||||
|
||||
# PyInstaller
|
||||
*.manifest
|
||||
*.spec
|
||||
|
||||
# Installer logs
|
||||
pip-log.txt
|
||||
pip-delete-this-directory.txt
|
||||
|
||||
# Unit test / coverage reports
|
||||
htmlcov/
|
||||
.tox/
|
||||
.nox/
|
||||
.coverage
|
||||
.coverage.*
|
||||
.cache
|
||||
nosetests.xml
|
||||
coverage.xml
|
||||
*.cover
|
||||
*.py,cover
|
||||
.hypothesis/
|
||||
.pytest_cache/
|
||||
|
||||
# Environments
|
||||
.env
|
||||
.mypy_cache
|
||||
.pytest_cache
|
||||
.ruff_cache
|
||||
.venv
|
||||
.vscode
|
||||
__pycache__
|
||||
env/
|
||||
venv/
|
||||
ENV/
|
||||
env.bak/
|
||||
venv.bak/
|
||||
.python-version
|
||||
|
||||
# Podman/Docker container storage artifacts
|
||||
~/.fuzzforge/
|
||||
# UV package manager
|
||||
uv.lock
|
||||
# But allow uv.lock in CLI and SDK for reproducible builds
|
||||
!cli/uv.lock
|
||||
!sdk/uv.lock
|
||||
!backend/uv.lock
|
||||
|
||||
# User-specific hub config (generated at runtime)
|
||||
hub-config.json
|
||||
# MyPy
|
||||
.mypy_cache/
|
||||
.dmypy.json
|
||||
dmypy.json
|
||||
|
||||
# Pyre type checker
|
||||
.pyre/
|
||||
|
||||
# pytype static type analyzer
|
||||
.pytype/
|
||||
|
||||
# Cython debug symbols
|
||||
cython_debug/
|
||||
|
||||
# -------------------- IDE / Editor --------------------
|
||||
# VSCode
|
||||
.vscode/
|
||||
*.code-workspace
|
||||
|
||||
# PyCharm
|
||||
.idea/
|
||||
|
||||
# Vim
|
||||
*.swp
|
||||
*.swo
|
||||
*~
|
||||
|
||||
# Emacs
|
||||
*~
|
||||
\#*\#
|
||||
/.emacs.desktop
|
||||
/.emacs.desktop.lock
|
||||
*.elc
|
||||
auto-save-list
|
||||
tramp
|
||||
.\#*
|
||||
|
||||
# Sublime Text
|
||||
*.sublime-project
|
||||
*.sublime-workspace
|
||||
|
||||
# -------------------- Operating System --------------------
|
||||
# macOS
|
||||
.DS_Store
|
||||
.AppleDouble
|
||||
.LSOverride
|
||||
Icon
|
||||
._*
|
||||
.DocumentRevisions-V100
|
||||
.fseventsd
|
||||
.Spotlight-V100
|
||||
.TemporaryItems
|
||||
.Trashes
|
||||
.VolumeIcon.icns
|
||||
.com.apple.timemachine.donotpresent
|
||||
.AppleDB
|
||||
.AppleDesktop
|
||||
Network Trash Folder
|
||||
Temporary Items
|
||||
.apdisk
|
||||
|
||||
# Windows
|
||||
Thumbs.db
|
||||
Thumbs.db:encryptable
|
||||
ehthumbs.db
|
||||
ehthumbs_vista.db
|
||||
*.stackdump
|
||||
[Dd]esktop.ini
|
||||
$RECYCLE.BIN/
|
||||
*.cab
|
||||
*.msi
|
||||
*.msix
|
||||
*.msm
|
||||
*.msp
|
||||
*.lnk
|
||||
|
||||
# Linux
|
||||
*~
|
||||
.fuse_hidden*
|
||||
.directory
|
||||
.Trash-*
|
||||
.nfs*
|
||||
|
||||
# -------------------- Docker --------------------
|
||||
# Docker volumes and data
|
||||
docker-volumes/
|
||||
.dockerignore.bak
|
||||
|
||||
# Docker Compose override files
|
||||
docker-compose.override.yml
|
||||
docker-compose.override.yaml
|
||||
|
||||
# -------------------- Database --------------------
|
||||
# SQLite
|
||||
*.sqlite
|
||||
*.sqlite3
|
||||
*.db
|
||||
*.db-journal
|
||||
*.db-shm
|
||||
*.db-wal
|
||||
|
||||
# PostgreSQL
|
||||
*.sql.backup
|
||||
|
||||
# -------------------- Logs --------------------
|
||||
# General logs
|
||||
*.log
|
||||
logs/
|
||||
*.log.*
|
||||
|
||||
# -------------------- FuzzForge Specific --------------------
|
||||
# FuzzForge project directories (user projects should manage their own .gitignore)
|
||||
.fuzzforge/
|
||||
|
||||
# Test project databases and configurations
|
||||
test_projects/*/.fuzzforge/
|
||||
test_projects/*/findings.db*
|
||||
test_projects/*/config.yaml
|
||||
test_projects/*/.gitignore
|
||||
|
||||
# Local development configurations
|
||||
local_config.yaml
|
||||
dev_config.yaml
|
||||
.env.local
|
||||
.env.development
|
||||
|
||||
# Generated reports and outputs
|
||||
reports/
|
||||
output/
|
||||
findings/
|
||||
*.sarif.json
|
||||
*.html.report
|
||||
security_report.*
|
||||
|
||||
# Temporary files
|
||||
tmp/
|
||||
temp/
|
||||
*.tmp
|
||||
*.temp
|
||||
|
||||
# Backup files
|
||||
*.bak
|
||||
*.backup
|
||||
*~
|
||||
|
||||
# -------------------- Node.js (for any JS tooling) --------------------
|
||||
node_modules/
|
||||
npm-debug.log*
|
||||
yarn-debug.log*
|
||||
yarn-error.log*
|
||||
.npm
|
||||
|
||||
# -------------------- Security --------------------
|
||||
# Never commit these files
|
||||
*.pem
|
||||
*.key
|
||||
*.p12
|
||||
*.pfx
|
||||
secret*
|
||||
secrets/
|
||||
credentials*
|
||||
api_keys*
|
||||
.env.production
|
||||
.env.staging
|
||||
|
||||
# AWS credentials
|
||||
.aws/
|
||||
|
||||
# -------------------- Build Artifacts --------------------
|
||||
# Python builds
|
||||
build/
|
||||
dist/
|
||||
*.wheel
|
||||
|
||||
# Documentation builds
|
||||
docs/_build/
|
||||
site/
|
||||
|
||||
# -------------------- Miscellaneous --------------------
|
||||
# Jupyter Notebook checkpoints
|
||||
.ipynb_checkpoints
|
||||
|
||||
# IPython history
|
||||
.ipython/
|
||||
|
||||
# Rope project settings
|
||||
.ropeproject
|
||||
|
||||
# spyderproject
|
||||
.spyderproject
|
||||
.spyproject
|
||||
|
||||
# mkdocs documentation
|
||||
/site
|
||||
|
||||
# Local Netlify folder
|
||||
.netlify
|
||||
|
||||
# -------------------- Project Specific Overrides --------------------
|
||||
# Allow specific test project files that should be tracked
|
||||
!test_projects/*/src/
|
||||
!test_projects/*/scripts/
|
||||
!test_projects/*/config/
|
||||
!test_projects/*/data/
|
||||
!test_projects/*/README.md
|
||||
!test_projects/*/*.py
|
||||
!test_projects/*/*.js
|
||||
!test_projects/*/*.php
|
||||
!test_projects/*/*.java
|
||||
|
||||
# But exclude their sensitive content
|
||||
test_projects/*/.env
|
||||
test_projects/*/private_key.pem
|
||||
test_projects/*/wallet.json
|
||||
test_projects/*/.npmrc
|
||||
test_projects/*/.git-credentials
|
||||
test_projects/*/credentials.*
|
||||
test_projects/*/api_keys.*
|
||||
@@ -1 +0,0 @@
|
||||
3.14.2
|
||||
520
CONTRIBUTING.md
520
CONTRIBUTING.md
@@ -1,21 +1,17 @@
|
||||
# Contributing to FuzzForge AI
|
||||
# Contributing to FuzzForge 🤝
|
||||
|
||||
Thank you for your interest in contributing to FuzzForge AI! We welcome contributions from the community and are excited to collaborate with you.
|
||||
Thank you for your interest in contributing to FuzzForge! We welcome contributions from the community and are excited to collaborate with you.
|
||||
|
||||
**Our Vision**: FuzzForge aims to be a **universal platform for security research** across all cybersecurity domains. Through our modular architecture, any security tool—from fuzzing engines to cloud scanners, from mobile app analyzers to IoT security tools—can be integrated as a containerized module and controlled via AI agents.
|
||||
## 🌟 Ways to Contribute
|
||||
|
||||
## Ways to Contribute
|
||||
- 🐛 **Bug Reports** - Help us identify and fix issues
|
||||
- 💡 **Feature Requests** - Suggest new capabilities and improvements
|
||||
- 🔧 **Code Contributions** - Submit bug fixes, features, and enhancements
|
||||
- 📚 **Documentation** - Improve guides, tutorials, and API documentation
|
||||
- 🧪 **Testing** - Help test new features and report issues
|
||||
- 🛡️ **Security Workflows** - Contribute new security analysis workflows
|
||||
|
||||
- **Security Modules** - Create modules for any cybersecurity domain (AppSec, NetSec, Cloud, IoT, etc.)
|
||||
- **Bug Reports** - Help us identify and fix issues
|
||||
- **Feature Requests** - Suggest new capabilities and improvements
|
||||
- **Core Features** - Contribute to the MCP server, runner, or CLI
|
||||
- **Documentation** - Improve guides, tutorials, and module documentation
|
||||
- **Testing** - Help test new features and report issues
|
||||
- **AI Integration** - Improve MCP tools and AI agent interactions
|
||||
- **Tool Integrations** - Wrap existing security tools as FuzzForge modules
|
||||
|
||||
## Contribution Guidelines
|
||||
## 📋 Contribution Guidelines
|
||||
|
||||
### Code Style
|
||||
|
||||
@@ -48,10 +44,9 @@ We use conventional commits for clear history:
|
||||
|
||||
**Examples:**
|
||||
```
|
||||
feat(modules): add cloud security scanner module
|
||||
fix(mcp): resolve module listing timeout
|
||||
docs(sdk): update module development guide
|
||||
test(runner): add container execution tests
|
||||
feat(workflows): add new static analysis workflow for Go
|
||||
fix(api): resolve authentication timeout issue
|
||||
docs(readme): update installation instructions
|
||||
```
|
||||
|
||||
### Pull Request Process
|
||||
@@ -70,14 +65,9 @@ test(runner): add container execution tests
|
||||
|
||||
3. **Test Your Changes**
|
||||
```bash
|
||||
# Test modules
|
||||
FUZZFORGE_MODULES_PATH=./fuzzforge-modules uv run fuzzforge modules list
|
||||
|
||||
# Run a module
|
||||
uv run fuzzforge modules run your-module --assets ./test-assets
|
||||
|
||||
# Test MCP integration (if applicable)
|
||||
uv run fuzzforge mcp status
|
||||
# Test workflows
|
||||
cd test_projects/vulnerable_app/
|
||||
ff workflow security_assessment .
|
||||
```
|
||||
|
||||
4. **Submit Pull Request**
|
||||
@@ -86,353 +76,64 @@ test(runner): add container execution tests
|
||||
- Link related issues using `Fixes #123` or `Closes #123`
|
||||
- Ensure all CI checks pass
|
||||
|
||||
## Module Development
|
||||
## 🛡️ Security Workflow Development
|
||||
|
||||
FuzzForge uses a modular architecture where security tools run as isolated containers. The `fuzzforge-modules-sdk` provides everything you need to create new modules.
|
||||
### Creating New Workflows
|
||||
|
||||
**Documentation:**
|
||||
- [Module SDK Documentation](fuzzforge-modules/fuzzforge-modules-sdk/README.md) - Complete SDK reference
|
||||
- [Module Template](fuzzforge-modules/fuzzforge-module-template/) - Starting point for new modules
|
||||
- [USAGE Guide](USAGE.md) - Setup and installation instructions
|
||||
|
||||
### Creating a New Module
|
||||
|
||||
1. **Use the Module Template**
|
||||
```bash
|
||||
# Generate a new module from template
|
||||
cd fuzzforge-modules/
|
||||
cp -r fuzzforge-module-template my-new-module
|
||||
cd my-new-module
|
||||
1. **Workflow Structure**
|
||||
```
|
||||
backend/toolbox/workflows/your_workflow/
|
||||
├── __init__.py
|
||||
├── workflow.py # Main Prefect flow
|
||||
├── metadata.yaml # Workflow metadata
|
||||
└── Dockerfile # Container definition
|
||||
```
|
||||
|
||||
2. **Module Structure**
|
||||
```
|
||||
my-new-module/
|
||||
├── Dockerfile # Container definition
|
||||
├── Makefile # Build commands
|
||||
├── README.md # Module documentation
|
||||
├── pyproject.toml # Python dependencies
|
||||
├── mypy.ini # Type checking config
|
||||
├── ruff.toml # Linting config
|
||||
└── src/
|
||||
└── module/
|
||||
├── __init__.py
|
||||
├── __main__.py # Entry point
|
||||
├── mod.py # Main module logic
|
||||
├── models.py # Pydantic models
|
||||
└── settings.py # Configuration
|
||||
```
|
||||
|
||||
3. **Implement Your Module**
|
||||
|
||||
Edit `src/module/mod.py`:
|
||||
2. **Register Your Workflow**
|
||||
Add your workflow to `backend/toolbox/workflows/registry.py`:
|
||||
```python
|
||||
from fuzzforge_modules_sdk.api.modules import BaseModule
|
||||
from fuzzforge_modules_sdk.api.models import ModuleResult
|
||||
from .models import MyModuleConfig, MyModuleOutput
|
||||
|
||||
class MyModule(BaseModule[MyModuleConfig, MyModuleOutput]):
|
||||
"""Your module description."""
|
||||
|
||||
def execute(self) -> ModuleResult[MyModuleOutput]:
|
||||
"""Main execution logic."""
|
||||
# Access input assets
|
||||
assets = self.input_path
|
||||
|
||||
# Your security tool logic here
|
||||
results = self.run_analysis(assets)
|
||||
|
||||
# Return structured results
|
||||
return ModuleResult(
|
||||
success=True,
|
||||
output=MyModuleOutput(
|
||||
findings=results,
|
||||
summary="Analysis complete"
|
||||
)
|
||||
)
|
||||
# Import your workflow
|
||||
from .your_workflow.workflow import main_flow as your_workflow_flow
|
||||
|
||||
# Add to registry
|
||||
WORKFLOW_REGISTRY["your_workflow"] = {
|
||||
"flow": your_workflow_flow,
|
||||
"module_path": "toolbox.workflows.your_workflow.workflow",
|
||||
"function_name": "main_flow",
|
||||
"description": "Description of your workflow",
|
||||
"version": "1.0.0",
|
||||
"author": "Your Name",
|
||||
"tags": ["tag1", "tag2"]
|
||||
}
|
||||
```
|
||||
|
||||
4. **Define Configuration Models**
|
||||
|
||||
Edit `src/module/models.py`:
|
||||
```python
|
||||
from pydantic import BaseModel, Field
|
||||
from fuzzforge_modules_sdk.api.models import BaseModuleConfig, BaseModuleOutput
|
||||
|
||||
class MyModuleConfig(BaseModuleConfig):
|
||||
"""Configuration for your module."""
|
||||
timeout: int = Field(default=300, description="Timeout in seconds")
|
||||
max_iterations: int = Field(default=1000, description="Max iterations")
|
||||
|
||||
class MyModuleOutput(BaseModuleOutput):
|
||||
"""Output from your module."""
|
||||
findings: list[dict] = Field(default_factory=list)
|
||||
coverage: float = Field(default=0.0)
|
||||
```
|
||||
|
||||
5. **Build Your Module**
|
||||
```bash
|
||||
# Build the SDK first (if not already done)
|
||||
cd ../fuzzforge-modules-sdk
|
||||
uv build
|
||||
mkdir -p .wheels
|
||||
cp ../../dist/fuzzforge_modules_sdk-*.whl .wheels/
|
||||
cd ../..
|
||||
docker build -t localhost/fuzzforge-modules-sdk:0.1.0 fuzzforge-modules/fuzzforge-modules-sdk/
|
||||
|
||||
# Build your module
|
||||
cd fuzzforge-modules/my-new-module
|
||||
docker build -t fuzzforge-my-new-module:0.1.0 .
|
||||
```
|
||||
|
||||
6. **Test Your Module**
|
||||
```bash
|
||||
# Run with test assets
|
||||
uv run fuzzforge modules run my-new-module --assets ./test-assets
|
||||
|
||||
# Check module info
|
||||
uv run fuzzforge modules info my-new-module
|
||||
```
|
||||
|
||||
### Module Development Guidelines
|
||||
|
||||
**Important Conventions:**
|
||||
- **Input/Output**: Use `/fuzzforge/input` for assets and `/fuzzforge/output` for results
|
||||
- **Configuration**: Support JSON configuration via stdin or file
|
||||
- **Logging**: Use structured logging (structlog is pre-configured)
|
||||
- **Error Handling**: Return proper exit codes and error messages
|
||||
- **Security**: Run as non-root user when possible
|
||||
- **Documentation**: Include clear README with usage examples
|
||||
- **Dependencies**: Minimize container size, use multi-stage builds
|
||||
|
||||
**See also:**
|
||||
- [Module SDK API Reference](fuzzforge-modules/fuzzforge-modules-sdk/src/fuzzforge_modules_sdk/api/)
|
||||
- [Dockerfile Best Practices](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/)
|
||||
|
||||
### Module Types
|
||||
|
||||
FuzzForge is designed to support modules across **all cybersecurity domains**. The modular architecture allows any security tool to be containerized and integrated. Here are the main categories:
|
||||
|
||||
**Application Security**
|
||||
- Fuzzing engines (coverage-guided, grammar-based, mutation-based)
|
||||
- Static analysis (SAST, code quality, dependency scanning)
|
||||
- Dynamic analysis (DAST, runtime analysis, instrumentation)
|
||||
- Test validation and coverage analysis
|
||||
- Crash analysis and exploit detection
|
||||
|
||||
**Network & Infrastructure Security**
|
||||
- Network scanning and service enumeration
|
||||
- Protocol analysis and fuzzing
|
||||
- Firewall and configuration testing
|
||||
- Cloud security (AWS/Azure/GCP misconfiguration detection, IAM analysis)
|
||||
- Container security (image scanning, Kubernetes security)
|
||||
|
||||
**Web & API Security**
|
||||
- Web vulnerability scanners (XSS, SQL injection, CSRF)
|
||||
- Authentication and session testing
|
||||
- API security (REST/GraphQL/gRPC testing, fuzzing)
|
||||
- SSL/TLS analysis
|
||||
|
||||
**Binary & Reverse Engineering**
|
||||
- Binary analysis and disassembly
|
||||
- Malware sandboxing and behavior analysis
|
||||
- Exploit development tools
|
||||
- Firmware extraction and analysis
|
||||
|
||||
**Mobile & IoT Security**
|
||||
- Mobile app analysis (Android/iOS static/dynamic analysis)
|
||||
- IoT device security and firmware analysis
|
||||
- SCADA/ICS and industrial protocol testing
|
||||
- Automotive security (CAN bus, ECU testing)
|
||||
|
||||
**Data & Compliance**
|
||||
- Database security testing
|
||||
- Encryption and cryptography analysis
|
||||
- Secrets and credential detection
|
||||
- Privacy tools (PII detection, GDPR compliance)
|
||||
- Compliance checkers (PCI-DSS, HIPAA, SOC2, ISO27001)
|
||||
|
||||
**Threat Intelligence & Risk**
|
||||
- OSINT and reconnaissance tools
|
||||
- Threat hunting and IOC correlation
|
||||
- Risk assessment and attack surface mapping
|
||||
- Security audit and policy validation
|
||||
|
||||
**Emerging Technologies**
|
||||
- AI/ML security (model poisoning, adversarial testing)
|
||||
- Blockchain and smart contract analysis
|
||||
- Quantum-safe cryptography testing
|
||||
|
||||
**Custom & Integration**
|
||||
- Domain-specific security tools
|
||||
- Bridges to existing security tools
|
||||
- Multi-tool orchestration and result aggregation
|
||||
|
||||
### Example: Simple Security Scanner Module
|
||||
|
||||
```python
|
||||
# src/module/mod.py
|
||||
from pathlib import Path
|
||||
from fuzzforge_modules_sdk.api.modules import BaseModule
|
||||
from fuzzforge_modules_sdk.api.models import ModuleResult
|
||||
from .models import ScannerConfig, ScannerOutput
|
||||
|
||||
class SecurityScanner(BaseModule[ScannerConfig, ScannerOutput]):
|
||||
"""Scans for common security issues in code."""
|
||||
|
||||
def execute(self) -> ModuleResult[ScannerOutput]:
|
||||
findings = []
|
||||
|
||||
# Scan all source files
|
||||
for file_path in self.input_path.rglob("*"):
|
||||
if file_path.is_file():
|
||||
findings.extend(self.scan_file(file_path))
|
||||
|
||||
return ModuleResult(
|
||||
success=True,
|
||||
output=ScannerOutput(
|
||||
findings=findings,
|
||||
files_scanned=len(list(self.input_path.rglob("*")))
|
||||
)
|
||||
)
|
||||
|
||||
def scan_file(self, path: Path) -> list[dict]:
|
||||
"""Scan a single file for security issues."""
|
||||
# Your scanning logic here
|
||||
return []
|
||||
```
|
||||
|
||||
### Testing Modules
|
||||
|
||||
Create tests in `tests/`:
|
||||
```python
|
||||
import pytest
|
||||
from module.mod import MyModule
|
||||
from module.models import MyModuleConfig
|
||||
|
||||
def test_module_execution():
|
||||
config = MyModuleConfig(timeout=60)
|
||||
module = MyModule(config=config, input_path=Path("test_assets"))
|
||||
result = module.execute()
|
||||
|
||||
assert result.success
|
||||
assert len(result.output.findings) >= 0
|
||||
```
|
||||
|
||||
Run tests:
|
||||
```bash
|
||||
uv run pytest
|
||||
```
|
||||
3. **Testing Workflows**
|
||||
- Create test cases in `test_projects/vulnerable_app/`
|
||||
- Ensure SARIF output format compliance
|
||||
- Test with various input scenarios
|
||||
|
||||
### Security Guidelines
|
||||
|
||||
**Critical Requirements:**
|
||||
- Never commit secrets, API keys, or credentials
|
||||
- Focus on **defensive security** tools and analysis
|
||||
- Do not create tools for malicious purposes
|
||||
- Test modules thoroughly before submission
|
||||
- Follow responsible disclosure for security issues
|
||||
- Use minimal, secure base images for containers
|
||||
- Avoid running containers as root when possible
|
||||
- 🔐 Never commit secrets, API keys, or credentials
|
||||
- 🛡️ Focus on **defensive security** tools and analysis
|
||||
- ⚠️ Do not create tools for malicious purposes
|
||||
- 🧪 Test workflows thoroughly before submission
|
||||
- 📋 Follow responsible disclosure for security issues
|
||||
|
||||
**Security Resources:**
|
||||
- [OWASP Container Security](https://cheatsheetseries.owasp.org/cheatsheets/Docker_Security_Cheat_Sheet.html)
|
||||
- [CIS Docker Benchmarks](https://www.cisecurity.org/benchmark/docker)
|
||||
|
||||
## Contributing to Core Features
|
||||
|
||||
Beyond modules, you can contribute to FuzzForge's core components.
|
||||
|
||||
**Useful Resources:**
|
||||
- [Project Structure](README.md) - Overview of the codebase
|
||||
- [USAGE Guide](USAGE.md) - Installation and setup
|
||||
- Python best practices: [PEP 8](https://pep8.org/)
|
||||
|
||||
### Core Components
|
||||
|
||||
- **fuzzforge-mcp** - MCP server for AI agent integration
|
||||
- **fuzzforge-runner** - Module execution engine
|
||||
- **fuzzforge-cli** - Command-line interface
|
||||
- **fuzzforge-common** - Shared utilities and sandbox engines
|
||||
- **fuzzforge-types** - Type definitions and schemas
|
||||
|
||||
### Development Setup
|
||||
|
||||
1. **Clone and Install**
|
||||
```bash
|
||||
git clone https://github.com/FuzzingLabs/fuzzforge_ai.git
|
||||
cd fuzzforge_ai
|
||||
uv sync --all-extras
|
||||
```
|
||||
|
||||
2. **Run Tests**
|
||||
```bash
|
||||
# Run all tests
|
||||
make test
|
||||
|
||||
# Run specific package tests
|
||||
cd fuzzforge-mcp
|
||||
uv run pytest
|
||||
```
|
||||
|
||||
3. **Type Checking**
|
||||
```bash
|
||||
# Type check all packages
|
||||
make typecheck
|
||||
|
||||
# Type check specific package
|
||||
cd fuzzforge-runner
|
||||
uv run mypy .
|
||||
```
|
||||
|
||||
4. **Linting and Formatting**
|
||||
```bash
|
||||
# Format code
|
||||
make format
|
||||
|
||||
# Lint code
|
||||
make lint
|
||||
```
|
||||
|
||||
## Bug Reports
|
||||
## 🐛 Bug Reports
|
||||
|
||||
When reporting bugs, please include:
|
||||
|
||||
- **Environment**: OS, Python version, Docker version, uv version
|
||||
- **FuzzForge Version**: Output of `uv run fuzzforge --version`
|
||||
- **Module**: Which module or component is affected
|
||||
- **Environment**: OS, Python version, Docker version
|
||||
- **Steps to Reproduce**: Clear steps to recreate the issue
|
||||
- **Expected Behavior**: What should happen
|
||||
- **Actual Behavior**: What actually happens
|
||||
- **Logs**: Relevant error messages and stack traces
|
||||
- **Container Logs**: For module issues, include Docker/Podman logs
|
||||
- **Screenshots**: If applicable
|
||||
|
||||
**Example:**
|
||||
```markdown
|
||||
**Environment:**
|
||||
- OS: Ubuntu 22.04
|
||||
- Python: 3.14.2
|
||||
- Docker: 24.0.7
|
||||
- uv: 0.5.13
|
||||
Use our [Bug Report Template](.github/ISSUE_TEMPLATE/bug_report.md).
|
||||
|
||||
**Module:** my-custom-scanner
|
||||
|
||||
**Steps to Reproduce:**
|
||||
1. Run `uv run fuzzforge modules run my-scanner --assets ./test-target`
|
||||
2. Module fails with timeout error
|
||||
|
||||
**Expected:** Module completes analysis
|
||||
**Actual:** Times out after 30 seconds
|
||||
|
||||
**Logs:**
|
||||
```
|
||||
ERROR: Module execution timeout
|
||||
...
|
||||
```
|
||||
```
|
||||
|
||||
## Feature Requests
|
||||
## 💡 Feature Requests
|
||||
|
||||
For new features, please provide:
|
||||
|
||||
@@ -440,124 +141,33 @@ For new features, please provide:
|
||||
- **Proposed Solution**: How should it work?
|
||||
- **Alternatives**: Other approaches considered
|
||||
- **Implementation**: Technical considerations (optional)
|
||||
- **Module vs Core**: Should this be a module or core feature?
|
||||
|
||||
**Example Feature Requests:**
|
||||
- New module for cloud security posture management (CSPM)
|
||||
- Module for analyzing smart contract vulnerabilities
|
||||
- MCP tool for orchestrating multi-module workflows
|
||||
- CLI command for batch module execution across multiple targets
|
||||
- Support for distributed fuzzing campaigns
|
||||
- Integration with CI/CD pipelines
|
||||
- Module marketplace/registry features
|
||||
Use our [Feature Request Template](.github/ISSUE_TEMPLATE/feature_request.md).
|
||||
|
||||
## Documentation
|
||||
## 📚 Documentation
|
||||
|
||||
Help improve our documentation:
|
||||
|
||||
- **Module Documentation**: Document your modules in their README.md
|
||||
- **API Documentation**: Update docstrings and type hints
|
||||
- **User Guides**: Improve USAGE.md and tutorial content
|
||||
- **Module SDK Guides**: Help document the SDK for module developers
|
||||
- **MCP Integration**: Document AI agent integration patterns
|
||||
- **Examples**: Add practical usage examples and workflows
|
||||
- **User Guides**: Create tutorials and how-to guides
|
||||
- **Workflow Documentation**: Document new security workflows
|
||||
- **Examples**: Add practical usage examples
|
||||
|
||||
### Documentation Standards
|
||||
|
||||
- Use clear, concise language
|
||||
- Include code examples
|
||||
- Add command-line examples with expected output
|
||||
- Document all configuration options
|
||||
- Explain error messages and troubleshooting
|
||||
|
||||
### Module README Template
|
||||
|
||||
```markdown
|
||||
# Module Name
|
||||
|
||||
Brief description of what this module does.
|
||||
|
||||
## Features
|
||||
|
||||
- Feature 1
|
||||
- Feature 2
|
||||
|
||||
## Configuration
|
||||
|
||||
| Parameter | Type | Default | Description |
|
||||
|-----------|------|---------|-------------|
|
||||
| timeout | int | 300 | Timeout in seconds |
|
||||
|
||||
## Usage
|
||||
|
||||
\`\`\`bash
|
||||
uv run fuzzforge modules run module-name --assets ./path/to/assets
|
||||
\`\`\`
|
||||
|
||||
## Output
|
||||
|
||||
Describes the output structure and format.
|
||||
|
||||
## Examples
|
||||
|
||||
Practical usage examples.
|
||||
```
|
||||
|
||||
## Recognition
|
||||
## 🙏 Recognition
|
||||
|
||||
Contributors will be:
|
||||
|
||||
- Listed in our [Contributors](CONTRIBUTORS.md) file
|
||||
- Mentioned in release notes for significant contributions
|
||||
- Credited in module documentation (for module authors)
|
||||
- Invited to join our [Discord community](https://discord.gg/8XEX33UUwZ)
|
||||
- Invited to join our Discord community
|
||||
- Eligible for FuzzingLabs Academy courses and swag
|
||||
|
||||
## Module Submission Checklist
|
||||
## 📜 License
|
||||
|
||||
Before submitting a new module:
|
||||
|
||||
- [ ] Module follows SDK structure and conventions
|
||||
- [ ] Dockerfile builds successfully
|
||||
- [ ] Module executes without errors
|
||||
- [ ] Configuration options are documented
|
||||
- [ ] README.md is complete with examples
|
||||
- [ ] Tests are included (pytest)
|
||||
- [ ] Type hints are used throughout
|
||||
- [ ] Linting passes (ruff)
|
||||
- [ ] Security best practices followed
|
||||
- [ ] No secrets or credentials in code
|
||||
- [ ] License headers included
|
||||
|
||||
## Review Process
|
||||
|
||||
1. **Initial Review** - Maintainers review for completeness
|
||||
2. **Technical Review** - Code quality and security assessment
|
||||
3. **Testing** - Module tested in isolated environment
|
||||
4. **Documentation Review** - Ensure docs are clear and complete
|
||||
5. **Approval** - Module merged and included in next release
|
||||
|
||||
## License
|
||||
|
||||
By contributing to FuzzForge AI, you agree that your contributions will be licensed under the same license as the project (see [LICENSE](LICENSE)).
|
||||
|
||||
For module contributions:
|
||||
- Modules you create remain under the project license
|
||||
- You retain credit as the module author
|
||||
- Your module may be used by others under the project license terms
|
||||
By contributing to FuzzForge, you agree that your contributions will be licensed under the same [Business Source License 1.1](LICENSE) as the project.
|
||||
|
||||
---
|
||||
|
||||
## Getting Help
|
||||
**Thank you for making FuzzForge better! 🚀**
|
||||
|
||||
Need help contributing?
|
||||
|
||||
- Join our [Discord](https://discord.gg/8XEX33UUwZ)
|
||||
- Read the [Module SDK Documentation](fuzzforge-modules/fuzzforge-modules-sdk/README.md)
|
||||
- Check the module template for examples
|
||||
- Contact: contact@fuzzinglabs.com
|
||||
|
||||
---
|
||||
|
||||
**Thank you for making FuzzForge better!**
|
||||
|
||||
Every contribution, no matter how small, helps build a stronger security research platform. Whether you're creating a module for web security, cloud scanning, mobile analysis, or any other cybersecurity domain, your work makes FuzzForge more powerful and versatile for the entire security community!
|
||||
Every contribution, no matter how small, helps build a stronger security community.
|
||||
|
||||
78
Makefile
78
Makefile
@@ -1,78 +0,0 @@
|
||||
.PHONY: help install sync format lint typecheck test build-hub-images clean
|
||||
|
||||
SHELL := /bin/bash
|
||||
|
||||
# Default target
|
||||
help:
|
||||
@echo "FuzzForge AI Development Commands"
|
||||
@echo ""
|
||||
@echo " make install - Install all dependencies"
|
||||
@echo " make sync - Sync shared packages from upstream"
|
||||
@echo " make format - Format code with ruff"
|
||||
@echo " make lint - Lint code with ruff"
|
||||
@echo " make typecheck - Type check with mypy"
|
||||
@echo " make test - Run all tests"
|
||||
@echo " make build-hub-images - Build all mcp-security-hub images"
|
||||
@echo " make clean - Clean build artifacts"
|
||||
@echo ""
|
||||
|
||||
# Install all dependencies
|
||||
install:
|
||||
uv sync
|
||||
|
||||
# Sync shared packages from upstream fuzzforge-core
|
||||
sync:
|
||||
@if [ -z "$(UPSTREAM)" ]; then \
|
||||
echo "Usage: make sync UPSTREAM=/path/to/fuzzforge-core"; \
|
||||
exit 1; \
|
||||
fi
|
||||
./scripts/sync-upstream.sh $(UPSTREAM)
|
||||
|
||||
# Format all packages
|
||||
format:
|
||||
@for pkg in packages/fuzzforge-*/; do \
|
||||
if [ -f "$$pkg/pyproject.toml" ]; then \
|
||||
echo "Formatting $$pkg..."; \
|
||||
cd "$$pkg" && uv run ruff format . && cd -; \
|
||||
fi \
|
||||
done
|
||||
|
||||
# Lint all packages
|
||||
lint:
|
||||
@for pkg in packages/fuzzforge-*/; do \
|
||||
if [ -f "$$pkg/pyproject.toml" ]; then \
|
||||
echo "Linting $$pkg..."; \
|
||||
cd "$$pkg" && uv run ruff check . && cd -; \
|
||||
fi \
|
||||
done
|
||||
|
||||
# Type check all packages
|
||||
typecheck:
|
||||
@for pkg in packages/fuzzforge-*/; do \
|
||||
if [ -f "$$pkg/pyproject.toml" ] && [ -f "$$pkg/mypy.ini" ]; then \
|
||||
echo "Type checking $$pkg..."; \
|
||||
cd "$$pkg" && uv run mypy . && cd -; \
|
||||
fi \
|
||||
done
|
||||
|
||||
# Run all tests
|
||||
test:
|
||||
@for pkg in packages/fuzzforge-*/; do \
|
||||
if [ -f "$$pkg/pytest.ini" ]; then \
|
||||
echo "Testing $$pkg..."; \
|
||||
cd "$$pkg" && uv run pytest && cd -; \
|
||||
fi \
|
||||
done
|
||||
|
||||
# Build all mcp-security-hub images for the firmware analysis pipeline
|
||||
build-hub-images:
|
||||
@bash scripts/build-hub-images.sh
|
||||
|
||||
# Clean build artifacts
|
||||
clean:
|
||||
find . -type d -name "__pycache__" -exec rm -rf {} + 2>/dev/null || true
|
||||
find . -type d -name ".pytest_cache" -exec rm -rf {} + 2>/dev/null || true
|
||||
find . -type d -name ".mypy_cache" -exec rm -rf {} + 2>/dev/null || true
|
||||
find . -type d -name ".ruff_cache" -exec rm -rf {} + 2>/dev/null || true
|
||||
find . -type d -name "*.egg-info" -exec rm -rf {} + 2>/dev/null || true
|
||||
find . -type f -name "*.pyc" -delete 2>/dev/null || true
|
||||
341
README.md
341
README.md
@@ -1,266 +1,215 @@
|
||||
<h1 align="center"> FuzzForge AI</h1>
|
||||
<h3 align="center">AI-Powered Security Research Orchestration via MCP</h3>
|
||||
|
||||
<p align="center">
|
||||
<a href="https://discord.gg/8XEX33UUwZ"><img src="https://img.shields.io/discord/1420767905255133267?logo=discord&label=Discord" alt="Discord"></a>
|
||||
<a href="LICENSE"><img src="https://img.shields.io/badge/license-BSL%201.1-blue" alt="License: BSL 1.1"></a>
|
||||
<a href="https://www.python.org/downloads/"><img src="https://img.shields.io/badge/python-3.12%2B-blue" alt="Python 3.12+"/></a>
|
||||
<a href="https://modelcontextprotocol.io"><img src="https://img.shields.io/badge/MCP-compatible-green" alt="MCP Compatible"/></a>
|
||||
<a href="https://fuzzforge.ai"><img src="https://img.shields.io/badge/Website-fuzzforge.ai-purple" alt="Website"/></a>
|
||||
<img src="docs/static/img/fuzzforge_banner_github.png" alt="FuzzForge Banner" width="100%">
|
||||
</p>
|
||||
<h1 align="center">🚧 FuzzForge is under active development</h1>
|
||||
|
||||
<p align="center"><strong>AI-powered workflow automation and AI Agents for AppSec, Fuzzing & Offensive Security</strong></p>
|
||||
|
||||
<p align="center">
|
||||
<strong>Let AI agents orchestrate your security research workflows locally</strong>
|
||||
<a href="https://discord.gg/8XEX33UUwZ/"><img src="https://img.shields.io/discord/1420767905255133267?logo=discord&label=Discord" alt="Discord"></a>
|
||||
<a href="LICENSE"><img src="https://img.shields.io/badge/license-BSL%20%2B%20Apache-orange" alt="License: BSL + Apache"></a>
|
||||
<a href="https://www.python.org/downloads/"><img src="https://img.shields.io/badge/python-3.11%2B-blue" alt="Python 3.11+"/></a>
|
||||
<a href="https://fuzzforge.ai"><img src="https://img.shields.io/badge/Website-fuzzforge.ai-blue" alt="Website"/></a>
|
||||
<img src="https://img.shields.io/badge/version-0.6.0-green" alt="Version">
|
||||
<a href="https://github.com/FuzzingLabs/fuzzforge_ai/stargazers"><img src="https://img.shields.io/github/stars/FuzzingLabs/fuzzforge_ai?style=social" alt="GitHub Stars"></a>
|
||||
|
||||
</p>
|
||||
|
||||
<p align="center">
|
||||
<sub>
|
||||
<a href="#-overview"><b>Overview</b></a> •
|
||||
<a href="#-features"><b>Features</b></a> •
|
||||
<a href="#-mcp-security-hub"><b>Security Hub</b></a> •
|
||||
<a href="#-installation"><b>Installation</b></a> •
|
||||
<a href="USAGE.md"><b>Usage Guide</b></a> •
|
||||
<a href="#-contributing"><b>Contributing</b></a>
|
||||
<a href="#-overview"><b>Overview</b></a>
|
||||
• <a href="#-key-features"><b>Features</b></a>
|
||||
• <a href="#-installation"><b>Installation</b></a>
|
||||
• <a href="#-quickstart"><b>Quickstart</b></a>
|
||||
• <a href="#ai-powered-workflow-execution"><b>AI Demo</b></a>
|
||||
• <a href="#-contributing"><b>Contributing</b></a>
|
||||
• <a href="#%EF%B8%8F-roadmap"><b>Roadmap</b></a>
|
||||
</sub>
|
||||
</p>
|
||||
|
||||
---
|
||||
|
||||
> 🚧 **FuzzForge AI is under active development.** Expect breaking changes and new features!
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Overview
|
||||
|
||||
**FuzzForge AI** is an open-source MCP server that enables AI agents (GitHub Copilot, Claude, etc.) to orchestrate security research workflows through the **Model Context Protocol (MCP)**.
|
||||
**FuzzForge** helps security researchers and engineers automate **application security** and **offensive security** workflows with the power of AI and fuzzing frameworks.
|
||||
|
||||
FuzzForge connects your AI assistant to **MCP tool hubs** — collections of containerized security tools that the agent can discover, chain, and execute autonomously. Instead of manually running security tools, describe what you want and let your AI assistant handle it.
|
||||
- Orchestrate static & dynamic analysis
|
||||
- Automate vulnerability research
|
||||
- Scale AppSec testing with AI agents
|
||||
- Build, share & reuse workflows across teams
|
||||
|
||||
### The Core: Hub Architecture
|
||||
FuzzForge is **open source**, built to empower security teams, researchers, and the community.
|
||||
|
||||
FuzzForge acts as a **meta-MCP server** — a single MCP endpoint that gives your AI agent access to tools from multiple MCP hub servers. Each hub server is a containerized security tool (Binwalk, YARA, Radare2, Nmap, etc.) that the agent can discover at runtime.
|
||||
|
||||
- **🔍 Discovery**: The agent lists available hub servers and discovers their tools
|
||||
- **🤖 AI-Native**: Hub tools provide agent context — usage tips, workflow guidance, and domain knowledge
|
||||
- **🔗 Composable**: Chain tools from different hubs into automated pipelines
|
||||
- **📦 Extensible**: Add your own MCP servers to the hub registry
|
||||
|
||||
### 🎬 Use Case: Firmware Vulnerability Research
|
||||
|
||||
> **Scenario**: Analyze a firmware image to find security vulnerabilities — fully automated by an AI agent.
|
||||
|
||||
```
|
||||
User: "Search for vulnerabilities in firmware.bin"
|
||||
|
||||
Agent → Binwalk: Extract filesystem from firmware image
|
||||
Agent → YARA: Scan extracted files for vulnerability patterns
|
||||
Agent → Radare2: Trace dangerous function calls in prioritized binaries
|
||||
Agent → Report: 8 vulnerabilities found (2 critical, 4 high, 2 medium)
|
||||
```
|
||||
|
||||
### 🎬 Use Case: Rust Fuzzing Pipeline
|
||||
|
||||
> **Scenario**: Fuzz a Rust crate to discover vulnerabilities using AI-assisted harness generation and parallel fuzzing.
|
||||
|
||||
```
|
||||
User: "Fuzz the blurhash crate for vulnerabilities"
|
||||
|
||||
Agent → Rust Analyzer: Identify fuzzable functions and attack surface
|
||||
Agent → Harness Gen: Generate and validate fuzzing harnesses
|
||||
Agent → Cargo Fuzzer: Run parallel coverage-guided fuzzing sessions
|
||||
Agent → Crash Analysis: Deduplicate and triage discovered crashes
|
||||
```
|
||||
> 🚧 FuzzForge is under active development. Expect breaking changes.
|
||||
|
||||
---
|
||||
|
||||
## ⭐ Support the Project
|
||||
|
||||
If you find FuzzForge useful, please **star the repo** to support development! 🚀
|
||||
|
||||
<a href="https://github.com/FuzzingLabs/fuzzforge_ai/stargazers">
|
||||
<img src="https://img.shields.io/github/stars/FuzzingLabs/fuzzforge_ai?style=social" alt="GitHub Stars">
|
||||
</a>
|
||||
|
||||
---
|
||||
|
||||
## ✨ Features
|
||||
|
||||
| Feature | Description |
|
||||
|---------|-------------|
|
||||
| 🤖 **AI-Native** | Built for MCP — works with GitHub Copilot, Claude, and any MCP-compatible agent |
|
||||
| 🔌 **Hub System** | Connect to MCP tool hubs — each hub brings dozens of containerized security tools |
|
||||
| 🔍 **Tool Discovery** | Agents discover available tools at runtime with built-in usage guidance |
|
||||
| 🔗 **Pipelines** | Chain tools from different hubs into automated multi-step workflows |
|
||||
| 🔄 **Persistent Sessions** | Long-running tools (Radare2, fuzzers) with stateful container sessions |
|
||||
| 🏠 **Local First** | All execution happens on your machine — no cloud required |
|
||||
| 🔒 **Sandboxed** | Every tool runs in an isolated container via Docker or Podman |
|
||||
If you find FuzzForge useful, please star the repo to support development 🚀
|
||||
|
||||
---
|
||||
|
||||
## 🏗️ Architecture
|
||||
## ✨ Key Features
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ AI Agent (Copilot/Claude) │
|
||||
└───────────────────────────┬─────────────────────────────────────┘
|
||||
│ MCP Protocol (stdio)
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ FuzzForge MCP Server │
|
||||
│ │
|
||||
│ Projects Hub Discovery Hub Execution │
|
||||
│ ┌──────────────┐ ┌──────────────────┐ ┌───────────────────┐ │
|
||||
│ │init_project │ │list_hub_servers │ │execute_hub_tool │ │
|
||||
│ │set_assets │ │discover_hub_tools│ │start_hub_server │ │
|
||||
│ │list_results │ │get_tool_schema │ │stop_hub_server │ │
|
||||
│ └──────────────┘ └──────────────────┘ └───────────────────┘ │
|
||||
└───────────────────────────┬─────────────────────────────────────┘
|
||||
│ Docker/Podman
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ MCP Hub Servers │
|
||||
│ │
|
||||
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ │
|
||||
│ │ Binwalk │ │ YARA │ │ Radare2 │ │ Nmap │ │
|
||||
│ │ 6 tools │ │ 5 tools │ │ 32 tools │ │ 8 tools │ │
|
||||
│ └───────────┘ └───────────┘ └───────────┘ └───────────┘ │
|
||||
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ │
|
||||
│ │ Nuclei │ │ SQLMap │ │ Trivy │ │ ... │ │
|
||||
│ │ 7 tools │ │ 8 tools │ │ 7 tools │ │ 36 hubs │ │
|
||||
│ └───────────┘ └───────────┘ └───────────┘ └───────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔧 MCP Security Hub
|
||||
|
||||
FuzzForge ships with built-in support for the **[MCP Security Hub](https://github.com/FuzzingLabs/mcp-security-hub)** — a collection of 36 production-ready, Dockerized MCP servers covering offensive security:
|
||||
|
||||
| Category | Servers | Examples |
|
||||
|----------|---------|----------|
|
||||
| 🔍 **Reconnaissance** | 8 | Nmap, Masscan, Shodan, WhatWeb |
|
||||
| 🌐 **Web Security** | 6 | Nuclei, SQLMap, ffuf, Nikto |
|
||||
| 🔬 **Binary Analysis** | 6 | Radare2, Binwalk, YARA, Capa, Ghidra |
|
||||
| ⛓️ **Blockchain** | 3 | Medusa, Solazy, DAML Viewer |
|
||||
| ☁️ **Cloud Security** | 3 | Trivy, Prowler, RoadRecon |
|
||||
| 💻 **Code Security** | 1 | Semgrep |
|
||||
| 🔑 **Secrets Detection** | 1 | Gitleaks |
|
||||
| 💥 **Exploitation** | 1 | SearchSploit |
|
||||
| 🎯 **Fuzzing** | 2 | Boofuzz, Dharma |
|
||||
| 🕵️ **OSINT** | 2 | Maigret, DNSTwist |
|
||||
| 🛡️ **Threat Intel** | 2 | VirusTotal, AlienVault OTX |
|
||||
| 🏰 **Active Directory** | 1 | BloodHound |
|
||||
|
||||
> 185+ individual tools accessible through a single MCP connection.
|
||||
|
||||
The hub is open source and can be extended with your own MCP servers. See the [mcp-security-hub repository](https://github.com/FuzzingLabs/mcp-security-hub) for details.
|
||||
- 🤖 **AI Agents for Security** – Specialized agents for AppSec, reversing, and fuzzing
|
||||
- 🛠 **Workflow Automation** – Define & execute AppSec workflows as code
|
||||
- 📈 **Vulnerability Research at Scale** – Rediscover 1-days & find 0-days with automation
|
||||
- 🔗 **Fuzzer Integration** – AFL, Honggfuzz, AFLnet, StateAFL & more
|
||||
- 🌐 **Community Marketplace** – Share workflows, corpora, PoCs, and modules
|
||||
- 🔒 **Enterprise Ready** – Team/Corp cloud tiers for scaling offensive security
|
||||
|
||||
---
|
||||
|
||||
## 📦 Installation
|
||||
|
||||
### Prerequisites
|
||||
### Requirements
|
||||
|
||||
- **Python 3.12+**
|
||||
- **[uv](https://docs.astral.sh/uv/)** package manager
|
||||
- **Docker** ([Install Docker](https://docs.docker.com/get-docker/)) or Podman
|
||||
**Python 3.11+**
|
||||
Python 3.11 or higher is required.
|
||||
|
||||
### Quick Install
|
||||
**uv Package Manager**
|
||||
|
||||
```bash
|
||||
curl -LsSf https://astral.sh/uv/install.sh | sh
|
||||
```
|
||||
|
||||
**Docker**
|
||||
For containerized workflows, see the [Docker Installation Guide](https://docs.docker.com/get-docker/).
|
||||
|
||||
#### Configure Docker Daemon
|
||||
|
||||
Before running `docker compose up`, configure Docker to allow insecure registries (required for the local registry).
|
||||
|
||||
Add the following to your Docker daemon configuration:
|
||||
|
||||
```json
|
||||
{
|
||||
"insecure-registries": [
|
||||
"localhost:5000",
|
||||
"host.docker.internal:5001",
|
||||
"registry:5000"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**macOS (Docker Desktop):**
|
||||
1. Open Docker Desktop
|
||||
2. Go to Settings → Docker Engine
|
||||
3. Add the `insecure-registries` configuration to the JSON
|
||||
4. Click "Apply & Restart"
|
||||
|
||||
**Linux:**
|
||||
1. Edit `/etc/docker/daemon.json` (create if it doesn't exist):
|
||||
```bash
|
||||
sudo nano /etc/docker/daemon.json
|
||||
```
|
||||
2. Add the configuration above
|
||||
3. Restart Docker:
|
||||
```bash
|
||||
sudo systemctl restart docker
|
||||
```
|
||||
|
||||
### CLI Installation
|
||||
|
||||
After installing the requirements, install the FuzzForge CLI:
|
||||
|
||||
```bash
|
||||
# Clone the repository
|
||||
git clone https://github.com/FuzzingLabs/fuzzforge_ai.git
|
||||
git clone https://github.com/fuzzinglabs/fuzzforge_ai.git
|
||||
cd fuzzforge_ai
|
||||
|
||||
# Install dependencies
|
||||
uv sync
|
||||
# Install CLI with uv (from the root directory)
|
||||
uv tool install --python python3.12 .
|
||||
```
|
||||
|
||||
### Link the Security Hub
|
||||
|
||||
```bash
|
||||
# Clone the MCP Security Hub
|
||||
git clone https://github.com/FuzzingLabs/mcp-security-hub.git ~/.fuzzforge/hubs/mcp-security-hub
|
||||
|
||||
# Build the Docker images for the hub tools
|
||||
./scripts/build-hub-images.sh
|
||||
```
|
||||
|
||||
Or use the terminal UI (`uv run fuzzforge ui`) to link hubs interactively.
|
||||
|
||||
### Configure MCP for Your AI Agent
|
||||
|
||||
```bash
|
||||
# For GitHub Copilot
|
||||
uv run fuzzforge mcp install copilot
|
||||
|
||||
# For Claude Code (CLI)
|
||||
uv run fuzzforge mcp install claude-code
|
||||
|
||||
# For Claude Desktop (standalone app)
|
||||
uv run fuzzforge mcp install claude-desktop
|
||||
|
||||
# Verify installation
|
||||
uv run fuzzforge mcp status
|
||||
```
|
||||
|
||||
**Restart your editor** and your AI agent will have access to FuzzForge tools!
|
||||
|
||||
---
|
||||
|
||||
## 🧑💻 Usage
|
||||
## ⚡ Quickstart
|
||||
|
||||
Once installed, just talk to your AI agent:
|
||||
Run your first workflow :
|
||||
|
||||
```
|
||||
"What security tools are available?"
|
||||
"Scan this firmware image for vulnerabilities"
|
||||
"Analyze this binary with radare2"
|
||||
"Run nuclei against https://example.com"
|
||||
```bash
|
||||
# 1. Clone the repo
|
||||
git clone https://github.com/fuzzinglabs/fuzzforge_ai.git
|
||||
cd fuzzforge_ai
|
||||
|
||||
# 2. Build & run with Docker
|
||||
# Set registry host for your OS (local registry is mandatory)
|
||||
# macOS/Windows (Docker Desktop):
|
||||
export REGISTRY_HOST=host.docker.internal
|
||||
# Linux (default):
|
||||
# export REGISTRY_HOST=localhost
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
The agent will use FuzzForge to discover the right hub tools, chain them into a pipeline, and return results — all without you touching a terminal.
|
||||
> The first launch can take 5-10 minutes due to Docker image building - a good time for a coffee break ☕
|
||||
|
||||
See the [Usage Guide](USAGE.md) for detailed setup and advanced workflows.
|
||||
```bash
|
||||
# 3. Run your first workflow
|
||||
cd test_projects/vulnerable_app/ # Go into the test directory
|
||||
fuzzforge init # Init a fuzzforge project
|
||||
ff workflow run security_assessment . # Start a workflow (you can also use ff command)
|
||||
```
|
||||
|
||||
### Manual Workflow Setup
|
||||
|
||||

|
||||
|
||||
_Setting up and running security workflows through the interface_
|
||||
|
||||
👉 More installation options in the [Documentation](https://docs.fuzzforge.ai).
|
||||
|
||||
---
|
||||
|
||||
## 📁 Project Structure
|
||||
## AI-Powered Workflow Execution
|
||||
|
||||
```
|
||||
fuzzforge_ai/
|
||||
├── fuzzforge-mcp/ # MCP server — the core of FuzzForge
|
||||
├── fuzzforge-cli/ # Command-line interface & terminal UI
|
||||
├── fuzzforge-common/ # Shared abstractions (containers, storage)
|
||||
├── fuzzforge-runner/ # Container execution engine (Docker/Podman)
|
||||
├── fuzzforge-tests/ # Integration tests
|
||||
├── mcp-security-hub/ # Default hub: 36 offensive security MCP servers
|
||||
└── scripts/ # Hub image build scripts
|
||||
```
|
||||

|
||||
|
||||
_AI agents automatically analyzing code and providing security insights_
|
||||
|
||||
## 📚 Resources
|
||||
|
||||
- 🌐 [Website](https://fuzzforge.ai)
|
||||
- 📖 [Documentation](https://docs.fuzzforge.ai)
|
||||
- 💬 [Community Discord](https://discord.gg/8XEX33UUwZ)
|
||||
- 🎓 [FuzzingLabs Academy](https://academy.fuzzinglabs.com/?coupon=GITHUB_FUZZFORGE)
|
||||
|
||||
---
|
||||
|
||||
## 🤝 Contributing
|
||||
|
||||
We welcome contributions from the community!
|
||||
We welcome contributions from the community!
|
||||
There are many ways to help:
|
||||
|
||||
- 🐛 Report bugs via [GitHub Issues](../../issues)
|
||||
- 💡 Suggest features or improvements
|
||||
- 🔧 Submit pull requests
|
||||
- 🔌 Add new MCP servers to the [Security Hub](https://github.com/FuzzingLabs/mcp-security-hub)
|
||||
- Report bugs by opening an [issue](../../issues)
|
||||
- Suggest new features or improvements
|
||||
- Submit pull requests with fixes or enhancements
|
||||
- Share workflows, corpora, or modules with the community
|
||||
|
||||
See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
|
||||
See our [Contributing Guide](CONTRIBUTING.md) for details.
|
||||
|
||||
---
|
||||
|
||||
## 📄 License
|
||||
## 🗺️ Roadmap
|
||||
|
||||
BSL 1.1 - See [LICENSE](LICENSE) for details.
|
||||
Planned features and improvements:
|
||||
|
||||
- 📦 Public workflow & module marketplace
|
||||
- 🤖 New specialized AI agents (Rust, Go, Android, Automotive)
|
||||
- 🔗 Expanded fuzzer integrations (LibFuzzer, Jazzer, more network fuzzers)
|
||||
- ☁️ Multi-tenant SaaS platform with team collaboration
|
||||
- 📊 Advanced reporting & analytics
|
||||
|
||||
👉 Follow updates in the [GitHub issues](../../issues) and [Discord](https://discord.gg/8XEX33UUwZ)
|
||||
|
||||
---
|
||||
|
||||
<p align="center">
|
||||
<strong>Maintained by <a href="https://fuzzinglabs.com">FuzzingLabs</a></strong>
|
||||
<br>
|
||||
</p>
|
||||
## 📜 License
|
||||
|
||||
FuzzForge is released under the **Business Source License (BSL) 1.1**, with an automatic fallback to **Apache 2.0** after 4 years.
|
||||
See [LICENSE](LICENSE) and [LICENSE-APACHE](LICENSE-APACHE) for details.
|
||||
|
||||
125
ROADMAP.md
125
ROADMAP.md
@@ -1,125 +0,0 @@
|
||||
# FuzzForge AI Roadmap
|
||||
|
||||
This document outlines the planned features and development direction for FuzzForge AI.
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Upcoming Features
|
||||
|
||||
### 1. MCP Security Hub Integration
|
||||
|
||||
**Status:** 🔄 Planned
|
||||
|
||||
Integrate [mcp-security-hub](https://github.com/FuzzingLabs/mcp-security-hub) tools into FuzzForge, giving AI agents access to 28 MCP servers and 163+ security tools through a unified interface.
|
||||
|
||||
#### How It Works
|
||||
|
||||
Unlike native FuzzForge modules (built with the SDK), mcp-security-hub tools are **standalone MCP servers**. The integration will bridge these tools so they can be:
|
||||
|
||||
- Discovered via `list_modules` alongside native modules
|
||||
- Executed through FuzzForge's orchestration layer
|
||||
- Chained with native modules in workflows
|
||||
|
||||
| Aspect | Native Modules | MCP Hub Tools |
|
||||
|--------|----------------|---------------|
|
||||
| **Runtime** | FuzzForge SDK container | Standalone MCP server container |
|
||||
| **Protocol** | Direct execution | MCP-to-MCP bridge |
|
||||
| **Configuration** | Module config | Tool-specific args |
|
||||
| **Output** | FuzzForge results format | Tool-native format (normalized) |
|
||||
|
||||
#### Goals
|
||||
|
||||
- Unified discovery of all available tools (native + hub)
|
||||
- Orchestrate hub tools through FuzzForge's workflow engine
|
||||
- Normalize outputs for consistent result handling
|
||||
- No modification required to mcp-security-hub tools
|
||||
|
||||
#### Planned Tool Categories
|
||||
|
||||
| Category | Tools | Example Use Cases |
|
||||
|----------|-------|-------------------|
|
||||
| **Reconnaissance** | nmap, masscan, whatweb, shodan | Network scanning, service discovery |
|
||||
| **Web Security** | nuclei, sqlmap, ffuf, nikto | Vulnerability scanning, fuzzing |
|
||||
| **Binary Analysis** | radare2, binwalk, yara, capa, ghidra | Reverse engineering, malware analysis |
|
||||
| **Cloud Security** | trivy, prowler | Container scanning, cloud auditing |
|
||||
| **Secrets Detection** | gitleaks | Credential scanning |
|
||||
| **OSINT** | maigret, dnstwist | Username tracking, typosquatting |
|
||||
| **Threat Intel** | virustotal, otx | Malware analysis, IOC lookup |
|
||||
|
||||
#### Example Workflow
|
||||
|
||||
```
|
||||
You: "Scan example.com for vulnerabilities and analyze any suspicious binaries"
|
||||
|
||||
AI Agent:
|
||||
1. Uses nmap module for port discovery
|
||||
2. Uses nuclei module for vulnerability scanning
|
||||
3. Uses binwalk module to extract firmware
|
||||
4. Uses yara module for malware detection
|
||||
5. Generates consolidated report
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 2. User Interface
|
||||
|
||||
**Status:** 🔄 Planned
|
||||
|
||||
A graphical interface to manage FuzzForge without the command line.
|
||||
|
||||
#### Goals
|
||||
|
||||
- Provide an alternative to CLI for users who prefer visual tools
|
||||
- Make configuration and monitoring more accessible
|
||||
- Complement (not replace) the CLI experience
|
||||
|
||||
#### Planned Capabilities
|
||||
|
||||
| Capability | Description |
|
||||
|------------|-------------|
|
||||
| **Configuration** | Change MCP server settings, engine options, paths |
|
||||
| **Module Management** | Browse, configure, and launch modules |
|
||||
| **Execution Monitoring** | View running tasks, logs, progress, metrics |
|
||||
| **Project Overview** | Manage projects and browse execution results |
|
||||
| **Workflow Management** | Create and run multi-module workflows |
|
||||
|
||||
---
|
||||
|
||||
## 📋 Backlog
|
||||
|
||||
Features under consideration for future releases:
|
||||
|
||||
| Feature | Description |
|
||||
|---------|-------------|
|
||||
| **Module Marketplace** | Browse and install community modules |
|
||||
| **Scheduled Executions** | Run modules on a schedule (cron-style) |
|
||||
| **Team Collaboration** | Share projects, results, and workflows |
|
||||
| **Reporting Engine** | Generate PDF/HTML security reports |
|
||||
| **Notifications** | Slack, Discord, email alerts for findings |
|
||||
|
||||
---
|
||||
|
||||
## ✅ Completed
|
||||
|
||||
| Feature | Version | Date |
|
||||
|---------|---------|------|
|
||||
| Docker as default engine | 0.1.0 | Jan 2026 |
|
||||
| MCP server for AI agents | 0.1.0 | Jan 2026 |
|
||||
| CLI for project management | 0.1.0 | Jan 2026 |
|
||||
| Continuous execution mode | 0.1.0 | Jan 2026 |
|
||||
| Workflow orchestration | 0.1.0 | Jan 2026 |
|
||||
|
||||
---
|
||||
|
||||
## 💬 Feedback
|
||||
|
||||
Have suggestions for the roadmap?
|
||||
|
||||
- Open an issue on [GitHub](https://github.com/FuzzingLabs/fuzzforge_ai/issues)
|
||||
- Join our [Discord](https://discord.gg/8XEX33UUwZ)
|
||||
|
||||
---
|
||||
|
||||
<p align="center">
|
||||
<strong>Built with ❤️ by <a href="https://fuzzinglabs.com">FuzzingLabs</a></strong>
|
||||
</p>
|
||||
517
USAGE.md
517
USAGE.md
@@ -1,517 +0,0 @@
|
||||
# FuzzForge AI Usage Guide
|
||||
|
||||
This guide covers everything you need to know to get started with FuzzForge AI — from installation to linking your first MCP hub and running security research workflows with AI.
|
||||
|
||||
> **FuzzForge is designed to be used with AI agents** (GitHub Copilot, Claude, etc.) via MCP.
|
||||
> A terminal UI (`fuzzforge ui`) is provided for managing agents and hubs.
|
||||
> The CLI is available for advanced users but the primary experience is through natural language interaction with your AI assistant.
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Quick Start](#quick-start)
|
||||
- [Prerequisites](#prerequisites)
|
||||
- [Installation](#installation)
|
||||
- [Terminal UI](#terminal-ui)
|
||||
- [Launching the UI](#launching-the-ui)
|
||||
- [Dashboard](#dashboard)
|
||||
- [Agent Setup](#agent-setup)
|
||||
- [Hub Manager](#hub-manager)
|
||||
- [MCP Hub System](#mcp-hub-system)
|
||||
- [What is an MCP Hub?](#what-is-an-mcp-hub)
|
||||
- [FuzzingLabs Security Hub](#fuzzinglabs-security-hub)
|
||||
- [Linking a Custom Hub](#linking-a-custom-hub)
|
||||
- [Building Hub Images](#building-hub-images)
|
||||
- [MCP Server Configuration (CLI)](#mcp-server-configuration-cli)
|
||||
- [GitHub Copilot](#github-copilot)
|
||||
- [Claude Code (CLI)](#claude-code-cli)
|
||||
- [Claude Desktop](#claude-desktop)
|
||||
- [Using FuzzForge with AI](#using-fuzzforge-with-ai)
|
||||
- [CLI Reference](#cli-reference)
|
||||
- [Environment Variables](#environment-variables)
|
||||
- [Troubleshooting](#troubleshooting)
|
||||
|
||||
---
|
||||
|
||||
## Quick Start
|
||||
|
||||
> **Prerequisites:** You need [uv](https://docs.astral.sh/uv/) and [Docker](https://docs.docker.com/get-docker/) installed.
|
||||
> See the [Prerequisites](#prerequisites) section for details.
|
||||
|
||||
```bash
|
||||
# 1. Clone and install
|
||||
git clone https://github.com/FuzzingLabs/fuzzforge_ai.git
|
||||
cd fuzzforge_ai
|
||||
uv sync
|
||||
|
||||
# 2. Launch the terminal UI
|
||||
uv run fuzzforge ui
|
||||
|
||||
# 3. Press 'h' → "FuzzingLabs Hub" to clone & link the default security hub
|
||||
# 4. Select an agent row and press Enter to install the MCP server for your agent
|
||||
# 5. Build the Docker images for the hub tools (required before tools can run)
|
||||
./scripts/build-hub-images.sh
|
||||
|
||||
# 6. Restart your AI agent and start talking:
|
||||
# "What security tools are available?"
|
||||
# "Scan this binary with binwalk and yara"
|
||||
# "Analyze this Rust crate for fuzzable functions"
|
||||
```
|
||||
|
||||
Or do it entirely from the command line:
|
||||
|
||||
```bash
|
||||
# Install MCP for your AI agent
|
||||
uv run fuzzforge mcp install copilot # For VS Code + GitHub Copilot
|
||||
# OR
|
||||
uv run fuzzforge mcp install claude-code # For Claude Code CLI
|
||||
|
||||
# Clone and link the default security hub
|
||||
git clone git@github.com:FuzzingLabs/mcp-security-hub.git ~/.fuzzforge/hubs/mcp-security-hub
|
||||
|
||||
# Build hub tool images (required — tools only run once their image is built)
|
||||
./scripts/build-hub-images.sh
|
||||
|
||||
# Restart your AI agent — done!
|
||||
```
|
||||
|
||||
> **Note:** FuzzForge uses Docker by default. Podman is also supported via `--engine podman`.
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before installing FuzzForge AI, ensure you have:
|
||||
|
||||
- **Python 3.12+** — [Download Python](https://www.python.org/downloads/)
|
||||
- **uv** package manager — [Install uv](https://docs.astral.sh/uv/)
|
||||
- **Docker** — Container runtime ([Install Docker](https://docs.docker.com/get-docker/))
|
||||
- **Git** — For cloning hub repositories
|
||||
|
||||
### Installing uv
|
||||
|
||||
```bash
|
||||
# Linux/macOS
|
||||
curl -LsSf https://astral.sh/uv/install.sh | sh
|
||||
|
||||
# Or with pip
|
||||
pip install uv
|
||||
```
|
||||
|
||||
### Installing Docker
|
||||
|
||||
```bash
|
||||
# Linux (Ubuntu/Debian)
|
||||
curl -fsSL https://get.docker.com | sh
|
||||
sudo usermod -aG docker $USER
|
||||
# Log out and back in for group changes to take effect
|
||||
|
||||
# macOS/Windows
|
||||
# Install Docker Desktop from https://docs.docker.com/get-docker/
|
||||
```
|
||||
|
||||
> **Note:** Podman is also supported. Use `--engine podman` with CLI commands
|
||||
> or set `FUZZFORGE_ENGINE=podman` environment variable.
|
||||
|
||||
---
|
||||
|
||||
## Installation
|
||||
|
||||
### 1. Clone the Repository
|
||||
|
||||
```bash
|
||||
git clone https://github.com/FuzzingLabs/fuzzforge_ai.git
|
||||
cd fuzzforge_ai
|
||||
```
|
||||
|
||||
### 2. Install Dependencies
|
||||
|
||||
```bash
|
||||
uv sync
|
||||
```
|
||||
|
||||
This installs all FuzzForge components in a virtual environment.
|
||||
|
||||
### 3. Verify Installation
|
||||
|
||||
```bash
|
||||
uv run fuzzforge --help
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Terminal UI
|
||||
|
||||
FuzzForge ships with a terminal user interface (TUI) built on [Textual](https://textual.textualize.io/) for managing AI agents and MCP hub servers from a single dashboard.
|
||||
|
||||
### Launching the UI
|
||||
|
||||
```bash
|
||||
uv run fuzzforge ui
|
||||
```
|
||||
|
||||
### Dashboard
|
||||
|
||||
The main screen is split into two panels:
|
||||
|
||||
| Panel | Content |
|
||||
|-------|---------|
|
||||
| **AI Agents** (left) | Shows GitHub Copilot, Claude Desktop, and Claude Code with live link status and config file path |
|
||||
| **Hub Servers** (right) | Shows all configured MCP hub tools with Docker image name, source hub, and build status (✓ Ready / ✗ Not built) |
|
||||
|
||||
### Keyboard Shortcuts
|
||||
|
||||
| Key | Action |
|
||||
|-----|--------|
|
||||
| `Enter` | **Select** — Act on the selected row (setup/unlink an agent) |
|
||||
| `h` | **Hub Manager** — Open the hub management screen |
|
||||
| `r` | **Refresh** — Re-check all agent and hub statuses |
|
||||
| `q` | **Quit** |
|
||||
|
||||
### Agent Setup
|
||||
|
||||
Select an agent row in the AI Agents table and press `Enter`:
|
||||
|
||||
- **If the agent is not linked** → a setup dialog opens asking for your container engine (Docker or Podman), then installs the FuzzForge MCP configuration
|
||||
- **If the agent is already linked** → a confirmation dialog offers to unlink it (removes the `fuzzforge` entry without touching other MCP servers)
|
||||
|
||||
The setup auto-detects:
|
||||
- FuzzForge installation root
|
||||
- Docker/Podman socket path
|
||||
- Hub configuration from `hub-config.json`
|
||||
|
||||
### Hub Manager
|
||||
|
||||
Press `h` to open the hub manager. This is where you manage your MCP hub repositories:
|
||||
|
||||
| Button | Action |
|
||||
|--------|--------|
|
||||
| **FuzzingLabs Hub** | One-click clone of the official [mcp-security-hub](https://github.com/FuzzingLabs/mcp-security-hub) repository — clones to `~/.fuzzforge/hubs/mcp-security-hub`, scans for tools, and registers them in `hub-config.json` |
|
||||
| **Link Path** | Link any local directory as a hub — enter a name and path, FuzzForge scans it for `category/tool-name/Dockerfile` patterns |
|
||||
| **Clone URL** | Clone any git repository and link it as a hub |
|
||||
| **Remove** | Unlink the selected hub and remove its servers from the configuration |
|
||||
|
||||
The hub table shows:
|
||||
- **Name** — Hub name (★ prefix for the default hub)
|
||||
- **Path** — Local directory path
|
||||
- **Servers** — Number of MCP tools discovered
|
||||
- **Source** — Git URL or "local"
|
||||
|
||||
---
|
||||
|
||||
## MCP Hub System
|
||||
|
||||
### What is an MCP Hub?
|
||||
|
||||
An MCP hub is a directory containing one or more containerized MCP tools, organized by category:
|
||||
|
||||
```
|
||||
my-hub/
|
||||
├── category-a/
|
||||
│ ├── tool-1/
|
||||
│ │ └── Dockerfile
|
||||
│ └── tool-2/
|
||||
│ └── Dockerfile
|
||||
├── category-b/
|
||||
│ └── tool-3/
|
||||
│ └── Dockerfile
|
||||
└── ...
|
||||
```
|
||||
|
||||
FuzzForge scans for the pattern `category/tool-name/Dockerfile` and auto-generates server configuration entries for each discovered tool.
|
||||
|
||||
### FuzzingLabs Security Hub
|
||||
|
||||
The default MCP hub is [mcp-security-hub](https://github.com/FuzzingLabs/mcp-security-hub), maintained by FuzzingLabs. It includes **40+ security tools** across categories:
|
||||
|
||||
| Category | Tools |
|
||||
|----------|-------|
|
||||
| **Reconnaissance** | nmap, masscan, shodan, zoomeye, whatweb, pd-tools, externalattacker, networksdb |
|
||||
| **Binary Analysis** | binwalk, yara, capa, radare2, ghidra, ida |
|
||||
| **Code Security** | semgrep, rust-analyzer, harness-tester, cargo-fuzzer, crash-analyzer |
|
||||
| **Web Security** | nuclei, nikto, sqlmap, ffuf, burp, waybackurls |
|
||||
| **Fuzzing** | boofuzz, dharma |
|
||||
| **Exploitation** | searchsploit |
|
||||
| **Secrets** | gitleaks |
|
||||
| **Cloud Security** | trivy, prowler, roadrecon |
|
||||
| **OSINT** | maigret, dnstwist |
|
||||
| **Threat Intel** | virustotal, otx |
|
||||
| **Password Cracking** | hashcat |
|
||||
| **Blockchain** | medusa, solazy, daml-viewer |
|
||||
|
||||
**Clone it via the UI:**
|
||||
|
||||
1. `uv run fuzzforge ui`
|
||||
2. Press `h` → click **FuzzingLabs Hub**
|
||||
3. Wait for the clone to finish — servers are auto-registered
|
||||
|
||||
**Or clone manually:**
|
||||
|
||||
```bash
|
||||
git clone git@github.com:FuzzingLabs/mcp-security-hub.git ~/.fuzzforge/hubs/mcp-security-hub
|
||||
```
|
||||
|
||||
### Linking a Custom Hub
|
||||
|
||||
You can link any directory that follows the `category/tool-name/Dockerfile` layout:
|
||||
|
||||
**Via the UI:**
|
||||
|
||||
1. Press `h` → **Link Path**
|
||||
2. Enter a name and the directory path
|
||||
|
||||
**Via the CLI (planned):** Not yet available — use the UI.
|
||||
|
||||
### Building Hub Images
|
||||
|
||||
After linking a hub, you need to build the Docker images before the tools can be used:
|
||||
|
||||
```bash
|
||||
# Build all images from the default security hub
|
||||
./scripts/build-hub-images.sh
|
||||
|
||||
# Or build a single tool image
|
||||
docker build -t semgrep-mcp:latest mcp-security-hub/code-security/semgrep-mcp/
|
||||
```
|
||||
|
||||
The dashboard hub table shows ✓ Ready for built images and ✗ Not built for missing ones.
|
||||
|
||||
---
|
||||
|
||||
## MCP Server Configuration (CLI)
|
||||
|
||||
If you prefer the command line over the TUI, you can configure agents directly:
|
||||
|
||||
### GitHub Copilot
|
||||
|
||||
```bash
|
||||
uv run fuzzforge mcp install copilot
|
||||
```
|
||||
|
||||
The command auto-detects:
|
||||
- **FuzzForge root** — Where FuzzForge is installed
|
||||
- **Docker socket** — Auto-detects `/var/run/docker.sock`
|
||||
|
||||
**Optional overrides:**
|
||||
```bash
|
||||
uv run fuzzforge mcp install copilot --engine podman
|
||||
```
|
||||
|
||||
**After installation:** Restart VS Code. FuzzForge tools appear in GitHub Copilot Chat.
|
||||
|
||||
### Claude Code (CLI)
|
||||
|
||||
```bash
|
||||
uv run fuzzforge mcp install claude-code
|
||||
```
|
||||
|
||||
Installs to `~/.claude.json`. FuzzForge tools are available from any directory after restarting Claude.
|
||||
|
||||
### Claude Desktop
|
||||
|
||||
```bash
|
||||
uv run fuzzforge mcp install claude-desktop
|
||||
```
|
||||
|
||||
**After installation:** Restart Claude Desktop.
|
||||
|
||||
### Check Status
|
||||
|
||||
```bash
|
||||
uv run fuzzforge mcp status
|
||||
```
|
||||
|
||||
### Remove Configuration
|
||||
|
||||
```bash
|
||||
uv run fuzzforge mcp uninstall copilot
|
||||
uv run fuzzforge mcp uninstall claude-code
|
||||
uv run fuzzforge mcp uninstall claude-desktop
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Using FuzzForge with AI
|
||||
|
||||
Once MCP is configured and hub images are built, interact with FuzzForge through natural language with your AI assistant.
|
||||
|
||||
### Example Conversations
|
||||
|
||||
**Discover available tools:**
|
||||
```
|
||||
You: "What security tools are available in FuzzForge?"
|
||||
AI: Queries hub tools → "I found 15 tools across categories: nmap for
|
||||
port scanning, binwalk for firmware analysis, semgrep for code
|
||||
scanning, cargo-fuzzer for Rust fuzzing..."
|
||||
```
|
||||
|
||||
**Analyze a binary:**
|
||||
```
|
||||
You: "Extract and analyze this firmware image"
|
||||
AI: Uses binwalk to extract → yara for pattern matching → capa for
|
||||
capability detection → "Found 3 embedded filesystems, 2 YARA
|
||||
matches for known vulnerabilities..."
|
||||
```
|
||||
|
||||
**Fuzz Rust code:**
|
||||
```
|
||||
You: "Analyze this Rust crate for functions I should fuzz"
|
||||
AI: Uses rust-analyzer → "Found 3 fuzzable entry points..."
|
||||
|
||||
You: "Start fuzzing parse_input for 10 minutes"
|
||||
AI: Uses cargo-fuzzer → "Fuzzing session started. 2 crashes found..."
|
||||
```
|
||||
|
||||
**Scan for vulnerabilities:**
|
||||
```
|
||||
You: "Scan this codebase with semgrep for security issues"
|
||||
AI: Uses semgrep-mcp → "Found 5 findings: 2 high severity SQL injection
|
||||
patterns, 3 medium severity hardcoded secrets..."
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## CLI Reference
|
||||
|
||||
### UI Command
|
||||
|
||||
```bash
|
||||
uv run fuzzforge ui # Launch the terminal dashboard
|
||||
```
|
||||
|
||||
### MCP Commands
|
||||
|
||||
```bash
|
||||
uv run fuzzforge mcp status # Check agent configuration status
|
||||
uv run fuzzforge mcp install <agent> # Install MCP config (copilot|claude-code|claude-desktop)
|
||||
uv run fuzzforge mcp uninstall <agent> # Remove MCP config
|
||||
uv run fuzzforge mcp generate <agent> # Preview config without installing
|
||||
```
|
||||
|
||||
### Project Commands
|
||||
|
||||
```bash
|
||||
uv run fuzzforge project init # Initialize a project
|
||||
uv run fuzzforge project info # Show project info
|
||||
uv run fuzzforge project executions # List executions
|
||||
uv run fuzzforge project results <id> # Get execution results
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Environment Variables
|
||||
|
||||
Configure FuzzForge using environment variables:
|
||||
|
||||
```bash
|
||||
# Override the FuzzForge installation root (auto-detected from cwd by default)
|
||||
export FUZZFORGE_ROOT=/path/to/fuzzforge_ai
|
||||
|
||||
# Override the user-global data directory (default: ~/.fuzzforge)
|
||||
# Useful for isolated testing without touching your real installation
|
||||
export FUZZFORGE_USER_DIR=/tmp/my-fuzzforge-test
|
||||
|
||||
# Storage path for projects and execution results (default: <workspace>/.fuzzforge/storage)
|
||||
export FUZZFORGE_STORAGE__PATH=/path/to/storage
|
||||
|
||||
# Container engine (Docker is default)
|
||||
export FUZZFORGE_ENGINE__TYPE=docker # or podman
|
||||
|
||||
# Podman-specific container storage paths
|
||||
export FUZZFORGE_ENGINE__GRAPHROOT=~/.fuzzforge/containers/storage
|
||||
export FUZZFORGE_ENGINE__RUNROOT=~/.fuzzforge/containers/run
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Docker Not Running
|
||||
|
||||
```
|
||||
Error: Cannot connect to Docker daemon
|
||||
```
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Linux: Start Docker service
|
||||
sudo systemctl start docker
|
||||
|
||||
# macOS/Windows: Start Docker Desktop application
|
||||
|
||||
# Verify Docker is running
|
||||
docker run --rm hello-world
|
||||
```
|
||||
|
||||
### Permission Denied on Docker Socket
|
||||
|
||||
```
|
||||
Error: Permission denied connecting to Docker socket
|
||||
```
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
sudo usermod -aG docker $USER
|
||||
# Log out and back in, then verify:
|
||||
docker run --rm hello-world
|
||||
```
|
||||
|
||||
### Hub Images Not Built
|
||||
|
||||
The dashboard shows ✗ Not built for tools:
|
||||
|
||||
```bash
|
||||
# Build all hub images
|
||||
./scripts/build-hub-images.sh
|
||||
|
||||
# Or build a single tool
|
||||
docker build -t <tool-name>:latest mcp-security-hub/<category>/<tool-name>/
|
||||
```
|
||||
|
||||
### MCP Server Not Starting
|
||||
|
||||
```bash
|
||||
# Check agent configuration
|
||||
uv run fuzzforge mcp status
|
||||
|
||||
# Verify the config file path exists and contains valid JSON
|
||||
cat ~/.config/Code/User/mcp.json # Copilot
|
||||
cat ~/.claude.json # Claude Code
|
||||
```
|
||||
|
||||
### Using Podman Instead of Docker
|
||||
|
||||
```bash
|
||||
# Install with Podman engine
|
||||
uv run fuzzforge mcp install copilot --engine podman
|
||||
|
||||
# Or set environment variable
|
||||
export FUZZFORGE_ENGINE=podman
|
||||
```
|
||||
|
||||
### Hub Registry
|
||||
|
||||
FuzzForge stores linked hub information in `~/.fuzzforge/hubs.json`. If something goes wrong:
|
||||
|
||||
```bash
|
||||
# View registry
|
||||
cat ~/.fuzzforge/hubs.json
|
||||
|
||||
# Reset registry
|
||||
rm ~/.fuzzforge/hubs.json
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
- 🖥️ Launch `uv run fuzzforge ui` and explore the dashboard
|
||||
- 🔒 Clone the [mcp-security-hub](https://github.com/FuzzingLabs/mcp-security-hub) for 40+ security tools
|
||||
- 💬 Join our [Discord](https://discord.gg/8XEX33UUwZ) for support
|
||||
|
||||
---
|
||||
|
||||
<p align="center">
|
||||
<strong>Built with ❤️ by <a href="https://fuzzinglabs.com">FuzzingLabs</a></strong>
|
||||
</p>
|
||||
6
ai/.gitignore
vendored
Normal file
6
ai/.gitignore
vendored
Normal file
@@ -0,0 +1,6 @@
|
||||
.env
|
||||
__pycache__/
|
||||
*.pyc
|
||||
fuzzforge_sessions.db
|
||||
agentops.log
|
||||
*.log
|
||||
110
ai/README.md
Normal file
110
ai/README.md
Normal file
@@ -0,0 +1,110 @@
|
||||
# FuzzForge AI Module
|
||||
|
||||
FuzzForge AI is the multi-agent layer that lets you operate the FuzzForge security platform through natural language. It orchestrates local tooling, registered Agent-to-Agent (A2A) peers, and the Prefect-powered backend while keeping long-running context in memory and project knowledge graphs.
|
||||
|
||||
## Quick Start
|
||||
|
||||
1. **Initialise a project**
|
||||
```bash
|
||||
cd /path/to/project
|
||||
fuzzforge init
|
||||
```
|
||||
2. **Review environment settings** – copy `.fuzzforge/.env.template` to `.fuzzforge/.env`, then edit the values to match your provider. The template ships with commented defaults for OpenAI-style usage and placeholders for Cognee keys.
|
||||
```env
|
||||
LLM_PROVIDER=openai
|
||||
LITELLM_MODEL=gpt-5-mini
|
||||
OPENAI_API_KEY=sk-your-key
|
||||
FUZZFORGE_MCP_URL=http://localhost:8010/mcp
|
||||
SESSION_PERSISTENCE=sqlite
|
||||
```
|
||||
Optional flags you may want to enable early:
|
||||
```env
|
||||
MEMORY_SERVICE=inmemory
|
||||
AGENTOPS_API_KEY=sk-your-agentops-key # Enable hosted tracing
|
||||
LOG_LEVEL=INFO # CLI / server log level
|
||||
```
|
||||
3. **Populate the knowledge graph**
|
||||
```bash
|
||||
fuzzforge ingest --path . --recursive
|
||||
# alias: fuzzforge rag ingest --path . --recursive
|
||||
```
|
||||
4. **Launch the agent shell**
|
||||
```bash
|
||||
fuzzforge ai agent
|
||||
```
|
||||
Keep the backend running (Prefect API at `FUZZFORGE_MCP_URL`) so workflow commands succeed.
|
||||
|
||||
## Everyday Workflow
|
||||
|
||||
- Run `fuzzforge ai agent` and start with `list available fuzzforge workflows` or `/memory status` to confirm everything is wired.
|
||||
- Use natural prompts for automation (`run fuzzforge workflow …`, `search project knowledge for …`) and fall back to slash commands for precision (`/recall`, `/sendfile`).
|
||||
- Keep `/memory datasets` handy to see which Cognee datasets are available after each ingest.
|
||||
- Start the HTTP surface with `python -m fuzzforge_ai` when external agents need access to artifacts or graph queries. The CLI stays usable at the same time.
|
||||
- Refresh the knowledge graph regularly: `fuzzforge ingest --path . --recursive --force` keeps responses aligned with recent code changes.
|
||||
|
||||
## What the Agent Can Do
|
||||
|
||||
- **Route requests** – automatically selects the right local tool or remote agent using the A2A capability registry.
|
||||
- **Run security workflows** – list, submit, and monitor FuzzForge workflows via MCP wrappers.
|
||||
- **Manage artifacts** – create downloadable files for reports, code edits, and shared attachments.
|
||||
- **Maintain context** – stores session history, semantic recall, and Cognee project graphs.
|
||||
- **Serve over HTTP** – expose the same agent as an A2A server using `python -m fuzzforge_ai`.
|
||||
|
||||
## Essential Commands
|
||||
|
||||
Inside `fuzzforge ai agent` you can mix slash commands and free-form prompts:
|
||||
|
||||
```text
|
||||
/list # Show registered A2A agents
|
||||
/register http://:10201 # Add a remote agent
|
||||
/artifacts # List generated files
|
||||
/sendfile SecurityAgent src/report.md "Please review"
|
||||
You> route_to SecurityAnalyzer: scan ./backend for secrets
|
||||
You> run fuzzforge workflow static_analysis_scan on ./test_projects/demo
|
||||
You> search project knowledge for "prefect status" using INSIGHTS
|
||||
```
|
||||
|
||||
Artifacts created during the conversation are served from `.fuzzforge/artifacts/` and exposed through the A2A HTTP API.
|
||||
|
||||
## Memory & Knowledge
|
||||
|
||||
The module layers three storage systems:
|
||||
|
||||
- **Session persistence** (SQLite or in-memory) for chat transcripts.
|
||||
- **Semantic recall** via the ADK memory service for fuzzy search.
|
||||
- **Cognee graphs** for project-wide knowledge built from ingestion runs.
|
||||
|
||||
Re-run ingestion after major code changes to keep graph answers relevant. If Cognee variables are not set, graph-specific tools automatically respond with a polite "not configured" message.
|
||||
|
||||
## Sample Prompts
|
||||
|
||||
Use these to validate the setup once the agent shell is running:
|
||||
|
||||
- `list available fuzzforge workflows`
|
||||
- `run fuzzforge workflow static_analysis_scan on ./backend with target_branch=main`
|
||||
- `show findings for that run once it finishes`
|
||||
- `refresh the project knowledge graph for ./backend`
|
||||
- `search project knowledge for "prefect readiness" using INSIGHTS`
|
||||
- `/recall terraform secrets`
|
||||
- `/memory status`
|
||||
- `ROUTE_TO SecurityAnalyzer: audit infrastructure_vulnerable`
|
||||
|
||||
## Need More Detail?
|
||||
|
||||
Dive into the dedicated guides under `ai/docs/advanced/`:
|
||||
|
||||
- [Architecture](https://docs.fuzzforge.ai/docs/ai/intro) – High-level architecture with diagrams and component breakdowns.
|
||||
- [Ingestion](https://docs.fuzzforge.ai/docs/ai/ingestion.md) – Command options, Cognee persistence, and prompt examples.
|
||||
- [Configuration](https://docs.fuzzforge.ai/docs/ai/configuration.md) – LLM provider matrix, local model setup, and tracing options.
|
||||
- [Prompts](https://docs.fuzzforge.ai/docs/ai/prompts.md) – Slash commands, workflow prompts, and routing tips.
|
||||
- [A2A Services](https://docs.fuzzforge.ai/docs/ai/a2a-services.md) – HTTP endpoints, agent card, and collaboration flow.
|
||||
- [Memory Persistence](https://docs.fuzzforge.ai/docs/ai/architecture.md#memory--persistence) – Deep dive on memory storage, datasets, and how `/memory status` inspects them.
|
||||
|
||||
## Development Notes
|
||||
|
||||
- Entry point for the CLI: `ai/src/fuzzforge_ai/cli.py`
|
||||
- A2A HTTP server: `ai/src/fuzzforge_ai/a2a_server.py`
|
||||
- Tool routing & workflow glue: `ai/src/fuzzforge_ai/agent_executor.py`
|
||||
- Ingestion helpers: `ai/src/fuzzforge_ai/ingest_utils.py`
|
||||
|
||||
Install the module in editable mode (`pip install -e ai`) while iterating so CLI changes are picked up immediately.
|
||||
93
ai/llm.txt
Normal file
93
ai/llm.txt
Normal file
@@ -0,0 +1,93 @@
|
||||
FuzzForge AI LLM Configuration Guide
|
||||
===================================
|
||||
|
||||
This note summarises the environment variables and libraries that drive LiteLLM (via the Google ADK runtime) inside the FuzzForge AI module. For complete matrices and advanced examples, read `docs/advanced/configuration.md`.
|
||||
|
||||
Core Libraries
|
||||
--------------
|
||||
- `google-adk` – hosts the agent runtime, memory services, and LiteLLM bridge.
|
||||
- `litellm` – provider-agnostic LLM client used by ADK and the executor.
|
||||
- Provider SDKs – install the SDK that matches your target backend (`openai`, `anthropic`, `google-cloud-aiplatform`, `groq`, etc.).
|
||||
- Optional extras: `agentops` for tracing, `cognee[all]` for knowledge-graph ingestion, `ollama` CLI for running local models.
|
||||
|
||||
Quick install foundation::
|
||||
|
||||
```
|
||||
pip install google-adk litellm openai
|
||||
```
|
||||
|
||||
Add any provider-specific SDKs (for example `pip install anthropic groq`) on top of that base.
|
||||
|
||||
Baseline Setup
|
||||
--------------
|
||||
Copy `.fuzzforge/.env.template` to `.fuzzforge/.env` and set the core fields:
|
||||
|
||||
```
|
||||
LLM_PROVIDER=openai
|
||||
LITELLM_MODEL=gpt-5-mini
|
||||
OPENAI_API_KEY=sk-your-key
|
||||
FUZZFORGE_MCP_URL=http://localhost:8010/mcp
|
||||
SESSION_PERSISTENCE=sqlite
|
||||
MEMORY_SERVICE=inmemory
|
||||
```
|
||||
|
||||
LiteLLM Provider Examples
|
||||
-------------------------
|
||||
|
||||
OpenAI-compatible (Azure, etc.)::
|
||||
```
|
||||
LLM_PROVIDER=azure_openai
|
||||
LITELLM_MODEL=gpt-4o-mini
|
||||
LLM_API_KEY=sk-your-azure-key
|
||||
LLM_ENDPOINT=https://your-resource.openai.azure.com
|
||||
```
|
||||
|
||||
Anthropic::
|
||||
```
|
||||
LLM_PROVIDER=anthropic
|
||||
LITELLM_MODEL=claude-3-haiku-20240307
|
||||
ANTHROPIC_API_KEY=sk-your-key
|
||||
```
|
||||
|
||||
Ollama (local)::
|
||||
```
|
||||
LLM_PROVIDER=ollama_chat
|
||||
LITELLM_MODEL=codellama:latest
|
||||
OLLAMA_API_BASE=http://localhost:11434
|
||||
```
|
||||
Run `ollama pull codellama:latest` so the adapter can respond immediately.
|
||||
|
||||
Vertex AI::
|
||||
```
|
||||
LLM_PROVIDER=vertex_ai
|
||||
LITELLM_MODEL=gemini-1.5-pro
|
||||
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
|
||||
```
|
||||
|
||||
Provider Checklist
|
||||
------------------
|
||||
- **OpenAI / Azure OpenAI**: `LLM_PROVIDER`, `LITELLM_MODEL`, API key, optional endpoint + API version (Azure).
|
||||
- **Anthropic**: `LLM_PROVIDER=anthropic`, `LITELLM_MODEL`, `ANTHROPIC_API_KEY`.
|
||||
- **Google Vertex AI**: `LLM_PROVIDER=vertex_ai`, `LITELLM_MODEL`, `GOOGLE_APPLICATION_CREDENTIALS`, `GOOGLE_CLOUD_PROJECT`.
|
||||
- **Groq**: `LLM_PROVIDER=groq`, `LITELLM_MODEL`, `GROQ_API_KEY`.
|
||||
- **Ollama / Local**: `LLM_PROVIDER=ollama_chat`, `LITELLM_MODEL`, `OLLAMA_API_BASE`, and the model pulled locally (`ollama pull <model>`).
|
||||
|
||||
Knowledge Graph Add-ons
|
||||
-----------------------
|
||||
Set these only if you plan to use Cognee project graphs:
|
||||
|
||||
```
|
||||
LLM_COGNEE_PROVIDER=openai
|
||||
LLM_COGNEE_MODEL=gpt-5-mini
|
||||
LLM_COGNEE_API_KEY=sk-your-key
|
||||
```
|
||||
|
||||
Tracing & Debugging
|
||||
-------------------
|
||||
- Provide `AGENTOPS_API_KEY` to enable hosted traces for every conversation.
|
||||
- Set `FUZZFORGE_DEBUG=1` (and optionally `LOG_LEVEL=DEBUG`) for verbose executor output.
|
||||
- Restart the agent after changing environment variables; LiteLLM loads configuration on boot.
|
||||
|
||||
Further Reading
|
||||
---------------
|
||||
`docs/advanced/configuration.md` – provider comparison, debugging flags, and referenced modules.
|
||||
44
ai/pyproject.toml
Normal file
44
ai/pyproject.toml
Normal file
@@ -0,0 +1,44 @@
|
||||
[project]
|
||||
name = "fuzzforge-ai"
|
||||
version = "0.6.0"
|
||||
description = "FuzzForge AI orchestration module"
|
||||
readme = "README.md"
|
||||
requires-python = ">=3.11"
|
||||
dependencies = [
|
||||
"google-adk",
|
||||
"a2a-sdk",
|
||||
"litellm",
|
||||
"python-dotenv",
|
||||
"httpx",
|
||||
"uvicorn",
|
||||
"rich",
|
||||
"agentops",
|
||||
"fastmcp",
|
||||
"mcp",
|
||||
"typing-extensions",
|
||||
"cognee>=0.3.0",
|
||||
]
|
||||
|
||||
[project.optional-dependencies]
|
||||
dev = [
|
||||
"pytest",
|
||||
"pytest-asyncio",
|
||||
"black",
|
||||
"ruff",
|
||||
]
|
||||
|
||||
[build-system]
|
||||
requires = ["hatchling"]
|
||||
build-backend = "hatchling.build"
|
||||
|
||||
[tool.hatch.build.targets.wheel]
|
||||
packages = ["src/fuzzforge_ai"]
|
||||
|
||||
[tool.hatch.metadata]
|
||||
allow-direct-references = true
|
||||
|
||||
[tool.uv]
|
||||
dev-dependencies = [
|
||||
"pytest",
|
||||
"pytest-asyncio",
|
||||
]
|
||||
24
ai/src/fuzzforge_ai/__init__.py
Normal file
24
ai/src/fuzzforge_ai/__init__.py
Normal file
@@ -0,0 +1,24 @@
|
||||
"""
|
||||
FuzzForge AI Module - Agent-to-Agent orchestration system
|
||||
|
||||
This module integrates the fuzzforge_ai components into FuzzForge,
|
||||
providing intelligent AI agent capabilities for security analysis.
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
__version__ = "0.6.0"
|
||||
|
||||
from .agent import FuzzForgeAgent
|
||||
from .config_manager import ConfigManager
|
||||
|
||||
__all__ = ['FuzzForgeAgent', 'ConfigManager']
|
||||
109
ai/src/fuzzforge_ai/__main__.py
Normal file
109
ai/src/fuzzforge_ai/__main__.py
Normal file
@@ -0,0 +1,109 @@
|
||||
"""
|
||||
FuzzForge A2A Server
|
||||
Run this to expose FuzzForge as an A2A-compatible agent
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
import os
|
||||
import warnings
|
||||
import logging
|
||||
from dotenv import load_dotenv
|
||||
|
||||
from fuzzforge_ai.config_bridge import ProjectConfigManager
|
||||
|
||||
# Suppress warnings
|
||||
warnings.filterwarnings("ignore")
|
||||
logging.getLogger("google.adk").setLevel(logging.ERROR)
|
||||
logging.getLogger("google.adk.tools.base_authenticated_tool").setLevel(logging.ERROR)
|
||||
|
||||
# Load .env from .fuzzforge directory first, then fallback
|
||||
from pathlib import Path
|
||||
|
||||
# Ensure Cognee logs stay inside the project workspace
|
||||
project_root = Path.cwd()
|
||||
default_log_dir = project_root / ".fuzzforge" / "logs"
|
||||
default_log_dir.mkdir(parents=True, exist_ok=True)
|
||||
log_path = default_log_dir / "cognee.log"
|
||||
os.environ.setdefault("COGNEE_LOG_PATH", str(log_path))
|
||||
fuzzforge_env = Path.cwd() / ".fuzzforge" / ".env"
|
||||
if fuzzforge_env.exists():
|
||||
load_dotenv(fuzzforge_env, override=True)
|
||||
else:
|
||||
load_dotenv(override=True)
|
||||
|
||||
# Ensure Cognee uses the project-specific storage paths when available
|
||||
try:
|
||||
project_config = ProjectConfigManager()
|
||||
project_config.setup_cognee_environment()
|
||||
except Exception:
|
||||
# Project may not be initialized; fall through with default settings
|
||||
pass
|
||||
|
||||
# Check configuration
|
||||
if not os.getenv('LITELLM_MODEL'):
|
||||
print("[ERROR] LITELLM_MODEL not set in .env file")
|
||||
print("Please set LITELLM_MODEL to your desired model (e.g., gpt-4o-mini)")
|
||||
exit(1)
|
||||
|
||||
from .agent import get_fuzzforge_agent
|
||||
from .a2a_server import create_a2a_app as create_custom_a2a_app
|
||||
|
||||
|
||||
def create_a2a_app():
|
||||
"""Create the A2A application"""
|
||||
# Get configuration
|
||||
port = int(os.getenv('FUZZFORGE_PORT', 10100))
|
||||
|
||||
# Get the FuzzForge agent
|
||||
fuzzforge = get_fuzzforge_agent(auto_start_server=False)
|
||||
|
||||
# Print ASCII banner
|
||||
print("\033[95m") # Purple color
|
||||
print(" ███████╗██╗ ██╗███████╗███████╗███████╗ ██████╗ ██████╗ ██████╗ ███████╗ █████╗ ██╗")
|
||||
print(" ██╔════╝██║ ██║╚══███╔╝╚══███╔╝██╔════╝██╔═══██╗██╔══██╗██╔════╝ ██╔════╝ ██╔══██╗██║")
|
||||
print(" █████╗ ██║ ██║ ███╔╝ ███╔╝ █████╗ ██║ ██║██████╔╝██║ ███╗█████╗ ███████║██║")
|
||||
print(" ██╔══╝ ██║ ██║ ███╔╝ ███╔╝ ██╔══╝ ██║ ██║██╔══██╗██║ ██║██╔══╝ ██╔══██║██║")
|
||||
print(" ██║ ╚██████╔╝███████╗███████╗██║ ╚██████╔╝██║ ██║╚██████╔╝███████╗ ██║ ██║██║")
|
||||
print(" ╚═╝ ╚═════╝ ╚══════╝╚══════╝╚═╝ ╚═════╝ ╚═╝ ╚═╝ ╚═════╝ ╚══════╝ ╚═╝ ╚═╝╚═╝")
|
||||
print("\033[0m") # Reset color
|
||||
|
||||
# Create A2A app
|
||||
print(f"🚀 Starting FuzzForge A2A Server")
|
||||
print(f" Model: {fuzzforge.model}")
|
||||
if fuzzforge.cognee_url:
|
||||
print(f" Memory: Cognee at {fuzzforge.cognee_url}")
|
||||
print(f" Port: {port}")
|
||||
|
||||
app = create_custom_a2a_app(fuzzforge.adk_agent, port=port, executor=fuzzforge.executor)
|
||||
|
||||
print(f"\n✅ FuzzForge A2A Server ready!")
|
||||
print(f" Agent card: http://localhost:{port}/.well-known/agent-card.json")
|
||||
print(f" A2A endpoint: http://localhost:{port}/")
|
||||
print(f"\n📡 Other agents can register FuzzForge at: http://localhost:{port}")
|
||||
|
||||
return app
|
||||
|
||||
|
||||
def main():
|
||||
"""Start the A2A server using uvicorn."""
|
||||
import uvicorn
|
||||
|
||||
app = create_a2a_app()
|
||||
port = int(os.getenv('FUZZFORGE_PORT', 10100))
|
||||
|
||||
print(f"\n🎯 Starting server with uvicorn...")
|
||||
uvicorn.run(app, host="127.0.0.1", port=port)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
230
ai/src/fuzzforge_ai/a2a_server.py
Normal file
230
ai/src/fuzzforge_ai/a2a_server.py
Normal file
@@ -0,0 +1,230 @@
|
||||
"""Custom A2A wiring so we can access task store and queue manager."""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from typing import Optional, Union
|
||||
|
||||
from starlette.applications import Starlette
|
||||
from starlette.responses import Response, FileResponse
|
||||
from starlette.routing import Route
|
||||
|
||||
from google.adk.a2a.executor.a2a_agent_executor import A2aAgentExecutor
|
||||
from google.adk.a2a.utils.agent_card_builder import AgentCardBuilder
|
||||
from google.adk.a2a.experimental import a2a_experimental
|
||||
from google.adk.agents.base_agent import BaseAgent
|
||||
from google.adk.artifacts.in_memory_artifact_service import InMemoryArtifactService
|
||||
from google.adk.auth.credential_service.in_memory_credential_service import InMemoryCredentialService
|
||||
from google.adk.cli.utils.logs import setup_adk_logger
|
||||
from google.adk.memory.in_memory_memory_service import InMemoryMemoryService
|
||||
from google.adk.runners import Runner
|
||||
from google.adk.sessions.in_memory_session_service import InMemorySessionService
|
||||
|
||||
from a2a.server.apps import A2AStarletteApplication
|
||||
from a2a.server.request_handlers.default_request_handler import DefaultRequestHandler
|
||||
from a2a.server.tasks.inmemory_task_store import InMemoryTaskStore
|
||||
from a2a.server.events.in_memory_queue_manager import InMemoryQueueManager
|
||||
from a2a.types import AgentCard
|
||||
|
||||
from .agent_executor import FuzzForgeExecutor
|
||||
|
||||
|
||||
import json
|
||||
|
||||
|
||||
async def serve_artifact(request):
|
||||
"""Serve artifact files via HTTP for A2A agents"""
|
||||
artifact_id = request.path_params["artifact_id"]
|
||||
|
||||
# Try to get the executor instance to access artifact cache
|
||||
# We'll store a reference to it during app creation
|
||||
executor = getattr(serve_artifact, '_executor', None)
|
||||
if not executor:
|
||||
return Response("Artifact service not available", status_code=503)
|
||||
|
||||
try:
|
||||
# Look in the artifact cache directory
|
||||
artifact_cache_dir = executor._artifact_cache_dir
|
||||
artifact_dir = artifact_cache_dir / artifact_id
|
||||
|
||||
if not artifact_dir.exists():
|
||||
return Response("Artifact not found", status_code=404)
|
||||
|
||||
# Find the artifact file (should be only one file in the directory)
|
||||
artifact_files = list(artifact_dir.glob("*"))
|
||||
if not artifact_files:
|
||||
return Response("Artifact file not found", status_code=404)
|
||||
|
||||
artifact_file = artifact_files[0] # Take the first (and should be only) file
|
||||
|
||||
# Determine mime type from file extension or default to octet-stream
|
||||
import mimetypes
|
||||
mime_type, _ = mimetypes.guess_type(str(artifact_file))
|
||||
if not mime_type:
|
||||
mime_type = 'application/octet-stream'
|
||||
|
||||
return FileResponse(
|
||||
path=str(artifact_file),
|
||||
media_type=mime_type,
|
||||
filename=artifact_file.name
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
return Response(f"Error serving artifact: {str(e)}", status_code=500)
|
||||
|
||||
|
||||
async def knowledge_query(request):
|
||||
"""Expose knowledge graph search over HTTP for external agents."""
|
||||
executor = getattr(knowledge_query, '_executor', None)
|
||||
if not executor:
|
||||
return Response("Knowledge service not available", status_code=503)
|
||||
|
||||
try:
|
||||
payload = await request.json()
|
||||
except Exception:
|
||||
return Response("Invalid JSON body", status_code=400)
|
||||
|
||||
query = payload.get("query")
|
||||
if not query:
|
||||
return Response("'query' is required", status_code=400)
|
||||
|
||||
search_type = payload.get("search_type", "INSIGHTS")
|
||||
dataset = payload.get("dataset")
|
||||
|
||||
result = await executor.query_project_knowledge_api(
|
||||
query=query,
|
||||
search_type=search_type,
|
||||
dataset=dataset,
|
||||
)
|
||||
|
||||
status = 200 if not isinstance(result, dict) or "error" not in result else 400
|
||||
return Response(
|
||||
json.dumps(result, default=str),
|
||||
status_code=status,
|
||||
media_type="application/json",
|
||||
)
|
||||
|
||||
|
||||
async def create_file_artifact(request):
|
||||
"""Create an artifact from a project file via HTTP."""
|
||||
executor = getattr(create_file_artifact, '_executor', None)
|
||||
if not executor:
|
||||
return Response("File service not available", status_code=503)
|
||||
|
||||
try:
|
||||
payload = await request.json()
|
||||
except Exception:
|
||||
return Response("Invalid JSON body", status_code=400)
|
||||
|
||||
path = payload.get("path")
|
||||
if not path:
|
||||
return Response("'path' is required", status_code=400)
|
||||
|
||||
result = await executor.create_project_file_artifact_api(path)
|
||||
status = 200 if not isinstance(result, dict) or "error" not in result else 400
|
||||
return Response(
|
||||
json.dumps(result, default=str),
|
||||
status_code=status,
|
||||
media_type="application/json",
|
||||
)
|
||||
|
||||
|
||||
def _load_agent_card(agent_card: Optional[Union[AgentCard, str]]) -> Optional[AgentCard]:
|
||||
if agent_card is None:
|
||||
return None
|
||||
if isinstance(agent_card, AgentCard):
|
||||
return agent_card
|
||||
|
||||
import json
|
||||
from pathlib import Path
|
||||
|
||||
path = Path(agent_card)
|
||||
with path.open('r', encoding='utf-8') as handle:
|
||||
data = json.load(handle)
|
||||
return AgentCard(**data)
|
||||
|
||||
|
||||
@a2a_experimental
|
||||
def create_a2a_app(
|
||||
agent: BaseAgent,
|
||||
*,
|
||||
host: str = "localhost",
|
||||
port: int = 8000,
|
||||
protocol: str = "http",
|
||||
agent_card: Optional[Union[AgentCard, str]] = None,
|
||||
executor=None, # Accept executor reference
|
||||
) -> Starlette:
|
||||
"""Variant of google.adk.a2a.utils.to_a2a that exposes task-store handles."""
|
||||
|
||||
setup_adk_logger(logging.INFO)
|
||||
|
||||
async def create_runner() -> Runner:
|
||||
return Runner(
|
||||
agent=agent,
|
||||
app_name=agent.name or "fuzzforge",
|
||||
artifact_service=InMemoryArtifactService(),
|
||||
session_service=InMemorySessionService(),
|
||||
memory_service=InMemoryMemoryService(),
|
||||
credential_service=InMemoryCredentialService(),
|
||||
)
|
||||
|
||||
task_store = InMemoryTaskStore()
|
||||
queue_manager = InMemoryQueueManager()
|
||||
|
||||
agent_executor = A2aAgentExecutor(runner=create_runner)
|
||||
request_handler = DefaultRequestHandler(
|
||||
agent_executor=agent_executor,
|
||||
task_store=task_store,
|
||||
queue_manager=queue_manager,
|
||||
)
|
||||
|
||||
rpc_url = f"{protocol}://{host}:{port}/"
|
||||
provided_card = _load_agent_card(agent_card)
|
||||
|
||||
card_builder = AgentCardBuilder(agent=agent, rpc_url=rpc_url)
|
||||
|
||||
app = Starlette()
|
||||
|
||||
async def setup() -> None:
|
||||
if provided_card is not None:
|
||||
final_card = provided_card
|
||||
else:
|
||||
final_card = await card_builder.build()
|
||||
|
||||
a2a_app = A2AStarletteApplication(
|
||||
agent_card=final_card,
|
||||
http_handler=request_handler,
|
||||
)
|
||||
a2a_app.add_routes_to_app(app)
|
||||
|
||||
# Add artifact serving route
|
||||
app.router.add_route("/artifacts/{artifact_id}", serve_artifact, methods=["GET"])
|
||||
app.router.add_route("/graph/query", knowledge_query, methods=["POST"])
|
||||
app.router.add_route("/project/files", create_file_artifact, methods=["POST"])
|
||||
|
||||
app.add_event_handler("startup", setup)
|
||||
|
||||
# Expose handles so the executor can emit task updates later
|
||||
FuzzForgeExecutor.task_store = task_store
|
||||
FuzzForgeExecutor.queue_manager = queue_manager
|
||||
|
||||
# Store reference to executor for artifact serving
|
||||
serve_artifact._executor = executor
|
||||
knowledge_query._executor = executor
|
||||
create_file_artifact._executor = executor
|
||||
|
||||
return app
|
||||
|
||||
|
||||
__all__ = ["create_a2a_app"]
|
||||
218
ai/src/fuzzforge_ai/agent.py
Normal file
218
ai/src/fuzzforge_ai/agent.py
Normal file
@@ -0,0 +1,218 @@
|
||||
"""
|
||||
FuzzForge Agent Definition
|
||||
The core agent that combines all components
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
import os
|
||||
import threading
|
||||
import time
|
||||
import socket
|
||||
import asyncio
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any, List, Optional
|
||||
from google.adk import Agent
|
||||
from google.adk.models.lite_llm import LiteLlm
|
||||
from .agent_card import get_fuzzforge_agent_card
|
||||
from .agent_executor import FuzzForgeExecutor
|
||||
from .memory_service import FuzzForgeMemoryService, HybridMemoryManager
|
||||
|
||||
# Load environment variables from the AI module's .env file
|
||||
try:
|
||||
from dotenv import load_dotenv
|
||||
_ai_dir = Path(__file__).parent
|
||||
_env_file = _ai_dir / ".env"
|
||||
if _env_file.exists():
|
||||
load_dotenv(_env_file, override=False) # Don't override existing env vars
|
||||
except ImportError:
|
||||
# dotenv not available, skip loading
|
||||
pass
|
||||
|
||||
|
||||
class FuzzForgeAgent:
|
||||
"""The main FuzzForge agent that combines card, executor, and ADK agent"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
model: str = None,
|
||||
cognee_url: str = None,
|
||||
port: int = 10100,
|
||||
auto_start_server: Optional[bool] = None,
|
||||
):
|
||||
"""Initialize FuzzForge agent with configuration"""
|
||||
self.model = model or os.getenv('LITELLM_MODEL', 'gpt-4o-mini')
|
||||
self.cognee_url = cognee_url or os.getenv('COGNEE_MCP_URL')
|
||||
self.port = int(os.getenv('FUZZFORGE_PORT', port))
|
||||
self._auto_start_server = (
|
||||
auto_start_server
|
||||
if auto_start_server is not None
|
||||
else os.getenv('FUZZFORGE_AUTO_A2A_SERVER', '1') not in {'0', 'false', 'False'}
|
||||
)
|
||||
self._uvicorn_server = None
|
||||
self._a2a_server_thread: Optional[threading.Thread] = None
|
||||
|
||||
# Initialize ADK Memory Service for conversational memory
|
||||
memory_type = os.getenv('MEMORY_SERVICE', 'inmemory')
|
||||
self.memory_service = FuzzForgeMemoryService(memory_type=memory_type)
|
||||
|
||||
# Create the executor (the brain) with memory and session services
|
||||
self.executor = FuzzForgeExecutor(
|
||||
model=self.model,
|
||||
cognee_url=self.cognee_url,
|
||||
debug=os.getenv('FUZZFORGE_DEBUG', '0') == '1',
|
||||
memory_service=self.memory_service,
|
||||
session_persistence=os.getenv('SESSION_PERSISTENCE', 'inmemory'),
|
||||
fuzzforge_mcp_url=os.getenv('FUZZFORGE_MCP_URL'),
|
||||
)
|
||||
|
||||
# Create Hybrid Memory Manager (ADK + Cognee direct integration)
|
||||
# MCP tools removed - using direct Cognee integration only
|
||||
self.memory_manager = HybridMemoryManager(
|
||||
memory_service=self.memory_service,
|
||||
cognee_tools=None # No MCP tools, direct integration used instead
|
||||
)
|
||||
|
||||
# Get the agent card (the identity)
|
||||
self.agent_card = get_fuzzforge_agent_card(f"http://localhost:{self.port}")
|
||||
|
||||
# Create the ADK agent (for A2A server mode)
|
||||
self.adk_agent = self._create_adk_agent()
|
||||
|
||||
if self._auto_start_server:
|
||||
self._ensure_a2a_server_running()
|
||||
|
||||
def _create_adk_agent(self) -> Agent:
|
||||
"""Create the ADK agent for A2A server mode"""
|
||||
# Build instruction
|
||||
instruction = f"""You are {self.agent_card.name}, {self.agent_card.description}
|
||||
|
||||
Your capabilities include:
|
||||
"""
|
||||
for skill in self.agent_card.skills:
|
||||
instruction += f"\n- {skill.name}: {skill.description}"
|
||||
|
||||
instruction += """
|
||||
|
||||
When responding to requests:
|
||||
1. Use your registered agents when appropriate
|
||||
2. Use Cognee memory tools when available
|
||||
3. Provide helpful, concise responses
|
||||
4. Maintain context across conversations
|
||||
"""
|
||||
|
||||
# Create ADK agent
|
||||
return Agent(
|
||||
model=LiteLlm(model=self.model),
|
||||
name=self.agent_card.name,
|
||||
description=self.agent_card.description,
|
||||
instruction=instruction,
|
||||
tools=self.executor.agent.tools if hasattr(self.executor.agent, 'tools') else []
|
||||
)
|
||||
|
||||
async def process_message(self, message: str, context_id: str = None) -> str:
|
||||
"""Process a message using the executor"""
|
||||
result = await self.executor.execute(message, context_id or "default")
|
||||
return result.get("response", "No response generated")
|
||||
|
||||
async def register_agent(self, url: str) -> Dict[str, Any]:
|
||||
"""Register a new agent"""
|
||||
return await self.executor.register_agent(url)
|
||||
|
||||
def list_agents(self) -> List[Dict[str, Any]]:
|
||||
"""List registered agents"""
|
||||
return self.executor.list_agents()
|
||||
|
||||
async def cleanup(self):
|
||||
"""Clean up resources"""
|
||||
await self._stop_a2a_server()
|
||||
await self.executor.cleanup()
|
||||
|
||||
def _ensure_a2a_server_running(self):
|
||||
"""Start the A2A server in the background if it's not already running."""
|
||||
if self._a2a_server_thread and self._a2a_server_thread.is_alive():
|
||||
return
|
||||
|
||||
try:
|
||||
from uvicorn import Config, Server
|
||||
from .a2a_server import create_a2a_app as create_custom_a2a_app
|
||||
except ImportError as exc:
|
||||
if os.getenv('FUZZFORGE_DEBUG', '0') == '1':
|
||||
print(f"[DEBUG] Unable to start A2A server automatically: {exc}")
|
||||
return
|
||||
|
||||
app = create_custom_a2a_app(
|
||||
self.adk_agent,
|
||||
port=self.port,
|
||||
executor=self.executor,
|
||||
)
|
||||
|
||||
log_level = os.getenv('FUZZFORGE_UVICORN_LOG_LEVEL', 'error')
|
||||
config = Config(app=app, host='127.0.0.1', port=self.port, log_level=log_level, loop='asyncio')
|
||||
server = Server(config=config)
|
||||
self._uvicorn_server = server
|
||||
|
||||
def _run_server():
|
||||
loop = asyncio.new_event_loop()
|
||||
asyncio.set_event_loop(loop)
|
||||
|
||||
async def _serve():
|
||||
await server.serve()
|
||||
|
||||
try:
|
||||
loop.run_until_complete(_serve())
|
||||
finally:
|
||||
loop.close()
|
||||
|
||||
thread = threading.Thread(target=_run_server, name='FuzzForgeA2AServer', daemon=True)
|
||||
thread.start()
|
||||
self._a2a_server_thread = thread
|
||||
|
||||
# Give the server a moment to bind to the port for downstream agents
|
||||
for _ in range(50):
|
||||
if server.should_exit:
|
||||
break
|
||||
try:
|
||||
with socket.create_connection(('127.0.0.1', self.port), timeout=0.1):
|
||||
if os.getenv('FUZZFORGE_DEBUG', '0') == '1':
|
||||
print(f"[DEBUG] Auto-started A2A server on http://127.0.0.1:{self.port}")
|
||||
break
|
||||
except OSError:
|
||||
time.sleep(0.1)
|
||||
|
||||
async def _stop_a2a_server(self):
|
||||
"""Shut down the background A2A server if we started one."""
|
||||
server = self._uvicorn_server
|
||||
if server is None:
|
||||
return
|
||||
|
||||
server.should_exit = True
|
||||
if self._a2a_server_thread and self._a2a_server_thread.is_alive():
|
||||
# Allow server loop to exit gracefully without blocking event loop
|
||||
try:
|
||||
await asyncio.wait_for(asyncio.to_thread(self._a2a_server_thread.join, 5), timeout=6)
|
||||
except (asyncio.TimeoutError, RuntimeError):
|
||||
pass
|
||||
|
||||
self._uvicorn_server = None
|
||||
self._a2a_server_thread = None
|
||||
|
||||
|
||||
# Create a singleton instance for import
|
||||
_instance = None
|
||||
|
||||
def get_fuzzforge_agent(auto_start_server: Optional[bool] = None) -> FuzzForgeAgent:
|
||||
"""Get the singleton FuzzForge agent instance"""
|
||||
global _instance
|
||||
if _instance is None:
|
||||
_instance = FuzzForgeAgent(auto_start_server=auto_start_server)
|
||||
return _instance
|
||||
183
ai/src/fuzzforge_ai/agent_card.py
Normal file
183
ai/src/fuzzforge_ai/agent_card.py
Normal file
@@ -0,0 +1,183 @@
|
||||
"""
|
||||
FuzzForge Agent Card and Skills Definition
|
||||
Defines what FuzzForge can do and how others can discover it
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
from dataclasses import dataclass
|
||||
from typing import List, Optional, Dict, Any
|
||||
|
||||
@dataclass
|
||||
class AgentSkill:
|
||||
"""Represents a specific capability of the agent"""
|
||||
id: str
|
||||
name: str
|
||||
description: str
|
||||
tags: List[str]
|
||||
examples: List[str]
|
||||
input_modes: List[str] = None
|
||||
output_modes: List[str] = None
|
||||
|
||||
def to_dict(self) -> Dict[str, Any]:
|
||||
"""Convert to dictionary for JSON serialization"""
|
||||
return {
|
||||
"id": self.id,
|
||||
"name": self.name,
|
||||
"description": self.description,
|
||||
"tags": self.tags,
|
||||
"examples": self.examples,
|
||||
"inputModes": self.input_modes or ["text/plain"],
|
||||
"outputModes": self.output_modes or ["text/plain"]
|
||||
}
|
||||
|
||||
|
||||
@dataclass
|
||||
class AgentCapabilities:
|
||||
"""Defines agent capabilities for A2A protocol"""
|
||||
streaming: bool = False
|
||||
push_notifications: bool = False
|
||||
multi_turn: bool = True
|
||||
context_retention: bool = True
|
||||
|
||||
def to_dict(self) -> Dict[str, Any]:
|
||||
return {
|
||||
"streaming": self.streaming,
|
||||
"pushNotifications": self.push_notifications,
|
||||
"multiTurn": self.multi_turn,
|
||||
"contextRetention": self.context_retention
|
||||
}
|
||||
|
||||
|
||||
@dataclass
|
||||
class AgentCard:
|
||||
"""The agent's business card - tells others what this agent can do"""
|
||||
name: str
|
||||
description: str
|
||||
version: str
|
||||
url: str
|
||||
skills: List[AgentSkill]
|
||||
capabilities: AgentCapabilities
|
||||
default_input_modes: List[str] = None
|
||||
default_output_modes: List[str] = None
|
||||
preferred_transport: str = "JSONRPC"
|
||||
protocol_version: str = "0.3.0"
|
||||
|
||||
def to_dict(self) -> Dict[str, Any]:
|
||||
"""Convert to A2A-compliant agent card JSON"""
|
||||
return {
|
||||
"name": self.name,
|
||||
"description": self.description,
|
||||
"version": self.version,
|
||||
"url": self.url,
|
||||
"protocolVersion": self.protocol_version,
|
||||
"preferredTransport": self.preferred_transport,
|
||||
"defaultInputModes": self.default_input_modes or ["text/plain"],
|
||||
"defaultOutputModes": self.default_output_modes or ["text/plain"],
|
||||
"capabilities": self.capabilities.to_dict(),
|
||||
"skills": [skill.to_dict() for skill in self.skills]
|
||||
}
|
||||
|
||||
|
||||
# Define FuzzForge's skills
|
||||
orchestration_skill = AgentSkill(
|
||||
id="orchestration",
|
||||
name="Agent Orchestration",
|
||||
description="Route requests to appropriate registered agents based on their capabilities",
|
||||
tags=["orchestration", "routing", "coordination"],
|
||||
examples=[
|
||||
"Route this to the calculator",
|
||||
"Send this to the appropriate agent",
|
||||
"Which agent should handle this?"
|
||||
]
|
||||
)
|
||||
|
||||
memory_skill = AgentSkill(
|
||||
id="memory",
|
||||
name="Memory Management",
|
||||
description="Store and retrieve information using Cognee knowledge graph",
|
||||
tags=["memory", "knowledge", "storage", "cognee"],
|
||||
examples=[
|
||||
"Remember that my favorite color is blue",
|
||||
"What do you remember about me?",
|
||||
"Search your memory for project details"
|
||||
]
|
||||
)
|
||||
|
||||
conversation_skill = AgentSkill(
|
||||
id="conversation",
|
||||
name="General Conversation",
|
||||
description="Engage in general conversation and answer questions using LLM",
|
||||
tags=["chat", "conversation", "qa", "llm"],
|
||||
examples=[
|
||||
"What is the meaning of life?",
|
||||
"Explain quantum computing",
|
||||
"Help me understand this concept"
|
||||
]
|
||||
)
|
||||
|
||||
workflow_automation_skill = AgentSkill(
|
||||
id="workflow_automation",
|
||||
name="Workflow Automation",
|
||||
description="Operate project workflows via MCP, monitor runs, and share results",
|
||||
tags=["workflow", "automation", "mcp", "orchestration"],
|
||||
examples=[
|
||||
"Submit the security assessment workflow",
|
||||
"Kick off the infrastructure scan and monitor it",
|
||||
"Summarise findings for run abc123"
|
||||
]
|
||||
)
|
||||
|
||||
agent_management_skill = AgentSkill(
|
||||
id="agent_management",
|
||||
name="Agent Registry Management",
|
||||
description="Register, list, and manage connections to other A2A agents",
|
||||
tags=["registry", "management", "discovery"],
|
||||
examples=[
|
||||
"Register agent at http://localhost:10201",
|
||||
"List all registered agents",
|
||||
"Show agent capabilities"
|
||||
]
|
||||
)
|
||||
|
||||
# Define FuzzForge's capabilities
|
||||
fuzzforge_capabilities = AgentCapabilities(
|
||||
streaming=False,
|
||||
push_notifications=True,
|
||||
multi_turn=True, # We support multi-turn conversations
|
||||
context_retention=True # We maintain context across turns
|
||||
)
|
||||
|
||||
# Create the public agent card
|
||||
def get_fuzzforge_agent_card(url: str = "http://localhost:10100") -> AgentCard:
|
||||
"""Get FuzzForge's agent card with current configuration"""
|
||||
return AgentCard(
|
||||
name="ProjectOrchestrator",
|
||||
description=(
|
||||
"An A2A-capable project agent that can launch and monitor FuzzForge workflows, "
|
||||
"consult the project knowledge graph, and coordinate with speciality agents."
|
||||
),
|
||||
version="project-agent",
|
||||
url=url,
|
||||
skills=[
|
||||
orchestration_skill,
|
||||
memory_skill,
|
||||
conversation_skill,
|
||||
workflow_automation_skill,
|
||||
agent_management_skill
|
||||
],
|
||||
capabilities=fuzzforge_capabilities,
|
||||
default_input_modes=["text/plain", "application/json"],
|
||||
default_output_modes=["text/plain", "application/json"],
|
||||
preferred_transport="JSONRPC",
|
||||
protocol_version="0.3.0"
|
||||
)
|
||||
2427
ai/src/fuzzforge_ai/agent_executor.py
Normal file
2427
ai/src/fuzzforge_ai/agent_executor.py
Normal file
File diff suppressed because it is too large
Load Diff
977
ai/src/fuzzforge_ai/cli.py
Executable file
977
ai/src/fuzzforge_ai/cli.py
Executable file
@@ -0,0 +1,977 @@
|
||||
#!/usr/bin/env python3
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
"""
|
||||
FuzzForge CLI - Clean modular version
|
||||
Uses the separated agent components
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import shlex
|
||||
import os
|
||||
import sys
|
||||
import signal
|
||||
import warnings
|
||||
import logging
|
||||
import random
|
||||
from datetime import datetime
|
||||
from contextlib import contextmanager
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
from dotenv import load_dotenv
|
||||
|
||||
# Ensure Cognee writes logs inside the project workspace
|
||||
project_root = Path.cwd()
|
||||
default_log_dir = project_root / ".fuzzforge" / "logs"
|
||||
default_log_dir.mkdir(parents=True, exist_ok=True)
|
||||
log_path = default_log_dir / "cognee.log"
|
||||
os.environ.setdefault("COGNEE_LOG_PATH", str(log_path))
|
||||
|
||||
# Suppress warnings
|
||||
warnings.filterwarnings("ignore")
|
||||
logging.basicConfig(level=logging.ERROR)
|
||||
|
||||
# Load .env file with explicit path handling
|
||||
# 1. First check current working directory for .fuzzforge/.env
|
||||
fuzzforge_env = Path.cwd() / ".fuzzforge" / ".env"
|
||||
if fuzzforge_env.exists():
|
||||
load_dotenv(fuzzforge_env, override=True)
|
||||
else:
|
||||
# 2. Then check parent directories for .fuzzforge projects
|
||||
current_path = Path.cwd()
|
||||
for parent in [current_path] + list(current_path.parents):
|
||||
fuzzforge_dir = parent / ".fuzzforge"
|
||||
if fuzzforge_dir.exists():
|
||||
project_env = fuzzforge_dir / ".env"
|
||||
if project_env.exists():
|
||||
load_dotenv(project_env, override=True)
|
||||
break
|
||||
else:
|
||||
# 3. Fallback to generic load_dotenv
|
||||
load_dotenv(override=True)
|
||||
|
||||
# Enhanced readline configuration for Rich Console input compatibility
|
||||
try:
|
||||
import readline
|
||||
# Enable Rich-compatible input features
|
||||
readline.parse_and_bind("tab: complete")
|
||||
readline.parse_and_bind("set editing-mode emacs")
|
||||
readline.parse_and_bind("set show-all-if-ambiguous on")
|
||||
readline.parse_and_bind("set completion-ignore-case on")
|
||||
readline.parse_and_bind("set colored-completion-prefix on")
|
||||
readline.parse_and_bind("set enable-bracketed-paste on") # Better paste support
|
||||
# Navigation bindings for better editing
|
||||
readline.parse_and_bind("Control-a: beginning-of-line")
|
||||
readline.parse_and_bind("Control-e: end-of-line")
|
||||
readline.parse_and_bind("Control-u: unix-line-discard")
|
||||
readline.parse_and_bind("Control-k: kill-line")
|
||||
readline.parse_and_bind("Control-w: unix-word-rubout")
|
||||
readline.parse_and_bind("Meta-Backspace: backward-kill-word")
|
||||
# History and completion
|
||||
readline.set_history_length(2000)
|
||||
readline.set_startup_hook(None)
|
||||
# Enable multiline editing hints
|
||||
readline.parse_and_bind("set horizontal-scroll-mode off")
|
||||
readline.parse_and_bind("set mark-symlinked-directories on")
|
||||
READLINE_AVAILABLE = True
|
||||
except ImportError:
|
||||
READLINE_AVAILABLE = False
|
||||
|
||||
from rich.console import Console
|
||||
from rich.table import Table
|
||||
from rich.panel import Panel
|
||||
from rich.prompt import Prompt
|
||||
from rich import box
|
||||
|
||||
from google.adk.events.event import Event
|
||||
from google.adk.events.event_actions import EventActions
|
||||
from google.genai import types as gen_types
|
||||
|
||||
from .agent import FuzzForgeAgent
|
||||
from .agent_card import get_fuzzforge_agent_card
|
||||
from .config_manager import ConfigManager
|
||||
from .config_bridge import ProjectConfigManager
|
||||
from .remote_agent import RemoteAgentConnection
|
||||
|
||||
console = Console()
|
||||
|
||||
# Global shutdown flag
|
||||
shutdown_requested = False
|
||||
|
||||
# Dynamic status messages for better UX
|
||||
THINKING_MESSAGES = [
|
||||
"Thinking", "Processing", "Computing", "Analyzing", "Working",
|
||||
"Pondering", "Deliberating", "Calculating", "Reasoning", "Evaluating"
|
||||
]
|
||||
|
||||
WORKING_MESSAGES = [
|
||||
"Working", "Processing", "Handling", "Executing", "Running",
|
||||
"Operating", "Performing", "Conducting", "Managing", "Coordinating"
|
||||
]
|
||||
|
||||
SEARCH_MESSAGES = [
|
||||
"Searching", "Scanning", "Exploring", "Investigating", "Hunting",
|
||||
"Seeking", "Probing", "Examining", "Inspecting", "Browsing"
|
||||
]
|
||||
|
||||
# Cool prompt symbols
|
||||
PROMPT_STYLES = [
|
||||
"▶", "❯", "➤", "→", "»", "⟩", "▷", "⇨", "⟶", "◆"
|
||||
]
|
||||
|
||||
def get_dynamic_status(action_type="thinking"):
|
||||
"""Get a random status message based on action type"""
|
||||
if action_type == "thinking":
|
||||
return f"{random.choice(THINKING_MESSAGES)}..."
|
||||
elif action_type == "working":
|
||||
return f"{random.choice(WORKING_MESSAGES)}..."
|
||||
elif action_type == "searching":
|
||||
return f"{random.choice(SEARCH_MESSAGES)}..."
|
||||
else:
|
||||
return f"{random.choice(THINKING_MESSAGES)}..."
|
||||
|
||||
def get_prompt_symbol():
|
||||
"""Get prompt symbol indicating where to write"""
|
||||
return ">>"
|
||||
|
||||
def signal_handler(signum, frame):
|
||||
"""Handle Ctrl+C gracefully"""
|
||||
global shutdown_requested
|
||||
shutdown_requested = True
|
||||
console.print("\n\n[yellow]Shutting down gracefully...[/yellow]")
|
||||
sys.exit(0)
|
||||
|
||||
signal.signal(signal.SIGINT, signal_handler)
|
||||
|
||||
@contextmanager
|
||||
def safe_status(message: str):
|
||||
"""Safe status context manager"""
|
||||
status = console.status(message, spinner="dots")
|
||||
try:
|
||||
status.start()
|
||||
yield
|
||||
finally:
|
||||
status.stop()
|
||||
|
||||
|
||||
class FuzzForgeCLI:
|
||||
"""Command-line interface for FuzzForge"""
|
||||
|
||||
def __init__(self):
|
||||
"""Initialize the CLI"""
|
||||
# Ensure .env is loaded from .fuzzforge directory
|
||||
fuzzforge_env = Path.cwd() / ".fuzzforge" / ".env"
|
||||
if fuzzforge_env.exists():
|
||||
load_dotenv(fuzzforge_env, override=True)
|
||||
|
||||
# Load configuration for agent registry
|
||||
self.config_manager = ConfigManager()
|
||||
|
||||
# Check environment configuration
|
||||
if not os.getenv('LITELLM_MODEL'):
|
||||
console.print("[red]ERROR: LITELLM_MODEL not set in .env file[/red]")
|
||||
console.print("Please set LITELLM_MODEL to your desired model")
|
||||
sys.exit(1)
|
||||
|
||||
# Create the agent (uses env vars directly)
|
||||
self.agent = FuzzForgeAgent()
|
||||
|
||||
# Create a consistent context ID for this CLI session
|
||||
self.context_id = f"cli_{datetime.now().strftime('%Y%m%d_%H%M%S')}"
|
||||
|
||||
# Track registered agents for config persistence
|
||||
self.agents_modified = False
|
||||
|
||||
# Command handlers
|
||||
self.commands = {
|
||||
"/help": self.cmd_help,
|
||||
"/register": self.cmd_register,
|
||||
"/unregister": self.cmd_unregister,
|
||||
"/list": self.cmd_list,
|
||||
"/memory": self.cmd_memory,
|
||||
"/recall": self.cmd_recall,
|
||||
"/artifacts": self.cmd_artifacts,
|
||||
"/tasks": self.cmd_tasks,
|
||||
"/skills": self.cmd_skills,
|
||||
"/sessions": self.cmd_sessions,
|
||||
"/clear": self.cmd_clear,
|
||||
"/sendfile": self.cmd_sendfile,
|
||||
"/quit": self.cmd_quit,
|
||||
"/exit": self.cmd_quit,
|
||||
}
|
||||
|
||||
self.background_tasks: set[asyncio.Task] = set()
|
||||
|
||||
def print_banner(self):
|
||||
"""Print welcome banner"""
|
||||
card = self.agent.agent_card
|
||||
|
||||
# Print ASCII banner
|
||||
console.print("[medium_purple3] ███████╗██╗ ██╗███████╗███████╗███████╗ ██████╗ ██████╗ ██████╗ ███████╗ █████╗ ██╗[/medium_purple3]")
|
||||
console.print("[medium_purple3] ██╔════╝██║ ██║╚══███╔╝╚══███╔╝██╔════╝██╔═══██╗██╔══██╗██╔════╝ ██╔════╝ ██╔══██╗██║[/medium_purple3]")
|
||||
console.print("[medium_purple3] █████╗ ██║ ██║ ███╔╝ ███╔╝ █████╗ ██║ ██║██████╔╝██║ ███╗█████╗ ███████║██║[/medium_purple3]")
|
||||
console.print("[medium_purple3] ██╔══╝ ██║ ██║ ███╔╝ ███╔╝ ██╔══╝ ██║ ██║██╔══██╗██║ ██║██╔══╝ ██╔══██║██║[/medium_purple3]")
|
||||
console.print("[medium_purple3] ██║ ╚██████╔╝███████╗███████╗██║ ╚██████╔╝██║ ██║╚██████╔╝███████╗ ██║ ██║██║[/medium_purple3]")
|
||||
console.print("[medium_purple3] ╚═╝ ╚═════╝ ╚══════╝╚══════╝╚═╝ ╚═════╝ ╚═╝ ╚═╝ ╚═════╝ ╚══════╝ ╚═╝ ╚═╝╚═╝[/medium_purple3]")
|
||||
console.print(f"\n[dim]{card.description}[/dim]\n")
|
||||
|
||||
provider = (
|
||||
os.getenv("LLM_PROVIDER")
|
||||
or os.getenv("LLM_COGNEE_PROVIDER")
|
||||
or os.getenv("COGNEE_LLM_PROVIDER")
|
||||
or "unknown"
|
||||
)
|
||||
|
||||
console.print(
|
||||
"LLM Provider: [medium_purple1]{provider}[/medium_purple1]".format(
|
||||
provider=provider
|
||||
)
|
||||
)
|
||||
console.print(
|
||||
"LLM Model: [medium_purple1]{model}[/medium_purple1]".format(
|
||||
model=self.agent.model
|
||||
)
|
||||
)
|
||||
if self.agent.executor.agentops_trace:
|
||||
console.print(f"Tracking: [medium_purple1]AgentOps active[/medium_purple1]")
|
||||
|
||||
# Show skills
|
||||
console.print("\nSkills:")
|
||||
for skill in card.skills:
|
||||
console.print(
|
||||
f" • [deep_sky_blue1]{skill.name}[/deep_sky_blue1] – {skill.description}"
|
||||
)
|
||||
console.print("\nType /help for commands or just chat\n")
|
||||
|
||||
async def cmd_help(self, args: str = "") -> None:
|
||||
"""Show help"""
|
||||
help_text = """
|
||||
[bold]Commands:[/bold]
|
||||
/register <url> - Register an A2A agent (saves to config)
|
||||
/unregister <name> - Remove agent from registry and config
|
||||
/list - List registered agents
|
||||
|
||||
[bold]Memory Systems:[/bold]
|
||||
/recall <query> - Search past conversations (ADK Memory)
|
||||
/memory - Show knowledge graph (Cognee)
|
||||
/memory save - Save to knowledge graph
|
||||
/memory search - Search knowledge graph
|
||||
|
||||
[bold]Other:[/bold]
|
||||
/artifacts - List created artifacts
|
||||
/artifacts <id> - Show artifact content
|
||||
/tasks [id] - Show task list or details
|
||||
/skills - Show FuzzForge skills
|
||||
/sessions - List active sessions
|
||||
/sendfile <agent> <path> [message] - Attach file as artifact and route to agent
|
||||
/clear - Clear screen
|
||||
/help - Show this help
|
||||
/quit - Exit
|
||||
|
||||
[bold]Sample prompts:[/bold]
|
||||
run fuzzforge workflow security_assessment on /absolute/path --volume-mode ro
|
||||
list fuzzforge runs limit=5
|
||||
get fuzzforge summary <run_id>
|
||||
query project knowledge about "unsafe Rust" using GRAPH_COMPLETION
|
||||
export project file src/lib.rs as artifact
|
||||
/memory search "recent findings"
|
||||
|
||||
[bold]Input Editing:[/bold]
|
||||
Arrow keys - Move cursor
|
||||
Ctrl+A/E - Start/end of line
|
||||
Up/Down - Command history
|
||||
"""
|
||||
console.print(help_text)
|
||||
|
||||
async def cmd_register(self, args: str) -> None:
|
||||
"""Register an agent"""
|
||||
if not args:
|
||||
console.print("Usage: /register <url>")
|
||||
return
|
||||
|
||||
with safe_status(f"{get_dynamic_status('working')} Registering {args}"):
|
||||
result = await self.agent.register_agent(args.strip())
|
||||
|
||||
if result["success"]:
|
||||
console.print(f"✅ Registered: [bold]{result['name']}[/bold]")
|
||||
console.print(f" Capabilities: {result['capabilities']} skills")
|
||||
|
||||
# Get description from the agent's card
|
||||
agents = self.agent.list_agents()
|
||||
description = ""
|
||||
for agent in agents:
|
||||
if agent['name'] == result['name']:
|
||||
description = agent.get('description', '')
|
||||
break
|
||||
|
||||
# Add to config for persistence
|
||||
self.config_manager.add_registered_agent(
|
||||
name=result['name'],
|
||||
url=args.strip(),
|
||||
description=description
|
||||
)
|
||||
console.print(f" [dim]Saved to config for auto-registration[/dim]")
|
||||
else:
|
||||
console.print(f"[red]Failed: {result['error']}[/red]")
|
||||
|
||||
async def cmd_unregister(self, args: str) -> None:
|
||||
"""Unregister an agent and remove from config"""
|
||||
if not args:
|
||||
console.print("Usage: /unregister <name or url>")
|
||||
return
|
||||
|
||||
# Try to find the agent
|
||||
agents = self.agent.list_agents()
|
||||
agent_to_remove = None
|
||||
|
||||
for agent in agents:
|
||||
if agent['name'].lower() == args.lower() or agent['url'] == args:
|
||||
agent_to_remove = agent
|
||||
break
|
||||
|
||||
if not agent_to_remove:
|
||||
console.print(f"[yellow]Agent '{args}' not found[/yellow]")
|
||||
return
|
||||
|
||||
# Remove from config
|
||||
if self.config_manager.remove_registered_agent(name=agent_to_remove['name'], url=agent_to_remove['url']):
|
||||
console.print(f"✅ Unregistered: [bold]{agent_to_remove['name']}[/bold]")
|
||||
console.print(f" [dim]Removed from config (won't auto-register next time)[/dim]")
|
||||
else:
|
||||
console.print(f"[yellow]Agent unregistered from session but not found in config[/yellow]")
|
||||
|
||||
async def cmd_list(self, args: str = "") -> None:
|
||||
"""List registered agents"""
|
||||
agents = self.agent.list_agents()
|
||||
|
||||
if not agents:
|
||||
console.print("No agents registered. Use /register <url>")
|
||||
return
|
||||
|
||||
table = Table(title="Registered Agents", box=box.ROUNDED)
|
||||
table.add_column("Name", style="medium_purple3")
|
||||
table.add_column("URL", style="deep_sky_blue3")
|
||||
table.add_column("Skills", style="plum3")
|
||||
table.add_column("Description", style="dim")
|
||||
|
||||
for agent in agents:
|
||||
desc = agent['description']
|
||||
if len(desc) > 40:
|
||||
desc = desc[:37] + "..."
|
||||
table.add_row(
|
||||
agent['name'],
|
||||
agent['url'],
|
||||
str(agent['skills']),
|
||||
desc
|
||||
)
|
||||
|
||||
console.print(table)
|
||||
|
||||
async def cmd_recall(self, args: str = "") -> None:
|
||||
"""Search conversational memory (past conversations)"""
|
||||
if not args:
|
||||
console.print("Usage: /recall <query>")
|
||||
return
|
||||
|
||||
await self._sync_conversational_memory()
|
||||
|
||||
# First try MemoryService (for ingested memories)
|
||||
with safe_status(get_dynamic_status('searching')):
|
||||
results = await self.agent.memory_manager.search_conversational_memory(args)
|
||||
|
||||
if results and results.memories:
|
||||
console.print(f"[bold]Found {len(results.memories)} memories:[/bold]\n")
|
||||
for i, memory in enumerate(results.memories, 1):
|
||||
# MemoryEntry has 'text' field, not 'content'
|
||||
text = getattr(memory, 'text', str(memory))
|
||||
if len(text) > 200:
|
||||
text = text[:200] + "..."
|
||||
console.print(f"{i}. {text}")
|
||||
else:
|
||||
# If MemoryService is empty, search SQLite directly
|
||||
console.print("[yellow]No memories in MemoryService, searching SQLite sessions...[/yellow]")
|
||||
|
||||
# Check if using DatabaseSessionService
|
||||
if hasattr(self.agent.executor, 'session_service'):
|
||||
service_type = type(self.agent.executor.session_service).__name__
|
||||
if service_type == 'DatabaseSessionService':
|
||||
# Search SQLite database directly
|
||||
import sqlite3
|
||||
import os
|
||||
db_path = os.getenv('SESSION_DB_PATH', './fuzzforge_sessions.db')
|
||||
|
||||
if os.path.exists(db_path):
|
||||
conn = sqlite3.connect(db_path)
|
||||
cursor = conn.cursor()
|
||||
|
||||
# Search in events table
|
||||
query = f"%{args}%"
|
||||
cursor.execute(
|
||||
"SELECT content FROM events WHERE content LIKE ? LIMIT 10",
|
||||
(query,)
|
||||
)
|
||||
|
||||
rows = cursor.fetchall()
|
||||
conn.close()
|
||||
|
||||
if rows:
|
||||
console.print(f"[green]Found {len(rows)} matches in SQLite sessions:[/green]\n")
|
||||
for i, (content,) in enumerate(rows, 1):
|
||||
# Parse JSON content
|
||||
import json
|
||||
try:
|
||||
data = json.loads(content)
|
||||
if 'parts' in data and data['parts']:
|
||||
text = data['parts'][0].get('text', '')[:150]
|
||||
role = data.get('role', 'unknown')
|
||||
console.print(f"{i}. [{role}]: {text}...")
|
||||
except:
|
||||
console.print(f"{i}. {content[:150]}...")
|
||||
else:
|
||||
console.print("[yellow]No matches found in SQLite either[/yellow]")
|
||||
else:
|
||||
console.print("[yellow]SQLite database not found[/yellow]")
|
||||
else:
|
||||
console.print(f"[dim]Using {service_type} (not searchable)[/dim]")
|
||||
else:
|
||||
console.print("[yellow]No session history available[/yellow]")
|
||||
|
||||
async def cmd_memory(self, args: str = "") -> None:
|
||||
"""Inspect conversational memory and knowledge graph state."""
|
||||
raw_args = (args or "").strip()
|
||||
lower_args = raw_args.lower()
|
||||
|
||||
if not raw_args or lower_args in {"status", "info"}:
|
||||
await self._show_memory_status()
|
||||
return
|
||||
|
||||
if lower_args == "datasets":
|
||||
await self._show_dataset_summary()
|
||||
return
|
||||
|
||||
if lower_args.startswith("search ") or lower_args.startswith("recall "):
|
||||
query = raw_args.split(" ", 1)[1].strip() if " " in raw_args else ""
|
||||
if not query:
|
||||
console.print("Usage: /memory search <query>")
|
||||
return
|
||||
await self.cmd_recall(query)
|
||||
return
|
||||
|
||||
console.print("Usage: /memory [status|datasets|search <query>]")
|
||||
console.print("[dim]/memory search <query> is an alias for /recall <query>[/dim]")
|
||||
|
||||
async def _sync_conversational_memory(self) -> None:
|
||||
"""Ensure the ADK memory service ingests any completed sessions."""
|
||||
memory_service = getattr(self.agent.memory_manager, "memory_service", None)
|
||||
executor_sessions = getattr(self.agent.executor, "sessions", {})
|
||||
metadata_map = getattr(self.agent.executor, "session_metadata", {})
|
||||
|
||||
if not memory_service or not executor_sessions:
|
||||
return
|
||||
|
||||
for context_id, session in list(executor_sessions.items()):
|
||||
meta = metadata_map.get(context_id, {})
|
||||
if meta.get('memory_synced'):
|
||||
continue
|
||||
|
||||
add_session = getattr(memory_service, "add_session_to_memory", None)
|
||||
if not callable(add_session):
|
||||
return
|
||||
|
||||
try:
|
||||
await add_session(session)
|
||||
meta['memory_synced'] = True
|
||||
metadata_map[context_id] = meta
|
||||
except Exception as exc: # pragma: no cover - defensive logging
|
||||
if os.getenv('FUZZFORGE_DEBUG', '0') == '1':
|
||||
console.print(f"[yellow]Memory sync failed:[/yellow] {exc}")
|
||||
|
||||
async def _show_memory_status(self) -> None:
|
||||
"""Render conversational memory, session store, and knowledge graph status."""
|
||||
await self._sync_conversational_memory()
|
||||
|
||||
status = self.agent.memory_manager.get_status()
|
||||
|
||||
conversational = status.get("conversational_memory", {})
|
||||
conv_type = conversational.get("type", "unknown")
|
||||
conv_active = "yes" if conversational.get("active") else "no"
|
||||
conv_details = conversational.get("details", "")
|
||||
|
||||
session_service = getattr(self.agent.executor, "session_service", None)
|
||||
session_service_name = type(session_service).__name__ if session_service else "Unavailable"
|
||||
|
||||
session_lines = [
|
||||
f"[bold]Service:[/bold] {session_service_name}"
|
||||
]
|
||||
|
||||
session_count = None
|
||||
event_count = None
|
||||
db_path_display = None
|
||||
|
||||
if session_service_name == "DatabaseSessionService":
|
||||
import sqlite3
|
||||
|
||||
db_path = os.getenv('SESSION_DB_PATH', './fuzzforge_sessions.db')
|
||||
session_path = Path(db_path).expanduser().resolve()
|
||||
db_path_display = str(session_path)
|
||||
|
||||
if session_path.exists():
|
||||
try:
|
||||
with sqlite3.connect(session_path) as conn:
|
||||
cursor = conn.cursor()
|
||||
cursor.execute("SELECT COUNT(*) FROM sessions")
|
||||
session_count = cursor.fetchone()[0]
|
||||
cursor.execute("SELECT COUNT(*) FROM events")
|
||||
event_count = cursor.fetchone()[0]
|
||||
except Exception as exc:
|
||||
session_lines.append(f"[yellow]Warning:[/yellow] Unable to read session database ({exc})")
|
||||
else:
|
||||
session_lines.append("[yellow]SQLite session database not found yet[/yellow]")
|
||||
|
||||
elif session_service_name == "InMemorySessionService":
|
||||
session_lines.append("[dim]Session data persists for the current process only[/dim]")
|
||||
|
||||
if db_path_display:
|
||||
session_lines.append(f"[bold]Database:[/bold] {db_path_display}")
|
||||
if session_count is not None:
|
||||
session_lines.append(f"[bold]Sessions Recorded:[/bold] {session_count}")
|
||||
if event_count is not None:
|
||||
session_lines.append(f"[bold]Events Logged:[/bold] {event_count}")
|
||||
|
||||
conv_lines = [
|
||||
f"[bold]Type:[/bold] {conv_type}",
|
||||
f"[bold]Active:[/bold] {conv_active}"
|
||||
]
|
||||
if conv_details:
|
||||
conv_lines.append(f"[bold]Details:[/bold] {conv_details}")
|
||||
|
||||
console.print(Panel("\n".join(conv_lines), title="Conversation Memory", border_style="medium_purple3"))
|
||||
console.print(Panel("\n".join(session_lines), title="Session Store", border_style="deep_sky_blue3"))
|
||||
|
||||
# Knowledge graph section
|
||||
knowledge = status.get("knowledge_graph", {})
|
||||
kg_active = knowledge.get("active", False)
|
||||
kg_lines = [
|
||||
f"[bold]Active:[/bold] {'yes' if kg_active else 'no'}",
|
||||
f"[bold]Purpose:[/bold] {knowledge.get('purpose', 'N/A')}"
|
||||
]
|
||||
|
||||
cognee_data = None
|
||||
cognee_error = None
|
||||
try:
|
||||
project_config = ProjectConfigManager()
|
||||
cognee_data = project_config.get_cognee_config()
|
||||
except Exception as exc: # pragma: no cover - defensive
|
||||
cognee_error = str(exc)
|
||||
|
||||
if cognee_data:
|
||||
data_dir = cognee_data.get('data_directory')
|
||||
system_dir = cognee_data.get('system_directory')
|
||||
if data_dir:
|
||||
kg_lines.append(f"[bold]Data dir:[/bold] {data_dir}")
|
||||
if system_dir:
|
||||
kg_lines.append(f"[bold]System dir:[/bold] {system_dir}")
|
||||
elif cognee_error:
|
||||
kg_lines.append(f"[yellow]Config unavailable:[/yellow] {cognee_error}")
|
||||
|
||||
dataset_summary = None
|
||||
if kg_active:
|
||||
try:
|
||||
integration = await self.agent.executor._get_knowledge_integration()
|
||||
if integration:
|
||||
dataset_summary = await integration.list_datasets()
|
||||
except Exception as exc: # pragma: no cover - defensive
|
||||
kg_lines.append(f"[yellow]Dataset listing failed:[/yellow] {exc}")
|
||||
|
||||
if dataset_summary:
|
||||
if dataset_summary.get("error"):
|
||||
kg_lines.append(f"[yellow]Dataset listing failed:[/yellow] {dataset_summary['error']}")
|
||||
else:
|
||||
datasets = dataset_summary.get("datasets", [])
|
||||
total = dataset_summary.get("total_datasets")
|
||||
if total is not None:
|
||||
kg_lines.append(f"[bold]Datasets:[/bold] {total}")
|
||||
if datasets:
|
||||
preview = ", ".join(sorted(datasets)[:5])
|
||||
if len(datasets) > 5:
|
||||
preview += ", …"
|
||||
kg_lines.append(f"[bold]Samples:[/bold] {preview}")
|
||||
else:
|
||||
kg_lines.append("[dim]Run `fuzzforge ingest` to populate the knowledge graph[/dim]")
|
||||
|
||||
console.print(Panel("\n".join(kg_lines), title="Knowledge Graph", border_style="spring_green4"))
|
||||
console.print("\n[dim]Subcommands: /memory datasets | /memory search <query>[/dim]")
|
||||
|
||||
async def _show_dataset_summary(self) -> None:
|
||||
"""List datasets available in the Cognee knowledge graph."""
|
||||
try:
|
||||
integration = await self.agent.executor._get_knowledge_integration()
|
||||
except Exception as exc:
|
||||
console.print(f"[yellow]Knowledge graph unavailable:[/yellow] {exc}")
|
||||
return
|
||||
|
||||
if not integration:
|
||||
console.print("[yellow]Knowledge graph is not initialised yet.[/yellow]")
|
||||
console.print("[dim]Run `fuzzforge ingest --path . --recursive` to create the project dataset.[/dim]")
|
||||
return
|
||||
|
||||
with safe_status(get_dynamic_status('searching')):
|
||||
dataset_info = await integration.list_datasets()
|
||||
|
||||
if dataset_info.get("error"):
|
||||
console.print(f"[red]{dataset_info['error']}[/red]")
|
||||
return
|
||||
|
||||
datasets = dataset_info.get("datasets", [])
|
||||
if not datasets:
|
||||
console.print("[yellow]No datasets found.[/yellow]")
|
||||
console.print("[dim]Run `fuzzforge ingest` to populate the knowledge graph.[/dim]")
|
||||
return
|
||||
|
||||
table = Table(title="Cognee Datasets", box=box.ROUNDED)
|
||||
table.add_column("Dataset", style="medium_purple3")
|
||||
table.add_column("Notes", style="dim")
|
||||
|
||||
for name in sorted(datasets):
|
||||
note = ""
|
||||
if name.endswith("_codebase"):
|
||||
note = "primary project dataset"
|
||||
table.add_row(name, note)
|
||||
|
||||
console.print(table)
|
||||
console.print(
|
||||
"[dim]Use knowledge graph prompts (e.g. `search project knowledge for \"topic\" using INSIGHTS`) to query these datasets.[/dim]"
|
||||
)
|
||||
|
||||
async def cmd_artifacts(self, args: str = "") -> None:
|
||||
"""List or show artifacts"""
|
||||
if args:
|
||||
# Show specific artifact
|
||||
artifacts = await self.agent.executor.get_artifacts(self.context_id)
|
||||
for artifact in artifacts:
|
||||
if artifact['id'] == args or args in artifact['id']:
|
||||
console.print(Panel(
|
||||
f"[bold]{artifact['title']}[/bold]\n"
|
||||
f"Type: {artifact['type']} | Created: {artifact['created_at'][:19]}\n\n"
|
||||
f"[code]{artifact['content']}[/code]",
|
||||
title=f"Artifact: {artifact['id']}",
|
||||
border_style="medium_purple3"
|
||||
))
|
||||
return
|
||||
console.print(f"[yellow]Artifact {args} not found[/yellow]")
|
||||
return
|
||||
|
||||
# List all artifacts
|
||||
artifacts = await self.agent.executor.get_artifacts(self.context_id)
|
||||
|
||||
if not artifacts:
|
||||
console.print("No artifacts created yet")
|
||||
console.print("[dim]Artifacts are created when generating code, configs, or documents[/dim]")
|
||||
return
|
||||
|
||||
table = Table(title="Artifacts", box=box.ROUNDED)
|
||||
table.add_column("ID", style="medium_purple3")
|
||||
table.add_column("Type", style="deep_sky_blue3")
|
||||
table.add_column("Title", style="plum3")
|
||||
table.add_column("Size", style="dim")
|
||||
table.add_column("Created", style="dim")
|
||||
|
||||
for artifact in artifacts:
|
||||
size = f"{len(artifact['content'])} chars"
|
||||
created = artifact['created_at'][:19] # Just date and time
|
||||
|
||||
table.add_row(
|
||||
artifact['id'],
|
||||
artifact['type'],
|
||||
artifact['title'][:40] + "..." if len(artifact['title']) > 40 else artifact['title'],
|
||||
size,
|
||||
created
|
||||
)
|
||||
|
||||
console.print(table)
|
||||
console.print(f"\n[dim]Use /artifacts <id> to view artifact content[/dim]")
|
||||
|
||||
async def cmd_tasks(self, args: str = "") -> None:
|
||||
"""List tasks or show details for a specific task."""
|
||||
store = getattr(self.agent.executor, "task_store", None)
|
||||
if not store or not hasattr(store, "tasks"):
|
||||
console.print("Task store not available")
|
||||
return
|
||||
|
||||
task_id = args.strip()
|
||||
|
||||
async with store.lock:
|
||||
tasks = dict(store.tasks)
|
||||
|
||||
if not tasks:
|
||||
console.print("No tasks recorded yet")
|
||||
return
|
||||
|
||||
if task_id:
|
||||
task = tasks.get(task_id)
|
||||
if not task:
|
||||
console.print(f"Task '{task_id}' not found")
|
||||
return
|
||||
|
||||
state_str = task.status.state.value if hasattr(task.status.state, "value") else str(task.status.state)
|
||||
console.print(f"\n[bold]Task {task.id}[/bold]")
|
||||
console.print(f"Context: {task.context_id}")
|
||||
console.print(f"State: {state_str}")
|
||||
console.print(f"Timestamp: {task.status.timestamp}")
|
||||
if task.metadata:
|
||||
console.print("Metadata:")
|
||||
for key, value in task.metadata.items():
|
||||
console.print(f" • {key}: {value}")
|
||||
if task.history:
|
||||
console.print("History:")
|
||||
for entry in task.history[-5:]:
|
||||
text = getattr(entry, "text", None)
|
||||
if not text and hasattr(entry, "parts"):
|
||||
text = " ".join(
|
||||
getattr(part, "text", "") for part in getattr(entry, "parts", [])
|
||||
)
|
||||
console.print(f" - {text}")
|
||||
return
|
||||
|
||||
table = Table(title="FuzzForge Tasks", box=box.ROUNDED)
|
||||
table.add_column("ID", style="medium_purple3")
|
||||
table.add_column("State", style="white")
|
||||
table.add_column("Workflow", style="deep_sky_blue3")
|
||||
table.add_column("Updated", style="green")
|
||||
|
||||
for task in tasks.values():
|
||||
state_value = task.status.state.value if hasattr(task.status.state, "value") else str(task.status.state)
|
||||
workflow = ""
|
||||
if task.metadata:
|
||||
workflow = task.metadata.get("workflow") or task.metadata.get("workflow_name") or ""
|
||||
timestamp = task.status.timestamp if task.status else ""
|
||||
table.add_row(task.id, state_value, workflow, timestamp)
|
||||
|
||||
console.print(table)
|
||||
console.print("\n[dim]Use /tasks <id> to view task details[/dim]")
|
||||
|
||||
async def cmd_sessions(self, args: str = "") -> None:
|
||||
"""List active sessions"""
|
||||
sessions = self.agent.executor.sessions
|
||||
|
||||
if not sessions:
|
||||
console.print("No active sessions")
|
||||
return
|
||||
|
||||
table = Table(title="Active Sessions", box=box.ROUNDED)
|
||||
table.add_column("Context ID", style="medium_purple3")
|
||||
table.add_column("Session ID", style="deep_sky_blue3")
|
||||
table.add_column("User ID", style="plum3")
|
||||
table.add_column("State", style="dim")
|
||||
|
||||
for context_id, session in sessions.items():
|
||||
# Get session info
|
||||
session_id = getattr(session, 'id', 'N/A')
|
||||
user_id = getattr(session, 'user_id', 'N/A')
|
||||
state = getattr(session, 'state', {})
|
||||
|
||||
# Format state info
|
||||
agents_count = len(state.get('registered_agents', []))
|
||||
state_info = f"{agents_count} agents registered"
|
||||
|
||||
table.add_row(
|
||||
context_id[:20] + "..." if len(context_id) > 20 else context_id,
|
||||
session_id[:20] + "..." if len(str(session_id)) > 20 else str(session_id),
|
||||
user_id,
|
||||
state_info
|
||||
)
|
||||
|
||||
console.print(table)
|
||||
console.print(f"\n[dim]Current session: {self.context_id}[/dim]")
|
||||
|
||||
async def cmd_skills(self, args: str = "") -> None:
|
||||
"""Show FuzzForge skills"""
|
||||
card = self.agent.agent_card
|
||||
|
||||
table = Table(title=f"{card.name} Skills", box=box.ROUNDED)
|
||||
table.add_column("Skill", style="medium_purple3")
|
||||
table.add_column("Description", style="white")
|
||||
table.add_column("Tags", style="deep_sky_blue3")
|
||||
|
||||
for skill in card.skills:
|
||||
table.add_row(
|
||||
skill.name,
|
||||
skill.description,
|
||||
", ".join(skill.tags[:3])
|
||||
)
|
||||
|
||||
console.print(table)
|
||||
|
||||
async def cmd_clear(self, args: str = "") -> None:
|
||||
"""Clear screen"""
|
||||
console.clear()
|
||||
self.print_banner()
|
||||
|
||||
async def cmd_sendfile(self, args: str) -> None:
|
||||
"""Encode a local file as an artifact and route it to a registered agent."""
|
||||
tokens = shlex.split(args)
|
||||
if len(tokens) < 2:
|
||||
console.print("Usage: /sendfile <agent_name> <path> [message]")
|
||||
return
|
||||
|
||||
agent_name = tokens[0]
|
||||
file_arg = tokens[1]
|
||||
note = " ".join(tokens[2:]).strip()
|
||||
|
||||
file_path = Path(file_arg).expanduser()
|
||||
if not file_path.exists():
|
||||
console.print(f"[red]File not found:[/red] {file_path}")
|
||||
return
|
||||
|
||||
session = self.agent.executor.sessions.get(self.context_id)
|
||||
if not session:
|
||||
console.print("[red]No active session available. Try sending a prompt first.[/red]")
|
||||
return
|
||||
|
||||
console.print(f"[dim]Delegating {file_path.name} to {agent_name}...[/dim]")
|
||||
|
||||
async def _delegate() -> None:
|
||||
try:
|
||||
response = await self.agent.executor.delegate_file_to_agent(
|
||||
agent_name,
|
||||
str(file_path),
|
||||
note,
|
||||
session=session,
|
||||
context_id=self.context_id,
|
||||
)
|
||||
console.print(f"[{agent_name}]: {response}")
|
||||
except Exception as exc:
|
||||
console.print(f"[red]Failed to delegate file:[/red] {exc}")
|
||||
finally:
|
||||
self.background_tasks.discard(asyncio.current_task())
|
||||
|
||||
task = asyncio.create_task(_delegate())
|
||||
self.background_tasks.add(task)
|
||||
console.print("[dim]Delegation in progress… you can continue working.[/dim]")
|
||||
|
||||
async def cmd_quit(self, args: str = "") -> None:
|
||||
"""Exit the CLI"""
|
||||
console.print("\n[green]Shutting down...[/green]")
|
||||
await self.agent.cleanup()
|
||||
if self.background_tasks:
|
||||
for task in list(self.background_tasks):
|
||||
task.cancel()
|
||||
await asyncio.gather(*self.background_tasks, return_exceptions=True)
|
||||
console.print("Goodbye!\n")
|
||||
sys.exit(0)
|
||||
|
||||
async def process_command(self, text: str) -> bool:
|
||||
"""Process slash commands"""
|
||||
if not text.startswith('/'):
|
||||
return False
|
||||
|
||||
parts = text.split(maxsplit=1)
|
||||
cmd = parts[0].lower()
|
||||
args = parts[1] if len(parts) > 1 else ""
|
||||
|
||||
if cmd in self.commands:
|
||||
await self.commands[cmd](args)
|
||||
return True
|
||||
|
||||
console.print(f"Unknown command: {cmd}")
|
||||
return True
|
||||
|
||||
async def auto_register_agents(self):
|
||||
"""Auto-register agents from config on startup"""
|
||||
agents_to_register = self.config_manager.get_registered_agents()
|
||||
|
||||
if agents_to_register:
|
||||
console.print(f"\n[dim]Auto-registering {len(agents_to_register)} agents from config...[/dim]")
|
||||
|
||||
for agent_config in agents_to_register:
|
||||
url = agent_config.get('url')
|
||||
name = agent_config.get('name', 'Unknown')
|
||||
|
||||
if url:
|
||||
try:
|
||||
with safe_status(f"Registering {name}..."):
|
||||
result = await self.agent.register_agent(url)
|
||||
|
||||
if result["success"]:
|
||||
console.print(f" ✅ {name}: [green]Connected[/green]")
|
||||
else:
|
||||
console.print(f" ⚠️ {name}: [yellow]Failed - {result.get('error', 'Unknown error')}[/yellow]")
|
||||
except Exception as e:
|
||||
console.print(f" ⚠️ {name}: [yellow]Failed - {e}[/yellow]")
|
||||
|
||||
console.print("") # Empty line for spacing
|
||||
|
||||
async def run(self):
|
||||
"""Main CLI loop"""
|
||||
self.print_banner()
|
||||
|
||||
# Auto-register agents from config
|
||||
await self.auto_register_agents()
|
||||
|
||||
while not shutdown_requested:
|
||||
try:
|
||||
# Use standard input with non-deletable colored prompt
|
||||
prompt_symbol = get_prompt_symbol()
|
||||
try:
|
||||
# Print colored prompt then use input() for non-deletable behavior
|
||||
console.print(f"[medium_purple3]{prompt_symbol}[/medium_purple3] ", end="")
|
||||
user_input = input().strip()
|
||||
except (EOFError, KeyboardInterrupt):
|
||||
raise
|
||||
|
||||
if not user_input:
|
||||
continue
|
||||
|
||||
# Check for commands
|
||||
if await self.process_command(user_input):
|
||||
continue
|
||||
|
||||
# Process message
|
||||
with safe_status(get_dynamic_status('thinking')):
|
||||
response = await self.agent.process_message(user_input, self.context_id)
|
||||
|
||||
# Display response
|
||||
console.print(f"\n{response}\n")
|
||||
|
||||
except KeyboardInterrupt:
|
||||
await self.cmd_quit()
|
||||
|
||||
except EOFError:
|
||||
await self.cmd_quit()
|
||||
|
||||
except Exception as e:
|
||||
console.print(f"[red]Error: {e}[/red]")
|
||||
if os.getenv('FUZZFORGE_DEBUG') == '1':
|
||||
console.print_exception()
|
||||
console.print("")
|
||||
|
||||
await self.agent.cleanup()
|
||||
|
||||
|
||||
def main():
|
||||
"""Main entry point"""
|
||||
try:
|
||||
cli = FuzzForgeCLI()
|
||||
asyncio.run(cli.run())
|
||||
except KeyboardInterrupt:
|
||||
console.print("\n[yellow]Interrupted[/yellow]")
|
||||
sys.exit(0)
|
||||
except Exception as e:
|
||||
console.print(f"[red]Fatal error: {e}[/red]")
|
||||
if os.getenv('FUZZFORGE_DEBUG') == '1':
|
||||
console.print_exception()
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
435
ai/src/fuzzforge_ai/cognee_integration.py
Normal file
435
ai/src/fuzzforge_ai/cognee_integration.py
Normal file
@@ -0,0 +1,435 @@
|
||||
"""
|
||||
Cognee Integration Module for FuzzForge
|
||||
Provides standardized access to project-specific knowledge graphs
|
||||
Can be reused by external agents and other components
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
import os
|
||||
import asyncio
|
||||
import json
|
||||
from typing import Dict, List, Any, Optional, Union
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
class CogneeProjectIntegration:
|
||||
"""
|
||||
Standardized Cognee integration that can be reused across agents
|
||||
Automatically detects project context and provides knowledge graph access
|
||||
"""
|
||||
|
||||
def __init__(self, project_dir: Optional[str] = None):
|
||||
"""
|
||||
Initialize with project directory (defaults to current working directory)
|
||||
|
||||
Args:
|
||||
project_dir: Path to project directory (optional, defaults to cwd)
|
||||
"""
|
||||
self.project_dir = Path(project_dir) if project_dir else Path.cwd()
|
||||
self.config_file = self.project_dir / ".fuzzforge" / "config.yaml"
|
||||
self.project_context = None
|
||||
self._cognee = None
|
||||
self._initialized = False
|
||||
|
||||
async def initialize(self) -> bool:
|
||||
"""
|
||||
Initialize Cognee with project context
|
||||
|
||||
Returns:
|
||||
bool: True if initialization successful
|
||||
"""
|
||||
try:
|
||||
# Import Cognee
|
||||
import cognee
|
||||
self._cognee = cognee
|
||||
|
||||
# Load project context
|
||||
if not self._load_project_context():
|
||||
return False
|
||||
|
||||
# Configure Cognee for this project
|
||||
await self._setup_cognee_config()
|
||||
|
||||
self._initialized = True
|
||||
return True
|
||||
|
||||
except ImportError:
|
||||
print("Cognee not installed. Install with: pip install cognee")
|
||||
return False
|
||||
except Exception as e:
|
||||
print(f"Failed to initialize Cognee: {e}")
|
||||
return False
|
||||
|
||||
def _load_project_context(self) -> bool:
|
||||
"""Load project context from FuzzForge config"""
|
||||
try:
|
||||
if not self.config_file.exists():
|
||||
print(f"No FuzzForge config found at {self.config_file}")
|
||||
return False
|
||||
|
||||
import yaml
|
||||
with open(self.config_file, 'r') as f:
|
||||
config = yaml.safe_load(f)
|
||||
|
||||
self.project_context = {
|
||||
"project_name": config.get("project", {}).get("name", "default"),
|
||||
"project_id": config.get("project", {}).get("id", "default"),
|
||||
"tenant_id": config.get("cognee", {}).get("tenant", "default")
|
||||
}
|
||||
return True
|
||||
|
||||
except Exception as e:
|
||||
print(f"Error loading project context: {e}")
|
||||
return False
|
||||
|
||||
async def _setup_cognee_config(self):
|
||||
"""Configure Cognee for project-specific access"""
|
||||
# Set API key and model
|
||||
api_key = os.getenv('OPENAI_API_KEY')
|
||||
model = os.getenv('LITELLM_MODEL', 'gpt-4o-mini')
|
||||
|
||||
if not api_key:
|
||||
raise ValueError("OPENAI_API_KEY required for Cognee operations")
|
||||
|
||||
# Configure Cognee
|
||||
self._cognee.config.set_llm_api_key(api_key)
|
||||
self._cognee.config.set_llm_model(model)
|
||||
self._cognee.config.set_llm_provider("openai")
|
||||
|
||||
# Set project-specific directories
|
||||
project_cognee_dir = self.project_dir / ".fuzzforge" / "cognee" / f"project_{self.project_context['project_id']}"
|
||||
|
||||
self._cognee.config.data_root_directory(str(project_cognee_dir / "data"))
|
||||
self._cognee.config.system_root_directory(str(project_cognee_dir / "system"))
|
||||
|
||||
# Ensure directories exist
|
||||
project_cognee_dir.mkdir(parents=True, exist_ok=True)
|
||||
(project_cognee_dir / "data").mkdir(exist_ok=True)
|
||||
(project_cognee_dir / "system").mkdir(exist_ok=True)
|
||||
|
||||
async def search_knowledge_graph(self, query: str, search_type: str = "GRAPH_COMPLETION", dataset: str = None) -> Dict[str, Any]:
|
||||
"""
|
||||
Search the project's knowledge graph
|
||||
|
||||
Args:
|
||||
query: Search query
|
||||
search_type: Type of search ("GRAPH_COMPLETION", "INSIGHTS", "CHUNKS", etc.)
|
||||
dataset: Specific dataset to search (optional)
|
||||
|
||||
Returns:
|
||||
Dict containing search results
|
||||
"""
|
||||
if not self._initialized:
|
||||
await self.initialize()
|
||||
|
||||
if not self._initialized:
|
||||
return {"error": "Cognee not initialized"}
|
||||
|
||||
try:
|
||||
from cognee.modules.search.types import SearchType
|
||||
|
||||
# Resolve search type dynamically; fallback to GRAPH_COMPLETION
|
||||
try:
|
||||
search_type_enum = getattr(SearchType, search_type.upper())
|
||||
except AttributeError:
|
||||
search_type_enum = SearchType.GRAPH_COMPLETION
|
||||
search_type = "GRAPH_COMPLETION"
|
||||
|
||||
# Prepare search kwargs
|
||||
search_kwargs = {
|
||||
"query_type": search_type_enum,
|
||||
"query_text": query
|
||||
}
|
||||
|
||||
# Add dataset filter if specified
|
||||
if dataset:
|
||||
search_kwargs["datasets"] = [dataset]
|
||||
|
||||
results = await self._cognee.search(**search_kwargs)
|
||||
|
||||
return {
|
||||
"query": query,
|
||||
"search_type": search_type,
|
||||
"dataset": dataset,
|
||||
"results": results,
|
||||
"project": self.project_context["project_name"]
|
||||
}
|
||||
except Exception as e:
|
||||
return {"error": f"Search failed: {e}"}
|
||||
|
||||
async def list_knowledge_data(self) -> Dict[str, Any]:
|
||||
"""
|
||||
List available data in the knowledge graph
|
||||
|
||||
Returns:
|
||||
Dict containing available data
|
||||
"""
|
||||
if not self._initialized:
|
||||
await self.initialize()
|
||||
|
||||
if not self._initialized:
|
||||
return {"error": "Cognee not initialized"}
|
||||
|
||||
try:
|
||||
data = await self._cognee.list_data()
|
||||
return {
|
||||
"project": self.project_context["project_name"],
|
||||
"available_data": data
|
||||
}
|
||||
except Exception as e:
|
||||
return {"error": f"Failed to list data: {e}"}
|
||||
|
||||
async def ingest_text_to_dataset(self, text: str, dataset: str = None) -> Dict[str, Any]:
|
||||
"""
|
||||
Ingest text content into a specific dataset
|
||||
|
||||
Args:
|
||||
text: Text to ingest
|
||||
dataset: Dataset name (defaults to project_name_codebase)
|
||||
|
||||
Returns:
|
||||
Dict containing ingest results
|
||||
"""
|
||||
if not self._initialized:
|
||||
await self.initialize()
|
||||
|
||||
if not self._initialized:
|
||||
return {"error": "Cognee not initialized"}
|
||||
|
||||
if not dataset:
|
||||
dataset = f"{self.project_context['project_name']}_codebase"
|
||||
|
||||
try:
|
||||
# Add text to dataset
|
||||
await self._cognee.add([text], dataset_name=dataset)
|
||||
|
||||
# Process (cognify) the dataset
|
||||
await self._cognee.cognify([dataset])
|
||||
|
||||
return {
|
||||
"text_length": len(text),
|
||||
"dataset": dataset,
|
||||
"project": self.project_context["project_name"],
|
||||
"status": "success"
|
||||
}
|
||||
except Exception as e:
|
||||
return {"error": f"Ingest failed: {e}"}
|
||||
|
||||
async def ingest_files_to_dataset(self, file_paths: list, dataset: str = None) -> Dict[str, Any]:
|
||||
"""
|
||||
Ingest multiple files into a specific dataset
|
||||
|
||||
Args:
|
||||
file_paths: List of file paths to ingest
|
||||
dataset: Dataset name (defaults to project_name_codebase)
|
||||
|
||||
Returns:
|
||||
Dict containing ingest results
|
||||
"""
|
||||
if not self._initialized:
|
||||
await self.initialize()
|
||||
|
||||
if not self._initialized:
|
||||
return {"error": "Cognee not initialized"}
|
||||
|
||||
if not dataset:
|
||||
dataset = f"{self.project_context['project_name']}_codebase"
|
||||
|
||||
try:
|
||||
# Validate and filter readable files
|
||||
valid_files = []
|
||||
for file_path in file_paths:
|
||||
try:
|
||||
path = Path(file_path)
|
||||
if path.exists() and path.is_file():
|
||||
# Test if file is readable
|
||||
with open(path, 'r', encoding='utf-8') as f:
|
||||
f.read(1)
|
||||
valid_files.append(str(path))
|
||||
except (UnicodeDecodeError, PermissionError, OSError):
|
||||
continue
|
||||
|
||||
if not valid_files:
|
||||
return {"error": "No valid files found to ingest"}
|
||||
|
||||
# Add files to dataset
|
||||
await self._cognee.add(valid_files, dataset_name=dataset)
|
||||
|
||||
# Process (cognify) the dataset
|
||||
await self._cognee.cognify([dataset])
|
||||
|
||||
return {
|
||||
"files_processed": len(valid_files),
|
||||
"total_files_requested": len(file_paths),
|
||||
"dataset": dataset,
|
||||
"project": self.project_context["project_name"],
|
||||
"status": "success"
|
||||
}
|
||||
except Exception as e:
|
||||
return {"error": f"Ingest failed: {e}"}
|
||||
|
||||
async def list_datasets(self) -> Dict[str, Any]:
|
||||
"""
|
||||
List all datasets available in the project
|
||||
|
||||
Returns:
|
||||
Dict containing available datasets
|
||||
"""
|
||||
if not self._initialized:
|
||||
await self.initialize()
|
||||
|
||||
if not self._initialized:
|
||||
return {"error": "Cognee not initialized"}
|
||||
|
||||
try:
|
||||
# Get available datasets by searching for data
|
||||
data = await self._cognee.list_data()
|
||||
|
||||
# Extract unique dataset names from the data
|
||||
datasets = set()
|
||||
if isinstance(data, list):
|
||||
for item in data:
|
||||
if isinstance(item, dict) and 'dataset_name' in item:
|
||||
datasets.add(item['dataset_name'])
|
||||
|
||||
return {
|
||||
"project": self.project_context["project_name"],
|
||||
"datasets": list(datasets),
|
||||
"total_datasets": len(datasets)
|
||||
}
|
||||
except Exception as e:
|
||||
return {"error": f"Failed to list datasets: {e}"}
|
||||
|
||||
async def create_dataset(self, dataset: str) -> Dict[str, Any]:
|
||||
"""
|
||||
Create a new dataset (dataset is created automatically when data is added)
|
||||
|
||||
Args:
|
||||
dataset: Dataset name to create
|
||||
|
||||
Returns:
|
||||
Dict containing creation result
|
||||
"""
|
||||
if not self._initialized:
|
||||
await self.initialize()
|
||||
|
||||
if not self._initialized:
|
||||
return {"error": "Cognee not initialized"}
|
||||
|
||||
try:
|
||||
# In Cognee, datasets are created implicitly when data is added
|
||||
# We'll add empty content to create the dataset
|
||||
await self._cognee.add([f"Dataset {dataset} initialized for project {self.project_context['project_name']}"],
|
||||
dataset_name=dataset)
|
||||
|
||||
return {
|
||||
"dataset": dataset,
|
||||
"project": self.project_context["project_name"],
|
||||
"status": "created"
|
||||
}
|
||||
except Exception as e:
|
||||
return {"error": f"Failed to create dataset: {e}"}
|
||||
|
||||
def get_project_context(self) -> Optional[Dict[str, str]]:
|
||||
"""Get current project context"""
|
||||
return self.project_context
|
||||
|
||||
def is_initialized(self) -> bool:
|
||||
"""Check if Cognee is initialized"""
|
||||
return self._initialized
|
||||
|
||||
|
||||
# Convenience functions for easy integration
|
||||
async def search_project_codebase(query: str, project_dir: Optional[str] = None, dataset: str = None, search_type: str = "GRAPH_COMPLETION") -> str:
|
||||
"""
|
||||
Convenience function to search project codebase
|
||||
|
||||
Args:
|
||||
query: Search query
|
||||
project_dir: Project directory (optional, defaults to cwd)
|
||||
dataset: Specific dataset to search (optional)
|
||||
search_type: Type of search ("GRAPH_COMPLETION", "INSIGHTS", "CHUNKS")
|
||||
|
||||
Returns:
|
||||
Formatted search results as string
|
||||
"""
|
||||
cognee_integration = CogneeProjectIntegration(project_dir)
|
||||
result = await cognee_integration.search_knowledge_graph(query, search_type, dataset)
|
||||
|
||||
if "error" in result:
|
||||
return f"Error searching codebase: {result['error']}"
|
||||
|
||||
project_name = result.get("project", "Unknown")
|
||||
results = result.get("results", [])
|
||||
|
||||
if not results:
|
||||
return f"No results found for '{query}' in project {project_name}"
|
||||
|
||||
output = f"Search results for '{query}' in project {project_name}:\n\n"
|
||||
|
||||
# Format results
|
||||
if isinstance(results, list):
|
||||
for i, item in enumerate(results, 1):
|
||||
if isinstance(item, dict):
|
||||
# Handle structured results
|
||||
output += f"{i}. "
|
||||
if "search_result" in item:
|
||||
output += f"Dataset: {item.get('dataset_name', 'Unknown')}\n"
|
||||
for result_item in item["search_result"]:
|
||||
if isinstance(result_item, dict):
|
||||
if "name" in result_item:
|
||||
output += f" - {result_item['name']}: {result_item.get('description', '')}\n"
|
||||
elif "text" in result_item:
|
||||
text = result_item["text"][:200] + "..." if len(result_item["text"]) > 200 else result_item["text"]
|
||||
output += f" - {text}\n"
|
||||
else:
|
||||
output += f" - {str(result_item)[:200]}...\n"
|
||||
else:
|
||||
output += f"{str(item)[:200]}...\n"
|
||||
output += "\n"
|
||||
else:
|
||||
output += f"{i}. {str(item)[:200]}...\n\n"
|
||||
else:
|
||||
output += f"{str(results)[:500]}..."
|
||||
|
||||
return output
|
||||
|
||||
|
||||
async def list_project_knowledge(project_dir: Optional[str] = None) -> str:
|
||||
"""
|
||||
Convenience function to list project knowledge
|
||||
|
||||
Args:
|
||||
project_dir: Project directory (optional, defaults to cwd)
|
||||
|
||||
Returns:
|
||||
Formatted list of available data
|
||||
"""
|
||||
cognee_integration = CogneeProjectIntegration(project_dir)
|
||||
result = await cognee_integration.list_knowledge_data()
|
||||
|
||||
if "error" in result:
|
||||
return f"Error listing knowledge: {result['error']}"
|
||||
|
||||
project_name = result.get("project", "Unknown")
|
||||
data = result.get("available_data", [])
|
||||
|
||||
output = f"Available knowledge in project {project_name}:\n\n"
|
||||
|
||||
if not data:
|
||||
output += "No data available in knowledge graph"
|
||||
else:
|
||||
for i, item in enumerate(data, 1):
|
||||
output += f"{i}. {item}\n"
|
||||
|
||||
return output
|
||||
416
ai/src/fuzzforge_ai/cognee_service.py
Normal file
416
ai/src/fuzzforge_ai/cognee_service.py
Normal file
@@ -0,0 +1,416 @@
|
||||
"""
|
||||
Cognee Service for FuzzForge
|
||||
Provides integrated Cognee functionality for codebase analysis and knowledge graphs
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
import os
|
||||
import asyncio
|
||||
import logging
|
||||
from pathlib import Path
|
||||
from typing import Dict, List, Any, Optional
|
||||
from datetime import datetime
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class CogneeService:
|
||||
"""
|
||||
Service for managing Cognee integration with FuzzForge
|
||||
Handles multi-tenant isolation and project-specific knowledge graphs
|
||||
"""
|
||||
|
||||
def __init__(self, config):
|
||||
"""Initialize with FuzzForge config"""
|
||||
self.config = config
|
||||
self.cognee_config = config.get_cognee_config()
|
||||
self.project_context = config.get_project_context()
|
||||
self._cognee = None
|
||||
self._user = None
|
||||
self._initialized = False
|
||||
|
||||
async def initialize(self):
|
||||
"""Initialize Cognee with project-specific configuration"""
|
||||
try:
|
||||
# Ensure environment variables for Cognee are set before import
|
||||
self.config.setup_cognee_environment()
|
||||
logger.debug(
|
||||
"Cognee environment configured",
|
||||
extra={
|
||||
"data": self.cognee_config.get("data_directory"),
|
||||
"system": self.cognee_config.get("system_directory"),
|
||||
},
|
||||
)
|
||||
|
||||
import cognee
|
||||
self._cognee = cognee
|
||||
|
||||
# Configure LLM with API key BEFORE any other cognee operations
|
||||
provider = os.getenv("LLM_PROVIDER", "openai")
|
||||
model = os.getenv("LLM_MODEL") or os.getenv("LITELLM_MODEL", "gpt-4o-mini")
|
||||
api_key = os.getenv("LLM_API_KEY") or os.getenv("OPENAI_API_KEY")
|
||||
endpoint = os.getenv("LLM_ENDPOINT")
|
||||
api_version = os.getenv("LLM_API_VERSION")
|
||||
max_tokens = os.getenv("LLM_MAX_TOKENS")
|
||||
|
||||
if provider.lower() in {"openai", "azure_openai", "custom"} and not api_key:
|
||||
raise ValueError(
|
||||
"OpenAI-compatible API key is required for Cognee LLM operations. "
|
||||
"Set OPENAI_API_KEY, LLM_API_KEY, or COGNEE_LLM_API_KEY in your .env"
|
||||
)
|
||||
|
||||
# Expose environment variables for downstream libraries
|
||||
os.environ["LLM_PROVIDER"] = provider
|
||||
os.environ["LITELLM_MODEL"] = model
|
||||
os.environ["LLM_MODEL"] = model
|
||||
if api_key:
|
||||
os.environ["LLM_API_KEY"] = api_key
|
||||
# Maintain compatibility with components still expecting OPENAI_API_KEY
|
||||
if provider.lower() in {"openai", "azure_openai", "custom"}:
|
||||
os.environ.setdefault("OPENAI_API_KEY", api_key)
|
||||
if endpoint:
|
||||
os.environ["LLM_ENDPOINT"] = endpoint
|
||||
if api_version:
|
||||
os.environ["LLM_API_VERSION"] = api_version
|
||||
if max_tokens:
|
||||
os.environ["LLM_MAX_TOKENS"] = str(max_tokens)
|
||||
|
||||
# Configure Cognee's runtime using its configuration helpers when available
|
||||
if hasattr(cognee.config, "set_llm_provider"):
|
||||
cognee.config.set_llm_provider(provider)
|
||||
if hasattr(cognee.config, "set_llm_model"):
|
||||
cognee.config.set_llm_model(model)
|
||||
if api_key and hasattr(cognee.config, "set_llm_api_key"):
|
||||
cognee.config.set_llm_api_key(api_key)
|
||||
if endpoint and hasattr(cognee.config, "set_llm_endpoint"):
|
||||
cognee.config.set_llm_endpoint(endpoint)
|
||||
if api_version and hasattr(cognee.config, "set_llm_api_version"):
|
||||
cognee.config.set_llm_api_version(api_version)
|
||||
if max_tokens and hasattr(cognee.config, "set_llm_max_tokens"):
|
||||
cognee.config.set_llm_max_tokens(int(max_tokens))
|
||||
|
||||
# Configure graph database
|
||||
cognee.config.set_graph_db_config({
|
||||
"graph_database_provider": self.cognee_config.get("graph_database_provider", "kuzu"),
|
||||
})
|
||||
|
||||
# Set data directories
|
||||
data_dir = self.cognee_config.get("data_directory")
|
||||
system_dir = self.cognee_config.get("system_directory")
|
||||
|
||||
if data_dir:
|
||||
logger.debug("Setting cognee data root", extra={"path": data_dir})
|
||||
cognee.config.data_root_directory(data_dir)
|
||||
if system_dir:
|
||||
logger.debug("Setting cognee system root", extra={"path": system_dir})
|
||||
cognee.config.system_root_directory(system_dir)
|
||||
|
||||
# Setup multi-tenant user context
|
||||
await self._setup_user_context()
|
||||
|
||||
self._initialized = True
|
||||
logger.info(f"Cognee initialized for project {self.project_context['project_name']} "
|
||||
f"with Kuzu at {system_dir}")
|
||||
|
||||
except ImportError:
|
||||
logger.error("Cognee not installed. Install with: pip install cognee")
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to initialize Cognee: {e}")
|
||||
raise
|
||||
|
||||
async def create_dataset(self):
|
||||
"""Create dataset for this project if it doesn't exist"""
|
||||
if not self._initialized:
|
||||
await self.initialize()
|
||||
|
||||
try:
|
||||
# Dataset creation is handled automatically by Cognee when adding files
|
||||
# We just ensure we have the right context set up
|
||||
dataset_name = f"{self.project_context['project_name']}_codebase"
|
||||
logger.info(f"Dataset {dataset_name} ready for project {self.project_context['project_name']}")
|
||||
return dataset_name
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to create dataset: {e}")
|
||||
raise
|
||||
|
||||
async def _setup_user_context(self):
|
||||
"""Setup user context for multi-tenant isolation"""
|
||||
try:
|
||||
from cognee.modules.users.methods import create_user, get_user
|
||||
|
||||
# Always try fallback email first to avoid validation issues
|
||||
fallback_email = f"project_{self.project_context['project_id']}@fuzzforge.example"
|
||||
user_tenant = self.project_context['tenant_id']
|
||||
|
||||
# Try to get existing fallback user first
|
||||
try:
|
||||
self._user = await get_user(fallback_email)
|
||||
logger.info(f"Using existing user: {fallback_email}")
|
||||
return
|
||||
except:
|
||||
# User doesn't exist, try to create fallback
|
||||
pass
|
||||
|
||||
# Create fallback user
|
||||
try:
|
||||
self._user = await create_user(fallback_email, user_tenant)
|
||||
logger.info(f"Created fallback user: {fallback_email} for tenant: {user_tenant}")
|
||||
return
|
||||
except Exception as fallback_error:
|
||||
logger.warning(f"Fallback user creation failed: {fallback_error}")
|
||||
self._user = None
|
||||
return
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"Could not setup multi-tenant user context: {e}")
|
||||
logger.info("Proceeding with default context")
|
||||
self._user = None
|
||||
|
||||
def get_project_dataset_name(self, dataset_suffix: str = "codebase") -> str:
|
||||
"""Get project-specific dataset name"""
|
||||
return f"{self.project_context['project_name']}_{dataset_suffix}"
|
||||
|
||||
async def ingest_text(self, content: str, dataset: str = "fuzzforge") -> bool:
|
||||
"""Ingest text content into knowledge graph"""
|
||||
if not self._initialized:
|
||||
await self.initialize()
|
||||
|
||||
try:
|
||||
await self._cognee.add([content], dataset)
|
||||
await self._cognee.cognify([dataset])
|
||||
return True
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to ingest text: {e}")
|
||||
return False
|
||||
|
||||
async def ingest_files(self, file_paths: List[Path], dataset: str = "fuzzforge") -> Dict[str, Any]:
|
||||
"""Ingest multiple files into knowledge graph"""
|
||||
if not self._initialized:
|
||||
await self.initialize()
|
||||
|
||||
results = {
|
||||
"success": 0,
|
||||
"failed": 0,
|
||||
"errors": []
|
||||
}
|
||||
|
||||
try:
|
||||
ingest_paths: List[str] = []
|
||||
for file_path in file_paths:
|
||||
try:
|
||||
with open(file_path, 'r', encoding='utf-8'):
|
||||
ingest_paths.append(str(file_path))
|
||||
results["success"] += 1
|
||||
except (UnicodeDecodeError, PermissionError) as exc:
|
||||
results["failed"] += 1
|
||||
results["errors"].append(f"{file_path}: {exc}")
|
||||
logger.warning("Skipping %s: %s", file_path, exc)
|
||||
|
||||
if ingest_paths:
|
||||
await self._cognee.add(ingest_paths, dataset_name=dataset)
|
||||
await self._cognee.cognify([dataset])
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to ingest files: {e}")
|
||||
results["errors"].append(f"Cognify error: {str(e)}")
|
||||
|
||||
return results
|
||||
|
||||
async def search_insights(self, query: str, dataset: str = None) -> List[str]:
|
||||
"""Search for insights in the knowledge graph"""
|
||||
if not self._initialized:
|
||||
await self.initialize()
|
||||
|
||||
try:
|
||||
from cognee.modules.search.types import SearchType
|
||||
|
||||
kwargs = {
|
||||
"query_type": SearchType.INSIGHTS,
|
||||
"query_text": query
|
||||
}
|
||||
|
||||
if dataset:
|
||||
kwargs["datasets"] = [dataset]
|
||||
|
||||
results = await self._cognee.search(**kwargs)
|
||||
return results if isinstance(results, list) else []
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to search insights: {e}")
|
||||
return []
|
||||
|
||||
async def search_chunks(self, query: str, dataset: str = None) -> List[str]:
|
||||
"""Search for relevant text chunks"""
|
||||
if not self._initialized:
|
||||
await self.initialize()
|
||||
|
||||
try:
|
||||
from cognee.modules.search.types import SearchType
|
||||
|
||||
kwargs = {
|
||||
"query_type": SearchType.CHUNKS,
|
||||
"query_text": query
|
||||
}
|
||||
|
||||
if dataset:
|
||||
kwargs["datasets"] = [dataset]
|
||||
|
||||
results = await self._cognee.search(**kwargs)
|
||||
return results if isinstance(results, list) else []
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to search chunks: {e}")
|
||||
return []
|
||||
|
||||
async def search_graph_completion(self, query: str) -> List[str]:
|
||||
"""Search for graph completion (relationships)"""
|
||||
if not self._initialized:
|
||||
await self.initialize()
|
||||
|
||||
try:
|
||||
from cognee.modules.search.types import SearchType
|
||||
|
||||
results = await self._cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION,
|
||||
query_text=query
|
||||
)
|
||||
return results if isinstance(results, list) else []
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to search graph completion: {e}")
|
||||
return []
|
||||
|
||||
async def get_status(self) -> Dict[str, Any]:
|
||||
"""Get service status and statistics"""
|
||||
status = {
|
||||
"initialized": self._initialized,
|
||||
"enabled": self.cognee_config.get("enabled", True),
|
||||
"provider": self.cognee_config.get("graph_database_provider", "kuzu"),
|
||||
"data_directory": self.cognee_config.get("data_directory"),
|
||||
"system_directory": self.cognee_config.get("system_directory"),
|
||||
}
|
||||
|
||||
if self._initialized:
|
||||
try:
|
||||
# Check if directories exist and get sizes
|
||||
data_dir = Path(status["data_directory"])
|
||||
system_dir = Path(status["system_directory"])
|
||||
|
||||
status.update({
|
||||
"data_dir_exists": data_dir.exists(),
|
||||
"system_dir_exists": system_dir.exists(),
|
||||
"kuzu_db_exists": (system_dir / "kuzu_db").exists(),
|
||||
"lancedb_exists": (system_dir / "lancedb").exists(),
|
||||
})
|
||||
|
||||
except Exception as e:
|
||||
status["status_error"] = str(e)
|
||||
|
||||
return status
|
||||
|
||||
async def clear_data(self, confirm: bool = False):
|
||||
"""Clear all ingested data (dangerous!)"""
|
||||
if not confirm:
|
||||
raise ValueError("Must confirm data clearing with confirm=True")
|
||||
|
||||
if not self._initialized:
|
||||
await self.initialize()
|
||||
|
||||
try:
|
||||
await self._cognee.prune.prune_data()
|
||||
await self._cognee.prune.prune_system(metadata=True)
|
||||
logger.info("Cognee data cleared")
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to clear data: {e}")
|
||||
raise
|
||||
|
||||
|
||||
class FuzzForgeCogneeIntegration:
|
||||
"""
|
||||
Main integration class for FuzzForge + Cognee
|
||||
Provides high-level operations for security analysis
|
||||
"""
|
||||
|
||||
def __init__(self, config):
|
||||
self.service = CogneeService(config)
|
||||
|
||||
async def analyze_codebase(self, path: Path, recursive: bool = True) -> Dict[str, Any]:
|
||||
"""
|
||||
Analyze a codebase and extract security-relevant insights
|
||||
"""
|
||||
# Collect code files
|
||||
from fuzzforge_ai.ingest_utils import collect_ingest_files
|
||||
|
||||
files = collect_ingest_files(path, recursive, None, [])
|
||||
|
||||
if not files:
|
||||
return {"error": "No files found to analyze"}
|
||||
|
||||
# Ingest files
|
||||
results = await self.service.ingest_files(files, "security_analysis")
|
||||
|
||||
if results["success"] == 0:
|
||||
return {"error": "Failed to ingest any files", "details": results}
|
||||
|
||||
# Extract security insights
|
||||
security_queries = [
|
||||
"vulnerabilities security risks",
|
||||
"authentication authorization",
|
||||
"input validation sanitization",
|
||||
"encryption cryptography",
|
||||
"error handling exceptions",
|
||||
"logging sensitive data"
|
||||
]
|
||||
|
||||
insights = {}
|
||||
for query in security_queries:
|
||||
insight_results = await self.service.search_insights(query, "security_analysis")
|
||||
if insight_results:
|
||||
insights[query.replace(" ", "_")] = insight_results
|
||||
|
||||
return {
|
||||
"files_processed": results["success"],
|
||||
"files_failed": results["failed"],
|
||||
"errors": results["errors"],
|
||||
"security_insights": insights
|
||||
}
|
||||
|
||||
async def query_codebase(self, query: str, search_type: str = "insights") -> List[str]:
|
||||
"""Query the ingested codebase"""
|
||||
if search_type == "insights":
|
||||
return await self.service.search_insights(query)
|
||||
elif search_type == "chunks":
|
||||
return await self.service.search_chunks(query)
|
||||
elif search_type == "graph":
|
||||
return await self.service.search_graph_completion(query)
|
||||
else:
|
||||
raise ValueError(f"Unknown search type: {search_type}")
|
||||
|
||||
async def get_project_summary(self) -> Dict[str, Any]:
|
||||
"""Get a summary of the analyzed project"""
|
||||
# Search for general project insights
|
||||
summary_queries = [
|
||||
"project structure components",
|
||||
"main functionality features",
|
||||
"programming languages frameworks",
|
||||
"dependencies libraries"
|
||||
]
|
||||
|
||||
summary = {}
|
||||
for query in summary_queries:
|
||||
results = await self.service.search_insights(query)
|
||||
if results:
|
||||
summary[query.replace(" ", "_")] = results[:3] # Top 3 results
|
||||
|
||||
return summary
|
||||
9
ai/src/fuzzforge_ai/config.yaml
Normal file
9
ai/src/fuzzforge_ai/config.yaml
Normal file
@@ -0,0 +1,9 @@
|
||||
# FuzzForge Registered Agents
|
||||
# These agents will be automatically registered on startup
|
||||
|
||||
registered_agents:
|
||||
|
||||
# Example entries:
|
||||
# - name: Calculator
|
||||
# url: http://localhost:10201
|
||||
# description: Mathematical calculations agent
|
||||
31
ai/src/fuzzforge_ai/config_bridge.py
Normal file
31
ai/src/fuzzforge_ai/config_bridge.py
Normal file
@@ -0,0 +1,31 @@
|
||||
"""Bridge module providing access to the host CLI configuration manager."""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
try:
|
||||
from fuzzforge_cli.config import ProjectConfigManager as _ProjectConfigManager
|
||||
except ImportError as exc: # pragma: no cover - used when CLI not available
|
||||
class _ProjectConfigManager: # type: ignore[no-redef]
|
||||
"""Fallback implementation that raises a helpful error."""
|
||||
|
||||
def __init__(self, *args, **kwargs):
|
||||
raise ImportError(
|
||||
"ProjectConfigManager is unavailable. Install the FuzzForge CLI "
|
||||
"package or supply a compatible configuration object."
|
||||
) from exc
|
||||
|
||||
def __getattr__(name): # pragma: no cover - defensive
|
||||
raise ImportError("ProjectConfigManager unavailable") from exc
|
||||
|
||||
ProjectConfigManager = _ProjectConfigManager
|
||||
|
||||
__all__ = ["ProjectConfigManager"]
|
||||
134
ai/src/fuzzforge_ai/config_manager.py
Normal file
134
ai/src/fuzzforge_ai/config_manager.py
Normal file
@@ -0,0 +1,134 @@
|
||||
"""
|
||||
Configuration manager for FuzzForge
|
||||
Handles loading and saving registered agents
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
import os
|
||||
import yaml
|
||||
from typing import Dict, Any, List
|
||||
|
||||
class ConfigManager:
|
||||
"""Manages FuzzForge agent registry configuration"""
|
||||
|
||||
def __init__(self, config_path: str = None):
|
||||
"""Initialize config manager"""
|
||||
if config_path:
|
||||
self.config_path = config_path
|
||||
else:
|
||||
# Check for local .fuzzforge/agents.yaml first, then fall back to global
|
||||
local_config = os.path.join(os.getcwd(), '.fuzzforge', 'agents.yaml')
|
||||
global_config = os.path.join(os.path.dirname(__file__), 'config.yaml')
|
||||
|
||||
if os.path.exists(local_config):
|
||||
self.config_path = local_config
|
||||
if os.getenv("FUZZFORGE_DEBUG", "0") == "1":
|
||||
print(f"[CONFIG] Using local config: {local_config}")
|
||||
else:
|
||||
self.config_path = global_config
|
||||
if os.getenv("FUZZFORGE_DEBUG", "0") == "1":
|
||||
print(f"[CONFIG] Using global config: {global_config}")
|
||||
|
||||
self.config = self.load_config()
|
||||
|
||||
def load_config(self) -> Dict[str, Any]:
|
||||
"""Load configuration from YAML file"""
|
||||
if not os.path.exists(self.config_path):
|
||||
# Create default config if it doesn't exist
|
||||
return {'registered_agents': []}
|
||||
|
||||
try:
|
||||
with open(self.config_path, 'r') as f:
|
||||
config = yaml.safe_load(f) or {}
|
||||
# Ensure registered_agents is a list
|
||||
if 'registered_agents' not in config or config['registered_agents'] is None:
|
||||
config['registered_agents'] = []
|
||||
return config
|
||||
except Exception as e:
|
||||
print(f"[WARNING] Failed to load config: {e}")
|
||||
return {'registered_agents': []}
|
||||
|
||||
def save_config(self):
|
||||
"""Save current configuration to file"""
|
||||
try:
|
||||
# Create a clean config with comments
|
||||
config_content = """# FuzzForge Registered Agents
|
||||
# These agents will be automatically registered on startup
|
||||
|
||||
"""
|
||||
# Add the agents list
|
||||
if self.config.get('registered_agents'):
|
||||
config_content += yaml.dump({'registered_agents': self.config['registered_agents']},
|
||||
default_flow_style=False, sort_keys=False)
|
||||
else:
|
||||
config_content += "registered_agents: []\n"
|
||||
|
||||
config_content += """
|
||||
# Example entries:
|
||||
# - name: Calculator
|
||||
# url: http://localhost:10201
|
||||
# description: Mathematical calculations agent
|
||||
"""
|
||||
|
||||
with open(self.config_path, 'w') as f:
|
||||
f.write(config_content)
|
||||
|
||||
return True
|
||||
except Exception as e:
|
||||
print(f"[ERROR] Failed to save config: {e}")
|
||||
return False
|
||||
|
||||
def get_registered_agents(self) -> List[Dict[str, Any]]:
|
||||
"""Get list of registered agents from config"""
|
||||
return self.config.get('registered_agents', [])
|
||||
|
||||
def add_registered_agent(self, name: str, url: str, description: str = "") -> bool:
|
||||
"""Add a new registered agent to config"""
|
||||
if 'registered_agents' not in self.config:
|
||||
self.config['registered_agents'] = []
|
||||
|
||||
# Check if agent already exists
|
||||
for agent in self.config['registered_agents']:
|
||||
if agent.get('url') == url:
|
||||
# Update existing agent
|
||||
agent['name'] = name
|
||||
agent['description'] = description
|
||||
return self.save_config()
|
||||
|
||||
# Add new agent
|
||||
self.config['registered_agents'].append({
|
||||
'name': name,
|
||||
'url': url,
|
||||
'description': description
|
||||
})
|
||||
|
||||
return self.save_config()
|
||||
|
||||
def remove_registered_agent(self, name: str = None, url: str = None) -> bool:
|
||||
"""Remove a registered agent from config"""
|
||||
if 'registered_agents' not in self.config:
|
||||
return False
|
||||
|
||||
original_count = len(self.config['registered_agents'])
|
||||
|
||||
# Filter out the agent
|
||||
self.config['registered_agents'] = [
|
||||
agent for agent in self.config['registered_agents']
|
||||
if not ((name and agent.get('name') == name) or
|
||||
(url and agent.get('url') == url))
|
||||
]
|
||||
|
||||
if len(self.config['registered_agents']) < original_count:
|
||||
return self.save_config()
|
||||
|
||||
return False
|
||||
104
ai/src/fuzzforge_ai/ingest_utils.py
Normal file
104
ai/src/fuzzforge_ai/ingest_utils.py
Normal file
@@ -0,0 +1,104 @@
|
||||
"""Utilities for collecting files to ingest into Cognee."""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import fnmatch
|
||||
from pathlib import Path
|
||||
from typing import Iterable, List, Optional
|
||||
|
||||
_DEFAULT_FILE_TYPES = [
|
||||
".py",
|
||||
".js",
|
||||
".ts",
|
||||
".java",
|
||||
".cpp",
|
||||
".c",
|
||||
".h",
|
||||
".rs",
|
||||
".go",
|
||||
".rb",
|
||||
".php",
|
||||
".cs",
|
||||
".swift",
|
||||
".kt",
|
||||
".scala",
|
||||
".clj",
|
||||
".hs",
|
||||
".md",
|
||||
".txt",
|
||||
".yaml",
|
||||
".yml",
|
||||
".json",
|
||||
".toml",
|
||||
".cfg",
|
||||
".ini",
|
||||
]
|
||||
|
||||
_DEFAULT_EXCLUDE = [
|
||||
"*.pyc",
|
||||
"__pycache__",
|
||||
".git",
|
||||
".svn",
|
||||
".hg",
|
||||
"node_modules",
|
||||
".venv",
|
||||
"venv",
|
||||
".env",
|
||||
"dist",
|
||||
"build",
|
||||
".pytest_cache",
|
||||
".mypy_cache",
|
||||
".tox",
|
||||
"coverage",
|
||||
"*.log",
|
||||
"*.tmp",
|
||||
]
|
||||
|
||||
|
||||
def collect_ingest_files(
|
||||
path: Path,
|
||||
recursive: bool = True,
|
||||
file_types: Optional[Iterable[str]] = None,
|
||||
exclude: Optional[Iterable[str]] = None,
|
||||
) -> List[Path]:
|
||||
"""Return a list of files eligible for ingestion."""
|
||||
path = path.resolve()
|
||||
files: List[Path] = []
|
||||
|
||||
extensions = list(file_types) if file_types else list(_DEFAULT_FILE_TYPES)
|
||||
exclusions = list(exclude) if exclude else []
|
||||
exclusions.extend(_DEFAULT_EXCLUDE)
|
||||
|
||||
def should_exclude(file_path: Path) -> bool:
|
||||
file_str = str(file_path)
|
||||
for pattern in exclusions:
|
||||
if fnmatch.fnmatch(file_str, f"*{pattern}*") or fnmatch.fnmatch(file_path.name, pattern):
|
||||
return True
|
||||
return False
|
||||
|
||||
if path.is_file():
|
||||
if not should_exclude(path) and any(str(path).endswith(ext) for ext in extensions):
|
||||
files.append(path)
|
||||
return files
|
||||
|
||||
pattern = "**/*" if recursive else "*"
|
||||
for file_path in path.glob(pattern):
|
||||
if file_path.is_file() and not should_exclude(file_path):
|
||||
if any(str(file_path).endswith(ext) for ext in extensions):
|
||||
files.append(file_path)
|
||||
|
||||
return files
|
||||
|
||||
|
||||
__all__ = ["collect_ingest_files"]
|
||||
247
ai/src/fuzzforge_ai/memory_service.py
Normal file
247
ai/src/fuzzforge_ai/memory_service.py
Normal file
@@ -0,0 +1,247 @@
|
||||
"""
|
||||
FuzzForge Memory Service
|
||||
Implements ADK MemoryService pattern for conversational memory
|
||||
Separate from Cognee which will be used for RAG/codebase analysis
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
import os
|
||||
import json
|
||||
from typing import Dict, List, Any, Optional
|
||||
from datetime import datetime
|
||||
import logging
|
||||
|
||||
# ADK Memory imports
|
||||
from google.adk.memory import InMemoryMemoryService, BaseMemoryService
|
||||
from google.adk.memory.base_memory_service import SearchMemoryResponse
|
||||
from google.adk.memory.memory_entry import MemoryEntry
|
||||
|
||||
# Optional VertexAI Memory Bank
|
||||
try:
|
||||
from google.adk.memory import VertexAiMemoryBankService
|
||||
VERTEX_AVAILABLE = True
|
||||
except ImportError:
|
||||
VERTEX_AVAILABLE = False
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class FuzzForgeMemoryService:
|
||||
"""
|
||||
Manages conversational memory using ADK patterns
|
||||
This is separate from Cognee which will handle RAG/codebase
|
||||
"""
|
||||
|
||||
def __init__(self, memory_type: str = "inmemory", **kwargs):
|
||||
"""
|
||||
Initialize memory service
|
||||
|
||||
Args:
|
||||
memory_type: "inmemory" or "vertexai"
|
||||
**kwargs: Additional args for specific memory service
|
||||
For vertexai: project, location, agent_engine_id
|
||||
"""
|
||||
self.memory_type = memory_type
|
||||
self.service = self._create_service(memory_type, **kwargs)
|
||||
|
||||
def _create_service(self, memory_type: str, **kwargs) -> BaseMemoryService:
|
||||
"""Create the appropriate memory service"""
|
||||
|
||||
if memory_type == "inmemory":
|
||||
# Use ADK's InMemoryMemoryService for local development
|
||||
logger.info("Using InMemory MemoryService for conversational memory")
|
||||
return InMemoryMemoryService()
|
||||
|
||||
elif memory_type == "vertexai" and VERTEX_AVAILABLE:
|
||||
# Use VertexAI Memory Bank for production
|
||||
project = kwargs.get('project') or os.getenv('GOOGLE_CLOUD_PROJECT')
|
||||
location = kwargs.get('location') or os.getenv('GOOGLE_CLOUD_LOCATION', 'us-central1')
|
||||
agent_engine_id = kwargs.get('agent_engine_id') or os.getenv('AGENT_ENGINE_ID')
|
||||
|
||||
if not all([project, location, agent_engine_id]):
|
||||
logger.warning("VertexAI config missing, falling back to InMemory")
|
||||
return InMemoryMemoryService()
|
||||
|
||||
logger.info(f"Using VertexAI MemoryBank: {agent_engine_id}")
|
||||
return VertexAiMemoryBankService(
|
||||
project=project,
|
||||
location=location,
|
||||
agent_engine_id=agent_engine_id
|
||||
)
|
||||
else:
|
||||
# Default to in-memory
|
||||
logger.info("Defaulting to InMemory MemoryService")
|
||||
return InMemoryMemoryService()
|
||||
|
||||
async def add_session_to_memory(self, session: Any) -> None:
|
||||
"""
|
||||
Add a completed session to long-term memory
|
||||
This extracts meaningful information from the conversation
|
||||
|
||||
Args:
|
||||
session: The session object to process
|
||||
"""
|
||||
try:
|
||||
# Let the underlying service handle the ingestion
|
||||
# It will extract relevant information based on the implementation
|
||||
await self.service.add_session_to_memory(session)
|
||||
|
||||
logger.debug(f"Added session {session.id} to {self.memory_type} memory")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to add session to memory: {e}")
|
||||
|
||||
async def search_memory(self,
|
||||
query: str,
|
||||
app_name: str = "fuzzforge",
|
||||
user_id: str = None,
|
||||
max_results: int = 10) -> SearchMemoryResponse:
|
||||
"""
|
||||
Search long-term memory for relevant information
|
||||
|
||||
Args:
|
||||
query: The search query
|
||||
app_name: Application name for filtering
|
||||
user_id: User ID for filtering (optional)
|
||||
max_results: Maximum number of results
|
||||
|
||||
Returns:
|
||||
SearchMemoryResponse with relevant memories
|
||||
"""
|
||||
try:
|
||||
# Search the memory service
|
||||
results = await self.service.search_memory(
|
||||
app_name=app_name,
|
||||
user_id=user_id,
|
||||
query=query
|
||||
)
|
||||
|
||||
logger.debug(f"Memory search for '{query}' returned {len(results.memories)} results")
|
||||
return results
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Memory search failed: {e}")
|
||||
# Return empty results on error
|
||||
return SearchMemoryResponse(memories=[])
|
||||
|
||||
async def ingest_completed_sessions(self, session_service) -> int:
|
||||
"""
|
||||
Batch ingest all completed sessions into memory
|
||||
Useful for initial memory population
|
||||
|
||||
Args:
|
||||
session_service: The session service containing sessions
|
||||
|
||||
Returns:
|
||||
Number of sessions ingested
|
||||
"""
|
||||
ingested = 0
|
||||
|
||||
try:
|
||||
# Get all sessions from the session service
|
||||
sessions = await session_service.list_sessions(app_name="fuzzforge")
|
||||
|
||||
for session_info in sessions:
|
||||
# Load full session
|
||||
session = await session_service.load_session(
|
||||
app_name="fuzzforge",
|
||||
user_id=session_info.get('user_id'),
|
||||
session_id=session_info.get('id')
|
||||
)
|
||||
|
||||
if session and len(session.get_events()) > 0:
|
||||
await self.add_session_to_memory(session)
|
||||
ingested += 1
|
||||
|
||||
logger.info(f"Ingested {ingested} sessions into {self.memory_type} memory")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to batch ingest sessions: {e}")
|
||||
|
||||
return ingested
|
||||
|
||||
def get_status(self) -> Dict[str, Any]:
|
||||
"""Get memory service status"""
|
||||
return {
|
||||
"type": self.memory_type,
|
||||
"active": self.service is not None,
|
||||
"vertex_available": VERTEX_AVAILABLE,
|
||||
"details": {
|
||||
"inmemory": "Non-persistent, keyword search",
|
||||
"vertexai": "Persistent, semantic search with LLM extraction"
|
||||
}.get(self.memory_type, "Unknown")
|
||||
}
|
||||
|
||||
|
||||
class HybridMemoryManager:
|
||||
"""
|
||||
Manages both ADK MemoryService (conversational) and Cognee (RAG/codebase)
|
||||
Provides unified interface for both memory systems
|
||||
"""
|
||||
|
||||
def __init__(self,
|
||||
memory_service: FuzzForgeMemoryService = None,
|
||||
cognee_tools = None):
|
||||
"""
|
||||
Initialize with both memory systems
|
||||
|
||||
Args:
|
||||
memory_service: ADK-pattern memory for conversations
|
||||
cognee_tools: Cognee MCP tools for RAG/codebase
|
||||
"""
|
||||
# ADK memory for conversations
|
||||
self.memory_service = memory_service or FuzzForgeMemoryService()
|
||||
|
||||
# Cognee for knowledge graphs and RAG (future)
|
||||
self.cognee_tools = cognee_tools
|
||||
|
||||
async def search_conversational_memory(self, query: str) -> SearchMemoryResponse:
|
||||
"""Search past conversations using ADK memory"""
|
||||
return await self.memory_service.search_memory(query)
|
||||
|
||||
async def search_knowledge_graph(self, query: str, search_type: str = "GRAPH_COMPLETION"):
|
||||
"""Search Cognee knowledge graph (for RAG/codebase in future)"""
|
||||
if not self.cognee_tools:
|
||||
return None
|
||||
|
||||
try:
|
||||
# Use Cognee's graph search
|
||||
return await self.cognee_tools.search(
|
||||
query=query,
|
||||
search_type=search_type
|
||||
)
|
||||
except Exception as e:
|
||||
logger.debug(f"Cognee search failed: {e}")
|
||||
return None
|
||||
|
||||
async def store_in_graph(self, content: str):
|
||||
"""Store in Cognee knowledge graph (for codebase analysis later)"""
|
||||
if not self.cognee_tools:
|
||||
return None
|
||||
|
||||
try:
|
||||
# Use cognify to create graph structures
|
||||
return await self.cognee_tools.cognify(content)
|
||||
except Exception as e:
|
||||
logger.debug(f"Cognee store failed: {e}")
|
||||
return None
|
||||
|
||||
def get_status(self) -> Dict[str, Any]:
|
||||
"""Get status of both memory systems"""
|
||||
return {
|
||||
"conversational_memory": self.memory_service.get_status(),
|
||||
"knowledge_graph": {
|
||||
"active": self.cognee_tools is not None,
|
||||
"purpose": "RAG/codebase analysis (future)"
|
||||
}
|
||||
}
|
||||
148
ai/src/fuzzforge_ai/remote_agent.py
Normal file
148
ai/src/fuzzforge_ai/remote_agent.py
Normal file
@@ -0,0 +1,148 @@
|
||||
"""
|
||||
Remote Agent Connection Handler
|
||||
Handles A2A protocol communication with remote agents
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
import httpx
|
||||
import uuid
|
||||
from typing import Dict, Any, Optional, List
|
||||
|
||||
|
||||
class RemoteAgentConnection:
|
||||
"""Handles A2A protocol communication with remote agents"""
|
||||
|
||||
def __init__(self, url: str):
|
||||
"""Initialize connection to a remote agent"""
|
||||
self.url = url.rstrip('/')
|
||||
self.agent_card = None
|
||||
self.client = httpx.AsyncClient(timeout=120.0)
|
||||
self.context_id = None
|
||||
|
||||
async def get_agent_card(self) -> Optional[Dict[str, Any]]:
|
||||
"""Get the agent card from the remote agent"""
|
||||
try:
|
||||
# Try new path first (A2A 0.3.0+)
|
||||
response = await self.client.get(f"{self.url}/.well-known/agent-card.json")
|
||||
response.raise_for_status()
|
||||
self.agent_card = response.json()
|
||||
return self.agent_card
|
||||
except:
|
||||
# Try old path for compatibility
|
||||
try:
|
||||
response = await self.client.get(f"{self.url}/.well-known/agent.json")
|
||||
response.raise_for_status()
|
||||
self.agent_card = response.json()
|
||||
return self.agent_card
|
||||
except Exception as e:
|
||||
print(f"Failed to get agent card from {self.url}: {e}")
|
||||
return None
|
||||
|
||||
async def send_message(self, message: str | Dict[str, Any] | List[Dict[str, Any]]) -> str:
|
||||
"""Send a message to the remote agent using A2A protocol"""
|
||||
try:
|
||||
parts: List[Dict[str, Any]]
|
||||
metadata: Dict[str, Any] | None = None
|
||||
if isinstance(message, dict):
|
||||
metadata = message.get("metadata") if isinstance(message.get("metadata"), dict) else None
|
||||
raw_parts = message.get("parts", [])
|
||||
if not raw_parts:
|
||||
text_value = message.get("text") or message.get("message")
|
||||
if isinstance(text_value, str):
|
||||
raw_parts = [{"type": "text", "text": text_value}]
|
||||
parts = [raw_part for raw_part in raw_parts if isinstance(raw_part, dict)]
|
||||
elif isinstance(message, list):
|
||||
parts = [part for part in message if isinstance(part, dict)]
|
||||
metadata = None
|
||||
else:
|
||||
parts = [{"type": "text", "text": message}]
|
||||
metadata = None
|
||||
|
||||
if not parts:
|
||||
parts = [{"type": "text", "text": ""}]
|
||||
|
||||
# Build JSON-RPC request per A2A spec
|
||||
payload = {
|
||||
"jsonrpc": "2.0",
|
||||
"method": "message/send",
|
||||
"params": {
|
||||
"message": {
|
||||
"messageId": str(uuid.uuid4()),
|
||||
"role": "user",
|
||||
"parts": parts,
|
||||
}
|
||||
},
|
||||
"id": 1
|
||||
}
|
||||
|
||||
if metadata:
|
||||
payload["params"]["message"]["metadata"] = metadata
|
||||
|
||||
# Include context if we have one
|
||||
if self.context_id:
|
||||
payload["params"]["contextId"] = self.context_id
|
||||
|
||||
# Send to root endpoint per A2A protocol
|
||||
response = await self.client.post(f"{self.url}/", json=payload)
|
||||
response.raise_for_status()
|
||||
result = response.json()
|
||||
|
||||
# Extract response based on A2A JSON-RPC format
|
||||
if isinstance(result, dict):
|
||||
# Update context for continuity
|
||||
if "result" in result and isinstance(result["result"], dict):
|
||||
if "contextId" in result["result"]:
|
||||
self.context_id = result["result"]["contextId"]
|
||||
|
||||
# Extract text from artifacts
|
||||
if "artifacts" in result["result"]:
|
||||
texts = []
|
||||
for artifact in result["result"]["artifacts"]:
|
||||
if isinstance(artifact, dict) and "parts" in artifact:
|
||||
for part in artifact["parts"]:
|
||||
if isinstance(part, dict) and "text" in part:
|
||||
texts.append(part["text"])
|
||||
if texts:
|
||||
return " ".join(texts)
|
||||
|
||||
# Extract from message format
|
||||
if "message" in result["result"]:
|
||||
msg = result["result"]["message"]
|
||||
if isinstance(msg, dict) and "parts" in msg:
|
||||
texts = []
|
||||
for part in msg["parts"]:
|
||||
if isinstance(part, dict) and "text" in part:
|
||||
texts.append(part["text"])
|
||||
return " ".join(texts) if texts else str(msg)
|
||||
return str(msg)
|
||||
|
||||
return str(result["result"])
|
||||
|
||||
# Handle error response
|
||||
elif "error" in result:
|
||||
error = result["error"]
|
||||
if isinstance(error, dict):
|
||||
return f"Error: {error.get('message', str(error))}"
|
||||
return f"Error: {error}"
|
||||
|
||||
# Fallback
|
||||
return result.get("response", result.get("message", str(result)))
|
||||
|
||||
return str(result)
|
||||
|
||||
except Exception as e:
|
||||
return f"Error communicating with agent: {e}"
|
||||
|
||||
async def close(self):
|
||||
"""Close the connection properly"""
|
||||
await self.client.aclose()
|
||||
41
backend/Dockerfile
Normal file
41
backend/Dockerfile
Normal file
@@ -0,0 +1,41 @@
|
||||
FROM python:3.11-slim
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# Install system dependencies including Docker client and rsync
|
||||
RUN apt-get update && apt-get install -y \
|
||||
curl \
|
||||
ca-certificates \
|
||||
gnupg \
|
||||
lsb-release \
|
||||
rsync \
|
||||
&& curl -fsSL https://download.docker.com/linux/debian/gpg | gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg \
|
||||
&& echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/debian $(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null \
|
||||
&& apt-get update \
|
||||
&& apt-get install -y docker-ce-cli \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Docker client configuration removed - localhost:5001 doesn't require insecure registry config
|
||||
|
||||
# Install uv for faster package management
|
||||
RUN pip install uv
|
||||
|
||||
# Copy project files
|
||||
COPY pyproject.toml ./
|
||||
COPY uv.lock ./
|
||||
|
||||
# Install dependencies
|
||||
RUN uv sync --no-dev
|
||||
|
||||
# Copy source code
|
||||
COPY . .
|
||||
|
||||
# Expose port
|
||||
EXPOSE 8000
|
||||
|
||||
# Health check
|
||||
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
|
||||
CMD curl -f http://localhost:8000/health || exit 1
|
||||
|
||||
# Start the application
|
||||
CMD ["uv", "run", "uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "8000"]
|
||||
257
backend/README.md
Normal file
257
backend/README.md
Normal file
@@ -0,0 +1,257 @@
|
||||
# FuzzForge Backend
|
||||
|
||||
A stateless API server for security testing workflow orchestration using Prefect. This system dynamically discovers workflows, executes them in isolated Docker containers with volume mounting, and returns findings in SARIF format.
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
### Core Components
|
||||
|
||||
1. **Workflow Discovery System**: Automatically discovers workflows at startup
|
||||
2. **Module System**: Reusable components (scanner, analyzer, reporter) with a common interface
|
||||
3. **Prefect Integration**: Handles container orchestration, workflow execution, and monitoring
|
||||
4. **Volume Mounting**: Secure file access with configurable permissions (ro/rw)
|
||||
5. **SARIF Output**: Standardized security findings format
|
||||
|
||||
### Key Features
|
||||
|
||||
- **Stateless**: No persistent data, fully scalable
|
||||
- **Generic**: No hardcoded workflows, automatic discovery
|
||||
- **Isolated**: Each workflow runs in its own Docker container
|
||||
- **Extensible**: Easy to add new workflows and modules
|
||||
- **Secure**: Read-only volume mounts by default, path validation
|
||||
- **Observable**: Comprehensive logging and status tracking
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- Docker and Docker Compose
|
||||
|
||||
### Installation
|
||||
|
||||
From the project root, start all services:
|
||||
|
||||
```bash
|
||||
docker-compose up -d
|
||||
```
|
||||
|
||||
This will start:
|
||||
- Prefect server (API at http://localhost:4200/api)
|
||||
- PostgreSQL database
|
||||
- Redis cache
|
||||
- Docker registry (port 5001)
|
||||
- Prefect worker (for running workflows)
|
||||
- FuzzForge backend API (port 8000)
|
||||
- FuzzForge MCP server (port 8010)
|
||||
|
||||
**Note**: The Prefect UI at http://localhost:4200 is not currently accessible from the host due to the API being configured for inter-container communication. Use the REST API or MCP interface instead.
|
||||
|
||||
## API Endpoints
|
||||
|
||||
### Workflows
|
||||
|
||||
- `GET /workflows` - List all discovered workflows
|
||||
- `GET /workflows/{name}/metadata` - Get workflow metadata and parameters
|
||||
- `GET /workflows/{name}/parameters` - Get workflow parameter schema
|
||||
- `GET /workflows/metadata/schema` - Get metadata.yaml schema
|
||||
- `POST /workflows/{name}/submit` - Submit a workflow for execution
|
||||
|
||||
### Runs
|
||||
|
||||
- `GET /runs/{run_id}/status` - Get run status
|
||||
- `GET /runs/{run_id}/findings` - Get SARIF findings from completed run
|
||||
- `GET /runs/{workflow_name}/findings/{run_id}` - Alternative findings endpoint with workflow name
|
||||
|
||||
## Workflow Structure
|
||||
|
||||
Each workflow must have:
|
||||
|
||||
```
|
||||
toolbox/workflows/{workflow_name}/
|
||||
workflow.py # Prefect flow definition
|
||||
metadata.yaml # Mandatory metadata (parameters, version, etc.)
|
||||
Dockerfile # Optional custom container definition
|
||||
requirements.txt # Optional Python dependencies
|
||||
```
|
||||
|
||||
### Example metadata.yaml
|
||||
|
||||
```yaml
|
||||
name: security_assessment
|
||||
version: "1.0.0"
|
||||
description: "Comprehensive security analysis workflow"
|
||||
author: "FuzzForge Team"
|
||||
category: "comprehensive"
|
||||
tags:
|
||||
- "security"
|
||||
- "analysis"
|
||||
- "comprehensive"
|
||||
|
||||
supported_volume_modes:
|
||||
- "ro"
|
||||
- "rw"
|
||||
|
||||
requirements:
|
||||
tools:
|
||||
- "file_scanner"
|
||||
- "security_analyzer"
|
||||
- "sarif_reporter"
|
||||
resources:
|
||||
memory: "512Mi"
|
||||
cpu: "500m"
|
||||
timeout: 1800
|
||||
|
||||
has_docker: true
|
||||
|
||||
parameters:
|
||||
type: object
|
||||
properties:
|
||||
target_path:
|
||||
type: string
|
||||
default: "/workspace"
|
||||
description: "Path to analyze"
|
||||
volume_mode:
|
||||
type: string
|
||||
enum: ["ro", "rw"]
|
||||
default: "ro"
|
||||
description: "Volume mount mode"
|
||||
scanner_config:
|
||||
type: object
|
||||
description: "Scanner configuration"
|
||||
properties:
|
||||
max_file_size:
|
||||
type: integer
|
||||
description: "Maximum file size to scan (bytes)"
|
||||
|
||||
output_schema:
|
||||
type: object
|
||||
properties:
|
||||
sarif:
|
||||
type: object
|
||||
description: "SARIF-formatted security findings"
|
||||
summary:
|
||||
type: object
|
||||
description: "Scan execution summary"
|
||||
```
|
||||
|
||||
### Metadata Field Descriptions
|
||||
|
||||
- **name**: Workflow identifier (must match directory name)
|
||||
- **version**: Semantic version (x.y.z format)
|
||||
- **description**: Human-readable description of the workflow
|
||||
- **author**: Workflow author/maintainer
|
||||
- **category**: Workflow category (comprehensive, specialized, fuzzing, focused)
|
||||
- **tags**: Array of descriptive tags for categorization
|
||||
- **requirements.tools**: Required security tools that the workflow uses
|
||||
- **requirements.resources**: Resource requirements enforced at runtime:
|
||||
- `memory`: Memory limit (e.g., "512Mi", "1Gi")
|
||||
- `cpu`: CPU limit (e.g., "500m" for 0.5 cores, "1" for 1 core)
|
||||
- `timeout`: Maximum execution time in seconds
|
||||
- **parameters**: JSON Schema object defining workflow parameters
|
||||
- **output_schema**: Expected output format (typically SARIF)
|
||||
|
||||
### Resource Requirements
|
||||
|
||||
Resource requirements defined in workflow metadata are automatically enforced. Users can override defaults when submitting workflows:
|
||||
|
||||
```bash
|
||||
curl -X POST "http://localhost:8000/workflows/security_assessment/submit" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"target_path": "/tmp/project",
|
||||
"volume_mode": "ro",
|
||||
"resource_limits": {
|
||||
"memory_limit": "1Gi",
|
||||
"cpu_limit": "1"
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
Resource precedence: User limits > Workflow requirements > System defaults
|
||||
|
||||
## Module Development
|
||||
|
||||
Modules implement the `BaseModule` interface:
|
||||
|
||||
```python
|
||||
from src.toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult
|
||||
|
||||
class MyModule(BaseModule):
|
||||
def get_metadata(self) -> ModuleMetadata:
|
||||
return ModuleMetadata(
|
||||
name="my_module",
|
||||
version="1.0.0",
|
||||
description="Module description",
|
||||
category="scanner",
|
||||
...
|
||||
)
|
||||
|
||||
async def execute(self, config: Dict, workspace: Path) -> ModuleResult:
|
||||
# Module logic here
|
||||
findings = [...]
|
||||
return self.create_result(findings=findings)
|
||||
|
||||
def validate_config(self, config: Dict) -> bool:
|
||||
# Validate configuration
|
||||
return True
|
||||
```
|
||||
|
||||
## Submitting a Workflow
|
||||
|
||||
```bash
|
||||
curl -X POST "http://localhost:8000/workflows/security_assessment/submit" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"target_path": "/home/user/project",
|
||||
"volume_mode": "ro",
|
||||
"parameters": {
|
||||
"scanner_config": {"patterns": ["*.py"]},
|
||||
"analyzer_config": {"check_secrets": true}
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
## Getting Findings
|
||||
|
||||
```bash
|
||||
curl "http://localhost:8000/runs/{run_id}/findings"
|
||||
```
|
||||
|
||||
Returns SARIF-formatted findings:
|
||||
|
||||
```json
|
||||
{
|
||||
"workflow": "security_assessment",
|
||||
"run_id": "abc-123",
|
||||
"sarif": {
|
||||
"version": "2.1.0",
|
||||
"runs": [{
|
||||
"tool": {...},
|
||||
"results": [...]
|
||||
}]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Security Considerations
|
||||
|
||||
1. **Volume Mounting**: Only allowed directories can be mounted
|
||||
2. **Read-Only Default**: Volumes mounted as read-only unless explicitly set
|
||||
3. **Container Isolation**: Each workflow runs in an isolated container
|
||||
4. **Resource Limits**: Can set CPU/memory limits via Prefect
|
||||
5. **Network Isolation**: Containers use bridge networking
|
||||
|
||||
## Development
|
||||
|
||||
### Adding a New Workflow
|
||||
|
||||
1. Create directory: `toolbox/workflows/my_workflow/`
|
||||
2. Add `workflow.py` with a Prefect flow
|
||||
3. Add mandatory `metadata.yaml`
|
||||
4. Restart backend: `docker-compose restart fuzzforge-backend`
|
||||
|
||||
### Adding a New Module
|
||||
|
||||
1. Create module in `toolbox/modules/{category}/`
|
||||
2. Implement `BaseModule` interface
|
||||
3. Use in workflows via import
|
||||
122
backend/mcp-config.json
Normal file
122
backend/mcp-config.json
Normal file
@@ -0,0 +1,122 @@
|
||||
{
|
||||
"name": "FuzzForge Security Testing Platform",
|
||||
"description": "MCP server for FuzzForge security testing workflows via Docker Compose",
|
||||
"version": "0.6.0",
|
||||
"connection": {
|
||||
"type": "http",
|
||||
"host": "localhost",
|
||||
"port": 8010,
|
||||
"base_url": "http://localhost:8010",
|
||||
"mcp_endpoint": "/mcp"
|
||||
},
|
||||
"docker_compose": {
|
||||
"service": "fuzzforge-backend",
|
||||
"command": "docker compose up -d",
|
||||
"health_check": "http://localhost:8000/health"
|
||||
},
|
||||
"capabilities": {
|
||||
"tools": [
|
||||
{
|
||||
"name": "submit_security_scan_mcp",
|
||||
"description": "Submit a security scanning workflow for execution",
|
||||
"parameters": {
|
||||
"workflow_name": "string",
|
||||
"target_path": "string",
|
||||
"volume_mode": "string (ro|rw)",
|
||||
"parameters": "object"
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "get_comprehensive_scan_summary",
|
||||
"description": "Get a comprehensive summary of scan results with analysis",
|
||||
"parameters": {
|
||||
"run_id": "string"
|
||||
}
|
||||
}
|
||||
],
|
||||
"fastapi_routes": [
|
||||
{
|
||||
"method": "GET",
|
||||
"path": "/",
|
||||
"description": "Get API status and loaded workflows count"
|
||||
},
|
||||
{
|
||||
"method": "GET",
|
||||
"path": "/workflows/",
|
||||
"description": "List all available security testing workflows"
|
||||
},
|
||||
{
|
||||
"method": "POST",
|
||||
"path": "/workflows/{workflow_name}/submit",
|
||||
"description": "Submit a security scanning workflow for execution"
|
||||
},
|
||||
{
|
||||
"method": "GET",
|
||||
"path": "/runs/{run_id}/status",
|
||||
"description": "Get the current status of a security scan run"
|
||||
},
|
||||
{
|
||||
"method": "GET",
|
||||
"path": "/runs/{run_id}/findings",
|
||||
"description": "Get security findings from a completed scan"
|
||||
},
|
||||
{
|
||||
"method": "GET",
|
||||
"path": "/fuzzing/{run_id}/stats",
|
||||
"description": "Get fuzzing statistics for a run"
|
||||
}
|
||||
]
|
||||
},
|
||||
"examples": {
|
||||
"start_infrastructure_scan": {
|
||||
"description": "Run infrastructure security scan on a project",
|
||||
"steps": [
|
||||
"1. Start Docker Compose: docker compose up -d",
|
||||
"2. Submit scan via MCP tool: submit_security_scan_mcp",
|
||||
"3. Monitor status and get results"
|
||||
],
|
||||
"workflow_name": "infrastructure_scan",
|
||||
"target_path": "/Users/tduhamel/Documents/FuzzingLabs/fuzzforge_alpha/test_projects/infrastructure_vulnerable",
|
||||
"parameters": {
|
||||
"checkov_config": {
|
||||
"severity": ["HIGH", "MEDIUM", "LOW"]
|
||||
},
|
||||
"hadolint_config": {
|
||||
"severity": ["error", "warning", "info", "style"]
|
||||
}
|
||||
}
|
||||
},
|
||||
"static_analysis_scan": {
|
||||
"description": "Run static analysis security scan",
|
||||
"workflow_name": "static_analysis_scan",
|
||||
"target_path": "/Users/tduhamel/Documents/FuzzingLabs/fuzzforge_alpha/test_projects/static_analysis_vulnerable",
|
||||
"parameters": {
|
||||
"bandit_config": {
|
||||
"severity": ["HIGH", "MEDIUM", "LOW"]
|
||||
},
|
||||
"opengrep_config": {
|
||||
"severity": ["HIGH", "MEDIUM", "LOW"]
|
||||
}
|
||||
}
|
||||
},
|
||||
"secret_detection_scan": {
|
||||
"description": "Run secret detection scan",
|
||||
"workflow_name": "secret_detection_scan",
|
||||
"target_path": "/Users/tduhamel/Documents/FuzzingLabs/fuzzforge_alpha/test_projects/secret_detection_vulnerable",
|
||||
"parameters": {
|
||||
"trufflehog_config": {
|
||||
"verified_only": false
|
||||
},
|
||||
"gitleaks_config": {
|
||||
"no_git": true
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"usage": {
|
||||
"via_mcp": "Connect MCP client to http://localhost:8010/mcp after starting Docker Compose",
|
||||
"via_api": "Use FastAPI endpoints directly at http://localhost:8000",
|
||||
"start_system": "docker compose up -d",
|
||||
"stop_system": "docker compose down"
|
||||
}
|
||||
}
|
||||
25
backend/pyproject.toml
Normal file
25
backend/pyproject.toml
Normal file
@@ -0,0 +1,25 @@
|
||||
[project]
|
||||
name = "backend"
|
||||
version = "0.6.0"
|
||||
description = "FuzzForge OSS backend"
|
||||
authors = []
|
||||
readme = "README.md"
|
||||
requires-python = ">=3.11"
|
||||
dependencies = [
|
||||
"fastapi>=0.116.1",
|
||||
"prefect>=3.4.18",
|
||||
"pydantic>=2.0.0",
|
||||
"pyyaml>=6.0",
|
||||
"docker>=7.0.0",
|
||||
"aiofiles>=23.0.0",
|
||||
"uvicorn>=0.30.0",
|
||||
"aiohttp>=3.12.15",
|
||||
"fastmcp",
|
||||
]
|
||||
|
||||
[project.optional-dependencies]
|
||||
dev = [
|
||||
"pytest>=8.0.0",
|
||||
"pytest-asyncio>=0.23.0",
|
||||
"httpx>=0.27.0",
|
||||
]
|
||||
11
backend/src/__init__.py
Normal file
11
backend/src/__init__.py
Normal file
@@ -0,0 +1,11 @@
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
11
backend/src/api/__init__.py
Normal file
11
backend/src/api/__init__.py
Normal file
@@ -0,0 +1,11 @@
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
325
backend/src/api/fuzzing.py
Normal file
325
backend/src/api/fuzzing.py
Normal file
@@ -0,0 +1,325 @@
|
||||
"""
|
||||
API endpoints for fuzzing workflow management and real-time monitoring
|
||||
"""
|
||||
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import logging
|
||||
from typing import List, Dict, Any
|
||||
from fastapi import APIRouter, HTTPException, Depends, WebSocket, WebSocketDisconnect
|
||||
from fastapi.responses import StreamingResponse
|
||||
import asyncio
|
||||
import json
|
||||
from datetime import datetime
|
||||
|
||||
from src.models.findings import (
|
||||
FuzzingStats,
|
||||
CrashReport
|
||||
)
|
||||
from src.core.workflow_discovery import WorkflowDiscovery
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
router = APIRouter(prefix="/fuzzing", tags=["fuzzing"])
|
||||
|
||||
# In-memory storage for real-time stats (in production, use Redis or similar)
|
||||
fuzzing_stats: Dict[str, FuzzingStats] = {}
|
||||
crash_reports: Dict[str, List[CrashReport]] = {}
|
||||
active_connections: Dict[str, List[WebSocket]] = {}
|
||||
|
||||
|
||||
def initialize_fuzzing_tracking(run_id: str, workflow_name: str):
|
||||
"""
|
||||
Initialize fuzzing tracking for a new run.
|
||||
|
||||
This function should be called when a workflow is submitted to enable
|
||||
real-time monitoring and stats collection.
|
||||
|
||||
Args:
|
||||
run_id: The run identifier
|
||||
workflow_name: Name of the workflow
|
||||
"""
|
||||
fuzzing_stats[run_id] = FuzzingStats(
|
||||
run_id=run_id,
|
||||
workflow=workflow_name
|
||||
)
|
||||
crash_reports[run_id] = []
|
||||
active_connections[run_id] = []
|
||||
|
||||
|
||||
@router.get("/{run_id}/stats", response_model=FuzzingStats)
|
||||
async def get_fuzzing_stats(run_id: str) -> FuzzingStats:
|
||||
"""
|
||||
Get current fuzzing statistics for a run.
|
||||
|
||||
Args:
|
||||
run_id: The fuzzing run ID
|
||||
|
||||
Returns:
|
||||
Current fuzzing statistics
|
||||
|
||||
Raises:
|
||||
HTTPException: 404 if run not found
|
||||
"""
|
||||
if run_id not in fuzzing_stats:
|
||||
raise HTTPException(
|
||||
status_code=404,
|
||||
detail=f"Fuzzing run not found: {run_id}"
|
||||
)
|
||||
|
||||
return fuzzing_stats[run_id]
|
||||
|
||||
|
||||
@router.get("/{run_id}/crashes", response_model=List[CrashReport])
|
||||
async def get_crash_reports(run_id: str) -> List[CrashReport]:
|
||||
"""
|
||||
Get crash reports for a fuzzing run.
|
||||
|
||||
Args:
|
||||
run_id: The fuzzing run ID
|
||||
|
||||
Returns:
|
||||
List of crash reports
|
||||
|
||||
Raises:
|
||||
HTTPException: 404 if run not found
|
||||
"""
|
||||
if run_id not in crash_reports:
|
||||
raise HTTPException(
|
||||
status_code=404,
|
||||
detail=f"Fuzzing run not found: {run_id}"
|
||||
)
|
||||
|
||||
return crash_reports[run_id]
|
||||
|
||||
|
||||
@router.post("/{run_id}/stats")
|
||||
async def update_fuzzing_stats(run_id: str, stats: FuzzingStats):
|
||||
"""
|
||||
Update fuzzing statistics (called by fuzzing workflows).
|
||||
|
||||
Args:
|
||||
run_id: The fuzzing run ID
|
||||
stats: Updated statistics
|
||||
|
||||
Raises:
|
||||
HTTPException: 404 if run not found
|
||||
"""
|
||||
if run_id not in fuzzing_stats:
|
||||
raise HTTPException(
|
||||
status_code=404,
|
||||
detail=f"Fuzzing run not found: {run_id}"
|
||||
)
|
||||
|
||||
# Update stats
|
||||
fuzzing_stats[run_id] = stats
|
||||
|
||||
# Debug: log reception for live instrumentation
|
||||
try:
|
||||
logger.info(
|
||||
"Received fuzzing stats update: run_id=%s exec=%s eps=%.2f crashes=%s corpus=%s elapsed=%ss",
|
||||
run_id,
|
||||
stats.executions,
|
||||
stats.executions_per_sec,
|
||||
stats.crashes,
|
||||
stats.corpus_size,
|
||||
stats.elapsed_time,
|
||||
)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Notify connected WebSocket clients
|
||||
if run_id in active_connections:
|
||||
message = {
|
||||
"type": "stats_update",
|
||||
"data": stats.model_dump()
|
||||
}
|
||||
for websocket in active_connections[run_id][:]: # Copy to avoid modification during iteration
|
||||
try:
|
||||
await websocket.send_text(json.dumps(message))
|
||||
except Exception:
|
||||
# Remove disconnected clients
|
||||
active_connections[run_id].remove(websocket)
|
||||
|
||||
|
||||
@router.post("/{run_id}/crash")
|
||||
async def report_crash(run_id: str, crash: CrashReport):
|
||||
"""
|
||||
Report a new crash (called by fuzzing workflows).
|
||||
|
||||
Args:
|
||||
run_id: The fuzzing run ID
|
||||
crash: Crash report details
|
||||
"""
|
||||
if run_id not in crash_reports:
|
||||
crash_reports[run_id] = []
|
||||
|
||||
# Add crash report
|
||||
crash_reports[run_id].append(crash)
|
||||
|
||||
# Update stats
|
||||
if run_id in fuzzing_stats:
|
||||
fuzzing_stats[run_id].crashes += 1
|
||||
fuzzing_stats[run_id].last_crash_time = crash.timestamp
|
||||
|
||||
# Notify connected WebSocket clients
|
||||
if run_id in active_connections:
|
||||
message = {
|
||||
"type": "crash_report",
|
||||
"data": crash.model_dump()
|
||||
}
|
||||
for websocket in active_connections[run_id][:]:
|
||||
try:
|
||||
await websocket.send_text(json.dumps(message))
|
||||
except Exception:
|
||||
active_connections[run_id].remove(websocket)
|
||||
|
||||
|
||||
@router.websocket("/{run_id}/live")
|
||||
async def websocket_endpoint(websocket: WebSocket, run_id: str):
|
||||
"""
|
||||
WebSocket endpoint for real-time fuzzing updates.
|
||||
|
||||
Args:
|
||||
websocket: WebSocket connection
|
||||
run_id: The fuzzing run ID to monitor
|
||||
"""
|
||||
await websocket.accept()
|
||||
|
||||
# Initialize connection tracking
|
||||
if run_id not in active_connections:
|
||||
active_connections[run_id] = []
|
||||
active_connections[run_id].append(websocket)
|
||||
|
||||
try:
|
||||
# Send current stats on connection
|
||||
if run_id in fuzzing_stats:
|
||||
current = fuzzing_stats[run_id]
|
||||
if isinstance(current, dict):
|
||||
payload = current
|
||||
elif hasattr(current, "model_dump"):
|
||||
payload = current.model_dump()
|
||||
elif hasattr(current, "dict"):
|
||||
payload = current.dict()
|
||||
else:
|
||||
payload = getattr(current, "__dict__", {"run_id": run_id})
|
||||
message = {"type": "stats_update", "data": payload}
|
||||
await websocket.send_text(json.dumps(message))
|
||||
|
||||
# Keep connection alive
|
||||
while True:
|
||||
try:
|
||||
# Wait for ping or handle disconnect
|
||||
data = await asyncio.wait_for(websocket.receive_text(), timeout=30.0)
|
||||
# Echo back for ping-pong
|
||||
if data == "ping":
|
||||
await websocket.send_text("pong")
|
||||
except asyncio.TimeoutError:
|
||||
# Send periodic heartbeat
|
||||
await websocket.send_text(json.dumps({"type": "heartbeat"}))
|
||||
|
||||
except WebSocketDisconnect:
|
||||
# Clean up connection
|
||||
if run_id in active_connections and websocket in active_connections[run_id]:
|
||||
active_connections[run_id].remove(websocket)
|
||||
except Exception as e:
|
||||
logger.error(f"WebSocket error for run {run_id}: {e}")
|
||||
if run_id in active_connections and websocket in active_connections[run_id]:
|
||||
active_connections[run_id].remove(websocket)
|
||||
|
||||
|
||||
@router.get("/{run_id}/stream")
|
||||
async def stream_fuzzing_updates(run_id: str):
|
||||
"""
|
||||
Server-Sent Events endpoint for real-time fuzzing updates.
|
||||
|
||||
Args:
|
||||
run_id: The fuzzing run ID to monitor
|
||||
|
||||
Returns:
|
||||
Streaming response with real-time updates
|
||||
"""
|
||||
if run_id not in fuzzing_stats:
|
||||
raise HTTPException(
|
||||
status_code=404,
|
||||
detail=f"Fuzzing run not found: {run_id}"
|
||||
)
|
||||
|
||||
async def event_stream():
|
||||
"""Generate server-sent events for fuzzing updates"""
|
||||
last_stats_time = datetime.utcnow()
|
||||
|
||||
while True:
|
||||
try:
|
||||
# Send current stats
|
||||
if run_id in fuzzing_stats:
|
||||
current_stats = fuzzing_stats[run_id]
|
||||
if isinstance(current_stats, dict):
|
||||
stats_payload = current_stats
|
||||
elif hasattr(current_stats, "model_dump"):
|
||||
stats_payload = current_stats.model_dump()
|
||||
elif hasattr(current_stats, "dict"):
|
||||
stats_payload = current_stats.dict()
|
||||
else:
|
||||
stats_payload = getattr(current_stats, "__dict__", {"run_id": run_id})
|
||||
event_data = f"data: {json.dumps({'type': 'stats', 'data': stats_payload})}\n\n"
|
||||
yield event_data
|
||||
|
||||
# Send recent crashes
|
||||
if run_id in crash_reports:
|
||||
recent_crashes = [
|
||||
crash for crash in crash_reports[run_id]
|
||||
if crash.timestamp > last_stats_time
|
||||
]
|
||||
for crash in recent_crashes:
|
||||
event_data = f"data: {json.dumps({'type': 'crash', 'data': crash.model_dump()})}\n\n"
|
||||
yield event_data
|
||||
|
||||
last_stats_time = datetime.utcnow()
|
||||
await asyncio.sleep(5) # Update every 5 seconds
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error in event stream for run {run_id}: {e}")
|
||||
break
|
||||
|
||||
return StreamingResponse(
|
||||
event_stream(),
|
||||
media_type="text/event-stream",
|
||||
headers={
|
||||
"Cache-Control": "no-cache",
|
||||
"Connection": "keep-alive",
|
||||
}
|
||||
)
|
||||
|
||||
|
||||
@router.delete("/{run_id}")
|
||||
async def cleanup_fuzzing_run(run_id: str):
|
||||
"""
|
||||
Clean up fuzzing run data.
|
||||
|
||||
Args:
|
||||
run_id: The fuzzing run ID to clean up
|
||||
"""
|
||||
# Clean up tracking data
|
||||
fuzzing_stats.pop(run_id, None)
|
||||
crash_reports.pop(run_id, None)
|
||||
|
||||
# Close any active WebSocket connections
|
||||
if run_id in active_connections:
|
||||
for websocket in active_connections[run_id]:
|
||||
try:
|
||||
await websocket.close()
|
||||
except Exception:
|
||||
pass
|
||||
del active_connections[run_id]
|
||||
|
||||
return {"message": f"Cleaned up fuzzing run {run_id}"}
|
||||
184
backend/src/api/runs.py
Normal file
184
backend/src/api/runs.py
Normal file
@@ -0,0 +1,184 @@
|
||||
"""
|
||||
API endpoints for workflow run management and findings retrieval
|
||||
"""
|
||||
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import logging
|
||||
from typing import Dict, Any
|
||||
from fastapi import APIRouter, HTTPException, Depends
|
||||
|
||||
from src.models.findings import WorkflowFindings, WorkflowStatus
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
router = APIRouter(prefix="/runs", tags=["runs"])
|
||||
|
||||
|
||||
def get_prefect_manager():
|
||||
"""Dependency to get the Prefect manager instance"""
|
||||
from src.main import prefect_mgr
|
||||
return prefect_mgr
|
||||
|
||||
|
||||
@router.get("/{run_id}/status", response_model=WorkflowStatus)
|
||||
async def get_run_status(
|
||||
run_id: str,
|
||||
prefect_mgr=Depends(get_prefect_manager)
|
||||
) -> WorkflowStatus:
|
||||
"""
|
||||
Get the current status of a workflow run.
|
||||
|
||||
Args:
|
||||
run_id: The flow run ID
|
||||
|
||||
Returns:
|
||||
Status information including state, timestamps, and completion flags
|
||||
|
||||
Raises:
|
||||
HTTPException: 404 if run not found
|
||||
"""
|
||||
try:
|
||||
status = await prefect_mgr.get_flow_run_status(run_id)
|
||||
|
||||
# Find workflow name from deployment
|
||||
workflow_name = "unknown"
|
||||
workflow_deployment_id = status.get("workflow", "")
|
||||
for name, deployment_id in prefect_mgr.deployments.items():
|
||||
if str(deployment_id) == str(workflow_deployment_id):
|
||||
workflow_name = name
|
||||
break
|
||||
|
||||
return WorkflowStatus(
|
||||
run_id=status["run_id"],
|
||||
workflow=workflow_name,
|
||||
status=status["status"],
|
||||
is_completed=status["is_completed"],
|
||||
is_failed=status["is_failed"],
|
||||
is_running=status["is_running"],
|
||||
created_at=status["created_at"],
|
||||
updated_at=status["updated_at"]
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get status for run {run_id}: {e}")
|
||||
raise HTTPException(
|
||||
status_code=404,
|
||||
detail=f"Run not found: {run_id}"
|
||||
)
|
||||
|
||||
|
||||
@router.get("/{run_id}/findings", response_model=WorkflowFindings)
|
||||
async def get_run_findings(
|
||||
run_id: str,
|
||||
prefect_mgr=Depends(get_prefect_manager)
|
||||
) -> WorkflowFindings:
|
||||
"""
|
||||
Get the findings from a completed workflow run.
|
||||
|
||||
Args:
|
||||
run_id: The flow run ID
|
||||
|
||||
Returns:
|
||||
SARIF-formatted findings from the workflow execution
|
||||
|
||||
Raises:
|
||||
HTTPException: 404 if run not found, 400 if run not completed
|
||||
"""
|
||||
try:
|
||||
# Get run status first
|
||||
status = await prefect_mgr.get_flow_run_status(run_id)
|
||||
|
||||
if not status["is_completed"]:
|
||||
if status["is_running"]:
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail=f"Run {run_id} is still running. Current status: {status['status']}"
|
||||
)
|
||||
elif status["is_failed"]:
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail=f"Run {run_id} failed. Status: {status['status']}"
|
||||
)
|
||||
else:
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail=f"Run {run_id} not completed. Status: {status['status']}"
|
||||
)
|
||||
|
||||
# Get the findings
|
||||
findings = await prefect_mgr.get_flow_run_findings(run_id)
|
||||
|
||||
# Find workflow name
|
||||
workflow_name = "unknown"
|
||||
workflow_deployment_id = status.get("workflow", "")
|
||||
for name, deployment_id in prefect_mgr.deployments.items():
|
||||
if str(deployment_id) == str(workflow_deployment_id):
|
||||
workflow_name = name
|
||||
break
|
||||
|
||||
# Get workflow version if available
|
||||
metadata = {
|
||||
"completion_time": status["updated_at"],
|
||||
"workflow_version": "unknown"
|
||||
}
|
||||
|
||||
if workflow_name in prefect_mgr.workflows:
|
||||
workflow_info = prefect_mgr.workflows[workflow_name]
|
||||
metadata["workflow_version"] = workflow_info.metadata.get("version", "unknown")
|
||||
|
||||
return WorkflowFindings(
|
||||
workflow=workflow_name,
|
||||
run_id=run_id,
|
||||
sarif=findings,
|
||||
metadata=metadata
|
||||
)
|
||||
|
||||
except HTTPException:
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get findings for run {run_id}: {e}")
|
||||
raise HTTPException(
|
||||
status_code=500,
|
||||
detail=f"Failed to retrieve findings: {str(e)}"
|
||||
)
|
||||
|
||||
|
||||
@router.get("/{workflow_name}/findings/{run_id}", response_model=WorkflowFindings)
|
||||
async def get_workflow_findings(
|
||||
workflow_name: str,
|
||||
run_id: str,
|
||||
prefect_mgr=Depends(get_prefect_manager)
|
||||
) -> WorkflowFindings:
|
||||
"""
|
||||
Get findings for a specific workflow run.
|
||||
|
||||
Alternative endpoint that includes workflow name in the path for clarity.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow
|
||||
run_id: The flow run ID
|
||||
|
||||
Returns:
|
||||
SARIF-formatted findings from the workflow execution
|
||||
|
||||
Raises:
|
||||
HTTPException: 404 if workflow or run not found, 400 if run not completed
|
||||
"""
|
||||
if workflow_name not in prefect_mgr.workflows:
|
||||
raise HTTPException(
|
||||
status_code=404,
|
||||
detail=f"Workflow not found: {workflow_name}"
|
||||
)
|
||||
|
||||
# Delegate to the main findings endpoint
|
||||
return await get_run_findings(run_id, prefect_mgr)
|
||||
386
backend/src/api/workflows.py
Normal file
386
backend/src/api/workflows.py
Normal file
@@ -0,0 +1,386 @@
|
||||
"""
|
||||
API endpoints for workflow management with enhanced error handling
|
||||
"""
|
||||
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import logging
|
||||
import traceback
|
||||
from typing import List, Dict, Any, Optional
|
||||
from fastapi import APIRouter, HTTPException, Depends
|
||||
from pathlib import Path
|
||||
|
||||
from src.models.findings import (
|
||||
WorkflowSubmission,
|
||||
WorkflowMetadata,
|
||||
WorkflowListItem,
|
||||
RunSubmissionResponse
|
||||
)
|
||||
from src.core.workflow_discovery import WorkflowDiscovery
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
router = APIRouter(prefix="/workflows", tags=["workflows"])
|
||||
|
||||
|
||||
def create_structured_error_response(
|
||||
error_type: str,
|
||||
message: str,
|
||||
workflow_name: Optional[str] = None,
|
||||
run_id: Optional[str] = None,
|
||||
container_info: Optional[Dict[str, Any]] = None,
|
||||
deployment_info: Optional[Dict[str, Any]] = None,
|
||||
suggestions: Optional[List[str]] = None
|
||||
) -> Dict[str, Any]:
|
||||
"""Create a structured error response with rich context."""
|
||||
error_response = {
|
||||
"error": {
|
||||
"type": error_type,
|
||||
"message": message,
|
||||
"timestamp": __import__("datetime").datetime.utcnow().isoformat() + "Z"
|
||||
}
|
||||
}
|
||||
|
||||
if workflow_name:
|
||||
error_response["error"]["workflow_name"] = workflow_name
|
||||
|
||||
if run_id:
|
||||
error_response["error"]["run_id"] = run_id
|
||||
|
||||
if container_info:
|
||||
error_response["error"]["container"] = container_info
|
||||
|
||||
if deployment_info:
|
||||
error_response["error"]["deployment"] = deployment_info
|
||||
|
||||
if suggestions:
|
||||
error_response["error"]["suggestions"] = suggestions
|
||||
|
||||
return error_response
|
||||
|
||||
|
||||
def get_prefect_manager():
|
||||
"""Dependency to get the Prefect manager instance"""
|
||||
from src.main import prefect_mgr
|
||||
return prefect_mgr
|
||||
|
||||
|
||||
@router.get("/", response_model=List[WorkflowListItem])
|
||||
async def list_workflows(
|
||||
prefect_mgr=Depends(get_prefect_manager)
|
||||
) -> List[WorkflowListItem]:
|
||||
"""
|
||||
List all discovered workflows with their metadata.
|
||||
|
||||
Returns a summary of each workflow including name, version, description,
|
||||
author, and tags.
|
||||
"""
|
||||
workflows = []
|
||||
for name, info in prefect_mgr.workflows.items():
|
||||
workflows.append(WorkflowListItem(
|
||||
name=name,
|
||||
version=info.metadata.get("version", "0.6.0"),
|
||||
description=info.metadata.get("description", ""),
|
||||
author=info.metadata.get("author"),
|
||||
tags=info.metadata.get("tags", [])
|
||||
))
|
||||
|
||||
return workflows
|
||||
|
||||
|
||||
@router.get("/metadata/schema")
|
||||
async def get_metadata_schema() -> Dict[str, Any]:
|
||||
"""
|
||||
Get the JSON schema for workflow metadata files.
|
||||
|
||||
This schema defines the structure and requirements for metadata.yaml files
|
||||
that must accompany each workflow.
|
||||
"""
|
||||
return WorkflowDiscovery.get_metadata_schema()
|
||||
|
||||
|
||||
@router.get("/{workflow_name}/metadata", response_model=WorkflowMetadata)
|
||||
async def get_workflow_metadata(
|
||||
workflow_name: str,
|
||||
prefect_mgr=Depends(get_prefect_manager)
|
||||
) -> WorkflowMetadata:
|
||||
"""
|
||||
Get complete metadata for a specific workflow.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow
|
||||
|
||||
Returns:
|
||||
Complete metadata including parameters schema, supported volume modes,
|
||||
required modules, and more.
|
||||
|
||||
Raises:
|
||||
HTTPException: 404 if workflow not found
|
||||
"""
|
||||
if workflow_name not in prefect_mgr.workflows:
|
||||
available_workflows = list(prefect_mgr.workflows.keys())
|
||||
error_response = create_structured_error_response(
|
||||
error_type="WorkflowNotFound",
|
||||
message=f"Workflow '{workflow_name}' not found",
|
||||
workflow_name=workflow_name,
|
||||
suggestions=[
|
||||
f"Available workflows: {', '.join(available_workflows)}",
|
||||
"Use GET /workflows/ to see all available workflows",
|
||||
"Check workflow name spelling and case sensitivity"
|
||||
]
|
||||
)
|
||||
raise HTTPException(
|
||||
status_code=404,
|
||||
detail=error_response
|
||||
)
|
||||
|
||||
info = prefect_mgr.workflows[workflow_name]
|
||||
metadata = info.metadata
|
||||
|
||||
return WorkflowMetadata(
|
||||
name=workflow_name,
|
||||
version=metadata.get("version", "0.6.0"),
|
||||
description=metadata.get("description", ""),
|
||||
author=metadata.get("author"),
|
||||
tags=metadata.get("tags", []),
|
||||
parameters=metadata.get("parameters", {}),
|
||||
default_parameters=metadata.get("default_parameters", {}),
|
||||
required_modules=metadata.get("required_modules", []),
|
||||
supported_volume_modes=metadata.get("supported_volume_modes", ["ro", "rw"]),
|
||||
has_custom_docker=info.has_docker
|
||||
)
|
||||
|
||||
|
||||
@router.post("/{workflow_name}/submit", response_model=RunSubmissionResponse)
|
||||
async def submit_workflow(
|
||||
workflow_name: str,
|
||||
submission: WorkflowSubmission,
|
||||
prefect_mgr=Depends(get_prefect_manager)
|
||||
) -> RunSubmissionResponse:
|
||||
"""
|
||||
Submit a workflow for execution with volume mounting.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow to execute
|
||||
submission: Submission parameters including target path and volume mode
|
||||
|
||||
Returns:
|
||||
Run submission response with run_id and initial status
|
||||
|
||||
Raises:
|
||||
HTTPException: 404 if workflow not found, 400 for invalid parameters
|
||||
"""
|
||||
if workflow_name not in prefect_mgr.workflows:
|
||||
available_workflows = list(prefect_mgr.workflows.keys())
|
||||
error_response = create_structured_error_response(
|
||||
error_type="WorkflowNotFound",
|
||||
message=f"Workflow '{workflow_name}' not found",
|
||||
workflow_name=workflow_name,
|
||||
suggestions=[
|
||||
f"Available workflows: {', '.join(available_workflows)}",
|
||||
"Use GET /workflows/ to see all available workflows",
|
||||
"Check workflow name spelling and case sensitivity"
|
||||
]
|
||||
)
|
||||
raise HTTPException(
|
||||
status_code=404,
|
||||
detail=error_response
|
||||
)
|
||||
|
||||
try:
|
||||
# Convert ResourceLimits to dict if provided
|
||||
resource_limits_dict = None
|
||||
if submission.resource_limits:
|
||||
resource_limits_dict = {
|
||||
"cpu_limit": submission.resource_limits.cpu_limit,
|
||||
"memory_limit": submission.resource_limits.memory_limit,
|
||||
"cpu_request": submission.resource_limits.cpu_request,
|
||||
"memory_request": submission.resource_limits.memory_request
|
||||
}
|
||||
|
||||
# Submit the workflow with enhanced parameters
|
||||
flow_run = await prefect_mgr.submit_workflow(
|
||||
workflow_name=workflow_name,
|
||||
target_path=submission.target_path,
|
||||
volume_mode=submission.volume_mode,
|
||||
parameters=submission.parameters,
|
||||
resource_limits=resource_limits_dict,
|
||||
additional_volumes=submission.additional_volumes,
|
||||
timeout=submission.timeout
|
||||
)
|
||||
|
||||
run_id = str(flow_run.id)
|
||||
|
||||
# Initialize fuzzing tracking if this looks like a fuzzing workflow
|
||||
workflow_info = prefect_mgr.workflows.get(workflow_name, {})
|
||||
workflow_tags = workflow_info.metadata.get("tags", []) if hasattr(workflow_info, 'metadata') else []
|
||||
if "fuzzing" in workflow_tags or "fuzz" in workflow_name.lower():
|
||||
from src.api.fuzzing import initialize_fuzzing_tracking
|
||||
initialize_fuzzing_tracking(run_id, workflow_name)
|
||||
|
||||
return RunSubmissionResponse(
|
||||
run_id=run_id,
|
||||
status=flow_run.state.name if flow_run.state else "PENDING",
|
||||
workflow=workflow_name,
|
||||
message=f"Workflow '{workflow_name}' submitted successfully"
|
||||
)
|
||||
|
||||
except ValueError as e:
|
||||
# Parameter validation errors
|
||||
error_response = create_structured_error_response(
|
||||
error_type="ValidationError",
|
||||
message=str(e),
|
||||
workflow_name=workflow_name,
|
||||
suggestions=[
|
||||
"Check parameter types and values",
|
||||
"Use GET /workflows/{workflow_name}/parameters for schema",
|
||||
"Ensure all required parameters are provided"
|
||||
]
|
||||
)
|
||||
raise HTTPException(status_code=400, detail=error_response)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to submit workflow '{workflow_name}': {e}")
|
||||
logger.error(f"Traceback: {traceback.format_exc()}")
|
||||
|
||||
# Try to get more context about the error
|
||||
container_info = None
|
||||
deployment_info = None
|
||||
suggestions = []
|
||||
|
||||
error_message = str(e)
|
||||
error_type = "WorkflowSubmissionError"
|
||||
|
||||
# Detect specific error patterns
|
||||
if "deployment" in error_message.lower():
|
||||
error_type = "DeploymentError"
|
||||
deployment_info = {
|
||||
"status": "failed",
|
||||
"error": error_message
|
||||
}
|
||||
suggestions.extend([
|
||||
"Check if Prefect server is running and accessible",
|
||||
"Verify Docker is running and has sufficient resources",
|
||||
"Check container image availability",
|
||||
"Ensure volume paths exist and are accessible"
|
||||
])
|
||||
|
||||
elif "volume" in error_message.lower() or "mount" in error_message.lower():
|
||||
error_type = "VolumeError"
|
||||
suggestions.extend([
|
||||
"Check if the target path exists and is accessible",
|
||||
"Verify file permissions (Docker needs read access)",
|
||||
"Ensure the path is not in use by another process",
|
||||
"Try using an absolute path instead of relative path"
|
||||
])
|
||||
|
||||
elif "memory" in error_message.lower() or "resource" in error_message.lower():
|
||||
error_type = "ResourceError"
|
||||
suggestions.extend([
|
||||
"Check system memory and CPU availability",
|
||||
"Consider reducing resource limits or dataset size",
|
||||
"Monitor Docker resource usage",
|
||||
"Increase Docker memory limits if needed"
|
||||
])
|
||||
|
||||
elif "image" in error_message.lower():
|
||||
error_type = "ImageError"
|
||||
suggestions.extend([
|
||||
"Check if the workflow image exists",
|
||||
"Verify Docker registry access",
|
||||
"Try rebuilding the workflow image",
|
||||
"Check network connectivity to registries"
|
||||
])
|
||||
|
||||
else:
|
||||
suggestions.extend([
|
||||
"Check FuzzForge backend logs for details",
|
||||
"Verify all services are running (docker-compose up -d)",
|
||||
"Try restarting the workflow deployment",
|
||||
"Contact support if the issue persists"
|
||||
])
|
||||
|
||||
error_response = create_structured_error_response(
|
||||
error_type=error_type,
|
||||
message=f"Failed to submit workflow: {error_message}",
|
||||
workflow_name=workflow_name,
|
||||
container_info=container_info,
|
||||
deployment_info=deployment_info,
|
||||
suggestions=suggestions
|
||||
)
|
||||
|
||||
raise HTTPException(
|
||||
status_code=500,
|
||||
detail=error_response
|
||||
)
|
||||
|
||||
|
||||
@router.get("/{workflow_name}/parameters")
|
||||
async def get_workflow_parameters(
|
||||
workflow_name: str,
|
||||
prefect_mgr=Depends(get_prefect_manager)
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Get the parameters schema for a workflow.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow
|
||||
|
||||
Returns:
|
||||
Parameters schema with types, descriptions, and defaults
|
||||
|
||||
Raises:
|
||||
HTTPException: 404 if workflow not found
|
||||
"""
|
||||
if workflow_name not in prefect_mgr.workflows:
|
||||
available_workflows = list(prefect_mgr.workflows.keys())
|
||||
error_response = create_structured_error_response(
|
||||
error_type="WorkflowNotFound",
|
||||
message=f"Workflow '{workflow_name}' not found",
|
||||
workflow_name=workflow_name,
|
||||
suggestions=[
|
||||
f"Available workflows: {', '.join(available_workflows)}",
|
||||
"Use GET /workflows/ to see all available workflows"
|
||||
]
|
||||
)
|
||||
raise HTTPException(
|
||||
status_code=404,
|
||||
detail=error_response
|
||||
)
|
||||
|
||||
info = prefect_mgr.workflows[workflow_name]
|
||||
metadata = info.metadata
|
||||
|
||||
# Return parameters with enhanced schema information
|
||||
parameters_schema = metadata.get("parameters", {})
|
||||
|
||||
# Extract the actual parameter definitions from JSON schema structure
|
||||
if "properties" in parameters_schema:
|
||||
param_definitions = parameters_schema["properties"]
|
||||
else:
|
||||
param_definitions = parameters_schema
|
||||
|
||||
# Add default values to the schema
|
||||
default_params = metadata.get("default_parameters", {})
|
||||
for param_name, param_schema in param_definitions.items():
|
||||
if isinstance(param_schema, dict) and param_name in default_params:
|
||||
param_schema["default"] = default_params[param_name]
|
||||
|
||||
return {
|
||||
"workflow": workflow_name,
|
||||
"parameters": param_definitions,
|
||||
"default_parameters": default_params,
|
||||
"required_parameters": [
|
||||
name for name, schema in param_definitions.items()
|
||||
if isinstance(schema, dict) and schema.get("required", False)
|
||||
]
|
||||
}
|
||||
11
backend/src/core/__init__.py
Normal file
11
backend/src/core/__init__.py
Normal file
@@ -0,0 +1,11 @@
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
770
backend/src/core/prefect_manager.py
Normal file
770
backend/src/core/prefect_manager.py
Normal file
@@ -0,0 +1,770 @@
|
||||
"""
|
||||
Prefect Manager - Core orchestration for workflow deployment and execution
|
||||
"""
|
||||
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import logging
|
||||
import os
|
||||
import platform
|
||||
import re
|
||||
from pathlib import Path
|
||||
from typing import Dict, Optional, Any
|
||||
from prefect import get_client
|
||||
from prefect.docker import DockerImage
|
||||
from prefect.client.schemas import FlowRun
|
||||
|
||||
from src.core.workflow_discovery import WorkflowDiscovery, WorkflowInfo
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
def get_registry_url(context: str = "default") -> str:
|
||||
"""
|
||||
Get the container registry URL to use for a given operation context.
|
||||
|
||||
Goals:
|
||||
- Work reliably across Linux and macOS Docker Desktop
|
||||
- Prefer in-network service discovery when running inside containers
|
||||
- Allow full override via env vars from docker-compose
|
||||
|
||||
Env overrides:
|
||||
- FUZZFORGE_REGISTRY_PUSH_URL: used for image builds/pushes
|
||||
- FUZZFORGE_REGISTRY_PULL_URL: used for workers to pull images
|
||||
"""
|
||||
# Normalize context
|
||||
ctx = (context or "default").lower()
|
||||
|
||||
# Always honor explicit overrides first
|
||||
if ctx in ("push", "build"):
|
||||
push_url = os.getenv("FUZZFORGE_REGISTRY_PUSH_URL")
|
||||
if push_url:
|
||||
logger.debug("Using FUZZFORGE_REGISTRY_PUSH_URL: %s", push_url)
|
||||
return push_url
|
||||
# Default to host-published registry for Docker daemon operations
|
||||
return "localhost:5001"
|
||||
|
||||
if ctx == "pull":
|
||||
pull_url = os.getenv("FUZZFORGE_REGISTRY_PULL_URL")
|
||||
if pull_url:
|
||||
logger.debug("Using FUZZFORGE_REGISTRY_PULL_URL: %s", pull_url)
|
||||
return pull_url
|
||||
# Prefect worker pulls via host Docker daemon as well
|
||||
return "localhost:5001"
|
||||
|
||||
# Default/fallback
|
||||
return os.getenv("FUZZFORGE_REGISTRY_PULL_URL", os.getenv("FUZZFORGE_REGISTRY_PUSH_URL", "localhost:5001"))
|
||||
|
||||
|
||||
def _compose_project_name(default: str = "fuzzforge") -> str:
|
||||
"""Return the docker-compose project name used for network/volume naming.
|
||||
|
||||
Always returns 'fuzzforge' regardless of environment variables.
|
||||
"""
|
||||
return "fuzzforge"
|
||||
|
||||
|
||||
class PrefectManager:
|
||||
"""
|
||||
Manages Prefect deployments and flow runs for discovered workflows.
|
||||
|
||||
This class handles:
|
||||
- Workflow discovery and registration
|
||||
- Docker image building through Prefect
|
||||
- Deployment creation and management
|
||||
- Flow run submission with volume mounting
|
||||
- Findings retrieval from completed runs
|
||||
"""
|
||||
|
||||
def __init__(self, workflows_dir: Path = None):
|
||||
"""
|
||||
Initialize the Prefect manager.
|
||||
|
||||
Args:
|
||||
workflows_dir: Path to the workflows directory (default: toolbox/workflows)
|
||||
"""
|
||||
if workflows_dir is None:
|
||||
workflows_dir = Path("toolbox/workflows")
|
||||
|
||||
self.discovery = WorkflowDiscovery(workflows_dir)
|
||||
self.workflows: Dict[str, WorkflowInfo] = {}
|
||||
self.deployments: Dict[str, str] = {} # workflow_name -> deployment_id
|
||||
|
||||
# Security: Define allowed and forbidden paths for host mounting
|
||||
self.allowed_base_paths = [
|
||||
"/tmp",
|
||||
"/home",
|
||||
"/Users", # macOS users
|
||||
"/opt",
|
||||
"/var/tmp",
|
||||
"/workspace", # Common container workspace
|
||||
"/app" # Container application directory (for test projects)
|
||||
]
|
||||
|
||||
self.forbidden_paths = [
|
||||
"/etc",
|
||||
"/root",
|
||||
"/var/run",
|
||||
"/sys",
|
||||
"/proc",
|
||||
"/dev",
|
||||
"/boot",
|
||||
"/var/lib/docker", # Critical Docker data
|
||||
"/var/log", # System logs
|
||||
"/usr/bin", # System binaries
|
||||
"/usr/sbin",
|
||||
"/sbin",
|
||||
"/bin"
|
||||
]
|
||||
|
||||
@staticmethod
|
||||
def _parse_memory_to_bytes(memory_str: str) -> int:
|
||||
"""
|
||||
Parse memory string (like '512Mi', '1Gi') to bytes.
|
||||
|
||||
Args:
|
||||
memory_str: Memory string with unit suffix
|
||||
|
||||
Returns:
|
||||
Memory in bytes
|
||||
|
||||
Raises:
|
||||
ValueError: If format is invalid
|
||||
"""
|
||||
if not memory_str:
|
||||
return 0
|
||||
|
||||
match = re.match(r'^(\d+(?:\.\d+)?)\s*([GMK]i?)$', memory_str.strip())
|
||||
if not match:
|
||||
raise ValueError(f"Invalid memory format: {memory_str}. Expected format like '512Mi', '1Gi'")
|
||||
|
||||
value, unit = match.groups()
|
||||
value = float(value)
|
||||
|
||||
# Convert to bytes based on unit (binary units: Ki, Mi, Gi)
|
||||
if unit in ['K', 'Ki']:
|
||||
multiplier = 1024
|
||||
elif unit in ['M', 'Mi']:
|
||||
multiplier = 1024 * 1024
|
||||
elif unit in ['G', 'Gi']:
|
||||
multiplier = 1024 * 1024 * 1024
|
||||
else:
|
||||
raise ValueError(f"Unsupported memory unit: {unit}")
|
||||
|
||||
return int(value * multiplier)
|
||||
|
||||
@staticmethod
|
||||
def _parse_cpu_to_millicores(cpu_str: str) -> int:
|
||||
"""
|
||||
Parse CPU string (like '500m', '1', '2.5') to millicores.
|
||||
|
||||
Args:
|
||||
cpu_str: CPU string
|
||||
|
||||
Returns:
|
||||
CPU in millicores (1 core = 1000 millicores)
|
||||
|
||||
Raises:
|
||||
ValueError: If format is invalid
|
||||
"""
|
||||
if not cpu_str:
|
||||
return 0
|
||||
|
||||
cpu_str = cpu_str.strip()
|
||||
|
||||
# Handle millicores format (e.g., '500m')
|
||||
if cpu_str.endswith('m'):
|
||||
try:
|
||||
return int(cpu_str[:-1])
|
||||
except ValueError:
|
||||
raise ValueError(f"Invalid CPU format: {cpu_str}")
|
||||
|
||||
# Handle core format (e.g., '1', '2.5')
|
||||
try:
|
||||
cores = float(cpu_str)
|
||||
return int(cores * 1000) # Convert to millicores
|
||||
except ValueError:
|
||||
raise ValueError(f"Invalid CPU format: {cpu_str}")
|
||||
|
||||
def _extract_resource_requirements(self, workflow_info: WorkflowInfo) -> Dict[str, str]:
|
||||
"""
|
||||
Extract resource requirements from workflow metadata.
|
||||
|
||||
Args:
|
||||
workflow_info: Workflow information with metadata
|
||||
|
||||
Returns:
|
||||
Dictionary with resource requirements in Docker format
|
||||
"""
|
||||
metadata = workflow_info.metadata
|
||||
requirements = metadata.get("requirements", {})
|
||||
resources = requirements.get("resources", {})
|
||||
|
||||
resource_config = {}
|
||||
|
||||
# Extract memory requirement
|
||||
memory = resources.get("memory")
|
||||
if memory:
|
||||
try:
|
||||
# Validate memory format and store original string for Docker
|
||||
self._parse_memory_to_bytes(memory)
|
||||
resource_config["memory"] = memory
|
||||
except ValueError as e:
|
||||
logger.warning(f"Invalid memory requirement in {workflow_info.name}: {e}")
|
||||
|
||||
# Extract CPU requirement
|
||||
cpu = resources.get("cpu")
|
||||
if cpu:
|
||||
try:
|
||||
# Validate CPU format and store original string for Docker
|
||||
self._parse_cpu_to_millicores(cpu)
|
||||
resource_config["cpus"] = cpu
|
||||
except ValueError as e:
|
||||
logger.warning(f"Invalid CPU requirement in {workflow_info.name}: {e}")
|
||||
|
||||
# Extract timeout
|
||||
timeout = resources.get("timeout")
|
||||
if timeout and isinstance(timeout, int):
|
||||
resource_config["timeout"] = str(timeout)
|
||||
|
||||
return resource_config
|
||||
|
||||
async def initialize(self):
|
||||
"""
|
||||
Initialize the manager by discovering and deploying all workflows.
|
||||
|
||||
This method:
|
||||
1. Discovers all valid workflows in the workflows directory
|
||||
2. Validates their metadata
|
||||
3. Deploys each workflow to Prefect with Docker images
|
||||
"""
|
||||
try:
|
||||
# Discover workflows
|
||||
self.workflows = await self.discovery.discover_workflows()
|
||||
|
||||
if not self.workflows:
|
||||
logger.warning("No workflows discovered")
|
||||
return
|
||||
|
||||
logger.info(f"Discovered {len(self.workflows)} workflows: {list(self.workflows.keys())}")
|
||||
|
||||
# Deploy each workflow
|
||||
for name, info in self.workflows.items():
|
||||
try:
|
||||
await self._deploy_workflow(name, info)
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to deploy workflow '{name}': {e}")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to initialize Prefect manager: {e}")
|
||||
raise
|
||||
|
||||
async def _deploy_workflow(self, name: str, info: WorkflowInfo):
|
||||
"""
|
||||
Deploy a single workflow to Prefect with Docker image.
|
||||
|
||||
Args:
|
||||
name: Workflow name
|
||||
info: Workflow information including metadata and paths
|
||||
"""
|
||||
logger.info(f"Deploying workflow '{name}'...")
|
||||
|
||||
# Get the flow function from registry
|
||||
flow_func = self.discovery.get_flow_function(name)
|
||||
if not flow_func:
|
||||
logger.error(
|
||||
f"Failed to get flow function for '{name}' from registry. "
|
||||
f"Ensure the workflow is properly registered in toolbox/workflows/registry.py"
|
||||
)
|
||||
return
|
||||
|
||||
# Use the mandatory Dockerfile with absolute paths for Docker Compose
|
||||
# Get absolute paths for build context and dockerfile
|
||||
toolbox_path = info.path.parent.parent.resolve()
|
||||
dockerfile_abs_path = info.dockerfile.resolve()
|
||||
|
||||
# Calculate relative dockerfile path from toolbox context
|
||||
try:
|
||||
dockerfile_rel_path = dockerfile_abs_path.relative_to(toolbox_path)
|
||||
except ValueError:
|
||||
# If relative path fails, use the workflow-specific path
|
||||
dockerfile_rel_path = Path("workflows") / name / "Dockerfile"
|
||||
|
||||
# Determine deployment strategy based on Dockerfile presence
|
||||
base_image = "prefecthq/prefect:3-python3.11"
|
||||
has_custom_dockerfile = info.has_docker and info.dockerfile.exists()
|
||||
|
||||
logger.info(f"=== DEPLOYMENT DEBUG for '{name}' ===")
|
||||
logger.info(f"info.has_docker: {info.has_docker}")
|
||||
logger.info(f"info.dockerfile: {info.dockerfile}")
|
||||
logger.info(f"info.dockerfile.exists(): {info.dockerfile.exists()}")
|
||||
logger.info(f"has_custom_dockerfile: {has_custom_dockerfile}")
|
||||
logger.info(f"toolbox_path: {toolbox_path}")
|
||||
logger.info(f"dockerfile_rel_path: {dockerfile_rel_path}")
|
||||
|
||||
if has_custom_dockerfile:
|
||||
logger.info(f"Workflow '{name}' has custom Dockerfile - building custom image")
|
||||
# Decide whether to use registry or keep images local to host engine
|
||||
import os
|
||||
# Default to using the local registry; set FUZZFORGE_USE_REGISTRY=false to bypass (not recommended)
|
||||
use_registry = os.getenv("FUZZFORGE_USE_REGISTRY", "true").lower() == "true"
|
||||
|
||||
if use_registry:
|
||||
registry_url = get_registry_url(context="push")
|
||||
image_spec = DockerImage(
|
||||
name=f"{registry_url}/fuzzforge/{name}",
|
||||
tag="latest",
|
||||
dockerfile=str(dockerfile_rel_path),
|
||||
context=str(toolbox_path)
|
||||
)
|
||||
deploy_image = f"{registry_url}/fuzzforge/{name}:latest"
|
||||
build_custom = True
|
||||
push_custom = True
|
||||
logger.info(f"Using registry: {registry_url} for '{name}'")
|
||||
else:
|
||||
# Single-host mode: build into host engine cache; no push required
|
||||
image_spec = DockerImage(
|
||||
name=f"fuzzforge/{name}",
|
||||
tag="latest",
|
||||
dockerfile=str(dockerfile_rel_path),
|
||||
context=str(toolbox_path)
|
||||
)
|
||||
deploy_image = f"fuzzforge/{name}:latest"
|
||||
build_custom = True
|
||||
push_custom = False
|
||||
logger.info("Using single-host image (no registry push): %s", deploy_image)
|
||||
else:
|
||||
logger.info(f"Workflow '{name}' using base image - no custom dependencies needed")
|
||||
deploy_image = base_image
|
||||
build_custom = False
|
||||
push_custom = False
|
||||
|
||||
# Pre-validate registry connectivity when pushing
|
||||
if push_custom:
|
||||
try:
|
||||
from .setup import validate_registry_connectivity
|
||||
await validate_registry_connectivity(registry_url)
|
||||
logger.info(f"Registry connectivity validated for {registry_url}")
|
||||
except Exception as e:
|
||||
logger.error(f"Registry connectivity validation failed for {registry_url}: {e}")
|
||||
raise RuntimeError(f"Cannot deploy workflow '{name}': Registry {registry_url} is not accessible. {e}")
|
||||
|
||||
# Deploy the workflow
|
||||
try:
|
||||
# Ensure any previous deployment is removed so job variables are updated
|
||||
try:
|
||||
async with get_client() as client:
|
||||
existing = await client.read_deployment_by_name(
|
||||
f"{name}/{name}-deployment"
|
||||
)
|
||||
if existing:
|
||||
logger.info(f"Removing existing deployment for '{name}' to refresh settings...")
|
||||
await client.delete_deployment(existing.id)
|
||||
except Exception:
|
||||
# If not found or deletion fails, continue with deployment
|
||||
pass
|
||||
|
||||
# Extract resource requirements from metadata
|
||||
workflow_resource_requirements = self._extract_resource_requirements(info)
|
||||
logger.info(f"Workflow '{name}' resource requirements: {workflow_resource_requirements}")
|
||||
|
||||
# Build job variables with resource requirements
|
||||
job_variables = {
|
||||
"image": deploy_image, # Use the worker-accessible registry name
|
||||
"volumes": [], # Populated at run submission with toolbox mount
|
||||
"env": {
|
||||
"PYTHONPATH": "/opt/prefect/toolbox:/opt/prefect",
|
||||
"WORKFLOW_NAME": name
|
||||
}
|
||||
}
|
||||
|
||||
# Add resource requirements to job variables if present
|
||||
if workflow_resource_requirements:
|
||||
job_variables["resources"] = workflow_resource_requirements
|
||||
|
||||
# Prepare deployment parameters
|
||||
deploy_params = {
|
||||
"name": f"{name}-deployment",
|
||||
"work_pool_name": "docker-pool",
|
||||
"image": image_spec if has_custom_dockerfile else deploy_image,
|
||||
"push": push_custom,
|
||||
"build": build_custom,
|
||||
"job_variables": job_variables
|
||||
}
|
||||
|
||||
deployment = await flow_func.deploy(**deploy_params)
|
||||
|
||||
self.deployments[name] = str(deployment.id) if hasattr(deployment, 'id') else name
|
||||
logger.info(f"Successfully deployed workflow '{name}'")
|
||||
|
||||
except Exception as e:
|
||||
# Enhanced error reporting with more context
|
||||
import traceback
|
||||
logger.error(f"Failed to deploy workflow '{name}': {e}")
|
||||
logger.error(f"Deployment traceback: {traceback.format_exc()}")
|
||||
|
||||
# Try to capture Docker-specific context
|
||||
error_context = {
|
||||
"workflow_name": name,
|
||||
"has_dockerfile": has_custom_dockerfile,
|
||||
"image_name": deploy_image if 'deploy_image' in locals() else "unknown",
|
||||
"registry_url": registry_url if 'registry_url' in locals() else "unknown",
|
||||
"error_type": type(e).__name__,
|
||||
"error_message": str(e)
|
||||
}
|
||||
|
||||
# Check for specific error patterns with detailed categorization
|
||||
error_msg_lower = str(e).lower()
|
||||
if "registry" in error_msg_lower and ("no such host" in error_msg_lower or "connection" in error_msg_lower):
|
||||
error_context["category"] = "registry_connectivity_error"
|
||||
error_context["solution"] = f"Cannot reach registry at {error_context['registry_url']}. Check Docker network and registry service."
|
||||
elif "docker" in error_msg_lower:
|
||||
error_context["category"] = "docker_error"
|
||||
if "build" in error_msg_lower:
|
||||
error_context["subcategory"] = "image_build_failed"
|
||||
error_context["solution"] = "Check Dockerfile syntax and dependencies."
|
||||
elif "pull" in error_msg_lower:
|
||||
error_context["subcategory"] = "image_pull_failed"
|
||||
error_context["solution"] = "Check if image exists in registry and network connectivity."
|
||||
elif "push" in error_msg_lower:
|
||||
error_context["subcategory"] = "image_push_failed"
|
||||
error_context["solution"] = f"Check registry connectivity and push permissions to {error_context['registry_url']}."
|
||||
elif "registry" in error_msg_lower:
|
||||
error_context["category"] = "registry_error"
|
||||
error_context["solution"] = "Check registry configuration and accessibility."
|
||||
elif "prefect" in error_msg_lower:
|
||||
error_context["category"] = "prefect_error"
|
||||
error_context["solution"] = "Check Prefect server connectivity and deployment configuration."
|
||||
else:
|
||||
error_context["category"] = "unknown_deployment_error"
|
||||
error_context["solution"] = "Check logs for more specific error details."
|
||||
|
||||
logger.error(f"Deployment error context: {error_context}")
|
||||
|
||||
# Raise enhanced exception with context
|
||||
enhanced_error = Exception(f"Deployment failed for workflow '{name}': {str(e)} | Context: {error_context}")
|
||||
enhanced_error.original_error = e
|
||||
enhanced_error.context = error_context
|
||||
raise enhanced_error
|
||||
|
||||
async def submit_workflow(
|
||||
self,
|
||||
workflow_name: str,
|
||||
target_path: str,
|
||||
volume_mode: str = "ro",
|
||||
parameters: Dict[str, Any] = None,
|
||||
resource_limits: Dict[str, str] = None,
|
||||
additional_volumes: list = None,
|
||||
timeout: int = None
|
||||
) -> FlowRun:
|
||||
"""
|
||||
Submit a workflow for execution with volume mounting.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow to execute
|
||||
target_path: Host path to mount as volume
|
||||
volume_mode: Volume mount mode ("ro" for read-only, "rw" for read-write)
|
||||
parameters: Workflow-specific parameters
|
||||
resource_limits: CPU/memory limits for container
|
||||
additional_volumes: List of additional volume mounts
|
||||
timeout: Timeout in seconds
|
||||
|
||||
Returns:
|
||||
FlowRun object with run information
|
||||
|
||||
Raises:
|
||||
ValueError: If workflow not found or volume mode not supported
|
||||
"""
|
||||
if workflow_name not in self.workflows:
|
||||
raise ValueError(f"Unknown workflow: {workflow_name}")
|
||||
|
||||
# Validate volume mode
|
||||
workflow_info = self.workflows[workflow_name]
|
||||
supported_modes = workflow_info.metadata.get("supported_volume_modes", ["ro", "rw"])
|
||||
|
||||
if volume_mode not in supported_modes:
|
||||
raise ValueError(
|
||||
f"Workflow '{workflow_name}' doesn't support volume mode '{volume_mode}'. "
|
||||
f"Supported modes: {supported_modes}"
|
||||
)
|
||||
|
||||
# Validate target path with security checks
|
||||
self._validate_target_path(target_path)
|
||||
|
||||
# Validate additional volumes if provided
|
||||
if additional_volumes:
|
||||
for volume in additional_volumes:
|
||||
self._validate_target_path(volume.host_path)
|
||||
|
||||
async with get_client() as client:
|
||||
# Get the deployment, auto-redeploy once if missing
|
||||
try:
|
||||
deployment = await client.read_deployment_by_name(
|
||||
f"{workflow_name}/{workflow_name}-deployment"
|
||||
)
|
||||
except Exception as e:
|
||||
import traceback
|
||||
logger.error(f"Failed to find deployment for workflow '{workflow_name}': {e}")
|
||||
logger.error(f"Deployment lookup traceback: {traceback.format_exc()}")
|
||||
|
||||
# Attempt a one-time auto-deploy to recover from startup races
|
||||
try:
|
||||
logger.info(f"Auto-deploying missing workflow '{workflow_name}' and retrying...")
|
||||
await self._deploy_workflow(workflow_name, workflow_info)
|
||||
deployment = await client.read_deployment_by_name(
|
||||
f"{workflow_name}/{workflow_name}-deployment"
|
||||
)
|
||||
except Exception as redeploy_exc:
|
||||
# Enhanced error with context
|
||||
error_context = {
|
||||
"workflow_name": workflow_name,
|
||||
"error_type": type(e).__name__,
|
||||
"error_message": str(e),
|
||||
"redeploy_error": str(redeploy_exc),
|
||||
"available_deployments": list(self.deployments.keys()),
|
||||
}
|
||||
enhanced_error = ValueError(
|
||||
f"Deployment not found and redeploy failed for workflow '{workflow_name}': {e} | Context: {error_context}"
|
||||
)
|
||||
enhanced_error.context = error_context
|
||||
raise enhanced_error
|
||||
|
||||
# Determine the Docker Compose network name and volume names
|
||||
# Hardcoded to 'fuzzforge' to avoid directory name dependencies
|
||||
import os
|
||||
compose_project = "fuzzforge"
|
||||
docker_network = "fuzzforge_default"
|
||||
|
||||
# Build volume mounts
|
||||
# Add toolbox volume mount for workflow code access
|
||||
backend_toolbox_path = "/app/toolbox" # Path in backend container
|
||||
|
||||
# Hardcoded volume names
|
||||
prefect_storage_volume = "fuzzforge_prefect_storage"
|
||||
toolbox_code_volume = "fuzzforge_toolbox_code"
|
||||
|
||||
volumes = [
|
||||
f"{target_path}:/workspace:{volume_mode}",
|
||||
f"{prefect_storage_volume}:/prefect-storage", # Shared storage for results
|
||||
f"{toolbox_code_volume}:/opt/prefect/toolbox:ro" # Mount workflow code
|
||||
]
|
||||
|
||||
# Add additional volumes if provided
|
||||
if additional_volumes:
|
||||
for volume in additional_volumes:
|
||||
volume_spec = f"{volume.host_path}:{volume.container_path}:{volume.mode}"
|
||||
volumes.append(volume_spec)
|
||||
|
||||
# Build environment variables
|
||||
env_vars = {
|
||||
"PREFECT_API_URL": "http://prefect-server:4200/api", # Use internal network hostname
|
||||
"PREFECT_LOGGING_LEVEL": "INFO",
|
||||
"PREFECT_LOCAL_STORAGE_PATH": "/prefect-storage", # Use shared storage
|
||||
"PREFECT_RESULTS_PERSIST_BY_DEFAULT": "true", # Enable result persistence
|
||||
"PREFECT_DEFAULT_RESULT_STORAGE_BLOCK": "local-file-system/fuzzforge-results", # Use our storage block
|
||||
"WORKSPACE_PATH": "/workspace",
|
||||
"VOLUME_MODE": volume_mode,
|
||||
"WORKFLOW_NAME": workflow_name
|
||||
}
|
||||
|
||||
# Add additional volume paths to environment for easy access
|
||||
if additional_volumes:
|
||||
for i, volume in enumerate(additional_volumes):
|
||||
env_vars[f"ADDITIONAL_VOLUME_{i}_PATH"] = volume.container_path
|
||||
|
||||
# Determine which image to use based on workflow configuration
|
||||
workflow_info = self.workflows[workflow_name]
|
||||
has_custom_dockerfile = workflow_info.has_docker and workflow_info.dockerfile.exists()
|
||||
# Use pull context for worker to pull from registry
|
||||
registry_url = get_registry_url(context="pull")
|
||||
workflow_image = f"{registry_url}/fuzzforge/{workflow_name}:latest" if has_custom_dockerfile else "prefecthq/prefect:3-python3.11"
|
||||
logger.debug(f"Worker will pull image: {workflow_image} (Registry: {registry_url})")
|
||||
|
||||
# Configure job variables with volume mounting and network access
|
||||
job_variables = {
|
||||
# Use custom image if available, otherwise base Prefect image
|
||||
"image": workflow_image,
|
||||
"volumes": volumes,
|
||||
"networks": [docker_network], # Connect to Docker Compose network
|
||||
"env": {
|
||||
**env_vars,
|
||||
"PYTHONPATH": "/opt/prefect/toolbox:/opt/prefect/toolbox/workflows",
|
||||
"WORKFLOW_NAME": workflow_name
|
||||
}
|
||||
}
|
||||
|
||||
# Apply resource requirements from workflow metadata and user overrides
|
||||
workflow_resource_requirements = self._extract_resource_requirements(workflow_info)
|
||||
final_resource_config = {}
|
||||
|
||||
# Start with workflow requirements as base
|
||||
if workflow_resource_requirements:
|
||||
final_resource_config.update(workflow_resource_requirements)
|
||||
|
||||
# Apply user-provided resource limits (overrides workflow defaults)
|
||||
if resource_limits:
|
||||
user_resource_config = {}
|
||||
if resource_limits.get("cpu_limit"):
|
||||
user_resource_config["cpus"] = resource_limits["cpu_limit"]
|
||||
if resource_limits.get("memory_limit"):
|
||||
user_resource_config["memory"] = resource_limits["memory_limit"]
|
||||
# Note: cpu_request and memory_request are not directly supported by Docker
|
||||
# but could be used for Kubernetes in the future
|
||||
|
||||
# User overrides take precedence
|
||||
final_resource_config.update(user_resource_config)
|
||||
|
||||
# Apply final resource configuration
|
||||
if final_resource_config:
|
||||
job_variables["resources"] = final_resource_config
|
||||
logger.info(f"Applied resource limits: {final_resource_config}")
|
||||
|
||||
# Merge parameters with defaults from metadata
|
||||
default_params = workflow_info.metadata.get("default_parameters", {})
|
||||
final_params = {**default_params, **(parameters or {})}
|
||||
|
||||
# Set flow parameters that match the flow signature
|
||||
final_params["target_path"] = "/workspace" # Container path where volume is mounted
|
||||
final_params["volume_mode"] = volume_mode
|
||||
|
||||
# Create and submit the flow run
|
||||
# Pass job_variables to ensure network, volumes, and environment are configured
|
||||
logger.info(f"Submitting flow with job_variables: {job_variables}")
|
||||
logger.info(f"Submitting flow with parameters: {final_params}")
|
||||
|
||||
# Prepare flow run creation parameters
|
||||
flow_run_params = {
|
||||
"deployment_id": deployment.id,
|
||||
"parameters": final_params,
|
||||
"job_variables": job_variables
|
||||
}
|
||||
|
||||
# Note: Timeout is handled through workflow-level configuration
|
||||
# Additional timeout configuration can be added to deployment metadata if needed
|
||||
|
||||
flow_run = await client.create_flow_run_from_deployment(**flow_run_params)
|
||||
|
||||
logger.info(
|
||||
f"Submitted workflow '{workflow_name}' with run_id: {flow_run.id}, "
|
||||
f"target: {target_path}, mode: {volume_mode}"
|
||||
)
|
||||
|
||||
return flow_run
|
||||
|
||||
async def get_flow_run_findings(self, run_id: str) -> Dict[str, Any]:
|
||||
"""
|
||||
Retrieve findings from a completed flow run.
|
||||
|
||||
Args:
|
||||
run_id: The flow run ID
|
||||
|
||||
Returns:
|
||||
Dictionary containing SARIF-formatted findings
|
||||
|
||||
Raises:
|
||||
ValueError: If run not completed or not found
|
||||
"""
|
||||
async with get_client() as client:
|
||||
flow_run = await client.read_flow_run(run_id)
|
||||
|
||||
if not flow_run.state.is_completed():
|
||||
raise ValueError(
|
||||
f"Flow run {run_id} not completed. Current status: {flow_run.state.name}"
|
||||
)
|
||||
|
||||
# Get the findings from the flow run result
|
||||
try:
|
||||
findings = await flow_run.state.result()
|
||||
return findings
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to retrieve findings for run {run_id}: {e}")
|
||||
raise ValueError(f"Failed to retrieve findings: {e}")
|
||||
|
||||
async def get_flow_run_status(self, run_id: str) -> Dict[str, Any]:
|
||||
"""
|
||||
Get the current status of a flow run.
|
||||
|
||||
Args:
|
||||
run_id: The flow run ID
|
||||
|
||||
Returns:
|
||||
Dictionary with status information
|
||||
"""
|
||||
async with get_client() as client:
|
||||
flow_run = await client.read_flow_run(run_id)
|
||||
|
||||
return {
|
||||
"run_id": str(flow_run.id),
|
||||
"workflow": flow_run.deployment_id,
|
||||
"status": flow_run.state.name,
|
||||
"is_completed": flow_run.state.is_completed(),
|
||||
"is_failed": flow_run.state.is_failed(),
|
||||
"is_running": flow_run.state.is_running(),
|
||||
"created_at": flow_run.created,
|
||||
"updated_at": flow_run.updated
|
||||
}
|
||||
|
||||
def _validate_target_path(self, target_path: str) -> None:
|
||||
"""
|
||||
Validate target path for security before mounting as volume.
|
||||
|
||||
Args:
|
||||
target_path: Host path to validate
|
||||
|
||||
Raises:
|
||||
ValueError: If path is not allowed for security reasons
|
||||
"""
|
||||
target = Path(target_path)
|
||||
|
||||
# Path must be absolute
|
||||
if not target.is_absolute():
|
||||
raise ValueError(f"Target path must be absolute: {target_path}")
|
||||
|
||||
# Resolve path to handle symlinks and relative components
|
||||
try:
|
||||
resolved_path = target.resolve()
|
||||
except (OSError, RuntimeError) as e:
|
||||
raise ValueError(f"Cannot resolve target path: {target_path} - {e}")
|
||||
|
||||
resolved_str = str(resolved_path)
|
||||
|
||||
# Check against forbidden paths first (more restrictive)
|
||||
for forbidden in self.forbidden_paths:
|
||||
if resolved_str.startswith(forbidden):
|
||||
raise ValueError(
|
||||
f"Access denied: Path '{target_path}' resolves to forbidden directory '{forbidden}'. "
|
||||
f"This path contains sensitive system files and cannot be mounted."
|
||||
)
|
||||
|
||||
# Check if path starts with any allowed base path
|
||||
path_allowed = False
|
||||
for allowed in self.allowed_base_paths:
|
||||
if resolved_str.startswith(allowed):
|
||||
path_allowed = True
|
||||
break
|
||||
|
||||
if not path_allowed:
|
||||
allowed_list = ", ".join(self.allowed_base_paths)
|
||||
raise ValueError(
|
||||
f"Access denied: Path '{target_path}' is not in allowed directories. "
|
||||
f"Allowed base paths: {allowed_list}"
|
||||
)
|
||||
|
||||
# Additional security checks
|
||||
if resolved_str == "/":
|
||||
raise ValueError("Cannot mount root filesystem")
|
||||
|
||||
# Warn if path doesn't exist (but don't block - it might be created later)
|
||||
if not resolved_path.exists():
|
||||
logger.warning(f"Target path does not exist: {target_path}")
|
||||
|
||||
logger.info(f"Path validation passed for: {target_path} -> {resolved_str}")
|
||||
402
backend/src/core/setup.py
Normal file
402
backend/src/core/setup.py
Normal file
@@ -0,0 +1,402 @@
|
||||
"""
|
||||
Setup utilities for Prefect infrastructure
|
||||
"""
|
||||
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import logging
|
||||
from prefect import get_client
|
||||
from prefect.client.schemas.actions import WorkPoolCreate
|
||||
from prefect.client.schemas.objects import WorkPool
|
||||
from .prefect_manager import get_registry_url
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
async def setup_docker_pool():
|
||||
"""
|
||||
Create or update the Docker work pool for container execution.
|
||||
|
||||
This work pool is configured to:
|
||||
- Connect to the local Docker daemon
|
||||
- Support volume mounting at runtime
|
||||
- Clean up containers after execution
|
||||
- Use bridge networking by default
|
||||
"""
|
||||
import os
|
||||
|
||||
async with get_client() as client:
|
||||
pool_name = "docker-pool"
|
||||
|
||||
# Add force recreation flag for debugging fresh install issues
|
||||
force_recreate = os.getenv('FORCE_RECREATE_WORK_POOL', 'false').lower() == 'true'
|
||||
debug_setup = os.getenv('DEBUG_WORK_POOL_SETUP', 'false').lower() == 'true'
|
||||
|
||||
if force_recreate:
|
||||
logger.warning(f"FORCE_RECREATE_WORK_POOL=true - Will recreate work pool regardless of existing configuration")
|
||||
if debug_setup:
|
||||
logger.warning(f"DEBUG_WORK_POOL_SETUP=true - Enhanced logging enabled")
|
||||
# Temporarily set logging level to DEBUG for this function
|
||||
original_level = logger.level
|
||||
logger.setLevel(logging.DEBUG)
|
||||
|
||||
try:
|
||||
# Check if pool already exists and supports custom images
|
||||
existing_pools = await client.read_work_pools()
|
||||
existing_pool = None
|
||||
for pool in existing_pools:
|
||||
if pool.name == pool_name:
|
||||
existing_pool = pool
|
||||
break
|
||||
|
||||
if existing_pool and not force_recreate:
|
||||
logger.info(f"Found existing work pool '{pool_name}' - validating configuration...")
|
||||
|
||||
# Check if the existing pool has the correct configuration
|
||||
base_template = existing_pool.base_job_template or {}
|
||||
logger.debug(f"Base template keys: {list(base_template.keys())}")
|
||||
|
||||
job_config = base_template.get("job_configuration", {})
|
||||
logger.debug(f"Job config keys: {list(job_config.keys())}")
|
||||
|
||||
image_config = job_config.get("image", "")
|
||||
has_image_variable = "{{ image }}" in str(image_config)
|
||||
logger.debug(f"Image config: '{image_config}' -> has_image_variable: {has_image_variable}")
|
||||
|
||||
# Check if volume defaults include toolbox mount
|
||||
variables = base_template.get("variables", {})
|
||||
properties = variables.get("properties", {})
|
||||
volume_config = properties.get("volumes", {})
|
||||
volume_defaults = volume_config.get("default", [])
|
||||
has_toolbox_volume = any("toolbox_code" in str(vol) for vol in volume_defaults) if volume_defaults else False
|
||||
logger.debug(f"Volume defaults: {volume_defaults}")
|
||||
logger.debug(f"Has toolbox volume: {has_toolbox_volume}")
|
||||
|
||||
# Check if environment defaults include required settings
|
||||
env_config = properties.get("env", {})
|
||||
env_defaults = env_config.get("default", {})
|
||||
has_api_url = "PREFECT_API_URL" in env_defaults
|
||||
has_storage_path = "PREFECT_LOCAL_STORAGE_PATH" in env_defaults
|
||||
has_results_persist = "PREFECT_RESULTS_PERSIST_BY_DEFAULT" in env_defaults
|
||||
has_required_env = has_api_url and has_storage_path and has_results_persist
|
||||
logger.debug(f"Environment defaults: {env_defaults}")
|
||||
logger.debug(f"Has API URL: {has_api_url}, Has storage path: {has_storage_path}, Has results persist: {has_results_persist}")
|
||||
logger.debug(f"Has required env: {has_required_env}")
|
||||
|
||||
# Log the full validation result
|
||||
logger.info(f"Work pool validation - Image: {has_image_variable}, Toolbox: {has_toolbox_volume}, Environment: {has_required_env}")
|
||||
|
||||
if has_image_variable and has_toolbox_volume and has_required_env:
|
||||
logger.info(f"Docker work pool '{pool_name}' already exists with correct configuration")
|
||||
return
|
||||
else:
|
||||
reasons = []
|
||||
if not has_image_variable:
|
||||
reasons.append("missing image template")
|
||||
if not has_toolbox_volume:
|
||||
reasons.append("missing toolbox volume mount")
|
||||
if not has_required_env:
|
||||
if not has_api_url:
|
||||
reasons.append("missing PREFECT_API_URL")
|
||||
if not has_storage_path:
|
||||
reasons.append("missing PREFECT_LOCAL_STORAGE_PATH")
|
||||
if not has_results_persist:
|
||||
reasons.append("missing PREFECT_RESULTS_PERSIST_BY_DEFAULT")
|
||||
|
||||
logger.warning(f"Docker work pool '{pool_name}' exists but lacks: {', '.join(reasons)}. Recreating...")
|
||||
# Delete the old pool and recreate it
|
||||
try:
|
||||
await client.delete_work_pool(pool_name)
|
||||
logger.info(f"Deleted old work pool '{pool_name}'")
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to delete old work pool: {e}")
|
||||
elif force_recreate and existing_pool:
|
||||
logger.warning(f"Force recreation enabled - deleting existing work pool '{pool_name}'")
|
||||
try:
|
||||
await client.delete_work_pool(pool_name)
|
||||
logger.info(f"Deleted existing work pool for force recreation")
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to delete work pool for force recreation: {e}")
|
||||
|
||||
logger.info(f"Creating Docker work pool '{pool_name}' with custom image support...")
|
||||
|
||||
# Create the work pool with proper Docker configuration
|
||||
work_pool = WorkPoolCreate(
|
||||
name=pool_name,
|
||||
type="docker",
|
||||
description="Docker work pool for FuzzForge workflows with custom image support",
|
||||
base_job_template={
|
||||
"job_configuration": {
|
||||
"image": "{{ image }}", # Template variable for custom images
|
||||
"volumes": "{{ volumes }}", # List of volume mounts
|
||||
"env": "{{ env }}", # Environment variables
|
||||
"networks": "{{ networks }}", # Docker networks
|
||||
"stream_output": True,
|
||||
"auto_remove": True,
|
||||
"privileged": False,
|
||||
"network_mode": None, # Use networks instead
|
||||
"labels": {},
|
||||
"command": None # Let the image's CMD/ENTRYPOINT run
|
||||
},
|
||||
"variables": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"image": {
|
||||
"type": "string",
|
||||
"title": "Docker Image",
|
||||
"default": "prefecthq/prefect:3-python3.11",
|
||||
"description": "Docker image for the flow run"
|
||||
},
|
||||
"volumes": {
|
||||
"type": "array",
|
||||
"title": "Volume Mounts",
|
||||
"default": [
|
||||
"fuzzforge_prefect_storage:/prefect-storage",
|
||||
"fuzzforge_toolbox_code:/opt/prefect/toolbox:ro"
|
||||
],
|
||||
"description": "Volume mounts in format 'host:container:mode'",
|
||||
"items": {
|
||||
"type": "string"
|
||||
}
|
||||
},
|
||||
"networks": {
|
||||
"type": "array",
|
||||
"title": "Docker Networks",
|
||||
"default": ["fuzzforge_default"],
|
||||
"description": "Docker networks to connect container to",
|
||||
"items": {
|
||||
"type": "string"
|
||||
}
|
||||
},
|
||||
"env": {
|
||||
"type": "object",
|
||||
"title": "Environment Variables",
|
||||
"default": {
|
||||
"PREFECT_API_URL": "http://prefect-server:4200/api",
|
||||
"PREFECT_LOCAL_STORAGE_PATH": "/prefect-storage",
|
||||
"PREFECT_RESULTS_PERSIST_BY_DEFAULT": "true"
|
||||
},
|
||||
"description": "Environment variables for the container",
|
||||
"additionalProperties": {
|
||||
"type": "string"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
)
|
||||
|
||||
await client.create_work_pool(work_pool)
|
||||
logger.info(f"Created Docker work pool '{pool_name}'")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to setup Docker work pool: {e}")
|
||||
raise
|
||||
finally:
|
||||
# Restore original logging level if debug mode was enabled
|
||||
if debug_setup and 'original_level' in locals():
|
||||
logger.setLevel(original_level)
|
||||
|
||||
|
||||
def get_actual_compose_project_name():
|
||||
"""
|
||||
Return the hardcoded compose project name for FuzzForge.
|
||||
|
||||
Always returns 'fuzzforge' as per system requirements.
|
||||
"""
|
||||
logger.info("Using hardcoded compose project name: fuzzforge")
|
||||
return "fuzzforge"
|
||||
|
||||
|
||||
async def setup_result_storage():
|
||||
"""
|
||||
Create or update Prefect result storage block for findings persistence.
|
||||
|
||||
This sets up a LocalFileSystem storage block pointing to the shared
|
||||
/prefect-storage volume for result persistence.
|
||||
"""
|
||||
from prefect.filesystems import LocalFileSystem
|
||||
|
||||
storage_name = "fuzzforge-results"
|
||||
|
||||
try:
|
||||
# Create the storage block, overwrite if it exists
|
||||
logger.info(f"Setting up storage block '{storage_name}'...")
|
||||
storage = LocalFileSystem(basepath="/prefect-storage")
|
||||
|
||||
block_doc_id = await storage.save(name=storage_name, overwrite=True)
|
||||
logger.info(f"Storage block '{storage_name}' configured successfully")
|
||||
return str(block_doc_id)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to setup result storage: {e}")
|
||||
# Don't raise the exception - continue without storage block
|
||||
logger.warning("Continuing without result storage block - findings may not persist")
|
||||
return None
|
||||
|
||||
|
||||
async def validate_docker_connection():
|
||||
"""
|
||||
Validate that Docker is accessible and running.
|
||||
|
||||
Note: In containerized deployments with Docker socket proxy,
|
||||
the backend doesn't need direct Docker access.
|
||||
|
||||
Raises:
|
||||
RuntimeError: If Docker is not accessible
|
||||
"""
|
||||
import os
|
||||
|
||||
# Skip Docker validation if running in container without socket access
|
||||
if os.path.exists("/.dockerenv") and not os.path.exists("/var/run/docker.sock"):
|
||||
logger.info("Running in container without Docker socket - skipping Docker validation")
|
||||
return
|
||||
|
||||
try:
|
||||
import docker
|
||||
client = docker.from_env()
|
||||
client.ping()
|
||||
logger.info("Docker connection validated")
|
||||
except Exception as e:
|
||||
logger.error(f"Docker is not accessible: {e}")
|
||||
raise RuntimeError(
|
||||
"Docker is not running or not accessible. "
|
||||
"Please ensure Docker is installed and running."
|
||||
)
|
||||
|
||||
|
||||
async def validate_registry_connectivity(registry_url: str = None):
|
||||
"""
|
||||
Validate that the Docker registry is accessible.
|
||||
|
||||
Args:
|
||||
registry_url: URL of the Docker registry to validate (auto-detected if None)
|
||||
|
||||
Raises:
|
||||
RuntimeError: If registry is not accessible
|
||||
"""
|
||||
# Resolve a reachable test URL from within this process
|
||||
if registry_url is None:
|
||||
# If not specified, prefer internal service name in containers, host port on host
|
||||
import os
|
||||
if os.path.exists('/.dockerenv'):
|
||||
registry_url = "registry:5000"
|
||||
else:
|
||||
registry_url = "localhost:5001"
|
||||
|
||||
# If we're running inside a container and asked to probe localhost:PORT,
|
||||
# the probe would hit the container, not the host. Use host.docker.internal instead.
|
||||
import os
|
||||
try:
|
||||
host_part, port_part = registry_url.split(":", 1)
|
||||
except ValueError:
|
||||
host_part, port_part = registry_url, "80"
|
||||
|
||||
if os.path.exists('/.dockerenv') and host_part in ("localhost", "127.0.0.1"):
|
||||
test_host = "host.docker.internal"
|
||||
else:
|
||||
test_host = host_part
|
||||
test_url = f"http://{test_host}:{port_part}/v2/"
|
||||
|
||||
import aiohttp
|
||||
import asyncio
|
||||
|
||||
logger.info(f"Validating registry connectivity to {registry_url}...")
|
||||
|
||||
try:
|
||||
async with aiohttp.ClientSession(timeout=aiohttp.ClientTimeout(total=10)) as session:
|
||||
async with session.get(test_url) as response:
|
||||
if response.status == 200:
|
||||
logger.info(f"Registry at {registry_url} is accessible (tested via {test_host})")
|
||||
return
|
||||
else:
|
||||
raise RuntimeError(f"Registry returned status {response.status}")
|
||||
except asyncio.TimeoutError:
|
||||
raise RuntimeError(f"Registry at {registry_url} is not responding (timeout)")
|
||||
except aiohttp.ClientError as e:
|
||||
raise RuntimeError(f"Registry at {registry_url} is not accessible: {e}")
|
||||
except Exception as e:
|
||||
raise RuntimeError(f"Failed to validate registry connectivity: {e}")
|
||||
|
||||
|
||||
async def validate_docker_network(network_name: str):
|
||||
"""
|
||||
Validate that the specified Docker network exists.
|
||||
|
||||
Args:
|
||||
network_name: Name of the Docker network to validate
|
||||
|
||||
Raises:
|
||||
RuntimeError: If network doesn't exist
|
||||
"""
|
||||
import os
|
||||
|
||||
# Skip network validation if running in container without Docker socket
|
||||
if os.path.exists("/.dockerenv") and not os.path.exists("/var/run/docker.sock"):
|
||||
logger.info("Running in container without Docker socket - skipping network validation")
|
||||
return
|
||||
|
||||
try:
|
||||
import docker
|
||||
client = docker.from_env()
|
||||
|
||||
# List all networks
|
||||
networks = client.networks.list(names=[network_name])
|
||||
|
||||
if not networks:
|
||||
# Try to find networks with similar names
|
||||
all_networks = client.networks.list()
|
||||
similar_networks = [n.name for n in all_networks if "fuzzforge" in n.name.lower()]
|
||||
|
||||
error_msg = f"Docker network '{network_name}' not found."
|
||||
if similar_networks:
|
||||
error_msg += f" Available networks: {similar_networks}"
|
||||
else:
|
||||
error_msg += " Please ensure Docker Compose is running."
|
||||
|
||||
raise RuntimeError(error_msg)
|
||||
|
||||
logger.info(f"Docker network '{network_name}' validated")
|
||||
|
||||
except Exception as e:
|
||||
if isinstance(e, RuntimeError):
|
||||
raise
|
||||
logger.error(f"Network validation failed: {e}")
|
||||
raise RuntimeError(f"Failed to validate Docker network: {e}")
|
||||
|
||||
|
||||
async def validate_infrastructure():
|
||||
"""
|
||||
Validate all required infrastructure components.
|
||||
|
||||
This should be called during startup to ensure everything is ready.
|
||||
"""
|
||||
logger.info("Validating infrastructure...")
|
||||
|
||||
# Validate Docker connection
|
||||
await validate_docker_connection()
|
||||
|
||||
# Validate registry connectivity for custom image building
|
||||
await validate_registry_connectivity()
|
||||
|
||||
# Validate network (hardcoded to avoid directory name dependencies)
|
||||
import os
|
||||
compose_project = "fuzzforge"
|
||||
docker_network = "fuzzforge_default"
|
||||
|
||||
try:
|
||||
await validate_docker_network(docker_network)
|
||||
except RuntimeError as e:
|
||||
logger.warning(f"Network validation failed: {e}")
|
||||
logger.warning("Workflows may not be able to connect to Prefect services")
|
||||
|
||||
logger.info("Infrastructure validation completed")
|
||||
459
backend/src/core/workflow_discovery.py
Normal file
459
backend/src/core/workflow_discovery.py
Normal file
@@ -0,0 +1,459 @@
|
||||
"""
|
||||
Workflow Discovery - Registry-based discovery and loading of workflows
|
||||
"""
|
||||
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import logging
|
||||
import yaml
|
||||
from pathlib import Path
|
||||
from typing import Dict, Optional, Any, Callable
|
||||
from pydantic import BaseModel, Field, ConfigDict
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class WorkflowInfo(BaseModel):
|
||||
"""Information about a discovered workflow"""
|
||||
name: str = Field(..., description="Workflow name")
|
||||
path: Path = Field(..., description="Path to workflow directory")
|
||||
workflow_file: Path = Field(..., description="Path to workflow.py file")
|
||||
dockerfile: Path = Field(..., description="Path to Dockerfile")
|
||||
has_docker: bool = Field(..., description="Whether workflow has custom Dockerfile")
|
||||
metadata: Dict[str, Any] = Field(..., description="Workflow metadata from YAML")
|
||||
flow_function_name: str = Field(default="main_flow", description="Name of the flow function")
|
||||
|
||||
model_config = ConfigDict(arbitrary_types_allowed=True)
|
||||
|
||||
|
||||
class WorkflowDiscovery:
|
||||
"""
|
||||
Discovers workflows from the filesystem and validates them against the registry.
|
||||
|
||||
This system:
|
||||
1. Scans for workflows with metadata.yaml files
|
||||
2. Cross-references them with the manual registry
|
||||
3. Provides registry-based flow functions for deployment
|
||||
|
||||
Workflows must have:
|
||||
- workflow.py: Contains the Prefect flow
|
||||
- metadata.yaml: Mandatory metadata file
|
||||
- Entry in toolbox/workflows/registry.py: Manual registration
|
||||
- Dockerfile (optional): Custom container definition
|
||||
- requirements.txt (optional): Python dependencies
|
||||
"""
|
||||
|
||||
def __init__(self, workflows_dir: Path):
|
||||
"""
|
||||
Initialize workflow discovery.
|
||||
|
||||
Args:
|
||||
workflows_dir: Path to the workflows directory
|
||||
"""
|
||||
self.workflows_dir = workflows_dir
|
||||
if not self.workflows_dir.exists():
|
||||
self.workflows_dir.mkdir(parents=True, exist_ok=True)
|
||||
logger.info(f"Created workflows directory: {self.workflows_dir}")
|
||||
|
||||
# Import registry - this validates it on import
|
||||
try:
|
||||
from toolbox.workflows.registry import WORKFLOW_REGISTRY, list_registered_workflows
|
||||
self.registry = WORKFLOW_REGISTRY
|
||||
logger.info(f"Loaded workflow registry with {len(self.registry)} registered workflows")
|
||||
except ImportError as e:
|
||||
logger.error(f"Failed to import workflow registry: {e}")
|
||||
self.registry = {}
|
||||
except Exception as e:
|
||||
logger.error(f"Registry validation failed: {e}")
|
||||
self.registry = {}
|
||||
|
||||
# Cache for discovered workflows
|
||||
self._workflow_cache: Optional[Dict[str, WorkflowInfo]] = None
|
||||
self._cache_timestamp: Optional[float] = None
|
||||
self._cache_ttl = 60.0 # Cache TTL in seconds
|
||||
|
||||
async def discover_workflows(self) -> Dict[str, WorkflowInfo]:
|
||||
"""
|
||||
Discover workflows by cross-referencing filesystem with registry.
|
||||
Uses caching to avoid frequent filesystem scans.
|
||||
|
||||
Returns:
|
||||
Dictionary mapping workflow names to their information
|
||||
"""
|
||||
# Check cache validity
|
||||
import time
|
||||
current_time = time.time()
|
||||
|
||||
if (self._workflow_cache is not None and
|
||||
self._cache_timestamp is not None and
|
||||
(current_time - self._cache_timestamp) < self._cache_ttl):
|
||||
# Return cached results
|
||||
logger.debug(f"Returning cached workflow discovery ({len(self._workflow_cache)} workflows)")
|
||||
return self._workflow_cache
|
||||
workflows = {}
|
||||
discovered_dirs = set()
|
||||
registry_names = set(self.registry.keys())
|
||||
|
||||
if not self.workflows_dir.exists():
|
||||
logger.warning(f"Workflows directory does not exist: {self.workflows_dir}")
|
||||
return workflows
|
||||
|
||||
# Recursively scan all directories and subdirectories
|
||||
await self._scan_directory_recursive(self.workflows_dir, workflows, discovered_dirs)
|
||||
|
||||
# Check for registry entries without corresponding directories
|
||||
missing_dirs = registry_names - discovered_dirs
|
||||
if missing_dirs:
|
||||
logger.warning(
|
||||
f"Registry contains workflows without filesystem directories: {missing_dirs}. "
|
||||
f"These workflows cannot be deployed."
|
||||
)
|
||||
|
||||
logger.info(
|
||||
f"Discovery complete: {len(workflows)} workflows ready for deployment, "
|
||||
f"{len(missing_dirs)} registry entries missing directories, "
|
||||
f"{len(discovered_dirs - registry_names)} filesystem workflows not registered"
|
||||
)
|
||||
|
||||
# Update cache
|
||||
self._workflow_cache = workflows
|
||||
self._cache_timestamp = current_time
|
||||
|
||||
return workflows
|
||||
|
||||
async def _scan_directory_recursive(self, directory: Path, workflows: Dict[str, WorkflowInfo], discovered_dirs: set):
|
||||
"""
|
||||
Recursively scan directory for workflows.
|
||||
|
||||
Args:
|
||||
directory: Directory to scan
|
||||
workflows: Dictionary to populate with discovered workflows
|
||||
discovered_dirs: Set to track discovered workflow names
|
||||
"""
|
||||
for item in directory.iterdir():
|
||||
if not item.is_dir():
|
||||
continue
|
||||
|
||||
if item.name.startswith('_') or item.name.startswith('.'):
|
||||
continue # Skip hidden or private directories
|
||||
|
||||
# Check if this directory contains workflow files (workflow.py and metadata.yaml)
|
||||
workflow_file = item / "workflow.py"
|
||||
metadata_file = item / "metadata.yaml"
|
||||
|
||||
if workflow_file.exists() and metadata_file.exists():
|
||||
# This is a workflow directory
|
||||
workflow_name = item.name
|
||||
discovered_dirs.add(workflow_name)
|
||||
|
||||
# Only process workflows that are in the registry
|
||||
if workflow_name not in self.registry:
|
||||
logger.warning(
|
||||
f"Workflow '{workflow_name}' found in filesystem but not in registry. "
|
||||
f"Add it to toolbox/workflows/registry.py to enable deployment."
|
||||
)
|
||||
continue
|
||||
|
||||
try:
|
||||
workflow_info = await self._load_workflow(item)
|
||||
if workflow_info:
|
||||
workflows[workflow_info.name] = workflow_info
|
||||
logger.info(f"Discovered and registered workflow: {workflow_info.name}")
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to load workflow from {item}: {e}")
|
||||
else:
|
||||
# This is a category directory, recurse into it
|
||||
await self._scan_directory_recursive(item, workflows, discovered_dirs)
|
||||
|
||||
async def _load_workflow(self, workflow_dir: Path) -> Optional[WorkflowInfo]:
|
||||
"""
|
||||
Load and validate a single workflow.
|
||||
|
||||
Args:
|
||||
workflow_dir: Path to the workflow directory
|
||||
|
||||
Returns:
|
||||
WorkflowInfo if valid, None otherwise
|
||||
"""
|
||||
workflow_name = workflow_dir.name
|
||||
|
||||
# Check for mandatory files
|
||||
workflow_file = workflow_dir / "workflow.py"
|
||||
metadata_file = workflow_dir / "metadata.yaml"
|
||||
|
||||
if not workflow_file.exists():
|
||||
logger.warning(f"Workflow {workflow_name} missing workflow.py")
|
||||
return None
|
||||
|
||||
if not metadata_file.exists():
|
||||
logger.error(f"Workflow {workflow_name} missing mandatory metadata.yaml")
|
||||
return None
|
||||
|
||||
# Load and validate metadata
|
||||
try:
|
||||
metadata = self._load_metadata(metadata_file)
|
||||
if not self._validate_metadata(metadata, workflow_name):
|
||||
return None
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to load metadata for {workflow_name}: {e}")
|
||||
return None
|
||||
|
||||
# Check for mandatory Dockerfile
|
||||
dockerfile = workflow_dir / "Dockerfile"
|
||||
if not dockerfile.exists():
|
||||
logger.error(f"Workflow {workflow_name} missing mandatory Dockerfile")
|
||||
return None
|
||||
|
||||
has_docker = True # Always True since Dockerfile is mandatory
|
||||
|
||||
# Get flow function name from metadata or use default
|
||||
flow_function_name = metadata.get("flow_function", "main_flow")
|
||||
|
||||
return WorkflowInfo(
|
||||
name=workflow_name,
|
||||
path=workflow_dir,
|
||||
workflow_file=workflow_file,
|
||||
dockerfile=dockerfile,
|
||||
has_docker=has_docker,
|
||||
metadata=metadata,
|
||||
flow_function_name=flow_function_name
|
||||
)
|
||||
|
||||
def _load_metadata(self, metadata_file: Path) -> Dict[str, Any]:
|
||||
"""
|
||||
Load metadata from YAML file.
|
||||
|
||||
Args:
|
||||
metadata_file: Path to metadata.yaml
|
||||
|
||||
Returns:
|
||||
Dictionary containing metadata
|
||||
"""
|
||||
with open(metadata_file, 'r') as f:
|
||||
metadata = yaml.safe_load(f)
|
||||
|
||||
if metadata is None:
|
||||
raise ValueError("Empty metadata file")
|
||||
|
||||
return metadata
|
||||
|
||||
def _validate_metadata(self, metadata: Dict[str, Any], workflow_name: str) -> bool:
|
||||
"""
|
||||
Validate that metadata contains all required fields.
|
||||
|
||||
Args:
|
||||
metadata: Metadata dictionary
|
||||
workflow_name: Name of the workflow for logging
|
||||
|
||||
Returns:
|
||||
True if valid, False otherwise
|
||||
"""
|
||||
required_fields = ["name", "version", "description", "author", "category", "parameters", "requirements"]
|
||||
|
||||
missing_fields = []
|
||||
for field in required_fields:
|
||||
if field not in metadata:
|
||||
missing_fields.append(field)
|
||||
|
||||
if missing_fields:
|
||||
logger.error(
|
||||
f"Workflow {workflow_name} metadata missing required fields: {missing_fields}"
|
||||
)
|
||||
return False
|
||||
|
||||
# Validate version format (semantic versioning)
|
||||
version = metadata.get("version", "")
|
||||
if not self._is_valid_version(version):
|
||||
logger.error(f"Workflow {workflow_name} has invalid version format: {version}")
|
||||
return False
|
||||
|
||||
# Validate parameters structure
|
||||
parameters = metadata.get("parameters", {})
|
||||
if not isinstance(parameters, dict):
|
||||
logger.error(f"Workflow {workflow_name} parameters must be a dictionary")
|
||||
return False
|
||||
|
||||
return True
|
||||
|
||||
def _is_valid_version(self, version: str) -> bool:
|
||||
"""
|
||||
Check if version follows semantic versioning (x.y.z).
|
||||
|
||||
Args:
|
||||
version: Version string
|
||||
|
||||
Returns:
|
||||
True if valid semantic version
|
||||
"""
|
||||
try:
|
||||
parts = version.split('.')
|
||||
if len(parts) != 3:
|
||||
return False
|
||||
for part in parts:
|
||||
int(part) # Check if each part is a number
|
||||
return True
|
||||
except (ValueError, AttributeError):
|
||||
return False
|
||||
|
||||
def invalidate_cache(self) -> None:
|
||||
"""
|
||||
Invalidate the workflow discovery cache.
|
||||
Useful when workflows are added or modified.
|
||||
"""
|
||||
self._workflow_cache = None
|
||||
self._cache_timestamp = None
|
||||
logger.debug("Workflow discovery cache invalidated")
|
||||
|
||||
def get_flow_function(self, workflow_name: str) -> Optional[Callable]:
|
||||
"""
|
||||
Get the flow function from the registry.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow
|
||||
|
||||
Returns:
|
||||
The flow function if found in registry, None otherwise
|
||||
"""
|
||||
if workflow_name not in self.registry:
|
||||
logger.error(
|
||||
f"Workflow '{workflow_name}' not found in registry. "
|
||||
f"Available workflows: {list(self.registry.keys())}"
|
||||
)
|
||||
return None
|
||||
|
||||
try:
|
||||
from toolbox.workflows.registry import get_workflow_flow
|
||||
flow_func = get_workflow_flow(workflow_name)
|
||||
logger.debug(f"Retrieved flow function for '{workflow_name}' from registry")
|
||||
return flow_func
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get flow function for '{workflow_name}': {e}")
|
||||
return None
|
||||
|
||||
def get_registry_info(self, workflow_name: str) -> Optional[Dict[str, Any]]:
|
||||
"""
|
||||
Get registry information for a workflow.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow
|
||||
|
||||
Returns:
|
||||
Registry information if found, None otherwise
|
||||
"""
|
||||
if workflow_name not in self.registry:
|
||||
return None
|
||||
|
||||
try:
|
||||
from toolbox.workflows.registry import get_workflow_info
|
||||
return get_workflow_info(workflow_name)
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get registry info for '{workflow_name}': {e}")
|
||||
return None
|
||||
|
||||
@staticmethod
|
||||
def get_metadata_schema() -> Dict[str, Any]:
|
||||
"""
|
||||
Get the JSON schema for workflow metadata.
|
||||
|
||||
Returns:
|
||||
JSON schema dictionary
|
||||
"""
|
||||
return {
|
||||
"type": "object",
|
||||
"required": ["name", "version", "description", "author", "category", "parameters", "requirements"],
|
||||
"properties": {
|
||||
"name": {
|
||||
"type": "string",
|
||||
"description": "Workflow name"
|
||||
},
|
||||
"version": {
|
||||
"type": "string",
|
||||
"pattern": "^\\d+\\.\\d+\\.\\d+$",
|
||||
"description": "Semantic version (x.y.z)"
|
||||
},
|
||||
"description": {
|
||||
"type": "string",
|
||||
"description": "Workflow description"
|
||||
},
|
||||
"author": {
|
||||
"type": "string",
|
||||
"description": "Workflow author"
|
||||
},
|
||||
"category": {
|
||||
"type": "string",
|
||||
"enum": ["comprehensive", "specialized", "fuzzing", "focused"],
|
||||
"description": "Workflow category"
|
||||
},
|
||||
"tags": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "Workflow tags for categorization"
|
||||
},
|
||||
"requirements": {
|
||||
"type": "object",
|
||||
"required": ["tools", "resources"],
|
||||
"properties": {
|
||||
"tools": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "Required security tools"
|
||||
},
|
||||
"resources": {
|
||||
"type": "object",
|
||||
"required": ["memory", "cpu", "timeout"],
|
||||
"properties": {
|
||||
"memory": {
|
||||
"type": "string",
|
||||
"pattern": "^\\d+[GMK]i$",
|
||||
"description": "Memory limit (e.g., 1Gi, 512Mi)"
|
||||
},
|
||||
"cpu": {
|
||||
"type": "string",
|
||||
"pattern": "^\\d+m?$",
|
||||
"description": "CPU limit (e.g., 1000m, 2)"
|
||||
},
|
||||
"timeout": {
|
||||
"type": "integer",
|
||||
"minimum": 60,
|
||||
"maximum": 7200,
|
||||
"description": "Workflow timeout in seconds"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"description": "Workflow parameters schema"
|
||||
},
|
||||
"default_parameters": {
|
||||
"type": "object",
|
||||
"description": "Default parameter values"
|
||||
},
|
||||
"required_modules": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "Required module names"
|
||||
},
|
||||
"supported_volume_modes": {
|
||||
"type": "array",
|
||||
"items": {"enum": ["ro", "rw"]},
|
||||
"default": ["ro", "rw"],
|
||||
"description": "Supported volume mount modes"
|
||||
},
|
||||
"flow_function": {
|
||||
"type": "string",
|
||||
"default": "main_flow",
|
||||
"description": "Name of the flow function in workflow.py"
|
||||
}
|
||||
}
|
||||
}
|
||||
864
backend/src/main.py
Normal file
864
backend/src/main.py
Normal file
@@ -0,0 +1,864 @@
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
import os
|
||||
from uuid import UUID
|
||||
from contextlib import AsyncExitStack, asynccontextmanager, suppress
|
||||
from typing import Any, Dict, Optional, List
|
||||
|
||||
import uvicorn
|
||||
from fastapi import FastAPI
|
||||
from starlette.applications import Starlette
|
||||
from starlette.routing import Mount
|
||||
|
||||
from fastmcp.server.http import create_sse_app
|
||||
|
||||
from src.core.prefect_manager import PrefectManager
|
||||
from src.core.setup import setup_docker_pool, setup_result_storage, validate_infrastructure
|
||||
from src.core.workflow_discovery import WorkflowDiscovery
|
||||
from src.api import workflows, runs, fuzzing
|
||||
from src.services.prefect_stats_monitor import prefect_stats_monitor
|
||||
|
||||
from fastmcp import FastMCP
|
||||
from prefect.client.orchestration import get_client
|
||||
from prefect.client.schemas.filters import (
|
||||
FlowRunFilter,
|
||||
FlowRunFilterDeploymentId,
|
||||
FlowRunFilterState,
|
||||
FlowRunFilterStateType,
|
||||
)
|
||||
from prefect.client.schemas.sorting import FlowRunSort
|
||||
from prefect.states import StateType
|
||||
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
prefect_mgr = PrefectManager()
|
||||
|
||||
|
||||
class PrefectBootstrapState:
|
||||
"""Tracks Prefect initialization progress for API and MCP consumers."""
|
||||
|
||||
def __init__(self) -> None:
|
||||
self.ready: bool = False
|
||||
self.status: str = "not_started"
|
||||
self.last_error: Optional[str] = None
|
||||
self.task_running: bool = False
|
||||
|
||||
def as_dict(self) -> Dict[str, Any]:
|
||||
return {
|
||||
"ready": self.ready,
|
||||
"status": self.status,
|
||||
"last_error": self.last_error,
|
||||
"task_running": self.task_running,
|
||||
}
|
||||
|
||||
|
||||
prefect_bootstrap_state = PrefectBootstrapState()
|
||||
|
||||
# Configure retry strategy for bootstrapping Prefect + infrastructure
|
||||
STARTUP_RETRY_SECONDS = max(1, int(os.getenv("FUZZFORGE_STARTUP_RETRY_SECONDS", "5")))
|
||||
STARTUP_RETRY_MAX_SECONDS = max(
|
||||
STARTUP_RETRY_SECONDS,
|
||||
int(os.getenv("FUZZFORGE_STARTUP_RETRY_MAX_SECONDS", "60")),
|
||||
)
|
||||
|
||||
prefect_bootstrap_task: Optional[asyncio.Task] = None
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# FastAPI application (REST API remains unchanged)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
app = FastAPI(
|
||||
title="FuzzForge API",
|
||||
description="Security testing workflow orchestration API with fuzzing support",
|
||||
version="0.6.0",
|
||||
)
|
||||
|
||||
app.include_router(workflows.router)
|
||||
app.include_router(runs.router)
|
||||
app.include_router(fuzzing.router)
|
||||
|
||||
|
||||
def get_prefect_status() -> Dict[str, Any]:
|
||||
"""Return a snapshot of Prefect bootstrap state for diagnostics."""
|
||||
status = prefect_bootstrap_state.as_dict()
|
||||
status["workflows_loaded"] = len(prefect_mgr.workflows)
|
||||
status["deployments_tracked"] = len(prefect_mgr.deployments)
|
||||
status["bootstrap_task_running"] = (
|
||||
prefect_bootstrap_task is not None and not prefect_bootstrap_task.done()
|
||||
)
|
||||
return status
|
||||
|
||||
|
||||
def _prefect_not_ready_status() -> Optional[Dict[str, Any]]:
|
||||
"""Return status details if Prefect is not ready yet."""
|
||||
status = get_prefect_status()
|
||||
if status.get("ready"):
|
||||
return None
|
||||
return status
|
||||
|
||||
|
||||
@app.get("/")
|
||||
async def root() -> Dict[str, Any]:
|
||||
status = get_prefect_status()
|
||||
return {
|
||||
"name": "FuzzForge API",
|
||||
"version": "0.6.0",
|
||||
"status": "ready" if status.get("ready") else "initializing",
|
||||
"workflows_loaded": status.get("workflows_loaded", 0),
|
||||
"prefect": status,
|
||||
}
|
||||
|
||||
|
||||
@app.get("/health")
|
||||
async def health() -> Dict[str, str]:
|
||||
status = get_prefect_status()
|
||||
health_status = "healthy" if status.get("ready") else "initializing"
|
||||
return {"status": health_status}
|
||||
|
||||
|
||||
# Map FastAPI OpenAPI operationIds to readable MCP tool names
|
||||
FASTAPI_MCP_NAME_OVERRIDES: Dict[str, str] = {
|
||||
"list_workflows_workflows__get": "api_list_workflows",
|
||||
"get_metadata_schema_workflows_metadata_schema_get": "api_get_metadata_schema",
|
||||
"get_workflow_metadata_workflows__workflow_name__metadata_get": "api_get_workflow_metadata",
|
||||
"submit_workflow_workflows__workflow_name__submit_post": "api_submit_workflow",
|
||||
"get_workflow_parameters_workflows__workflow_name__parameters_get": "api_get_workflow_parameters",
|
||||
"get_run_status_runs__run_id__status_get": "api_get_run_status",
|
||||
"get_run_findings_runs__run_id__findings_get": "api_get_run_findings",
|
||||
"get_workflow_findings_runs__workflow_name__findings__run_id__get": "api_get_workflow_findings",
|
||||
"get_fuzzing_stats_fuzzing__run_id__stats_get": "api_get_fuzzing_stats",
|
||||
"update_fuzzing_stats_fuzzing__run_id__stats_post": "api_update_fuzzing_stats",
|
||||
"get_crash_reports_fuzzing__run_id__crashes_get": "api_get_crash_reports",
|
||||
"report_crash_fuzzing__run_id__crash_post": "api_report_crash",
|
||||
"stream_fuzzing_updates_fuzzing__run_id__stream_get": "api_stream_fuzzing_updates",
|
||||
"cleanup_fuzzing_run_fuzzing__run_id__delete": "api_cleanup_fuzzing_run",
|
||||
"root__get": "api_root",
|
||||
"health_health_get": "api_health",
|
||||
}
|
||||
|
||||
|
||||
# Create an MCP adapter exposing all FastAPI endpoints via OpenAPI parsing
|
||||
FASTAPI_MCP_ADAPTER = FastMCP.from_fastapi(
|
||||
app,
|
||||
name="FuzzForge FastAPI",
|
||||
mcp_names=FASTAPI_MCP_NAME_OVERRIDES,
|
||||
)
|
||||
_fastapi_mcp_imported = False
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# FastMCP server (runs on dedicated port outside FastAPI)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
mcp = FastMCP(name="FuzzForge MCP")
|
||||
|
||||
|
||||
async def _bootstrap_prefect_with_retries() -> None:
|
||||
"""Initialize Prefect infrastructure with exponential backoff retries."""
|
||||
|
||||
attempt = 0
|
||||
|
||||
while True:
|
||||
attempt += 1
|
||||
prefect_bootstrap_state.task_running = True
|
||||
prefect_bootstrap_state.status = "starting"
|
||||
prefect_bootstrap_state.ready = False
|
||||
prefect_bootstrap_state.last_error = None
|
||||
|
||||
try:
|
||||
logger.info("Bootstrapping Prefect infrastructure...")
|
||||
await validate_infrastructure()
|
||||
await setup_docker_pool()
|
||||
await setup_result_storage()
|
||||
await prefect_mgr.initialize()
|
||||
await prefect_stats_monitor.start_monitoring()
|
||||
|
||||
prefect_bootstrap_state.ready = True
|
||||
prefect_bootstrap_state.status = "ready"
|
||||
prefect_bootstrap_state.task_running = False
|
||||
logger.info("Prefect infrastructure ready")
|
||||
return
|
||||
|
||||
except asyncio.CancelledError:
|
||||
prefect_bootstrap_state.status = "cancelled"
|
||||
prefect_bootstrap_state.task_running = False
|
||||
logger.info("Prefect bootstrap task cancelled")
|
||||
raise
|
||||
|
||||
except Exception as exc: # pragma: no cover - defensive logging on infra startup
|
||||
logger.exception("Prefect bootstrap failed")
|
||||
prefect_bootstrap_state.ready = False
|
||||
prefect_bootstrap_state.status = "error"
|
||||
prefect_bootstrap_state.last_error = str(exc)
|
||||
|
||||
# Ensure partial initialization does not leave stale state behind
|
||||
prefect_mgr.workflows.clear()
|
||||
prefect_mgr.deployments.clear()
|
||||
await prefect_stats_monitor.stop_monitoring()
|
||||
|
||||
wait_time = min(
|
||||
STARTUP_RETRY_SECONDS * (2 ** (attempt - 1)),
|
||||
STARTUP_RETRY_MAX_SECONDS,
|
||||
)
|
||||
logger.info("Retrying Prefect bootstrap in %s second(s)", wait_time)
|
||||
|
||||
try:
|
||||
await asyncio.sleep(wait_time)
|
||||
except asyncio.CancelledError:
|
||||
prefect_bootstrap_state.status = "cancelled"
|
||||
prefect_bootstrap_state.task_running = False
|
||||
raise
|
||||
|
||||
|
||||
def _lookup_workflow(workflow_name: str):
|
||||
info = prefect_mgr.workflows.get(workflow_name)
|
||||
if not info:
|
||||
return None
|
||||
metadata = info.metadata
|
||||
defaults = metadata.get("default_parameters", {})
|
||||
default_target_path = metadata.get("default_target_path") or defaults.get("target_path")
|
||||
supported_modes = metadata.get("supported_volume_modes") or ["ro", "rw"]
|
||||
if not isinstance(supported_modes, list) or not supported_modes:
|
||||
supported_modes = ["ro", "rw"]
|
||||
default_volume_mode = (
|
||||
metadata.get("default_volume_mode")
|
||||
or defaults.get("volume_mode")
|
||||
or supported_modes[0]
|
||||
)
|
||||
return {
|
||||
"name": workflow_name,
|
||||
"version": metadata.get("version", "0.6.0"),
|
||||
"description": metadata.get("description", ""),
|
||||
"author": metadata.get("author"),
|
||||
"tags": metadata.get("tags", []),
|
||||
"parameters": metadata.get("parameters", {}),
|
||||
"default_parameters": metadata.get("default_parameters", {}),
|
||||
"required_modules": metadata.get("required_modules", []),
|
||||
"supported_volume_modes": supported_modes,
|
||||
"default_target_path": default_target_path,
|
||||
"default_volume_mode": default_volume_mode,
|
||||
"has_custom_docker": bool(info.has_docker),
|
||||
}
|
||||
|
||||
|
||||
@mcp.tool
|
||||
async def list_workflows_mcp() -> Dict[str, Any]:
|
||||
"""List all discovered workflows and their metadata summary."""
|
||||
not_ready = _prefect_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"workflows": [],
|
||||
"prefect": not_ready,
|
||||
"message": "Prefect infrastructure is still initializing",
|
||||
}
|
||||
|
||||
workflows_summary = []
|
||||
for name, info in prefect_mgr.workflows.items():
|
||||
metadata = info.metadata
|
||||
defaults = metadata.get("default_parameters", {})
|
||||
workflows_summary.append({
|
||||
"name": name,
|
||||
"version": metadata.get("version", "0.6.0"),
|
||||
"description": metadata.get("description", ""),
|
||||
"author": metadata.get("author"),
|
||||
"tags": metadata.get("tags", []),
|
||||
"supported_volume_modes": metadata.get("supported_volume_modes", ["ro", "rw"]),
|
||||
"default_volume_mode": metadata.get("default_volume_mode")
|
||||
or defaults.get("volume_mode")
|
||||
or "ro",
|
||||
"default_target_path": metadata.get("default_target_path")
|
||||
or defaults.get("target_path"),
|
||||
"has_custom_docker": bool(info.has_docker),
|
||||
})
|
||||
return {"workflows": workflows_summary, "prefect": get_prefect_status()}
|
||||
|
||||
|
||||
@mcp.tool
|
||||
async def get_workflow_metadata_mcp(workflow_name: str) -> Dict[str, Any]:
|
||||
"""Fetch detailed metadata for a workflow."""
|
||||
not_ready = _prefect_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"error": "Prefect infrastructure not ready",
|
||||
"prefect": not_ready,
|
||||
}
|
||||
|
||||
data = _lookup_workflow(workflow_name)
|
||||
if not data:
|
||||
return {"error": f"Workflow not found: {workflow_name}"}
|
||||
return data
|
||||
|
||||
|
||||
@mcp.tool
|
||||
async def get_workflow_parameters_mcp(workflow_name: str) -> Dict[str, Any]:
|
||||
"""Return the parameter schema and defaults for a workflow."""
|
||||
not_ready = _prefect_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"error": "Prefect infrastructure not ready",
|
||||
"prefect": not_ready,
|
||||
}
|
||||
|
||||
data = _lookup_workflow(workflow_name)
|
||||
if not data:
|
||||
return {"error": f"Workflow not found: {workflow_name}"}
|
||||
return {
|
||||
"parameters": data.get("parameters", {}),
|
||||
"defaults": data.get("default_parameters", {}),
|
||||
}
|
||||
|
||||
|
||||
@mcp.tool
|
||||
async def get_workflow_metadata_schema_mcp() -> Dict[str, Any]:
|
||||
"""Return the JSON schema describing workflow metadata files."""
|
||||
return WorkflowDiscovery.get_metadata_schema()
|
||||
|
||||
|
||||
@mcp.tool
|
||||
async def submit_security_scan_mcp(
|
||||
workflow_name: str,
|
||||
target_path: str | None = None,
|
||||
volume_mode: str | None = None,
|
||||
parameters: Dict[str, Any] | None = None,
|
||||
) -> Dict[str, Any] | Dict[str, str]:
|
||||
"""Submit a Prefect workflow via MCP."""
|
||||
try:
|
||||
not_ready = _prefect_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"error": "Prefect infrastructure not ready",
|
||||
"prefect": not_ready,
|
||||
}
|
||||
|
||||
workflow_info = prefect_mgr.workflows.get(workflow_name)
|
||||
if not workflow_info:
|
||||
return {"error": f"Workflow '{workflow_name}' not found"}
|
||||
|
||||
metadata = workflow_info.metadata or {}
|
||||
defaults = metadata.get("default_parameters", {})
|
||||
|
||||
resolved_target_path = target_path or metadata.get("default_target_path") or defaults.get("target_path")
|
||||
if not resolved_target_path:
|
||||
return {
|
||||
"error": (
|
||||
"target_path is required and no default_target_path is defined in metadata"
|
||||
),
|
||||
"metadata": {
|
||||
"workflow": workflow_name,
|
||||
"default_target_path": metadata.get("default_target_path"),
|
||||
},
|
||||
}
|
||||
|
||||
requested_volume_mode = volume_mode or metadata.get("default_volume_mode") or defaults.get("volume_mode")
|
||||
if not requested_volume_mode:
|
||||
requested_volume_mode = "ro"
|
||||
|
||||
normalised_volume_mode = (
|
||||
str(requested_volume_mode).strip().lower().replace("-", "_")
|
||||
)
|
||||
if normalised_volume_mode in {"read_only", "readonly", "ro"}:
|
||||
normalised_volume_mode = "ro"
|
||||
elif normalised_volume_mode in {"read_write", "readwrite", "rw"}:
|
||||
normalised_volume_mode = "rw"
|
||||
else:
|
||||
supported_modes = metadata.get("supported_volume_modes", ["ro", "rw"])
|
||||
if isinstance(supported_modes, list) and normalised_volume_mode in supported_modes:
|
||||
pass
|
||||
else:
|
||||
normalised_volume_mode = "ro"
|
||||
|
||||
parameters = parameters or {}
|
||||
|
||||
cleaned_parameters: Dict[str, Any] = {**defaults, **parameters}
|
||||
|
||||
# Ensure *_config structures default to dicts so Prefect validation passes.
|
||||
for key, value in list(cleaned_parameters.items()):
|
||||
if isinstance(key, str) and key.endswith("_config") and value is None:
|
||||
cleaned_parameters[key] = {}
|
||||
|
||||
# Some workflows expect configuration dictionaries even when omitted.
|
||||
parameter_definitions = (
|
||||
metadata.get("parameters", {}).get("properties", {})
|
||||
if isinstance(metadata.get("parameters"), dict)
|
||||
else {}
|
||||
)
|
||||
for key, definition in parameter_definitions.items():
|
||||
if not isinstance(key, str) or not key.endswith("_config"):
|
||||
continue
|
||||
if key not in cleaned_parameters:
|
||||
default_value = definition.get("default") if isinstance(definition, dict) else None
|
||||
cleaned_parameters[key] = default_value if default_value is not None else {}
|
||||
elif cleaned_parameters[key] is None:
|
||||
cleaned_parameters[key] = {}
|
||||
|
||||
flow_run = await prefect_mgr.submit_workflow(
|
||||
workflow_name=workflow_name,
|
||||
target_path=resolved_target_path,
|
||||
volume_mode=normalised_volume_mode,
|
||||
parameters=cleaned_parameters,
|
||||
)
|
||||
|
||||
return {
|
||||
"run_id": str(flow_run.id),
|
||||
"status": flow_run.state.name if flow_run.state else "PENDING",
|
||||
"workflow": workflow_name,
|
||||
"message": f"Workflow '{workflow_name}' submitted successfully",
|
||||
"target_path": resolved_target_path,
|
||||
"volume_mode": normalised_volume_mode,
|
||||
"parameters": cleaned_parameters,
|
||||
"mcp_enabled": True,
|
||||
}
|
||||
except Exception as exc: # pragma: no cover - defensive logging
|
||||
logger.exception("MCP submit failed")
|
||||
return {"error": f"Failed to submit workflow: {exc}"}
|
||||
|
||||
|
||||
@mcp.tool
|
||||
async def get_comprehensive_scan_summary(run_id: str) -> Dict[str, Any] | Dict[str, str]:
|
||||
"""Return a summary for the given flow run via MCP."""
|
||||
try:
|
||||
not_ready = _prefect_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"error": "Prefect infrastructure not ready",
|
||||
"prefect": not_ready,
|
||||
}
|
||||
|
||||
status = await prefect_mgr.get_flow_run_status(run_id)
|
||||
findings = await prefect_mgr.get_flow_run_findings(run_id)
|
||||
|
||||
workflow_name = "unknown"
|
||||
deployment_id = status.get("workflow", "")
|
||||
for name, deployment in prefect_mgr.deployments.items():
|
||||
if str(deployment) == str(deployment_id):
|
||||
workflow_name = name
|
||||
break
|
||||
|
||||
total_findings = 0
|
||||
severity_summary = {"critical": 0, "high": 0, "medium": 0, "low": 0, "info": 0}
|
||||
|
||||
if findings and "sarif" in findings:
|
||||
sarif = findings["sarif"]
|
||||
if isinstance(sarif, dict):
|
||||
total_findings = sarif.get("total_findings", 0)
|
||||
|
||||
return {
|
||||
"run_id": run_id,
|
||||
"workflow": workflow_name,
|
||||
"status": status.get("status", "unknown"),
|
||||
"is_completed": status.get("is_completed", False),
|
||||
"total_findings": total_findings,
|
||||
"severity_summary": severity_summary,
|
||||
"scan_duration": status.get("updated_at", "")
|
||||
if status.get("is_completed")
|
||||
else "In progress",
|
||||
"recommendations": (
|
||||
[
|
||||
"Review high and critical severity findings first",
|
||||
"Implement security fixes based on finding recommendations",
|
||||
"Re-run scan after applying fixes to verify remediation",
|
||||
]
|
||||
if total_findings > 0
|
||||
else ["No security issues found"]
|
||||
),
|
||||
"mcp_analysis": True,
|
||||
}
|
||||
except Exception as exc: # pragma: no cover
|
||||
logger.exception("MCP summary failed")
|
||||
return {"error": f"Failed to summarize run: {exc}"}
|
||||
|
||||
|
||||
@mcp.tool
|
||||
async def get_run_status_mcp(run_id: str) -> Dict[str, Any]:
|
||||
"""Return current status information for a Prefect run."""
|
||||
try:
|
||||
not_ready = _prefect_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"error": "Prefect infrastructure not ready",
|
||||
"prefect": not_ready,
|
||||
}
|
||||
|
||||
status = await prefect_mgr.get_flow_run_status(run_id)
|
||||
workflow_name = "unknown"
|
||||
deployment_id = status.get("workflow", "")
|
||||
for name, deployment in prefect_mgr.deployments.items():
|
||||
if str(deployment) == str(deployment_id):
|
||||
workflow_name = name
|
||||
break
|
||||
|
||||
return {
|
||||
"run_id": status["run_id"],
|
||||
"workflow": workflow_name,
|
||||
"status": status["status"],
|
||||
"is_completed": status["is_completed"],
|
||||
"is_failed": status["is_failed"],
|
||||
"is_running": status["is_running"],
|
||||
"created_at": status["created_at"],
|
||||
"updated_at": status["updated_at"],
|
||||
}
|
||||
except Exception as exc:
|
||||
logger.exception("MCP run status failed")
|
||||
return {"error": f"Failed to get run status: {exc}"}
|
||||
|
||||
|
||||
@mcp.tool
|
||||
async def get_run_findings_mcp(run_id: str) -> Dict[str, Any]:
|
||||
"""Return SARIF findings for a completed run."""
|
||||
try:
|
||||
not_ready = _prefect_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"error": "Prefect infrastructure not ready",
|
||||
"prefect": not_ready,
|
||||
}
|
||||
|
||||
status = await prefect_mgr.get_flow_run_status(run_id)
|
||||
if not status.get("is_completed"):
|
||||
return {"error": f"Run {run_id} not completed. Status: {status.get('status')}"}
|
||||
|
||||
findings = await prefect_mgr.get_flow_run_findings(run_id)
|
||||
|
||||
workflow_name = "unknown"
|
||||
deployment_id = status.get("workflow", "")
|
||||
for name, deployment in prefect_mgr.deployments.items():
|
||||
if str(deployment) == str(deployment_id):
|
||||
workflow_name = name
|
||||
break
|
||||
|
||||
metadata = {
|
||||
"completion_time": status.get("updated_at"),
|
||||
"workflow_version": "unknown",
|
||||
}
|
||||
info = prefect_mgr.workflows.get(workflow_name)
|
||||
if info:
|
||||
metadata["workflow_version"] = info.metadata.get("version", "unknown")
|
||||
|
||||
return {
|
||||
"workflow": workflow_name,
|
||||
"run_id": run_id,
|
||||
"sarif": findings,
|
||||
"metadata": metadata,
|
||||
}
|
||||
except Exception as exc:
|
||||
logger.exception("MCP findings failed")
|
||||
return {"error": f"Failed to retrieve findings: {exc}"}
|
||||
|
||||
|
||||
@mcp.tool
|
||||
async def list_recent_runs_mcp(
|
||||
limit: int = 10,
|
||||
workflow_name: str | None = None,
|
||||
states: List[str] | None = None,
|
||||
) -> Dict[str, Any]:
|
||||
"""List recent Prefect runs with optional workflow/state filters."""
|
||||
|
||||
not_ready = _prefect_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"runs": [],
|
||||
"prefect": not_ready,
|
||||
"message": "Prefect infrastructure is still initializing",
|
||||
}
|
||||
|
||||
try:
|
||||
limit_value = int(limit)
|
||||
except (TypeError, ValueError):
|
||||
limit_value = 10
|
||||
limit_value = max(1, min(limit_value, 100))
|
||||
|
||||
deployment_map = {
|
||||
str(deployment_id): workflow
|
||||
for workflow, deployment_id in prefect_mgr.deployments.items()
|
||||
}
|
||||
|
||||
deployment_filter_value = None
|
||||
if workflow_name:
|
||||
deployment_id = prefect_mgr.deployments.get(workflow_name)
|
||||
if not deployment_id:
|
||||
return {
|
||||
"runs": [],
|
||||
"prefect": get_prefect_status(),
|
||||
"error": f"Workflow '{workflow_name}' has no registered deployment",
|
||||
}
|
||||
try:
|
||||
deployment_filter_value = UUID(str(deployment_id))
|
||||
except ValueError:
|
||||
return {
|
||||
"runs": [],
|
||||
"prefect": get_prefect_status(),
|
||||
"error": (
|
||||
f"Deployment id '{deployment_id}' for workflow '{workflow_name}' is invalid"
|
||||
),
|
||||
}
|
||||
|
||||
desired_state_types: List[StateType] = []
|
||||
if states:
|
||||
for raw_state in states:
|
||||
if not raw_state:
|
||||
continue
|
||||
normalised = raw_state.strip().upper()
|
||||
if normalised == "ALL":
|
||||
desired_state_types = []
|
||||
break
|
||||
try:
|
||||
desired_state_types.append(StateType[normalised])
|
||||
except KeyError:
|
||||
continue
|
||||
if not desired_state_types:
|
||||
desired_state_types = [
|
||||
StateType.RUNNING,
|
||||
StateType.COMPLETED,
|
||||
StateType.FAILED,
|
||||
StateType.CANCELLED,
|
||||
]
|
||||
|
||||
flow_filter = FlowRunFilter()
|
||||
if desired_state_types:
|
||||
flow_filter.state = FlowRunFilterState(
|
||||
type=FlowRunFilterStateType(any_=desired_state_types)
|
||||
)
|
||||
if deployment_filter_value:
|
||||
flow_filter.deployment_id = FlowRunFilterDeploymentId(
|
||||
any_=[deployment_filter_value]
|
||||
)
|
||||
|
||||
async with get_client() as client:
|
||||
flow_runs = await client.read_flow_runs(
|
||||
limit=limit_value,
|
||||
flow_run_filter=flow_filter,
|
||||
sort=FlowRunSort.START_TIME_DESC,
|
||||
)
|
||||
|
||||
results: List[Dict[str, Any]] = []
|
||||
for flow_run in flow_runs:
|
||||
deployment_id = getattr(flow_run, "deployment_id", None)
|
||||
workflow = deployment_map.get(str(deployment_id), "unknown")
|
||||
state = getattr(flow_run, "state", None)
|
||||
state_name = getattr(state, "name", None) if state else None
|
||||
state_type = getattr(state, "type", None) if state else None
|
||||
|
||||
results.append(
|
||||
{
|
||||
"run_id": str(flow_run.id),
|
||||
"workflow": workflow,
|
||||
"deployment_id": str(deployment_id) if deployment_id else None,
|
||||
"state": state_name or (state_type.name if state_type else None),
|
||||
"state_type": state_type.name if state_type else None,
|
||||
"is_completed": bool(getattr(state, "is_completed", lambda: False)()),
|
||||
"is_running": bool(getattr(state, "is_running", lambda: False)()),
|
||||
"is_failed": bool(getattr(state, "is_failed", lambda: False)()),
|
||||
"created_at": getattr(flow_run, "created", None),
|
||||
"updated_at": getattr(flow_run, "updated", None),
|
||||
"expected_start_time": getattr(flow_run, "expected_start_time", None),
|
||||
"start_time": getattr(flow_run, "start_time", None),
|
||||
}
|
||||
)
|
||||
|
||||
# Normalise datetimes to ISO 8601 strings for serialization
|
||||
for entry in results:
|
||||
for key in ("created_at", "updated_at", "expected_start_time", "start_time"):
|
||||
value = entry.get(key)
|
||||
if value is None:
|
||||
continue
|
||||
try:
|
||||
entry[key] = value.isoformat()
|
||||
except AttributeError:
|
||||
entry[key] = str(value)
|
||||
|
||||
return {"runs": results, "prefect": get_prefect_status()}
|
||||
|
||||
|
||||
@mcp.tool
|
||||
async def get_fuzzing_stats_mcp(run_id: str) -> Dict[str, Any]:
|
||||
"""Return fuzzing statistics for a run if available."""
|
||||
not_ready = _prefect_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"error": "Prefect infrastructure not ready",
|
||||
"prefect": not_ready,
|
||||
}
|
||||
|
||||
stats = fuzzing.fuzzing_stats.get(run_id)
|
||||
if not stats:
|
||||
return {"error": f"Fuzzing run not found: {run_id}"}
|
||||
# Be resilient if a plain dict slipped into the cache
|
||||
if isinstance(stats, dict):
|
||||
return stats
|
||||
if hasattr(stats, "model_dump"):
|
||||
return stats.model_dump()
|
||||
if hasattr(stats, "dict"):
|
||||
return stats.dict()
|
||||
# Last resort
|
||||
return getattr(stats, "__dict__", {"run_id": run_id})
|
||||
|
||||
|
||||
@mcp.tool
|
||||
async def get_fuzzing_crash_reports_mcp(run_id: str) -> Dict[str, Any]:
|
||||
"""Return crash reports collected for a fuzzing run."""
|
||||
not_ready = _prefect_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"error": "Prefect infrastructure not ready",
|
||||
"prefect": not_ready,
|
||||
}
|
||||
|
||||
reports = fuzzing.crash_reports.get(run_id)
|
||||
if reports is None:
|
||||
return {"error": f"Fuzzing run not found: {run_id}"}
|
||||
return {"run_id": run_id, "crashes": [report.model_dump() for report in reports]}
|
||||
|
||||
|
||||
@mcp.tool
|
||||
async def get_backend_status_mcp() -> Dict[str, Any]:
|
||||
"""Expose backend readiness, workflows, and registered MCP tools."""
|
||||
|
||||
status = get_prefect_status()
|
||||
response: Dict[str, Any] = {"prefect": status}
|
||||
|
||||
if status.get("ready"):
|
||||
response["workflows"] = list(prefect_mgr.workflows.keys())
|
||||
|
||||
try:
|
||||
tools = await mcp._tool_manager.list_tools()
|
||||
response["mcp_tools"] = sorted(tool.name for tool in tools)
|
||||
except Exception as exc: # pragma: no cover - defensive logging
|
||||
logger.debug("Failed to enumerate MCP tools: %s", exc)
|
||||
|
||||
return response
|
||||
|
||||
|
||||
def create_mcp_transport_app() -> Starlette:
|
||||
"""Build a Starlette app serving HTTP + SSE transports on one port."""
|
||||
|
||||
http_app = mcp.http_app(path="/", transport="streamable-http")
|
||||
sse_app = create_sse_app(
|
||||
server=mcp,
|
||||
message_path="/messages",
|
||||
sse_path="/",
|
||||
auth=mcp.auth,
|
||||
)
|
||||
|
||||
routes = [
|
||||
Mount("/mcp", app=http_app),
|
||||
Mount("/mcp/sse", app=sse_app),
|
||||
]
|
||||
|
||||
@asynccontextmanager
|
||||
async def lifespan(app: Starlette): # pragma: no cover - integration wiring
|
||||
async with AsyncExitStack() as stack:
|
||||
await stack.enter_async_context(
|
||||
http_app.router.lifespan_context(http_app)
|
||||
)
|
||||
await stack.enter_async_context(
|
||||
sse_app.router.lifespan_context(sse_app)
|
||||
)
|
||||
yield
|
||||
|
||||
combined_app = Starlette(routes=routes, lifespan=lifespan)
|
||||
combined_app.state.fastmcp_server = mcp
|
||||
combined_app.state.http_app = http_app
|
||||
combined_app.state.sse_app = sse_app
|
||||
return combined_app
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Combined lifespan: Prefect init + dedicated MCP transports
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@asynccontextmanager
|
||||
async def combined_lifespan(app: FastAPI):
|
||||
global prefect_bootstrap_task, _fastapi_mcp_imported
|
||||
|
||||
logger.info("Starting FuzzForge backend...")
|
||||
|
||||
# Ensure FastAPI endpoints are exposed via MCP once
|
||||
if not _fastapi_mcp_imported:
|
||||
try:
|
||||
await mcp.import_server(FASTAPI_MCP_ADAPTER)
|
||||
_fastapi_mcp_imported = True
|
||||
logger.info("Mounted FastAPI endpoints as MCP tools")
|
||||
except Exception as exc:
|
||||
logger.exception("Failed to import FastAPI endpoints into MCP", exc_info=exc)
|
||||
|
||||
# Kick off Prefect bootstrap in the background if needed
|
||||
if prefect_bootstrap_task is None or prefect_bootstrap_task.done():
|
||||
prefect_bootstrap_task = asyncio.create_task(_bootstrap_prefect_with_retries())
|
||||
logger.info("Prefect bootstrap task started")
|
||||
else:
|
||||
logger.info("Prefect bootstrap task already running")
|
||||
|
||||
# Start MCP transports on shared port (HTTP + SSE)
|
||||
mcp_app = create_mcp_transport_app()
|
||||
mcp_config = uvicorn.Config(
|
||||
app=mcp_app,
|
||||
host="0.0.0.0",
|
||||
port=8010,
|
||||
log_level="info",
|
||||
lifespan="on",
|
||||
)
|
||||
mcp_server = uvicorn.Server(mcp_config)
|
||||
mcp_server.install_signal_handlers = lambda: None # type: ignore[assignment]
|
||||
mcp_task = asyncio.create_task(mcp_server.serve())
|
||||
|
||||
async def _wait_for_uvicorn_startup() -> None:
|
||||
started_attr = getattr(mcp_server, "started", None)
|
||||
if hasattr(started_attr, "wait"):
|
||||
await asyncio.wait_for(started_attr.wait(), timeout=10)
|
||||
return
|
||||
|
||||
# Fallback for uvicorn versions where "started" is a bool
|
||||
poll_interval = 0.1
|
||||
checks = int(10 / poll_interval)
|
||||
for _ in range(checks):
|
||||
if getattr(mcp_server, "started", False):
|
||||
return
|
||||
await asyncio.sleep(poll_interval)
|
||||
raise asyncio.TimeoutError
|
||||
|
||||
try:
|
||||
await _wait_for_uvicorn_startup()
|
||||
except asyncio.TimeoutError: # pragma: no cover - defensive logging
|
||||
if mcp_task.done():
|
||||
raise RuntimeError("MCP server failed to start") from mcp_task.exception()
|
||||
logger.warning("Timed out waiting for MCP server startup; continuing anyway")
|
||||
|
||||
logger.info("MCP HTTP available at http://0.0.0.0:8010/mcp")
|
||||
logger.info("MCP SSE available at http://0.0.0.0:8010/mcp/sse")
|
||||
|
||||
try:
|
||||
yield
|
||||
finally:
|
||||
logger.info("Shutting down MCP transports...")
|
||||
mcp_server.should_exit = True
|
||||
mcp_server.force_exit = True
|
||||
await asyncio.gather(mcp_task, return_exceptions=True)
|
||||
|
||||
if prefect_bootstrap_task and not prefect_bootstrap_task.done():
|
||||
prefect_bootstrap_task.cancel()
|
||||
with suppress(asyncio.CancelledError):
|
||||
await prefect_bootstrap_task
|
||||
prefect_bootstrap_state.task_running = False
|
||||
if not prefect_bootstrap_state.ready:
|
||||
prefect_bootstrap_state.status = "stopped"
|
||||
prefect_bootstrap_state.next_retry_seconds = None
|
||||
prefect_bootstrap_task = None
|
||||
|
||||
logger.info("Shutting down Prefect statistics monitor...")
|
||||
await prefect_stats_monitor.stop_monitoring()
|
||||
logger.info("Shutting down FuzzForge backend...")
|
||||
|
||||
|
||||
app.router.lifespan_context = combined_lifespan
|
||||
11
backend/src/models/__init__.py
Normal file
11
backend/src/models/__init__.py
Normal file
@@ -0,0 +1,11 @@
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
182
backend/src/models/findings.py
Normal file
182
backend/src/models/findings.py
Normal file
@@ -0,0 +1,182 @@
|
||||
"""
|
||||
Models for workflow findings and submissions
|
||||
"""
|
||||
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
from pydantic import BaseModel, Field, field_validator
|
||||
from typing import Dict, Any, Optional, Literal, List
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
class WorkflowFindings(BaseModel):
|
||||
"""Findings from a workflow execution in SARIF format"""
|
||||
workflow: str = Field(..., description="Workflow name")
|
||||
run_id: str = Field(..., description="Unique run identifier")
|
||||
sarif: Dict[str, Any] = Field(..., description="SARIF formatted findings")
|
||||
metadata: Dict[str, Any] = Field(default_factory=dict, description="Additional metadata")
|
||||
|
||||
|
||||
class ResourceLimits(BaseModel):
|
||||
"""Resource limits for workflow execution"""
|
||||
cpu_limit: Optional[str] = Field(None, description="CPU limit (e.g., '2' for 2 cores, '500m' for 0.5 cores)")
|
||||
memory_limit: Optional[str] = Field(None, description="Memory limit (e.g., '1Gi', '512Mi')")
|
||||
cpu_request: Optional[str] = Field(None, description="CPU request (guaranteed)")
|
||||
memory_request: Optional[str] = Field(None, description="Memory request (guaranteed)")
|
||||
|
||||
|
||||
class VolumeMount(BaseModel):
|
||||
"""Volume mount specification"""
|
||||
host_path: str = Field(..., description="Host path to mount")
|
||||
container_path: str = Field(..., description="Container path for mount")
|
||||
mode: Literal["ro", "rw"] = Field(default="ro", description="Mount mode")
|
||||
|
||||
@field_validator("host_path")
|
||||
@classmethod
|
||||
def validate_host_path(cls, v):
|
||||
"""Validate that the host path is absolute (existence checked at runtime)"""
|
||||
path = Path(v)
|
||||
if not path.is_absolute():
|
||||
raise ValueError(f"Host path must be absolute: {v}")
|
||||
# Note: Path existence is validated at workflow runtime
|
||||
# We can't validate existence here as this runs inside Docker container
|
||||
return str(path)
|
||||
|
||||
@field_validator("container_path")
|
||||
@classmethod
|
||||
def validate_container_path(cls, v):
|
||||
"""Validate that the container path is absolute"""
|
||||
if not v.startswith('/'):
|
||||
raise ValueError(f"Container path must be absolute: {v}")
|
||||
return v
|
||||
|
||||
|
||||
class WorkflowSubmission(BaseModel):
|
||||
"""Submit a workflow with configurable settings"""
|
||||
target_path: str = Field(..., description="Absolute path to analyze")
|
||||
volume_mode: Literal["ro", "rw"] = Field(
|
||||
default="ro",
|
||||
description="Volume mount mode: read-only (ro) or read-write (rw)"
|
||||
)
|
||||
parameters: Dict[str, Any] = Field(
|
||||
default_factory=dict,
|
||||
description="Workflow-specific parameters"
|
||||
)
|
||||
timeout: Optional[int] = Field(
|
||||
default=None, # Allow workflow-specific defaults
|
||||
description="Timeout in seconds (None for workflow default)",
|
||||
ge=1,
|
||||
le=604800 # Max 7 days to support fuzzing campaigns
|
||||
)
|
||||
resource_limits: Optional[ResourceLimits] = Field(
|
||||
None,
|
||||
description="Resource limits for workflow container"
|
||||
)
|
||||
additional_volumes: List[VolumeMount] = Field(
|
||||
default_factory=list,
|
||||
description="Additional volume mounts (e.g., for corpus, output directories)"
|
||||
)
|
||||
|
||||
@field_validator("target_path")
|
||||
@classmethod
|
||||
def validate_path(cls, v):
|
||||
"""Validate that the target path is absolute (existence checked at runtime)"""
|
||||
path = Path(v)
|
||||
if not path.is_absolute():
|
||||
raise ValueError(f"Path must be absolute: {v}")
|
||||
# Note: Path existence is validated at workflow runtime when volumes are mounted
|
||||
# We can't validate existence here as this runs inside Docker container
|
||||
return str(path)
|
||||
|
||||
|
||||
class WorkflowStatus(BaseModel):
|
||||
"""Status of a workflow run"""
|
||||
run_id: str = Field(..., description="Unique run identifier")
|
||||
workflow: str = Field(..., description="Workflow name")
|
||||
status: str = Field(..., description="Current status")
|
||||
is_completed: bool = Field(..., description="Whether the run is completed")
|
||||
is_failed: bool = Field(..., description="Whether the run failed")
|
||||
is_running: bool = Field(..., description="Whether the run is currently running")
|
||||
created_at: datetime = Field(..., description="Run creation time")
|
||||
updated_at: datetime = Field(..., description="Last update time")
|
||||
|
||||
|
||||
class WorkflowMetadata(BaseModel):
|
||||
"""Complete metadata for a workflow"""
|
||||
name: str = Field(..., description="Workflow name")
|
||||
version: str = Field(..., description="Semantic version")
|
||||
description: str = Field(..., description="Workflow description")
|
||||
author: Optional[str] = Field(None, description="Workflow author")
|
||||
tags: List[str] = Field(default_factory=list, description="Workflow tags")
|
||||
parameters: Dict[str, Any] = Field(..., description="Parameters schema")
|
||||
default_parameters: Dict[str, Any] = Field(
|
||||
default_factory=dict,
|
||||
description="Default parameter values"
|
||||
)
|
||||
required_modules: List[str] = Field(
|
||||
default_factory=list,
|
||||
description="Required module names"
|
||||
)
|
||||
supported_volume_modes: List[Literal["ro", "rw"]] = Field(
|
||||
default=["ro", "rw"],
|
||||
description="Supported volume mount modes"
|
||||
)
|
||||
has_custom_docker: bool = Field(
|
||||
default=False,
|
||||
description="Whether workflow has custom Dockerfile"
|
||||
)
|
||||
|
||||
|
||||
class WorkflowListItem(BaseModel):
|
||||
"""Summary information for a workflow in list views"""
|
||||
name: str = Field(..., description="Workflow name")
|
||||
version: str = Field(..., description="Semantic version")
|
||||
description: str = Field(..., description="Workflow description")
|
||||
author: Optional[str] = Field(None, description="Workflow author")
|
||||
tags: List[str] = Field(default_factory=list, description="Workflow tags")
|
||||
|
||||
|
||||
class RunSubmissionResponse(BaseModel):
|
||||
"""Response after submitting a workflow"""
|
||||
run_id: str = Field(..., description="Unique run identifier")
|
||||
status: str = Field(..., description="Initial status")
|
||||
workflow: str = Field(..., description="Workflow name")
|
||||
message: str = Field(default="Workflow submitted successfully")
|
||||
|
||||
|
||||
class FuzzingStats(BaseModel):
|
||||
"""Real-time fuzzing statistics"""
|
||||
run_id: str = Field(..., description="Unique run identifier")
|
||||
workflow: str = Field(..., description="Workflow name")
|
||||
executions: int = Field(default=0, description="Total executions")
|
||||
executions_per_sec: float = Field(default=0.0, description="Current execution rate")
|
||||
crashes: int = Field(default=0, description="Total crashes found")
|
||||
unique_crashes: int = Field(default=0, description="Unique crashes")
|
||||
coverage: Optional[float] = Field(None, description="Code coverage percentage")
|
||||
corpus_size: int = Field(default=0, description="Current corpus size")
|
||||
elapsed_time: int = Field(default=0, description="Elapsed time in seconds")
|
||||
last_crash_time: Optional[datetime] = Field(None, description="Time of last crash")
|
||||
|
||||
|
||||
class CrashReport(BaseModel):
|
||||
"""Individual crash report from fuzzing"""
|
||||
run_id: str = Field(..., description="Run identifier")
|
||||
crash_id: str = Field(..., description="Unique crash identifier")
|
||||
timestamp: datetime = Field(default_factory=datetime.utcnow)
|
||||
signal: Optional[str] = Field(None, description="Crash signal (SIGSEGV, etc.)")
|
||||
crash_type: Optional[str] = Field(None, description="Type of crash")
|
||||
stack_trace: Optional[str] = Field(None, description="Stack trace")
|
||||
input_file: Optional[str] = Field(None, description="Path to crashing input")
|
||||
reproducer: Optional[str] = Field(None, description="Minimized reproducer")
|
||||
severity: str = Field(default="medium", description="Crash severity")
|
||||
exploitability: Optional[str] = Field(None, description="Exploitability assessment")
|
||||
394
backend/src/services/prefect_stats_monitor.py
Normal file
394
backend/src/services/prefect_stats_monitor.py
Normal file
@@ -0,0 +1,394 @@
|
||||
"""
|
||||
Generic Prefect Statistics Monitor Service
|
||||
|
||||
This service monitors ALL workflows for structured live data logging and
|
||||
updates the appropriate statistics APIs. Works with any workflow that follows
|
||||
the standard LIVE_STATS logging pattern.
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
import asyncio
|
||||
import json
|
||||
import logging
|
||||
from datetime import datetime, timedelta, timezone
|
||||
from typing import Dict, Any, Optional
|
||||
from prefect.client.orchestration import get_client
|
||||
from prefect.client.schemas.objects import FlowRun, TaskRun
|
||||
from src.models.findings import FuzzingStats
|
||||
from src.api.fuzzing import fuzzing_stats, initialize_fuzzing_tracking, active_connections
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class PrefectStatsMonitor:
|
||||
"""Monitors Prefect flows and tasks for live statistics from any workflow"""
|
||||
|
||||
def __init__(self):
|
||||
self.monitoring = False
|
||||
self.monitor_task = None
|
||||
self.monitored_runs = set()
|
||||
self.last_log_ts: Dict[str, datetime] = {}
|
||||
self._client = None
|
||||
self._client_refresh_time = None
|
||||
self._client_refresh_interval = 300 # Refresh connection every 5 minutes
|
||||
|
||||
async def start_monitoring(self):
|
||||
"""Start the Prefect statistics monitoring service"""
|
||||
if self.monitoring:
|
||||
logger.warning("Prefect stats monitor already running")
|
||||
return
|
||||
|
||||
self.monitoring = True
|
||||
self.monitor_task = asyncio.create_task(self._monitor_flows())
|
||||
logger.info("Started Prefect statistics monitor")
|
||||
|
||||
async def stop_monitoring(self):
|
||||
"""Stop the monitoring service"""
|
||||
self.monitoring = False
|
||||
if self.monitor_task:
|
||||
self.monitor_task.cancel()
|
||||
try:
|
||||
await self.monitor_task
|
||||
except asyncio.CancelledError:
|
||||
pass
|
||||
logger.info("Stopped Prefect statistics monitor")
|
||||
|
||||
async def _get_or_refresh_client(self):
|
||||
"""Get or refresh Prefect client with connection pooling."""
|
||||
now = datetime.now(timezone.utc)
|
||||
|
||||
if (self._client is None or
|
||||
self._client_refresh_time is None or
|
||||
(now - self._client_refresh_time).total_seconds() > self._client_refresh_interval):
|
||||
|
||||
if self._client:
|
||||
try:
|
||||
await self._client.aclose()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
self._client = get_client()
|
||||
self._client_refresh_time = now
|
||||
await self._client.__aenter__()
|
||||
|
||||
return self._client
|
||||
|
||||
async def _monitor_flows(self):
|
||||
"""Main monitoring loop that watches Prefect flows"""
|
||||
try:
|
||||
while self.monitoring:
|
||||
try:
|
||||
# Use connection pooling for better performance
|
||||
client = await self._get_or_refresh_client()
|
||||
|
||||
# Get recent flow runs (limit to reduce load)
|
||||
flow_runs = await client.read_flow_runs(
|
||||
limit=50,
|
||||
sort="START_TIME_DESC",
|
||||
)
|
||||
|
||||
# Only consider runs from the last 15 minutes
|
||||
recent_cutoff = datetime.now(timezone.utc) - timedelta(minutes=15)
|
||||
for flow_run in flow_runs:
|
||||
created = getattr(flow_run, "created", None)
|
||||
if created is None:
|
||||
continue
|
||||
try:
|
||||
# Ensure timezone-aware comparison
|
||||
if created.tzinfo is None:
|
||||
created = created.replace(tzinfo=timezone.utc)
|
||||
if created >= recent_cutoff:
|
||||
await self._monitor_flow_run(client, flow_run)
|
||||
except Exception:
|
||||
# If comparison fails, attempt monitoring anyway
|
||||
await self._monitor_flow_run(client, flow_run)
|
||||
|
||||
await asyncio.sleep(5) # Check every 5 seconds
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error in Prefect monitoring: {e}")
|
||||
await asyncio.sleep(10)
|
||||
|
||||
except asyncio.CancelledError:
|
||||
logger.info("Prefect monitoring cancelled")
|
||||
except Exception as e:
|
||||
logger.error(f"Fatal error in Prefect monitoring: {e}")
|
||||
finally:
|
||||
# Clean up client on exit
|
||||
if self._client:
|
||||
try:
|
||||
await self._client.__aexit__(None, None, None)
|
||||
except Exception:
|
||||
pass
|
||||
self._client = None
|
||||
|
||||
async def _monitor_flow_run(self, client, flow_run: FlowRun):
|
||||
"""Monitor a specific flow run for statistics"""
|
||||
run_id = str(flow_run.id)
|
||||
workflow_name = flow_run.name or "unknown"
|
||||
|
||||
try:
|
||||
# Initialize tracking if not exists - only for workflows that might have live stats
|
||||
if run_id not in fuzzing_stats:
|
||||
initialize_fuzzing_tracking(run_id, workflow_name)
|
||||
self.monitored_runs.add(run_id)
|
||||
|
||||
# Skip corrupted entries (should not happen after startup cleanup, but defensive)
|
||||
elif not isinstance(fuzzing_stats[run_id], FuzzingStats):
|
||||
logger.warning(f"Skipping corrupted stats entry for {run_id}, reinitializing")
|
||||
initialize_fuzzing_tracking(run_id, workflow_name)
|
||||
self.monitored_runs.add(run_id)
|
||||
|
||||
# Get task runs for this flow
|
||||
task_runs = await client.read_task_runs(
|
||||
flow_run_filter={"id": {"any_": [flow_run.id]}},
|
||||
limit=25,
|
||||
)
|
||||
|
||||
# Check all tasks for live statistics logging
|
||||
for task_run in task_runs:
|
||||
await self._extract_stats_from_task(client, run_id, task_run, workflow_name)
|
||||
|
||||
# Also scan flow-level logs as a fallback
|
||||
await self._extract_stats_from_flow_logs(client, run_id, flow_run, workflow_name)
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"Error monitoring flow run {run_id}: {e}")
|
||||
|
||||
async def _extract_stats_from_task(self, client, run_id: str, task_run: TaskRun, workflow_name: str):
|
||||
"""Extract statistics from any task that logs live stats"""
|
||||
try:
|
||||
# Get task run logs
|
||||
logs = await client.read_logs(
|
||||
log_filter={
|
||||
"task_run_id": {"any_": [task_run.id]}
|
||||
},
|
||||
limit=100,
|
||||
sort="TIMESTAMP_ASC"
|
||||
)
|
||||
|
||||
# Parse logs for LIVE_STATS entries (generic pattern for any workflow)
|
||||
latest_stats = None
|
||||
for log in logs:
|
||||
# Prefer structured extra field if present
|
||||
extra_data = getattr(log, "extra", None) or getattr(log, "extra_fields", None) or None
|
||||
if isinstance(extra_data, dict):
|
||||
stat_type = extra_data.get("stats_type")
|
||||
if stat_type in ["fuzzing_live_update", "scan_progress", "analysis_update", "live_stats"]:
|
||||
latest_stats = extra_data
|
||||
continue
|
||||
|
||||
# Fallback to parsing from message text
|
||||
if ("FUZZ_STATS" in log.message or "LIVE_STATS" in log.message):
|
||||
stats = self._parse_stats_from_log(log.message)
|
||||
if stats:
|
||||
latest_stats = stats
|
||||
|
||||
# Update statistics if we found any
|
||||
if latest_stats:
|
||||
# Calculate elapsed time from task start
|
||||
elapsed_time = 0
|
||||
if task_run.start_time:
|
||||
# Ensure timezone-aware arithmetic
|
||||
now = datetime.now(timezone.utc)
|
||||
try:
|
||||
elapsed_time = int((now - task_run.start_time).total_seconds())
|
||||
except Exception:
|
||||
# Fallback to naive UTC if types mismatch
|
||||
elapsed_time = int((datetime.utcnow() - task_run.start_time.replace(tzinfo=None)).total_seconds())
|
||||
|
||||
updated_stats = FuzzingStats(
|
||||
run_id=run_id,
|
||||
workflow=workflow_name,
|
||||
executions=latest_stats.get("executions", 0),
|
||||
executions_per_sec=latest_stats.get("executions_per_sec", 0.0),
|
||||
crashes=latest_stats.get("crashes", 0),
|
||||
unique_crashes=latest_stats.get("unique_crashes", 0),
|
||||
corpus_size=latest_stats.get("corpus_size", 0),
|
||||
elapsed_time=elapsed_time
|
||||
)
|
||||
|
||||
# Update the global stats
|
||||
previous = fuzzing_stats.get(run_id)
|
||||
fuzzing_stats[run_id] = updated_stats
|
||||
|
||||
# Broadcast to any active WebSocket clients for this run
|
||||
if active_connections.get(run_id):
|
||||
# Handle both Pydantic objects and plain dicts
|
||||
if isinstance(updated_stats, dict):
|
||||
stats_data = updated_stats
|
||||
elif hasattr(updated_stats, 'model_dump'):
|
||||
stats_data = updated_stats.model_dump()
|
||||
elif hasattr(updated_stats, 'dict'):
|
||||
stats_data = updated_stats.dict()
|
||||
else:
|
||||
stats_data = updated_stats.__dict__
|
||||
|
||||
message = {
|
||||
"type": "stats_update",
|
||||
"data": stats_data,
|
||||
}
|
||||
disconnected = []
|
||||
for ws in active_connections[run_id]:
|
||||
try:
|
||||
await ws.send_text(json.dumps(message))
|
||||
except Exception:
|
||||
disconnected.append(ws)
|
||||
# Clean up disconnected sockets
|
||||
for ws in disconnected:
|
||||
try:
|
||||
active_connections[run_id].remove(ws)
|
||||
except ValueError:
|
||||
pass
|
||||
|
||||
logger.debug(f"Updated Prefect stats for {run_id}: {updated_stats.executions} execs")
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"Error extracting stats from task {task_run.id}: {e}")
|
||||
|
||||
async def _extract_stats_from_flow_logs(self, client, run_id: str, flow_run: FlowRun, workflow_name: str):
|
||||
"""Extract statistics by scanning flow-level logs for LIVE/FUZZ stats"""
|
||||
try:
|
||||
logs = await client.read_logs(
|
||||
log_filter={
|
||||
"flow_run_id": {"any_": [flow_run.id]}
|
||||
},
|
||||
limit=200,
|
||||
sort="TIMESTAMP_ASC"
|
||||
)
|
||||
|
||||
latest_stats = None
|
||||
last_seen = self.last_log_ts.get(run_id)
|
||||
max_ts = last_seen
|
||||
|
||||
for log in logs:
|
||||
# Skip logs we've already processed
|
||||
ts = getattr(log, "timestamp", None)
|
||||
if last_seen and ts and ts <= last_seen:
|
||||
continue
|
||||
if ts and (max_ts is None or ts > max_ts):
|
||||
max_ts = ts
|
||||
|
||||
# Prefer structured extra field if available
|
||||
extra_data = getattr(log, "extra", None) or getattr(log, "extra_fields", None) or None
|
||||
if isinstance(extra_data, dict):
|
||||
stat_type = extra_data.get("stats_type")
|
||||
if stat_type in ["fuzzing_live_update", "scan_progress", "analysis_update", "live_stats"]:
|
||||
latest_stats = extra_data
|
||||
continue
|
||||
|
||||
# Fallback to message parse
|
||||
if ("FUZZ_STATS" in log.message or "LIVE_STATS" in log.message):
|
||||
stats = self._parse_stats_from_log(log.message)
|
||||
if stats:
|
||||
latest_stats = stats
|
||||
|
||||
if max_ts:
|
||||
self.last_log_ts[run_id] = max_ts
|
||||
|
||||
if latest_stats:
|
||||
# Use flow_run timestamps for elapsed time if available
|
||||
elapsed_time = 0
|
||||
start_time = getattr(flow_run, "start_time", None) or getattr(flow_run, "start_time", None)
|
||||
if start_time:
|
||||
now = datetime.now(timezone.utc)
|
||||
try:
|
||||
if start_time.tzinfo is None:
|
||||
start_time = start_time.replace(tzinfo=timezone.utc)
|
||||
elapsed_time = int((now - start_time).total_seconds())
|
||||
except Exception:
|
||||
elapsed_time = int((datetime.utcnow() - start_time.replace(tzinfo=None)).total_seconds())
|
||||
|
||||
updated_stats = FuzzingStats(
|
||||
run_id=run_id,
|
||||
workflow=workflow_name,
|
||||
executions=latest_stats.get("executions", 0),
|
||||
executions_per_sec=latest_stats.get("executions_per_sec", 0.0),
|
||||
crashes=latest_stats.get("crashes", 0),
|
||||
unique_crashes=latest_stats.get("unique_crashes", 0),
|
||||
corpus_size=latest_stats.get("corpus_size", 0),
|
||||
elapsed_time=elapsed_time
|
||||
)
|
||||
|
||||
fuzzing_stats[run_id] = updated_stats
|
||||
|
||||
# Broadcast if listeners exist
|
||||
if active_connections.get(run_id):
|
||||
# Handle both Pydantic objects and plain dicts
|
||||
if isinstance(updated_stats, dict):
|
||||
stats_data = updated_stats
|
||||
elif hasattr(updated_stats, 'model_dump'):
|
||||
stats_data = updated_stats.model_dump()
|
||||
elif hasattr(updated_stats, 'dict'):
|
||||
stats_data = updated_stats.dict()
|
||||
else:
|
||||
stats_data = updated_stats.__dict__
|
||||
|
||||
message = {
|
||||
"type": "stats_update",
|
||||
"data": stats_data,
|
||||
}
|
||||
disconnected = []
|
||||
for ws in active_connections[run_id]:
|
||||
try:
|
||||
await ws.send_text(json.dumps(message))
|
||||
except Exception:
|
||||
disconnected.append(ws)
|
||||
for ws in disconnected:
|
||||
try:
|
||||
active_connections[run_id].remove(ws)
|
||||
except ValueError:
|
||||
pass
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"Error extracting stats from flow logs {run_id}: {e}")
|
||||
|
||||
def _parse_stats_from_log(self, log_message: str) -> Optional[Dict[str, Any]]:
|
||||
"""Parse statistics from a log message"""
|
||||
try:
|
||||
import re
|
||||
|
||||
# Prefer explicit JSON after marker tokens
|
||||
m = re.search(r'(?:FUZZ_STATS|LIVE_STATS)\s+(\{.*\})', log_message)
|
||||
if m:
|
||||
try:
|
||||
return json.loads(m.group(1))
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Fallback: Extract the extra= dict and coerce to JSON
|
||||
stats_match = re.search(r'extra=({.*?})', log_message)
|
||||
if not stats_match:
|
||||
return None
|
||||
|
||||
extra_str = stats_match.group(1)
|
||||
extra_str = extra_str.replace("'", '"')
|
||||
extra_str = extra_str.replace('None', 'null')
|
||||
extra_str = extra_str.replace('True', 'true')
|
||||
extra_str = extra_str.replace('False', 'false')
|
||||
|
||||
stats_data = json.loads(extra_str)
|
||||
|
||||
# Support multiple stat types for different workflows
|
||||
stat_type = stats_data.get("stats_type")
|
||||
if stat_type in ["fuzzing_live_update", "scan_progress", "analysis_update", "live_stats"]:
|
||||
return stats_data
|
||||
|
||||
except Exception as e:
|
||||
logger.debug(f"Error parsing log stats: {e}")
|
||||
|
||||
return None
|
||||
|
||||
|
||||
# Global instance
|
||||
prefect_stats_monitor = PrefectStatsMonitor()
|
||||
19
backend/tests/conftest.py
Normal file
19
backend/tests/conftest.py
Normal file
@@ -0,0 +1,19 @@
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
# Ensure project root is on sys.path so `src` is importable
|
||||
ROOT = Path(__file__).resolve().parents[1]
|
||||
if str(ROOT) not in sys.path:
|
||||
sys.path.insert(0, str(ROOT))
|
||||
|
||||
82
backend/tests/test_prefect_stats_monitor.py
Normal file
82
backend/tests/test_prefect_stats_monitor.py
Normal file
@@ -0,0 +1,82 @@
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import asyncio
|
||||
from datetime import datetime, timezone, timedelta
|
||||
|
||||
|
||||
from src.services.prefect_stats_monitor import PrefectStatsMonitor
|
||||
from src.api import fuzzing
|
||||
|
||||
|
||||
class FakeLog:
|
||||
def __init__(self, message: str):
|
||||
self.message = message
|
||||
|
||||
|
||||
class FakeClient:
|
||||
def __init__(self, logs):
|
||||
self._logs = logs
|
||||
|
||||
async def read_logs(self, log_filter=None, limit=100, sort="TIMESTAMP_ASC"):
|
||||
return self._logs
|
||||
|
||||
|
||||
class FakeTaskRun:
|
||||
def __init__(self):
|
||||
self.id = "task-1"
|
||||
self.start_time = datetime.now(timezone.utc) - timedelta(seconds=5)
|
||||
|
||||
|
||||
def test_parse_stats_from_log_fuzzing():
|
||||
mon = PrefectStatsMonitor()
|
||||
msg = (
|
||||
"INFO LIVE_STATS extra={'stats_type': 'fuzzing_live_update', "
|
||||
"'executions': 42, 'executions_per_sec': 3.14, 'crashes': 1, 'unique_crashes': 1, 'corpus_size': 9}"
|
||||
)
|
||||
stats = mon._parse_stats_from_log(msg)
|
||||
assert stats is not None
|
||||
assert stats["stats_type"] == "fuzzing_live_update"
|
||||
assert stats["executions"] == 42
|
||||
|
||||
|
||||
def test_extract_stats_updates_and_broadcasts():
|
||||
mon = PrefectStatsMonitor()
|
||||
run_id = "run-123"
|
||||
workflow = "wf"
|
||||
fuzzing.initialize_fuzzing_tracking(run_id, workflow)
|
||||
|
||||
# Prepare a fake websocket to capture messages
|
||||
sent = []
|
||||
|
||||
class FakeWS:
|
||||
async def send_text(self, text: str):
|
||||
sent.append(text)
|
||||
|
||||
fuzzing.active_connections[run_id] = [FakeWS()]
|
||||
|
||||
# Craft a log line the parser understands
|
||||
msg = (
|
||||
"INFO LIVE_STATS extra={'stats_type': 'fuzzing_live_update', "
|
||||
"'executions': 10, 'executions_per_sec': 1.5, 'crashes': 0, 'unique_crashes': 0, 'corpus_size': 2}"
|
||||
)
|
||||
fake_client = FakeClient([FakeLog(msg)])
|
||||
task_run = FakeTaskRun()
|
||||
|
||||
asyncio.run(mon._extract_stats_from_task(fake_client, run_id, task_run, workflow))
|
||||
|
||||
# Verify stats updated
|
||||
stats = fuzzing.fuzzing_stats[run_id]
|
||||
assert stats.executions == 10
|
||||
assert stats.executions_per_sec == 1.5
|
||||
|
||||
# Verify a message was sent to WebSocket
|
||||
assert sent, "Expected a stats_update message to be sent"
|
||||
11
backend/toolbox/__init__.py
Normal file
11
backend/toolbox/__init__.py
Normal file
@@ -0,0 +1,11 @@
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
11
backend/toolbox/modules/__init__.py
Normal file
11
backend/toolbox/modules/__init__.py
Normal file
@@ -0,0 +1,11 @@
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
14
backend/toolbox/modules/analyzer/__init__.py
Normal file
14
backend/toolbox/modules/analyzer/__init__.py
Normal file
@@ -0,0 +1,14 @@
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
from .security_analyzer import SecurityAnalyzer
|
||||
|
||||
__all__ = ["SecurityAnalyzer"]
|
||||
368
backend/toolbox/modules/analyzer/security_analyzer.py
Normal file
368
backend/toolbox/modules/analyzer/security_analyzer.py
Normal file
@@ -0,0 +1,368 @@
|
||||
"""
|
||||
Security Analyzer Module - Analyzes code for security vulnerabilities
|
||||
"""
|
||||
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import logging
|
||||
import re
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any, List, Optional
|
||||
|
||||
try:
|
||||
from toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
|
||||
except ImportError:
|
||||
try:
|
||||
from modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
|
||||
except ImportError:
|
||||
from src.toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class SecurityAnalyzer(BaseModule):
|
||||
"""
|
||||
Analyzes source code for common security vulnerabilities.
|
||||
|
||||
This module:
|
||||
- Detects hardcoded secrets and credentials
|
||||
- Identifies dangerous function calls
|
||||
- Finds SQL injection vulnerabilities
|
||||
- Detects insecure configurations
|
||||
"""
|
||||
|
||||
def get_metadata(self) -> ModuleMetadata:
|
||||
"""Get module metadata"""
|
||||
return ModuleMetadata(
|
||||
name="security_analyzer",
|
||||
version="1.0.0",
|
||||
description="Analyzes code for security vulnerabilities",
|
||||
author="FuzzForge Team",
|
||||
category="analyzer",
|
||||
tags=["security", "vulnerabilities", "static-analysis"],
|
||||
input_schema={
|
||||
"file_extensions": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "File extensions to analyze",
|
||||
"default": [".py", ".js", ".java", ".php", ".rb", ".go"]
|
||||
},
|
||||
"check_secrets": {
|
||||
"type": "boolean",
|
||||
"description": "Check for hardcoded secrets",
|
||||
"default": True
|
||||
},
|
||||
"check_sql": {
|
||||
"type": "boolean",
|
||||
"description": "Check for SQL injection risks",
|
||||
"default": True
|
||||
},
|
||||
"check_dangerous_functions": {
|
||||
"type": "boolean",
|
||||
"description": "Check for dangerous function calls",
|
||||
"default": True
|
||||
}
|
||||
},
|
||||
output_schema={
|
||||
"findings": {
|
||||
"type": "array",
|
||||
"description": "List of security findings"
|
||||
}
|
||||
},
|
||||
requires_workspace=True
|
||||
)
|
||||
|
||||
def validate_config(self, config: Dict[str, Any]) -> bool:
|
||||
"""Validate module configuration"""
|
||||
extensions = config.get("file_extensions", [])
|
||||
if not isinstance(extensions, list):
|
||||
raise ValueError("file_extensions must be a list")
|
||||
|
||||
return True
|
||||
|
||||
async def execute(self, config: Dict[str, Any], workspace: Path) -> ModuleResult:
|
||||
"""
|
||||
Execute the security analysis module.
|
||||
|
||||
Args:
|
||||
config: Module configuration
|
||||
workspace: Path to the workspace directory
|
||||
|
||||
Returns:
|
||||
ModuleResult with security findings
|
||||
"""
|
||||
self.start_timer()
|
||||
self.validate_workspace(workspace)
|
||||
self.validate_config(config)
|
||||
|
||||
findings = []
|
||||
files_analyzed = 0
|
||||
|
||||
# Get configuration
|
||||
file_extensions = config.get("file_extensions", [".py", ".js", ".java", ".php", ".rb", ".go"])
|
||||
check_secrets = config.get("check_secrets", True)
|
||||
check_sql = config.get("check_sql", True)
|
||||
check_dangerous = config.get("check_dangerous_functions", True)
|
||||
|
||||
logger.info(f"Analyzing files with extensions: {file_extensions}")
|
||||
|
||||
try:
|
||||
# Analyze each file
|
||||
for ext in file_extensions:
|
||||
for file_path in workspace.rglob(f"*{ext}"):
|
||||
if not file_path.is_file():
|
||||
continue
|
||||
|
||||
files_analyzed += 1
|
||||
relative_path = file_path.relative_to(workspace)
|
||||
|
||||
try:
|
||||
content = file_path.read_text(encoding='utf-8', errors='ignore')
|
||||
lines = content.splitlines()
|
||||
|
||||
# Check for secrets
|
||||
if check_secrets:
|
||||
secret_findings = self._check_hardcoded_secrets(
|
||||
content, lines, relative_path
|
||||
)
|
||||
findings.extend(secret_findings)
|
||||
|
||||
# Check for SQL injection
|
||||
if check_sql and ext in [".py", ".php", ".java", ".js"]:
|
||||
sql_findings = self._check_sql_injection(
|
||||
content, lines, relative_path
|
||||
)
|
||||
findings.extend(sql_findings)
|
||||
|
||||
# Check for dangerous functions
|
||||
if check_dangerous:
|
||||
dangerous_findings = self._check_dangerous_functions(
|
||||
content, lines, relative_path, ext
|
||||
)
|
||||
findings.extend(dangerous_findings)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error analyzing file {relative_path}: {e}")
|
||||
|
||||
# Create summary
|
||||
summary = {
|
||||
"files_analyzed": files_analyzed,
|
||||
"total_findings": len(findings),
|
||||
"extensions_scanned": file_extensions
|
||||
}
|
||||
|
||||
return self.create_result(
|
||||
findings=findings,
|
||||
status="success" if files_analyzed > 0 else "partial",
|
||||
summary=summary,
|
||||
metadata={
|
||||
"workspace": str(workspace),
|
||||
"config": config
|
||||
}
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Security analyzer failed: {e}")
|
||||
return self.create_result(
|
||||
findings=findings,
|
||||
status="failed",
|
||||
error=str(e)
|
||||
)
|
||||
|
||||
def _check_hardcoded_secrets(
|
||||
self, content: str, lines: List[str], file_path: Path
|
||||
) -> List[ModuleFinding]:
|
||||
"""
|
||||
Check for hardcoded secrets in code.
|
||||
|
||||
Args:
|
||||
content: File content
|
||||
lines: File lines
|
||||
file_path: Relative file path
|
||||
|
||||
Returns:
|
||||
List of findings
|
||||
"""
|
||||
findings = []
|
||||
|
||||
# Patterns for secrets
|
||||
secret_patterns = [
|
||||
(r'api[_-]?key\s*=\s*["\']([^"\']{20,})["\']', 'API Key'),
|
||||
(r'api[_-]?secret\s*=\s*["\']([^"\']{20,})["\']', 'API Secret'),
|
||||
(r'password\s*=\s*["\']([^"\']+)["\']', 'Hardcoded Password'),
|
||||
(r'token\s*=\s*["\']([^"\']{20,})["\']', 'Authentication Token'),
|
||||
(r'aws[_-]?access[_-]?key\s*=\s*["\']([^"\']+)["\']', 'AWS Access Key'),
|
||||
(r'aws[_-]?secret[_-]?key\s*=\s*["\']([^"\']+)["\']', 'AWS Secret Key'),
|
||||
(r'private[_-]?key\s*=\s*["\']([^"\']+)["\']', 'Private Key'),
|
||||
(r'["\']([A-Za-z0-9]{32,})["\']', 'Potential Secret Hash'),
|
||||
(r'Bearer\s+([A-Za-z0-9\-_]+\.[A-Za-z0-9\-_]+\.[A-Za-z0-9\-_]+)', 'JWT Token'),
|
||||
]
|
||||
|
||||
for pattern, secret_type in secret_patterns:
|
||||
for match in re.finditer(pattern, content, re.IGNORECASE):
|
||||
# Find line number
|
||||
line_num = content[:match.start()].count('\n') + 1
|
||||
line_content = lines[line_num - 1] if line_num <= len(lines) else ""
|
||||
|
||||
# Skip common false positives
|
||||
if self._is_false_positive_secret(match.group(0)):
|
||||
continue
|
||||
|
||||
findings.append(self.create_finding(
|
||||
title=f"Hardcoded {secret_type} detected",
|
||||
description=f"Found potential hardcoded {secret_type} in {file_path}",
|
||||
severity="high" if "key" in secret_type.lower() else "medium",
|
||||
category="hardcoded_secret",
|
||||
file_path=str(file_path),
|
||||
line_start=line_num,
|
||||
code_snippet=line_content.strip()[:100],
|
||||
recommendation=f"Remove hardcoded {secret_type} and use environment variables or secure vault",
|
||||
metadata={"secret_type": secret_type}
|
||||
))
|
||||
|
||||
return findings
|
||||
|
||||
def _check_sql_injection(
|
||||
self, content: str, lines: List[str], file_path: Path
|
||||
) -> List[ModuleFinding]:
|
||||
"""
|
||||
Check for potential SQL injection vulnerabilities.
|
||||
|
||||
Args:
|
||||
content: File content
|
||||
lines: File lines
|
||||
file_path: Relative file path
|
||||
|
||||
Returns:
|
||||
List of findings
|
||||
"""
|
||||
findings = []
|
||||
|
||||
# SQL injection patterns
|
||||
sql_patterns = [
|
||||
(r'(SELECT|INSERT|UPDATE|DELETE).*\+\s*[\'"]?\s*\+?\s*\w+', 'String concatenation in SQL'),
|
||||
(r'(SELECT|INSERT|UPDATE|DELETE).*%\s*[\'"]?\s*%?\s*\w+', 'String formatting in SQL'),
|
||||
(r'f[\'"].*?(SELECT|INSERT|UPDATE|DELETE).*?\{.*?\}', 'F-string in SQL query'),
|
||||
(r'query\s*=.*?\+', 'Dynamic query building'),
|
||||
(r'execute\s*\(.*?\+.*?\)', 'Dynamic execute statement'),
|
||||
]
|
||||
|
||||
for pattern, vuln_type in sql_patterns:
|
||||
for match in re.finditer(pattern, content, re.IGNORECASE):
|
||||
line_num = content[:match.start()].count('\n') + 1
|
||||
line_content = lines[line_num - 1] if line_num <= len(lines) else ""
|
||||
|
||||
findings.append(self.create_finding(
|
||||
title=f"Potential SQL Injection: {vuln_type}",
|
||||
description=f"Detected potential SQL injection vulnerability via {vuln_type}",
|
||||
severity="high",
|
||||
category="sql_injection",
|
||||
file_path=str(file_path),
|
||||
line_start=line_num,
|
||||
code_snippet=line_content.strip()[:100],
|
||||
recommendation="Use parameterized queries or prepared statements instead",
|
||||
metadata={"vulnerability_type": vuln_type}
|
||||
))
|
||||
|
||||
return findings
|
||||
|
||||
def _check_dangerous_functions(
|
||||
self, content: str, lines: List[str], file_path: Path, ext: str
|
||||
) -> List[ModuleFinding]:
|
||||
"""
|
||||
Check for dangerous function calls.
|
||||
|
||||
Args:
|
||||
content: File content
|
||||
lines: File lines
|
||||
file_path: Relative file path
|
||||
ext: File extension
|
||||
|
||||
Returns:
|
||||
List of findings
|
||||
"""
|
||||
findings = []
|
||||
|
||||
# Language-specific dangerous functions
|
||||
dangerous_functions = {
|
||||
".py": [
|
||||
(r'eval\s*\(', 'eval()', 'Arbitrary code execution'),
|
||||
(r'exec\s*\(', 'exec()', 'Arbitrary code execution'),
|
||||
(r'os\.system\s*\(', 'os.system()', 'Command injection risk'),
|
||||
(r'subprocess\.call\s*\(.*shell=True', 'subprocess with shell=True', 'Command injection risk'),
|
||||
(r'pickle\.loads?\s*\(', 'pickle.load()', 'Deserialization vulnerability'),
|
||||
],
|
||||
".js": [
|
||||
(r'eval\s*\(', 'eval()', 'Arbitrary code execution'),
|
||||
(r'new\s+Function\s*\(', 'new Function()', 'Arbitrary code execution'),
|
||||
(r'innerHTML\s*=', 'innerHTML', 'XSS vulnerability'),
|
||||
(r'document\.write\s*\(', 'document.write()', 'XSS vulnerability'),
|
||||
],
|
||||
".php": [
|
||||
(r'eval\s*\(', 'eval()', 'Arbitrary code execution'),
|
||||
(r'exec\s*\(', 'exec()', 'Command execution'),
|
||||
(r'system\s*\(', 'system()', 'Command execution'),
|
||||
(r'shell_exec\s*\(', 'shell_exec()', 'Command execution'),
|
||||
(r'\$_GET\[', 'Direct $_GET usage', 'Input validation missing'),
|
||||
(r'\$_POST\[', 'Direct $_POST usage', 'Input validation missing'),
|
||||
]
|
||||
}
|
||||
|
||||
if ext in dangerous_functions:
|
||||
for pattern, func_name, risk_type in dangerous_functions[ext]:
|
||||
for match in re.finditer(pattern, content):
|
||||
line_num = content[:match.start()].count('\n') + 1
|
||||
line_content = lines[line_num - 1] if line_num <= len(lines) else ""
|
||||
|
||||
findings.append(self.create_finding(
|
||||
title=f"Dangerous function: {func_name}",
|
||||
description=f"Use of potentially dangerous function {func_name}: {risk_type}",
|
||||
severity="medium",
|
||||
category="dangerous_function",
|
||||
file_path=str(file_path),
|
||||
line_start=line_num,
|
||||
code_snippet=line_content.strip()[:100],
|
||||
recommendation=f"Consider safer alternatives to {func_name}",
|
||||
metadata={
|
||||
"function": func_name,
|
||||
"risk": risk_type
|
||||
}
|
||||
))
|
||||
|
||||
return findings
|
||||
|
||||
def _is_false_positive_secret(self, value: str) -> bool:
|
||||
"""
|
||||
Check if a potential secret is likely a false positive.
|
||||
|
||||
Args:
|
||||
value: Potential secret value
|
||||
|
||||
Returns:
|
||||
True if likely false positive
|
||||
"""
|
||||
false_positive_patterns = [
|
||||
'example',
|
||||
'test',
|
||||
'demo',
|
||||
'sample',
|
||||
'dummy',
|
||||
'placeholder',
|
||||
'xxx',
|
||||
'123',
|
||||
'change',
|
||||
'your',
|
||||
'here'
|
||||
]
|
||||
|
||||
value_lower = value.lower()
|
||||
return any(pattern in value_lower for pattern in false_positive_patterns)
|
||||
272
backend/toolbox/modules/base.py
Normal file
272
backend/toolbox/modules/base.py
Normal file
@@ -0,0 +1,272 @@
|
||||
"""
|
||||
Base module interface for all FuzzForge modules
|
||||
"""
|
||||
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
from abc import ABC, abstractmethod
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any, List, Optional
|
||||
from pydantic import BaseModel, Field
|
||||
from datetime import datetime
|
||||
import logging
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class ModuleMetadata(BaseModel):
|
||||
"""Metadata describing a module's capabilities and requirements"""
|
||||
name: str = Field(..., description="Module name")
|
||||
version: str = Field(..., description="Module version")
|
||||
description: str = Field(..., description="Module description")
|
||||
author: Optional[str] = Field(None, description="Module author")
|
||||
category: str = Field(..., description="Module category (scanner, analyzer, reporter, etc.)")
|
||||
tags: List[str] = Field(default_factory=list, description="Module tags")
|
||||
input_schema: Dict[str, Any] = Field(default_factory=dict, description="Expected input schema")
|
||||
output_schema: Dict[str, Any] = Field(default_factory=dict, description="Output schema")
|
||||
requires_workspace: bool = Field(True, description="Whether module requires workspace access")
|
||||
|
||||
|
||||
class ModuleFinding(BaseModel):
|
||||
"""Individual finding from a module"""
|
||||
id: str = Field(..., description="Unique finding ID")
|
||||
title: str = Field(..., description="Finding title")
|
||||
description: str = Field(..., description="Detailed description")
|
||||
severity: str = Field(..., description="Severity level (info, low, medium, high, critical)")
|
||||
category: str = Field(..., description="Finding category")
|
||||
file_path: Optional[str] = Field(None, description="Affected file path relative to workspace")
|
||||
line_start: Optional[int] = Field(None, description="Starting line number")
|
||||
line_end: Optional[int] = Field(None, description="Ending line number")
|
||||
code_snippet: Optional[str] = Field(None, description="Relevant code snippet")
|
||||
recommendation: Optional[str] = Field(None, description="Remediation recommendation")
|
||||
metadata: Dict[str, Any] = Field(default_factory=dict, description="Additional metadata")
|
||||
|
||||
|
||||
class ModuleResult(BaseModel):
|
||||
"""Standard result format from module execution"""
|
||||
module: str = Field(..., description="Module name")
|
||||
version: str = Field(..., description="Module version")
|
||||
status: str = Field(default="success", description="Execution status (success, partial, failed)")
|
||||
execution_time: float = Field(..., description="Execution time in seconds")
|
||||
findings: List[ModuleFinding] = Field(default_factory=list, description="List of findings")
|
||||
summary: Dict[str, Any] = Field(default_factory=dict, description="Summary statistics")
|
||||
metadata: Dict[str, Any] = Field(default_factory=dict, description="Additional metadata")
|
||||
error: Optional[str] = Field(None, description="Error message if failed")
|
||||
sarif: Optional[Dict[str, Any]] = Field(None, description="SARIF report if generated by reporter module")
|
||||
|
||||
|
||||
class BaseModule(ABC):
|
||||
"""
|
||||
Base interface for all security testing modules.
|
||||
|
||||
All modules must inherit from this class and implement the required methods.
|
||||
Modules are designed to be stateless and reusable across different workflows.
|
||||
"""
|
||||
|
||||
def __init__(self):
|
||||
"""Initialize the module"""
|
||||
self._metadata = self.get_metadata()
|
||||
self._start_time = None
|
||||
logger.info(f"Initialized module: {self._metadata.name} v{self._metadata.version}")
|
||||
|
||||
@abstractmethod
|
||||
def get_metadata(self) -> ModuleMetadata:
|
||||
"""
|
||||
Get module metadata.
|
||||
|
||||
Returns:
|
||||
ModuleMetadata object describing the module
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def execute(self, config: Dict[str, Any], workspace: Path) -> ModuleResult:
|
||||
"""
|
||||
Execute the module with given configuration and workspace.
|
||||
|
||||
Args:
|
||||
config: Module-specific configuration parameters
|
||||
workspace: Path to the mounted workspace directory
|
||||
|
||||
Returns:
|
||||
ModuleResult containing findings and metadata
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def validate_config(self, config: Dict[str, Any]) -> bool:
|
||||
"""
|
||||
Validate the provided configuration against module requirements.
|
||||
|
||||
Args:
|
||||
config: Configuration to validate
|
||||
|
||||
Returns:
|
||||
True if configuration is valid, False otherwise
|
||||
|
||||
Raises:
|
||||
ValueError: If configuration is invalid with details
|
||||
"""
|
||||
pass
|
||||
|
||||
def validate_workspace(self, workspace: Path) -> bool:
|
||||
"""
|
||||
Validate that the workspace exists and is accessible.
|
||||
|
||||
Args:
|
||||
workspace: Path to the workspace
|
||||
|
||||
Returns:
|
||||
True if workspace is valid
|
||||
|
||||
Raises:
|
||||
ValueError: If workspace is invalid
|
||||
"""
|
||||
if not workspace.exists():
|
||||
raise ValueError(f"Workspace does not exist: {workspace}")
|
||||
|
||||
if not workspace.is_dir():
|
||||
raise ValueError(f"Workspace is not a directory: {workspace}")
|
||||
|
||||
return True
|
||||
|
||||
def create_finding(
|
||||
self,
|
||||
title: str,
|
||||
description: str,
|
||||
severity: str,
|
||||
category: str,
|
||||
**kwargs
|
||||
) -> ModuleFinding:
|
||||
"""
|
||||
Helper method to create a standardized finding.
|
||||
|
||||
Args:
|
||||
title: Finding title
|
||||
description: Detailed description
|
||||
severity: Severity level
|
||||
category: Finding category
|
||||
**kwargs: Additional finding fields
|
||||
|
||||
Returns:
|
||||
ModuleFinding object
|
||||
"""
|
||||
import uuid
|
||||
finding_id = str(uuid.uuid4())
|
||||
|
||||
return ModuleFinding(
|
||||
id=finding_id,
|
||||
title=title,
|
||||
description=description,
|
||||
severity=severity,
|
||||
category=category,
|
||||
**kwargs
|
||||
)
|
||||
|
||||
def start_timer(self):
|
||||
"""Start the execution timer"""
|
||||
from time import time
|
||||
self._start_time = time()
|
||||
|
||||
def get_execution_time(self) -> float:
|
||||
"""Get the execution time in seconds"""
|
||||
from time import time
|
||||
if self._start_time is None:
|
||||
return 0.0
|
||||
return time() - self._start_time
|
||||
|
||||
def create_result(
|
||||
self,
|
||||
findings: List[ModuleFinding],
|
||||
status: str = "success",
|
||||
summary: Dict[str, Any] = None,
|
||||
metadata: Dict[str, Any] = None,
|
||||
error: str = None
|
||||
) -> ModuleResult:
|
||||
"""
|
||||
Helper method to create a module result.
|
||||
|
||||
Args:
|
||||
findings: List of findings
|
||||
status: Execution status
|
||||
summary: Summary statistics
|
||||
metadata: Additional metadata
|
||||
error: Error message if failed
|
||||
|
||||
Returns:
|
||||
ModuleResult object
|
||||
"""
|
||||
return ModuleResult(
|
||||
module=self._metadata.name,
|
||||
version=self._metadata.version,
|
||||
status=status,
|
||||
execution_time=self.get_execution_time(),
|
||||
findings=findings,
|
||||
summary=summary or self._generate_summary(findings),
|
||||
metadata=metadata or {},
|
||||
error=error
|
||||
)
|
||||
|
||||
def _generate_summary(self, findings: List[ModuleFinding]) -> Dict[str, Any]:
|
||||
"""
|
||||
Generate summary statistics from findings.
|
||||
|
||||
Args:
|
||||
findings: List of findings
|
||||
|
||||
Returns:
|
||||
Summary dictionary
|
||||
"""
|
||||
severity_counts = {
|
||||
"info": 0,
|
||||
"low": 0,
|
||||
"medium": 0,
|
||||
"high": 0,
|
||||
"critical": 0
|
||||
}
|
||||
|
||||
category_counts = {}
|
||||
|
||||
for finding in findings:
|
||||
# Count by severity
|
||||
if finding.severity in severity_counts:
|
||||
severity_counts[finding.severity] += 1
|
||||
|
||||
# Count by category
|
||||
if finding.category not in category_counts:
|
||||
category_counts[finding.category] = 0
|
||||
category_counts[finding.category] += 1
|
||||
|
||||
return {
|
||||
"total_findings": len(findings),
|
||||
"severity_counts": severity_counts,
|
||||
"category_counts": category_counts,
|
||||
"highest_severity": self._get_highest_severity(findings)
|
||||
}
|
||||
|
||||
def _get_highest_severity(self, findings: List[ModuleFinding]) -> str:
|
||||
"""
|
||||
Get the highest severity from findings.
|
||||
|
||||
Args:
|
||||
findings: List of findings
|
||||
|
||||
Returns:
|
||||
Highest severity level
|
||||
"""
|
||||
severity_order = ["critical", "high", "medium", "low", "info"]
|
||||
|
||||
for severity in severity_order:
|
||||
if any(f.severity == severity for f in findings):
|
||||
return severity
|
||||
|
||||
return "none"
|
||||
14
backend/toolbox/modules/reporter/__init__.py
Normal file
14
backend/toolbox/modules/reporter/__init__.py
Normal file
@@ -0,0 +1,14 @@
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
from .sarif_reporter import SARIFReporter
|
||||
|
||||
__all__ = ["SARIFReporter"]
|
||||
401
backend/toolbox/modules/reporter/sarif_reporter.py
Normal file
401
backend/toolbox/modules/reporter/sarif_reporter.py
Normal file
@@ -0,0 +1,401 @@
|
||||
"""
|
||||
SARIF Reporter Module - Generates SARIF-formatted security reports
|
||||
"""
|
||||
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import logging
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any, List
|
||||
from datetime import datetime
|
||||
import json
|
||||
|
||||
try:
|
||||
from toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
|
||||
except ImportError:
|
||||
try:
|
||||
from modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
|
||||
except ImportError:
|
||||
from src.toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class SARIFReporter(BaseModule):
|
||||
"""
|
||||
Generates SARIF (Static Analysis Results Interchange Format) reports.
|
||||
|
||||
This module:
|
||||
- Converts findings to SARIF format
|
||||
- Aggregates results from multiple modules
|
||||
- Adds metadata and context
|
||||
- Provides actionable recommendations
|
||||
"""
|
||||
|
||||
def get_metadata(self) -> ModuleMetadata:
|
||||
"""Get module metadata"""
|
||||
return ModuleMetadata(
|
||||
name="sarif_reporter",
|
||||
version="1.0.0",
|
||||
description="Generates SARIF-formatted security reports",
|
||||
author="FuzzForge Team",
|
||||
category="reporter",
|
||||
tags=["reporting", "sarif", "output"],
|
||||
input_schema={
|
||||
"findings": {
|
||||
"type": "array",
|
||||
"description": "List of findings to report",
|
||||
"required": True
|
||||
},
|
||||
"tool_name": {
|
||||
"type": "string",
|
||||
"description": "Name of the tool",
|
||||
"default": "FuzzForge Security Assessment"
|
||||
},
|
||||
"tool_version": {
|
||||
"type": "string",
|
||||
"description": "Tool version",
|
||||
"default": "1.0.0"
|
||||
},
|
||||
"include_code_flows": {
|
||||
"type": "boolean",
|
||||
"description": "Include code flow information",
|
||||
"default": False
|
||||
}
|
||||
},
|
||||
output_schema={
|
||||
"sarif": {
|
||||
"type": "object",
|
||||
"description": "SARIF 2.1.0 formatted report"
|
||||
}
|
||||
},
|
||||
requires_workspace=False # Reporter doesn't need direct workspace access
|
||||
)
|
||||
|
||||
def validate_config(self, config: Dict[str, Any]) -> bool:
|
||||
"""Validate module configuration"""
|
||||
if "findings" not in config and "modules_results" not in config:
|
||||
raise ValueError("Either 'findings' or 'modules_results' must be provided")
|
||||
return True
|
||||
|
||||
async def execute(self, config: Dict[str, Any], workspace: Path = None) -> ModuleResult:
|
||||
"""
|
||||
Execute the SARIF reporter module.
|
||||
|
||||
Args:
|
||||
config: Module configuration with findings
|
||||
workspace: Optional workspace path for context
|
||||
|
||||
Returns:
|
||||
ModuleResult with SARIF report
|
||||
"""
|
||||
self.start_timer()
|
||||
self.validate_config(config)
|
||||
|
||||
# Get configuration
|
||||
tool_name = config.get("tool_name", "FuzzForge Security Assessment")
|
||||
tool_version = config.get("tool_version", "1.0.0")
|
||||
include_code_flows = config.get("include_code_flows", False)
|
||||
|
||||
# Collect findings from either direct findings or module results
|
||||
all_findings = []
|
||||
|
||||
if "findings" in config:
|
||||
# Direct findings provided
|
||||
all_findings = config["findings"]
|
||||
if isinstance(all_findings, list) and all(isinstance(f, dict) for f in all_findings):
|
||||
# Convert dict findings to ModuleFinding objects
|
||||
all_findings = [ModuleFinding(**f) if isinstance(f, dict) else f for f in all_findings]
|
||||
elif "modules_results" in config:
|
||||
# Aggregate from module results
|
||||
for module_result in config["modules_results"]:
|
||||
if isinstance(module_result, dict):
|
||||
findings = module_result.get("findings", [])
|
||||
all_findings.extend(findings)
|
||||
elif hasattr(module_result, "findings"):
|
||||
all_findings.extend(module_result.findings)
|
||||
|
||||
logger.info(f"Generating SARIF report for {len(all_findings)} findings")
|
||||
|
||||
try:
|
||||
# Generate SARIF report
|
||||
sarif_report = self._generate_sarif(
|
||||
findings=all_findings,
|
||||
tool_name=tool_name,
|
||||
tool_version=tool_version,
|
||||
include_code_flows=include_code_flows,
|
||||
workspace_path=str(workspace) if workspace else None
|
||||
)
|
||||
|
||||
# Create summary
|
||||
summary = self._generate_report_summary(all_findings)
|
||||
|
||||
return ModuleResult(
|
||||
module=self.get_metadata().name,
|
||||
version=self.get_metadata().version,
|
||||
status="success",
|
||||
execution_time=self.get_execution_time(),
|
||||
findings=[], # Reporter doesn't generate new findings
|
||||
summary=summary,
|
||||
metadata={
|
||||
"tool_name": tool_name,
|
||||
"tool_version": tool_version,
|
||||
"report_format": "SARIF 2.1.0",
|
||||
"total_findings": len(all_findings)
|
||||
},
|
||||
error=None,
|
||||
sarif=sarif_report # Add SARIF as custom field
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"SARIF reporter failed: {e}")
|
||||
return self.create_result(
|
||||
findings=[],
|
||||
status="failed",
|
||||
error=str(e)
|
||||
)
|
||||
|
||||
def _generate_sarif(
|
||||
self,
|
||||
findings: List[ModuleFinding],
|
||||
tool_name: str,
|
||||
tool_version: str,
|
||||
include_code_flows: bool,
|
||||
workspace_path: str = None
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Generate SARIF 2.1.0 formatted report.
|
||||
|
||||
Args:
|
||||
findings: List of findings to report
|
||||
tool_name: Name of the tool
|
||||
tool_version: Tool version
|
||||
include_code_flows: Whether to include code flow information
|
||||
workspace_path: Optional workspace path
|
||||
|
||||
Returns:
|
||||
SARIF formatted dictionary
|
||||
"""
|
||||
# Create rules from unique finding types
|
||||
rules = self._create_rules(findings)
|
||||
|
||||
# Create results from findings
|
||||
results = self._create_results(findings, include_code_flows)
|
||||
|
||||
# Build SARIF structure
|
||||
sarif = {
|
||||
"$schema": "https://raw.githubusercontent.com/oasis-tcs/sarif-spec/master/Schemata/sarif-schema-2.1.0.json",
|
||||
"version": "2.1.0",
|
||||
"runs": [
|
||||
{
|
||||
"tool": {
|
||||
"driver": {
|
||||
"name": tool_name,
|
||||
"version": tool_version,
|
||||
"informationUri": "https://fuzzforge.io",
|
||||
"rules": rules
|
||||
}
|
||||
},
|
||||
"results": results,
|
||||
"invocations": [
|
||||
{
|
||||
"executionSuccessful": True,
|
||||
"endTimeUtc": datetime.utcnow().isoformat() + "Z"
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
# Add workspace information if available
|
||||
if workspace_path:
|
||||
sarif["runs"][0]["originalUriBaseIds"] = {
|
||||
"WORKSPACE": {
|
||||
"uri": f"file://{workspace_path}/",
|
||||
"description": "The workspace root directory"
|
||||
}
|
||||
}
|
||||
|
||||
return sarif
|
||||
|
||||
def _create_rules(self, findings: List[ModuleFinding]) -> List[Dict[str, Any]]:
|
||||
"""
|
||||
Create SARIF rules from findings.
|
||||
|
||||
Args:
|
||||
findings: List of findings
|
||||
|
||||
Returns:
|
||||
List of SARIF rule objects
|
||||
"""
|
||||
rules_dict = {}
|
||||
|
||||
for finding in findings:
|
||||
rule_id = f"{finding.category}_{finding.severity}"
|
||||
|
||||
if rule_id not in rules_dict:
|
||||
rules_dict[rule_id] = {
|
||||
"id": rule_id,
|
||||
"name": finding.category.replace("_", " ").title(),
|
||||
"shortDescription": {
|
||||
"text": f"{finding.category} vulnerability"
|
||||
},
|
||||
"fullDescription": {
|
||||
"text": f"Detection rule for {finding.category} vulnerabilities with {finding.severity} severity"
|
||||
},
|
||||
"defaultConfiguration": {
|
||||
"level": self._severity_to_sarif_level(finding.severity)
|
||||
},
|
||||
"properties": {
|
||||
"category": finding.category,
|
||||
"severity": finding.severity,
|
||||
"tags": ["security", finding.category, finding.severity]
|
||||
}
|
||||
}
|
||||
|
||||
return list(rules_dict.values())
|
||||
|
||||
def _create_results(
|
||||
self, findings: List[ModuleFinding], include_code_flows: bool
|
||||
) -> List[Dict[str, Any]]:
|
||||
"""
|
||||
Create SARIF results from findings.
|
||||
|
||||
Args:
|
||||
findings: List of findings
|
||||
include_code_flows: Whether to include code flows
|
||||
|
||||
Returns:
|
||||
List of SARIF result objects
|
||||
"""
|
||||
results = []
|
||||
|
||||
for finding in findings:
|
||||
result = {
|
||||
"ruleId": f"{finding.category}_{finding.severity}",
|
||||
"level": self._severity_to_sarif_level(finding.severity),
|
||||
"message": {
|
||||
"text": finding.description
|
||||
},
|
||||
"locations": []
|
||||
}
|
||||
|
||||
# Add location information if available
|
||||
if finding.file_path:
|
||||
location = {
|
||||
"physicalLocation": {
|
||||
"artifactLocation": {
|
||||
"uri": finding.file_path,
|
||||
"uriBaseId": "WORKSPACE"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# Add line information if available
|
||||
if finding.line_start:
|
||||
location["physicalLocation"]["region"] = {
|
||||
"startLine": finding.line_start
|
||||
}
|
||||
if finding.line_end:
|
||||
location["physicalLocation"]["region"]["endLine"] = finding.line_end
|
||||
|
||||
# Add code snippet if available
|
||||
if finding.code_snippet:
|
||||
location["physicalLocation"]["region"]["snippet"] = {
|
||||
"text": finding.code_snippet
|
||||
}
|
||||
|
||||
result["locations"].append(location)
|
||||
|
||||
# Add fix suggestions if available
|
||||
if finding.recommendation:
|
||||
result["fixes"] = [
|
||||
{
|
||||
"description": {
|
||||
"text": finding.recommendation
|
||||
}
|
||||
}
|
||||
]
|
||||
|
||||
# Add properties
|
||||
result["properties"] = {
|
||||
"findingId": finding.id,
|
||||
"title": finding.title,
|
||||
"metadata": finding.metadata
|
||||
}
|
||||
|
||||
results.append(result)
|
||||
|
||||
return results
|
||||
|
||||
def _severity_to_sarif_level(self, severity: str) -> str:
|
||||
"""
|
||||
Convert severity to SARIF level.
|
||||
|
||||
Args:
|
||||
severity: Finding severity
|
||||
|
||||
Returns:
|
||||
SARIF level string
|
||||
"""
|
||||
mapping = {
|
||||
"critical": "error",
|
||||
"high": "error",
|
||||
"medium": "warning",
|
||||
"low": "note",
|
||||
"info": "none"
|
||||
}
|
||||
return mapping.get(severity.lower(), "warning")
|
||||
|
||||
def _generate_report_summary(self, findings: List[ModuleFinding]) -> Dict[str, Any]:
|
||||
"""
|
||||
Generate summary statistics for the report.
|
||||
|
||||
Args:
|
||||
findings: List of findings
|
||||
|
||||
Returns:
|
||||
Summary dictionary
|
||||
"""
|
||||
severity_counts = {
|
||||
"critical": 0,
|
||||
"high": 0,
|
||||
"medium": 0,
|
||||
"low": 0,
|
||||
"info": 0
|
||||
}
|
||||
|
||||
category_counts = {}
|
||||
affected_files = set()
|
||||
|
||||
for finding in findings:
|
||||
# Count by severity
|
||||
if finding.severity in severity_counts:
|
||||
severity_counts[finding.severity] += 1
|
||||
|
||||
# Count by category
|
||||
if finding.category not in category_counts:
|
||||
category_counts[finding.category] = 0
|
||||
category_counts[finding.category] += 1
|
||||
|
||||
# Track affected files
|
||||
if finding.file_path:
|
||||
affected_files.add(finding.file_path)
|
||||
|
||||
return {
|
||||
"total_findings": len(findings),
|
||||
"severity_distribution": severity_counts,
|
||||
"category_distribution": category_counts,
|
||||
"affected_files": len(affected_files),
|
||||
"report_format": "SARIF 2.1.0",
|
||||
"generated_at": datetime.utcnow().isoformat()
|
||||
}
|
||||
14
backend/toolbox/modules/scanner/__init__.py
Normal file
14
backend/toolbox/modules/scanner/__init__.py
Normal file
@@ -0,0 +1,14 @@
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
from .file_scanner import FileScanner
|
||||
|
||||
__all__ = ["FileScanner"]
|
||||
315
backend/toolbox/modules/scanner/file_scanner.py
Normal file
315
backend/toolbox/modules/scanner/file_scanner.py
Normal file
@@ -0,0 +1,315 @@
|
||||
"""
|
||||
File Scanner Module - Scans and enumerates files in the workspace
|
||||
"""
|
||||
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import logging
|
||||
import mimetypes
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any, List
|
||||
import hashlib
|
||||
|
||||
try:
|
||||
from toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
|
||||
except ImportError:
|
||||
try:
|
||||
from modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
|
||||
except ImportError:
|
||||
from src.toolbox.modules.base import BaseModule, ModuleMetadata, ModuleResult, ModuleFinding
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class FileScanner(BaseModule):
|
||||
"""
|
||||
Scans files in the mounted workspace and collects information.
|
||||
|
||||
This module:
|
||||
- Enumerates files based on patterns
|
||||
- Detects file types
|
||||
- Calculates file hashes
|
||||
- Identifies potentially sensitive files
|
||||
"""
|
||||
|
||||
def get_metadata(self) -> ModuleMetadata:
|
||||
"""Get module metadata"""
|
||||
return ModuleMetadata(
|
||||
name="file_scanner",
|
||||
version="1.0.0",
|
||||
description="Scans and enumerates files in the workspace",
|
||||
author="FuzzForge Team",
|
||||
category="scanner",
|
||||
tags=["files", "enumeration", "discovery"],
|
||||
input_schema={
|
||||
"patterns": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "File patterns to scan (e.g., ['*.py', '*.js'])",
|
||||
"default": ["*"]
|
||||
},
|
||||
"max_file_size": {
|
||||
"type": "integer",
|
||||
"description": "Maximum file size to scan in bytes",
|
||||
"default": 10485760 # 10MB
|
||||
},
|
||||
"check_sensitive": {
|
||||
"type": "boolean",
|
||||
"description": "Check for sensitive file patterns",
|
||||
"default": True
|
||||
},
|
||||
"calculate_hashes": {
|
||||
"type": "boolean",
|
||||
"description": "Calculate SHA256 hashes for files",
|
||||
"default": False
|
||||
}
|
||||
},
|
||||
output_schema={
|
||||
"findings": {
|
||||
"type": "array",
|
||||
"description": "List of discovered files with metadata"
|
||||
}
|
||||
},
|
||||
requires_workspace=True
|
||||
)
|
||||
|
||||
def validate_config(self, config: Dict[str, Any]) -> bool:
|
||||
"""Validate module configuration"""
|
||||
patterns = config.get("patterns", ["*"])
|
||||
if not isinstance(patterns, list):
|
||||
raise ValueError("patterns must be a list")
|
||||
|
||||
max_size = config.get("max_file_size", 10485760)
|
||||
if not isinstance(max_size, int) or max_size <= 0:
|
||||
raise ValueError("max_file_size must be a positive integer")
|
||||
|
||||
return True
|
||||
|
||||
async def execute(self, config: Dict[str, Any], workspace: Path) -> ModuleResult:
|
||||
"""
|
||||
Execute the file scanning module.
|
||||
|
||||
Args:
|
||||
config: Module configuration
|
||||
workspace: Path to the workspace directory
|
||||
|
||||
Returns:
|
||||
ModuleResult with file findings
|
||||
"""
|
||||
self.start_timer()
|
||||
self.validate_workspace(workspace)
|
||||
self.validate_config(config)
|
||||
|
||||
findings = []
|
||||
file_count = 0
|
||||
total_size = 0
|
||||
file_types = {}
|
||||
|
||||
# Get configuration
|
||||
patterns = config.get("patterns", ["*"])
|
||||
max_file_size = config.get("max_file_size", 10485760)
|
||||
check_sensitive = config.get("check_sensitive", True)
|
||||
calculate_hashes = config.get("calculate_hashes", False)
|
||||
|
||||
logger.info(f"Scanning workspace with patterns: {patterns}")
|
||||
|
||||
try:
|
||||
# Scan for each pattern
|
||||
for pattern in patterns:
|
||||
for file_path in workspace.rglob(pattern):
|
||||
if not file_path.is_file():
|
||||
continue
|
||||
|
||||
file_count += 1
|
||||
relative_path = file_path.relative_to(workspace)
|
||||
|
||||
# Get file stats
|
||||
try:
|
||||
stats = file_path.stat()
|
||||
file_size = stats.st_size
|
||||
total_size += file_size
|
||||
|
||||
# Skip large files
|
||||
if file_size > max_file_size:
|
||||
logger.warning(f"Skipping large file: {relative_path} ({file_size} bytes)")
|
||||
continue
|
||||
|
||||
# Detect file type
|
||||
file_type = self._detect_file_type(file_path)
|
||||
if file_type not in file_types:
|
||||
file_types[file_type] = 0
|
||||
file_types[file_type] += 1
|
||||
|
||||
# Check for sensitive files
|
||||
if check_sensitive and self._is_sensitive_file(file_path):
|
||||
findings.append(self.create_finding(
|
||||
title=f"Potentially sensitive file: {relative_path.name}",
|
||||
description=f"Found potentially sensitive file at {relative_path}",
|
||||
severity="medium",
|
||||
category="sensitive_file",
|
||||
file_path=str(relative_path),
|
||||
metadata={
|
||||
"file_size": file_size,
|
||||
"file_type": file_type
|
||||
}
|
||||
))
|
||||
|
||||
# Calculate hash if requested
|
||||
file_hash = None
|
||||
if calculate_hashes and file_size < 1048576: # Only hash files < 1MB
|
||||
file_hash = self._calculate_hash(file_path)
|
||||
|
||||
# Create informational finding for each file
|
||||
findings.append(self.create_finding(
|
||||
title=f"File discovered: {relative_path.name}",
|
||||
description=f"File: {relative_path}",
|
||||
severity="info",
|
||||
category="file_enumeration",
|
||||
file_path=str(relative_path),
|
||||
metadata={
|
||||
"file_size": file_size,
|
||||
"file_type": file_type,
|
||||
"file_hash": file_hash
|
||||
}
|
||||
))
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error processing file {relative_path}: {e}")
|
||||
|
||||
# Create summary
|
||||
summary = {
|
||||
"total_files": file_count,
|
||||
"total_size_bytes": total_size,
|
||||
"file_types": file_types,
|
||||
"patterns_scanned": patterns
|
||||
}
|
||||
|
||||
return self.create_result(
|
||||
findings=findings,
|
||||
status="success",
|
||||
summary=summary,
|
||||
metadata={
|
||||
"workspace": str(workspace),
|
||||
"config": config
|
||||
}
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"File scanner failed: {e}")
|
||||
return self.create_result(
|
||||
findings=findings,
|
||||
status="failed",
|
||||
error=str(e)
|
||||
)
|
||||
|
||||
def _detect_file_type(self, file_path: Path) -> str:
|
||||
"""
|
||||
Detect the type of a file.
|
||||
|
||||
Args:
|
||||
file_path: Path to the file
|
||||
|
||||
Returns:
|
||||
File type string
|
||||
"""
|
||||
# Try to determine from extension
|
||||
mime_type, _ = mimetypes.guess_type(str(file_path))
|
||||
if mime_type:
|
||||
return mime_type
|
||||
|
||||
# Check by extension
|
||||
ext = file_path.suffix.lower()
|
||||
type_map = {
|
||||
'.py': 'text/x-python',
|
||||
'.js': 'application/javascript',
|
||||
'.java': 'text/x-java',
|
||||
'.cpp': 'text/x-c++',
|
||||
'.c': 'text/x-c',
|
||||
'.go': 'text/x-go',
|
||||
'.rs': 'text/x-rust',
|
||||
'.rb': 'text/x-ruby',
|
||||
'.php': 'text/x-php',
|
||||
'.yaml': 'text/yaml',
|
||||
'.yml': 'text/yaml',
|
||||
'.json': 'application/json',
|
||||
'.xml': 'text/xml',
|
||||
'.md': 'text/markdown',
|
||||
'.txt': 'text/plain',
|
||||
'.sh': 'text/x-shellscript',
|
||||
'.bat': 'text/x-batch',
|
||||
'.ps1': 'text/x-powershell'
|
||||
}
|
||||
|
||||
return type_map.get(ext, 'application/octet-stream')
|
||||
|
||||
def _is_sensitive_file(self, file_path: Path) -> bool:
|
||||
"""
|
||||
Check if a file might contain sensitive information.
|
||||
|
||||
Args:
|
||||
file_path: Path to the file
|
||||
|
||||
Returns:
|
||||
True if potentially sensitive
|
||||
"""
|
||||
sensitive_patterns = [
|
||||
'.env',
|
||||
'.env.local',
|
||||
'.env.production',
|
||||
'credentials',
|
||||
'password',
|
||||
'secret',
|
||||
'private_key',
|
||||
'id_rsa',
|
||||
'id_dsa',
|
||||
'.pem',
|
||||
'.key',
|
||||
'.pfx',
|
||||
'.p12',
|
||||
'wallet',
|
||||
'.ssh',
|
||||
'token',
|
||||
'api_key',
|
||||
'config.json',
|
||||
'settings.json',
|
||||
'.git-credentials',
|
||||
'.npmrc',
|
||||
'.pypirc',
|
||||
'.docker/config.json'
|
||||
]
|
||||
|
||||
file_name_lower = file_path.name.lower()
|
||||
for pattern in sensitive_patterns:
|
||||
if pattern in file_name_lower:
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
def _calculate_hash(self, file_path: Path) -> str:
|
||||
"""
|
||||
Calculate SHA256 hash of a file.
|
||||
|
||||
Args:
|
||||
file_path: Path to the file
|
||||
|
||||
Returns:
|
||||
Hex string of SHA256 hash
|
||||
"""
|
||||
try:
|
||||
sha256_hash = hashlib.sha256()
|
||||
with open(file_path, "rb") as f:
|
||||
for byte_block in iter(lambda: f.read(4096), b""):
|
||||
sha256_hash.update(byte_block)
|
||||
return sha256_hash.hexdigest()
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to calculate hash for {file_path}: {e}")
|
||||
return None
|
||||
36
backend/toolbox/modules/secret_detection/__init__.py
Normal file
36
backend/toolbox/modules/secret_detection/__init__.py
Normal file
@@ -0,0 +1,36 @@
|
||||
"""
|
||||
Secret Detection Modules
|
||||
|
||||
This package contains modules for detecting secrets, credentials, and sensitive information
|
||||
in codebases and repositories.
|
||||
|
||||
Available modules:
|
||||
- TruffleHog: Comprehensive secret detection with verification
|
||||
- Gitleaks: Git-specific secret scanning and leak detection
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
from typing import List, Type
|
||||
from ..base import BaseModule
|
||||
|
||||
# Module registry for automatic discovery
|
||||
SECRET_DETECTION_MODULES: List[Type[BaseModule]] = []
|
||||
|
||||
def register_module(module_class: Type[BaseModule]):
|
||||
"""Register a secret detection module"""
|
||||
SECRET_DETECTION_MODULES.append(module_class)
|
||||
return module_class
|
||||
|
||||
def get_available_modules() -> List[Type[BaseModule]]:
|
||||
"""Get all available secret detection modules"""
|
||||
return SECRET_DETECTION_MODULES.copy()
|
||||
351
backend/toolbox/modules/secret_detection/gitleaks.py
Normal file
351
backend/toolbox/modules/secret_detection/gitleaks.py
Normal file
@@ -0,0 +1,351 @@
|
||||
"""
|
||||
Gitleaks Secret Detection Module
|
||||
|
||||
This module uses Gitleaks to detect secrets and sensitive information in Git repositories
|
||||
and file systems.
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
import asyncio
|
||||
import json
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any, List
|
||||
import subprocess
|
||||
import logging
|
||||
|
||||
from ..base import BaseModule, ModuleMetadata, ModuleFinding, ModuleResult
|
||||
from . import register_module
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@register_module
|
||||
class GitleaksModule(BaseModule):
|
||||
"""Gitleaks secret detection module"""
|
||||
|
||||
def get_metadata(self) -> ModuleMetadata:
|
||||
"""Get module metadata"""
|
||||
return ModuleMetadata(
|
||||
name="gitleaks",
|
||||
version="8.18.0",
|
||||
description="Git-specific secret scanning and leak detection using Gitleaks",
|
||||
author="FuzzForge Team",
|
||||
category="secret_detection",
|
||||
tags=["secrets", "git", "leak-detection", "credentials"],
|
||||
input_schema={
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"scan_mode": {
|
||||
"type": "string",
|
||||
"enum": ["detect", "protect"],
|
||||
"default": "detect",
|
||||
"description": "Scan mode: detect (entire repo history) or protect (staged changes)"
|
||||
},
|
||||
"config_file": {
|
||||
"type": "string",
|
||||
"description": "Path to custom Gitleaks configuration file"
|
||||
},
|
||||
"baseline_file": {
|
||||
"type": "string",
|
||||
"description": "Path to baseline file to ignore known findings"
|
||||
},
|
||||
"max_target_megabytes": {
|
||||
"type": "integer",
|
||||
"default": 100,
|
||||
"description": "Maximum size of files to scan (in MB)"
|
||||
},
|
||||
"redact": {
|
||||
"type": "boolean",
|
||||
"default": True,
|
||||
"description": "Redact secrets in output"
|
||||
},
|
||||
"no_git": {
|
||||
"type": "boolean",
|
||||
"default": False,
|
||||
"description": "Scan files without Git context"
|
||||
}
|
||||
}
|
||||
},
|
||||
output_schema={
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"findings": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"rule_id": {"type": "string"},
|
||||
"category": {"type": "string"},
|
||||
"file_path": {"type": "string"},
|
||||
"line_number": {"type": "integer"},
|
||||
"secret": {"type": "string"}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
)
|
||||
|
||||
def validate_config(self, config: Dict[str, Any]) -> bool:
|
||||
"""Validate configuration"""
|
||||
scan_mode = config.get("scan_mode", "detect")
|
||||
if scan_mode not in ["detect", "protect"]:
|
||||
raise ValueError("scan_mode must be 'detect' or 'protect'")
|
||||
|
||||
max_size = config.get("max_target_megabytes", 100)
|
||||
if not isinstance(max_size, int) or max_size < 1 or max_size > 1000:
|
||||
raise ValueError("max_target_megabytes must be between 1 and 1000")
|
||||
|
||||
return True
|
||||
|
||||
async def execute(self, config: Dict[str, Any], workspace: Path) -> ModuleResult:
|
||||
"""Execute Gitleaks secret detection"""
|
||||
self.start_timer()
|
||||
|
||||
try:
|
||||
# Validate inputs
|
||||
self.validate_config(config)
|
||||
self.validate_workspace(workspace)
|
||||
|
||||
logger.info(f"Running Gitleaks on {workspace}")
|
||||
|
||||
# Build Gitleaks command
|
||||
scan_mode = config.get("scan_mode", "detect")
|
||||
cmd = ["gitleaks", scan_mode]
|
||||
|
||||
# Add source path
|
||||
cmd.extend(["--source", str(workspace)])
|
||||
|
||||
# Create temp file for JSON output
|
||||
import tempfile
|
||||
output_file = tempfile.NamedTemporaryFile(mode='w+', suffix='.json', delete=False)
|
||||
output_path = output_file.name
|
||||
output_file.close()
|
||||
|
||||
# Add report format and output file
|
||||
cmd.extend(["--report-format", "json"])
|
||||
cmd.extend(["--report-path", output_path])
|
||||
|
||||
# Add redact option
|
||||
if config.get("redact", True):
|
||||
cmd.append("--redact")
|
||||
|
||||
# Add max target size
|
||||
max_size = config.get("max_target_megabytes", 100)
|
||||
cmd.extend(["--max-target-megabytes", str(max_size)])
|
||||
|
||||
# Add config file if specified
|
||||
if config.get("config_file"):
|
||||
config_path = Path(config["config_file"])
|
||||
if config_path.exists():
|
||||
cmd.extend(["--config", str(config_path)])
|
||||
|
||||
# Add baseline file if specified
|
||||
if config.get("baseline_file"):
|
||||
baseline_path = Path(config["baseline_file"])
|
||||
if baseline_path.exists():
|
||||
cmd.extend(["--baseline-path", str(baseline_path)])
|
||||
|
||||
# Add no-git flag if specified
|
||||
if config.get("no_git", False):
|
||||
cmd.append("--no-git")
|
||||
|
||||
# Add verbose output
|
||||
cmd.append("--verbose")
|
||||
|
||||
logger.debug(f"Running command: {' '.join(cmd)}")
|
||||
|
||||
# Run Gitleaks
|
||||
process = await asyncio.create_subprocess_exec(
|
||||
*cmd,
|
||||
stdout=asyncio.subprocess.PIPE,
|
||||
stderr=asyncio.subprocess.PIPE,
|
||||
cwd=workspace
|
||||
)
|
||||
|
||||
stdout, stderr = await process.communicate()
|
||||
|
||||
# Parse results
|
||||
findings = []
|
||||
try:
|
||||
# Read the JSON output from file
|
||||
with open(output_path, 'r') as f:
|
||||
output_content = f.read()
|
||||
|
||||
if process.returncode == 0:
|
||||
# No secrets found
|
||||
logger.info("No secrets detected by Gitleaks")
|
||||
elif process.returncode == 1:
|
||||
# Secrets found - parse from file content
|
||||
findings = self._parse_gitleaks_output(output_content, workspace)
|
||||
else:
|
||||
# Error occurred
|
||||
error_msg = stderr.decode()
|
||||
logger.error(f"Gitleaks failed: {error_msg}")
|
||||
return self.create_result(
|
||||
findings=[],
|
||||
status="failed",
|
||||
error=f"Gitleaks execution failed: {error_msg}"
|
||||
)
|
||||
finally:
|
||||
# Clean up temp file
|
||||
import os
|
||||
try:
|
||||
os.unlink(output_path)
|
||||
except:
|
||||
pass
|
||||
|
||||
# Create summary
|
||||
summary = {
|
||||
"total_leaks": len(findings),
|
||||
"unique_rules": len(set(f.metadata.get("rule_id", "") for f in findings)),
|
||||
"files_with_leaks": len(set(f.file_path for f in findings if f.file_path)),
|
||||
"scan_mode": scan_mode
|
||||
}
|
||||
|
||||
logger.info(f"Gitleaks found {len(findings)} potential leaks")
|
||||
|
||||
return self.create_result(
|
||||
findings=findings,
|
||||
status="success",
|
||||
summary=summary
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Gitleaks module failed: {e}")
|
||||
return self.create_result(
|
||||
findings=[],
|
||||
status="failed",
|
||||
error=str(e)
|
||||
)
|
||||
|
||||
def _parse_gitleaks_output(self, output: str, workspace: Path) -> List[ModuleFinding]:
|
||||
"""Parse Gitleaks JSON output into findings"""
|
||||
findings = []
|
||||
|
||||
if not output.strip():
|
||||
return findings
|
||||
|
||||
try:
|
||||
# Gitleaks outputs JSON array
|
||||
results = json.loads(output)
|
||||
if not isinstance(results, list):
|
||||
logger.warning("Unexpected Gitleaks output format")
|
||||
return findings
|
||||
|
||||
for result in results:
|
||||
# Extract information
|
||||
rule_id = result.get("RuleID", "unknown")
|
||||
description = result.get("Description", "")
|
||||
file_path = result.get("File", "")
|
||||
line_number = result.get("LineNumber", 0)
|
||||
secret = result.get("Secret", "")
|
||||
match_text = result.get("Match", "")
|
||||
|
||||
# Commit info (if available)
|
||||
commit = result.get("Commit", "")
|
||||
author = result.get("Author", "")
|
||||
email = result.get("Email", "")
|
||||
date = result.get("Date", "")
|
||||
|
||||
# Make file path relative to workspace
|
||||
if file_path:
|
||||
try:
|
||||
rel_path = Path(file_path).relative_to(workspace)
|
||||
file_path = str(rel_path)
|
||||
except ValueError:
|
||||
# If file is outside workspace, keep absolute path
|
||||
pass
|
||||
|
||||
# Determine severity based on rule type
|
||||
severity = self._get_leak_severity(rule_id, description)
|
||||
|
||||
# Create finding
|
||||
finding = self.create_finding(
|
||||
title=f"Secret leak detected: {rule_id}",
|
||||
description=self._get_leak_description(rule_id, description, commit),
|
||||
severity=severity,
|
||||
category="secret_leak",
|
||||
file_path=file_path if file_path else None,
|
||||
line_start=line_number if line_number > 0 else None,
|
||||
code_snippet=match_text if match_text else secret,
|
||||
recommendation=self._get_leak_recommendation(rule_id),
|
||||
metadata={
|
||||
"rule_id": rule_id,
|
||||
"secret_type": description,
|
||||
"commit": commit,
|
||||
"author": author,
|
||||
"email": email,
|
||||
"date": date,
|
||||
"entropy": result.get("Entropy", 0),
|
||||
"fingerprint": result.get("Fingerprint", "")
|
||||
}
|
||||
)
|
||||
|
||||
findings.append(finding)
|
||||
|
||||
except json.JSONDecodeError as e:
|
||||
logger.warning(f"Failed to parse Gitleaks output: {e}")
|
||||
except Exception as e:
|
||||
logger.warning(f"Error processing Gitleaks results: {e}")
|
||||
|
||||
return findings
|
||||
|
||||
def _get_leak_severity(self, rule_id: str, description: str) -> str:
|
||||
"""Determine severity based on secret type"""
|
||||
critical_patterns = [
|
||||
"aws", "amazon", "gcp", "google", "azure", "microsoft",
|
||||
"private_key", "rsa", "ssh", "certificate", "database",
|
||||
"password", "auth", "token", "secret", "key"
|
||||
]
|
||||
|
||||
rule_lower = rule_id.lower()
|
||||
desc_lower = description.lower()
|
||||
|
||||
# Check for critical patterns
|
||||
for pattern in critical_patterns:
|
||||
if pattern in rule_lower or pattern in desc_lower:
|
||||
if any(x in rule_lower for x in ["aws", "gcp", "azure"]):
|
||||
return "critical"
|
||||
elif any(x in rule_lower for x in ["private", "key", "password"]):
|
||||
return "high"
|
||||
else:
|
||||
return "medium"
|
||||
|
||||
return "low"
|
||||
|
||||
def _get_leak_description(self, rule_id: str, description: str, commit: str) -> str:
|
||||
"""Get description for the leak finding"""
|
||||
base_desc = f"Gitleaks detected a potential secret leak matching rule '{rule_id}'"
|
||||
if description:
|
||||
base_desc += f" ({description})"
|
||||
|
||||
if commit:
|
||||
base_desc += f" in commit {commit[:8]}"
|
||||
|
||||
base_desc += ". This may indicate sensitive information has been committed to version control."
|
||||
|
||||
return base_desc
|
||||
|
||||
def _get_leak_recommendation(self, rule_id: str) -> str:
|
||||
"""Get remediation recommendation"""
|
||||
base_rec = "Remove the secret from the codebase and Git history. "
|
||||
|
||||
if any(pattern in rule_id.lower() for pattern in ["aws", "gcp", "azure"]):
|
||||
base_rec += "Revoke the cloud credentials immediately and rotate them. "
|
||||
|
||||
base_rec += "Consider using Git history rewriting tools (git-filter-branch, BFG) " \
|
||||
"to remove sensitive data from commit history. Implement pre-commit hooks " \
|
||||
"to prevent future secret commits."
|
||||
|
||||
return base_rec
|
||||
294
backend/toolbox/modules/secret_detection/trufflehog.py
Normal file
294
backend/toolbox/modules/secret_detection/trufflehog.py
Normal file
@@ -0,0 +1,294 @@
|
||||
"""
|
||||
TruffleHog Secret Detection Module
|
||||
|
||||
This module uses TruffleHog to detect secrets, credentials, and sensitive information
|
||||
with verification capabilities.
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
import asyncio
|
||||
import json
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any, List
|
||||
import subprocess
|
||||
import logging
|
||||
|
||||
from ..base import BaseModule, ModuleMetadata, ModuleFinding, ModuleResult
|
||||
from . import register_module
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@register_module
|
||||
class TruffleHogModule(BaseModule):
|
||||
"""TruffleHog secret detection module"""
|
||||
|
||||
def get_metadata(self) -> ModuleMetadata:
|
||||
"""Get module metadata"""
|
||||
return ModuleMetadata(
|
||||
name="trufflehog",
|
||||
version="3.63.2",
|
||||
description="Comprehensive secret detection with verification using TruffleHog",
|
||||
author="FuzzForge Team",
|
||||
category="secret_detection",
|
||||
tags=["secrets", "credentials", "sensitive-data", "verification"],
|
||||
input_schema={
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"verify": {
|
||||
"type": "boolean",
|
||||
"default": False,
|
||||
"description": "Verify discovered secrets"
|
||||
},
|
||||
"include_detectors": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "Specific detectors to include"
|
||||
},
|
||||
"exclude_detectors": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "Specific detectors to exclude"
|
||||
},
|
||||
"max_depth": {
|
||||
"type": "integer",
|
||||
"default": 10,
|
||||
"description": "Maximum directory depth to scan"
|
||||
},
|
||||
"concurrency": {
|
||||
"type": "integer",
|
||||
"default": 10,
|
||||
"description": "Number of concurrent workers"
|
||||
}
|
||||
}
|
||||
},
|
||||
output_schema={
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"findings": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"detector": {"type": "string"},
|
||||
"verified": {"type": "boolean"},
|
||||
"file_path": {"type": "string"},
|
||||
"line": {"type": "integer"},
|
||||
"secret": {"type": "string"}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
)
|
||||
|
||||
def validate_config(self, config: Dict[str, Any]) -> bool:
|
||||
"""Validate configuration"""
|
||||
# Check concurrency bounds
|
||||
concurrency = config.get("concurrency", 10)
|
||||
if not isinstance(concurrency, int) or concurrency < 1 or concurrency > 50:
|
||||
raise ValueError("Concurrency must be between 1 and 50")
|
||||
|
||||
# Check max_depth bounds
|
||||
max_depth = config.get("max_depth", 10)
|
||||
if not isinstance(max_depth, int) or max_depth < 1 or max_depth > 20:
|
||||
raise ValueError("Max depth must be between 1 and 20")
|
||||
|
||||
return True
|
||||
|
||||
async def execute(self, config: Dict[str, Any], workspace: Path) -> ModuleResult:
|
||||
"""Execute TruffleHog secret detection"""
|
||||
self.start_timer()
|
||||
|
||||
try:
|
||||
# Validate inputs
|
||||
self.validate_config(config)
|
||||
self.validate_workspace(workspace)
|
||||
|
||||
logger.info(f"Running TruffleHog on {workspace}")
|
||||
|
||||
# Build TruffleHog command
|
||||
cmd = ["trufflehog", "filesystem", str(workspace)]
|
||||
|
||||
# Add verification flag
|
||||
if config.get("verify", False):
|
||||
cmd.append("--verify")
|
||||
|
||||
# Add JSON output
|
||||
cmd.extend(["--json", "--no-update"])
|
||||
|
||||
# Add concurrency
|
||||
cmd.extend(["--concurrency", str(config.get("concurrency", 10))])
|
||||
|
||||
# Add max depth
|
||||
cmd.extend(["--max-depth", str(config.get("max_depth", 10))])
|
||||
|
||||
# Add include/exclude detectors
|
||||
if config.get("include_detectors"):
|
||||
cmd.extend(["--include-detectors", ",".join(config["include_detectors"])])
|
||||
|
||||
if config.get("exclude_detectors"):
|
||||
cmd.extend(["--exclude-detectors", ",".join(config["exclude_detectors"])])
|
||||
|
||||
logger.debug(f"Running command: {' '.join(cmd)}")
|
||||
|
||||
# Run TruffleHog
|
||||
process = await asyncio.create_subprocess_exec(
|
||||
*cmd,
|
||||
stdout=asyncio.subprocess.PIPE,
|
||||
stderr=asyncio.subprocess.PIPE,
|
||||
cwd=workspace
|
||||
)
|
||||
|
||||
stdout, stderr = await process.communicate()
|
||||
|
||||
# Parse results
|
||||
findings = []
|
||||
if process.returncode == 0 or process.returncode == 1: # 1 indicates secrets found
|
||||
findings = self._parse_trufflehog_output(stdout.decode(), workspace)
|
||||
else:
|
||||
error_msg = stderr.decode()
|
||||
logger.error(f"TruffleHog failed: {error_msg}")
|
||||
return self.create_result(
|
||||
findings=[],
|
||||
status="failed",
|
||||
error=f"TruffleHog execution failed: {error_msg}"
|
||||
)
|
||||
|
||||
# Create summary
|
||||
summary = {
|
||||
"total_secrets": len(findings),
|
||||
"verified_secrets": len([f for f in findings if f.metadata.get("verified", False)]),
|
||||
"detectors_triggered": len(set(f.metadata.get("detector", "") for f in findings)),
|
||||
"files_with_secrets": len(set(f.file_path for f in findings if f.file_path))
|
||||
}
|
||||
|
||||
logger.info(f"TruffleHog found {len(findings)} secrets")
|
||||
|
||||
return self.create_result(
|
||||
findings=findings,
|
||||
status="success",
|
||||
summary=summary
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"TruffleHog module failed: {e}")
|
||||
return self.create_result(
|
||||
findings=[],
|
||||
status="failed",
|
||||
error=str(e)
|
||||
)
|
||||
|
||||
def _parse_trufflehog_output(self, output: str, workspace: Path) -> List[ModuleFinding]:
|
||||
"""Parse TruffleHog JSON output into findings"""
|
||||
findings = []
|
||||
|
||||
for line in output.strip().split('\n'):
|
||||
if not line.strip():
|
||||
continue
|
||||
|
||||
try:
|
||||
result = json.loads(line)
|
||||
|
||||
# Extract information
|
||||
detector = result.get("DetectorName", "unknown")
|
||||
verified = result.get("Verified", False)
|
||||
raw_secret = result.get("Raw", "")
|
||||
|
||||
# Source info
|
||||
source_metadata = result.get("SourceMetadata", {})
|
||||
source_data = source_metadata.get("Data", {})
|
||||
file_path = source_data.get("Filesystem", {}).get("file", "")
|
||||
line_num = source_data.get("Filesystem", {}).get("line", 0)
|
||||
|
||||
# Make file path relative to workspace
|
||||
if file_path:
|
||||
try:
|
||||
rel_path = Path(file_path).relative_to(workspace)
|
||||
file_path = str(rel_path)
|
||||
except ValueError:
|
||||
# If file is outside workspace, keep absolute path
|
||||
pass
|
||||
|
||||
# Determine severity based on verification and detector type
|
||||
severity = self._get_secret_severity(detector, verified, raw_secret)
|
||||
|
||||
# Create finding
|
||||
finding = self.create_finding(
|
||||
title=f"{detector} secret detected",
|
||||
description=self._get_secret_description(detector, verified),
|
||||
severity=severity,
|
||||
category="secret_detection",
|
||||
file_path=file_path if file_path else None,
|
||||
line_start=line_num if line_num > 0 else None,
|
||||
code_snippet=self._truncate_secret(raw_secret),
|
||||
recommendation=self._get_secret_recommendation(detector, verified),
|
||||
metadata={
|
||||
"detector": detector,
|
||||
"verified": verified,
|
||||
"detector_type": result.get("DetectorType", ""),
|
||||
"decoder_type": result.get("DecoderType", ""),
|
||||
"structured_data": result.get("StructuredData", {})
|
||||
}
|
||||
)
|
||||
|
||||
findings.append(finding)
|
||||
|
||||
except json.JSONDecodeError as e:
|
||||
logger.warning(f"Failed to parse TruffleHog output line: {e}")
|
||||
continue
|
||||
except Exception as e:
|
||||
logger.warning(f"Error processing TruffleHog result: {e}")
|
||||
continue
|
||||
|
||||
return findings
|
||||
|
||||
def _get_secret_severity(self, detector: str, verified: bool, secret: str) -> str:
|
||||
"""Determine severity based on secret type and verification status"""
|
||||
if verified:
|
||||
# Verified secrets are always high risk
|
||||
critical_detectors = ["aws", "gcp", "azure", "github", "gitlab", "database"]
|
||||
if any(crit in detector.lower() for crit in critical_detectors):
|
||||
return "critical"
|
||||
return "high"
|
||||
|
||||
# Unverified secrets
|
||||
high_risk_detectors = ["private_key", "certificate", "password", "token"]
|
||||
if any(high in detector.lower() for high in high_risk_detectors):
|
||||
return "medium"
|
||||
|
||||
return "low"
|
||||
|
||||
def _get_secret_description(self, detector: str, verified: bool) -> str:
|
||||
"""Get description for the secret finding"""
|
||||
verification_status = "verified and active" if verified else "unverified"
|
||||
return f"A {detector} secret was detected and is {verification_status}. " \
|
||||
f"This may represent a security risk if the credential is valid."
|
||||
|
||||
def _get_secret_recommendation(self, detector: str, verified: bool) -> str:
|
||||
"""Get remediation recommendation"""
|
||||
if verified:
|
||||
return f"IMMEDIATE ACTION REQUIRED: This {detector} secret is verified and active. " \
|
||||
f"Revoke the credential immediately, remove it from the codebase, and " \
|
||||
f"implement proper secret management practices."
|
||||
else:
|
||||
return f"Review this {detector} secret to determine if it's valid. " \
|
||||
f"If real, revoke the credential and remove it from the codebase. " \
|
||||
f"Consider implementing secret scanning in CI/CD pipelines."
|
||||
|
||||
def _truncate_secret(self, secret: str, max_length: int = 50) -> str:
|
||||
"""Truncate secret for display purposes"""
|
||||
if len(secret) <= max_length:
|
||||
return secret
|
||||
return secret[:max_length] + "..."
|
||||
11
backend/toolbox/workflows/__init__.py
Normal file
11
backend/toolbox/workflows/__init__.py
Normal file
@@ -0,0 +1,11 @@
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
12
backend/toolbox/workflows/comprehensive/__init__.py
Normal file
12
backend/toolbox/workflows/comprehensive/__init__.py
Normal file
@@ -0,0 +1,12 @@
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
@@ -0,0 +1,47 @@
|
||||
# Secret Detection Workflow Dockerfile
|
||||
FROM prefecthq/prefect:3-python3.11
|
||||
|
||||
# Install system dependencies
|
||||
RUN apt-get update && apt-get install -y \
|
||||
curl \
|
||||
wget \
|
||||
git \
|
||||
ca-certificates \
|
||||
gnupg \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Install TruffleHog (use direct binary download to avoid install script issues)
|
||||
RUN curl -sSfL "https://github.com/trufflesecurity/trufflehog/releases/download/v3.63.2/trufflehog_3.63.2_linux_amd64.tar.gz" -o trufflehog.tar.gz \
|
||||
&& tar -xzf trufflehog.tar.gz \
|
||||
&& mv trufflehog /usr/local/bin/ \
|
||||
&& rm trufflehog.tar.gz
|
||||
|
||||
# Install Gitleaks (use specific version to avoid API rate limiting)
|
||||
RUN wget https://github.com/gitleaks/gitleaks/releases/download/v8.18.2/gitleaks_8.18.2_linux_x64.tar.gz \
|
||||
&& tar -xzf gitleaks_8.18.2_linux_x64.tar.gz \
|
||||
&& mv gitleaks /usr/local/bin/ \
|
||||
&& rm gitleaks_8.18.2_linux_x64.tar.gz
|
||||
|
||||
# Verify installations
|
||||
RUN trufflehog --version && gitleaks version
|
||||
|
||||
# Set working directory
|
||||
WORKDIR /opt/prefect
|
||||
|
||||
# Create toolbox directory structure
|
||||
RUN mkdir -p /opt/prefect/toolbox
|
||||
|
||||
# Set environment variables
|
||||
ENV PYTHONPATH=/opt/prefect/toolbox:/opt/prefect/toolbox/workflows
|
||||
ENV WORKFLOW_NAME=secret_detection_scan
|
||||
|
||||
# The toolbox code will be mounted at runtime from the backend container
|
||||
# This includes:
|
||||
# - /opt/prefect/toolbox/modules/base.py
|
||||
# - /opt/prefect/toolbox/modules/secret_detection/ (TruffleHog, Gitleaks modules)
|
||||
# - /opt/prefect/toolbox/modules/reporter/ (SARIF reporter)
|
||||
# - /opt/prefect/toolbox/workflows/comprehensive/secret_detection_scan/
|
||||
VOLUME /opt/prefect/toolbox
|
||||
|
||||
# Set working directory for execution
|
||||
WORKDIR /opt/prefect
|
||||
@@ -0,0 +1,58 @@
|
||||
# Secret Detection Workflow Dockerfile - Self-Contained Version
|
||||
# This version copies all required modules into the image for complete isolation
|
||||
FROM prefecthq/prefect:3-python3.11
|
||||
|
||||
# Install system dependencies
|
||||
RUN apt-get update && apt-get install -y \
|
||||
curl \
|
||||
wget \
|
||||
git \
|
||||
ca-certificates \
|
||||
gnupg \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Install TruffleHog
|
||||
RUN curl -sSfL https://raw.githubusercontent.com/trufflesecurity/trufflehog/main/scripts/install.sh | sh -s -- -b /usr/local/bin
|
||||
|
||||
# Install Gitleaks
|
||||
RUN wget https://github.com/gitleaks/gitleaks/releases/latest/download/gitleaks_linux_x64.tar.gz \
|
||||
&& tar -xzf gitleaks_linux_x64.tar.gz \
|
||||
&& mv gitleaks /usr/local/bin/ \
|
||||
&& rm gitleaks_linux_x64.tar.gz
|
||||
|
||||
# Verify installations
|
||||
RUN trufflehog --version && gitleaks version
|
||||
|
||||
# Set working directory
|
||||
WORKDIR /opt/prefect
|
||||
|
||||
# Create directory structure
|
||||
RUN mkdir -p /opt/prefect/toolbox/modules/secret_detection \
|
||||
/opt/prefect/toolbox/modules/reporter \
|
||||
/opt/prefect/toolbox/workflows/comprehensive/secret_detection_scan
|
||||
|
||||
# Copy the base module and required modules
|
||||
COPY toolbox/modules/base.py /opt/prefect/toolbox/modules/base.py
|
||||
COPY toolbox/modules/__init__.py /opt/prefect/toolbox/modules/__init__.py
|
||||
COPY toolbox/modules/secret_detection/ /opt/prefect/toolbox/modules/secret_detection/
|
||||
COPY toolbox/modules/reporter/ /opt/prefect/toolbox/modules/reporter/
|
||||
|
||||
# Copy the workflow code
|
||||
COPY toolbox/workflows/comprehensive/secret_detection_scan/ /opt/prefect/toolbox/workflows/comprehensive/secret_detection_scan/
|
||||
|
||||
# Copy toolbox init files
|
||||
COPY toolbox/__init__.py /opt/prefect/toolbox/__init__.py
|
||||
COPY toolbox/workflows/__init__.py /opt/prefect/toolbox/workflows/__init__.py
|
||||
COPY toolbox/workflows/comprehensive/__init__.py /opt/prefect/toolbox/workflows/comprehensive/__init__.py
|
||||
|
||||
# Install Python dependencies for the modules
|
||||
RUN pip install --no-cache-dir \
|
||||
pydantic \
|
||||
asyncio
|
||||
|
||||
# Set environment variables
|
||||
ENV PYTHONPATH=/opt/prefect/toolbox:/opt/prefect/toolbox/workflows
|
||||
ENV WORKFLOW_NAME=secret_detection_scan
|
||||
|
||||
# Set default command (can be overridden)
|
||||
CMD ["python", "-m", "toolbox.workflows.comprehensive.secret_detection_scan.workflow"]
|
||||
@@ -0,0 +1,130 @@
|
||||
# Secret Detection Scan Workflow
|
||||
|
||||
This workflow performs comprehensive secret detection using multiple industry-standard tools:
|
||||
|
||||
- **TruffleHog**: Comprehensive secret detection with verification capabilities
|
||||
- **Gitleaks**: Git-specific secret scanning and leak detection
|
||||
|
||||
## Features
|
||||
|
||||
- **Parallel Execution**: Runs TruffleHog and Gitleaks concurrently for faster results
|
||||
- **Deduplication**: Automatically removes duplicate findings across tools
|
||||
- **SARIF Output**: Generates standardized SARIF reports for integration with security tools
|
||||
- **Configurable**: Supports extensive configuration for both tools
|
||||
|
||||
## Dependencies
|
||||
|
||||
### Required Modules
|
||||
- `toolbox.modules.secret_detection.trufflehog`
|
||||
- `toolbox.modules.secret_detection.gitleaks`
|
||||
- `toolbox.modules.reporter` (SARIF reporter)
|
||||
- `toolbox.modules.base` (Base module interface)
|
||||
|
||||
### External Tools
|
||||
- TruffleHog v3.63.2+
|
||||
- Gitleaks v8.18.0+
|
||||
|
||||
## Docker Deployment
|
||||
|
||||
This workflow provides two Docker deployment approaches:
|
||||
|
||||
### 1. Volume-Based Approach (Default: `Dockerfile`)
|
||||
|
||||
**Advantages:**
|
||||
- Live code updates without rebuilding images
|
||||
- Smaller image sizes
|
||||
- Consistent module versions across workflows
|
||||
- Faster development iteration
|
||||
|
||||
**How it works:**
|
||||
- Docker image contains only external tools (TruffleHog, Gitleaks)
|
||||
- Python modules are mounted at runtime from the backend container
|
||||
- Backend manages code synchronization via shared volumes
|
||||
|
||||
### 2. Self-Contained Approach (`Dockerfile.self-contained`)
|
||||
|
||||
**Advantages:**
|
||||
- Complete isolation and reproducibility
|
||||
- No runtime dependencies on backend code
|
||||
- Can run independently of FuzzForge platform
|
||||
- Better for CI/CD integration
|
||||
|
||||
**How it works:**
|
||||
- All required Python modules are copied into the Docker image
|
||||
- Image is completely self-contained
|
||||
- Larger image size but fully portable
|
||||
|
||||
## Configuration
|
||||
|
||||
### TruffleHog Configuration
|
||||
|
||||
```json
|
||||
{
|
||||
"trufflehog_config": {
|
||||
"verify": true, // Verify discovered secrets
|
||||
"concurrency": 10, // Number of concurrent workers
|
||||
"max_depth": 10, // Maximum directory depth
|
||||
"include_detectors": [], // Specific detectors to include
|
||||
"exclude_detectors": [] // Specific detectors to exclude
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Gitleaks Configuration
|
||||
|
||||
```json
|
||||
{
|
||||
"gitleaks_config": {
|
||||
"scan_mode": "detect", // "detect" or "protect"
|
||||
"redact": true, // Redact secrets in output
|
||||
"max_target_megabytes": 100, // Maximum file size (MB)
|
||||
"no_git": false, // Scan without Git context
|
||||
"config_file": "", // Custom Gitleaks config
|
||||
"baseline_file": "" // Baseline file for known findings
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Usage Example
|
||||
|
||||
```bash
|
||||
curl -X POST "http://localhost:8000/workflows/secret_detection_scan/submit" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"target_path": "/path/to/scan",
|
||||
"volume_mode": "ro",
|
||||
"parameters": {
|
||||
"trufflehog_config": {
|
||||
"verify": true,
|
||||
"concurrency": 15
|
||||
},
|
||||
"gitleaks_config": {
|
||||
"scan_mode": "detect",
|
||||
"max_target_megabytes": 200
|
||||
}
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
## Output Format
|
||||
|
||||
The workflow generates a SARIF report containing:
|
||||
- All unique findings from both tools
|
||||
- Severity levels mapped to standard scale
|
||||
- File locations and line numbers
|
||||
- Detailed descriptions and recommendations
|
||||
- Tool-specific metadata
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
- **TruffleHog**: CPU-intensive with verification enabled
|
||||
- **Gitleaks**: Memory-intensive for large repositories
|
||||
- **Recommended Resources**: 512Mi memory, 500m CPU
|
||||
- **Typical Runtime**: 1-5 minutes for small repos, 10-30 minutes for large ones
|
||||
|
||||
## Security Notes
|
||||
|
||||
- Secrets are redacted in output by default
|
||||
- Verified secrets are marked with higher severity
|
||||
- Both tools support custom rules and exclusions
|
||||
- Consider using baseline files for known false positives
|
||||
@@ -0,0 +1,17 @@
|
||||
"""
|
||||
Secret Detection Scan Workflow
|
||||
|
||||
This package contains the comprehensive secret detection workflow that combines
|
||||
multiple secret detection tools for thorough analysis.
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
@@ -0,0 +1,113 @@
|
||||
name: secret_detection_scan
|
||||
version: "2.0.0"
|
||||
description: "Comprehensive secret detection using TruffleHog and Gitleaks"
|
||||
author: "FuzzForge Team"
|
||||
category: "comprehensive"
|
||||
tags:
|
||||
- "secrets"
|
||||
- "credentials"
|
||||
- "detection"
|
||||
- "trufflehog"
|
||||
- "gitleaks"
|
||||
- "comprehensive"
|
||||
|
||||
supported_volume_modes:
|
||||
- "ro"
|
||||
- "rw"
|
||||
|
||||
default_volume_mode: "ro"
|
||||
default_target_path: "/workspace"
|
||||
|
||||
requirements:
|
||||
tools:
|
||||
- "trufflehog"
|
||||
- "gitleaks"
|
||||
resources:
|
||||
memory: "512Mi"
|
||||
cpu: "500m"
|
||||
timeout: 1800
|
||||
|
||||
has_docker: true
|
||||
|
||||
default_parameters:
|
||||
target_path: "/workspace"
|
||||
volume_mode: "ro"
|
||||
trufflehog_config: {}
|
||||
gitleaks_config: {}
|
||||
reporter_config: {}
|
||||
|
||||
parameters:
|
||||
type: object
|
||||
properties:
|
||||
target_path:
|
||||
type: string
|
||||
default: "/workspace"
|
||||
description: "Path to analyze"
|
||||
volume_mode:
|
||||
type: string
|
||||
enum: ["ro", "rw"]
|
||||
default: "ro"
|
||||
description: "Volume mount mode"
|
||||
trufflehog_config:
|
||||
type: object
|
||||
description: "TruffleHog configuration"
|
||||
properties:
|
||||
verify:
|
||||
type: boolean
|
||||
description: "Verify discovered secrets"
|
||||
concurrency:
|
||||
type: integer
|
||||
description: "Number of concurrent workers"
|
||||
max_depth:
|
||||
type: integer
|
||||
description: "Maximum directory depth to scan"
|
||||
include_detectors:
|
||||
type: array
|
||||
items:
|
||||
type: string
|
||||
description: "Specific detectors to include"
|
||||
exclude_detectors:
|
||||
type: array
|
||||
items:
|
||||
type: string
|
||||
description: "Specific detectors to exclude"
|
||||
gitleaks_config:
|
||||
type: object
|
||||
description: "Gitleaks configuration"
|
||||
properties:
|
||||
scan_mode:
|
||||
type: string
|
||||
enum: ["detect", "protect"]
|
||||
description: "Scan mode"
|
||||
redact:
|
||||
type: boolean
|
||||
description: "Redact secrets in output"
|
||||
max_target_megabytes:
|
||||
type: integer
|
||||
description: "Maximum file size to scan (MB)"
|
||||
no_git:
|
||||
type: boolean
|
||||
description: "Scan files without Git context"
|
||||
config_file:
|
||||
type: string
|
||||
description: "Path to custom configuration file"
|
||||
baseline_file:
|
||||
type: string
|
||||
description: "Path to baseline file"
|
||||
reporter_config:
|
||||
type: object
|
||||
description: "SARIF reporter configuration"
|
||||
properties:
|
||||
output_file:
|
||||
type: string
|
||||
description: "Output SARIF file name"
|
||||
include_code_flows:
|
||||
type: boolean
|
||||
description: "Include code flow information"
|
||||
|
||||
output_schema:
|
||||
type: object
|
||||
properties:
|
||||
sarif:
|
||||
type: object
|
||||
description: "SARIF-formatted security findings"
|
||||
@@ -0,0 +1,290 @@
|
||||
"""
|
||||
Secret Detection Scan Workflow
|
||||
|
||||
This workflow performs comprehensive secret detection using multiple tools:
|
||||
- TruffleHog: Comprehensive secret detection with verification
|
||||
- Gitleaks: Git-specific secret scanning
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
import sys
|
||||
import logging
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any, List, Optional
|
||||
from prefect import flow, task
|
||||
from prefect.artifacts import create_markdown_artifact, create_table_artifact
|
||||
import asyncio
|
||||
import json
|
||||
|
||||
# Add modules to path
|
||||
sys.path.insert(0, '/app')
|
||||
|
||||
# Import modules
|
||||
from toolbox.modules.secret_detection.trufflehog import TruffleHogModule
|
||||
from toolbox.modules.secret_detection.gitleaks import GitleaksModule
|
||||
from toolbox.modules.reporter import SARIFReporter
|
||||
|
||||
# Configure logging
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@task(name="trufflehog_scan")
|
||||
async def run_trufflehog_task(workspace: Path, config: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""
|
||||
Task to run TruffleHog secret detection.
|
||||
|
||||
Args:
|
||||
workspace: Path to the workspace
|
||||
config: TruffleHog configuration
|
||||
|
||||
Returns:
|
||||
TruffleHog results
|
||||
"""
|
||||
logger.info("Running TruffleHog secret detection")
|
||||
module = TruffleHogModule()
|
||||
result = await module.execute(config, workspace)
|
||||
logger.info(f"TruffleHog completed: {result.summary.get('total_secrets', 0)} secrets found")
|
||||
return result.dict()
|
||||
|
||||
|
||||
@task(name="gitleaks_scan")
|
||||
async def run_gitleaks_task(workspace: Path, config: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""
|
||||
Task to run Gitleaks secret detection.
|
||||
|
||||
Args:
|
||||
workspace: Path to the workspace
|
||||
config: Gitleaks configuration
|
||||
|
||||
Returns:
|
||||
Gitleaks results
|
||||
"""
|
||||
logger.info("Running Gitleaks secret detection")
|
||||
module = GitleaksModule()
|
||||
result = await module.execute(config, workspace)
|
||||
logger.info(f"Gitleaks completed: {result.summary.get('total_leaks', 0)} leaks found")
|
||||
return result.dict()
|
||||
|
||||
|
||||
@task(name="aggregate_findings")
|
||||
async def aggregate_findings_task(
|
||||
trufflehog_results: Dict[str, Any],
|
||||
gitleaks_results: Dict[str, Any],
|
||||
config: Dict[str, Any],
|
||||
workspace: Path
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Task to aggregate findings from all secret detection tools.
|
||||
|
||||
Args:
|
||||
trufflehog_results: Results from TruffleHog
|
||||
gitleaks_results: Results from Gitleaks
|
||||
config: Reporter configuration
|
||||
workspace: Path to workspace
|
||||
|
||||
Returns:
|
||||
Aggregated SARIF report
|
||||
"""
|
||||
logger.info("Aggregating secret detection findings")
|
||||
|
||||
# Combine all findings
|
||||
all_findings = []
|
||||
|
||||
# Add TruffleHog findings
|
||||
trufflehog_findings = trufflehog_results.get("findings", [])
|
||||
all_findings.extend(trufflehog_findings)
|
||||
|
||||
# Add Gitleaks findings
|
||||
gitleaks_findings = gitleaks_results.get("findings", [])
|
||||
all_findings.extend(gitleaks_findings)
|
||||
|
||||
# Deduplicate findings based on file path and line number
|
||||
unique_findings = []
|
||||
seen_signatures = set()
|
||||
|
||||
for finding in all_findings:
|
||||
# Create signature for deduplication
|
||||
signature = (
|
||||
finding.get("file_path", ""),
|
||||
finding.get("line_start", 0),
|
||||
finding.get("title", "").lower()[:50] # First 50 chars of title
|
||||
)
|
||||
|
||||
if signature not in seen_signatures:
|
||||
seen_signatures.add(signature)
|
||||
unique_findings.append(finding)
|
||||
else:
|
||||
logger.debug(f"Deduplicated finding: {signature}")
|
||||
|
||||
logger.info(f"Aggregated {len(unique_findings)} unique findings from {len(all_findings)} total")
|
||||
|
||||
# Generate SARIF report
|
||||
reporter = SARIFReporter()
|
||||
reporter_config = {
|
||||
**config,
|
||||
"findings": unique_findings,
|
||||
"tool_name": "FuzzForge Secret Detection",
|
||||
"tool_version": "1.0.0",
|
||||
"tool_description": "Comprehensive secret detection using TruffleHog and Gitleaks"
|
||||
}
|
||||
|
||||
result = await reporter.execute(reporter_config, workspace)
|
||||
return result.dict().get("sarif", {})
|
||||
|
||||
|
||||
@flow(name="secret_detection_scan", log_prints=True)
|
||||
async def main_flow(
|
||||
target_path: str = "/workspace",
|
||||
volume_mode: str = "ro",
|
||||
trufflehog_config: Optional[Dict[str, Any]] = None,
|
||||
gitleaks_config: Optional[Dict[str, Any]] = None,
|
||||
reporter_config: Optional[Dict[str, Any]] = None
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Main secret detection workflow.
|
||||
|
||||
This workflow:
|
||||
1. Runs TruffleHog for comprehensive secret detection
|
||||
2. Runs Gitleaks for Git-specific secret detection
|
||||
3. Aggregates and deduplicates findings
|
||||
4. Generates a unified SARIF report
|
||||
|
||||
Args:
|
||||
target_path: Path to the mounted workspace (default: /workspace)
|
||||
volume_mode: Volume mount mode (ro/rw)
|
||||
trufflehog_config: Configuration for TruffleHog
|
||||
gitleaks_config: Configuration for Gitleaks
|
||||
reporter_config: Configuration for SARIF reporter
|
||||
|
||||
Returns:
|
||||
SARIF-formatted findings report
|
||||
"""
|
||||
logger.info("Starting comprehensive secret detection workflow")
|
||||
logger.info(f"Workspace: {target_path}, Mode: {volume_mode}")
|
||||
|
||||
# Set workspace path
|
||||
workspace = Path(target_path)
|
||||
|
||||
if not workspace.exists():
|
||||
logger.error(f"Workspace does not exist: {workspace}")
|
||||
return {
|
||||
"error": f"Workspace not found: {workspace}",
|
||||
"sarif": None
|
||||
}
|
||||
|
||||
# Default configurations - merge with provided configs to ensure defaults are always applied
|
||||
default_trufflehog_config = {
|
||||
"verify": False,
|
||||
"concurrency": 10,
|
||||
"max_depth": 10,
|
||||
"no_git": True # Add no_git for filesystem scanning
|
||||
}
|
||||
trufflehog_config = {**default_trufflehog_config, **(trufflehog_config or {})}
|
||||
|
||||
default_gitleaks_config = {
|
||||
"scan_mode": "detect",
|
||||
"redact": True,
|
||||
"max_target_megabytes": 100,
|
||||
"no_git": True # Critical for non-git directories
|
||||
}
|
||||
gitleaks_config = {**default_gitleaks_config, **(gitleaks_config or {})}
|
||||
|
||||
default_reporter_config = {
|
||||
"include_code_flows": False
|
||||
}
|
||||
reporter_config = {**default_reporter_config, **(reporter_config or {})}
|
||||
|
||||
try:
|
||||
# Run secret detection tools in parallel
|
||||
logger.info("Phase 1: Running secret detection tools")
|
||||
|
||||
# Create tasks for parallel execution
|
||||
trufflehog_task_result = run_trufflehog_task(workspace, trufflehog_config)
|
||||
gitleaks_task_result = run_gitleaks_task(workspace, gitleaks_config)
|
||||
|
||||
# Wait for both to complete
|
||||
trufflehog_results, gitleaks_results = await asyncio.gather(
|
||||
trufflehog_task_result,
|
||||
gitleaks_task_result,
|
||||
return_exceptions=True
|
||||
)
|
||||
|
||||
# Handle any exceptions
|
||||
if isinstance(trufflehog_results, Exception):
|
||||
logger.error(f"TruffleHog failed: {trufflehog_results}")
|
||||
trufflehog_results = {"findings": [], "status": "failed"}
|
||||
|
||||
if isinstance(gitleaks_results, Exception):
|
||||
logger.error(f"Gitleaks failed: {gitleaks_results}")
|
||||
gitleaks_results = {"findings": [], "status": "failed"}
|
||||
|
||||
# Aggregate findings
|
||||
logger.info("Phase 2: Aggregating findings")
|
||||
sarif_report = await aggregate_findings_task(
|
||||
trufflehog_results,
|
||||
gitleaks_results,
|
||||
reporter_config,
|
||||
workspace
|
||||
)
|
||||
|
||||
# Log summary
|
||||
if sarif_report and "runs" in sarif_report:
|
||||
results_count = len(sarif_report["runs"][0].get("results", []))
|
||||
logger.info(f"Workflow completed successfully with {results_count} unique secret findings")
|
||||
|
||||
# Log tool-specific stats
|
||||
trufflehog_count = len(trufflehog_results.get("findings", []))
|
||||
gitleaks_count = len(gitleaks_results.get("findings", []))
|
||||
logger.info(f"Tool results - TruffleHog: {trufflehog_count}, Gitleaks: {gitleaks_count}")
|
||||
else:
|
||||
logger.info("Workflow completed successfully with no findings")
|
||||
|
||||
return sarif_report
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Secret detection workflow failed: {e}")
|
||||
# Return error in SARIF format
|
||||
return {
|
||||
"$schema": "https://raw.githubusercontent.com/oasis-tcs/sarif-spec/master/Schemata/sarif-schema-2.1.0.json",
|
||||
"version": "2.1.0",
|
||||
"runs": [
|
||||
{
|
||||
"tool": {
|
||||
"driver": {
|
||||
"name": "FuzzForge Secret Detection",
|
||||
"version": "1.0.0"
|
||||
}
|
||||
},
|
||||
"results": [],
|
||||
"invocations": [
|
||||
{
|
||||
"executionSuccessful": False,
|
||||
"exitCode": 1,
|
||||
"exitCodeDescription": str(e)
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
# For local testing
|
||||
import asyncio
|
||||
|
||||
asyncio.run(main_flow(
|
||||
target_path="/tmp/test",
|
||||
trufflehog_config={"verify": True, "max_depth": 5},
|
||||
gitleaks_config={"scan_mode": "detect"}
|
||||
))
|
||||
187
backend/toolbox/workflows/registry.py
Normal file
187
backend/toolbox/workflows/registry.py
Normal file
@@ -0,0 +1,187 @@
|
||||
"""
|
||||
Manual Workflow Registry for Prefect Deployment
|
||||
|
||||
This file contains the manual registry of all workflows that can be deployed.
|
||||
Developers MUST add their workflows here after creating them.
|
||||
|
||||
This approach is required because:
|
||||
1. Prefect cannot deploy dynamically imported flows
|
||||
2. Docker deployment needs static flow references
|
||||
3. Explicit registration provides better control and visibility
|
||||
"""
|
||||
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
from typing import Dict, Any, Callable
|
||||
import logging
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Import only essential workflows
|
||||
# Import each workflow individually to handle failures gracefully
|
||||
security_assessment_flow = None
|
||||
secret_detection_flow = None
|
||||
|
||||
# Try to import each workflow individually
|
||||
try:
|
||||
from .security_assessment.workflow import main_flow as security_assessment_flow
|
||||
except ImportError as e:
|
||||
logger.warning(f"Failed to import security_assessment workflow: {e}")
|
||||
|
||||
try:
|
||||
from .comprehensive.secret_detection_scan.workflow import main_flow as secret_detection_flow
|
||||
except ImportError as e:
|
||||
logger.warning(f"Failed to import secret_detection_scan workflow: {e}")
|
||||
|
||||
|
||||
# Manual registry - developers add workflows here after creation
|
||||
# Only include workflows that were successfully imported
|
||||
WORKFLOW_REGISTRY: Dict[str, Dict[str, Any]] = {}
|
||||
|
||||
# Add workflows that were successfully imported
|
||||
if security_assessment_flow is not None:
|
||||
WORKFLOW_REGISTRY["security_assessment"] = {
|
||||
"flow": security_assessment_flow,
|
||||
"module_path": "toolbox.workflows.security_assessment.workflow",
|
||||
"function_name": "main_flow",
|
||||
"description": "Comprehensive security assessment workflow that scans files, analyzes code for vulnerabilities, and generates SARIF reports",
|
||||
"version": "1.0.0",
|
||||
"author": "FuzzForge Team",
|
||||
"tags": ["security", "scanner", "analyzer", "static-analysis", "sarif"]
|
||||
}
|
||||
|
||||
if secret_detection_flow is not None:
|
||||
WORKFLOW_REGISTRY["secret_detection_scan"] = {
|
||||
"flow": secret_detection_flow,
|
||||
"module_path": "toolbox.workflows.comprehensive.secret_detection_scan.workflow",
|
||||
"function_name": "main_flow",
|
||||
"description": "Comprehensive secret detection using TruffleHog and Gitleaks for thorough credential scanning",
|
||||
"version": "1.0.0",
|
||||
"author": "FuzzForge Team",
|
||||
"tags": ["secrets", "credentials", "detection", "trufflehog", "gitleaks", "comprehensive"]
|
||||
}
|
||||
|
||||
#
|
||||
# To add a new workflow, follow this pattern:
|
||||
#
|
||||
# "my_new_workflow": {
|
||||
# "flow": my_new_flow_function, # Import the flow function above
|
||||
# "module_path": "toolbox.workflows.my_new_workflow.workflow",
|
||||
# "function_name": "my_new_flow_function",
|
||||
# "description": "Description of what this workflow does",
|
||||
# "version": "1.0.0",
|
||||
# "author": "Developer Name",
|
||||
# "tags": ["tag1", "tag2"]
|
||||
# }
|
||||
|
||||
|
||||
def get_workflow_flow(workflow_name: str) -> Callable:
|
||||
"""
|
||||
Get the flow function for a workflow.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow
|
||||
|
||||
Returns:
|
||||
Flow function
|
||||
|
||||
Raises:
|
||||
KeyError: If workflow not found in registry
|
||||
"""
|
||||
if workflow_name not in WORKFLOW_REGISTRY:
|
||||
available = list(WORKFLOW_REGISTRY.keys())
|
||||
raise KeyError(
|
||||
f"Workflow '{workflow_name}' not found in registry. "
|
||||
f"Available workflows: {available}. "
|
||||
f"Please add the workflow to toolbox/workflows/registry.py"
|
||||
)
|
||||
|
||||
return WORKFLOW_REGISTRY[workflow_name]["flow"]
|
||||
|
||||
|
||||
def get_workflow_info(workflow_name: str) -> Dict[str, Any]:
|
||||
"""
|
||||
Get registry information for a workflow.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow
|
||||
|
||||
Returns:
|
||||
Registry information dictionary
|
||||
|
||||
Raises:
|
||||
KeyError: If workflow not found in registry
|
||||
"""
|
||||
if workflow_name not in WORKFLOW_REGISTRY:
|
||||
available = list(WORKFLOW_REGISTRY.keys())
|
||||
raise KeyError(
|
||||
f"Workflow '{workflow_name}' not found in registry. "
|
||||
f"Available workflows: {available}"
|
||||
)
|
||||
|
||||
return WORKFLOW_REGISTRY[workflow_name]
|
||||
|
||||
|
||||
def list_registered_workflows() -> Dict[str, Dict[str, Any]]:
|
||||
"""
|
||||
Get all registered workflows.
|
||||
|
||||
Returns:
|
||||
Dictionary of all workflow registry entries
|
||||
"""
|
||||
return WORKFLOW_REGISTRY.copy()
|
||||
|
||||
|
||||
def validate_registry() -> bool:
|
||||
"""
|
||||
Validate the workflow registry for consistency.
|
||||
|
||||
Returns:
|
||||
True if valid, raises exceptions if not
|
||||
|
||||
Raises:
|
||||
ValueError: If registry is invalid
|
||||
"""
|
||||
if not WORKFLOW_REGISTRY:
|
||||
raise ValueError("Workflow registry is empty")
|
||||
|
||||
required_fields = ["flow", "module_path", "function_name", "description"]
|
||||
|
||||
for name, entry in WORKFLOW_REGISTRY.items():
|
||||
# Check required fields
|
||||
missing_fields = [field for field in required_fields if field not in entry]
|
||||
if missing_fields:
|
||||
raise ValueError(
|
||||
f"Workflow '{name}' missing required fields: {missing_fields}"
|
||||
)
|
||||
|
||||
# Check if flow is callable
|
||||
if not callable(entry["flow"]):
|
||||
raise ValueError(f"Workflow '{name}' flow is not callable")
|
||||
|
||||
# Check if flow has the required Prefect attributes
|
||||
if not hasattr(entry["flow"], "deploy"):
|
||||
raise ValueError(
|
||||
f"Workflow '{name}' flow is not a Prefect flow (missing deploy method)"
|
||||
)
|
||||
|
||||
logger.info(f"Registry validation passed. {len(WORKFLOW_REGISTRY)} workflows registered.")
|
||||
return True
|
||||
|
||||
|
||||
# Validate registry on import
|
||||
try:
|
||||
validate_registry()
|
||||
logger.info(f"Workflow registry loaded successfully with {len(WORKFLOW_REGISTRY)} workflows")
|
||||
except Exception as e:
|
||||
logger.error(f"Workflow registry validation failed: {e}")
|
||||
raise
|
||||
30
backend/toolbox/workflows/security_assessment/Dockerfile
Normal file
30
backend/toolbox/workflows/security_assessment/Dockerfile
Normal file
@@ -0,0 +1,30 @@
|
||||
FROM prefecthq/prefect:3-python3.11
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# Create toolbox directory structure to match expected import paths
|
||||
RUN mkdir -p /app/toolbox/workflows /app/toolbox/modules
|
||||
|
||||
# Copy base module infrastructure
|
||||
COPY modules/__init__.py /app/toolbox/modules/
|
||||
COPY modules/base.py /app/toolbox/modules/
|
||||
|
||||
# Copy only required modules (manual selection)
|
||||
COPY modules/scanner /app/toolbox/modules/scanner
|
||||
COPY modules/analyzer /app/toolbox/modules/analyzer
|
||||
COPY modules/reporter /app/toolbox/modules/reporter
|
||||
|
||||
# Copy this workflow
|
||||
COPY workflows/security_assessment /app/toolbox/workflows/security_assessment
|
||||
|
||||
# Install workflow-specific requirements if they exist
|
||||
RUN if [ -f /app/toolbox/workflows/security_assessment/requirements.txt ]; then pip install --no-cache-dir -r /app/toolbox/workflows/security_assessment/requirements.txt; fi
|
||||
|
||||
# Install common requirements
|
||||
RUN pip install --no-cache-dir pyyaml
|
||||
|
||||
# Set Python path
|
||||
ENV PYTHONPATH=/app:$PYTHONPATH
|
||||
|
||||
# Create workspace directory
|
||||
RUN mkdir -p /workspace
|
||||
11
backend/toolbox/workflows/security_assessment/__init__.py
Normal file
11
backend/toolbox/workflows/security_assessment/__init__.py
Normal file
@@ -0,0 +1,11 @@
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
111
backend/toolbox/workflows/security_assessment/metadata.yaml
Normal file
111
backend/toolbox/workflows/security_assessment/metadata.yaml
Normal file
@@ -0,0 +1,111 @@
|
||||
name: security_assessment
|
||||
version: "2.0.0"
|
||||
description: "Comprehensive security assessment workflow that scans files, analyzes code for vulnerabilities, and generates SARIF reports"
|
||||
author: "FuzzForge Team"
|
||||
category: "comprehensive"
|
||||
tags:
|
||||
- "security"
|
||||
- "scanner"
|
||||
- "analyzer"
|
||||
- "static-analysis"
|
||||
- "sarif"
|
||||
- "comprehensive"
|
||||
|
||||
supported_volume_modes:
|
||||
- "ro"
|
||||
- "rw"
|
||||
|
||||
default_volume_mode: "ro"
|
||||
default_target_path: "/workspace"
|
||||
|
||||
requirements:
|
||||
tools:
|
||||
- "file_scanner"
|
||||
- "security_analyzer"
|
||||
- "sarif_reporter"
|
||||
resources:
|
||||
memory: "512Mi"
|
||||
cpu: "500m"
|
||||
timeout: 1800
|
||||
|
||||
has_docker: true
|
||||
|
||||
default_parameters:
|
||||
target_path: "/workspace"
|
||||
volume_mode: "ro"
|
||||
scanner_config: {}
|
||||
analyzer_config: {}
|
||||
reporter_config: {}
|
||||
|
||||
parameters:
|
||||
type: object
|
||||
properties:
|
||||
target_path:
|
||||
type: string
|
||||
default: "/workspace"
|
||||
description: "Path to analyze"
|
||||
volume_mode:
|
||||
type: string
|
||||
enum: ["ro", "rw"]
|
||||
default: "ro"
|
||||
description: "Volume mount mode"
|
||||
scanner_config:
|
||||
type: object
|
||||
description: "File scanner configuration"
|
||||
properties:
|
||||
patterns:
|
||||
type: array
|
||||
items:
|
||||
type: string
|
||||
description: "File patterns to scan"
|
||||
check_sensitive:
|
||||
type: boolean
|
||||
description: "Check for sensitive files"
|
||||
calculate_hashes:
|
||||
type: boolean
|
||||
description: "Calculate file hashes"
|
||||
max_file_size:
|
||||
type: integer
|
||||
description: "Maximum file size to scan (bytes)"
|
||||
analyzer_config:
|
||||
type: object
|
||||
description: "Security analyzer configuration"
|
||||
properties:
|
||||
file_extensions:
|
||||
type: array
|
||||
items:
|
||||
type: string
|
||||
description: "File extensions to analyze"
|
||||
check_secrets:
|
||||
type: boolean
|
||||
description: "Check for hardcoded secrets"
|
||||
check_sql:
|
||||
type: boolean
|
||||
description: "Check for SQL injection risks"
|
||||
check_dangerous_functions:
|
||||
type: boolean
|
||||
description: "Check for dangerous function calls"
|
||||
reporter_config:
|
||||
type: object
|
||||
description: "SARIF reporter configuration"
|
||||
properties:
|
||||
include_code_flows:
|
||||
type: boolean
|
||||
description: "Include code flow information"
|
||||
|
||||
output_schema:
|
||||
type: object
|
||||
properties:
|
||||
sarif:
|
||||
type: object
|
||||
description: "SARIF-formatted security findings"
|
||||
summary:
|
||||
type: object
|
||||
description: "Scan execution summary"
|
||||
properties:
|
||||
total_findings:
|
||||
type: integer
|
||||
severity_counts:
|
||||
type: object
|
||||
tool_counts:
|
||||
type: object
|
||||
@@ -0,0 +1,4 @@
|
||||
# Requirements for security assessment workflow
|
||||
pydantic>=2.0.0
|
||||
pyyaml>=6.0
|
||||
aiofiles>=23.0.0
|
||||
252
backend/toolbox/workflows/security_assessment/workflow.py
Normal file
252
backend/toolbox/workflows/security_assessment/workflow.py
Normal file
@@ -0,0 +1,252 @@
|
||||
"""
|
||||
Security Assessment Workflow - Comprehensive security analysis using multiple modules
|
||||
"""
|
||||
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import sys
|
||||
import logging
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any, Optional
|
||||
from prefect import flow, task
|
||||
import json
|
||||
|
||||
# Add modules to path
|
||||
sys.path.insert(0, '/app')
|
||||
|
||||
# Import modules
|
||||
from toolbox.modules.scanner import FileScanner
|
||||
from toolbox.modules.analyzer import SecurityAnalyzer
|
||||
from toolbox.modules.reporter import SARIFReporter
|
||||
|
||||
# Configure logging
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@task(name="file_scanning")
|
||||
async def scan_files_task(workspace: Path, config: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""
|
||||
Task to scan files in the workspace.
|
||||
|
||||
Args:
|
||||
workspace: Path to the workspace
|
||||
config: Scanner configuration
|
||||
|
||||
Returns:
|
||||
Scanner results
|
||||
"""
|
||||
logger.info(f"Starting file scanning in {workspace}")
|
||||
scanner = FileScanner()
|
||||
|
||||
result = await scanner.execute(config, workspace)
|
||||
|
||||
logger.info(f"File scanning completed: {result.summary.get('total_files', 0)} files found")
|
||||
return result.dict()
|
||||
|
||||
|
||||
@task(name="security_analysis")
|
||||
async def analyze_security_task(workspace: Path, config: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""
|
||||
Task to analyze security vulnerabilities.
|
||||
|
||||
Args:
|
||||
workspace: Path to the workspace
|
||||
config: Analyzer configuration
|
||||
|
||||
Returns:
|
||||
Analysis results
|
||||
"""
|
||||
logger.info("Starting security analysis")
|
||||
analyzer = SecurityAnalyzer()
|
||||
|
||||
result = await analyzer.execute(config, workspace)
|
||||
|
||||
logger.info(
|
||||
f"Security analysis completed: {result.summary.get('total_findings', 0)} findings"
|
||||
)
|
||||
return result.dict()
|
||||
|
||||
|
||||
@task(name="report_generation")
|
||||
async def generate_report_task(
|
||||
scan_results: Dict[str, Any],
|
||||
analysis_results: Dict[str, Any],
|
||||
config: Dict[str, Any],
|
||||
workspace: Path
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Task to generate SARIF report from all findings.
|
||||
|
||||
Args:
|
||||
scan_results: Results from scanner
|
||||
analysis_results: Results from analyzer
|
||||
config: Reporter configuration
|
||||
workspace: Path to the workspace
|
||||
|
||||
Returns:
|
||||
SARIF report
|
||||
"""
|
||||
logger.info("Generating SARIF report")
|
||||
reporter = SARIFReporter()
|
||||
|
||||
# Combine findings from all modules
|
||||
all_findings = []
|
||||
|
||||
# Add scanner findings (only sensitive files, not all files)
|
||||
scanner_findings = scan_results.get("findings", [])
|
||||
sensitive_findings = [f for f in scanner_findings if f.get("severity") != "info"]
|
||||
all_findings.extend(sensitive_findings)
|
||||
|
||||
# Add analyzer findings
|
||||
analyzer_findings = analysis_results.get("findings", [])
|
||||
all_findings.extend(analyzer_findings)
|
||||
|
||||
# Prepare reporter config
|
||||
reporter_config = {
|
||||
**config,
|
||||
"findings": all_findings,
|
||||
"tool_name": "FuzzForge Security Assessment",
|
||||
"tool_version": "1.0.0"
|
||||
}
|
||||
|
||||
result = await reporter.execute(reporter_config, workspace)
|
||||
|
||||
# Extract SARIF from result
|
||||
sarif = result.dict().get("sarif", {})
|
||||
|
||||
logger.info(f"Report generated with {len(all_findings)} total findings")
|
||||
return sarif
|
||||
|
||||
|
||||
@flow(name="security_assessment", log_prints=True)
|
||||
async def main_flow(
|
||||
target_path: str = "/workspace",
|
||||
volume_mode: str = "ro",
|
||||
scanner_config: Optional[Dict[str, Any]] = None,
|
||||
analyzer_config: Optional[Dict[str, Any]] = None,
|
||||
reporter_config: Optional[Dict[str, Any]] = None
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Main security assessment workflow.
|
||||
|
||||
This workflow:
|
||||
1. Scans files in the workspace
|
||||
2. Analyzes code for security vulnerabilities
|
||||
3. Generates a SARIF report with all findings
|
||||
|
||||
Args:
|
||||
target_path: Path to the mounted workspace (default: /workspace)
|
||||
volume_mode: Volume mount mode (ro/rw)
|
||||
scanner_config: Configuration for file scanner
|
||||
analyzer_config: Configuration for security analyzer
|
||||
reporter_config: Configuration for SARIF reporter
|
||||
|
||||
Returns:
|
||||
SARIF-formatted findings report
|
||||
"""
|
||||
logger.info(f"Starting security assessment workflow")
|
||||
logger.info(f"Workspace: {target_path}, Mode: {volume_mode}")
|
||||
|
||||
# Set workspace path
|
||||
workspace = Path(target_path)
|
||||
|
||||
if not workspace.exists():
|
||||
logger.error(f"Workspace does not exist: {workspace}")
|
||||
return {
|
||||
"error": f"Workspace not found: {workspace}",
|
||||
"sarif": None
|
||||
}
|
||||
|
||||
# Default configurations
|
||||
if not scanner_config:
|
||||
scanner_config = {
|
||||
"patterns": ["*"],
|
||||
"check_sensitive": True,
|
||||
"calculate_hashes": False,
|
||||
"max_file_size": 10485760 # 10MB
|
||||
}
|
||||
|
||||
if not analyzer_config:
|
||||
analyzer_config = {
|
||||
"file_extensions": [".py", ".js", ".java", ".php", ".rb", ".go"],
|
||||
"check_secrets": True,
|
||||
"check_sql": True,
|
||||
"check_dangerous_functions": True
|
||||
}
|
||||
|
||||
if not reporter_config:
|
||||
reporter_config = {
|
||||
"include_code_flows": False
|
||||
}
|
||||
|
||||
try:
|
||||
# Execute workflow tasks
|
||||
logger.info("Phase 1: File scanning")
|
||||
scan_results = await scan_files_task(workspace, scanner_config)
|
||||
|
||||
logger.info("Phase 2: Security analysis")
|
||||
analysis_results = await analyze_security_task(workspace, analyzer_config)
|
||||
|
||||
logger.info("Phase 3: Report generation")
|
||||
sarif_report = await generate_report_task(
|
||||
scan_results,
|
||||
analysis_results,
|
||||
reporter_config,
|
||||
workspace
|
||||
)
|
||||
|
||||
# Log summary
|
||||
if sarif_report and "runs" in sarif_report:
|
||||
results_count = len(sarif_report["runs"][0].get("results", []))
|
||||
logger.info(f"Workflow completed successfully with {results_count} findings")
|
||||
else:
|
||||
logger.info("Workflow completed successfully")
|
||||
|
||||
return sarif_report
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Workflow failed: {e}")
|
||||
# Return error in SARIF format
|
||||
return {
|
||||
"$schema": "https://raw.githubusercontent.com/oasis-tcs/sarif-spec/master/Schemata/sarif-schema-2.1.0.json",
|
||||
"version": "2.1.0",
|
||||
"runs": [
|
||||
{
|
||||
"tool": {
|
||||
"driver": {
|
||||
"name": "FuzzForge Security Assessment",
|
||||
"version": "1.0.0"
|
||||
}
|
||||
},
|
||||
"results": [],
|
||||
"invocations": [
|
||||
{
|
||||
"executionSuccessful": False,
|
||||
"exitCode": 1,
|
||||
"exitCodeDescription": str(e)
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
# For local testing
|
||||
import asyncio
|
||||
|
||||
asyncio.run(main_flow(
|
||||
target_path="/tmp/test",
|
||||
scanner_config={"patterns": ["*.py"]},
|
||||
analyzer_config={"check_secrets": True}
|
||||
))
|
||||
2635
backend/uv.lock
generated
Normal file
2635
backend/uv.lock
generated
Normal file
File diff suppressed because it is too large
Load Diff
64
cli/.gitignore
vendored
Normal file
64
cli/.gitignore
vendored
Normal file
@@ -0,0 +1,64 @@
|
||||
# FuzzForge CLI specific .gitignore
|
||||
|
||||
# Python
|
||||
__pycache__/
|
||||
*.py[cod]
|
||||
*$py.class
|
||||
*.so
|
||||
.Python
|
||||
build/
|
||||
develop-eggs/
|
||||
dist/
|
||||
downloads/
|
||||
eggs/
|
||||
.eggs/
|
||||
lib/
|
||||
lib64/
|
||||
parts/
|
||||
sdist/
|
||||
var/
|
||||
wheels/
|
||||
*.egg-info/
|
||||
.installed.cfg
|
||||
*.egg
|
||||
MANIFEST
|
||||
|
||||
# Virtual environments
|
||||
.venv/
|
||||
venv/
|
||||
ENV/
|
||||
env/
|
||||
|
||||
# UV package manager - keep uv.lock for CLI
|
||||
# uv.lock # Commented out - we want to keep this for reproducible CLI builds
|
||||
|
||||
# IDE
|
||||
.vscode/
|
||||
.idea/
|
||||
*.swp
|
||||
*.swo
|
||||
|
||||
# OS
|
||||
.DS_Store
|
||||
Thumbs.db
|
||||
|
||||
# Testing
|
||||
.coverage
|
||||
.pytest_cache/
|
||||
.tox/
|
||||
htmlcov/
|
||||
|
||||
# MyPy
|
||||
.mypy_cache/
|
||||
|
||||
# Local development
|
||||
local_config.yaml
|
||||
.env.local
|
||||
|
||||
# Generated files
|
||||
*.log
|
||||
*.tmp
|
||||
|
||||
# CLI specific
|
||||
# Don't ignore uv.lock in CLI as it's needed for reproducible builds
|
||||
!uv.lock
|
||||
583
cli/README.md
Normal file
583
cli/README.md
Normal file
@@ -0,0 +1,583 @@
|
||||
# FuzzForge CLI
|
||||
|
||||
🛡️ **FuzzForge CLI** - Command-line interface for FuzzForge security testing platform
|
||||
|
||||
A comprehensive CLI for managing security testing workflows, monitoring runs in real-time, and analyzing findings with beautiful terminal interfaces and persistent project management.
|
||||
|
||||
## ✨ Features
|
||||
|
||||
- 📁 **Project Management** - Initialize and manage FuzzForge projects with local databases
|
||||
- 🔧 **Workflow Management** - Browse, configure, and run security testing workflows
|
||||
- 🚀 **Workflow Execution** - Execute and manage security testing workflows
|
||||
- 🔍 **Findings Analysis** - View, export, and analyze security findings in multiple formats
|
||||
- 📊 **Real-time Monitoring** - Live dashboards for fuzzing statistics and crash reports
|
||||
- ⚙️ **Configuration** - Flexible project and global configuration management
|
||||
- 🎨 **Rich UI** - Beautiful tables, progress bars, and interactive prompts
|
||||
- 💾 **Persistent Storage** - SQLite database for runs, findings, and crash data
|
||||
- 🛡️ **Error Handling** - Comprehensive error handling with user-friendly messages
|
||||
- 🔄 **Network Resilience** - Automatic retries and graceful degradation
|
||||
|
||||
## 🚀 Quick Start
|
||||
|
||||
### Installation
|
||||
|
||||
#### Prerequisites
|
||||
- Python 3.11 or higher
|
||||
- [uv](https://docs.astral.sh/uv/) package manager
|
||||
|
||||
#### Install FuzzForge CLI
|
||||
```bash
|
||||
# Clone the repository
|
||||
git clone https://github.com/FuzzingLabs/fuzzforge_alpha.git
|
||||
cd fuzzforge_alpha/cli
|
||||
|
||||
# Install globally with uv (recommended)
|
||||
uv tool install .
|
||||
|
||||
# Alternative: Install in development mode
|
||||
uv sync
|
||||
uv add --editable ../sdk
|
||||
uv tool install --editable .
|
||||
|
||||
# Verify installation
|
||||
fuzzforge --help
|
||||
```
|
||||
|
||||
#### Shell Completion (Optional)
|
||||
```bash
|
||||
# Install completion for your shell
|
||||
fuzzforge --install-completion
|
||||
```
|
||||
|
||||
### Initialize Your First Project
|
||||
|
||||
```bash
|
||||
# Create a new project directory
|
||||
mkdir my-security-project
|
||||
cd my-security-project
|
||||
|
||||
# Initialize FuzzForge project
|
||||
ff init
|
||||
|
||||
# Check status
|
||||
fuzzforge status
|
||||
```
|
||||
|
||||
This creates a `.fuzzforge/` directory with:
|
||||
- SQLite database for persistent storage
|
||||
- Configuration file (`config.yaml`)
|
||||
- Project metadata
|
||||
|
||||
### Run Your First Analysis
|
||||
|
||||
```bash
|
||||
# List available workflows
|
||||
fuzzforge workflows list
|
||||
|
||||
# Get workflow details
|
||||
fuzzforge workflows info security_assessment
|
||||
|
||||
# Submit a workflow for analysis
|
||||
fuzzforge workflow security_assessment /path/to/your/code
|
||||
|
||||
|
||||
# View findings when complete
|
||||
fuzzforge finding <execution-id>
|
||||
```
|
||||
|
||||
## 📚 Command Reference
|
||||
|
||||
### Project Management
|
||||
|
||||
#### `ff init`
|
||||
Initialize a new FuzzForge project in the current directory.
|
||||
|
||||
```bash
|
||||
ff init --name "My Security Project" --api-url "http://localhost:8000"
|
||||
```
|
||||
|
||||
**Options:**
|
||||
- `--name, -n` - Project name (defaults to directory name)
|
||||
- `--api-url, -u` - FuzzForge API URL (defaults to http://localhost:8000)
|
||||
- `--force, -f` - Force initialization even if project exists
|
||||
|
||||
#### `fuzzforge status`
|
||||
Show comprehensive project and API status information.
|
||||
|
||||
```bash
|
||||
fuzzforge status
|
||||
```
|
||||
|
||||
Displays:
|
||||
- Project information and configuration
|
||||
- Database statistics (runs, findings, crashes)
|
||||
- API connectivity and available workflows
|
||||
|
||||
### Workflow Management
|
||||
|
||||
#### `fuzzforge workflows list`
|
||||
List all available security testing workflows.
|
||||
|
||||
```bash
|
||||
fuzzforge workflows list
|
||||
```
|
||||
|
||||
#### `fuzzforge workflows info <workflow-name>`
|
||||
Show detailed information about a specific workflow.
|
||||
|
||||
```bash
|
||||
fuzzforge workflows info security_assessment
|
||||
```
|
||||
|
||||
Displays:
|
||||
- Workflow metadata (version, author, description)
|
||||
- Parameter schema and requirements
|
||||
- Supported volume modes and features
|
||||
|
||||
#### `fuzzforge workflows parameters <workflow-name>`
|
||||
Interactive parameter builder for workflows.
|
||||
|
||||
```bash
|
||||
# Interactive mode
|
||||
fuzzforge workflows parameters security_assessment
|
||||
|
||||
# Save parameters to file
|
||||
fuzzforge workflows parameters security_assessment --output params.json
|
||||
|
||||
# Non-interactive mode (show schema only)
|
||||
fuzzforge workflows parameters security_assessment --no-interactive
|
||||
```
|
||||
|
||||
### Workflow Execution
|
||||
|
||||
#### `fuzzforge workflow <workflow> <target-path>`
|
||||
Execute a security testing workflow.
|
||||
|
||||
```bash
|
||||
# Basic execution
|
||||
fuzzforge workflow security_assessment /path/to/code
|
||||
|
||||
# With parameters
|
||||
fuzzforge workflow security_assessment /path/to/binary \
|
||||
--param timeout=3600 \
|
||||
--param iterations=10000
|
||||
|
||||
# With parameter file
|
||||
fuzzforge workflow security_assessment /path/to/code \
|
||||
--param-file my-params.json
|
||||
|
||||
# Wait for completion
|
||||
fuzzforge workflow security_assessment /path/to/code --wait
|
||||
```
|
||||
|
||||
**Options:**
|
||||
- `--param, -p` - Parameter in key=value format (can be used multiple times)
|
||||
- `--param-file, -f` - JSON file containing parameters
|
||||
- `--volume-mode, -v` - Volume mount mode: `ro` (read-only) or `rw` (read-write)
|
||||
- `--timeout, -t` - Execution timeout in seconds
|
||||
- `--interactive/--no-interactive, -i/-n` - Interactive parameter input
|
||||
- `--wait, -w` - Wait for execution to complete
|
||||
|
||||
#### `fuzzforge workflow status [execution-id]`
|
||||
Check the status of a workflow execution.
|
||||
|
||||
```bash
|
||||
# Check specific execution
|
||||
fuzzforge workflow status abc123def456
|
||||
|
||||
# Check most recent execution
|
||||
fuzzforge workflow status
|
||||
```
|
||||
|
||||
#### `fuzzforge workflow history`
|
||||
Show workflow execution history from local database.
|
||||
|
||||
```bash
|
||||
# List all executions
|
||||
fuzzforge workflow history
|
||||
|
||||
# Filter by workflow
|
||||
fuzzforge workflow history --workflow security_assessment
|
||||
|
||||
# Filter by status
|
||||
fuzzforge workflow history --status completed
|
||||
|
||||
# Limit results
|
||||
fuzzforge workflow history --limit 10
|
||||
```
|
||||
|
||||
#### `fuzzforge workflow retry <execution-id>`
|
||||
Retry a workflow with the same or modified parameters.
|
||||
|
||||
```bash
|
||||
# Retry with same parameters
|
||||
fuzzforge workflow retry abc123def456
|
||||
|
||||
# Modify parameters interactively
|
||||
fuzzforge workflow retry abc123def456 --modify-params
|
||||
```
|
||||
|
||||
### Findings Management
|
||||
|
||||
#### `fuzzforge finding [execution-id]`
|
||||
View security findings for a specific execution.
|
||||
|
||||
```bash
|
||||
# Display latest findings
|
||||
fuzzforge finding
|
||||
|
||||
# Display specific execution findings
|
||||
fuzzforge finding abc123def456
|
||||
```
|
||||
|
||||
#### `fuzzforge findings`
|
||||
Browse all security findings from local database.
|
||||
|
||||
```bash
|
||||
# List all findings
|
||||
fuzzforge findings
|
||||
|
||||
# Show findings history
|
||||
fuzzforge findings history --limit 20
|
||||
```
|
||||
|
||||
#### `fuzzforge finding export [execution-id]`
|
||||
Export security findings in various formats.
|
||||
|
||||
```bash
|
||||
# Export latest findings
|
||||
fuzzforge finding export --format json
|
||||
|
||||
# Export specific execution findings
|
||||
fuzzforge finding export abc123def456 --format sarif
|
||||
|
||||
# Export as CSV with output file
|
||||
fuzzforge finding export abc123def456 --format csv --output report.csv
|
||||
|
||||
# Export as HTML report
|
||||
fuzzforge finding export --format html --output report.html
|
||||
```
|
||||
|
||||
### Configuration Management
|
||||
|
||||
#### `fuzzforge config show`
|
||||
Display current configuration settings.
|
||||
|
||||
```bash
|
||||
# Show project configuration
|
||||
fuzzforge config show
|
||||
|
||||
# Show global configuration
|
||||
fuzzforge config show --global
|
||||
```
|
||||
|
||||
#### `fuzzforge config set <key> <value>`
|
||||
Set a configuration value.
|
||||
|
||||
```bash
|
||||
# Project settings
|
||||
fuzzforge config set project.api_url "http://api.fuzzforge.com"
|
||||
fuzzforge config set project.default_timeout 7200
|
||||
fuzzforge config set project.default_workflow "security_assessment"
|
||||
|
||||
# Retention settings
|
||||
fuzzforge config set retention.max_runs 200
|
||||
fuzzforge config set retention.keep_findings_days 120
|
||||
|
||||
# Preferences
|
||||
fuzzforge config set preferences.auto_save_findings true
|
||||
fuzzforge config set preferences.show_progress_bars false
|
||||
|
||||
# Global configuration
|
||||
fuzzforge config set project.api_url "http://global.api.com" --global
|
||||
```
|
||||
|
||||
#### `fuzzforge config get <key>`
|
||||
Get a specific configuration value.
|
||||
|
||||
```bash
|
||||
fuzzforge config get project.api_url
|
||||
fuzzforge config get retention.max_runs --global
|
||||
```
|
||||
|
||||
#### `fuzzforge config reset`
|
||||
Reset configuration to defaults.
|
||||
|
||||
```bash
|
||||
# Reset project configuration
|
||||
fuzzforge config reset
|
||||
|
||||
# Reset global configuration
|
||||
fuzzforge config reset --global
|
||||
|
||||
# Skip confirmation
|
||||
fuzzforge config reset --force
|
||||
```
|
||||
|
||||
#### `fuzzforge config edit`
|
||||
Open configuration file in default editor.
|
||||
|
||||
```bash
|
||||
# Edit project configuration
|
||||
fuzzforge config edit
|
||||
|
||||
# Edit global configuration
|
||||
fuzzforge config edit --global
|
||||
```
|
||||
|
||||
## 🏗️ Project Structure
|
||||
|
||||
When you initialize a FuzzForge project, the following structure is created:
|
||||
|
||||
```
|
||||
my-project/
|
||||
├── .fuzzforge/
|
||||
│ ├── config.yaml # Project configuration
|
||||
│ └── findings.db # SQLite database
|
||||
├── .gitignore # Updated with FuzzForge entries
|
||||
└── README.md # Project README (if created)
|
||||
```
|
||||
|
||||
### Database Schema
|
||||
|
||||
The SQLite database stores:
|
||||
|
||||
- **runs** - Workflow run history and metadata
|
||||
- **findings** - Security findings and SARIF data
|
||||
- **crashes** - Crash reports and fuzzing data
|
||||
|
||||
### Configuration Format
|
||||
|
||||
Project configuration (`.fuzzforge/config.yaml`):
|
||||
|
||||
```yaml
|
||||
project:
|
||||
name: "My Security Project"
|
||||
api_url: "http://localhost:8000"
|
||||
default_timeout: 3600
|
||||
default_workflow: null
|
||||
|
||||
retention:
|
||||
max_runs: 100
|
||||
keep_findings_days: 90
|
||||
|
||||
preferences:
|
||||
auto_save_findings: true
|
||||
show_progress_bars: true
|
||||
table_style: "rich"
|
||||
color_output: true
|
||||
```
|
||||
|
||||
## 🔧 Advanced Usage
|
||||
|
||||
### Parameter Handling
|
||||
|
||||
FuzzForge CLI supports flexible parameter input:
|
||||
|
||||
1. **Command line parameters**:
|
||||
```bash
|
||||
ff workflow workflow-name /path key1=value1 key2=value2
|
||||
```
|
||||
|
||||
2. **Parameter files**:
|
||||
```bash
|
||||
echo '{"timeout": 3600, "threads": 4}' > params.json
|
||||
ff workflow workflow-name /path --param-file params.json
|
||||
```
|
||||
|
||||
3. **Interactive prompts**:
|
||||
```bash
|
||||
ff workflow workflow-name /path --interactive
|
||||
```
|
||||
|
||||
4. **Parameter builder**:
|
||||
```bash
|
||||
ff workflows parameters workflow-name --output my-params.json
|
||||
ff workflow workflow-name /path --param-file my-params.json
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
|
||||
Override configuration with environment variables:
|
||||
|
||||
```bash
|
||||
export FUZZFORGE_API_URL="http://production.api.com"
|
||||
export FUZZFORGE_TIMEOUT="7200"
|
||||
```
|
||||
|
||||
### Data Retention
|
||||
|
||||
Configure automatic cleanup of old data:
|
||||
|
||||
```bash
|
||||
# Keep only 50 runs
|
||||
fuzzforge config set retention.max_runs 50
|
||||
|
||||
# Keep findings for 30 days
|
||||
fuzzforge config set retention.keep_findings_days 30
|
||||
```
|
||||
|
||||
### Export Formats
|
||||
|
||||
Support for multiple export formats:
|
||||
|
||||
- **JSON** - Simplified findings structure
|
||||
- **CSV** - Tabular data for spreadsheets
|
||||
- **HTML** - Interactive web report
|
||||
- **SARIF** - Standard security analysis format
|
||||
|
||||
## 🛠️ Development
|
||||
|
||||
### Setup Development Environment
|
||||
|
||||
```bash
|
||||
# Clone repository
|
||||
git clone https://github.com/FuzzingLabs/fuzzforge_alpha.git
|
||||
cd fuzzforge_alpha/cli
|
||||
|
||||
# Install in development mode
|
||||
uv sync
|
||||
uv add --editable ../sdk
|
||||
|
||||
# Install CLI in editable mode
|
||||
uv tool install --editable .
|
||||
```
|
||||
|
||||
### Project Structure
|
||||
|
||||
```
|
||||
cli/
|
||||
├── src/fuzzforge_cli/
|
||||
│ ├── __init__.py
|
||||
│ ├── main.py # Main CLI app
|
||||
│ ├── config.py # Configuration management
|
||||
│ ├── database.py # Database operations
|
||||
│ ├── exceptions.py # Error handling
|
||||
│ ├── api_validation.py # API response validation
|
||||
│ └── commands/ # Command implementations
|
||||
│ ├── init.py # Project initialization
|
||||
│ ├── workflows.py # Workflow management
|
||||
│ ├── runs.py # Run management
|
||||
│ ├── findings.py # Findings management
|
||||
│ ├── config.py # Configuration commands
|
||||
│ └── status.py # Status information
|
||||
├── pyproject.toml # Project configuration
|
||||
└── README.md # This file
|
||||
```
|
||||
|
||||
### Running Tests
|
||||
|
||||
```bash
|
||||
# Run tests (when available)
|
||||
uv run pytest
|
||||
|
||||
# Code formatting
|
||||
uv run black src/
|
||||
uv run isort src/
|
||||
|
||||
# Type checking
|
||||
uv run mypy src/
|
||||
```
|
||||
|
||||
## ⚠️ Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
#### "No FuzzForge project found"
|
||||
```bash
|
||||
# Initialize a project first
|
||||
ff init
|
||||
```
|
||||
|
||||
#### API Connection Failed
|
||||
```bash
|
||||
# Check API URL configuration
|
||||
fuzzforge config get project.api_url
|
||||
|
||||
# Test API connectivity
|
||||
fuzzforge status
|
||||
|
||||
# Update API URL if needed
|
||||
fuzzforge config set project.api_url "http://correct-url:8000"
|
||||
```
|
||||
|
||||
#### Permission Errors
|
||||
```bash
|
||||
# Ensure proper permissions for project directory
|
||||
chmod -R 755 .fuzzforge/
|
||||
|
||||
# Check file ownership
|
||||
ls -la .fuzzforge/
|
||||
```
|
||||
|
||||
#### Database Issues
|
||||
```bash
|
||||
# Check database file exists
|
||||
ls -la .fuzzforge/findings.db
|
||||
|
||||
# Reinitialize if corrupted (will lose data)
|
||||
rm .fuzzforge/findings.db
|
||||
ff init --force
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
|
||||
Set these environment variables for debugging:
|
||||
|
||||
```bash
|
||||
export FUZZFORGE_DEBUG=1 # Enable debug logging
|
||||
export FUZZFORGE_API_URL="..." # Override API URL
|
||||
export FUZZFORGE_TIMEOUT="30" # Override timeout
|
||||
```
|
||||
|
||||
### Getting Help
|
||||
|
||||
```bash
|
||||
# General help
|
||||
fuzzforge --help
|
||||
|
||||
# Command-specific help
|
||||
ff workflows --help
|
||||
ff workflow run --help
|
||||
|
||||
# Show version
|
||||
fuzzforge --version
|
||||
```
|
||||
|
||||
## 🏆 Example Workflow
|
||||
|
||||
Here's a complete example of analyzing a project:
|
||||
|
||||
```bash
|
||||
# 1. Initialize project
|
||||
mkdir my-security-audit
|
||||
cd my-security-audit
|
||||
ff init --name "Security Audit 2024"
|
||||
|
||||
# 2. Check available workflows
|
||||
fuzzforge workflows list
|
||||
|
||||
# 3. Submit comprehensive security assessment
|
||||
ff workflow security_assessment /path/to/source/code --wait
|
||||
|
||||
# 4. View findings in table format
|
||||
fuzzforge findings get <run-id>
|
||||
|
||||
# 5. Export detailed report
|
||||
fuzzforge findings export <run-id> --format html --output security_report.html
|
||||
|
||||
# 6. Check project statistics
|
||||
fuzzforge status
|
||||
```
|
||||
|
||||
## 📜 License
|
||||
|
||||
This project is licensed under the terms specified in the main FuzzForge repository.
|
||||
|
||||
## 🤝 Contributing
|
||||
|
||||
Contributions are welcome! Please see the main FuzzForge repository for contribution guidelines.
|
||||
|
||||
---
|
||||
|
||||
**FuzzForge CLI** - Making security testing workflows accessible and efficient from the command line.
|
||||
323
cli/completion_install.py
Normal file
323
cli/completion_install.py
Normal file
@@ -0,0 +1,323 @@
|
||||
#!/usr/bin/env python3
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
"""
|
||||
Install shell completion for FuzzForge CLI.
|
||||
|
||||
This script installs completion using Typer's built-in --install-completion command.
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import subprocess
|
||||
from pathlib import Path
|
||||
import typer
|
||||
|
||||
|
||||
def run_fuzzforge_completion_install(shell: str) -> bool:
|
||||
"""Install completion using the fuzzforge CLI itself."""
|
||||
try:
|
||||
# Use the CLI's built-in completion installation
|
||||
result = subprocess.run([
|
||||
sys.executable, "-m", "fuzzforge_cli.main",
|
||||
"--install-completion", shell
|
||||
], capture_output=True, text=True, cwd=Path(__file__).parent.parent)
|
||||
|
||||
if result.returncode == 0:
|
||||
print(f"✅ {shell.capitalize()} completion installed successfully")
|
||||
return True
|
||||
else:
|
||||
print(f"❌ Failed to install {shell} completion: {result.stderr}")
|
||||
return False
|
||||
|
||||
except Exception as e:
|
||||
print(f"❌ Error installing {shell} completion: {e}")
|
||||
return False
|
||||
|
||||
|
||||
def create_manual_completion_scripts():
|
||||
"""Create manual completion scripts as fallback."""
|
||||
scripts = {
|
||||
"bash": '''
|
||||
# FuzzForge CLI completion for bash
|
||||
_fuzzforge_completion() {
|
||||
local IFS=$'\\t'
|
||||
local response
|
||||
|
||||
response=$(env COMP_WORDS="${COMP_WORDS[*]}" COMP_CWORD=$COMP_CWORD _FUZZFORGE_COMPLETE=bash_complete $1)
|
||||
|
||||
for completion in $response; do
|
||||
IFS=',' read type value <<< "$completion"
|
||||
|
||||
if [[ $type == 'dir' ]]; then
|
||||
COMPREPLY=()
|
||||
compopt -o dirnames
|
||||
elif [[ $type == 'file' ]]; then
|
||||
COMPREPLY=()
|
||||
compopt -o default
|
||||
elif [[ $type == 'plain' ]]; then
|
||||
COMPREPLY+=($value)
|
||||
fi
|
||||
done
|
||||
|
||||
return 0
|
||||
}
|
||||
|
||||
complete -o nosort -F _fuzzforge_completion fuzzforge
|
||||
''',
|
||||
|
||||
"zsh": '''
|
||||
#compdef fuzzforge
|
||||
|
||||
_fuzzforge_completion() {
|
||||
local -a completions
|
||||
local -a completions_with_descriptions
|
||||
local -a response
|
||||
response=(${(f)"$(env COMP_WORDS="${words[*]}" COMP_CWORD=$((CURRENT-1)) _FUZZFORGE_COMPLETE=zsh_complete fuzzforge)"})
|
||||
|
||||
for type_and_line in $response; do
|
||||
if [[ "$type_and_line" =~ ^([^,]*),(.*)$ ]]; then
|
||||
local type="$match[1]"
|
||||
local line="$match[2]"
|
||||
|
||||
if [[ "$type" == "dir" ]]; then
|
||||
_path_files -/
|
||||
elif [[ "$type" == "file" ]]; then
|
||||
_path_files -f
|
||||
elif [[ "$type" == "plain" ]]; then
|
||||
if [[ "$line" =~ ^([^:]*):(.*)$ ]]; then
|
||||
completions_with_descriptions+=("$match[1]":"$match[2]")
|
||||
else
|
||||
completions+=("$line")
|
||||
fi
|
||||
fi
|
||||
fi
|
||||
done
|
||||
|
||||
if [ -n "$completions_with_descriptions" ]; then
|
||||
_describe "" completions_with_descriptions -V unsorted
|
||||
fi
|
||||
|
||||
if [ -n "$completions" ]; then
|
||||
compadd -U -V unsorted -a completions
|
||||
fi
|
||||
}
|
||||
|
||||
compdef _fuzzforge_completion fuzzforge;
|
||||
''',
|
||||
|
||||
"fish": '''
|
||||
# FuzzForge CLI completion for fish
|
||||
function __fuzzforge_completion
|
||||
set -l response
|
||||
|
||||
for value in (env _FUZZFORGE_COMPLETE=fish_complete COMP_WORDS=(commandline -cp) COMP_CWORD=(commandline -t) fuzzforge)
|
||||
set response $response $value
|
||||
end
|
||||
|
||||
for completion in $response
|
||||
set -l metadata (string split "," $completion)
|
||||
|
||||
if test $metadata[1] = "dir"
|
||||
__fish_complete_directories $metadata[2]
|
||||
else if test $metadata[1] = "file"
|
||||
__fish_complete_path $metadata[2]
|
||||
else if test $metadata[1] = "plain"
|
||||
echo $metadata[2]
|
||||
end
|
||||
end
|
||||
end
|
||||
|
||||
complete --no-files --command fuzzforge --arguments "(__fuzzforge_completion)"
|
||||
'''
|
||||
}
|
||||
|
||||
return scripts
|
||||
|
||||
|
||||
def install_bash_completion():
|
||||
"""Install bash completion."""
|
||||
print("📝 Installing bash completion...")
|
||||
|
||||
# Get the manual completion script
|
||||
scripts = create_manual_completion_scripts()
|
||||
completion_script = scripts["bash"]
|
||||
|
||||
# Try different locations for bash completion
|
||||
completion_dirs = [
|
||||
Path.home() / ".bash_completion.d",
|
||||
Path("/usr/local/etc/bash_completion.d"),
|
||||
Path("/etc/bash_completion.d")
|
||||
]
|
||||
|
||||
for completion_dir in completion_dirs:
|
||||
try:
|
||||
completion_dir.mkdir(exist_ok=True)
|
||||
completion_file = completion_dir / "fuzzforge"
|
||||
completion_file.write_text(completion_script)
|
||||
print(f"✅ Bash completion installed to: {completion_file}")
|
||||
|
||||
# Add source line to .bashrc if not present
|
||||
bashrc = Path.home() / ".bashrc"
|
||||
source_line = f"source {completion_file}"
|
||||
|
||||
if bashrc.exists():
|
||||
bashrc_content = bashrc.read_text()
|
||||
if source_line not in bashrc_content:
|
||||
with bashrc.open("a") as f:
|
||||
f.write(f"\n# FuzzForge CLI completion\n{source_line}\n")
|
||||
print("✅ Added completion source to ~/.bashrc")
|
||||
|
||||
return True
|
||||
except PermissionError:
|
||||
continue
|
||||
except Exception as e:
|
||||
print(f"❌ Failed to install bash completion: {e}")
|
||||
continue
|
||||
|
||||
print("❌ Could not install bash completion (permission denied)")
|
||||
return False
|
||||
|
||||
|
||||
def install_zsh_completion():
|
||||
"""Install zsh completion."""
|
||||
print("📝 Installing zsh completion...")
|
||||
|
||||
# Get the manual completion script
|
||||
scripts = create_manual_completion_scripts()
|
||||
completion_script = scripts["zsh"]
|
||||
|
||||
# Create completion directory
|
||||
comp_dir = Path.home() / ".zsh" / "completions"
|
||||
comp_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
try:
|
||||
completion_file = comp_dir / "_fuzzforge"
|
||||
completion_file.write_text(completion_script)
|
||||
print(f"✅ Zsh completion installed to: {completion_file}")
|
||||
|
||||
# Add fpath to .zshrc if not present
|
||||
zshrc = Path.home() / ".zshrc"
|
||||
fpath_line = f'fpath=(~/.zsh/completions $fpath)'
|
||||
autoload_line = 'autoload -U compinit && compinit'
|
||||
|
||||
if zshrc.exists():
|
||||
zshrc_content = zshrc.read_text()
|
||||
lines_to_add = []
|
||||
|
||||
if fpath_line not in zshrc_content:
|
||||
lines_to_add.append(fpath_line)
|
||||
|
||||
if autoload_line not in zshrc_content:
|
||||
lines_to_add.append(autoload_line)
|
||||
|
||||
if lines_to_add:
|
||||
with zshrc.open("a") as f:
|
||||
f.write(f"\n# FuzzForge CLI completion\n")
|
||||
for line in lines_to_add:
|
||||
f.write(f"{line}\n")
|
||||
print("✅ Added completion setup to ~/.zshrc")
|
||||
|
||||
return True
|
||||
except Exception as e:
|
||||
print(f"❌ Failed to install zsh completion: {e}")
|
||||
return False
|
||||
|
||||
|
||||
def install_fish_completion():
|
||||
"""Install fish completion."""
|
||||
print("📝 Installing fish completion...")
|
||||
|
||||
# Get the manual completion script
|
||||
scripts = create_manual_completion_scripts()
|
||||
completion_script = scripts["fish"]
|
||||
|
||||
# Fish completion directory
|
||||
comp_dir = Path.home() / ".config" / "fish" / "completions"
|
||||
comp_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
try:
|
||||
completion_file = comp_dir / "fuzzforge.fish"
|
||||
completion_file.write_text(completion_script)
|
||||
print(f"✅ Fish completion installed to: {completion_file}")
|
||||
return True
|
||||
except Exception as e:
|
||||
print(f"❌ Failed to install fish completion: {e}")
|
||||
return False
|
||||
|
||||
|
||||
def detect_shell():
|
||||
"""Detect the current shell."""
|
||||
shell_path = os.environ.get('SHELL', '')
|
||||
if 'bash' in shell_path:
|
||||
return 'bash'
|
||||
elif 'zsh' in shell_path:
|
||||
return 'zsh'
|
||||
elif 'fish' in shell_path:
|
||||
return 'fish'
|
||||
else:
|
||||
return None
|
||||
|
||||
|
||||
def main():
|
||||
"""Install completion for the current shell or all shells."""
|
||||
print("🚀 FuzzForge CLI Completion Installer")
|
||||
print("=" * 50)
|
||||
|
||||
current_shell = detect_shell()
|
||||
if current_shell:
|
||||
print(f"🐚 Detected shell: {current_shell}")
|
||||
|
||||
# Check for command line arguments
|
||||
if len(sys.argv) > 1 and sys.argv[1] == "--all":
|
||||
install_all = True
|
||||
print("Installing completion for all shells...")
|
||||
else:
|
||||
# Ask user which shells to install (with default to current shell only)
|
||||
if current_shell:
|
||||
install_all = typer.confirm("Install completion for all supported shells (bash, zsh, fish)?", default=False)
|
||||
if not install_all:
|
||||
print(f"Installing completion for {current_shell} only...")
|
||||
else:
|
||||
install_all = typer.confirm("Install completion for all supported shells (bash, zsh, fish)?", default=True)
|
||||
|
||||
success_count = 0
|
||||
|
||||
if install_all or current_shell == 'bash':
|
||||
if install_bash_completion():
|
||||
success_count += 1
|
||||
|
||||
if install_all or current_shell == 'zsh':
|
||||
if install_zsh_completion():
|
||||
success_count += 1
|
||||
|
||||
if install_all or current_shell == 'fish':
|
||||
if install_fish_completion():
|
||||
success_count += 1
|
||||
|
||||
print("\n" + "=" * 50)
|
||||
if success_count > 0:
|
||||
print(f"✅ Successfully installed completion for {success_count} shell(s)!")
|
||||
print("\n📋 To activate completion:")
|
||||
print(" • Bash: Restart your terminal or run 'source ~/.bashrc'")
|
||||
print(" • Zsh: Restart your terminal or run 'source ~/.zshrc'")
|
||||
print(" • Fish: Completion is active immediately")
|
||||
print("\n💡 Try typing 'fuzzforge <TAB>' to test completion!")
|
||||
else:
|
||||
print("❌ No completions were installed successfully.")
|
||||
return 1
|
||||
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
22
cli/main.py
Normal file
22
cli/main.py
Normal file
@@ -0,0 +1,22 @@
|
||||
"""
|
||||
FuzzForge CLI - Command-line interface for FuzzForge security testing platform.
|
||||
|
||||
This module provides the main entry point for the FuzzForge CLI application.
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
import typer
|
||||
from src.fuzzforge_cli.main import app
|
||||
|
||||
if __name__ == "__main__":
|
||||
app()
|
||||
41
cli/pyproject.toml
Normal file
41
cli/pyproject.toml
Normal file
@@ -0,0 +1,41 @@
|
||||
[project]
|
||||
name = "fuzzforge-cli"
|
||||
version = "0.6.0"
|
||||
description = "FuzzForge CLI - Command-line interface for FuzzForge security testing platform"
|
||||
readme = "README.md"
|
||||
authors = [
|
||||
{ name = "Tanguy Duhamel", email = "tduhamel@fuzzinglabs.com" }
|
||||
]
|
||||
requires-python = ">=3.11"
|
||||
dependencies = [
|
||||
"typer>=0.12.0",
|
||||
"rich>=13.0.0",
|
||||
"pyyaml>=6.0.0",
|
||||
"pydantic>=2.0.0",
|
||||
"httpx>=0.27.0",
|
||||
"websockets>=13.0",
|
||||
"sseclient-py>=1.8.0",
|
||||
"fuzzforge-sdk",
|
||||
"fuzzforge-ai",
|
||||
]
|
||||
|
||||
[project.optional-dependencies]
|
||||
dev = [
|
||||
"pytest>=8.0.0",
|
||||
"pytest-asyncio>=0.23.0",
|
||||
"black>=24.0.0",
|
||||
"isort>=5.13.0",
|
||||
"mypy>=1.11.0",
|
||||
]
|
||||
|
||||
[project.scripts]
|
||||
fuzzforge = "fuzzforge_cli.main:main"
|
||||
ff = "fuzzforge_cli.main:main"
|
||||
|
||||
[build-system]
|
||||
requires = ["uv_build>=0.8.17,<0.9.0"]
|
||||
build-backend = "uv_build"
|
||||
|
||||
[tool.uv.sources]
|
||||
fuzzforge-sdk = { path = "../sdk", editable = true }
|
||||
fuzzforge-ai = { path = "../ai", editable = true }
|
||||
19
cli/src/fuzzforge_cli/__init__.py
Normal file
19
cli/src/fuzzforge_cli/__init__.py
Normal file
@@ -0,0 +1,19 @@
|
||||
"""
|
||||
FuzzForge CLI - Command-line interface for FuzzForge security testing platform.
|
||||
|
||||
A comprehensive CLI for managing workflows, runs, findings, and real-time monitoring
|
||||
with local project management and persistent storage.
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
__version__ = "0.6.0"
|
||||
311
cli/src/fuzzforge_cli/api_validation.py
Normal file
311
cli/src/fuzzforge_cli/api_validation.py
Normal file
@@ -0,0 +1,311 @@
|
||||
"""
|
||||
API response validation and graceful degradation utilities.
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
import logging
|
||||
from typing import Any, Dict, List, Optional, Union
|
||||
from pydantic import BaseModel, ValidationError as PydanticValidationError
|
||||
|
||||
from .exceptions import ValidationError, APIConnectionError
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class WorkflowMetadata(BaseModel):
|
||||
"""Expected workflow metadata structure"""
|
||||
name: str
|
||||
version: str
|
||||
author: Optional[str] = None
|
||||
description: Optional[str] = None
|
||||
parameters: Dict[str, Any] = {}
|
||||
supported_volume_modes: List[str] = ["ro", "rw"]
|
||||
|
||||
|
||||
class RunStatus(BaseModel):
|
||||
"""Expected run status structure"""
|
||||
run_id: str
|
||||
workflow: str
|
||||
status: str
|
||||
created_at: str
|
||||
updated_at: str
|
||||
|
||||
@property
|
||||
def is_completed(self) -> bool:
|
||||
"""Check if run is in a completed state"""
|
||||
return self.status.lower() in ["completed", "success", "finished"]
|
||||
|
||||
@property
|
||||
def is_running(self) -> bool:
|
||||
"""Check if run is currently running"""
|
||||
return self.status.lower() in ["running", "in_progress", "active"]
|
||||
|
||||
@property
|
||||
def is_failed(self) -> bool:
|
||||
"""Check if run has failed"""
|
||||
return self.status.lower() in ["failed", "error", "cancelled"]
|
||||
|
||||
|
||||
class FindingsResponse(BaseModel):
|
||||
"""Expected findings response structure"""
|
||||
run_id: str
|
||||
sarif: Dict[str, Any]
|
||||
total_issues: Optional[int] = None
|
||||
|
||||
def model_post_init(self, __context: Any) -> None:
|
||||
"""Validate SARIF structure after initialization"""
|
||||
if not self.sarif.get("runs"):
|
||||
logger.warning(f"SARIF data for run {self.run_id} missing 'runs' section")
|
||||
elif not isinstance(self.sarif["runs"], list):
|
||||
logger.warning(f"SARIF 'runs' section is not a list for run {self.run_id}")
|
||||
|
||||
|
||||
def validate_api_response(response_data: Any, expected_model: type[BaseModel],
|
||||
operation: str = "API operation") -> BaseModel:
|
||||
"""
|
||||
Validate API response against expected Pydantic model.
|
||||
|
||||
Args:
|
||||
response_data: Raw response data from API
|
||||
expected_model: Pydantic model class to validate against
|
||||
operation: Description of the operation for error messages
|
||||
|
||||
Returns:
|
||||
Validated model instance
|
||||
|
||||
Raises:
|
||||
ValidationError: If validation fails
|
||||
"""
|
||||
try:
|
||||
return expected_model.model_validate(response_data)
|
||||
except PydanticValidationError as e:
|
||||
logger.error(f"API response validation failed for {operation}: {e}")
|
||||
raise ValidationError(
|
||||
f"API response for {operation}",
|
||||
str(response_data)[:200] + "..." if len(str(response_data)) > 200 else str(response_data),
|
||||
f"valid {expected_model.__name__} format"
|
||||
) from e
|
||||
except Exception as e:
|
||||
logger.error(f"Unexpected error validating API response for {operation}: {e}")
|
||||
raise ValidationError(
|
||||
f"API response for {operation}",
|
||||
"invalid data",
|
||||
f"valid {expected_model.__name__} format"
|
||||
) from e
|
||||
|
||||
|
||||
def validate_sarif_structure(sarif_data: Dict[str, Any]) -> Dict[str, str]:
|
||||
"""
|
||||
Validate basic SARIF structure and return validation issues.
|
||||
|
||||
Args:
|
||||
sarif_data: SARIF data dictionary
|
||||
|
||||
Returns:
|
||||
Dictionary of validation issues found
|
||||
"""
|
||||
issues = {}
|
||||
|
||||
# Check basic SARIF structure
|
||||
if not isinstance(sarif_data, dict):
|
||||
issues["structure"] = "SARIF data is not a dictionary"
|
||||
return issues
|
||||
|
||||
if "runs" not in sarif_data:
|
||||
issues["runs"] = "Missing 'runs' section in SARIF data"
|
||||
elif not isinstance(sarif_data["runs"], list):
|
||||
issues["runs_type"] = "'runs' section is not a list"
|
||||
elif len(sarif_data["runs"]) == 0:
|
||||
issues["runs_empty"] = "'runs' section is empty"
|
||||
else:
|
||||
# Check first run structure
|
||||
run = sarif_data["runs"][0]
|
||||
if not isinstance(run, dict):
|
||||
issues["run_structure"] = "First run is not a dictionary"
|
||||
else:
|
||||
if "results" not in run:
|
||||
issues["results"] = "Missing 'results' section in run"
|
||||
elif not isinstance(run["results"], list):
|
||||
issues["results_type"] = "'results' section is not a list"
|
||||
|
||||
if "tool" not in run:
|
||||
issues["tool"] = "Missing 'tool' section in run"
|
||||
elif not isinstance(run["tool"], dict):
|
||||
issues["tool_type"] = "'tool' section is not a dictionary"
|
||||
|
||||
return issues
|
||||
|
||||
|
||||
def safe_extract_sarif_summary(sarif_data: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""
|
||||
Safely extract summary information from SARIF data with fallbacks.
|
||||
|
||||
Args:
|
||||
sarif_data: SARIF data dictionary
|
||||
|
||||
Returns:
|
||||
Summary dictionary with safe defaults
|
||||
"""
|
||||
summary = {
|
||||
"total_issues": 0,
|
||||
"by_severity": {},
|
||||
"by_rule": {},
|
||||
"tools": [],
|
||||
"validation_issues": []
|
||||
}
|
||||
|
||||
# Validate structure first
|
||||
validation_issues = validate_sarif_structure(sarif_data)
|
||||
if validation_issues:
|
||||
summary["validation_issues"] = list(validation_issues.values())
|
||||
logger.warning(f"SARIF validation issues: {validation_issues}")
|
||||
|
||||
try:
|
||||
runs = sarif_data.get("runs", [])
|
||||
if not runs:
|
||||
return summary
|
||||
|
||||
run = runs[0]
|
||||
results = run.get("results", [])
|
||||
|
||||
summary["total_issues"] = len(results)
|
||||
|
||||
# Count by severity/level
|
||||
for result in results:
|
||||
try:
|
||||
level = result.get("level", "note")
|
||||
rule_id = result.get("ruleId", "unknown")
|
||||
|
||||
summary["by_severity"][level] = summary["by_severity"].get(level, 0) + 1
|
||||
summary["by_rule"][rule_id] = summary["by_rule"].get(rule_id, 0) + 1
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to process result: {e}")
|
||||
continue
|
||||
|
||||
# Extract tool information safely
|
||||
try:
|
||||
tool = run.get("tool", {})
|
||||
driver = tool.get("driver", {})
|
||||
if driver.get("name"):
|
||||
summary["tools"].append({
|
||||
"name": driver.get("name", "unknown"),
|
||||
"version": driver.get("version", "unknown"),
|
||||
"rules": len(driver.get("rules", []))
|
||||
})
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to extract tool information: {e}")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to extract SARIF summary: {e}")
|
||||
summary["validation_issues"].append(f"Summary extraction failed: {e}")
|
||||
|
||||
return summary
|
||||
|
||||
|
||||
def validate_workflow_parameters(parameters: Dict[str, Any],
|
||||
workflow_schema: Dict[str, Any]) -> List[str]:
|
||||
"""
|
||||
Validate workflow parameters against schema with detailed error messages.
|
||||
|
||||
Args:
|
||||
parameters: Parameters to validate
|
||||
workflow_schema: JSON schema for the workflow
|
||||
|
||||
Returns:
|
||||
List of validation error messages
|
||||
"""
|
||||
errors = []
|
||||
|
||||
try:
|
||||
properties = workflow_schema.get("properties", {})
|
||||
required = set(workflow_schema.get("required", []))
|
||||
|
||||
# Check required parameters
|
||||
missing_required = required - set(parameters.keys())
|
||||
if missing_required:
|
||||
errors.append(f"Missing required parameters: {', '.join(missing_required)}")
|
||||
|
||||
# Validate individual parameters
|
||||
for param_name, param_value in parameters.items():
|
||||
if param_name not in properties:
|
||||
errors.append(f"Unknown parameter: {param_name}")
|
||||
continue
|
||||
|
||||
param_schema = properties[param_name]
|
||||
param_type = param_schema.get("type", "string")
|
||||
|
||||
# Type validation
|
||||
if param_type == "integer" and not isinstance(param_value, int):
|
||||
errors.append(f"Parameter '{param_name}' must be an integer")
|
||||
elif param_type == "number" and not isinstance(param_value, (int, float)):
|
||||
errors.append(f"Parameter '{param_name}' must be a number")
|
||||
elif param_type == "boolean" and not isinstance(param_value, bool):
|
||||
errors.append(f"Parameter '{param_name}' must be a boolean")
|
||||
elif param_type == "array" and not isinstance(param_value, list):
|
||||
errors.append(f"Parameter '{param_name}' must be an array")
|
||||
|
||||
# Range validation for numbers
|
||||
if param_type in ["integer", "number"] and isinstance(param_value, (int, float)):
|
||||
minimum = param_schema.get("minimum")
|
||||
maximum = param_schema.get("maximum")
|
||||
|
||||
if minimum is not None and param_value < minimum:
|
||||
errors.append(f"Parameter '{param_name}' must be >= {minimum}")
|
||||
if maximum is not None and param_value > maximum:
|
||||
errors.append(f"Parameter '{param_name}' must be <= {maximum}")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Parameter validation failed: {e}")
|
||||
errors.append(f"Parameter validation error: {e}")
|
||||
|
||||
return errors
|
||||
|
||||
|
||||
def create_fallback_response(response_type: str, **kwargs) -> Dict[str, Any]:
|
||||
"""
|
||||
Create fallback responses when API calls fail.
|
||||
|
||||
Args:
|
||||
response_type: Type of response to create
|
||||
**kwargs: Additional data for the fallback
|
||||
|
||||
Returns:
|
||||
Fallback response dictionary
|
||||
"""
|
||||
fallbacks = {
|
||||
"workflow_list": {
|
||||
"workflows": [],
|
||||
"message": "Unable to fetch workflows from API"
|
||||
},
|
||||
"run_status": {
|
||||
"run_id": kwargs.get("run_id", "unknown"),
|
||||
"workflow": kwargs.get("workflow", "unknown"),
|
||||
"status": "unknown",
|
||||
"created_at": kwargs.get("created_at", "unknown"),
|
||||
"updated_at": kwargs.get("updated_at", "unknown"),
|
||||
"message": "Unable to fetch run status from API"
|
||||
},
|
||||
"findings": {
|
||||
"run_id": kwargs.get("run_id", "unknown"),
|
||||
"sarif": {
|
||||
"version": "2.1.0",
|
||||
"runs": []
|
||||
},
|
||||
"message": "Unable to fetch findings from API"
|
||||
}
|
||||
}
|
||||
|
||||
fallback = fallbacks.get(response_type, {"message": f"No fallback available for {response_type}"})
|
||||
logger.info(f"Using fallback response for {response_type}: {fallback.get('message', 'Unknown fallback')}")
|
||||
|
||||
return fallback
|
||||
14
cli/src/fuzzforge_cli/commands/__init__.py
Normal file
14
cli/src/fuzzforge_cli/commands/__init__.py
Normal file
@@ -0,0 +1,14 @@
|
||||
"""
|
||||
Command modules for FuzzForge CLI.
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
133
cli/src/fuzzforge_cli/commands/ai.py
Normal file
133
cli/src/fuzzforge_cli/commands/ai.py
Normal file
@@ -0,0 +1,133 @@
|
||||
"""AI integration commands for the FuzzForge CLI."""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import os
|
||||
from datetime import datetime
|
||||
from typing import Optional
|
||||
|
||||
import typer
|
||||
from rich.console import Console
|
||||
from rich.panel import Panel
|
||||
from rich.table import Table
|
||||
|
||||
from ..config import ProjectConfigManager
|
||||
|
||||
console = Console()
|
||||
app = typer.Typer(name="ai", help="Interact with the FuzzForge AI system")
|
||||
|
||||
|
||||
@app.command("agent")
|
||||
def ai_agent() -> None:
|
||||
"""Launch the full AI agent CLI with A2A orchestration."""
|
||||
console.print("[cyan]🤖 Opening Project FuzzForge AI Agent session[/cyan]\n")
|
||||
|
||||
try:
|
||||
from fuzzforge_ai.cli import FuzzForgeCLI
|
||||
|
||||
cli = FuzzForgeCLI()
|
||||
asyncio.run(cli.run())
|
||||
except ImportError as exc:
|
||||
console.print(f"[red]Failed to import AI CLI:[/red] {exc}")
|
||||
console.print("[dim]Ensure AI dependencies are installed (pip install -e .)[/dim]")
|
||||
raise typer.Exit(1) from exc
|
||||
except Exception as exc: # pragma: no cover - runtime safety
|
||||
console.print(f"[red]Failed to launch AI agent:[/red] {exc}")
|
||||
console.print("[dim]Check that .env contains LITELLM_MODEL and API keys[/dim]")
|
||||
raise typer.Exit(1) from exc
|
||||
|
||||
|
||||
# Memory + health commands
|
||||
@app.command("status")
|
||||
def ai_status() -> None:
|
||||
"""Show AI system health and configuration."""
|
||||
try:
|
||||
status = asyncio.run(get_ai_status_async())
|
||||
except Exception as exc: # pragma: no cover
|
||||
console.print(f"[red]Failed to get AI status:[/red] {exc}")
|
||||
raise typer.Exit(1) from exc
|
||||
|
||||
console.print("[bold cyan]🤖 FuzzForge AI System Status[/bold cyan]\n")
|
||||
|
||||
config_table = Table(title="Configuration", show_header=True, header_style="bold magenta")
|
||||
config_table.add_column("Setting", style="bold")
|
||||
config_table.add_column("Value", style="cyan")
|
||||
config_table.add_column("Status", style="green")
|
||||
|
||||
for key, info in status["config"].items():
|
||||
status_icon = "✅" if info["configured"] else "❌"
|
||||
display_value = info["value"] if info["value"] else "-"
|
||||
config_table.add_row(key, display_value, f"{status_icon}")
|
||||
|
||||
console.print(config_table)
|
||||
console.print()
|
||||
|
||||
components_table = Table(title="AI Components", show_header=True, header_style="bold magenta")
|
||||
components_table.add_column("Component", style="bold")
|
||||
components_table.add_column("Status", style="green")
|
||||
components_table.add_column("Details", style="dim")
|
||||
|
||||
for component, info in status["components"].items():
|
||||
status_icon = "🟢" if info["available"] else "🔴"
|
||||
components_table.add_row(component, status_icon, info["details"])
|
||||
|
||||
console.print(components_table)
|
||||
|
||||
if status["agents"]:
|
||||
console.print()
|
||||
console.print(f"[bold green]✓[/bold green] {len(status['agents'])} agents registered")
|
||||
|
||||
|
||||
@app.command("server")
|
||||
def ai_server(
|
||||
port: int = typer.Option(10100, "--port", "-p", help="Server port (default: 10100)"),
|
||||
) -> None:
|
||||
"""Start AI system as an A2A server."""
|
||||
console.print(f"[cyan]🚀 Starting FuzzForge AI Server on port {port}[/cyan]")
|
||||
console.print("[dim]Other agents can register this instance at the A2A endpoint[/dim]\n")
|
||||
|
||||
try:
|
||||
os.environ["FUZZFORGE_PORT"] = str(port)
|
||||
from fuzzforge_ai.__main__ import main as start_server
|
||||
|
||||
start_server()
|
||||
except Exception as exc: # pragma: no cover
|
||||
console.print(f"[red]Failed to start AI server:[/red] {exc}")
|
||||
raise typer.Exit(1) from exc
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Helper functions (largely adapted from the OSS implementation)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
@app.callback(invoke_without_command=True)
|
||||
def ai_callback(ctx: typer.Context):
|
||||
"""
|
||||
🤖 AI integration features
|
||||
"""
|
||||
# Check if a subcommand is being invoked
|
||||
if ctx.invoked_subcommand is not None:
|
||||
# Let the subcommand handle it
|
||||
return
|
||||
|
||||
# Show not implemented message for default command
|
||||
console.print("🚧 [yellow]AI command is not fully implemented yet.[/yellow]")
|
||||
console.print("Please use specific subcommands:")
|
||||
console.print(" • [cyan]ff ai agent[/cyan] - Launch the full AI agent CLI")
|
||||
console.print(" • [cyan]ff ai status[/cyan] - Show AI system health and configuration")
|
||||
console.print(" • [cyan]ff ai server[/cyan] - Start AI system as an A2A server")
|
||||
|
||||
|
||||
384
cli/src/fuzzforge_cli/commands/config.py
Normal file
384
cli/src/fuzzforge_cli/commands/config.py
Normal file
@@ -0,0 +1,384 @@
|
||||
"""
|
||||
Configuration management commands.
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
import typer
|
||||
from pathlib import Path
|
||||
from rich.console import Console
|
||||
from rich.table import Table
|
||||
from rich.panel import Panel
|
||||
from rich.prompt import Prompt, Confirm
|
||||
from rich import box
|
||||
from typing import Optional
|
||||
|
||||
from ..config import (
|
||||
get_project_config,
|
||||
ensure_project_config,
|
||||
get_global_config,
|
||||
save_global_config,
|
||||
FuzzForgeConfig
|
||||
)
|
||||
from ..exceptions import require_project, ValidationError, handle_error
|
||||
|
||||
console = Console()
|
||||
app = typer.Typer()
|
||||
|
||||
|
||||
@app.command("show")
|
||||
def show_config(
|
||||
global_config: bool = typer.Option(
|
||||
False, "--global", "-g",
|
||||
help="Show global configuration instead of project config"
|
||||
)
|
||||
):
|
||||
"""
|
||||
📋 Display current configuration settings
|
||||
"""
|
||||
if global_config:
|
||||
config = get_global_config()
|
||||
config_type = "Global"
|
||||
config_path = Path.home() / ".config" / "fuzzforge" / "config.yaml"
|
||||
else:
|
||||
try:
|
||||
require_project()
|
||||
config = get_project_config()
|
||||
if not config:
|
||||
raise ValidationError("project configuration", "missing", "initialized project")
|
||||
except Exception as e:
|
||||
handle_error(e, "loading project configuration")
|
||||
return # Unreachable, but makes static analysis happy
|
||||
config_type = "Project"
|
||||
config_path = Path.cwd() / ".fuzzforge" / "config.yaml"
|
||||
|
||||
console.print(f"\n⚙️ [bold]{config_type} Configuration[/bold]\n")
|
||||
|
||||
# Project settings
|
||||
project_table = Table(show_header=False, box=box.SIMPLE)
|
||||
project_table.add_column("Setting", style="bold cyan")
|
||||
project_table.add_column("Value")
|
||||
|
||||
project_table.add_row("Project Name", config.project.name)
|
||||
project_table.add_row("API URL", config.project.api_url)
|
||||
project_table.add_row("Default Timeout", f"{config.project.default_timeout}s")
|
||||
if config.project.default_workflow:
|
||||
project_table.add_row("Default Workflow", config.project.default_workflow)
|
||||
|
||||
console.print(
|
||||
Panel.fit(
|
||||
project_table,
|
||||
title="📁 Project Settings",
|
||||
box=box.ROUNDED
|
||||
)
|
||||
)
|
||||
|
||||
# Retention settings
|
||||
retention_table = Table(show_header=False, box=box.SIMPLE)
|
||||
retention_table.add_column("Setting", style="bold cyan")
|
||||
retention_table.add_column("Value")
|
||||
|
||||
retention_table.add_row("Max Runs", str(config.retention.max_runs))
|
||||
retention_table.add_row("Keep Findings (days)", str(config.retention.keep_findings_days))
|
||||
|
||||
console.print(
|
||||
Panel.fit(
|
||||
retention_table,
|
||||
title="🗄️ Data Retention",
|
||||
box=box.ROUNDED
|
||||
)
|
||||
)
|
||||
|
||||
# Preferences
|
||||
prefs_table = Table(show_header=False, box=box.SIMPLE)
|
||||
prefs_table.add_column("Setting", style="bold cyan")
|
||||
prefs_table.add_column("Value")
|
||||
|
||||
prefs_table.add_row("Auto Save Findings", "✅ Yes" if config.preferences.auto_save_findings else "❌ No")
|
||||
prefs_table.add_row("Show Progress Bars", "✅ Yes" if config.preferences.show_progress_bars else "❌ No")
|
||||
prefs_table.add_row("Table Style", config.preferences.table_style)
|
||||
prefs_table.add_row("Color Output", "✅ Yes" if config.preferences.color_output else "❌ No")
|
||||
|
||||
console.print(
|
||||
Panel.fit(
|
||||
prefs_table,
|
||||
title="🎨 Preferences",
|
||||
box=box.ROUNDED
|
||||
)
|
||||
)
|
||||
|
||||
console.print(f"\n📍 Config file: [dim]{config_path}[/dim]")
|
||||
|
||||
|
||||
@app.command("set")
|
||||
def set_config(
|
||||
key: str = typer.Argument(..., help="Configuration key to set (e.g., 'project.name', 'project.api_url')"),
|
||||
value: str = typer.Argument(..., help="Value to set"),
|
||||
global_config: bool = typer.Option(
|
||||
False, "--global", "-g",
|
||||
help="Set in global configuration instead of project config"
|
||||
)
|
||||
):
|
||||
"""
|
||||
⚙️ Set a configuration value
|
||||
"""
|
||||
if global_config:
|
||||
config = get_global_config()
|
||||
config_type = "global"
|
||||
else:
|
||||
config = get_project_config()
|
||||
if not config:
|
||||
console.print("❌ No project configuration found. Run 'ff init' first.", style="red")
|
||||
raise typer.Exit(1)
|
||||
config_type = "project"
|
||||
|
||||
# Parse the key path
|
||||
key_parts = key.split('.')
|
||||
if len(key_parts) != 2:
|
||||
console.print("❌ Key must be in format 'section.setting' (e.g., 'project.name')", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
section, setting = key_parts
|
||||
|
||||
try:
|
||||
# Update configuration
|
||||
if section == "project":
|
||||
if setting == "name":
|
||||
config.project.name = value
|
||||
elif setting == "api_url":
|
||||
config.project.api_url = value
|
||||
elif setting == "default_timeout":
|
||||
config.project.default_timeout = int(value)
|
||||
elif setting == "default_workflow":
|
||||
config.project.default_workflow = value if value.lower() != "none" else None
|
||||
else:
|
||||
console.print(f"❌ Unknown project setting: {setting}", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
elif section == "retention":
|
||||
if setting == "max_runs":
|
||||
config.retention.max_runs = int(value)
|
||||
elif setting == "keep_findings_days":
|
||||
config.retention.keep_findings_days = int(value)
|
||||
else:
|
||||
console.print(f"❌ Unknown retention setting: {setting}", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
elif section == "preferences":
|
||||
if setting == "auto_save_findings":
|
||||
config.preferences.auto_save_findings = value.lower() in ("true", "yes", "1", "on")
|
||||
elif setting == "show_progress_bars":
|
||||
config.preferences.show_progress_bars = value.lower() in ("true", "yes", "1", "on")
|
||||
elif setting == "table_style":
|
||||
config.preferences.table_style = value
|
||||
elif setting == "color_output":
|
||||
config.preferences.color_output = value.lower() in ("true", "yes", "1", "on")
|
||||
else:
|
||||
console.print(f"❌ Unknown preferences setting: {setting}", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
else:
|
||||
console.print(f"❌ Unknown configuration section: {section}", style="red")
|
||||
console.print("Valid sections: project, retention, preferences", style="dim")
|
||||
raise typer.Exit(1)
|
||||
|
||||
# Save configuration
|
||||
if global_config:
|
||||
save_global_config(config)
|
||||
else:
|
||||
config_path = Path.cwd() / ".fuzzforge" / "config.yaml"
|
||||
config.save_to_file(config_path)
|
||||
|
||||
console.print(f"✅ Set {config_type} configuration: [bold cyan]{key}[/bold cyan] = [bold]{value}[/bold]", style="green")
|
||||
|
||||
except ValueError as e:
|
||||
console.print(f"❌ Invalid value for {key}: {e}", style="red")
|
||||
raise typer.Exit(1)
|
||||
except Exception as e:
|
||||
console.print(f"❌ Failed to set configuration: {e}", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
|
||||
@app.command("get")
|
||||
def get_config(
|
||||
key: str = typer.Argument(..., help="Configuration key to get (e.g., 'project.name')"),
|
||||
global_config: bool = typer.Option(
|
||||
False, "--global", "-g",
|
||||
help="Get from global configuration instead of project config"
|
||||
)
|
||||
):
|
||||
"""
|
||||
📖 Get a specific configuration value
|
||||
"""
|
||||
if global_config:
|
||||
config = get_global_config()
|
||||
else:
|
||||
config = get_project_config()
|
||||
if not config:
|
||||
console.print("❌ No project configuration found. Run 'ff init' first.", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
# Parse the key path
|
||||
key_parts = key.split('.')
|
||||
if len(key_parts) != 2:
|
||||
console.print("❌ Key must be in format 'section.setting' (e.g., 'project.name')", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
section, setting = key_parts
|
||||
|
||||
try:
|
||||
# Get configuration value
|
||||
if section == "project":
|
||||
if setting == "name":
|
||||
value = config.project.name
|
||||
elif setting == "api_url":
|
||||
value = config.project.api_url
|
||||
elif setting == "default_timeout":
|
||||
value = config.project.default_timeout
|
||||
elif setting == "default_workflow":
|
||||
value = config.project.default_workflow or "none"
|
||||
else:
|
||||
console.print(f"❌ Unknown project setting: {setting}", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
elif section == "retention":
|
||||
if setting == "max_runs":
|
||||
value = config.retention.max_runs
|
||||
elif setting == "keep_findings_days":
|
||||
value = config.retention.keep_findings_days
|
||||
else:
|
||||
console.print(f"❌ Unknown retention setting: {setting}", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
elif section == "preferences":
|
||||
if setting == "auto_save_findings":
|
||||
value = config.preferences.auto_save_findings
|
||||
elif setting == "show_progress_bars":
|
||||
value = config.preferences.show_progress_bars
|
||||
elif setting == "table_style":
|
||||
value = config.preferences.table_style
|
||||
elif setting == "color_output":
|
||||
value = config.preferences.color_output
|
||||
else:
|
||||
console.print(f"❌ Unknown preferences setting: {setting}", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
else:
|
||||
console.print(f"❌ Unknown configuration section: {section}", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
console.print(f"{key}: [bold cyan]{value}[/bold cyan]")
|
||||
|
||||
except Exception as e:
|
||||
console.print(f"❌ Failed to get configuration: {e}", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
|
||||
@app.command("reset")
|
||||
def reset_config(
|
||||
global_config: bool = typer.Option(
|
||||
False, "--global", "-g",
|
||||
help="Reset global configuration instead of project config"
|
||||
),
|
||||
force: bool = typer.Option(
|
||||
False, "--force", "-f",
|
||||
help="Skip confirmation prompt"
|
||||
)
|
||||
):
|
||||
"""
|
||||
🔄 Reset configuration to defaults
|
||||
"""
|
||||
config_type = "global" if global_config else "project"
|
||||
|
||||
if not force:
|
||||
if not Confirm.ask(f"Reset {config_type} configuration to defaults?", default=False, console=console):
|
||||
console.print("❌ Reset cancelled", style="yellow")
|
||||
raise typer.Exit(0)
|
||||
|
||||
try:
|
||||
# Create new default configuration
|
||||
new_config = FuzzForgeConfig()
|
||||
|
||||
if global_config:
|
||||
save_global_config(new_config)
|
||||
else:
|
||||
if not Path.cwd().joinpath(".fuzzforge").exists():
|
||||
console.print("❌ No project configuration found. Run 'ff init' first.", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
config_path = Path.cwd() / ".fuzzforge" / "config.yaml"
|
||||
new_config.save_to_file(config_path)
|
||||
|
||||
console.print(f"✅ {config_type.title()} configuration reset to defaults", style="green")
|
||||
|
||||
except Exception as e:
|
||||
console.print(f"❌ Failed to reset configuration: {e}", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
|
||||
@app.command("edit")
|
||||
def edit_config(
|
||||
global_config: bool = typer.Option(
|
||||
False, "--global", "-g",
|
||||
help="Edit global configuration instead of project config"
|
||||
)
|
||||
):
|
||||
"""
|
||||
📝 Open configuration file in default editor
|
||||
"""
|
||||
import os
|
||||
import subprocess
|
||||
|
||||
if global_config:
|
||||
config_path = Path.home() / ".config" / "fuzzforge" / "config.yaml"
|
||||
config_type = "global"
|
||||
else:
|
||||
config_path = Path.cwd() / ".fuzzforge" / "config.yaml"
|
||||
config_type = "project"
|
||||
|
||||
if not config_path.exists():
|
||||
console.print("❌ No project configuration found. Run 'ff init' first.", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
# Try to find a suitable editor
|
||||
editors = ["code", "vim", "nano", "notepad"]
|
||||
editor = None
|
||||
|
||||
for e in editors:
|
||||
try:
|
||||
subprocess.run([e, "--version"], capture_output=True, check=True)
|
||||
editor = e
|
||||
break
|
||||
except (subprocess.CalledProcessError, FileNotFoundError):
|
||||
continue
|
||||
|
||||
if not editor:
|
||||
console.print(f"📍 Configuration file: [bold cyan]{config_path}[/bold cyan]")
|
||||
console.print("❌ No suitable editor found. Please edit the file manually.", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
try:
|
||||
console.print(f"📝 Opening {config_type} configuration in {editor}...")
|
||||
subprocess.run([editor, str(config_path)], check=True)
|
||||
console.print(f"✅ Configuration file edited", style="green")
|
||||
|
||||
except subprocess.CalledProcessError as e:
|
||||
console.print(f"❌ Failed to open editor: {e}", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
|
||||
@app.callback()
|
||||
def config_callback():
|
||||
"""
|
||||
⚙️ Manage configuration settings
|
||||
"""
|
||||
pass
|
||||
940
cli/src/fuzzforge_cli/commands/findings.py
Normal file
940
cli/src/fuzzforge_cli/commands/findings.py
Normal file
@@ -0,0 +1,940 @@
|
||||
"""
|
||||
Findings and security results management commands.
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
import json
|
||||
import csv
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
from typing import Optional, Dict, Any, List
|
||||
|
||||
import typer
|
||||
from rich.console import Console
|
||||
from rich.table import Table, Column
|
||||
from rich.panel import Panel
|
||||
from rich.syntax import Syntax
|
||||
from rich.tree import Tree
|
||||
from rich.text import Text
|
||||
from rich import box
|
||||
|
||||
from ..config import get_project_config, FuzzForgeConfig
|
||||
from ..database import get_project_db, ensure_project_db, FindingRecord
|
||||
from ..exceptions import (
|
||||
handle_error, retry_on_network_error, validate_run_id,
|
||||
require_project, ValidationError, DatabaseError
|
||||
)
|
||||
from fuzzforge_sdk import FuzzForgeClient
|
||||
|
||||
console = Console()
|
||||
app = typer.Typer()
|
||||
|
||||
|
||||
@retry_on_network_error(max_retries=3, delay=1.0)
|
||||
def get_client() -> FuzzForgeClient:
|
||||
"""Get configured FuzzForge client with retry on network errors"""
|
||||
config = get_project_config() or FuzzForgeConfig()
|
||||
return FuzzForgeClient(base_url=config.get_api_url(), timeout=config.get_timeout())
|
||||
|
||||
|
||||
def severity_style(severity: str) -> str:
|
||||
"""Get rich style for severity level"""
|
||||
return {
|
||||
"error": "bold red",
|
||||
"warning": "bold yellow",
|
||||
"note": "bold blue",
|
||||
"info": "bold cyan"
|
||||
}.get(severity.lower(), "white")
|
||||
|
||||
|
||||
@app.command("get")
|
||||
def get_findings(
|
||||
run_id: str = typer.Argument(..., help="Run ID to get findings for"),
|
||||
save: bool = typer.Option(
|
||||
True, "--save/--no-save",
|
||||
help="Save findings to local database"
|
||||
),
|
||||
format: str = typer.Option(
|
||||
"table", "--format", "-f",
|
||||
help="Output format: table, json, sarif"
|
||||
)
|
||||
):
|
||||
"""
|
||||
🔍 Retrieve and display security findings for a run
|
||||
"""
|
||||
try:
|
||||
require_project()
|
||||
validate_run_id(run_id)
|
||||
|
||||
if format not in ["table", "json", "sarif"]:
|
||||
raise ValidationError("format", format, "one of: table, json, sarif")
|
||||
with get_client() as client:
|
||||
console.print(f"🔍 Fetching findings for run: {run_id}")
|
||||
findings = client.get_run_findings(run_id)
|
||||
|
||||
# Save to database if requested
|
||||
if save:
|
||||
try:
|
||||
db = ensure_project_db()
|
||||
|
||||
# Extract summary from SARIF
|
||||
sarif_data = findings.sarif
|
||||
runs_data = sarif_data.get("runs", [])
|
||||
summary = {}
|
||||
|
||||
if runs_data:
|
||||
results = runs_data[0].get("results", [])
|
||||
summary = {
|
||||
"total_issues": len(results),
|
||||
"by_severity": {},
|
||||
"by_rule": {},
|
||||
"tools": []
|
||||
}
|
||||
|
||||
for result in results:
|
||||
level = result.get("level", "note")
|
||||
rule_id = result.get("ruleId", "unknown")
|
||||
|
||||
summary["by_severity"][level] = summary["by_severity"].get(level, 0) + 1
|
||||
summary["by_rule"][rule_id] = summary["by_rule"].get(rule_id, 0) + 1
|
||||
|
||||
# Extract tool info
|
||||
tool = runs_data[0].get("tool", {})
|
||||
driver = tool.get("driver", {})
|
||||
if driver.get("name"):
|
||||
summary["tools"].append({
|
||||
"name": driver.get("name"),
|
||||
"version": driver.get("version"),
|
||||
"rules": len(driver.get("rules", []))
|
||||
})
|
||||
|
||||
finding_record = FindingRecord(
|
||||
run_id=run_id,
|
||||
sarif_data=sarif_data,
|
||||
summary=summary,
|
||||
created_at=datetime.now()
|
||||
)
|
||||
db.save_findings(finding_record)
|
||||
console.print("✅ Findings saved to local database", style="green")
|
||||
except Exception as e:
|
||||
console.print(f"⚠️ Failed to save findings to database: {e}", style="yellow")
|
||||
|
||||
# Display findings
|
||||
if format == "json":
|
||||
findings_json = json.dumps(findings.sarif, indent=2)
|
||||
console.print(Syntax(findings_json, "json", theme="monokai"))
|
||||
|
||||
elif format == "sarif":
|
||||
sarif_json = json.dumps(findings.sarif, indent=2)
|
||||
console.print(sarif_json)
|
||||
|
||||
else: # table format
|
||||
display_findings_table(findings.sarif)
|
||||
|
||||
except Exception as e:
|
||||
console.print(f"❌ Failed to get findings: {e}", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
|
||||
def display_findings_table(sarif_data: Dict[str, Any]):
|
||||
"""Display SARIF findings in a rich table format"""
|
||||
runs = sarif_data.get("runs", [])
|
||||
if not runs:
|
||||
console.print("ℹ️ No findings data available", style="dim")
|
||||
return
|
||||
|
||||
run_data = runs[0]
|
||||
results = run_data.get("results", [])
|
||||
tool = run_data.get("tool", {})
|
||||
driver = tool.get("driver", {})
|
||||
|
||||
# Tool information
|
||||
console.print(f"\n🔍 [bold]Security Analysis Results[/bold]")
|
||||
if driver.get("name"):
|
||||
console.print(f"Tool: {driver.get('name')} v{driver.get('version', 'unknown')}")
|
||||
|
||||
if not results:
|
||||
console.print("✅ No security issues found!", style="green")
|
||||
return
|
||||
|
||||
# Summary statistics
|
||||
summary_by_level = {}
|
||||
for result in results:
|
||||
level = result.get("level", "note")
|
||||
summary_by_level[level] = summary_by_level.get(level, 0) + 1
|
||||
|
||||
summary_table = Table(show_header=False, box=box.SIMPLE)
|
||||
summary_table.add_column("Severity", width=15, justify="left", style="bold")
|
||||
summary_table.add_column("Count", width=8, justify="right", style="bold")
|
||||
|
||||
for level, count in sorted(summary_by_level.items()):
|
||||
# Create Rich Text object with color styling
|
||||
level_text = level.upper()
|
||||
severity_text = Text(level_text, style=severity_style(level))
|
||||
count_text = Text(str(count))
|
||||
|
||||
summary_table.add_row(severity_text, count_text)
|
||||
|
||||
console.print(
|
||||
Panel.fit(
|
||||
summary_table,
|
||||
title=f"📊 Summary ({len(results)} total issues)",
|
||||
box=box.ROUNDED
|
||||
)
|
||||
)
|
||||
|
||||
# Detailed results - Rich Text-based table with proper emoji alignment
|
||||
results_table = Table(box=box.ROUNDED)
|
||||
results_table.add_column("Severity", width=12, justify="left", no_wrap=True)
|
||||
results_table.add_column("Rule", width=25, justify="left", style="bold cyan", no_wrap=True)
|
||||
results_table.add_column("Message", width=55, justify="left", no_wrap=True)
|
||||
results_table.add_column("Location", width=20, justify="left", style="dim", no_wrap=True)
|
||||
|
||||
for result in results[:50]: # Limit to first 50 results
|
||||
level = result.get("level", "note")
|
||||
rule_id = result.get("ruleId", "unknown")
|
||||
message = result.get("message", {}).get("text", "No message")
|
||||
|
||||
# Extract location information
|
||||
locations = result.get("locations", [])
|
||||
location_str = ""
|
||||
if locations:
|
||||
physical_location = locations[0].get("physicalLocation", {})
|
||||
artifact_location = physical_location.get("artifactLocation", {})
|
||||
region = physical_location.get("region", {})
|
||||
|
||||
file_path = artifact_location.get("uri", "")
|
||||
if file_path:
|
||||
location_str = Path(file_path).name
|
||||
if region.get("startLine"):
|
||||
location_str += f":{region['startLine']}"
|
||||
if region.get("startColumn"):
|
||||
location_str += f":{region['startColumn']}"
|
||||
|
||||
# Create Rich Text objects with color styling
|
||||
severity_text = Text(level.upper(), style=severity_style(level))
|
||||
severity_text.truncate(12, overflow="ellipsis")
|
||||
|
||||
rule_text = Text(rule_id)
|
||||
rule_text.truncate(25, overflow="ellipsis")
|
||||
|
||||
message_text = Text(message)
|
||||
message_text.truncate(55, overflow="ellipsis")
|
||||
|
||||
location_text = Text(location_str)
|
||||
location_text.truncate(20, overflow="ellipsis")
|
||||
|
||||
results_table.add_row(
|
||||
severity_text,
|
||||
rule_text,
|
||||
message_text,
|
||||
location_text
|
||||
)
|
||||
|
||||
console.print(f"\n📋 [bold]Detailed Results[/bold]")
|
||||
if len(results) > 50:
|
||||
console.print(f"Showing first 50 of {len(results)} results")
|
||||
console.print()
|
||||
console.print(results_table)
|
||||
|
||||
|
||||
@app.command("history")
|
||||
def findings_history(
|
||||
limit: int = typer.Option(20, "--limit", "-l", help="Maximum number of findings to show")
|
||||
):
|
||||
"""
|
||||
📚 Show findings history from local database
|
||||
"""
|
||||
db = get_project_db()
|
||||
if not db:
|
||||
console.print("❌ No FuzzForge project found. Run 'ff init' first.", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
try:
|
||||
findings = db.list_findings(limit=limit)
|
||||
|
||||
if not findings:
|
||||
console.print("❌ No findings found in database", style="red")
|
||||
return
|
||||
|
||||
table = Table(box=box.ROUNDED)
|
||||
table.add_column("Run ID", style="bold cyan", width=36) # Full UUID width
|
||||
table.add_column("Date", justify="center")
|
||||
table.add_column("Total Issues", justify="center", style="bold")
|
||||
table.add_column("Errors", justify="center", style="red")
|
||||
table.add_column("Warnings", justify="center", style="yellow")
|
||||
table.add_column("Notes", justify="center", style="blue")
|
||||
table.add_column("Tools", style="dim")
|
||||
|
||||
for finding in findings:
|
||||
summary = finding.summary
|
||||
total_issues = summary.get("total_issues", 0)
|
||||
by_severity = summary.get("by_severity", {})
|
||||
tools = summary.get("tools", [])
|
||||
|
||||
tool_names = ", ".join([tool.get("name", "Unknown") for tool in tools])
|
||||
|
||||
table.add_row(
|
||||
finding.run_id, # Show full Run ID
|
||||
finding.created_at.strftime("%m-%d %H:%M"),
|
||||
str(total_issues),
|
||||
str(by_severity.get("error", 0)),
|
||||
str(by_severity.get("warning", 0)),
|
||||
str(by_severity.get("note", 0)),
|
||||
tool_names[:30] + "..." if len(tool_names) > 30 else tool_names
|
||||
)
|
||||
|
||||
console.print(f"\n📚 [bold]Findings History ({len(findings)})[/bold]\n")
|
||||
console.print(table)
|
||||
|
||||
console.print(f"\n💡 Use [bold cyan]fuzzforge finding <run-id>[/bold cyan] to view detailed findings")
|
||||
|
||||
except Exception as e:
|
||||
console.print(f"❌ Failed to get findings history: {e}", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
|
||||
@app.command("export")
|
||||
def export_findings(
|
||||
run_id: str = typer.Argument(..., help="Run ID to export findings for"),
|
||||
format: str = typer.Option(
|
||||
"json", "--format", "-f",
|
||||
help="Export format: json, csv, html, sarif"
|
||||
),
|
||||
output: Optional[str] = typer.Option(
|
||||
None, "--output", "-o",
|
||||
help="Output file path (defaults to findings-<run-id>.<format>)"
|
||||
)
|
||||
):
|
||||
"""
|
||||
📤 Export security findings in various formats
|
||||
"""
|
||||
db = get_project_db()
|
||||
if not db:
|
||||
console.print("❌ No FuzzForge project found. Run 'ff init' first.", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
try:
|
||||
# Get findings from database first, fallback to API
|
||||
findings_data = db.get_findings(run_id)
|
||||
if not findings_data:
|
||||
console.print(f"📡 Fetching findings from API for run: {run_id}")
|
||||
with get_client() as client:
|
||||
findings = client.get_run_findings(run_id)
|
||||
sarif_data = findings.sarif
|
||||
else:
|
||||
sarif_data = findings_data.sarif_data
|
||||
|
||||
# Generate output filename
|
||||
if not output:
|
||||
output = f"findings-{run_id[:8]}.{format}"
|
||||
|
||||
output_path = Path(output)
|
||||
|
||||
# Export based on format
|
||||
if format == "sarif":
|
||||
with open(output_path, 'w') as f:
|
||||
json.dump(sarif_data, f, indent=2)
|
||||
|
||||
elif format == "json":
|
||||
# Simplified JSON format
|
||||
simplified_data = extract_simplified_findings(sarif_data)
|
||||
with open(output_path, 'w') as f:
|
||||
json.dump(simplified_data, f, indent=2)
|
||||
|
||||
elif format == "csv":
|
||||
export_to_csv(sarif_data, output_path)
|
||||
|
||||
elif format == "html":
|
||||
export_to_html(sarif_data, output_path, run_id)
|
||||
|
||||
else:
|
||||
console.print(f"❌ Unsupported format: {format}", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
console.print(f"✅ Findings exported to: [bold cyan]{output_path}[/bold cyan]")
|
||||
|
||||
except Exception as e:
|
||||
console.print(f"❌ Failed to export findings: {e}", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
|
||||
def extract_simplified_findings(sarif_data: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""Extract simplified findings structure from SARIF"""
|
||||
runs = sarif_data.get("runs", [])
|
||||
if not runs:
|
||||
return {"findings": [], "summary": {}}
|
||||
|
||||
run_data = runs[0]
|
||||
results = run_data.get("results", [])
|
||||
tool = run_data.get("tool", {}).get("driver", {})
|
||||
|
||||
simplified = {
|
||||
"tool": {
|
||||
"name": tool.get("name", "Unknown"),
|
||||
"version": tool.get("version", "Unknown")
|
||||
},
|
||||
"summary": {
|
||||
"total_issues": len(results),
|
||||
"by_severity": {}
|
||||
},
|
||||
"findings": []
|
||||
}
|
||||
|
||||
for result in results:
|
||||
level = result.get("level", "note")
|
||||
simplified["summary"]["by_severity"][level] = simplified["summary"]["by_severity"].get(level, 0) + 1
|
||||
|
||||
# Extract location
|
||||
location_info = {}
|
||||
locations = result.get("locations", [])
|
||||
if locations:
|
||||
physical_location = locations[0].get("physicalLocation", {})
|
||||
artifact_location = physical_location.get("artifactLocation", {})
|
||||
region = physical_location.get("region", {})
|
||||
|
||||
location_info = {
|
||||
"file": artifact_location.get("uri", ""),
|
||||
"line": region.get("startLine"),
|
||||
"column": region.get("startColumn")
|
||||
}
|
||||
|
||||
simplified["findings"].append({
|
||||
"rule_id": result.get("ruleId", "unknown"),
|
||||
"severity": level,
|
||||
"message": result.get("message", {}).get("text", ""),
|
||||
"location": location_info
|
||||
})
|
||||
|
||||
return simplified
|
||||
|
||||
|
||||
def export_to_csv(sarif_data: Dict[str, Any], output_path: Path):
|
||||
"""Export findings to CSV format"""
|
||||
runs = sarif_data.get("runs", [])
|
||||
if not runs:
|
||||
return
|
||||
|
||||
results = runs[0].get("results", [])
|
||||
|
||||
with open(output_path, 'w', newline='', encoding='utf-8') as csvfile:
|
||||
fieldnames = ['rule_id', 'severity', 'message', 'file', 'line', 'column']
|
||||
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
|
||||
writer.writeheader()
|
||||
|
||||
for result in results:
|
||||
location_info = {"file": "", "line": "", "column": ""}
|
||||
locations = result.get("locations", [])
|
||||
if locations:
|
||||
physical_location = locations[0].get("physicalLocation", {})
|
||||
artifact_location = physical_location.get("artifactLocation", {})
|
||||
region = physical_location.get("region", {})
|
||||
|
||||
location_info = {
|
||||
"file": artifact_location.get("uri", ""),
|
||||
"line": region.get("startLine", ""),
|
||||
"column": region.get("startColumn", "")
|
||||
}
|
||||
|
||||
writer.writerow({
|
||||
"rule_id": result.get("ruleId", ""),
|
||||
"severity": result.get("level", "note"),
|
||||
"message": result.get("message", {}).get("text", ""),
|
||||
**location_info
|
||||
})
|
||||
|
||||
|
||||
def export_to_html(sarif_data: Dict[str, Any], output_path: Path, run_id: str):
|
||||
"""Export findings to HTML format"""
|
||||
runs = sarif_data.get("runs", [])
|
||||
if not runs:
|
||||
return
|
||||
|
||||
run_data = runs[0]
|
||||
results = run_data.get("results", [])
|
||||
tool = run_data.get("tool", {}).get("driver", {})
|
||||
|
||||
# Simple HTML template
|
||||
html_content = f"""<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<title>Security Findings - {run_id}</title>
|
||||
<style>
|
||||
body {{ font-family: Arial, sans-serif; margin: 40px; }}
|
||||
.header {{ background: #f4f4f4; padding: 20px; border-radius: 5px; }}
|
||||
.summary {{ margin: 20px 0; }}
|
||||
.findings {{ margin: 20px 0; }}
|
||||
table {{ width: 100%; border-collapse: collapse; }}
|
||||
th, td {{ padding: 10px; text-align: left; border-bottom: 1px solid #ddd; }}
|
||||
th {{ background-color: #f2f2f2; }}
|
||||
.error {{ color: #d32f2f; }}
|
||||
.warning {{ color: #f57c00; }}
|
||||
.note {{ color: #1976d2; }}
|
||||
.info {{ color: #388e3c; }}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="header">
|
||||
<h1>Security Findings Report</h1>
|
||||
<p><strong>Run ID:</strong> {run_id}</p>
|
||||
<p><strong>Tool:</strong> {tool.get('name', 'Unknown')} v{tool.get('version', 'Unknown')}</p>
|
||||
<p><strong>Generated:</strong> {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}</p>
|
||||
</div>
|
||||
|
||||
<div class="summary">
|
||||
<h2>Summary</h2>
|
||||
<p><strong>Total Issues:</strong> {len(results)}</p>
|
||||
</div>
|
||||
|
||||
<div class="findings">
|
||||
<h2>Detailed Findings</h2>
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Rule ID</th>
|
||||
<th>Severity</th>
|
||||
<th>Message</th>
|
||||
<th>Location</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
"""
|
||||
|
||||
for result in results:
|
||||
level = result.get("level", "note")
|
||||
rule_id = result.get("ruleId", "unknown")
|
||||
message = result.get("message", {}).get("text", "")
|
||||
|
||||
# Extract location
|
||||
location_str = ""
|
||||
locations = result.get("locations", [])
|
||||
if locations:
|
||||
physical_location = locations[0].get("physicalLocation", {})
|
||||
artifact_location = physical_location.get("artifactLocation", {})
|
||||
region = physical_location.get("region", {})
|
||||
|
||||
file_path = artifact_location.get("uri", "")
|
||||
if file_path:
|
||||
location_str = file_path
|
||||
if region.get("startLine"):
|
||||
location_str += f":{region['startLine']}"
|
||||
|
||||
html_content += f"""
|
||||
<tr>
|
||||
<td>{rule_id}</td>
|
||||
<td class="{level}">{level}</td>
|
||||
<td>{message}</td>
|
||||
<td>{location_str}</td>
|
||||
</tr>
|
||||
"""
|
||||
|
||||
html_content += """
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</body>
|
||||
</html>
|
||||
"""
|
||||
|
||||
with open(output_path, 'w', encoding='utf-8') as f:
|
||||
f.write(html_content)
|
||||
|
||||
|
||||
@app.command("all")
|
||||
def all_findings(
|
||||
workflow: Optional[str] = typer.Option(
|
||||
None, "--workflow", "-w",
|
||||
help="Filter by workflow name"
|
||||
),
|
||||
severity: Optional[str] = typer.Option(
|
||||
None, "--severity", "-s",
|
||||
help="Filter by severity levels (comma-separated: error,warning,note,info)"
|
||||
),
|
||||
since: Optional[str] = typer.Option(
|
||||
None, "--since",
|
||||
help="Show findings since date (YYYY-MM-DD)"
|
||||
),
|
||||
limit: Optional[int] = typer.Option(
|
||||
None, "--limit", "-l",
|
||||
help="Maximum number of findings to show"
|
||||
),
|
||||
export_format: Optional[str] = typer.Option(
|
||||
None, "--export", "-e",
|
||||
help="Export format: json, csv, html"
|
||||
),
|
||||
output: Optional[str] = typer.Option(
|
||||
None, "--output", "-o",
|
||||
help="Output file for export"
|
||||
),
|
||||
stats_only: bool = typer.Option(
|
||||
False, "--stats",
|
||||
help="Show statistics only"
|
||||
),
|
||||
show_findings: bool = typer.Option(
|
||||
False, "--show-findings", "-f",
|
||||
help="Show actual findings content, not just summary"
|
||||
),
|
||||
max_findings: int = typer.Option(
|
||||
50, "--max-findings",
|
||||
help="Maximum number of individual findings to display"
|
||||
)
|
||||
):
|
||||
"""
|
||||
📊 Show all findings for the entire project
|
||||
"""
|
||||
db = get_project_db()
|
||||
if not db:
|
||||
console.print("❌ No FuzzForge project found. Run 'ff init' first.", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
try:
|
||||
# Parse filters
|
||||
severity_list = None
|
||||
if severity:
|
||||
severity_list = [s.strip().lower() for s in severity.split(",")]
|
||||
|
||||
since_date = None
|
||||
if since:
|
||||
try:
|
||||
since_date = datetime.strptime(since, "%Y-%m-%d")
|
||||
except ValueError:
|
||||
console.print(f"❌ Invalid date format: {since}. Use YYYY-MM-DD", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
# Get aggregated stats
|
||||
stats = db.get_aggregated_stats()
|
||||
|
||||
# Show statistics
|
||||
if stats_only or not export_format:
|
||||
# Create summary panel
|
||||
summary_text = f"""[bold]📊 Project Security Summary[/bold]
|
||||
|
||||
[cyan]Total Findings Records:[/cyan] {stats['total_findings_records']}
|
||||
[cyan]Total Runs Analyzed:[/cyan] {stats['total_runs']}
|
||||
[cyan]Total Security Issues:[/cyan] {stats['total_issues']}
|
||||
[cyan]Recent Findings (7 days):[/cyan] {stats['recent_findings']}
|
||||
|
||||
[bold]Severity Distribution:[/bold]
|
||||
🔴 Errors: {stats['severity_distribution'].get('error', 0)}
|
||||
🟡 Warnings: {stats['severity_distribution'].get('warning', 0)}
|
||||
🔵 Notes: {stats['severity_distribution'].get('note', 0)}
|
||||
ℹ️ Info: {stats['severity_distribution'].get('info', 0)}
|
||||
|
||||
[bold]By Workflow:[/bold]"""
|
||||
|
||||
for wf_name, count in stats['workflows'].items():
|
||||
summary_text += f"\n • {wf_name}: {count} findings"
|
||||
|
||||
console.print(Panel(summary_text, box=box.ROUNDED, title="FuzzForge Project Analysis", border_style="cyan"))
|
||||
|
||||
if stats_only:
|
||||
return
|
||||
|
||||
# Get all findings with filters
|
||||
findings = db.get_all_findings(
|
||||
workflow=workflow,
|
||||
severity=severity_list,
|
||||
since_date=since_date,
|
||||
limit=limit
|
||||
)
|
||||
|
||||
if not findings:
|
||||
console.print("ℹ️ No findings match the specified filters", style="dim")
|
||||
return
|
||||
|
||||
# Export if requested
|
||||
if export_format:
|
||||
if not output:
|
||||
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
|
||||
output = f"all_findings_{timestamp}.{export_format}"
|
||||
|
||||
export_all_findings(findings, export_format, output)
|
||||
console.print(f"✅ Exported {len(findings)} findings to: {output}", style="green")
|
||||
return
|
||||
|
||||
# Display findings table
|
||||
table = Table(box=box.ROUNDED, title=f"All Project Findings ({len(findings)} records)")
|
||||
table.add_column("Run ID", style="bold cyan", width=36) # Full UUID width
|
||||
table.add_column("Workflow", style="dim", width=20)
|
||||
table.add_column("Date", justify="center")
|
||||
table.add_column("Issues", justify="center", style="bold")
|
||||
table.add_column("Errors", justify="center", style="red")
|
||||
table.add_column("Warnings", justify="center", style="yellow")
|
||||
table.add_column("Notes", justify="center", style="blue")
|
||||
|
||||
# Get run info for each finding
|
||||
runs_info = {}
|
||||
for finding in findings:
|
||||
run_id = finding.run_id
|
||||
if run_id not in runs_info:
|
||||
run_info = db.get_run(run_id)
|
||||
runs_info[run_id] = run_info
|
||||
|
||||
for finding in findings:
|
||||
run_id = finding.run_id
|
||||
run_info = runs_info.get(run_id)
|
||||
workflow_name = run_info.workflow if run_info else "unknown"
|
||||
|
||||
summary = finding.summary
|
||||
total_issues = summary.get("total_issues", 0)
|
||||
by_severity = summary.get("by_severity", {})
|
||||
|
||||
# Count issues from SARIF data if summary is incomplete
|
||||
if total_issues == 0 and "runs" in finding.sarif_data:
|
||||
for run in finding.sarif_data["runs"]:
|
||||
total_issues += len(run.get("results", []))
|
||||
|
||||
table.add_row(
|
||||
run_id, # Show full Run ID
|
||||
workflow_name[:17] + "..." if len(workflow_name) > 20 else workflow_name,
|
||||
finding.created_at.strftime("%Y-%m-%d %H:%M"),
|
||||
str(total_issues),
|
||||
str(by_severity.get("error", 0)),
|
||||
str(by_severity.get("warning", 0)),
|
||||
str(by_severity.get("note", 0))
|
||||
)
|
||||
|
||||
console.print(table)
|
||||
|
||||
# Show actual findings if requested
|
||||
if show_findings:
|
||||
display_detailed_findings(findings, max_findings)
|
||||
|
||||
console.print(f"\n💡 Use filters to refine results: --workflow, --severity, --since")
|
||||
console.print(f"💡 Show findings content: --show-findings")
|
||||
console.print(f"💡 Export findings: --export json --output report.json")
|
||||
console.print(f"💡 View specific findings: [bold cyan]fuzzforge finding <run-id>[/bold cyan]")
|
||||
|
||||
except Exception as e:
|
||||
console.print(f"❌ Failed to get all findings: {e}", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
|
||||
def display_detailed_findings(findings: List[FindingRecord], max_findings: int):
|
||||
"""Display detailed findings content"""
|
||||
console.print(f"\n📋 [bold]Detailed Findings Content[/bold] (showing up to {max_findings} findings)\n")
|
||||
|
||||
findings_count = 0
|
||||
|
||||
for finding_record in findings:
|
||||
if findings_count >= max_findings:
|
||||
remaining = sum(len(run.get("results", []))
|
||||
for f in findings[findings.index(finding_record):]
|
||||
for run in f.sarif_data.get("runs", []))
|
||||
if remaining > 0:
|
||||
console.print(f"\n... and {remaining} more findings (use --max-findings to show more)")
|
||||
break
|
||||
|
||||
# Get run info for this finding
|
||||
sarif_data = finding_record.sarif_data
|
||||
if not sarif_data or "runs" not in sarif_data:
|
||||
continue
|
||||
|
||||
for run in sarif_data["runs"]:
|
||||
tool = run.get("tool", {})
|
||||
driver = tool.get("driver", {})
|
||||
tool_name = driver.get("name", "Unknown Tool")
|
||||
|
||||
results = run.get("results", [])
|
||||
if not results:
|
||||
continue
|
||||
|
||||
# Group results by severity
|
||||
for result in results:
|
||||
if findings_count >= max_findings:
|
||||
break
|
||||
|
||||
findings_count += 1
|
||||
|
||||
# Extract key information
|
||||
rule_id = result.get("ruleId", "unknown")
|
||||
level = result.get("level", "note").upper()
|
||||
message_text = result.get("message", {}).get("text", "No description")
|
||||
|
||||
# Get location information
|
||||
locations = result.get("locations", [])
|
||||
location_str = "Unknown location"
|
||||
if locations:
|
||||
physical = locations[0].get("physicalLocation", {})
|
||||
artifact = physical.get("artifactLocation", {})
|
||||
region = physical.get("region", {})
|
||||
|
||||
file_path = artifact.get("uri", "")
|
||||
line_number = region.get("startLine", "")
|
||||
|
||||
if file_path:
|
||||
location_str = f"{file_path}"
|
||||
if line_number:
|
||||
location_str += f":{line_number}"
|
||||
|
||||
# Get severity style
|
||||
severity_style = {
|
||||
"ERROR": "bold red",
|
||||
"WARNING": "bold yellow",
|
||||
"NOTE": "bold blue",
|
||||
"INFO": "bold cyan"
|
||||
}.get(level, "white")
|
||||
|
||||
# Create finding panel
|
||||
finding_content = f"""[bold]Rule:[/bold] {rule_id}
|
||||
[bold]Location:[/bold] {location_str}
|
||||
[bold]Tool:[/bold] {tool_name}
|
||||
[bold]Run:[/bold] {finding_record.run_id[:12]}...
|
||||
|
||||
[bold]Description:[/bold]
|
||||
{message_text}"""
|
||||
|
||||
# Add code context if available
|
||||
region = locations[0].get("physicalLocation", {}).get("region", {}) if locations else {}
|
||||
if region.get("snippet", {}).get("text"):
|
||||
code_snippet = region["snippet"]["text"].strip()
|
||||
finding_content += f"\n\n[bold]Code:[/bold]\n[dim]{code_snippet}[/dim]"
|
||||
|
||||
console.print(Panel(
|
||||
finding_content,
|
||||
title=f"[{severity_style}]{level}[/{severity_style}] Finding #{findings_count}",
|
||||
border_style=severity_style.split()[-1] if " " in severity_style else severity_style,
|
||||
box=box.ROUNDED
|
||||
))
|
||||
|
||||
console.print() # Add spacing between findings
|
||||
|
||||
|
||||
def export_all_findings(findings: List[FindingRecord], format: str, output_path: str):
|
||||
"""Export all findings to specified format"""
|
||||
output_file = Path(output_path)
|
||||
|
||||
if format == "json":
|
||||
# Combine all SARIF data
|
||||
all_results = []
|
||||
for finding in findings:
|
||||
if "runs" in finding.sarif_data:
|
||||
for run in finding.sarif_data["runs"]:
|
||||
for result in run.get("results", []):
|
||||
result_entry = {
|
||||
"run_id": finding.run_id,
|
||||
"created_at": finding.created_at.isoformat(),
|
||||
**result
|
||||
}
|
||||
all_results.append(result_entry)
|
||||
|
||||
with open(output_file, 'w') as f:
|
||||
json.dump({
|
||||
"total_findings": len(findings),
|
||||
"export_date": datetime.now().isoformat(),
|
||||
"results": all_results
|
||||
}, f, indent=2)
|
||||
|
||||
elif format == "csv":
|
||||
# Export to CSV
|
||||
with open(output_file, 'w', newline='') as f:
|
||||
writer = csv.writer(f)
|
||||
writer.writerow(["Run ID", "Date", "Severity", "Rule ID", "Message", "File", "Line"])
|
||||
|
||||
for finding in findings:
|
||||
if "runs" in finding.sarif_data:
|
||||
for run in finding.sarif_data["runs"]:
|
||||
for result in run.get("results", []):
|
||||
locations = result.get("locations", [])
|
||||
location_info = locations[0] if locations else {}
|
||||
physical = location_info.get("physicalLocation", {})
|
||||
artifact = physical.get("artifactLocation", {})
|
||||
region = physical.get("region", {})
|
||||
|
||||
writer.writerow([
|
||||
finding.run_id[:12],
|
||||
finding.created_at.strftime("%Y-%m-%d %H:%M"),
|
||||
result.get("level", "note"),
|
||||
result.get("ruleId", ""),
|
||||
result.get("message", {}).get("text", ""),
|
||||
artifact.get("uri", ""),
|
||||
region.get("startLine", "")
|
||||
])
|
||||
|
||||
elif format == "html":
|
||||
# Generate HTML report
|
||||
html_content = f"""<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<title>FuzzForge Security Findings Report</title>
|
||||
<style>
|
||||
body {{ font-family: Arial, sans-serif; margin: 20px; }}
|
||||
h1 {{ color: #333; }}
|
||||
.stats {{ background: #f5f5f5; padding: 15px; border-radius: 5px; margin: 20px 0; }}
|
||||
table {{ width: 100%; border-collapse: collapse; }}
|
||||
th, td {{ padding: 10px; text-align: left; border-bottom: 1px solid #ddd; }}
|
||||
th {{ background: #4CAF50; color: white; }}
|
||||
.error {{ color: red; font-weight: bold; }}
|
||||
.warning {{ color: orange; font-weight: bold; }}
|
||||
.note {{ color: blue; }}
|
||||
.info {{ color: gray; }}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<h1>FuzzForge Security Findings Report</h1>
|
||||
<div class="stats">
|
||||
<p><strong>Generated:</strong> {datetime.now().strftime("%Y-%m-%d %H:%M:%S")}</p>
|
||||
<p><strong>Total Findings:</strong> {len(findings)}</p>
|
||||
</div>
|
||||
<table>
|
||||
<tr>
|
||||
<th>Run ID</th>
|
||||
<th>Date</th>
|
||||
<th>Severity</th>
|
||||
<th>Rule</th>
|
||||
<th>Message</th>
|
||||
<th>Location</th>
|
||||
</tr>"""
|
||||
|
||||
for finding in findings:
|
||||
if "runs" in finding.sarif_data:
|
||||
for run in finding.sarif_data["runs"]:
|
||||
for result in run.get("results", []):
|
||||
level = result.get("level", "note")
|
||||
locations = result.get("locations", [])
|
||||
location_info = locations[0] if locations else {}
|
||||
physical = location_info.get("physicalLocation", {})
|
||||
artifact = physical.get("artifactLocation", {})
|
||||
region = physical.get("region", {})
|
||||
|
||||
html_content += f"""
|
||||
<tr>
|
||||
<td>{finding.run_id[:12]}</td>
|
||||
<td>{finding.created_at.strftime("%Y-%m-%d %H:%M")}</td>
|
||||
<td class="{level}">{level.upper()}</td>
|
||||
<td>{result.get("ruleId", "")}</td>
|
||||
<td>{result.get("message", {}).get("text", "")}</td>
|
||||
<td>{artifact.get("uri", "")} : {region.get("startLine", "")}</td>
|
||||
</tr>"""
|
||||
|
||||
html_content += """
|
||||
</table>
|
||||
</body>
|
||||
</html>"""
|
||||
|
||||
with open(output_file, 'w') as f:
|
||||
f.write(html_content)
|
||||
|
||||
|
||||
@app.callback(invoke_without_command=True)
|
||||
def findings_callback(ctx: typer.Context):
|
||||
"""
|
||||
🔍 View and export security findings
|
||||
"""
|
||||
# Check if a subcommand is being invoked
|
||||
if ctx.invoked_subcommand is not None:
|
||||
# Let the subcommand handle it
|
||||
return
|
||||
|
||||
# Default to history when no subcommand provided
|
||||
findings_history(limit=20)
|
||||
251
cli/src/fuzzforge_cli/commands/ingest.py
Normal file
251
cli/src/fuzzforge_cli/commands/ingest.py
Normal file
@@ -0,0 +1,251 @@
|
||||
"""Cognee ingestion commands for FuzzForge CLI."""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import os
|
||||
from pathlib import Path
|
||||
from typing import List, Optional
|
||||
|
||||
import typer
|
||||
from rich.console import Console
|
||||
from rich.prompt import Confirm
|
||||
|
||||
from ..config import ProjectConfigManager
|
||||
from ..ingest_utils import collect_ingest_files
|
||||
|
||||
console = Console()
|
||||
app = typer.Typer(
|
||||
name="ingest",
|
||||
help="Ingest files or directories into the Cognee knowledge graph for the current project",
|
||||
invoke_without_command=True,
|
||||
)
|
||||
|
||||
|
||||
@app.callback()
|
||||
def ingest_callback(
|
||||
ctx: typer.Context,
|
||||
path: Optional[Path] = typer.Argument(
|
||||
None,
|
||||
exists=True,
|
||||
file_okay=True,
|
||||
dir_okay=True,
|
||||
readable=True,
|
||||
resolve_path=True,
|
||||
help="File or directory to ingest (defaults to current directory)",
|
||||
),
|
||||
recursive: bool = typer.Option(
|
||||
False,
|
||||
"--recursive",
|
||||
"-r",
|
||||
help="Recursively ingest directories",
|
||||
),
|
||||
file_types: Optional[List[str]] = typer.Option(
|
||||
None,
|
||||
"--file-types",
|
||||
"-t",
|
||||
help="File extensions to include (e.g. --file-types .py --file-types .js)",
|
||||
),
|
||||
exclude: Optional[List[str]] = typer.Option(
|
||||
None,
|
||||
"--exclude",
|
||||
"-e",
|
||||
help="Glob patterns to exclude",
|
||||
),
|
||||
dataset: Optional[str] = typer.Option(
|
||||
None,
|
||||
"--dataset",
|
||||
"-d",
|
||||
help="Dataset name to ingest into",
|
||||
),
|
||||
force: bool = typer.Option(
|
||||
False,
|
||||
"--force",
|
||||
"-f",
|
||||
help="Force re-ingestion and skip confirmation",
|
||||
),
|
||||
):
|
||||
"""Entry point for `fuzzforge ingest` when no subcommand is provided."""
|
||||
if ctx.invoked_subcommand:
|
||||
return
|
||||
|
||||
try:
|
||||
config = ProjectConfigManager()
|
||||
except FileNotFoundError as exc:
|
||||
console.print(f"[red]Error:[/red] {exc}")
|
||||
raise typer.Exit(1) from exc
|
||||
|
||||
if not config.is_initialized():
|
||||
console.print("[red]Error: FuzzForge project not initialized. Run 'ff init' first.[/red]")
|
||||
raise typer.Exit(1)
|
||||
|
||||
config.setup_cognee_environment()
|
||||
if os.getenv("FUZZFORGE_DEBUG", "0") == "1":
|
||||
console.print(
|
||||
"[dim]Cognee directories:\n"
|
||||
f" DATA: {os.getenv('COGNEE_DATA_ROOT', 'unset')}\n"
|
||||
f" SYSTEM: {os.getenv('COGNEE_SYSTEM_ROOT', 'unset')}\n"
|
||||
f" USER: {os.getenv('COGNEE_USER_ID', 'unset')}\n",
|
||||
)
|
||||
project_context = config.get_project_context()
|
||||
|
||||
target_path = path or Path.cwd()
|
||||
dataset_name = dataset or f"{project_context['project_name']}_codebase"
|
||||
|
||||
try:
|
||||
import cognee # noqa: F401 # Just to validate installation
|
||||
except ImportError as exc:
|
||||
console.print("[red]Cognee is not installed.[/red]")
|
||||
console.print("Install with: pip install 'cognee[all]' litellm")
|
||||
raise typer.Exit(1) from exc
|
||||
|
||||
console.print(f"[bold]🔍 Ingesting {target_path} into Cognee knowledge graph[/bold]")
|
||||
console.print(
|
||||
f"Project: [cyan]{project_context['project_name']}[/cyan] "
|
||||
f"(ID: [dim]{project_context['project_id']}[/dim])"
|
||||
)
|
||||
console.print(f"Dataset: [cyan]{dataset_name}[/cyan]")
|
||||
console.print(f"Tenant: [dim]{project_context['tenant_id']}[/dim]")
|
||||
|
||||
if not force:
|
||||
confirm_message = f"Ingest {target_path} into knowledge graph for this project?"
|
||||
if not Confirm.ask(confirm_message, console=console):
|
||||
console.print("[yellow]Ingestion cancelled[/yellow]")
|
||||
raise typer.Exit(0)
|
||||
|
||||
try:
|
||||
asyncio.run(
|
||||
_run_ingestion(
|
||||
config=config,
|
||||
path=target_path.resolve(),
|
||||
recursive=recursive,
|
||||
file_types=file_types,
|
||||
exclude=exclude,
|
||||
dataset=dataset_name,
|
||||
force=force,
|
||||
)
|
||||
)
|
||||
except KeyboardInterrupt:
|
||||
console.print("\n[yellow]Ingestion cancelled by user[/yellow]")
|
||||
raise typer.Exit(1)
|
||||
except Exception as exc: # pragma: no cover - rich reporting
|
||||
console.print(f"[red]Failed to ingest:[/red] {exc}")
|
||||
raise typer.Exit(1) from exc
|
||||
|
||||
|
||||
async def _run_ingestion(
|
||||
*,
|
||||
config: ProjectConfigManager,
|
||||
path: Path,
|
||||
recursive: bool,
|
||||
file_types: Optional[List[str]],
|
||||
exclude: Optional[List[str]],
|
||||
dataset: str,
|
||||
force: bool,
|
||||
) -> None:
|
||||
"""Perform the actual ingestion work."""
|
||||
from fuzzforge_ai.cognee_service import CogneeService
|
||||
|
||||
cognee_service = CogneeService(config)
|
||||
await cognee_service.initialize()
|
||||
|
||||
# Always skip internal bookkeeping directories
|
||||
exclude_patterns = list(exclude or [])
|
||||
default_excludes = {
|
||||
".fuzzforge/**",
|
||||
".git/**",
|
||||
}
|
||||
added_defaults = []
|
||||
for pattern in default_excludes:
|
||||
if pattern not in exclude_patterns:
|
||||
exclude_patterns.append(pattern)
|
||||
added_defaults.append(pattern)
|
||||
|
||||
if added_defaults and os.getenv("FUZZFORGE_DEBUG", "0") == "1":
|
||||
console.print(
|
||||
"[dim]Auto-excluding paths: {patterns}[/dim]".format(
|
||||
patterns=", ".join(added_defaults)
|
||||
)
|
||||
)
|
||||
|
||||
try:
|
||||
files_to_ingest = collect_ingest_files(path, recursive, file_types, exclude_patterns)
|
||||
except Exception as exc:
|
||||
console.print(f"[red]Failed to collect files:[/red] {exc}")
|
||||
return
|
||||
|
||||
if not files_to_ingest:
|
||||
console.print("[yellow]No files found to ingest[/yellow]")
|
||||
return
|
||||
|
||||
console.print(f"Found [green]{len(files_to_ingest)}[/green] files to ingest")
|
||||
|
||||
if force:
|
||||
console.print("Cleaning existing data for this project...")
|
||||
try:
|
||||
await cognee_service.clear_data(confirm=True)
|
||||
except Exception as exc:
|
||||
console.print(f"[yellow]Warning:[/yellow] Could not clean existing data: {exc}")
|
||||
|
||||
console.print("Adding files to Cognee...")
|
||||
valid_file_paths = []
|
||||
for file_path in files_to_ingest:
|
||||
try:
|
||||
with open(file_path, "r", encoding="utf-8") as fh:
|
||||
fh.read(1)
|
||||
valid_file_paths.append(file_path)
|
||||
console.print(f" ✓ {file_path}")
|
||||
except (UnicodeDecodeError, PermissionError) as exc:
|
||||
console.print(f"[yellow]Skipping {file_path}: {exc}[/yellow]")
|
||||
|
||||
if not valid_file_paths:
|
||||
console.print("[yellow]No readable files found to ingest[/yellow]")
|
||||
return
|
||||
|
||||
results = await cognee_service.ingest_files(valid_file_paths, dataset)
|
||||
|
||||
console.print(
|
||||
f"[green]✅ Successfully ingested {results['success']} files into knowledge graph[/green]"
|
||||
)
|
||||
if results["failed"]:
|
||||
console.print(
|
||||
f"[yellow]⚠️ Skipped {results['failed']} files due to errors[/yellow]"
|
||||
)
|
||||
|
||||
try:
|
||||
insights = await cognee_service.search_insights(
|
||||
query=f"What insights can you provide about the {dataset} dataset?",
|
||||
dataset=dataset,
|
||||
)
|
||||
if insights:
|
||||
console.print(f"\n[bold]📊 Generated {len(insights)} insights:[/bold]")
|
||||
for index, insight in enumerate(insights[:3], 1):
|
||||
console.print(f" {index}. {insight}")
|
||||
if len(insights) > 3:
|
||||
console.print(f" ... and {len(insights) - 3} more")
|
||||
|
||||
chunks = await cognee_service.search_chunks(
|
||||
query=f"functions classes methods in {dataset}",
|
||||
dataset=dataset,
|
||||
)
|
||||
if chunks:
|
||||
console.print(
|
||||
f"\n[bold]🔍 Sample searchable content ({len(chunks)} chunks found):[/bold]"
|
||||
)
|
||||
for index, chunk in enumerate(chunks[:2], 1):
|
||||
preview = chunk[:100] + "..." if len(chunk) > 100 else chunk
|
||||
console.print(f" {index}. {preview}")
|
||||
except Exception:
|
||||
# Best-effort stats — ignore failures here
|
||||
pass
|
||||
277
cli/src/fuzzforge_cli/commands/init.py
Normal file
277
cli/src/fuzzforge_cli/commands/init.py
Normal file
@@ -0,0 +1,277 @@
|
||||
"""Project initialization commands."""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
from pathlib import Path
|
||||
from textwrap import dedent
|
||||
from typing import Optional
|
||||
|
||||
import typer
|
||||
from rich.console import Console
|
||||
from rich.prompt import Confirm, Prompt
|
||||
|
||||
from ..config import ensure_project_config
|
||||
from ..database import ensure_project_db
|
||||
|
||||
console = Console()
|
||||
app = typer.Typer()
|
||||
|
||||
|
||||
@app.command()
|
||||
def project(
|
||||
name: Optional[str] = typer.Option(
|
||||
None, "--name", "-n", help="Project name (defaults to current directory name)"
|
||||
),
|
||||
api_url: Optional[str] = typer.Option(
|
||||
None,
|
||||
"--api-url",
|
||||
"-u",
|
||||
help="FuzzForge API URL (defaults to http://localhost:8000)",
|
||||
),
|
||||
force: bool = typer.Option(
|
||||
False,
|
||||
"--force",
|
||||
"-f",
|
||||
help="Force initialization even if project already exists",
|
||||
),
|
||||
):
|
||||
"""
|
||||
📁 Initialize a new FuzzForge project in the current directory.
|
||||
|
||||
This creates a .fuzzforge directory with:
|
||||
• SQLite database for storing runs, findings, and crashes
|
||||
• Configuration file with project settings
|
||||
• Default ignore patterns and preferences
|
||||
"""
|
||||
current_dir = Path.cwd()
|
||||
fuzzforge_dir = current_dir / ".fuzzforge"
|
||||
|
||||
# Check if project already exists
|
||||
if fuzzforge_dir.exists() and not force:
|
||||
if fuzzforge_dir.is_dir() and any(fuzzforge_dir.iterdir()):
|
||||
console.print(
|
||||
"❌ FuzzForge project already exists in this directory", style="red"
|
||||
)
|
||||
console.print("Use --force to reinitialize", style="dim")
|
||||
raise typer.Exit(1)
|
||||
|
||||
# Get project name
|
||||
if not name:
|
||||
name = Prompt.ask("Project name", default=current_dir.name, console=console)
|
||||
|
||||
# Get API URL
|
||||
if not api_url:
|
||||
api_url = Prompt.ask(
|
||||
"FuzzForge API URL", default="http://localhost:8000", console=console
|
||||
)
|
||||
|
||||
# Confirm initialization
|
||||
console.print(f"\n📁 Initializing FuzzForge project: [bold cyan]{name}[/bold cyan]")
|
||||
console.print(f"📍 Location: [dim]{current_dir}[/dim]")
|
||||
console.print(f"🔗 API URL: [dim]{api_url}[/dim]")
|
||||
|
||||
if not Confirm.ask("\nProceed with initialization?", default=True, console=console):
|
||||
console.print("❌ Initialization cancelled", style="yellow")
|
||||
raise typer.Exit(0)
|
||||
|
||||
try:
|
||||
# Create .fuzzforge directory
|
||||
console.print("\n🔨 Creating project structure...")
|
||||
fuzzforge_dir.mkdir(exist_ok=True)
|
||||
|
||||
# Initialize configuration
|
||||
console.print("⚙️ Setting up configuration...")
|
||||
ensure_project_config(
|
||||
project_dir=current_dir,
|
||||
project_name=name,
|
||||
api_url=api_url,
|
||||
)
|
||||
|
||||
# Initialize database
|
||||
console.print("🗄️ Initializing database...")
|
||||
ensure_project_db(current_dir)
|
||||
|
||||
_ensure_env_file(fuzzforge_dir, force)
|
||||
_ensure_agents_registry(fuzzforge_dir, force)
|
||||
|
||||
# Create .gitignore if needed
|
||||
gitignore_path = current_dir / ".gitignore"
|
||||
gitignore_entries = [
|
||||
"# FuzzForge CLI",
|
||||
".fuzzforge/findings.db-*", # SQLite temp files
|
||||
".fuzzforge/cache/",
|
||||
".fuzzforge/temp/",
|
||||
]
|
||||
|
||||
if gitignore_path.exists():
|
||||
with open(gitignore_path, "r") as f:
|
||||
existing_content = f.read()
|
||||
|
||||
if "# FuzzForge CLI" not in existing_content:
|
||||
with open(gitignore_path, "a") as f:
|
||||
f.write(f"\n{chr(10).join(gitignore_entries)}\n")
|
||||
console.print("📝 Updated .gitignore with FuzzForge entries")
|
||||
else:
|
||||
with open(gitignore_path, "w") as f:
|
||||
f.write(f"{chr(10).join(gitignore_entries)}\n")
|
||||
console.print("📝 Created .gitignore")
|
||||
|
||||
# Create README if it doesn't exist
|
||||
readme_path = current_dir / "README.md"
|
||||
if not readme_path.exists():
|
||||
readme_content = f"""# {name}
|
||||
|
||||
FuzzForge security testing project.
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
# List available workflows
|
||||
fuzzforge workflows
|
||||
|
||||
# Submit a workflow for analysis
|
||||
fuzzforge workflow <workflow-name> /path/to/target
|
||||
|
||||
# View findings
|
||||
fuzzforge finding <run-id>
|
||||
```
|
||||
|
||||
## Project Structure
|
||||
|
||||
- `.fuzzforge/` - Project data and configuration
|
||||
- `.fuzzforge/config.yaml` - Project configuration
|
||||
- `.fuzzforge/findings.db` - Local database for runs and findings
|
||||
"""
|
||||
|
||||
with open(readme_path, "w") as f:
|
||||
f.write(readme_content)
|
||||
console.print("📚 Created README.md")
|
||||
|
||||
console.print("\n✅ FuzzForge project initialized successfully!", style="green")
|
||||
console.print("\n🎯 Next steps:")
|
||||
console.print(" • ff workflows - See available workflows")
|
||||
console.print(" • ff status - Check API connectivity")
|
||||
console.print(" • ff workflow <workflow> <path> - Start your first analysis")
|
||||
console.print(" • edit .fuzzforge/.env with API keys & provider settings")
|
||||
|
||||
except Exception as e:
|
||||
console.print(f"\n❌ Initialization failed: {e}", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
|
||||
@app.callback()
|
||||
def init_callback():
|
||||
"""
|
||||
📁 Initialize FuzzForge projects and components
|
||||
"""
|
||||
|
||||
|
||||
def _ensure_env_file(fuzzforge_dir: Path, force: bool) -> None:
|
||||
"""Create or update the .fuzzforge/.env file with AI defaults."""
|
||||
|
||||
env_path = fuzzforge_dir / ".env"
|
||||
if env_path.exists() and not force:
|
||||
console.print("🧪 Using existing .fuzzforge/.env (use --force to regenerate)")
|
||||
return
|
||||
|
||||
console.print("🧠 Configuring AI environment...")
|
||||
console.print(" • Default LLM provider: openai")
|
||||
console.print(" • Default LLM model: gpt-5-mini")
|
||||
console.print(" • To customise provider/model later, edit .fuzzforge/.env")
|
||||
|
||||
llm_provider = "openai"
|
||||
llm_model = "gpt-5-mini"
|
||||
|
||||
api_key = Prompt.ask(
|
||||
"OpenAI API key (leave blank to fill manually)",
|
||||
default="",
|
||||
show_default=False,
|
||||
console=console,
|
||||
)
|
||||
|
||||
enable_cognee = False
|
||||
cognee_url = ""
|
||||
|
||||
session_db_path = fuzzforge_dir / "fuzzforge_sessions.db"
|
||||
session_db_rel = session_db_path.relative_to(fuzzforge_dir.parent)
|
||||
|
||||
env_lines = [
|
||||
"# FuzzForge AI configuration",
|
||||
"# Populate the API key(s) that match your LLM provider",
|
||||
"",
|
||||
f"LLM_PROVIDER={llm_provider}",
|
||||
f"LLM_MODEL={llm_model}",
|
||||
f"LITELLM_MODEL={llm_model}",
|
||||
f"OPENAI_API_KEY={api_key}",
|
||||
f"FUZZFORGE_MCP_URL={os.getenv('FUZZFORGE_MCP_URL', 'http://localhost:8010/mcp')}",
|
||||
"",
|
||||
"# Cognee configuration mirrors the primary LLM by default",
|
||||
f"LLM_COGNEE_PROVIDER={llm_provider}",
|
||||
f"LLM_COGNEE_MODEL={llm_model}",
|
||||
f"LLM_COGNEE_API_KEY={api_key}",
|
||||
"LLM_COGNEE_ENDPOINT=",
|
||||
"COGNEE_MCP_URL=",
|
||||
"",
|
||||
"# Session persistence options: inmemory | sqlite",
|
||||
"SESSION_PERSISTENCE=sqlite",
|
||||
f"SESSION_DB_PATH={session_db_rel}",
|
||||
"",
|
||||
"# Optional integrations",
|
||||
"AGENTOPS_API_KEY=",
|
||||
"FUZZFORGE_DEBUG=0",
|
||||
"",
|
||||
]
|
||||
|
||||
env_path.write_text("\n".join(env_lines), encoding="utf-8")
|
||||
console.print(f"📝 Created {env_path.relative_to(fuzzforge_dir.parent)}")
|
||||
|
||||
template_path = fuzzforge_dir / ".env.template"
|
||||
if not template_path.exists() or force:
|
||||
template_lines = []
|
||||
for line in env_lines:
|
||||
if line.startswith("OPENAI_API_KEY="):
|
||||
template_lines.append("OPENAI_API_KEY=")
|
||||
elif line.startswith("LLM_COGNEE_API_KEY="):
|
||||
template_lines.append("LLM_COGNEE_API_KEY=")
|
||||
else:
|
||||
template_lines.append(line)
|
||||
template_path.write_text("\n".join(template_lines), encoding="utf-8")
|
||||
console.print(f"📝 Created {template_path.relative_to(fuzzforge_dir.parent)}")
|
||||
|
||||
# SQLite session DB will be created automatically when first used by the AI agent
|
||||
|
||||
|
||||
def _ensure_agents_registry(fuzzforge_dir: Path, force: bool) -> None:
|
||||
"""Create a starter agents.yaml registry if needed."""
|
||||
|
||||
agents_path = fuzzforge_dir / "agents.yaml"
|
||||
if agents_path.exists() and not force:
|
||||
return
|
||||
|
||||
template = dedent(
|
||||
"""\
|
||||
# FuzzForge Registered Agents
|
||||
# Populate this list to auto-register remote agents when the AI CLI starts
|
||||
registered_agents: []
|
||||
|
||||
# Example:
|
||||
# registered_agents:
|
||||
# - name: Calculator
|
||||
# url: http://localhost:10201
|
||||
# description: Sample math agent
|
||||
""".strip()
|
||||
)
|
||||
|
||||
agents_path.write_text(template + "\n", encoding="utf-8")
|
||||
console.print(f"📝 Created {agents_path.relative_to(fuzzforge_dir.parent)}")
|
||||
165
cli/src/fuzzforge_cli/commands/status.py
Normal file
165
cli/src/fuzzforge_cli/commands/status.py
Normal file
@@ -0,0 +1,165 @@
|
||||
"""
|
||||
Status command for showing project and API information.
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
from pathlib import Path
|
||||
from rich.console import Console
|
||||
from rich.table import Table
|
||||
from rich.panel import Panel
|
||||
from rich import box
|
||||
|
||||
from ..config import get_project_config, FuzzForgeConfig
|
||||
from ..database import get_project_db
|
||||
from fuzzforge_sdk import FuzzForgeClient
|
||||
|
||||
console = Console()
|
||||
|
||||
|
||||
def show_status():
|
||||
"""Show comprehensive project and API status"""
|
||||
current_dir = Path.cwd()
|
||||
fuzzforge_dir = current_dir / ".fuzzforge"
|
||||
|
||||
# Project status
|
||||
console.print("\n📊 [bold]FuzzForge Project Status[/bold]\n")
|
||||
|
||||
if not fuzzforge_dir.exists():
|
||||
console.print(
|
||||
Panel.fit(
|
||||
"❌ No FuzzForge project found in current directory\n\n"
|
||||
"Run [bold cyan]ff init[/bold cyan] to initialize a project",
|
||||
title="Project Status",
|
||||
box=box.ROUNDED
|
||||
)
|
||||
)
|
||||
return
|
||||
|
||||
# Load project configuration
|
||||
config = get_project_config()
|
||||
if not config:
|
||||
config = FuzzForgeConfig()
|
||||
|
||||
# Project info table
|
||||
project_table = Table(show_header=False, box=box.SIMPLE)
|
||||
project_table.add_column("Property", style="bold cyan")
|
||||
project_table.add_column("Value")
|
||||
|
||||
project_table.add_row("Project Name", config.project.name)
|
||||
project_table.add_row("Location", str(current_dir))
|
||||
project_table.add_row("API URL", config.project.api_url)
|
||||
project_table.add_row("Default Timeout", f"{config.project.default_timeout}s")
|
||||
|
||||
console.print(
|
||||
Panel.fit(
|
||||
project_table,
|
||||
title="✅ Project Information",
|
||||
box=box.ROUNDED
|
||||
)
|
||||
)
|
||||
|
||||
# Database status
|
||||
db = get_project_db()
|
||||
if db:
|
||||
try:
|
||||
stats = db.get_stats()
|
||||
db_table = Table(show_header=False, box=box.SIMPLE)
|
||||
db_table.add_column("Metric", style="bold cyan")
|
||||
db_table.add_column("Count", justify="right")
|
||||
|
||||
db_table.add_row("Total Runs", str(stats["total_runs"]))
|
||||
db_table.add_row("Total Findings", str(stats["total_findings"]))
|
||||
db_table.add_row("Total Crashes", str(stats["total_crashes"]))
|
||||
db_table.add_row("Runs (Last 7 days)", str(stats["runs_last_7_days"]))
|
||||
|
||||
if stats["runs_by_status"]:
|
||||
db_table.add_row("", "") # Spacer
|
||||
for status, count in stats["runs_by_status"].items():
|
||||
status_emoji = {
|
||||
"completed": "✅",
|
||||
"running": "🔄",
|
||||
"failed": "❌",
|
||||
"queued": "⏳",
|
||||
"cancelled": "⏹️"
|
||||
}.get(status, "📋")
|
||||
db_table.add_row(f"{status_emoji} {status.title()}", str(count))
|
||||
|
||||
console.print(
|
||||
Panel.fit(
|
||||
db_table,
|
||||
title="🗄️ Database Statistics",
|
||||
box=box.ROUNDED
|
||||
)
|
||||
)
|
||||
except Exception as e:
|
||||
console.print(f"⚠️ Database error: {e}", style="yellow")
|
||||
|
||||
# API status
|
||||
console.print("\n🔗 [bold]API Connectivity[/bold]")
|
||||
try:
|
||||
with FuzzForgeClient(base_url=config.get_api_url(), timeout=10.0) as client:
|
||||
api_status = client.get_api_status()
|
||||
workflows = client.list_workflows()
|
||||
|
||||
api_table = Table(show_header=False, box=box.SIMPLE)
|
||||
api_table.add_column("Property", style="bold cyan")
|
||||
api_table.add_column("Value")
|
||||
|
||||
api_table.add_row("Status", f"✅ Connected")
|
||||
api_table.add_row("Service", f"{api_status.name} v{api_status.version}")
|
||||
api_table.add_row("Workflows", str(len(workflows)))
|
||||
|
||||
console.print(
|
||||
Panel.fit(
|
||||
api_table,
|
||||
title="✅ API Status",
|
||||
box=box.ROUNDED
|
||||
)
|
||||
)
|
||||
|
||||
# Show available workflows
|
||||
if workflows:
|
||||
workflow_table = Table(box=box.SIMPLE_HEAD)
|
||||
workflow_table.add_column("Name", style="bold")
|
||||
workflow_table.add_column("Version", justify="center")
|
||||
workflow_table.add_column("Description")
|
||||
|
||||
for workflow in workflows[:10]: # Limit to first 10
|
||||
workflow_table.add_row(
|
||||
workflow.name,
|
||||
workflow.version,
|
||||
workflow.description[:60] + "..." if len(workflow.description) > 60 else workflow.description
|
||||
)
|
||||
|
||||
if len(workflows) > 10:
|
||||
workflow_table.add_row("...", "...", f"and {len(workflows) - 10} more workflows")
|
||||
|
||||
console.print(
|
||||
Panel.fit(
|
||||
workflow_table,
|
||||
title=f"🔧 Available Workflows ({len(workflows)})",
|
||||
box=box.ROUNDED
|
||||
)
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
console.print(
|
||||
Panel.fit(
|
||||
f"❌ Failed to connect to API\n\n"
|
||||
f"Error: {str(e)}\n\n"
|
||||
f"API URL: {config.get_api_url()}\n\n"
|
||||
"Check that the FuzzForge API is running and accessible.",
|
||||
title="❌ API Connection Failed",
|
||||
box=box.ROUNDED
|
||||
)
|
||||
)
|
||||
641
cli/src/fuzzforge_cli/commands/workflow_exec.py
Normal file
641
cli/src/fuzzforge_cli/commands/workflow_exec.py
Normal file
@@ -0,0 +1,641 @@
|
||||
"""
|
||||
Workflow execution and management commands.
|
||||
Replaces the old 'runs' terminology with cleaner workflow-centric commands.
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import time
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
from typing import Any, Dict, List, Optional
|
||||
|
||||
import typer
|
||||
from fuzzforge_sdk import FuzzForgeClient, WorkflowSubmission
|
||||
from rich import box
|
||||
from rich.console import Console
|
||||
from rich.panel import Panel
|
||||
from rich.prompt import Confirm, Prompt
|
||||
from rich.table import Table
|
||||
|
||||
from ..config import FuzzForgeConfig, get_project_config
|
||||
from ..constants import (
|
||||
DEFAULT_VOLUME_MODE,
|
||||
MAX_RETRIES,
|
||||
MAX_RUN_ID_DISPLAY_LENGTH,
|
||||
POLL_INTERVAL,
|
||||
PROGRESS_STEP_DELAYS,
|
||||
RETRY_DELAY,
|
||||
STATUS_EMOJIS,
|
||||
)
|
||||
from ..database import RunRecord, ensure_project_db, get_project_db
|
||||
from ..exceptions import (
|
||||
DatabaseError,
|
||||
ValidationError,
|
||||
handle_error,
|
||||
require_project,
|
||||
retry_on_network_error,
|
||||
safe_json_load,
|
||||
)
|
||||
from ..progress import step_progress
|
||||
from ..validation import (
|
||||
validate_parameters,
|
||||
validate_run_id,
|
||||
validate_target_path,
|
||||
validate_timeout,
|
||||
validate_volume_mode,
|
||||
validate_workflow_name,
|
||||
)
|
||||
|
||||
console = Console()
|
||||
app = typer.Typer()
|
||||
|
||||
|
||||
@retry_on_network_error(max_retries=MAX_RETRIES, delay=RETRY_DELAY)
|
||||
def get_client() -> FuzzForgeClient:
|
||||
"""Get configured FuzzForge client with retry on network errors"""
|
||||
config = get_project_config() or FuzzForgeConfig()
|
||||
return FuzzForgeClient(base_url=config.get_api_url(), timeout=config.get_timeout())
|
||||
|
||||
|
||||
def status_emoji(status: str) -> str:
|
||||
"""Get emoji for execution status"""
|
||||
return STATUS_EMOJIS.get(status.lower(), STATUS_EMOJIS["unknown"])
|
||||
|
||||
|
||||
def parse_inline_parameters(params: List[str]) -> Dict[str, Any]:
|
||||
"""Parse inline key=value parameters using improved validation"""
|
||||
return validate_parameters(params)
|
||||
|
||||
|
||||
def execute_workflow_submission(
|
||||
client: FuzzForgeClient,
|
||||
workflow: str,
|
||||
target_path: str,
|
||||
parameters: Dict[str, Any],
|
||||
volume_mode: str,
|
||||
timeout: Optional[int],
|
||||
interactive: bool,
|
||||
) -> Any:
|
||||
"""Handle the workflow submission process"""
|
||||
# Get workflow metadata for parameter validation
|
||||
console.print(f"🔧 Getting workflow information for: {workflow}")
|
||||
workflow_meta = client.get_workflow_metadata(workflow)
|
||||
param_response = client.get_workflow_parameters(workflow)
|
||||
|
||||
# Interactive parameter input
|
||||
if interactive and workflow_meta.parameters.get("properties"):
|
||||
properties = workflow_meta.parameters.get("properties", {})
|
||||
required_params = set(workflow_meta.parameters.get("required", []))
|
||||
defaults = param_response.defaults
|
||||
|
||||
missing_required = required_params - set(parameters.keys())
|
||||
|
||||
if missing_required:
|
||||
console.print(
|
||||
f"\n📝 [bold]Missing required parameters:[/bold] {', '.join(missing_required)}"
|
||||
)
|
||||
console.print("Please provide values:\n")
|
||||
|
||||
for param_name in missing_required:
|
||||
param_schema = properties.get(param_name, {})
|
||||
description = param_schema.get("description", "")
|
||||
param_type = param_schema.get("type", "string")
|
||||
|
||||
prompt_text = f"{param_name}"
|
||||
if description:
|
||||
prompt_text += f" ({description})"
|
||||
prompt_text += f" [{param_type}]"
|
||||
|
||||
while True:
|
||||
user_input = Prompt.ask(prompt_text, console=console)
|
||||
|
||||
try:
|
||||
if param_type == "integer":
|
||||
parameters[param_name] = int(user_input)
|
||||
elif param_type == "number":
|
||||
parameters[param_name] = float(user_input)
|
||||
elif param_type == "boolean":
|
||||
parameters[param_name] = user_input.lower() in (
|
||||
"true",
|
||||
"yes",
|
||||
"1",
|
||||
"on",
|
||||
)
|
||||
elif param_type == "array":
|
||||
parameters[param_name] = [
|
||||
item.strip()
|
||||
for item in user_input.split(",")
|
||||
if item.strip()
|
||||
]
|
||||
else:
|
||||
parameters[param_name] = user_input
|
||||
break
|
||||
except ValueError as e:
|
||||
console.print(f"❌ Invalid {param_type}: {e}", style="red")
|
||||
|
||||
# Validate volume mode
|
||||
validate_volume_mode(volume_mode)
|
||||
if volume_mode not in workflow_meta.supported_volume_modes:
|
||||
raise ValidationError(
|
||||
"volume mode",
|
||||
volume_mode,
|
||||
f"one of: {', '.join(workflow_meta.supported_volume_modes)}",
|
||||
)
|
||||
|
||||
# Create submission
|
||||
submission = WorkflowSubmission(
|
||||
target_path=target_path,
|
||||
volume_mode=volume_mode,
|
||||
parameters=parameters,
|
||||
timeout=timeout,
|
||||
)
|
||||
|
||||
# Show submission summary
|
||||
console.print("\n🎯 [bold]Executing workflow:[/bold]")
|
||||
console.print(f" Workflow: {workflow}")
|
||||
console.print(f" Target: {target_path}")
|
||||
console.print(f" Volume Mode: {volume_mode}")
|
||||
if parameters:
|
||||
console.print(f" Parameters: {len(parameters)} provided")
|
||||
if timeout:
|
||||
console.print(f" Timeout: {timeout}s")
|
||||
|
||||
# Only ask for confirmation in interactive mode
|
||||
if interactive:
|
||||
if not Confirm.ask("\nExecute workflow?", default=True, console=console):
|
||||
console.print("❌ Execution cancelled", style="yellow")
|
||||
raise typer.Exit(0)
|
||||
else:
|
||||
console.print("\n🚀 Executing workflow...")
|
||||
|
||||
# Submit the workflow with enhanced progress
|
||||
console.print(f"\n🚀 Executing workflow: [bold yellow]{workflow}[/bold yellow]")
|
||||
|
||||
steps = [
|
||||
"Validating workflow configuration",
|
||||
"Connecting to FuzzForge API",
|
||||
"Uploading parameters and settings",
|
||||
"Creating workflow deployment",
|
||||
"Initializing execution environment",
|
||||
]
|
||||
|
||||
with step_progress(steps, f"Executing {workflow}") as progress:
|
||||
progress.next_step() # Validating
|
||||
time.sleep(PROGRESS_STEP_DELAYS["validating"])
|
||||
|
||||
progress.next_step() # Connecting
|
||||
time.sleep(PROGRESS_STEP_DELAYS["connecting"])
|
||||
|
||||
progress.next_step() # Uploading
|
||||
response = client.submit_workflow(workflow, submission)
|
||||
time.sleep(PROGRESS_STEP_DELAYS["uploading"])
|
||||
|
||||
progress.next_step() # Creating deployment
|
||||
time.sleep(PROGRESS_STEP_DELAYS["creating"])
|
||||
|
||||
progress.next_step() # Initializing
|
||||
time.sleep(PROGRESS_STEP_DELAYS["initializing"])
|
||||
|
||||
progress.complete("Workflow started successfully!")
|
||||
|
||||
return response
|
||||
|
||||
|
||||
# Main workflow execution command (replaces 'runs submit')
|
||||
@app.command(
|
||||
name="exec", hidden=True
|
||||
) # Hidden because it will be called from main workflow command
|
||||
def execute_workflow(
|
||||
workflow: str = typer.Argument(..., help="Workflow name to execute"),
|
||||
target_path: str = typer.Argument(..., help="Path to analyze"),
|
||||
params: List[str] = typer.Argument(
|
||||
default=None, help="Parameters as key=value pairs"
|
||||
),
|
||||
param_file: Optional[str] = typer.Option(
|
||||
None, "--param-file", "-f", help="JSON file containing workflow parameters"
|
||||
),
|
||||
volume_mode: str = typer.Option(
|
||||
DEFAULT_VOLUME_MODE,
|
||||
"--volume-mode",
|
||||
"-v",
|
||||
help="Volume mount mode: ro (read-only) or rw (read-write)",
|
||||
),
|
||||
timeout: Optional[int] = typer.Option(
|
||||
None, "--timeout", "-t", help="Execution timeout in seconds"
|
||||
),
|
||||
interactive: bool = typer.Option(
|
||||
True,
|
||||
"--interactive/--no-interactive",
|
||||
"-i/-n",
|
||||
help="Interactive parameter input for missing required parameters",
|
||||
),
|
||||
wait: bool = typer.Option(
|
||||
False, "--wait", "-w", help="Wait for execution to complete"
|
||||
),
|
||||
):
|
||||
"""
|
||||
🚀 Execute a workflow on a target
|
||||
|
||||
Use --wait to wait for completion without live dashboard.
|
||||
"""
|
||||
try:
|
||||
# Validate inputs
|
||||
validate_workflow_name(workflow)
|
||||
target_path_obj = validate_target_path(target_path, must_exist=True)
|
||||
target_path = str(target_path_obj.absolute())
|
||||
validate_timeout(timeout)
|
||||
|
||||
# Ensure we're in a project directory
|
||||
require_project()
|
||||
except Exception as e:
|
||||
handle_error(e, "validating inputs")
|
||||
|
||||
# Parse parameters
|
||||
parameters = {}
|
||||
|
||||
# Load from param file
|
||||
if param_file:
|
||||
try:
|
||||
file_params = safe_json_load(param_file)
|
||||
if isinstance(file_params, dict):
|
||||
parameters.update(file_params)
|
||||
else:
|
||||
raise ValidationError("parameter file", param_file, "a JSON object")
|
||||
except Exception as e:
|
||||
handle_error(e, "loading parameter file")
|
||||
|
||||
# Parse inline parameters
|
||||
if params:
|
||||
try:
|
||||
inline_params = parse_inline_parameters(params)
|
||||
parameters.update(inline_params)
|
||||
except Exception as e:
|
||||
handle_error(e, "parsing parameters")
|
||||
|
||||
try:
|
||||
with get_client() as client:
|
||||
response = execute_workflow_submission(
|
||||
client,
|
||||
workflow,
|
||||
target_path,
|
||||
parameters,
|
||||
volume_mode,
|
||||
timeout,
|
||||
interactive,
|
||||
)
|
||||
|
||||
console.print("✅ Workflow execution started!", style="green")
|
||||
console.print(f" Execution ID: [bold cyan]{response.run_id}[/bold cyan]")
|
||||
console.print(
|
||||
f" Status: {status_emoji(response.status)} {response.status}"
|
||||
)
|
||||
|
||||
# Save to database
|
||||
try:
|
||||
db = ensure_project_db()
|
||||
run_record = RunRecord(
|
||||
run_id=response.run_id,
|
||||
workflow=workflow,
|
||||
status=response.status,
|
||||
target_path=target_path,
|
||||
parameters=parameters,
|
||||
created_at=datetime.now(),
|
||||
)
|
||||
db.save_run(run_record)
|
||||
except Exception as e:
|
||||
# Don't fail the whole operation if database save fails
|
||||
console.print(
|
||||
f"⚠️ Failed to save execution to database: {e}", style="yellow"
|
||||
)
|
||||
|
||||
console.print(
|
||||
f"💡 Check status: [bold cyan]fuzzforge workflow status {response.run_id}[/bold cyan]"
|
||||
)
|
||||
|
||||
# Wait for completion if requested
|
||||
if wait:
|
||||
console.print("\n⏳ Waiting for execution to complete...")
|
||||
try:
|
||||
final_status = client.wait_for_completion(
|
||||
response.run_id, poll_interval=POLL_INTERVAL
|
||||
)
|
||||
|
||||
# Update database
|
||||
try:
|
||||
db.update_run_status(
|
||||
response.run_id,
|
||||
final_status.status,
|
||||
completed_at=datetime.now()
|
||||
if final_status.is_completed
|
||||
else None,
|
||||
)
|
||||
except Exception as e:
|
||||
console.print(
|
||||
f"⚠️ Failed to update database: {e}", style="yellow"
|
||||
)
|
||||
|
||||
console.print(
|
||||
f"🏁 Execution completed with status: {status_emoji(final_status.status)} {final_status.status}"
|
||||
)
|
||||
|
||||
if final_status.is_completed:
|
||||
console.print(
|
||||
f"💡 View findings: [bold cyan]fuzzforge findings {response.run_id}[/bold cyan]"
|
||||
)
|
||||
|
||||
except KeyboardInterrupt:
|
||||
console.print(
|
||||
"\n⏹️ Monitoring cancelled (execution continues in background)",
|
||||
style="yellow",
|
||||
)
|
||||
except Exception as e:
|
||||
handle_error(e, "waiting for completion")
|
||||
|
||||
except Exception as e:
|
||||
handle_error(e, "executing workflow")
|
||||
|
||||
|
||||
@app.command("status")
|
||||
def workflow_status(
|
||||
execution_id: Optional[str] = typer.Argument(
|
||||
None, help="Execution ID to check (defaults to most recent)"
|
||||
),
|
||||
):
|
||||
"""
|
||||
📊 Check the status of a workflow execution
|
||||
"""
|
||||
try:
|
||||
require_project()
|
||||
|
||||
if execution_id:
|
||||
validate_run_id(execution_id)
|
||||
|
||||
db = get_project_db()
|
||||
if not db:
|
||||
raise DatabaseError("get project database", Exception("No database found"))
|
||||
|
||||
# Get execution ID
|
||||
if not execution_id:
|
||||
recent_runs = db.list_runs(limit=1)
|
||||
if not recent_runs:
|
||||
console.print(
|
||||
"⚠️ No executions found in project database", style="yellow"
|
||||
)
|
||||
raise typer.Exit(0)
|
||||
execution_id = recent_runs[0].run_id
|
||||
console.print(f"🔍 Using most recent execution: {execution_id}")
|
||||
else:
|
||||
validate_run_id(execution_id)
|
||||
|
||||
# Get status from API
|
||||
with get_client() as client:
|
||||
status = client.get_run_status(execution_id)
|
||||
|
||||
# Update local database
|
||||
try:
|
||||
db.update_run_status(
|
||||
execution_id,
|
||||
status.status,
|
||||
completed_at=status.updated_at if status.is_completed else None,
|
||||
)
|
||||
except Exception as e:
|
||||
console.print(f"⚠️ Failed to update database: {e}", style="yellow")
|
||||
|
||||
# Display status
|
||||
console.print(f"\n📊 [bold]Execution Status: {execution_id}[/bold]\n")
|
||||
|
||||
status_table = Table(show_header=False, box=box.SIMPLE)
|
||||
status_table.add_column("Property", style="bold cyan")
|
||||
status_table.add_column("Value")
|
||||
|
||||
status_table.add_row("Execution ID", execution_id)
|
||||
status_table.add_row("Workflow", status.workflow)
|
||||
status_table.add_row("Status", f"{status_emoji(status.status)} {status.status}")
|
||||
status_table.add_row("Created", status.created_at.strftime("%Y-%m-%d %H:%M:%S"))
|
||||
status_table.add_row("Updated", status.updated_at.strftime("%Y-%m-%d %H:%M:%S"))
|
||||
|
||||
if status.is_completed:
|
||||
duration = status.updated_at - status.created_at
|
||||
status_table.add_row(
|
||||
"Duration", str(duration).split(".")[0]
|
||||
) # Remove microseconds
|
||||
|
||||
console.print(
|
||||
Panel.fit(status_table, title="📊 Status Information", box=box.ROUNDED)
|
||||
)
|
||||
|
||||
# Show next steps
|
||||
|
||||
if status.is_completed:
|
||||
console.print(
|
||||
f"💡 View findings: [bold cyan]fuzzforge finding {execution_id}[/bold cyan]"
|
||||
)
|
||||
elif status.is_failed:
|
||||
console.print(
|
||||
f"💡 Check logs: [bold cyan]fuzzforge workflow logs {execution_id}[/bold cyan]"
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
handle_error(e, "getting execution status")
|
||||
|
||||
|
||||
@app.command("history")
|
||||
def workflow_history(
|
||||
workflow: Optional[str] = typer.Option(
|
||||
None, "--workflow", "-w", help="Filter by workflow name"
|
||||
),
|
||||
status: Optional[str] = typer.Option(
|
||||
None, "--status", "-s", help="Filter by status"
|
||||
),
|
||||
limit: int = typer.Option(
|
||||
20, "--limit", "-l", help="Maximum number of executions to show"
|
||||
),
|
||||
):
|
||||
"""
|
||||
📋 Show workflow execution history
|
||||
"""
|
||||
try:
|
||||
require_project()
|
||||
|
||||
if limit <= 0:
|
||||
raise ValidationError("limit", limit, "a positive integer")
|
||||
|
||||
db = get_project_db()
|
||||
if not db:
|
||||
raise DatabaseError("get project database", Exception("No database found"))
|
||||
runs = db.list_runs(workflow=workflow, status=status, limit=limit)
|
||||
|
||||
if not runs:
|
||||
console.print("⚠️ No executions found matching criteria", style="yellow")
|
||||
return
|
||||
|
||||
table = Table(box=box.ROUNDED)
|
||||
table.add_column("Execution ID", style="bold cyan")
|
||||
table.add_column("Workflow", style="bold")
|
||||
table.add_column("Status", justify="center")
|
||||
table.add_column("Target", style="dim")
|
||||
table.add_column("Created", justify="center")
|
||||
table.add_column("Parameters", justify="center", style="dim")
|
||||
|
||||
for run in runs:
|
||||
param_count = len(run.parameters) if run.parameters else 0
|
||||
param_str = f"{param_count} params" if param_count > 0 else "-"
|
||||
|
||||
table.add_row(
|
||||
run.run_id[:12] + "..."
|
||||
if len(run.run_id) > MAX_RUN_ID_DISPLAY_LENGTH
|
||||
else run.run_id,
|
||||
run.workflow,
|
||||
f"{status_emoji(run.status)} {run.status}",
|
||||
Path(run.target_path).name,
|
||||
run.created_at.strftime("%m-%d %H:%M"),
|
||||
param_str,
|
||||
)
|
||||
|
||||
console.print(f"\n📋 [bold]Workflow Execution History ({len(runs)})[/bold]")
|
||||
if workflow:
|
||||
console.print(f" Filtered by workflow: {workflow}")
|
||||
if status:
|
||||
console.print(f" Filtered by status: {status}")
|
||||
console.print()
|
||||
console.print(table)
|
||||
|
||||
console.print(
|
||||
"\n💡 Use [bold cyan]fuzzforge workflow status <execution-id>[/bold cyan] for detailed status"
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
handle_error(e, "listing execution history")
|
||||
|
||||
|
||||
@app.command("retry")
|
||||
def retry_workflow(
|
||||
execution_id: Optional[str] = typer.Argument(
|
||||
None, help="Execution ID to retry (defaults to most recent)"
|
||||
),
|
||||
modify_params: bool = typer.Option(
|
||||
False,
|
||||
"--modify-params",
|
||||
"-m",
|
||||
help="Interactively modify parameters before retrying",
|
||||
),
|
||||
):
|
||||
"""
|
||||
🔄 Retry a workflow execution with the same or modified parameters
|
||||
"""
|
||||
try:
|
||||
require_project()
|
||||
|
||||
db = get_project_db()
|
||||
if not db:
|
||||
raise DatabaseError("get project database", Exception("No database found"))
|
||||
|
||||
# Get execution ID if not provided
|
||||
if not execution_id:
|
||||
recent_runs = db.list_runs(limit=1)
|
||||
if not recent_runs:
|
||||
console.print("⚠️ No executions found to retry", style="yellow")
|
||||
raise typer.Exit(0)
|
||||
execution_id = recent_runs[0].run_id
|
||||
console.print(f"🔄 Retrying most recent execution: {execution_id}")
|
||||
else:
|
||||
validate_run_id(execution_id)
|
||||
|
||||
# Get original execution
|
||||
original_run = db.get_run(execution_id)
|
||||
if not original_run:
|
||||
raise ValidationError(
|
||||
"execution_id", execution_id, "an existing execution ID in the database"
|
||||
)
|
||||
|
||||
console.print(f"🔄 [bold]Retrying workflow:[/bold] {original_run.workflow}")
|
||||
console.print(f" Original Execution ID: {execution_id}")
|
||||
console.print(f" Target: {original_run.target_path}")
|
||||
|
||||
parameters = original_run.parameters.copy()
|
||||
|
||||
# Modify parameters if requested
|
||||
if modify_params and parameters:
|
||||
console.print("\n📝 [bold]Current parameters:[/bold]")
|
||||
for key, value in parameters.items():
|
||||
new_value = Prompt.ask(f"{key}", default=str(value), console=console)
|
||||
if new_value != str(value):
|
||||
# Try to maintain type
|
||||
try:
|
||||
if isinstance(value, bool):
|
||||
parameters[key] = new_value.lower() in (
|
||||
"true",
|
||||
"yes",
|
||||
"1",
|
||||
"on",
|
||||
)
|
||||
elif isinstance(value, int):
|
||||
parameters[key] = int(new_value)
|
||||
elif isinstance(value, float):
|
||||
parameters[key] = float(new_value)
|
||||
elif isinstance(value, list):
|
||||
parameters[key] = [
|
||||
item.strip()
|
||||
for item in new_value.split(",")
|
||||
if item.strip()
|
||||
]
|
||||
else:
|
||||
parameters[key] = new_value
|
||||
except ValueError:
|
||||
parameters[key] = new_value
|
||||
|
||||
# Submit new execution
|
||||
with get_client() as client:
|
||||
submission = WorkflowSubmission(
|
||||
target_path=original_run.target_path, parameters=parameters
|
||||
)
|
||||
|
||||
response = client.submit_workflow(original_run.workflow, submission)
|
||||
|
||||
console.print("\n✅ Retry submitted successfully!", style="green")
|
||||
console.print(
|
||||
f" New Execution ID: [bold cyan]{response.run_id}[/bold cyan]"
|
||||
)
|
||||
console.print(
|
||||
f" Status: {status_emoji(response.status)} {response.status}"
|
||||
)
|
||||
|
||||
# Save to database
|
||||
try:
|
||||
run_record = RunRecord(
|
||||
run_id=response.run_id,
|
||||
workflow=original_run.workflow,
|
||||
status=response.status,
|
||||
target_path=original_run.target_path,
|
||||
parameters=parameters,
|
||||
created_at=datetime.now(),
|
||||
metadata={"retry_of": execution_id},
|
||||
)
|
||||
db.save_run(run_record)
|
||||
except Exception as e:
|
||||
console.print(
|
||||
f"⚠️ Failed to save execution to database: {e}", style="yellow"
|
||||
)
|
||||
|
||||
console.print(
|
||||
f"\n💡 Monitor progress: [bold cyan]fuzzforge monitor {response.run_id}[/bold cyan]"
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
handle_error(e, "retrying workflow")
|
||||
|
||||
|
||||
@app.callback()
|
||||
def workflow_exec_callback():
|
||||
"""
|
||||
🚀 Workflow execution management
|
||||
"""
|
||||
305
cli/src/fuzzforge_cli/commands/workflows.py
Normal file
305
cli/src/fuzzforge_cli/commands/workflows.py
Normal file
@@ -0,0 +1,305 @@
|
||||
"""
|
||||
Workflow management commands.
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
import json
|
||||
import typer
|
||||
from rich.console import Console
|
||||
from rich.table import Table
|
||||
from rich.panel import Panel
|
||||
from rich.prompt import Prompt, Confirm
|
||||
from rich.syntax import Syntax
|
||||
from rich import box
|
||||
from typing import Optional, Dict, Any
|
||||
|
||||
from ..config import get_project_config, FuzzForgeConfig
|
||||
from ..fuzzy import enhanced_workflow_not_found_handler
|
||||
from fuzzforge_sdk import FuzzForgeClient
|
||||
|
||||
console = Console()
|
||||
app = typer.Typer()
|
||||
|
||||
|
||||
def get_client() -> FuzzForgeClient:
|
||||
"""Get configured FuzzForge client"""
|
||||
config = get_project_config() or FuzzForgeConfig()
|
||||
return FuzzForgeClient(base_url=config.get_api_url(), timeout=config.get_timeout())
|
||||
|
||||
|
||||
@app.command("list")
|
||||
def list_workflows():
|
||||
"""
|
||||
📋 List all available security testing workflows
|
||||
"""
|
||||
try:
|
||||
with get_client() as client:
|
||||
workflows = client.list_workflows()
|
||||
|
||||
if not workflows:
|
||||
console.print("❌ No workflows available", style="red")
|
||||
return
|
||||
|
||||
table = Table(box=box.ROUNDED)
|
||||
table.add_column("Name", style="bold cyan")
|
||||
table.add_column("Version", justify="center")
|
||||
table.add_column("Description")
|
||||
table.add_column("Tags", style="dim")
|
||||
|
||||
for workflow in workflows:
|
||||
tags_str = ", ".join(workflow.tags) if workflow.tags else ""
|
||||
table.add_row(
|
||||
workflow.name,
|
||||
workflow.version,
|
||||
workflow.description,
|
||||
tags_str
|
||||
)
|
||||
|
||||
console.print(f"\n🔧 [bold]Available Workflows ({len(workflows)})[/bold]\n")
|
||||
console.print(table)
|
||||
|
||||
console.print(f"\n💡 Use [bold cyan]fuzzforge workflows info <name>[/bold cyan] for detailed information")
|
||||
|
||||
except Exception as e:
|
||||
console.print(f"❌ Failed to fetch workflows: {e}", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
|
||||
@app.command("info")
|
||||
def workflow_info(
|
||||
name: str = typer.Argument(..., help="Workflow name to get information about")
|
||||
):
|
||||
"""
|
||||
📋 Show detailed information about a specific workflow
|
||||
"""
|
||||
try:
|
||||
with get_client() as client:
|
||||
workflow = client.get_workflow_metadata(name)
|
||||
|
||||
console.print(f"\n🔧 [bold]Workflow: {workflow.name}[/bold]\n")
|
||||
|
||||
# Basic information
|
||||
info_table = Table(show_header=False, box=box.SIMPLE)
|
||||
info_table.add_column("Property", style="bold cyan")
|
||||
info_table.add_column("Value")
|
||||
|
||||
info_table.add_row("Name", workflow.name)
|
||||
info_table.add_row("Version", workflow.version)
|
||||
info_table.add_row("Description", workflow.description)
|
||||
if workflow.author:
|
||||
info_table.add_row("Author", workflow.author)
|
||||
if workflow.tags:
|
||||
info_table.add_row("Tags", ", ".join(workflow.tags))
|
||||
info_table.add_row("Volume Modes", ", ".join(workflow.supported_volume_modes))
|
||||
info_table.add_row("Custom Docker", "✅ Yes" if workflow.has_custom_docker else "❌ No")
|
||||
|
||||
console.print(
|
||||
Panel.fit(
|
||||
info_table,
|
||||
title="ℹ️ Basic Information",
|
||||
box=box.ROUNDED
|
||||
)
|
||||
)
|
||||
|
||||
# Parameters
|
||||
if workflow.parameters:
|
||||
console.print("\n📝 [bold]Parameters Schema[/bold]")
|
||||
|
||||
param_table = Table(box=box.ROUNDED)
|
||||
param_table.add_column("Parameter", style="bold")
|
||||
param_table.add_column("Type", style="cyan")
|
||||
param_table.add_column("Required", justify="center")
|
||||
param_table.add_column("Default")
|
||||
param_table.add_column("Description", style="dim")
|
||||
|
||||
# Extract parameter information from JSON schema
|
||||
properties = workflow.parameters.get("properties", {})
|
||||
required_params = set(workflow.parameters.get("required", []))
|
||||
defaults = workflow.default_parameters
|
||||
|
||||
for param_name, param_schema in properties.items():
|
||||
param_type = param_schema.get("type", "unknown")
|
||||
is_required = "✅" if param_name in required_params else "❌"
|
||||
default_val = str(defaults.get(param_name, "")) if param_name in defaults else ""
|
||||
description = param_schema.get("description", "")
|
||||
|
||||
# Handle array types
|
||||
if param_type == "array":
|
||||
items_type = param_schema.get("items", {}).get("type", "unknown")
|
||||
param_type = f"array[{items_type}]"
|
||||
|
||||
param_table.add_row(
|
||||
param_name,
|
||||
param_type,
|
||||
is_required,
|
||||
default_val[:30] + "..." if len(default_val) > 30 else default_val,
|
||||
description[:50] + "..." if len(description) > 50 else description
|
||||
)
|
||||
|
||||
console.print(param_table)
|
||||
|
||||
# Required modules
|
||||
if workflow.required_modules:
|
||||
console.print(f"\n🔧 [bold]Required Modules:[/bold] {', '.join(workflow.required_modules)}")
|
||||
|
||||
console.print(f"\n💡 Use [bold cyan]fuzzforge workflows parameters {name}[/bold cyan] for interactive parameter builder")
|
||||
|
||||
except Exception as e:
|
||||
error_message = str(e)
|
||||
if "not found" in error_message.lower() or "404" in error_message:
|
||||
# Try fuzzy matching for workflow name
|
||||
enhanced_workflow_not_found_handler(name)
|
||||
else:
|
||||
console.print(f"❌ Failed to get workflow info: {e}", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
|
||||
@app.command("parameters")
|
||||
def workflow_parameters(
|
||||
name: str = typer.Argument(..., help="Workflow name"),
|
||||
output_file: Optional[str] = typer.Option(
|
||||
None, "--output", "-o",
|
||||
help="Save parameters to JSON file"
|
||||
),
|
||||
interactive: bool = typer.Option(
|
||||
True, "--interactive/--no-interactive", "-i/-n",
|
||||
help="Interactive parameter builder"
|
||||
)
|
||||
):
|
||||
"""
|
||||
📝 Interactive parameter builder for workflows
|
||||
"""
|
||||
try:
|
||||
with get_client() as client:
|
||||
workflow = client.get_workflow_metadata(name)
|
||||
param_response = client.get_workflow_parameters(name)
|
||||
|
||||
console.print(f"\n📝 [bold]Parameter Builder: {name}[/bold]\n")
|
||||
|
||||
if not workflow.parameters.get("properties"):
|
||||
console.print("ℹ️ This workflow has no configurable parameters")
|
||||
return
|
||||
|
||||
parameters = {}
|
||||
properties = workflow.parameters.get("properties", {})
|
||||
required_params = set(workflow.parameters.get("required", []))
|
||||
defaults = param_response.defaults
|
||||
|
||||
if interactive:
|
||||
console.print("🔧 Enter parameter values (press Enter for default):\n")
|
||||
|
||||
for param_name, param_schema in properties.items():
|
||||
param_type = param_schema.get("type", "string")
|
||||
description = param_schema.get("description", "")
|
||||
is_required = param_name in required_params
|
||||
default_value = defaults.get(param_name)
|
||||
|
||||
# Build prompt
|
||||
prompt_text = f"{param_name}"
|
||||
if description:
|
||||
prompt_text += f" ({description})"
|
||||
if param_type:
|
||||
prompt_text += f" [{param_type}]"
|
||||
if is_required:
|
||||
prompt_text += " [bold red]*required*[/bold red]"
|
||||
|
||||
# Get user input
|
||||
while True:
|
||||
if default_value is not None:
|
||||
user_input = Prompt.ask(
|
||||
prompt_text,
|
||||
default=str(default_value),
|
||||
console=console
|
||||
)
|
||||
else:
|
||||
user_input = Prompt.ask(
|
||||
prompt_text,
|
||||
console=console
|
||||
)
|
||||
|
||||
# Validate and convert input
|
||||
if user_input.strip() == "" and not is_required:
|
||||
break
|
||||
|
||||
if user_input.strip() == "" and is_required:
|
||||
console.print("❌ This parameter is required", style="red")
|
||||
continue
|
||||
|
||||
try:
|
||||
# Type conversion
|
||||
if param_type == "integer":
|
||||
parameters[param_name] = int(user_input)
|
||||
elif param_type == "number":
|
||||
parameters[param_name] = float(user_input)
|
||||
elif param_type == "boolean":
|
||||
parameters[param_name] = user_input.lower() in ("true", "yes", "1", "on")
|
||||
elif param_type == "array":
|
||||
# Simple comma-separated array
|
||||
parameters[param_name] = [item.strip() for item in user_input.split(",") if item.strip()]
|
||||
else:
|
||||
parameters[param_name] = user_input
|
||||
|
||||
break
|
||||
|
||||
except ValueError as e:
|
||||
console.print(f"❌ Invalid {param_type}: {e}", style="red")
|
||||
|
||||
# Show summary
|
||||
console.print("\n📋 [bold]Parameter Summary:[/bold]")
|
||||
summary_table = Table(show_header=False, box=box.SIMPLE)
|
||||
summary_table.add_column("Parameter", style="cyan")
|
||||
summary_table.add_column("Value", style="white")
|
||||
|
||||
for key, value in parameters.items():
|
||||
summary_table.add_row(key, str(value))
|
||||
|
||||
console.print(summary_table)
|
||||
|
||||
else:
|
||||
# Non-interactive mode - show schema
|
||||
console.print("📋 Parameter Schema:")
|
||||
schema_json = json.dumps(workflow.parameters, indent=2)
|
||||
console.print(Syntax(schema_json, "json", theme="monokai"))
|
||||
|
||||
if defaults:
|
||||
console.print("\n📋 Default Values:")
|
||||
defaults_json = json.dumps(defaults, indent=2)
|
||||
console.print(Syntax(defaults_json, "json", theme="monokai"))
|
||||
|
||||
# Save to file if requested
|
||||
if output_file:
|
||||
if parameters or not interactive:
|
||||
data_to_save = parameters if interactive else {"schema": workflow.parameters, "defaults": defaults}
|
||||
with open(output_file, 'w') as f:
|
||||
json.dump(data_to_save, f, indent=2)
|
||||
console.print(f"\n💾 Parameters saved to: {output_file}")
|
||||
else:
|
||||
console.print("\n❌ No parameters to save", style="red")
|
||||
|
||||
except Exception as e:
|
||||
console.print(f"❌ Failed to build parameters: {e}", style="red")
|
||||
raise typer.Exit(1)
|
||||
|
||||
|
||||
@app.callback(invoke_without_command=True)
|
||||
def workflows_callback(ctx: typer.Context):
|
||||
"""
|
||||
🔧 Manage security testing workflows
|
||||
"""
|
||||
# Check if a subcommand is being invoked
|
||||
if ctx.invoked_subcommand is not None:
|
||||
# Let the subcommand handle it
|
||||
return
|
||||
|
||||
# Default to list when no subcommand provided
|
||||
list_workflows()
|
||||
190
cli/src/fuzzforge_cli/completion.py
Normal file
190
cli/src/fuzzforge_cli/completion.py
Normal file
@@ -0,0 +1,190 @@
|
||||
"""
|
||||
Shell auto-completion support for FuzzForge CLI.
|
||||
|
||||
Provides intelligent tab completion for commands, workflows, run IDs, and parameters.
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
import typer
|
||||
from typing import List, Optional
|
||||
from pathlib import Path
|
||||
|
||||
from .config import get_project_config, FuzzForgeConfig
|
||||
from .database import get_project_db
|
||||
from fuzzforge_sdk import FuzzForgeClient
|
||||
|
||||
|
||||
def complete_workflow_names(incomplete: str) -> List[str]:
|
||||
"""Auto-complete workflow names from the API."""
|
||||
try:
|
||||
config = get_project_config() or FuzzForgeConfig()
|
||||
with FuzzForgeClient(base_url=config.get_api_url(), timeout=5.0) as client:
|
||||
workflows = client.list_workflows()
|
||||
workflow_names = [w.name for w in workflows]
|
||||
return [name for name in workflow_names if name.startswith(incomplete)]
|
||||
except Exception:
|
||||
# Fallback to common workflow names if API is unavailable
|
||||
common_workflows = [
|
||||
"security_assessment",
|
||||
"language_fuzzing",
|
||||
"infrastructure_scan",
|
||||
"static_analysis_scan",
|
||||
"penetration_testing_scan",
|
||||
"secret_detection_scan"
|
||||
]
|
||||
return [name for name in common_workflows if name.startswith(incomplete)]
|
||||
|
||||
|
||||
def complete_run_ids(incomplete: str) -> List[str]:
|
||||
"""Auto-complete run IDs from local database."""
|
||||
try:
|
||||
db = get_project_db()
|
||||
if db:
|
||||
runs = db.get_recent_runs(limit=50) # Get recent runs for completion
|
||||
run_ids = [run.run_id for run in runs]
|
||||
return [run_id for run_id in run_ids if run_id.startswith(incomplete)]
|
||||
except Exception:
|
||||
pass
|
||||
return []
|
||||
|
||||
|
||||
def complete_target_paths(incomplete: str) -> List[str]:
|
||||
"""Auto-complete file/directory paths."""
|
||||
try:
|
||||
# Convert incomplete path to Path object
|
||||
path = Path(incomplete) if incomplete else Path.cwd()
|
||||
|
||||
if path.is_dir():
|
||||
# Complete directory contents
|
||||
try:
|
||||
entries = []
|
||||
for entry in path.iterdir():
|
||||
entry_str = str(entry)
|
||||
if entry.is_dir():
|
||||
entry_str += "/"
|
||||
entries.append(entry_str)
|
||||
return entries
|
||||
except PermissionError:
|
||||
return []
|
||||
else:
|
||||
# Complete parent directory contents that match the incomplete name
|
||||
parent = path.parent
|
||||
name = path.name
|
||||
try:
|
||||
entries = []
|
||||
for entry in parent.iterdir():
|
||||
if entry.name.startswith(name):
|
||||
entry_str = str(entry)
|
||||
if entry.is_dir():
|
||||
entry_str += "/"
|
||||
entries.append(entry_str)
|
||||
return entries
|
||||
except (PermissionError, FileNotFoundError):
|
||||
return []
|
||||
except Exception:
|
||||
return []
|
||||
|
||||
|
||||
def complete_volume_modes(incomplete: str) -> List[str]:
|
||||
"""Auto-complete volume mount modes."""
|
||||
modes = ["ro", "rw"]
|
||||
return [mode for mode in modes if mode.startswith(incomplete)]
|
||||
|
||||
|
||||
def complete_export_formats(incomplete: str) -> List[str]:
|
||||
"""Auto-complete export formats."""
|
||||
formats = ["json", "csv", "html", "sarif"]
|
||||
return [fmt for fmt in formats if fmt.startswith(incomplete)]
|
||||
|
||||
|
||||
def complete_severity_levels(incomplete: str) -> List[str]:
|
||||
"""Auto-complete severity levels."""
|
||||
severities = ["critical", "high", "medium", "low", "info"]
|
||||
return [sev for sev in severities if sev.startswith(incomplete)]
|
||||
|
||||
|
||||
def complete_workflow_tags(incomplete: str) -> List[str]:
|
||||
"""Auto-complete workflow tags."""
|
||||
try:
|
||||
config = get_project_config() or FuzzForgeConfig()
|
||||
with FuzzForgeClient(base_url=config.get_api_url(), timeout=5.0) as client:
|
||||
workflows = client.list_workflows()
|
||||
all_tags = set()
|
||||
for w in workflows:
|
||||
if w.tags:
|
||||
all_tags.update(w.tags)
|
||||
return [tag for tag in sorted(all_tags) if tag.startswith(incomplete)]
|
||||
except Exception:
|
||||
# Fallback tags
|
||||
common_tags = [
|
||||
"security", "fuzzing", "static-analysis", "infrastructure",
|
||||
"secrets", "containers", "vulnerabilities", "pentest"
|
||||
]
|
||||
return [tag for tag in common_tags if tag.startswith(incomplete)]
|
||||
|
||||
|
||||
def complete_config_keys(incomplete: str) -> List[str]:
|
||||
"""Auto-complete configuration keys."""
|
||||
config_keys = [
|
||||
"api_url",
|
||||
"api_timeout",
|
||||
"default_workflow",
|
||||
"default_volume_mode",
|
||||
"project_name",
|
||||
"data_retention_days",
|
||||
"auto_save_findings",
|
||||
"notification_webhook"
|
||||
]
|
||||
return [key for key in config_keys if key.startswith(incomplete)]
|
||||
|
||||
|
||||
# Completion callbacks for Typer
|
||||
WorkflowNameComplete = typer.Option(
|
||||
autocompletion=complete_workflow_names,
|
||||
help="Workflow name (tab completion available)"
|
||||
)
|
||||
|
||||
RunIdComplete = typer.Option(
|
||||
autocompletion=complete_run_ids,
|
||||
help="Run ID (tab completion available)"
|
||||
)
|
||||
|
||||
TargetPathComplete = typer.Argument(
|
||||
autocompletion=complete_target_paths,
|
||||
help="Target path (tab completion available)"
|
||||
)
|
||||
|
||||
VolumeModetComplete = typer.Option(
|
||||
autocompletion=complete_volume_modes,
|
||||
help="Volume mode: ro or rw (tab completion available)"
|
||||
)
|
||||
|
||||
ExportFormatComplete = typer.Option(
|
||||
autocompletion=complete_export_formats,
|
||||
help="Export format (tab completion available)"
|
||||
)
|
||||
|
||||
SeverityComplete = typer.Option(
|
||||
autocompletion=complete_severity_levels,
|
||||
help="Severity level (tab completion available)"
|
||||
)
|
||||
|
||||
WorkflowTagComplete = typer.Option(
|
||||
autocompletion=complete_workflow_tags,
|
||||
help="Workflow tag (tab completion available)"
|
||||
)
|
||||
|
||||
ConfigKeyComplete = typer.Option(
|
||||
autocompletion=complete_config_keys,
|
||||
help="Configuration key (tab completion available)"
|
||||
)
|
||||
420
cli/src/fuzzforge_cli/config.py
Normal file
420
cli/src/fuzzforge_cli/config.py
Normal file
@@ -0,0 +1,420 @@
|
||||
"""
|
||||
Configuration management for FuzzForge CLI.
|
||||
|
||||
Extends project configuration with Cognee integration metadata
|
||||
and provides helpers for AI modules.
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import hashlib
|
||||
import os
|
||||
from pathlib import Path
|
||||
from typing import Any, Dict, Optional
|
||||
|
||||
try: # Optional dependency; fall back if not installed
|
||||
from dotenv import load_dotenv
|
||||
except ImportError: # pragma: no cover - optional dependency
|
||||
load_dotenv = None
|
||||
|
||||
import yaml
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
|
||||
def _generate_project_id(project_dir: Path, project_name: str) -> str:
|
||||
"""Generate a deterministic project identifier based on path and name."""
|
||||
resolved_path = str(project_dir.resolve())
|
||||
hash_input = f"{resolved_path}:{project_name}".encode()
|
||||
return hashlib.sha256(hash_input).hexdigest()[:16]
|
||||
|
||||
|
||||
class ProjectConfig(BaseModel):
|
||||
"""Project configuration model."""
|
||||
|
||||
name: str = "fuzzforge-project"
|
||||
api_url: str = "http://localhost:8000"
|
||||
default_timeout: int = 3600
|
||||
default_workflow: Optional[str] = None
|
||||
id: Optional[str] = None
|
||||
tenant_id: Optional[str] = None
|
||||
|
||||
|
||||
class RetentionConfig(BaseModel):
|
||||
"""Data retention configuration."""
|
||||
|
||||
max_runs: int = 100
|
||||
keep_findings_days: int = 90
|
||||
|
||||
|
||||
class PreferencesConfig(BaseModel):
|
||||
"""User preferences."""
|
||||
|
||||
auto_save_findings: bool = True
|
||||
show_progress_bars: bool = True
|
||||
table_style: str = "rich"
|
||||
color_output: bool = True
|
||||
|
||||
|
||||
class CogneeConfig(BaseModel):
|
||||
"""Cognee integration metadata."""
|
||||
|
||||
enabled: bool = True
|
||||
graph_database_provider: str = "kuzu"
|
||||
data_directory: Optional[str] = None
|
||||
system_directory: Optional[str] = None
|
||||
backend_access_control: bool = True
|
||||
project_id: Optional[str] = None
|
||||
tenant_id: Optional[str] = None
|
||||
|
||||
|
||||
class FuzzForgeConfig(BaseModel):
|
||||
"""Complete FuzzForge CLI configuration."""
|
||||
|
||||
project: ProjectConfig = Field(default_factory=ProjectConfig)
|
||||
retention: RetentionConfig = Field(default_factory=RetentionConfig)
|
||||
preferences: PreferencesConfig = Field(default_factory=PreferencesConfig)
|
||||
cognee: CogneeConfig = Field(default_factory=CogneeConfig)
|
||||
|
||||
@classmethod
|
||||
def from_file(cls, config_path: Path) -> "FuzzForgeConfig":
|
||||
"""Load configuration from YAML file."""
|
||||
if not config_path.exists():
|
||||
return cls()
|
||||
|
||||
try:
|
||||
with open(config_path, "r", encoding="utf-8") as fh:
|
||||
data = yaml.safe_load(fh) or {}
|
||||
return cls(**data)
|
||||
except Exception as exc: # pragma: no cover - defensive fallback
|
||||
print(f"Warning: Failed to load config from {config_path}: {exc}")
|
||||
return cls()
|
||||
|
||||
def save_to_file(self, config_path: Path) -> None:
|
||||
"""Save configuration to YAML file."""
|
||||
config_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
with open(config_path, "w", encoding="utf-8") as fh:
|
||||
yaml.dump(
|
||||
self.model_dump(),
|
||||
fh,
|
||||
default_flow_style=False,
|
||||
sort_keys=False,
|
||||
)
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Convenience helpers used by CLI and AI modules
|
||||
# ------------------------------------------------------------------
|
||||
def ensure_project_metadata(self, project_dir: Path) -> bool:
|
||||
"""Ensure project id/tenant metadata is populated."""
|
||||
changed = False
|
||||
project = self.project
|
||||
if not project.id:
|
||||
project.id = _generate_project_id(project_dir, project.name)
|
||||
changed = True
|
||||
if not project.tenant_id:
|
||||
project.tenant_id = f"fuzzforge_project_{project.id}"
|
||||
changed = True
|
||||
return changed
|
||||
|
||||
def ensure_cognee_defaults(self, project_dir: Path) -> bool:
|
||||
"""Ensure Cognee configuration and directories exist."""
|
||||
self.ensure_project_metadata(project_dir)
|
||||
changed = False
|
||||
|
||||
cognee = self.cognee
|
||||
if not cognee.project_id:
|
||||
cognee.project_id = self.project.id
|
||||
changed = True
|
||||
if not cognee.tenant_id:
|
||||
cognee.tenant_id = self.project.tenant_id
|
||||
changed = True
|
||||
|
||||
base_dir = project_dir / ".fuzzforge" / "cognee" / f"project_{self.project.id}"
|
||||
data_dir = base_dir / "data"
|
||||
system_dir = base_dir / "system"
|
||||
|
||||
for path in (
|
||||
base_dir,
|
||||
data_dir,
|
||||
system_dir,
|
||||
system_dir / "kuzu_db",
|
||||
system_dir / "lancedb",
|
||||
):
|
||||
if not path.exists():
|
||||
path.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
if cognee.data_directory != str(data_dir):
|
||||
cognee.data_directory = str(data_dir)
|
||||
changed = True
|
||||
if cognee.system_directory != str(system_dir):
|
||||
cognee.system_directory = str(system_dir)
|
||||
changed = True
|
||||
|
||||
return changed
|
||||
|
||||
def get_api_url(self) -> str:
|
||||
"""Get API URL with environment variable override."""
|
||||
return os.getenv("FUZZFORGE_API_URL", self.project.api_url)
|
||||
|
||||
def get_timeout(self) -> int:
|
||||
"""Get timeout with environment variable override."""
|
||||
env_timeout = os.getenv("FUZZFORGE_TIMEOUT")
|
||||
if env_timeout and env_timeout.isdigit():
|
||||
return int(env_timeout)
|
||||
return self.project.default_timeout
|
||||
|
||||
def get_project_context(self, project_dir: Path) -> Dict[str, str]:
|
||||
"""Return project metadata for AI integrations."""
|
||||
self.ensure_cognee_defaults(project_dir)
|
||||
return {
|
||||
"project_id": self.project.id or "unknown_project",
|
||||
"project_name": self.project.name,
|
||||
"tenant_id": self.project.tenant_id or "fuzzforge_tenant",
|
||||
"data_directory": self.cognee.data_directory,
|
||||
"system_directory": self.cognee.system_directory,
|
||||
}
|
||||
|
||||
def get_cognee_config(self, project_dir: Path) -> Dict[str, Any]:
|
||||
"""Expose Cognee configuration as a plain dictionary."""
|
||||
self.ensure_cognee_defaults(project_dir)
|
||||
return self.cognee.model_dump()
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# Project-level helpers used across the CLI
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
def _get_project_paths(project_dir: Path) -> Dict[str, Path]:
|
||||
config_dir = project_dir / ".fuzzforge"
|
||||
return {
|
||||
"config_dir": config_dir,
|
||||
"config_path": config_dir / "config.yaml",
|
||||
}
|
||||
|
||||
|
||||
def get_project_config(project_dir: Optional[Path] = None) -> Optional[FuzzForgeConfig]:
|
||||
"""Get configuration for the current project."""
|
||||
project_dir = Path(project_dir or Path.cwd())
|
||||
paths = _get_project_paths(project_dir)
|
||||
config_path = paths["config_path"]
|
||||
|
||||
if not config_path.exists():
|
||||
return None
|
||||
|
||||
config = FuzzForgeConfig.from_file(config_path)
|
||||
if config.ensure_cognee_defaults(project_dir):
|
||||
config.save_to_file(config_path)
|
||||
return config
|
||||
|
||||
|
||||
def ensure_project_config(
|
||||
project_dir: Optional[Path] = None,
|
||||
project_name: Optional[str] = None,
|
||||
api_url: Optional[str] = None,
|
||||
) -> FuzzForgeConfig:
|
||||
"""Ensure project configuration exists, creating defaults if needed."""
|
||||
project_dir = Path(project_dir or Path.cwd())
|
||||
paths = _get_project_paths(project_dir)
|
||||
config_dir = paths["config_dir"]
|
||||
config_path = paths["config_path"]
|
||||
|
||||
config_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
if config_path.exists():
|
||||
config = FuzzForgeConfig.from_file(config_path)
|
||||
else:
|
||||
config = FuzzForgeConfig()
|
||||
|
||||
if project_name:
|
||||
config.project.name = project_name
|
||||
if api_url:
|
||||
config.project.api_url = api_url
|
||||
|
||||
if config.ensure_cognee_defaults(project_dir):
|
||||
config.save_to_file(config_path)
|
||||
else:
|
||||
# Still ensure latest values persisted (e.g., updated name/url)
|
||||
config.save_to_file(config_path)
|
||||
|
||||
return config
|
||||
|
||||
|
||||
def get_global_config() -> FuzzForgeConfig:
|
||||
"""Get global user configuration."""
|
||||
home = Path.home()
|
||||
global_config_dir = home / ".config" / "fuzzforge"
|
||||
global_config_path = global_config_dir / "config.yaml"
|
||||
|
||||
if global_config_path.exists():
|
||||
return FuzzForgeConfig.from_file(global_config_path)
|
||||
|
||||
return FuzzForgeConfig()
|
||||
|
||||
|
||||
def save_global_config(config: FuzzForgeConfig) -> None:
|
||||
"""Save global user configuration."""
|
||||
home = Path.home()
|
||||
global_config_dir = home / ".config" / "fuzzforge"
|
||||
global_config_path = global_config_dir / "config.yaml"
|
||||
config.save_to_file(global_config_path)
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# Compatibility layer for AI modules
|
||||
# ----------------------------------------------------------------------
|
||||
|
||||
class ProjectConfigManager:
|
||||
"""Lightweight wrapper mimicking the legacy Config class used by the AI module."""
|
||||
|
||||
def __init__(self, project_dir: Optional[Path] = None):
|
||||
self.project_dir = Path(project_dir or Path.cwd())
|
||||
paths = _get_project_paths(self.project_dir)
|
||||
self.config_path = paths["config_dir"]
|
||||
self.file_path = paths["config_path"]
|
||||
self._config = get_project_config(self.project_dir)
|
||||
if self._config is None:
|
||||
raise FileNotFoundError(
|
||||
f"FuzzForge project not initialized in {self.project_dir}. Run 'ff init'."
|
||||
)
|
||||
|
||||
# Legacy API ------------------------------------------------------
|
||||
def is_initialized(self) -> bool:
|
||||
return self.file_path.exists()
|
||||
|
||||
def get_project_context(self) -> Dict[str, str]:
|
||||
return self._config.get_project_context(self.project_dir)
|
||||
|
||||
def get_cognee_config(self) -> Dict[str, Any]:
|
||||
return self._config.get_cognee_config(self.project_dir)
|
||||
|
||||
def setup_cognee_environment(self) -> None:
|
||||
cognee = self.get_cognee_config()
|
||||
if not cognee.get("enabled", True):
|
||||
return
|
||||
|
||||
# Load project-specific environment overrides from .fuzzforge/.env if available
|
||||
env_file = self.project_dir / ".fuzzforge" / ".env"
|
||||
if env_file.exists():
|
||||
if load_dotenv:
|
||||
load_dotenv(env_file, override=False)
|
||||
else:
|
||||
try:
|
||||
for line in env_file.read_text(encoding="utf-8").splitlines():
|
||||
stripped = line.strip()
|
||||
if not stripped or stripped.startswith("#"):
|
||||
continue
|
||||
if "=" not in stripped:
|
||||
continue
|
||||
key, value = stripped.split("=", 1)
|
||||
os.environ.setdefault(key.strip(), value.strip())
|
||||
except Exception: # pragma: no cover - best effort fallback
|
||||
pass
|
||||
|
||||
backend_access = "true" if cognee.get("backend_access_control", True) else "false"
|
||||
os.environ["ENABLE_BACKEND_ACCESS_CONTROL"] = backend_access
|
||||
os.environ["GRAPH_DATABASE_PROVIDER"] = cognee.get("graph_database_provider", "kuzu")
|
||||
|
||||
data_dir = cognee.get("data_directory")
|
||||
system_dir = cognee.get("system_directory")
|
||||
tenant_id = cognee.get("tenant_id", "fuzzforge_tenant")
|
||||
|
||||
if data_dir:
|
||||
os.environ["COGNEE_DATA_ROOT"] = data_dir
|
||||
if system_dir:
|
||||
os.environ["COGNEE_SYSTEM_ROOT"] = system_dir
|
||||
os.environ["COGNEE_USER_ID"] = tenant_id
|
||||
os.environ["COGNEE_TENANT_ID"] = tenant_id
|
||||
|
||||
# Configure LLM provider defaults for Cognee. Values prefixed with COGNEE_
|
||||
# take precedence so users can segregate credentials.
|
||||
def _env(*names: str, default: str | None = None) -> str | None:
|
||||
for name in names:
|
||||
value = os.getenv(name)
|
||||
if value:
|
||||
return value
|
||||
return default
|
||||
|
||||
provider = _env(
|
||||
"LLM_COGNEE_PROVIDER",
|
||||
"COGNEE_LLM_PROVIDER",
|
||||
"LLM_PROVIDER",
|
||||
default="openai",
|
||||
)
|
||||
model = _env(
|
||||
"LLM_COGNEE_MODEL",
|
||||
"COGNEE_LLM_MODEL",
|
||||
"LLM_MODEL",
|
||||
"LITELLM_MODEL",
|
||||
default="gpt-4o-mini",
|
||||
)
|
||||
api_key = _env(
|
||||
"LLM_COGNEE_API_KEY",
|
||||
"COGNEE_LLM_API_KEY",
|
||||
"LLM_API_KEY",
|
||||
"OPENAI_API_KEY",
|
||||
)
|
||||
endpoint = _env("LLM_COGNEE_ENDPOINT", "COGNEE_LLM_ENDPOINT", "LLM_ENDPOINT")
|
||||
api_version = _env(
|
||||
"LLM_COGNEE_API_VERSION",
|
||||
"COGNEE_LLM_API_VERSION",
|
||||
"LLM_API_VERSION",
|
||||
)
|
||||
max_tokens = _env(
|
||||
"LLM_COGNEE_MAX_TOKENS",
|
||||
"COGNEE_LLM_MAX_TOKENS",
|
||||
"LLM_MAX_TOKENS",
|
||||
)
|
||||
|
||||
if provider:
|
||||
os.environ["LLM_PROVIDER"] = provider
|
||||
if model:
|
||||
os.environ["LLM_MODEL"] = model
|
||||
# Maintain backwards compatibility with components expecting LITELLM_MODEL
|
||||
os.environ.setdefault("LITELLM_MODEL", model)
|
||||
if api_key:
|
||||
os.environ["LLM_API_KEY"] = api_key
|
||||
# Provide OPENAI_API_KEY fallback when using OpenAI-compatible providers
|
||||
if provider and provider.lower() in {"openai", "azure_openai", "custom"}:
|
||||
os.environ.setdefault("OPENAI_API_KEY", api_key)
|
||||
if endpoint:
|
||||
os.environ["LLM_ENDPOINT"] = endpoint
|
||||
if api_version:
|
||||
os.environ["LLM_API_VERSION"] = api_version
|
||||
if max_tokens:
|
||||
os.environ["LLM_MAX_TOKENS"] = str(max_tokens)
|
||||
|
||||
# Provide a default MCP endpoint for local FuzzForge backend access when unset
|
||||
if not os.getenv("FUZZFORGE_MCP_URL"):
|
||||
os.environ["FUZZFORGE_MCP_URL"] = os.getenv(
|
||||
"FUZZFORGE_DEFAULT_MCP_URL",
|
||||
"http://localhost:8010/mcp",
|
||||
)
|
||||
|
||||
def refresh(self) -> None:
|
||||
"""Reload configuration from disk."""
|
||||
self._config = get_project_config(self.project_dir)
|
||||
if self._config is None:
|
||||
raise FileNotFoundError(
|
||||
f"FuzzForge project not initialized in {self.project_dir}. Run 'ff init'."
|
||||
)
|
||||
|
||||
# Convenience accessors ------------------------------------------
|
||||
@property
|
||||
def fuzzforge_dir(self) -> Path:
|
||||
return self.config_path
|
||||
|
||||
def get_api_url(self) -> str:
|
||||
return self._config.get_api_url()
|
||||
|
||||
def get_timeout(self) -> int:
|
||||
return self._config.get_timeout()
|
||||
73
cli/src/fuzzforge_cli/constants.py
Normal file
73
cli/src/fuzzforge_cli/constants.py
Normal file
@@ -0,0 +1,73 @@
|
||||
"""
|
||||
Constants for FuzzForge CLI.
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
# Database constants
|
||||
DEFAULT_DB_TIMEOUT = 30.0
|
||||
DEFAULT_CLEANUP_DAYS = 90
|
||||
STATS_SAMPLE_SIZE = 100
|
||||
|
||||
# Network constants
|
||||
DEFAULT_API_TIMEOUT = 30.0
|
||||
MAX_RETRIES = 3
|
||||
RETRY_DELAY = 1.0
|
||||
POLL_INTERVAL = 5.0
|
||||
|
||||
# Display constants
|
||||
MAX_RUN_ID_DISPLAY_LENGTH = 15
|
||||
MAX_DESCRIPTION_LENGTH = 50
|
||||
MAX_DEFAULT_VALUE_LENGTH = 30
|
||||
|
||||
# Progress constants
|
||||
PROGRESS_STEP_DELAYS = {
|
||||
"validating": 0.3,
|
||||
"connecting": 0.2,
|
||||
"uploading": 0.2,
|
||||
"creating": 0.3,
|
||||
"initializing": 0.2
|
||||
}
|
||||
|
||||
# Status emojis
|
||||
STATUS_EMOJIS = {
|
||||
"completed": "✅",
|
||||
"running": "🔄",
|
||||
"failed": "❌",
|
||||
"queued": "⏳",
|
||||
"cancelled": "⏹️",
|
||||
"pending": "📋",
|
||||
"unknown": "❓"
|
||||
}
|
||||
|
||||
# Severity styles for Rich
|
||||
SEVERITY_STYLES = {
|
||||
"error": "bold red",
|
||||
"warning": "bold yellow",
|
||||
"note": "bold blue",
|
||||
"info": "bold cyan"
|
||||
}
|
||||
|
||||
# Default volume modes
|
||||
DEFAULT_VOLUME_MODE = "ro"
|
||||
SUPPORTED_VOLUME_MODES = ["ro", "rw"]
|
||||
|
||||
# Default export formats
|
||||
DEFAULT_EXPORT_FORMAT = "sarif"
|
||||
SUPPORTED_EXPORT_FORMATS = ["sarif", "json", "csv"]
|
||||
|
||||
# Default configuration
|
||||
DEFAULT_CONFIG = {
|
||||
"api_url": "http://localhost:8000",
|
||||
"timeout": DEFAULT_API_TIMEOUT,
|
||||
"max_retries": MAX_RETRIES,
|
||||
}
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user