feat: Add Python fuzzing vertical with Atheris integration

This commit implements a complete Python fuzzing workflow using Atheris:

## Python Worker (workers/python/)
- Dockerfile with Python 3.11, Atheris, and build tools
- Generic worker.py for dynamic workflow discovery
- requirements.txt with temporalio, boto3, atheris dependencies
- Added to docker-compose.temporal.yaml with dedicated cache volume

## AtherisFuzzer Module (backend/toolbox/modules/fuzzer/)
- Reusable module extending BaseModule
- Auto-discovers fuzz targets (fuzz_*.py, *_fuzz.py, fuzz_target.py)
- Recursive search to find targets in nested directories
- Dynamically loads TestOneInput() function
- Configurable max_iterations and timeout
- Real-time stats callback support for live monitoring
- Returns findings as ModuleFinding objects

## Atheris Fuzzing Workflow (backend/toolbox/workflows/atheris_fuzzing/)
- Temporal workflow for orchestrating fuzzing
- Downloads user code from MinIO
- Executes AtherisFuzzer module
- Uploads results to MinIO
- Cleans up cache after execution
- metadata.yaml with vertical: python for routing

## Test Project (test_projects/python_fuzz_waterfall/)
- Demonstrates stateful waterfall vulnerability
- main.py with check_secret() that leaks progress
- fuzz_target.py with Atheris TestOneInput() harness
- Complete README with usage instructions

## Backend Fixes
- Fixed parameter merging in REST API endpoints (workflows.py)
- Changed workflow parameter passing from positional args to kwargs (manager.py)
- Default parameters now properly merged with user parameters

## Testing
 Worker discovered AtherisFuzzingWorkflow
 Workflow executed end-to-end successfully
 Fuzz target auto-discovered in nested directories
 Atheris ran 100,000 iterations
 Results uploaded and cache cleaned
This commit is contained in:
Tanguy Duhamel
2025-10-02 11:06:34 +02:00
parent 0680f14df6
commit fe50d4ef72
16 changed files with 1668 additions and 13 deletions
@@ -0,0 +1,18 @@
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
*.egg-info/
dist/
build/
# FuzzForge
.fuzzforge/
# Atheris fuzzing artifacts
corpus/
crashes/
*.profraw
*.profdata
@@ -0,0 +1,137 @@
# Python Fuzzing Test - Waterfall Vulnerability
This project demonstrates a **stateful vulnerability** that Atheris can discover through fuzzing.
## Vulnerability Description
The `check_secret()` function in `main.py` validates input character by character against the secret string "FUZZINGLABS". This creates a **waterfall vulnerability** where:
1. State leaks through the global `progress` variable
2. Each correct character advances the progress counter
3. When all 11 characters are provided in order, the function crashes with `SystemError`
This pattern is analogous to:
- Timing attacks on password checkers
- Protocol state machines with sequential validation
- Multi-step authentication flows
## Files
- `main.py` - Main application with vulnerable `check_secret()` function
- `fuzz_target.py` - Atheris fuzzing harness (contains `TestOneInput()`)
- `README.md` - This file
## How to Fuzz
### Using FuzzForge CLI
```bash
# Initialize FuzzForge in this directory
cd test_projects/python_fuzz_waterfall/
ff init
# Run fuzzing workflow (uploads code to MinIO)
ff workflow run atheris_fuzzing .
# The workflow will:
# 1. Upload this directory to MinIO
# 2. Worker downloads and extracts the code
# 3. Worker discovers fuzz_target.py (has TestOneInput)
# 4. Worker runs Atheris fuzzing
# 5. Reports real-time stats every 5 seconds
# 6. Finds crash when "FUZZINGLABS" is discovered
```
### Using FuzzForge SDK
```python
from fuzzforge_sdk import FuzzForgeClient
from pathlib import Path
client = FuzzForgeClient(base_url="http://localhost:8000")
# Upload and run fuzzing
response = client.submit_workflow_with_upload(
workflow_name="atheris_fuzzing",
target_path=Path("./"),
parameters={
"max_iterations": 100000,
"timeout_seconds": 300
}
)
print(f"Workflow started: {response.run_id}")
# Wait for completion
final_status = client.wait_for_completion(response.run_id)
findings = client.get_run_findings(response.run_id)
for finding in findings:
print(f"Crash: {finding.title}")
print(f"Input: {finding.metadata.get('crash_input_hex')}")
```
### Standalone (Without FuzzForge)
```bash
# Install Atheris
pip install atheris
# Run fuzzing directly
python fuzz_target.py
```
## Expected Behavior
When fuzzing:
1. **Initial phase**: Random exploration, progress = 0
2. **Discovery phase**: Atheris finds 'F' (first char), progress = 1
3. **Incremental progress**: Finds 'U', then 'Z', etc.
4. **Crash**: When full "FUZZINGLABS" discovered, crashes with:
```
SystemError: SECRET COMPROMISED: FUZZINGLABS
```
## Monitoring
Watch real-time fuzzing stats:
```bash
docker logs fuzzforge-worker-python -f | grep LIVE_STATS
```
Output example:
```
INFO - LIVE_STATS - executions=1523 execs_per_sec=1523.0 crashes=0
INFO - LIVE_STATS - executions=7842 execs_per_sec=2104.2 crashes=0
INFO - LIVE_STATS - executions=15234 execs_per_sec=2167.0 crashes=1 ← Crash found!
```
## Vulnerability Details
**CVE**: N/A (demonstration vulnerability)
**CWE**: CWE-208 (Observable Timing Discrepancy)
**Severity**: Critical (in real systems)
**Fix**: Remove state-based checking or implement constant-time comparison:
```python
def check_secret_safe(input_data: bytes) -> bool:
"""Constant-time comparison"""
import hmac
return hmac.compare_digest(input_data, SECRET.encode())
```
## Adjusting Difficulty
If fuzzing finds the crash too quickly, extend the secret:
```python
# In main.py, change:
SECRET = "FUZZINGLABSSECURITYTESTING" # 26 characters instead of 11
```
## License
MIT License - This is a demonstration project for educational purposes.
@@ -0,0 +1,59 @@
"""
Atheris fuzzing target for the waterfall vulnerability.
This file is automatically discovered by FuzzForge's AtherisFuzzer module.
The fuzzer looks for files named: fuzz_*.py, *_fuzz.py, or fuzz_target.py
"""
import sys
import atheris
# Import the vulnerable function
from main import check_secret
def TestOneInput(data):
"""
Atheris fuzzing entry point.
This function is called by Atheris for each fuzzing iteration.
The fuzzer will try to find inputs that cause crashes.
Args:
data: Bytes to test (generated by Atheris)
The waterfall vulnerability means:
- Random inputs will mostly fail (progress = 0)
- Atheris will discover inputs that make progress
- Eventually Atheris will find the complete secret "FUZZINGLABS"
- When found, check_secret() will crash with SystemError
"""
try:
check_secret(bytes(data))
except SystemError:
# Let Atheris detect the crash
# This is the vulnerability we're trying to find!
raise
if __name__ == "__main__":
"""
Standalone fuzzing mode.
Run directly: python fuzz_target.py
"""
print("=" * 60)
print("Atheris Fuzzing - Waterfall Vulnerability")
print("=" * 60)
print("Fuzzing will try to discover the secret string...")
print("Watch for progress indicators: [DEBUG] Progress: X/11")
print()
print("Press Ctrl+C to stop fuzzing")
print("=" * 60)
print()
# Setup Atheris with command-line args
atheris.Setup(sys.argv, TestOneInput)
# Start fuzzing
atheris.Fuzz()
@@ -0,0 +1,96 @@
"""
Example application with a stateful vulnerability.
This simulates a password checking system that leaks state information
through a global progress variable - a classic waterfall vulnerability.
"""
# Global state - simulates session state
progress = 0
SECRET = "FUZZINGLABS" # 11 characters
def check_secret(input_data: bytes) -> bool:
"""
Vulnerable function: checks secret character by character.
This is a waterfall vulnerability - state leaks through the progress variable.
Real-world analogy:
- Timing attacks on password checkers
- Protocol state machines with sequential validation
- Multi-step authentication flows
Args:
input_data: Input bytes to check
Returns:
True if progress was made, False otherwise
Raises:
SystemError: When complete secret is discovered (vulnerability trigger)
"""
global progress
if len(input_data) > progress:
if input_data[progress] == ord(SECRET[progress]):
progress += 1
# Progress indicator (useful for monitoring during fuzzing)
if progress % 2 == 0: # Every 2 characters
print(f"[DEBUG] Progress: {progress}/{len(SECRET)} characters matched")
# VULNERABILITY: Crashes when complete secret found
if progress == len(SECRET):
raise SystemError(f"SECRET COMPROMISED: {SECRET}")
return True
else:
# Wrong character - reset progress
progress = 0
return False
return False
def reset_state():
"""Reset the global state (useful for testing)"""
global progress
progress = 0
if __name__ == "__main__":
"""Example usage showing the vulnerability"""
print("=" * 60)
print("Waterfall Vulnerability Demonstration")
print("=" * 60)
print(f"Secret: {SECRET}")
print(f"Secret length: {len(SECRET)} characters")
print()
# Test inputs showing progressive discovery
test_inputs = [
b"F", # First char correct
b"FU", # First two chars correct
b"FUZ", # First three chars correct
b"WRONG", # Wrong - resets progress
b"FUZZINGLABS", # Complete secret - triggers crash!
]
for test in test_inputs:
reset_state() # Start fresh for each test
print(f"Testing input: {test.decode(errors='ignore')!r}")
try:
result = check_secret(test)
print(f" Result: {result}, Progress: {progress}/{len(SECRET)}")
except SystemError as e:
print(f" 💥 CRASH: {e}")
print()
print("=" * 60)
print("To fuzz this vulnerability with FuzzForge:")
print(" ff init")
print(" ff workflow run atheris_fuzzing .")
print("=" * 60)