mirror of
https://github.com/FuzzingLabs/fuzzforge_ai.git
synced 2026-07-01 08:55:32 +02:00
Initial commit
This commit is contained in:
@@ -0,0 +1,8 @@
|
||||
{
|
||||
"label": "Reference",
|
||||
"position": 5,
|
||||
"link": {
|
||||
"type": "generated-index",
|
||||
"description": "Reference pages that are information-oriented."
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,305 @@
|
||||
# FuzzForge AI Reference: CLI, Environment, and API
|
||||
|
||||
Welcome to the FuzzForge AI Reference! This document provides a comprehensive, no-nonsense guide to all the commands, environment variables, and API endpoints you’ll need to master the FuzzForge AI system. Use this as your quick lookup for syntax, options, and integration details.
|
||||
|
||||
---
|
||||
|
||||
## CLI Commands Reference
|
||||
|
||||
| Command | Description | Example |
|
||||
|---------|-------------|---------|
|
||||
| `/register <url>` | Register an A2A agent | `/register http://localhost:10201` |
|
||||
| `/unregister <name>` | Remove a registered agent | `/unregister CalculatorAgent` |
|
||||
| `/list` | Show all registered agents | `/list` |
|
||||
| `/memory [action]` | Knowledge graph operations | `/memory search security` |
|
||||
| `/recall <query>` | Search conversation history | `/recall past calculations` |
|
||||
| `/artifacts [id]` | List or view artifacts | `/artifacts artifact_abc123` |
|
||||
| `/tasks [id]` | Show task status | `/tasks task_001` |
|
||||
| `/skills` | Display FuzzForge skills | `/skills` |
|
||||
| `/sessions` | List active sessions | `/sessions` |
|
||||
| `/sendfile <agent> <path>` | Send file to agent | `/sendfile Analyzer ./code.py` |
|
||||
| `/clear` | Clear the screen | `/clear` |
|
||||
| `/help` | Show help | `/help` |
|
||||
| `/quit` | Exit the CLI | `/quit` |
|
||||
|
||||
---
|
||||
|
||||
## Built-in Function Tools
|
||||
|
||||
### Knowledge Management
|
||||
```python
|
||||
search_project_knowledge(query, dataset, search_type)
|
||||
list_project_knowledge()
|
||||
ingest_to_dataset(content, dataset)
|
||||
```
|
||||
|
||||
### File Operations
|
||||
```python
|
||||
list_project_files(path, pattern)
|
||||
read_project_file(file_path, max_lines)
|
||||
search_project_files(search_pattern, file_pattern, path)
|
||||
```
|
||||
|
||||
### Agent Management
|
||||
```python
|
||||
get_agent_capabilities(agent_name)
|
||||
send_file_to_agent(agent_name, file_path, note)
|
||||
```
|
||||
|
||||
### FuzzForge Platform
|
||||
```python
|
||||
list_fuzzforge_workflows()
|
||||
submit_security_scan_mcp(workflow_name, target_path, parameters)
|
||||
get_comprehensive_scan_summary(run_id)
|
||||
get_fuzzforge_run_status(run_id)
|
||||
get_fuzzforge_summary(run_id)
|
||||
get_fuzzforge_findings(run_id)
|
||||
```
|
||||
|
||||
### Task Management
|
||||
```python
|
||||
create_task_list(tasks)
|
||||
update_task_status(task_list_id, task_id, status)
|
||||
get_task_list(task_list_id)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Environment Variables
|
||||
|
||||
Set these in `.fuzzforge/.env` to configure your FuzzForge AI instance.
|
||||
|
||||
### Model Configuration
|
||||
```env
|
||||
LITELLM_MODEL=gpt-4o-mini # Any LiteLLM-supported model
|
||||
OPENAI_API_KEY=sk-... # API key for model provider
|
||||
ANTHROPIC_API_KEY=sk-ant-... # For Claude models
|
||||
GEMINI_API_KEY=... # For Gemini models
|
||||
```
|
||||
|
||||
### Memory & Persistence
|
||||
```env
|
||||
SESSION_PERSISTENCE=sqlite # sqlite|inmemory
|
||||
SESSION_DB_PATH=./fuzzforge_sessions.db
|
||||
MEMORY_SERVICE=inmemory # inmemory|vertexai
|
||||
```
|
||||
|
||||
### Server & Communication
|
||||
```env
|
||||
FUZZFORGE_PORT=10100 # A2A server port
|
||||
ARTIFACT_STORAGE=inmemory # inmemory|gcs
|
||||
GCS_ARTIFACT_BUCKET=artifacts # For GCS storage
|
||||
```
|
||||
|
||||
### Debug & Observability
|
||||
```env
|
||||
FUZZFORGE_DEBUG=1 # Enable debug logging
|
||||
AGENTOPS_API_KEY=... # Optional observability
|
||||
```
|
||||
|
||||
### Platform Integration
|
||||
```env
|
||||
FUZZFORGE_MCP_URL=http://localhost:8010/mcp
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## MCP (Model Context Protocol) Integration
|
||||
|
||||
FuzzForge supports the Model Context Protocol (MCP), allowing LLM clients and AI assistants to interact directly with the security testing platform. All FastAPI endpoints are available as MCP-compatible tools, making security automation accessible to any MCP-aware client.
|
||||
|
||||
### MCP Endpoints
|
||||
|
||||
- **HTTP MCP endpoint:** `http://localhost:8010/mcp`
|
||||
- **SSE (Server-Sent Events):** `http://localhost:8010/mcp/sse`
|
||||
- **Base API:** `http://localhost:8000`
|
||||
|
||||
### MCP Tools
|
||||
|
||||
- `submit_security_scan_mcp` — Submit security scanning workflows
|
||||
- `get_comprehensive_scan_summary` — Get detailed scan analysis with recommendations
|
||||
|
||||
### FastAPI Endpoints (now MCP tools)
|
||||
|
||||
- `GET /` — API status
|
||||
- `GET /workflows/` — List available workflows
|
||||
- `POST /workflows/{workflow_name}/submit` — Submit security scans
|
||||
- `GET /runs/{run_id}/status` — Check scan status
|
||||
- `GET /runs/{run_id}/findings` — Get scan results
|
||||
- `GET /fuzzing/{run_id}/stats` — Fuzzing statistics
|
||||
|
||||
### Usage Example: Submit a Security Scan via MCP
|
||||
|
||||
```json
|
||||
{
|
||||
"tool": "submit_security_scan_mcp",
|
||||
"parameters": {
|
||||
"workflow_name": "infrastructure_scan",
|
||||
"target_path": "/path/to/your/project",
|
||||
"volume_mode": "ro",
|
||||
"parameters": {
|
||||
"checkov_config": {
|
||||
"severity": ["HIGH", "MEDIUM", "LOW"]
|
||||
},
|
||||
"hadolint_config": {
|
||||
"severity": ["error", "warning", "info", "style"]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Usage Example: Get a Comprehensive Scan Summary
|
||||
|
||||
```json
|
||||
{
|
||||
"tool": "get_comprehensive_scan_summary",
|
||||
"parameters": {
|
||||
"run_id": "your-run-id-here"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Available Workflows
|
||||
|
||||
1. **infrastructure_scan** — Docker/Kubernetes/Terraform security analysis
|
||||
2. **static_analysis_scan** — Code vulnerability detection
|
||||
3. **secret_detection_scan** — Credential and secret scanning
|
||||
4. **penetration_testing_scan** — Network and web app testing
|
||||
5. **security_assessment** — Comprehensive security evaluation
|
||||
|
||||
### MCP Client Configuration Example
|
||||
|
||||
```json
|
||||
{
|
||||
"mcpServers": {
|
||||
"fuzzforge": {
|
||||
"command": "curl",
|
||||
"args": ["-X", "POST", "http://localhost:8010/mcp"],
|
||||
"env": {}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Troubleshooting MCP
|
||||
|
||||
- **MCP Connection Failed:**
|
||||
Check backend status:
|
||||
`docker compose ps fuzzforge-backend`
|
||||
`curl http://localhost:8000/health`
|
||||
|
||||
- **Workflows Not Found:**
|
||||
`curl http://localhost:8000/workflows/`
|
||||
|
||||
- **Scan Submission Errors:**
|
||||
`curl -X POST http://localhost:8000/workflows/infrastructure_scan/submit -H "Content-Type: application/json" -d '{"target_path": "/your/path", "volume_mode": "ro"}'`
|
||||
|
||||
- **General Support:**
|
||||
- Check Docker Compose logs: `docker compose logs fuzzforge-backend`
|
||||
- Verify MCP endpoint: `curl http://localhost:8010/mcp`
|
||||
- Test FastAPI endpoints directly before using MCP
|
||||
|
||||
For more, see the [How-To: MCP Integration](../how-to/mcp-integration.md).
|
||||
|
||||
---
|
||||
|
||||
## API Endpoints
|
||||
|
||||
When running as an A2A server (`python -m fuzzforge_ai --port 10100`):
|
||||
|
||||
| Endpoint | Method | Description |
|
||||
|----------|--------|-------------|
|
||||
| `/.well-known/agent-card.json` | GET | Agent capabilities |
|
||||
| `/` | POST | A2A message processing |
|
||||
| `/artifacts/{artifact_id}` | GET | Artifact file serving |
|
||||
| `/health` | GET | Health check |
|
||||
|
||||
### Example: Agent Card Format
|
||||
|
||||
```json
|
||||
{
|
||||
"name": "FuzzForge",
|
||||
"description": "Multi-agent orchestrator with memory and security tools",
|
||||
"version": "1.0.0",
|
||||
"url": "http://localhost:10100",
|
||||
"protocolVersion": "0.3.0",
|
||||
"preferredTransport": "JSONRPC",
|
||||
"defaultInputModes": ["text/plain", "application/json"],
|
||||
"defaultOutputModes": ["text/plain", "application/json"],
|
||||
"capabilities": {
|
||||
"streaming": false,
|
||||
"pushNotifications": true,
|
||||
"multiTurn": true,
|
||||
"contextRetention": true
|
||||
},
|
||||
"skills": [
|
||||
{
|
||||
"id": "orchestration",
|
||||
"name": "Agent Orchestration",
|
||||
"description": "Route requests to appropriate agents",
|
||||
"tags": ["orchestration", "routing"]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Example: A2A Message Format
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "msg_001",
|
||||
"method": "agent.invoke",
|
||||
"params": {
|
||||
"message": {
|
||||
"role": "user",
|
||||
"parts": [
|
||||
{
|
||||
"type": "text",
|
||||
"content": "Calculate factorial of 10"
|
||||
}
|
||||
]
|
||||
},
|
||||
"context": {
|
||||
"sessionId": "session_abc123",
|
||||
"conversationId": "conv_001"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Project Structure Reference
|
||||
|
||||
```
|
||||
project_root/
|
||||
├── .fuzzforge/ # Project-local config
|
||||
│ ├── .env # Environment variables
|
||||
│ ├── config.json # Project configuration
|
||||
│ ├── agents.yaml # Registered agents
|
||||
│ ├── sessions.db # Session storage
|
||||
│ ├── artifacts/ # Local artifact cache
|
||||
│ └── data/ # Knowledge graphs
|
||||
└── your_project_files...
|
||||
```
|
||||
|
||||
### Agent Registry Example (`agents.yaml`)
|
||||
```yaml
|
||||
registered_agents:
|
||||
- name: CalculatorAgent
|
||||
url: http://localhost:10201
|
||||
description: Mathematical calculations
|
||||
- name: SecurityAnalyzer
|
||||
url: http://localhost:10202
|
||||
description: Code security analysis
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Quick Troubleshooting
|
||||
|
||||
- **Agent Registration Fails:** Check agent is running and accessible at its URL.
|
||||
- **Memory Not Persisting:** Ensure `SESSION_PERSISTENCE=sqlite` and DB path is correct.
|
||||
- **Files Not Found:** Use paths relative to project root.
|
||||
- **Model API Errors:** Verify API key and model name.
|
||||
@@ -0,0 +1,796 @@
|
||||
# Common Patterns Cookbook 👨🍳
|
||||
|
||||
A collection of proven patterns and recipes for FuzzForge modules and workflows. Copy, paste, and adapt these examples to build your own security tools quickly!
|
||||
|
||||
## Module Patterns
|
||||
|
||||
### File Processing Patterns
|
||||
|
||||
#### Pattern 1: Selective File Scanner
|
||||
|
||||
```python
|
||||
class SelectiveScanner(BaseModule):
|
||||
"""Scan only specific file types with size limits"""
|
||||
|
||||
SUPPORTED_EXTENSIONS = {'.py', '.js', '.java', '.cpp', '.c', '.go', '.rs'}
|
||||
DEFAULT_MAX_SIZE = 5 * 1024 * 1024 # 5MB
|
||||
|
||||
async def execute(self, config: Dict[str, Any], workspace: Path) -> ModuleResult:
|
||||
max_size = config.get('max_file_size', self.DEFAULT_MAX_SIZE)
|
||||
extensions = set(config.get('extensions', self.SUPPORTED_EXTENSIONS))
|
||||
|
||||
findings = []
|
||||
processed_files = 0
|
||||
|
||||
for file_path in workspace.rglob('*'):
|
||||
if (file_path.is_file() and
|
||||
file_path.suffix.lower() in extensions and
|
||||
file_path.stat().st_size <= max_size):
|
||||
|
||||
try:
|
||||
result = await self._process_file(file_path, workspace)
|
||||
findings.extend(result)
|
||||
processed_files += 1
|
||||
except Exception as e:
|
||||
# Log error but continue processing
|
||||
logger.warning(f"Failed to process {file_path}: {e}")
|
||||
|
||||
return self.create_result(
|
||||
findings=findings,
|
||||
summary={'files_processed': processed_files}
|
||||
)
|
||||
```
|
||||
|
||||
#### Pattern 2: Content-Based File Analysis
|
||||
|
||||
```python
|
||||
class ContentAnalyzer(BaseModule):
|
||||
"""Analyze file content with encoding detection"""
|
||||
|
||||
async def execute(self, config: Dict[str, Any], workspace: Path) -> ModuleResult:
|
||||
findings = []
|
||||
|
||||
for file_path in workspace.rglob('*'):
|
||||
if file_path.is_file():
|
||||
content = await self._safe_read_file(file_path)
|
||||
if content:
|
||||
analysis_result = await self._analyze_content(content, file_path, workspace)
|
||||
findings.extend(analysis_result)
|
||||
|
||||
return self.create_result(findings=findings)
|
||||
|
||||
async def _safe_read_file(self, file_path: Path) -> str:
|
||||
"""Safely read file with encoding detection"""
|
||||
try:
|
||||
# Try UTF-8 first
|
||||
return file_path.read_text(encoding='utf-8')
|
||||
except UnicodeDecodeError:
|
||||
try:
|
||||
# Fall back to latin-1 for binary-like files
|
||||
return file_path.read_text(encoding='latin-1', errors='ignore')
|
||||
except Exception:
|
||||
return ""
|
||||
|
||||
async def _analyze_content(self, content: str, file_path: Path, workspace: Path) -> List[ModuleFinding]:
|
||||
"""Override this method in your specific analyzer"""
|
||||
# Example: Find TODO comments
|
||||
findings = []
|
||||
lines = content.split('\n')
|
||||
|
||||
for i, line in enumerate(lines, 1):
|
||||
if 'TODO' in line.upper():
|
||||
findings.append(self.create_finding(
|
||||
title="TODO comment found",
|
||||
description=f"TODO comment: {line.strip()}",
|
||||
severity="info",
|
||||
category="code_quality",
|
||||
file_path=str(file_path.relative_to(workspace)),
|
||||
line_start=i,
|
||||
code_snippet=line.strip()
|
||||
))
|
||||
|
||||
return findings
|
||||
```
|
||||
|
||||
#### Pattern 3: Directory Structure Analysis
|
||||
|
||||
```python
|
||||
class StructureAnalyzer(BaseModule):
|
||||
"""Analyze project directory structure"""
|
||||
|
||||
IMPORTANT_FILES = {
|
||||
'README.md': 'documentation',
|
||||
'LICENSE': 'legal',
|
||||
'.gitignore': 'vcs',
|
||||
'requirements.txt': 'dependencies',
|
||||
'package.json': 'dependencies',
|
||||
'Dockerfile': 'deployment'
|
||||
}
|
||||
|
||||
async def execute(self, config: Dict[str, Any], workspace: Path) -> ModuleResult:
|
||||
findings = []
|
||||
structure_analysis = {
|
||||
'total_directories': 0,
|
||||
'max_depth': 0,
|
||||
'important_files_found': [],
|
||||
'important_files_missing': []
|
||||
}
|
||||
|
||||
# Analyze directory structure
|
||||
for item in workspace.rglob('*'):
|
||||
if item.is_dir():
|
||||
structure_analysis['total_directories'] += 1
|
||||
depth = len(item.relative_to(workspace).parts)
|
||||
structure_analysis['max_depth'] = max(structure_analysis['max_depth'], depth)
|
||||
|
||||
# Check for important files
|
||||
for filename, category in self.IMPORTANT_FILES.items():
|
||||
file_path = workspace / filename
|
||||
if file_path.exists():
|
||||
structure_analysis['important_files_found'].append(filename)
|
||||
else:
|
||||
structure_analysis['important_files_missing'].append(filename)
|
||||
findings.append(self.create_finding(
|
||||
title=f"Missing {category} file",
|
||||
description=f"Recommended file '{filename}' not found",
|
||||
severity="info",
|
||||
category=category,
|
||||
metadata={'file_type': category, 'recommended_file': filename}
|
||||
))
|
||||
|
||||
return self.create_result(
|
||||
findings=findings,
|
||||
summary=structure_analysis
|
||||
)
|
||||
```
|
||||
|
||||
### Configuration Patterns
|
||||
|
||||
#### Pattern 1: Schema-Based Configuration
|
||||
|
||||
```python
|
||||
from pydantic import BaseModel, Field, validator
|
||||
from enum import Enum
|
||||
|
||||
class SeverityLevel(str, Enum):
|
||||
LOW = "low"
|
||||
MEDIUM = "medium"
|
||||
HIGH = "high"
|
||||
CRITICAL = "critical"
|
||||
|
||||
class ModuleConfig(BaseModel):
|
||||
"""Type-safe configuration with validation"""
|
||||
severity_threshold: SeverityLevel = SeverityLevel.MEDIUM
|
||||
max_file_size_mb: int = Field(default=10, gt=0, le=100)
|
||||
include_patterns: List[str] = Field(default=['**/*.py', '**/*.js'])
|
||||
exclude_patterns: List[str] = Field(default=['**/node_modules/**', '**/.git/**'])
|
||||
timeout_seconds: int = Field(default=300, gt=0, le=3600)
|
||||
|
||||
@validator('include_patterns')
|
||||
def validate_patterns(cls, v):
|
||||
if not v:
|
||||
raise ValueError('At least one include pattern required')
|
||||
return v
|
||||
|
||||
class ConfigurableModule(BaseModule):
|
||||
def validate_config(self, config: Dict[str, Any]) -> bool:
|
||||
try:
|
||||
ModuleConfig(**config)
|
||||
return True
|
||||
except Exception:
|
||||
return False
|
||||
|
||||
async def execute(self, config: Dict[str, Any], workspace: Path) -> ModuleResult:
|
||||
# Get validated configuration
|
||||
validated_config = ModuleConfig(**config)
|
||||
|
||||
# Use type-safe configuration
|
||||
max_size = validated_config.max_file_size_mb * 1024 * 1024
|
||||
severity = validated_config.severity_threshold
|
||||
# ... rest of implementation
|
||||
```
|
||||
|
||||
#### Pattern 2: Configuration Templates
|
||||
|
||||
```python
|
||||
class TemplateBasedModule(BaseModule):
|
||||
"""Module with configuration templates"""
|
||||
|
||||
TEMPLATES = {
|
||||
'quick': {
|
||||
'max_file_size_mb': 5,
|
||||
'timeout_seconds': 60,
|
||||
'severity_threshold': 'medium'
|
||||
},
|
||||
'thorough': {
|
||||
'max_file_size_mb': 50,
|
||||
'timeout_seconds': 1800,
|
||||
'severity_threshold': 'low'
|
||||
},
|
||||
'critical_only': {
|
||||
'max_file_size_mb': 100,
|
||||
'timeout_seconds': 3600,
|
||||
'severity_threshold': 'critical'
|
||||
}
|
||||
}
|
||||
|
||||
async def execute(self, config: Dict[str, Any], workspace: Path) -> ModuleResult:
|
||||
# Load template if specified
|
||||
template_name = config.get('template')
|
||||
if template_name and template_name in self.TEMPLATES:
|
||||
base_config = self.TEMPLATES[template_name].copy()
|
||||
base_config.update(config) # Override template with specific config
|
||||
config = base_config
|
||||
|
||||
# Continue with normal execution
|
||||
return await self._execute_with_config(config, workspace)
|
||||
```
|
||||
|
||||
### Error Handling Recipes
|
||||
|
||||
#### Pattern 1: Graceful Degradation
|
||||
|
||||
```python
|
||||
class ResilientModule(BaseModule):
|
||||
"""Module that handles errors gracefully"""
|
||||
|
||||
async def execute(self, config: Dict[str, Any], workspace: Path) -> ModuleResult:
|
||||
findings = []
|
||||
errors = []
|
||||
processed_files = 0
|
||||
|
||||
for file_path in workspace.rglob('*'):
|
||||
if file_path.is_file():
|
||||
try:
|
||||
result = await self._analyze_file(file_path, workspace, config)
|
||||
findings.extend(result)
|
||||
processed_files += 1
|
||||
except PermissionError as e:
|
||||
errors.append({
|
||||
'file': str(file_path.relative_to(workspace)),
|
||||
'error': 'Permission denied',
|
||||
'type': 'permission_error'
|
||||
})
|
||||
except UnicodeDecodeError as e:
|
||||
errors.append({
|
||||
'file': str(file_path.relative_to(workspace)),
|
||||
'error': 'Encoding error',
|
||||
'type': 'encoding_error'
|
||||
})
|
||||
except Exception as e:
|
||||
errors.append({
|
||||
'file': str(file_path.relative_to(workspace)),
|
||||
'error': str(e),
|
||||
'type': 'analysis_error'
|
||||
})
|
||||
|
||||
# Determine overall status
|
||||
total_files = processed_files + len(errors)
|
||||
if len(errors) > total_files * 0.5: # More than 50% failed
|
||||
status = "partial"
|
||||
else:
|
||||
status = "success"
|
||||
|
||||
return self.create_result(
|
||||
findings=findings,
|
||||
status=status,
|
||||
summary={
|
||||
'files_processed': processed_files,
|
||||
'files_failed': len(errors),
|
||||
'error_rate': len(errors) / total_files if total_files > 0 else 0
|
||||
},
|
||||
metadata={'errors': errors}
|
||||
)
|
||||
```
|
||||
|
||||
#### Pattern 2: Circuit Breaker
|
||||
|
||||
```python
|
||||
import time
|
||||
|
||||
class CircuitBreakerModule(BaseModule):
|
||||
"""Module with circuit breaker for expensive operations"""
|
||||
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.failure_count = 0
|
||||
self.last_failure_time = 0
|
||||
self.circuit_open = False
|
||||
self.failure_threshold = 5
|
||||
self.recovery_timeout = 60 # seconds
|
||||
|
||||
async def execute(self, config: Dict[str, Any], workspace: Path) -> ModuleResult:
|
||||
findings = []
|
||||
|
||||
for file_path in workspace.rglob('*'):
|
||||
if file_path.is_file():
|
||||
if self._is_circuit_open():
|
||||
# Circuit is open, skip expensive operations
|
||||
findings.append(self.create_finding(
|
||||
title="Analysis skipped",
|
||||
description="Circuit breaker is open due to previous failures",
|
||||
severity="info",
|
||||
category="system",
|
||||
file_path=str(file_path.relative_to(workspace))
|
||||
))
|
||||
continue
|
||||
|
||||
try:
|
||||
result = await self._expensive_analysis(file_path, workspace)
|
||||
findings.extend(result)
|
||||
self._on_success()
|
||||
except Exception as e:
|
||||
self._on_failure()
|
||||
logger.warning(f"Analysis failed for {file_path}: {e}")
|
||||
|
||||
return self.create_result(findings=findings)
|
||||
|
||||
def _is_circuit_open(self) -> bool:
|
||||
if not self.circuit_open:
|
||||
return False
|
||||
|
||||
# Check if recovery timeout has passed
|
||||
if time.time() - self.last_failure_time > self.recovery_timeout:
|
||||
self.circuit_open = False
|
||||
self.failure_count = 0
|
||||
return False
|
||||
|
||||
return True
|
||||
|
||||
def _on_failure(self):
|
||||
self.failure_count += 1
|
||||
self.last_failure_time = time.time()
|
||||
|
||||
if self.failure_count >= self.failure_threshold:
|
||||
self.circuit_open = True
|
||||
|
||||
def _on_success(self):
|
||||
if self.circuit_open:
|
||||
self.circuit_open = False
|
||||
self.failure_count = 0
|
||||
```
|
||||
|
||||
### Performance Patterns
|
||||
|
||||
#### Pattern 1: Batch Processing
|
||||
|
||||
```python
|
||||
import asyncio
|
||||
from typing import List, AsyncGenerator
|
||||
|
||||
class BatchProcessor(BaseModule):
|
||||
"""Process files in batches to control memory usage"""
|
||||
|
||||
async def execute(self, config: Dict[str, Any], workspace: Path) -> ModuleResult:
|
||||
batch_size = config.get('batch_size', 10)
|
||||
findings = []
|
||||
|
||||
async for batch_findings in self._process_in_batches(workspace, batch_size, config):
|
||||
findings.extend(batch_findings)
|
||||
|
||||
return self.create_result(findings=findings)
|
||||
|
||||
async def _process_in_batches(
|
||||
self,
|
||||
workspace: Path,
|
||||
batch_size: int,
|
||||
config: Dict[str, Any]
|
||||
) -> AsyncGenerator[List[ModuleFinding], None]:
|
||||
"""Process files in batches"""
|
||||
files = [f for f in workspace.rglob('*') if f.is_file()]
|
||||
|
||||
for i in range(0, len(files), batch_size):
|
||||
batch = files[i:i + batch_size]
|
||||
batch_findings = []
|
||||
|
||||
for file_path in batch:
|
||||
try:
|
||||
result = await self._analyze_file(file_path, workspace, config)
|
||||
batch_findings.extend(result)
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to process {file_path}: {e}")
|
||||
|
||||
yield batch_findings
|
||||
```
|
||||
|
||||
#### Pattern 2: Concurrent Processing with Limits
|
||||
|
||||
```python
|
||||
class ConcurrentProcessor(BaseModule):
|
||||
"""Process files concurrently with semaphore limits"""
|
||||
|
||||
async def execute(self, config: Dict[str, Any], workspace: Path) -> ModuleResult:
|
||||
max_concurrent = config.get('max_concurrent', 5)
|
||||
semaphore = asyncio.Semaphore(max_concurrent)
|
||||
|
||||
files = [f for f in workspace.rglob('*') if f.is_file()]
|
||||
|
||||
# Process files concurrently
|
||||
tasks = [
|
||||
self._process_file_with_semaphore(file_path, workspace, config, semaphore)
|
||||
for file_path in files
|
||||
]
|
||||
|
||||
results = await asyncio.gather(*tasks, return_exceptions=True)
|
||||
|
||||
# Collect findings and handle exceptions
|
||||
findings = []
|
||||
for result in results:
|
||||
if isinstance(result, list):
|
||||
findings.extend(result)
|
||||
elif isinstance(result, Exception):
|
||||
logger.warning(f"Processing failed: {result}")
|
||||
|
||||
return self.create_result(findings=findings)
|
||||
|
||||
async def _process_file_with_semaphore(
|
||||
self,
|
||||
file_path: Path,
|
||||
workspace: Path,
|
||||
config: Dict[str, Any],
|
||||
semaphore: asyncio.Semaphore
|
||||
) -> List[ModuleFinding]:
|
||||
"""Process a single file with semaphore protection"""
|
||||
async with semaphore:
|
||||
return await self._analyze_file(file_path, workspace, config)
|
||||
```
|
||||
|
||||
## ⚡ Workflow Patterns
|
||||
|
||||
### Sequential Processing
|
||||
|
||||
```python
|
||||
@flow(name="sequential_analysis")
|
||||
async def sequential_workflow(target_path: str, **kwargs) -> Dict[str, Any]:
|
||||
"""Execute analysis steps in sequence"""
|
||||
workspace = Path(target_path)
|
||||
|
||||
# Step 1: File discovery
|
||||
scanner_config = kwargs.get('scanner_config', {})
|
||||
scan_results = await file_scan_task(workspace, scanner_config)
|
||||
|
||||
# Step 2: Analysis (depends on scan results)
|
||||
analyzer_config = {
|
||||
**kwargs.get('analyzer_config', {}),
|
||||
'discovered_files': scan_results.get('summary', {}).get('total_files', 0)
|
||||
}
|
||||
analysis_results = await analysis_task(scan_results, workspace, analyzer_config)
|
||||
|
||||
# Step 3: Report generation (depends on analysis)
|
||||
reporter_config = kwargs.get('reporter_config', {})
|
||||
final_report = await report_task(analysis_results, workspace, reporter_config)
|
||||
|
||||
return final_report
|
||||
```
|
||||
|
||||
### Parallel Execution
|
||||
|
||||
```python
|
||||
@flow(name="parallel_analysis")
|
||||
async def parallel_workflow(target_path: str, **kwargs) -> Dict[str, Any]:
|
||||
"""Execute independent analyses in parallel"""
|
||||
workspace = Path(target_path)
|
||||
|
||||
# Submit parallel tasks
|
||||
static_future = static_analysis_task.submit(workspace, kwargs.get('static_config', {}))
|
||||
secret_future = secret_detection_task.submit(workspace, kwargs.get('secret_config', {}))
|
||||
license_future = license_check_task.submit(workspace, kwargs.get('license_config', {}))
|
||||
|
||||
# Wait for all to complete
|
||||
static_results = await static_future.result()
|
||||
secret_results = await secret_future.result()
|
||||
license_results = await license_future.result()
|
||||
|
||||
# Combine results
|
||||
combined_report = await combine_results_task(
|
||||
[static_results, secret_results, license_results],
|
||||
workspace,
|
||||
kwargs.get('reporter_config', {})
|
||||
)
|
||||
|
||||
return combined_report
|
||||
```
|
||||
|
||||
### Conditional Logic
|
||||
|
||||
```python
|
||||
@flow(name="conditional_analysis")
|
||||
async def conditional_workflow(target_path: str, **kwargs) -> Dict[str, Any]:
|
||||
"""Execute workflow with conditional branches"""
|
||||
workspace = Path(target_path)
|
||||
|
||||
# Initial assessment
|
||||
assessment = await quick_assessment_task(workspace)
|
||||
|
||||
# Branch based on project type
|
||||
if assessment.get('project_type') == 'web_application':
|
||||
# Web app specific analysis
|
||||
web_results = await web_security_task(workspace, kwargs.get('web_config', {}))
|
||||
final_results = web_results
|
||||
|
||||
elif assessment.get('project_type') == 'library':
|
||||
# Library specific analysis
|
||||
lib_results = await library_analysis_task(workspace, kwargs.get('lib_config', {}))
|
||||
final_results = lib_results
|
||||
|
||||
else:
|
||||
# Generic analysis
|
||||
generic_results = await generic_analysis_task(workspace, kwargs.get('generic_config', {}))
|
||||
final_results = generic_results
|
||||
|
||||
# Optional deep analysis for high-risk projects
|
||||
if assessment.get('risk_level', 'low') in ['high', 'critical']:
|
||||
deep_results = await deep_analysis_task(workspace, kwargs.get('deep_config', {}))
|
||||
final_results = await merge_results_task(final_results, deep_results)
|
||||
|
||||
return final_results
|
||||
```
|
||||
|
||||
### Data Transformation
|
||||
|
||||
```python
|
||||
@task(name="filter_and_transform")
|
||||
async def filter_transform_task(
|
||||
raw_results: Dict[str, Any],
|
||||
config: Dict[str, Any]
|
||||
) -> Dict[str, Any]:
|
||||
"""Filter and transform findings based on criteria"""
|
||||
|
||||
findings = raw_results.get('findings', [])
|
||||
|
||||
# Filter by severity
|
||||
min_severity = config.get('min_severity', 'low')
|
||||
severity_order = {'info': 0, 'low': 1, 'medium': 2, 'high': 3, 'critical': 4}
|
||||
min_level = severity_order.get(min_severity, 0)
|
||||
|
||||
filtered_findings = [
|
||||
f for f in findings
|
||||
if severity_order.get(f.get('severity', 'info'), 0) >= min_level
|
||||
]
|
||||
|
||||
# Group by category
|
||||
categorized = {}
|
||||
for finding in filtered_findings:
|
||||
category = finding.get('category', 'other')
|
||||
if category not in categorized:
|
||||
categorized[category] = []
|
||||
categorized[category].append(finding)
|
||||
|
||||
# Transform findings (add risk scores, priorities, etc.)
|
||||
enriched_findings = []
|
||||
for finding in filtered_findings:
|
||||
enriched_finding = {
|
||||
**finding,
|
||||
'risk_score': calculate_risk_score(finding),
|
||||
'priority': determine_priority(finding),
|
||||
'remediation_effort': estimate_effort(finding)
|
||||
}
|
||||
enriched_findings.append(enriched_finding)
|
||||
|
||||
return {
|
||||
'findings': enriched_findings,
|
||||
'summary': {
|
||||
'total_findings': len(enriched_findings),
|
||||
'by_category': {k: len(v) for k, v in categorized.items()},
|
||||
'by_severity': {
|
||||
severity: len([f for f in enriched_findings if f.get('severity') == severity])
|
||||
for severity in ['info', 'low', 'medium', 'high', 'critical']
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 🧪 Testing Patterns
|
||||
|
||||
### Pattern 1: Comprehensive Module Testing
|
||||
|
||||
```python
|
||||
import pytest
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
from unittest.mock import patch, AsyncMock
|
||||
|
||||
class TestMyModule:
|
||||
|
||||
@pytest.fixture
|
||||
def temp_workspace(self):
|
||||
with tempfile.TemporaryDirectory() as temp_dir:
|
||||
workspace = Path(temp_dir)
|
||||
# Create test files
|
||||
(workspace / 'test.py').write_text('print("hello")')
|
||||
(workspace / 'config.json').write_text('{"key": "value"}')
|
||||
yield workspace
|
||||
|
||||
@pytest.fixture
|
||||
def module(self):
|
||||
return MyModule()
|
||||
|
||||
@pytest.fixture
|
||||
def base_config(self):
|
||||
return {
|
||||
'max_file_size_mb': 10,
|
||||
'severity_threshold': 'medium',
|
||||
'timeout_seconds': 60
|
||||
}
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_execute_success(self, module, temp_workspace, base_config):
|
||||
result = await module.execute(base_config, temp_workspace)
|
||||
|
||||
assert result.status == "success"
|
||||
assert isinstance(result.findings, list)
|
||||
assert isinstance(result.summary, dict)
|
||||
assert 'total_files' in result.summary
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_execute_empty_workspace(self, module, base_config):
|
||||
with tempfile.TemporaryDirectory() as empty_dir:
|
||||
result = await module.execute(base_config, Path(empty_dir))
|
||||
|
||||
assert result.summary['total_files'] == 0
|
||||
assert len(result.findings) == 0
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_config_validation(self, module):
|
||||
assert module.validate_config({'max_file_size_mb': 10})
|
||||
assert not module.validate_config({'max_file_size_mb': -1})
|
||||
assert not module.validate_config({'max_file_size_mb': 'invalid'})
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_error_handling(self, module, base_config):
|
||||
with patch.object(module, '_analyze_file', side_effect=Exception("Test error")):
|
||||
result = await module.execute(base_config, Path('/tmp'))
|
||||
|
||||
# Should handle errors gracefully
|
||||
assert 'errors' in result.metadata
|
||||
assert len(result.metadata['errors']) > 0
|
||||
|
||||
@pytest.mark.parametrize("severity,expected", [
|
||||
('low', ['low', 'medium', 'high', 'critical']),
|
||||
('medium', ['medium', 'high', 'critical']),
|
||||
('high', ['high', 'critical']),
|
||||
('critical', ['critical'])
|
||||
])
|
||||
async def test_severity_filtering(self, module, temp_workspace, severity, expected):
|
||||
config = {'severity_threshold': severity}
|
||||
result = await module.execute(config, temp_workspace)
|
||||
|
||||
found_severities = {f.severity for f in result.findings}
|
||||
assert found_severities.issubset(set(expected))
|
||||
```
|
||||
|
||||
## 🔧 Utility Functions
|
||||
|
||||
### File Type Detection
|
||||
|
||||
```python
|
||||
def detect_file_type(file_path: Path) -> str:
|
||||
"""Detect file type from extension and content"""
|
||||
|
||||
# Extension-based detection
|
||||
extension_map = {
|
||||
'.py': 'python',
|
||||
'.js': 'javascript',
|
||||
'.ts': 'typescript',
|
||||
'.java': 'java',
|
||||
'.cpp': 'cpp',
|
||||
'.c': 'c',
|
||||
'.go': 'go',
|
||||
'.rs': 'rust',
|
||||
'.json': 'json',
|
||||
'.yaml': 'yaml',
|
||||
'.yml': 'yaml',
|
||||
'.xml': 'xml',
|
||||
'.html': 'html',
|
||||
'.css': 'css',
|
||||
'.md': 'markdown',
|
||||
'.txt': 'text'
|
||||
}
|
||||
|
||||
file_type = extension_map.get(file_path.suffix.lower())
|
||||
if file_type:
|
||||
return file_type
|
||||
|
||||
# Content-based detection for files without extensions
|
||||
try:
|
||||
with open(file_path, 'r', encoding='utf-8', errors='ignore') as f:
|
||||
first_line = f.readline().strip()
|
||||
|
||||
if first_line.startswith('#!'):
|
||||
if 'python' in first_line:
|
||||
return 'python'
|
||||
elif 'bash' in first_line or 'sh' in first_line:
|
||||
return 'shell'
|
||||
elif 'node' in first_line:
|
||||
return 'javascript'
|
||||
|
||||
if first_line.startswith('<?xml'):
|
||||
return 'xml'
|
||||
|
||||
if first_line.startswith('<!DOCTYPE html') or first_line.startswith('<html'):
|
||||
return 'html'
|
||||
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
return 'unknown'
|
||||
```
|
||||
|
||||
### Risk Scoring
|
||||
|
||||
```python
|
||||
def calculate_risk_score(finding: Dict[str, Any]) -> int:
|
||||
"""Calculate numeric risk score for a finding"""
|
||||
|
||||
base_scores = {
|
||||
'critical': 100,
|
||||
'high': 75,
|
||||
'medium': 50,
|
||||
'low': 25,
|
||||
'info': 10
|
||||
}
|
||||
|
||||
severity = finding.get('severity', 'info')
|
||||
base_score = base_scores.get(severity, 10)
|
||||
|
||||
# Adjust based on category
|
||||
category_multipliers = {
|
||||
'security': 1.0,
|
||||
'vulnerability': 1.0,
|
||||
'credential': 1.2,
|
||||
'injection': 1.1,
|
||||
'authentication': 1.1,
|
||||
'authorization': 1.1,
|
||||
'code_quality': 0.8,
|
||||
'performance': 0.7,
|
||||
'documentation': 0.5
|
||||
}
|
||||
|
||||
category = finding.get('category', 'other')
|
||||
multiplier = category_multipliers.get(category, 0.9)
|
||||
|
||||
# Adjust based on file location
|
||||
file_path = finding.get('file_path', '')
|
||||
if any(sensitive in file_path.lower() for sensitive in ['config', 'secret', 'password', 'key']):
|
||||
multiplier *= 1.2
|
||||
|
||||
return int(base_score * multiplier)
|
||||
```
|
||||
|
||||
### Finding Deduplication
|
||||
|
||||
```python
|
||||
def deduplicate_findings(findings: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
|
||||
"""Remove duplicate findings based on title, file, and line"""
|
||||
|
||||
seen = set()
|
||||
deduplicated = []
|
||||
|
||||
for finding in findings:
|
||||
# Create unique key
|
||||
key = (
|
||||
finding.get('title', ''),
|
||||
finding.get('file_path', ''),
|
||||
finding.get('line_start', 0),
|
||||
finding.get('category', '')
|
||||
)
|
||||
|
||||
if key not in seen:
|
||||
seen.add(key)
|
||||
deduplicated.append(finding)
|
||||
else:
|
||||
# Update metadata to indicate duplication
|
||||
for existing in deduplicated:
|
||||
if (existing.get('title') == finding.get('title') and
|
||||
existing.get('file_path') == finding.get('file_path')):
|
||||
|
||||
metadata = existing.setdefault('metadata', {})
|
||||
metadata['duplicate_count'] = metadata.get('duplicate_count', 1) + 1
|
||||
break
|
||||
|
||||
return deduplicated
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**🎯 Next Steps**: Use these patterns as building blocks for your own modules and workflows. Mix and match patterns to create powerful security analysis tools!
|
||||
@@ -0,0 +1,69 @@
|
||||
# Contributing
|
||||
|
||||
Contributing is much appreciated.
|
||||
|
||||
## How to contribute
|
||||
|
||||
### Development environment setup
|
||||
|
||||
We recommand using the excellent [Based Pyright](https://docs.basedpyright.com/latest/) LSP, which is a fork of [pyright](https://github.com/microsoft/pyright) with various type checking improvements, pylance features and more. It is available in all major editors (VSCode, Vim, Emacs, Zed).
|
||||
|
||||
To work on the project, you will need to install `uv`. Check the [installation instructions](https://docs.astral.sh/uv/getting-started/installation/) for your platform.
|
||||
|
||||
We also recommand using Just to manage your development environment. Just is a command runner, similar to Make, but with a simpler syntax and more features. It is available in all major platforms. Check the [installation instructions](https://just.systems/man/en/) for your platform. We wrapped on number of useful commands in the `Justfile` at the root of the repository. You can see the available commands by running `just`.
|
||||
|
||||
### Code conventions
|
||||
|
||||
We try to follow the [Python Style Guide](https://www.python.org/dev/peps/pep-0008/) and [Google Python Style Guide](https://github.com/google/styleguide/blob/gh-pages/pyguide.md) for Python code. A linter and formatter is used to ensure that the code is consistent and follows the style guide. The linter and formatter used is [Ruff](https://docs.astral.sh/ruff/).
|
||||
|
||||
### Git usage
|
||||
|
||||
We use the [Conventional Commits 1.0.0](https://www.conventionalcommits.org/en/v1.0.0/) specification to format our commits.
|
||||
|
||||
As for our workflow, we use the following with the following branch names :
|
||||
|
||||
- main : `main` - for production code
|
||||
- hotfix : `hotfix/<hotfix name>` - for urgent fixes to the production code
|
||||
- develop : `dev` - for development
|
||||
- feature : `feat/<feature name>` - for new features
|
||||
- continuous integration : `ci/<ci name>` - for continuous integration related changes
|
||||
- documentation : `docs/<documentation name>` - for documentation changes
|
||||
- fix : `fix/<bug name>` - for bug fixes
|
||||
- chore : `chore/<chore name>` - for changes that do not modify src or test files
|
||||
- refactor : `refactor/<refactor name>` - for code refactoring
|
||||
- perf : `perf/<performance name>` - for performance improvements
|
||||
- test : `test/<test name>` - for adding or modifying tests
|
||||
- build : `build/<build name>` - for build-related changes
|
||||
- revert : `revert/<revert name>` - for reverting changes
|
||||
- style : `style/<style name>` - for style-related changes
|
||||
|
||||

|
||||
|
||||
In addition to the branching names, only `dev` and `hotfix` branches are allowed to be merged into `main`. All other branches must be merged into `dev` first.
|
||||
|
||||
The `dev` branch is the main development branch, and all new features and bug fixes should be merged into it. The `main` branch is the production branch, and only stable code should be merged into it.
|
||||
|
||||
The `hotfix` branch is used for urgent fixes to the production code, and should be merged into both `main` and `dev` branches.
|
||||
|
||||
This workflow is derived from the [Git Flow](https://nvie.com/posts/a-successful-git-branching-model/) workflow, with some modifications to fit our needs.
|
||||
|
||||
!!! note
|
||||
Following theses conventions allows for an automatic CI to label pull requests and commits with the correct labels. This can be used to automatically generate the changelog and release notes, but mainly facilitates the review process.
|
||||
|
||||
### Testing
|
||||
|
||||
We use [pytest](https://docs.pytest.org/en/latest/) for unit and integration testing, and [PyTestArch](https://pypi.org/project/PyTestArch/) for architectural rules. The tests are located in the `tests` directory.
|
||||
|
||||
A test is required for every new feature and bug fix. The tests should be located in the `tests` directory of the corresponding module.
|
||||
The tests should be run before merging any changes into the `dev` or `main` branches.
|
||||
|
||||
### Continuous integration
|
||||
|
||||
We use [GitHub Actions](https://docs.github.com/en/actions) for continuous integration. The CI workflow is located in the `.github/workflows` directory.
|
||||
The CI workflow is triggered on every push to the `dev` and `main` branches, and on every pull request to the `dev` and `main` branches.
|
||||
|
||||
The CI workflow runs the tests and linter, and builds the documentation. The CI workflow is required to pass before merging any changes into the `dev` or `main` branches.
|
||||
|
||||
### Bug report
|
||||
|
||||
To-do
|
||||
@@ -0,0 +1,64 @@
|
||||
# {Title of solution to solve the problem}
|
||||
|
||||
## Context and problem statement
|
||||
|
||||
{Describe the context and problem in free form, using two to three sentences or in the form of an illustrative story.
|
||||
You may want to articulate the problem in form of a question and add links to collaboration boards or issue management systems.}
|
||||
|
||||
<!-- This is an optional element. Feel free to remove. -->
|
||||
## Decision Drivers
|
||||
|
||||
* {decision driver, e.g., a force, facing concern, ...}
|
||||
* ...
|
||||
|
||||
## Considered Options
|
||||
|
||||
* [{title of option}](#{title of option})
|
||||
* ...
|
||||
|
||||
## Decision Outcome
|
||||
|
||||
Chosen option: "{title of chosen option}", because
|
||||
{justification. e.g., only option, which meets k.o. criterion decision driver | which resolves force {force} | … | comes out best (see below)}.
|
||||
|
||||
<!-- This is an optional element. Feel free to remove. -->
|
||||
## Decision Revisit
|
||||
|
||||
Last revisit: {information about the last revisit e.g. never | {date} by {author}}
|
||||
|
||||
<!-- This is an optional element. Feel free to remove. -->
|
||||
### Consequences
|
||||
|
||||
* Good, because {positive consequence, e.g., improvement of one or more desired qualities, …}
|
||||
* Bad, because {negative consequence, e.g., compromising one or more desired qualities, …}
|
||||
* … <!-- numbers of consequences can vary -->
|
||||
|
||||
<!-- This is an optional element. Feel free to remove. -->
|
||||
## Validation
|
||||
|
||||
{describe how the implementation of/compliance with the ADR is validated. E.g., by a review or an ArchUnit test}
|
||||
|
||||
<!-- This is an optional element. Feel free to remove. -->
|
||||
## Pros and Cons of the Options
|
||||
|
||||
<!-- This is an repeated element per option. use when necessary. -->
|
||||
### {title of option}
|
||||
|
||||
<!-- This is an optional element. Feel free to remove. -->
|
||||
{example | description | pointer to more information | …}
|
||||
|
||||
* Good, because {argument a}
|
||||
* Good, because {argument b}
|
||||
<!-- use "neutral" if the given argument weights neither for good nor bad -->
|
||||
* Neutral, because {argument c}
|
||||
* Bad, because {argument d}
|
||||
* ... <!-- numbers of pros and cons can vary -->
|
||||
|
||||
<!-- This is an optional element. Feel free to remove. -->
|
||||
## More Information
|
||||
|
||||
{Provide additional evidence/confidence for the decision outcome here and/or
|
||||
document the team agreement on the decision and/or
|
||||
define when this decision when and how the decision should be realized and if/when it should be re-visited and/or
|
||||
how the decision is validated.
|
||||
Links to other decisions and resources might here appear as well.}
|
||||
@@ -0,0 +1,17 @@
|
||||
# Diataxis documentation
|
||||
|
||||
This project uses the [Diátaxis](https://diataxis.fr) technical documentation framework.
|
||||
There are 4 main parts:
|
||||
|
||||
1. [Getting started (tutorials)](https://diataxis.fr/tutorials): learning-oriented
|
||||
- Pages that contain tutorials needed to get people up and running, for instance [Getting Started](/intro.md).
|
||||
2. [Concepts (explanation)](https://diataxis.fr/explanation): understanding-oriented
|
||||
- Pages explaining concepts that are relevant to the domain, for instance [Working with documentation](../concept/working-with-documentation.md).
|
||||
3. [How-to guides](https://diataxis.fr/how-to-guides): goal-oriented
|
||||
- Pages that contain tutorials, for instance [How-to: start the local documentation server](#).
|
||||
4. [Reference](https://diataxis.fr/reference): information-oriented
|
||||
- Pages that contain reference information, for instance about [Diataxis documentation](../reference/diataxis-documentation.md).
|
||||
|
||||
## Working with Diátaxis documentation
|
||||
|
||||
See [working with Diátaxis documentation](../concept/working-with-documentation.md).
|
||||
@@ -0,0 +1,17 @@
|
||||
# {Title}
|
||||
|
||||
**Status**: active
|
||||
|
||||
## Description
|
||||
|
||||
{Provide a detailed description of the issue, include things such as how to identify this particular issue}
|
||||
|
||||
<!-- This is an optional element. Feel free to remove. -->
|
||||
## Impact
|
||||
|
||||
{Describe the impact of the issue on users or the system.}
|
||||
|
||||
<!-- This is an optional element. Feel free to remove. -->
|
||||
## Workaround
|
||||
|
||||
{If available, describe any possible workarounds to mitigate the issue until it is resolved.}
|
||||
Binary file not shown.
|
After Width: | Height: | Size: 46 KiB |
@@ -0,0 +1,257 @@
|
||||
# Static Analysis Workflow Reference
|
||||
|
||||
The Static Analysis workflow in FuzzForge helps you find vulnerabilities, code quality issues, and compliance problems—before they reach production. This workflow uses multiple Static Application Security Testing (SAST) tools to analyze your source code without executing it, providing fast, actionable feedback in a standardized format.
|
||||
|
||||
---
|
||||
|
||||
## What Does This Workflow Do?
|
||||
|
||||
- **Workflow ID:** `static_analysis_scan`
|
||||
- **Primary Tools:** Semgrep (multi-language), Bandit (Python)
|
||||
- **Supported Languages:** Python, JavaScript, Java, Go, C/C++, PHP, Ruby, and more
|
||||
- **Typical Duration:** 1–5 minutes (varies by codebase size)
|
||||
- **Output Format:** SARIF 2.1.0 (industry standard)
|
||||
|
||||
---
|
||||
|
||||
## How Does It Work?
|
||||
|
||||
The workflow orchestrates multiple SAST tools in a containerized environment:
|
||||
|
||||
- **Semgrep:** Pattern-based static analysis for 30+ languages, with rule sets for OWASP Top 10, CWE Top 25, and more.
|
||||
- **Bandit:** Python-specific security scanner, focused on issues like hardcoded secrets, injection, and unsafe code patterns.
|
||||
|
||||
Each tool runs independently, and their findings are merged and normalized into a single SARIF report.
|
||||
|
||||
---
|
||||
|
||||
## How to Use the Static Analysis Workflow
|
||||
|
||||
### Basic Usage
|
||||
|
||||
**CLI:**
|
||||
```bash
|
||||
fuzzforge runs submit static_analysis_scan /path/to/your/project
|
||||
```
|
||||
|
||||
**API:**
|
||||
```bash
|
||||
curl -X POST "http://localhost:8000/workflows/static_analysis_scan/submit" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"target_path": "/path/to/your/project"}'
|
||||
```
|
||||
|
||||
### Advanced Configuration
|
||||
|
||||
You can fine-tune the workflow by passing parameters for each tool:
|
||||
|
||||
**CLI:**
|
||||
```bash
|
||||
fuzzforge runs submit static_analysis_scan /path/to/project \
|
||||
--parameters '{
|
||||
"semgrep_config": {
|
||||
"rules": ["p/security-audit", "owasp-top-ten"],
|
||||
"severity": ["ERROR", "WARNING"],
|
||||
"exclude_patterns": ["test/*", "vendor/*", "node_modules/*"]
|
||||
},
|
||||
"bandit_config": {
|
||||
"confidence": "MEDIUM",
|
||||
"severity": "MEDIUM",
|
||||
"exclude_dirs": ["tests", "migrations"]
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
**API:**
|
||||
```json
|
||||
{
|
||||
"target_path": "/path/to/project",
|
||||
"parameters": {
|
||||
"semgrep_config": {
|
||||
"rules": ["p/security-audit"],
|
||||
"languages": ["python", "javascript"],
|
||||
"severity": ["ERROR", "WARNING"],
|
||||
"exclude_patterns": ["*.test.js", "test_*.py", "vendor/*"]
|
||||
},
|
||||
"bandit_config": {
|
||||
"confidence": "MEDIUM",
|
||||
"severity": "LOW",
|
||||
"tests": ["B201", "B301"],
|
||||
"exclude_dirs": ["tests", ".git"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Configuration Reference
|
||||
|
||||
### Semgrep Parameters
|
||||
|
||||
| Parameter | Type | Default | Description |
|
||||
|-------------------|-----------|-------------------------------|---------------------------------------------|
|
||||
| `rules` | array | `"auto"` | Rule sets to use (e.g., `"p/security-audit"`)|
|
||||
| `languages` | array | `null` | Languages to analyze |
|
||||
| `severity` | array | `["ERROR", "WARNING", "INFO"]`| Severities to include |
|
||||
| `exclude_patterns`| array | `[]` | File patterns to exclude |
|
||||
| `include_patterns`| array | `[]` | File patterns to include |
|
||||
| `max_target_bytes`| integer | `1000000` | Max file size to analyze (bytes) |
|
||||
| `timeout` | integer | `300` | Tool timeout (seconds) |
|
||||
|
||||
### Bandit Parameters
|
||||
|
||||
| Parameter | Type | Default | Description |
|
||||
|-------------------|-----------|-------------------------------|---------------------------------------------|
|
||||
| `confidence` | string | `"LOW"` | Minimum confidence (`"LOW"`, `"MEDIUM"`, `"HIGH"`) |
|
||||
| `severity` | string | `"LOW"` | Minimum severity (`"LOW"`, `"MEDIUM"`, `"HIGH"`) |
|
||||
| `tests` | array | `null` | Specific test IDs to run |
|
||||
| `exclude_dirs` | array | `["tests", ".git"]` | Directories to exclude |
|
||||
| `aggregate` | string | `"file"` | Aggregation mode (`"file"`, `"vuln"`) |
|
||||
| `context_lines` | integer | `3` | Context lines around findings |
|
||||
|
||||
---
|
||||
|
||||
## What Can It Detect?
|
||||
|
||||
### Vulnerability Categories
|
||||
|
||||
- **OWASP Top 10:** Broken Access Control, Injection, Security Misconfiguration, etc.
|
||||
- **CWE Top 25:** SQL Injection, XSS, Command Injection, Information Exposure, etc.
|
||||
- **Language-Specific:** Python (unsafe eval, Django/Flask issues), JavaScript (XSS, prototype pollution), Java (deserialization), Go (race conditions), C/C++ (buffer overflows).
|
||||
|
||||
### Example Detections
|
||||
|
||||
**SQL Injection (Python)**
|
||||
```python
|
||||
query = f"SELECT * FROM users WHERE id = {user_id}" # CWE-89
|
||||
```
|
||||
*Recommendation: Use parameterized queries.*
|
||||
|
||||
**Command Injection (Python)**
|
||||
```python
|
||||
os.system(f"cp {filename} backup/") # CWE-78
|
||||
```
|
||||
*Recommendation: Use subprocess with argument arrays.*
|
||||
|
||||
**XSS (JavaScript)**
|
||||
```javascript
|
||||
element.innerHTML = userInput; // CWE-79
|
||||
```
|
||||
*Recommendation: Use textContent or sanitize input.*
|
||||
|
||||
---
|
||||
|
||||
## Output Format
|
||||
|
||||
All results are returned in SARIF 2.1.0 format, which is supported by many IDEs and security tools.
|
||||
|
||||
**Summary Example:**
|
||||
```json
|
||||
{
|
||||
"workflow": "static_analysis_scan",
|
||||
"status": "completed",
|
||||
"total_findings": 18,
|
||||
"severity_counts": {
|
||||
"critical": 0,
|
||||
"high": 6,
|
||||
"medium": 5,
|
||||
"low": 7
|
||||
},
|
||||
"tool_counts": {
|
||||
"semgrep": 12,
|
||||
"bandit": 6
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Finding Example:**
|
||||
```json
|
||||
{
|
||||
"ruleId": "bandit.B608",
|
||||
"level": "error",
|
||||
"message": {
|
||||
"text": "Possible SQL injection vector through string-based query construction"
|
||||
},
|
||||
"locations": [
|
||||
{
|
||||
"physicalLocation": {
|
||||
"artifactLocation": {
|
||||
"uri": "src/database.py"
|
||||
},
|
||||
"region": {
|
||||
"startLine": 42,
|
||||
"startColumn": 15,
|
||||
"endLine": 42,
|
||||
"endColumn": 65
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
"properties": {
|
||||
"severity": "high",
|
||||
"category": "sql_injection",
|
||||
"cwe": "CWE-89",
|
||||
"confidence": "high",
|
||||
"tool": "bandit"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Performance Tips
|
||||
|
||||
- For large codebases, increase `max_target_bytes` and `timeout` as needed.
|
||||
- Exclude large generated or dependency directories (`vendor/`, `node_modules/`, `dist/`).
|
||||
- Run focused scans on changed files for faster CI/CD feedback.
|
||||
|
||||
---
|
||||
|
||||
## Integration Examples
|
||||
|
||||
### GitHub Actions
|
||||
|
||||
```yaml
|
||||
- name: Run Static Analysis
|
||||
run: |
|
||||
curl -X POST "${{ secrets.FUZZFORGE_URL }}/workflows/static_analysis_scan/submit" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"target_path": "${{ github.workspace }}",
|
||||
"parameters": {
|
||||
"semgrep_config": {"severity": ["ERROR", "WARNING"]},
|
||||
"bandit_config": {"confidence": "MEDIUM"}
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
### Pre-commit Hook
|
||||
|
||||
```bash
|
||||
fuzzforge runs submit static_analysis_scan . --wait --json > /tmp/analysis.json
|
||||
HIGH_ISSUES=$(jq '.sarif.severity_counts.high // 0' /tmp/analysis.json)
|
||||
if [ "$HIGH_ISSUES" -gt 0 ]; then
|
||||
echo "❌ Found $HIGH_ISSUES high-severity security issues. Commit blocked."
|
||||
exit 1
|
||||
fi
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Best Practices
|
||||
|
||||
- **Target the right code:** Focus on your main source directories, not dependencies or build artifacts.
|
||||
- **Start broad, then refine:** Use default rule sets first, then add exclusions or custom rules as needed.
|
||||
- **Triage findings:** Address high-severity issues first, and document false positives for future runs.
|
||||
- **Monitor trends:** Track your security posture over time to measure improvement.
|
||||
- **Optimize for speed:** Use file size limits and timeouts for very large projects.
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
- **No Python files found:** Bandit will report zero findings if your project isn’t Python, this is normal.
|
||||
- **High memory usage:** Exclude large files and directories, or increase Docker memory limits.
|
||||
- **Slow scans:** Use inclusion/exclusion patterns and increase timeouts for big repos.
|
||||
- **Workflow errors:** See the [Troubleshooting Guide](/how-to/troubleshooting.md) for help with registry, Docker, or workflow issues.
|
||||
Reference in New Issue
Block a user