CI/CD Integration with Ephemeral Deployment Model (#14)

* feat: Complete migration from Prefect to Temporal

BREAKING CHANGE: Replaces Prefect workflow orchestration with Temporal

## Major Changes
- Replace Prefect with Temporal for workflow orchestration
- Implement vertical worker architecture (rust, android)
- Replace Docker registry with MinIO for unified storage
- Refactor activities to be co-located with workflows
- Update all API endpoints for Temporal compatibility

## Infrastructure
- New: docker-compose.temporal.yaml (Temporal + MinIO + workers)
- New: workers/ directory with rust and android vertical workers
- New: backend/src/temporal/ (manager, discovery)
- New: backend/src/storage/ (S3-cached storage with MinIO)
- New: backend/toolbox/common/ (shared storage activities)
- Deleted: docker-compose.yaml (old Prefect setup)
- Deleted: backend/src/core/prefect_manager.py
- Deleted: backend/src/services/prefect_stats_monitor.py
- Deleted: Docker registry and insecure-registries requirement

## Workflows
- Migrated: security_assessment workflow to Temporal
- New: rust_test workflow (example/test workflow)
- Deleted: secret_detection_scan (Prefect-based, to be reimplemented)
- Activities now co-located with workflows for independent testing

## API Changes
- Updated: backend/src/api/workflows.py (Temporal submission)
- Updated: backend/src/api/runs.py (Temporal status/results)
- Updated: backend/src/main.py (727 lines, TemporalManager integration)
- Updated: All 16 MCP tools to use TemporalManager

## Testing
-  All services healthy (Temporal, PostgreSQL, MinIO, workers, backend)
-  All API endpoints functional
-  End-to-end workflow test passed (72 findings from vulnerable_app)
-  MinIO storage integration working (target upload/download, results)
-  Worker activity discovery working (6 activities registered)
-  Tarball extraction working
-  SARIF report generation working

## Documentation
- ARCHITECTURE.md: Complete Temporal architecture documentation
- QUICKSTART_TEMPORAL.md: Getting started guide
- MIGRATION_DECISION.md: Why we chose Temporal over Prefect
- IMPLEMENTATION_STATUS.md: Migration progress tracking
- workers/README.md: Worker development guide

## Dependencies
- Added: temporalio>=1.6.0
- Added: boto3>=1.34.0 (MinIO S3 client)
- Removed: prefect>=3.4.18

* feat: Add Python fuzzing vertical with Atheris integration

This commit implements a complete Python fuzzing workflow using Atheris:

## Python Worker (workers/python/)
- Dockerfile with Python 3.11, Atheris, and build tools
- Generic worker.py for dynamic workflow discovery
- requirements.txt with temporalio, boto3, atheris dependencies
- Added to docker-compose.temporal.yaml with dedicated cache volume

## AtherisFuzzer Module (backend/toolbox/modules/fuzzer/)
- Reusable module extending BaseModule
- Auto-discovers fuzz targets (fuzz_*.py, *_fuzz.py, fuzz_target.py)
- Recursive search to find targets in nested directories
- Dynamically loads TestOneInput() function
- Configurable max_iterations and timeout
- Real-time stats callback support for live monitoring
- Returns findings as ModuleFinding objects

## Atheris Fuzzing Workflow (backend/toolbox/workflows/atheris_fuzzing/)
- Temporal workflow for orchestrating fuzzing
- Downloads user code from MinIO
- Executes AtherisFuzzer module
- Uploads results to MinIO
- Cleans up cache after execution
- metadata.yaml with vertical: python for routing

## Test Project (test_projects/python_fuzz_waterfall/)
- Demonstrates stateful waterfall vulnerability
- main.py with check_secret() that leaks progress
- fuzz_target.py with Atheris TestOneInput() harness
- Complete README with usage instructions

## Backend Fixes
- Fixed parameter merging in REST API endpoints (workflows.py)
- Changed workflow parameter passing from positional args to kwargs (manager.py)
- Default parameters now properly merged with user parameters

## Testing
 Worker discovered AtherisFuzzingWorkflow
 Workflow executed end-to-end successfully
 Fuzz target auto-discovered in nested directories
 Atheris ran 100,000 iterations
 Results uploaded and cache cleaned

* chore: Complete Temporal migration with updated CLI/SDK/docs

This commit includes all remaining Temporal migration changes:

## CLI Updates (cli/)
- Updated workflow execution commands for Temporal
- Enhanced error handling and exceptions
- Updated dependencies in uv.lock

## SDK Updates (sdk/)
- Client methods updated for Temporal workflows
- Updated models for new workflow execution
- Updated dependencies in uv.lock

## Documentation Updates (docs/)
- Architecture documentation for Temporal
- Workflow concept documentation
- Resource management documentation (new)
- Debugging guide (new)
- Updated tutorials and how-to guides
- Troubleshooting updates

## README Updates
- Main README with Temporal instructions
- Backend README
- CLI README
- SDK README

## Other
- Updated IMPLEMENTATION_STATUS.md
- Removed old vulnerable_app.tar.gz

These changes complete the Temporal migration and ensure the
CLI/SDK work correctly with the new backend.

* fix: Use positional args instead of kwargs for Temporal workflows

The Temporal Python SDK's start_workflow() method doesn't accept
a 'kwargs' parameter. Workflows must receive parameters as positional
arguments via the 'args' parameter.

Changed from:
  args=workflow_args  # Positional arguments

This fixes the error:
  TypeError: Client.start_workflow() got an unexpected keyword argument 'kwargs'

Workflows now correctly receive parameters in order:
- security_assessment: [target_id, scanner_config, analyzer_config, reporter_config]
- atheris_fuzzing: [target_id, target_file, max_iterations, timeout_seconds]
- rust_test: [target_id, test_message]

* fix: Filter metadata-only parameters from workflow arguments

SecurityAssessmentWorkflow was receiving 7 arguments instead of 2-5.
The issue was that target_path and volume_mode from default_parameters
were being passed to the workflow, when they should only be used by
the system for configuration.

Now filters out metadata-only parameters (target_path, volume_mode)
before passing arguments to workflow execution.

* refactor: Remove Prefect leftovers and volume mounting legacy

Complete cleanup of Prefect migration artifacts:

Backend:
- Delete registry.py and workflow_discovery.py (Prefect-specific files)
- Remove Docker validation from setup.py (no longer needed)
- Remove ResourceLimits and VolumeMount models
- Remove target_path and volume_mode from WorkflowSubmission
- Remove supported_volume_modes from API and discovery
- Clean up metadata.yaml files (remove volume/path fields)
- Simplify parameter filtering in manager.py

SDK:
- Remove volume_mode parameter from client methods
- Remove ResourceLimits and VolumeMount models
- Remove Prefect error patterns from docker_logs.py
- Clean up WorkflowSubmission and WorkflowMetadata models

CLI:
- Remove Volume Modes display from workflow info

All removed features are Prefect-specific or Docker volume mounting
artifacts. Temporal workflows use MinIO storage exclusively.

* feat: Add comprehensive test suite and benchmark infrastructure

- Add 68 unit tests for fuzzer, scanner, and analyzer modules
- Implement pytest-based test infrastructure with fixtures
- Add 6 performance benchmarks with category-specific thresholds
- Configure GitHub Actions for automated testing and benchmarking
- Add test and benchmark documentation

Test coverage:
- AtherisFuzzer: 8 tests
- CargoFuzzer: 14 tests
- FileScanner: 22 tests
- SecurityAnalyzer: 24 tests

All tests passing (68/68)
All benchmarks passing (6/6)

* fix: Resolve all ruff linting violations across codebase

Fixed 27 ruff violations in 12 files:
- Removed unused imports (Depends, Dict, Any, Optional, etc.)
- Fixed undefined workflow_info variable in workflows.py
- Removed dead code with undefined variables in atheris_fuzzer.py
- Changed f-string to regular string where no placeholders used

All files now pass ruff checks for CI/CD compliance.

* fix: Configure CI for unit tests only

- Renamed docker-compose.temporal.yaml → docker-compose.yml for CI compatibility
- Commented out integration-tests job (no integration tests yet)
- Updated test-summary to only depend on lint and unit-tests

CI will now run successfully with 68 unit tests. Integration tests can be added later.

* feat: Add CI/CD integration with ephemeral deployment model

Implements comprehensive CI/CD support for FuzzForge with on-demand worker management:

**Worker Management (v0.7.0)**
- Add WorkerManager for automatic worker lifecycle control
- Auto-start workers from stopped state when workflows execute
- Auto-stop workers after workflow completion
- Health checks and startup timeout handling (90s default)

**CI/CD Features**
- `--fail-on` flag: Fail builds based on SARIF severity levels (error/warning/note/info)
- `--export-sarif` flag: Export findings in SARIF 2.1.0 format
- `--auto-start`/`--auto-stop` flags: Control worker lifecycle
- Exit code propagation: Returns 1 on blocking findings, 0 on success

**Exit Code Fix**
- Add `except typer.Exit: raise` handlers at 3 critical locations
- Move worker cleanup to finally block for guaranteed execution
- Exit codes now propagate correctly even when build fails

**CI Scripts & Examples**
- ci-start.sh: Start FuzzForge services with health checks
- ci-stop.sh: Clean shutdown with volume preservation option
- GitHub Actions workflow example (security-scan.yml)
- GitLab CI pipeline example (.gitlab-ci.example.yml)
- docker-compose.ci.yml: CI-optimized compose file with profiles

**OSS-Fuzz Integration**
- New ossfuzz_campaign workflow for running OSS-Fuzz projects
- OSS-Fuzz worker with Docker-in-Docker support
- Configurable campaign duration and project selection

**Documentation**
- Comprehensive CI/CD integration guide (docs/how-to/cicd-integration.md)
- Updated architecture docs with worker lifecycle details
- Updated workspace isolation documentation
- CLI README with worker management examples

**SDK Enhancements**
- Add get_workflow_worker_info() endpoint
- Worker vertical metadata in workflow responses

**Testing**
- All workflows tested: security_assessment, atheris_fuzzing, secret_detection, cargo_fuzzing
- All monitoring commands tested: stats, crashes, status, finding
- Full CI pipeline simulation verified
- Exit codes verified for success/failure scenarios

Ephemeral CI/CD model: ~3-4GB RAM, ~60-90s startup, runs entirely in CI containers.

* fix: Resolve ruff linting violations in CI/CD code

- Remove unused variables (run_id, defaults, result)
- Remove unused imports
- Fix f-string without placeholders

All CI/CD integration files now pass ruff checks.
This commit is contained in:
tduhamel42
2025-10-14 10:13:45 +02:00
committed by GitHub
parent 987c49569c
commit 60ca088ecf
167 changed files with 26101 additions and 5703 deletions
+4 -4
View File
@@ -14,8 +14,8 @@ API endpoints for fuzzing workflow management and real-time monitoring
# Additional attribution and requirements are provided in the NOTICE file.
import logging
from typing import List, Dict, Any
from fastapi import APIRouter, HTTPException, Depends, WebSocket, WebSocketDisconnect
from typing import List, Dict
from fastapi import APIRouter, HTTPException, WebSocket, WebSocketDisconnect
from fastapi.responses import StreamingResponse
import asyncio
import json
@@ -25,7 +25,6 @@ from src.models.findings import (
FuzzingStats,
CrashReport
)
from src.core.workflow_discovery import WorkflowDiscovery
logger = logging.getLogger(__name__)
@@ -126,12 +125,13 @@ async def update_fuzzing_stats(run_id: str, stats: FuzzingStats):
# Debug: log reception for live instrumentation
try:
logger.info(
"Received fuzzing stats update: run_id=%s exec=%s eps=%.2f crashes=%s corpus=%s elapsed=%ss",
"Received fuzzing stats update: run_id=%s exec=%s eps=%.2f crashes=%s corpus=%s coverage=%s elapsed=%ss",
run_id,
stats.executions,
stats.executions_per_sec,
stats.crashes,
stats.corpus_size,
stats.coverage,
stats.elapsed_time,
)
except Exception:
+49 -56
View File
@@ -14,7 +14,6 @@ API endpoints for workflow run management and findings retrieval
# Additional attribution and requirements are provided in the NOTICE file.
import logging
from typing import Dict, Any
from fastapi import APIRouter, HTTPException, Depends
from src.models.findings import WorkflowFindings, WorkflowStatus
@@ -24,22 +23,22 @@ logger = logging.getLogger(__name__)
router = APIRouter(prefix="/runs", tags=["runs"])
def get_prefect_manager():
"""Dependency to get the Prefect manager instance"""
from src.main import prefect_mgr
return prefect_mgr
def get_temporal_manager():
"""Dependency to get the Temporal manager instance"""
from src.main import temporal_mgr
return temporal_mgr
@router.get("/{run_id}/status", response_model=WorkflowStatus)
async def get_run_status(
run_id: str,
prefect_mgr=Depends(get_prefect_manager)
temporal_mgr=Depends(get_temporal_manager)
) -> WorkflowStatus:
"""
Get the current status of a workflow run.
Args:
run_id: The flow run ID
run_id: The workflow run ID
Returns:
Status information including state, timestamps, and completion flags
@@ -48,25 +47,23 @@ async def get_run_status(
HTTPException: 404 if run not found
"""
try:
status = await prefect_mgr.get_flow_run_status(run_id)
status = await temporal_mgr.get_workflow_status(run_id)
# Find workflow name from deployment
workflow_name = "unknown"
workflow_deployment_id = status.get("workflow", "")
for name, deployment_id in prefect_mgr.deployments.items():
if str(deployment_id) == str(workflow_deployment_id):
workflow_name = name
break
# Map Temporal status to response format
workflow_status = status.get("status", "UNKNOWN")
is_completed = workflow_status in ["COMPLETED", "FAILED", "CANCELLED"]
is_failed = workflow_status == "FAILED"
is_running = workflow_status == "RUNNING"
return WorkflowStatus(
run_id=status["run_id"],
workflow=workflow_name,
status=status["status"],
is_completed=status["is_completed"],
is_failed=status["is_failed"],
is_running=status["is_running"],
created_at=status["created_at"],
updated_at=status["updated_at"]
run_id=run_id,
workflow="unknown", # Temporal doesn't track workflow name in status
status=workflow_status,
is_completed=is_completed,
is_failed=is_failed,
is_running=is_running,
created_at=status.get("start_time"),
updated_at=status.get("close_time") or status.get("execution_time")
)
except Exception as e:
@@ -80,13 +77,13 @@ async def get_run_status(
@router.get("/{run_id}/findings", response_model=WorkflowFindings)
async def get_run_findings(
run_id: str,
prefect_mgr=Depends(get_prefect_manager)
temporal_mgr=Depends(get_temporal_manager)
) -> WorkflowFindings:
"""
Get the findings from a completed workflow run.
Args:
run_id: The flow run ID
run_id: The workflow run ID
Returns:
SARIF-formatted findings from the workflow execution
@@ -96,50 +93,46 @@ async def get_run_findings(
"""
try:
# Get run status first
status = await prefect_mgr.get_flow_run_status(run_id)
status = await temporal_mgr.get_workflow_status(run_id)
workflow_status = status.get("status", "UNKNOWN")
if not status["is_completed"]:
if status["is_running"]:
if workflow_status not in ["COMPLETED", "FAILED", "CANCELLED"]:
if workflow_status == "RUNNING":
raise HTTPException(
status_code=400,
detail=f"Run {run_id} is still running. Current status: {status['status']}"
)
elif status["is_failed"]:
raise HTTPException(
status_code=400,
detail=f"Run {run_id} failed. Status: {status['status']}"
detail=f"Run {run_id} is still running. Current status: {workflow_status}"
)
else:
raise HTTPException(
status_code=400,
detail=f"Run {run_id} not completed. Status: {status['status']}"
detail=f"Run {run_id} not completed. Status: {workflow_status}"
)
# Get the findings
findings = await prefect_mgr.get_flow_run_findings(run_id)
if workflow_status == "FAILED":
raise HTTPException(
status_code=400,
detail=f"Run {run_id} failed. Status: {workflow_status}"
)
# Find workflow name
workflow_name = "unknown"
workflow_deployment_id = status.get("workflow", "")
for name, deployment_id in prefect_mgr.deployments.items():
if str(deployment_id) == str(workflow_deployment_id):
workflow_name = name
break
# Get the workflow result
result = await temporal_mgr.get_workflow_result(run_id)
# Get workflow version if available
# Extract SARIF from result (handle None for backwards compatibility)
if isinstance(result, dict):
sarif = result.get("sarif") or {}
else:
sarif = {}
# Metadata
metadata = {
"completion_time": status["updated_at"],
"completion_time": status.get("close_time"),
"workflow_version": "unknown"
}
if workflow_name in prefect_mgr.workflows:
workflow_info = prefect_mgr.workflows[workflow_name]
metadata["workflow_version"] = workflow_info.metadata.get("version", "unknown")
return WorkflowFindings(
workflow=workflow_name,
workflow="unknown",
run_id=run_id,
sarif=findings,
sarif=sarif,
metadata=metadata
)
@@ -157,7 +150,7 @@ async def get_run_findings(
async def get_workflow_findings(
workflow_name: str,
run_id: str,
prefect_mgr=Depends(get_prefect_manager)
temporal_mgr=Depends(get_temporal_manager)
) -> WorkflowFindings:
"""
Get findings for a specific workflow run.
@@ -166,7 +159,7 @@ async def get_workflow_findings(
Args:
workflow_name: Name of the workflow
run_id: The flow run ID
run_id: The workflow run ID
Returns:
SARIF-formatted findings from the workflow execution
@@ -174,11 +167,11 @@ async def get_workflow_findings(
Raises:
HTTPException: 404 if workflow or run not found, 400 if run not completed
"""
if workflow_name not in prefect_mgr.workflows:
if workflow_name not in temporal_mgr.workflows:
raise HTTPException(
status_code=404,
detail=f"Workflow not found: {workflow_name}"
)
# Delegate to the main findings endpoint
return await get_run_findings(run_id, prefect_mgr)
return await get_run_findings(run_id, temporal_mgr)
+307 -59
View File
@@ -15,8 +15,9 @@ API endpoints for workflow management with enhanced error handling
import logging
import traceback
import tempfile
from typing import List, Dict, Any, Optional
from fastapi import APIRouter, HTTPException, Depends
from fastapi import APIRouter, HTTPException, Depends, UploadFile, File, Form
from pathlib import Path
from src.models.findings import (
@@ -25,10 +26,20 @@ from src.models.findings import (
WorkflowListItem,
RunSubmissionResponse
)
from src.core.workflow_discovery import WorkflowDiscovery
from src.temporal.discovery import WorkflowDiscovery
logger = logging.getLogger(__name__)
# Configuration for file uploads
MAX_UPLOAD_SIZE = 10 * 1024 * 1024 * 1024 # 10 GB
ALLOWED_CONTENT_TYPES = [
"application/gzip",
"application/x-gzip",
"application/x-tar",
"application/x-compressed-tar",
"application/octet-stream", # Generic binary
]
router = APIRouter(prefix="/workflows", tags=["workflows"])
@@ -68,15 +79,15 @@ def create_structured_error_response(
return error_response
def get_prefect_manager():
"""Dependency to get the Prefect manager instance"""
from src.main import prefect_mgr
return prefect_mgr
def get_temporal_manager():
"""Dependency to get the Temporal manager instance"""
from src.main import temporal_mgr
return temporal_mgr
@router.get("/", response_model=List[WorkflowListItem])
async def list_workflows(
prefect_mgr=Depends(get_prefect_manager)
temporal_mgr=Depends(get_temporal_manager)
) -> List[WorkflowListItem]:
"""
List all discovered workflows with their metadata.
@@ -85,7 +96,7 @@ async def list_workflows(
author, and tags.
"""
workflows = []
for name, info in prefect_mgr.workflows.items():
for name, info in temporal_mgr.workflows.items():
workflows.append(WorkflowListItem(
name=name,
version=info.metadata.get("version", "0.6.0"),
@@ -111,7 +122,7 @@ async def get_metadata_schema() -> Dict[str, Any]:
@router.get("/{workflow_name}/metadata", response_model=WorkflowMetadata)
async def get_workflow_metadata(
workflow_name: str,
prefect_mgr=Depends(get_prefect_manager)
temporal_mgr=Depends(get_temporal_manager)
) -> WorkflowMetadata:
"""
Get complete metadata for a specific workflow.
@@ -126,8 +137,8 @@ async def get_workflow_metadata(
Raises:
HTTPException: 404 if workflow not found
"""
if workflow_name not in prefect_mgr.workflows:
available_workflows = list(prefect_mgr.workflows.keys())
if workflow_name not in temporal_mgr.workflows:
available_workflows = list(temporal_mgr.workflows.keys())
error_response = create_structured_error_response(
error_type="WorkflowNotFound",
message=f"Workflow '{workflow_name}' not found",
@@ -143,7 +154,7 @@ async def get_workflow_metadata(
detail=error_response
)
info = prefect_mgr.workflows[workflow_name]
info = temporal_mgr.workflows[workflow_name]
metadata = info.metadata
return WorkflowMetadata(
@@ -154,9 +165,7 @@ async def get_workflow_metadata(
tags=metadata.get("tags", []),
parameters=metadata.get("parameters", {}),
default_parameters=metadata.get("default_parameters", {}),
required_modules=metadata.get("required_modules", []),
supported_volume_modes=metadata.get("supported_volume_modes", ["ro", "rw"]),
has_custom_docker=info.has_docker
required_modules=metadata.get("required_modules", [])
)
@@ -164,14 +173,14 @@ async def get_workflow_metadata(
async def submit_workflow(
workflow_name: str,
submission: WorkflowSubmission,
prefect_mgr=Depends(get_prefect_manager)
temporal_mgr=Depends(get_temporal_manager)
) -> RunSubmissionResponse:
"""
Submit a workflow for execution with volume mounting.
Submit a workflow for execution.
Args:
workflow_name: Name of the workflow to execute
submission: Submission parameters including target path and volume mode
submission: Submission parameters including target path and parameters
Returns:
Run submission response with run_id and initial status
@@ -179,8 +188,8 @@ async def submit_workflow(
Raises:
HTTPException: 404 if workflow not found, 400 for invalid parameters
"""
if workflow_name not in prefect_mgr.workflows:
available_workflows = list(prefect_mgr.workflows.keys())
if workflow_name not in temporal_mgr.workflows:
available_workflows = list(temporal_mgr.workflows.keys())
error_response = create_structured_error_response(
error_type="WorkflowNotFound",
message=f"Workflow '{workflow_name}' not found",
@@ -197,31 +206,36 @@ async def submit_workflow(
)
try:
# Convert ResourceLimits to dict if provided
resource_limits_dict = None
if submission.resource_limits:
resource_limits_dict = {
"cpu_limit": submission.resource_limits.cpu_limit,
"memory_limit": submission.resource_limits.memory_limit,
"cpu_request": submission.resource_limits.cpu_request,
"memory_request": submission.resource_limits.memory_request
}
# Upload target file to MinIO and get target_id
target_path = Path(submission.target_path)
if not target_path.exists():
raise ValueError(f"Target path does not exist: {submission.target_path}")
# Submit the workflow with enhanced parameters
flow_run = await prefect_mgr.submit_workflow(
workflow_name=workflow_name,
target_path=submission.target_path,
volume_mode=submission.volume_mode,
parameters=submission.parameters,
resource_limits=resource_limits_dict,
additional_volumes=submission.additional_volumes,
timeout=submission.timeout
# Upload target (using anonymous user for now)
target_id = await temporal_mgr.upload_target(
file_path=target_path,
user_id="api-user",
metadata={"workflow": workflow_name}
)
run_id = str(flow_run.id)
# Merge default parameters with user parameters
workflow_info = temporal_mgr.workflows[workflow_name]
metadata = workflow_info.metadata or {}
defaults = metadata.get("default_parameters", {})
user_params = submission.parameters or {}
workflow_params = {**defaults, **user_params}
# Start workflow execution
handle = await temporal_mgr.run_workflow(
workflow_name=workflow_name,
target_id=target_id,
workflow_params=workflow_params
)
run_id = handle.id
# Initialize fuzzing tracking if this looks like a fuzzing workflow
workflow_info = prefect_mgr.workflows.get(workflow_name, {})
workflow_info = temporal_mgr.workflows.get(workflow_name, {})
workflow_tags = workflow_info.metadata.get("tags", []) if hasattr(workflow_info, 'metadata') else []
if "fuzzing" in workflow_tags or "fuzz" in workflow_name.lower():
from src.api.fuzzing import initialize_fuzzing_tracking
@@ -229,7 +243,7 @@ async def submit_workflow(
return RunSubmissionResponse(
run_id=run_id,
status=flow_run.state.name if flow_run.state else "PENDING",
status="RUNNING",
workflow=workflow_name,
message=f"Workflow '{workflow_name}' submitted successfully"
)
@@ -261,17 +275,13 @@ async def submit_workflow(
error_type = "WorkflowSubmissionError"
# Detect specific error patterns
if "deployment" in error_message.lower():
error_type = "DeploymentError"
deployment_info = {
"status": "failed",
"error": error_message
}
if "workflow" in error_message.lower() and "not found" in error_message.lower():
error_type = "WorkflowError"
suggestions.extend([
"Check if Prefect server is running and accessible",
"Verify Docker is running and has sufficient resources",
"Check container image availability",
"Ensure volume paths exist and are accessible"
"Check if Temporal server is running and accessible",
"Verify workflow workers are running",
"Check if workflow is registered with correct vertical",
"Ensure Docker is running and has sufficient resources"
])
elif "volume" in error_message.lower() or "mount" in error_message.lower():
@@ -324,25 +334,200 @@ async def submit_workflow(
)
@router.get("/{workflow_name}/parameters")
async def get_workflow_parameters(
@router.post("/{workflow_name}/upload-and-submit", response_model=RunSubmissionResponse)
async def upload_and_submit_workflow(
workflow_name: str,
prefect_mgr=Depends(get_prefect_manager)
file: UploadFile = File(..., description="Target file or tarball to analyze"),
parameters: Optional[str] = Form(None, description="JSON-encoded workflow parameters"),
timeout: Optional[int] = Form(None, description="Timeout in seconds"),
temporal_mgr=Depends(get_temporal_manager)
) -> RunSubmissionResponse:
"""
Upload a target file/tarball and submit workflow for execution.
This endpoint accepts multipart/form-data uploads and is the recommended
way to submit workflows from remote CLI clients.
Args:
workflow_name: Name of the workflow to execute
file: Target file or tarball (compressed directory)
parameters: JSON string of workflow parameters (optional)
timeout: Execution timeout in seconds (optional)
Returns:
Run submission response with run_id and initial status
Raises:
HTTPException: 404 if workflow not found, 400 for invalid parameters,
413 if file too large
"""
if workflow_name not in temporal_mgr.workflows:
available_workflows = list(temporal_mgr.workflows.keys())
error_response = create_structured_error_response(
error_type="WorkflowNotFound",
message=f"Workflow '{workflow_name}' not found",
workflow_name=workflow_name,
suggestions=[
f"Available workflows: {', '.join(available_workflows)}",
"Use GET /workflows/ to see all available workflows"
]
)
raise HTTPException(status_code=404, detail=error_response)
temp_file_path = None
try:
# Validate file size
file_size = 0
chunk_size = 1024 * 1024 # 1MB chunks
# Create temporary file
temp_fd, temp_file_path = tempfile.mkstemp(suffix=".tar.gz")
logger.info(f"Receiving file upload for workflow '{workflow_name}': {file.filename}")
# Stream file to disk
with open(temp_fd, 'wb') as temp_file:
while True:
chunk = await file.read(chunk_size)
if not chunk:
break
file_size += len(chunk)
# Check size limit
if file_size > MAX_UPLOAD_SIZE:
raise HTTPException(
status_code=413,
detail=create_structured_error_response(
error_type="FileTooLarge",
message=f"File size exceeds maximum allowed size of {MAX_UPLOAD_SIZE / (1024**3):.1f} GB",
workflow_name=workflow_name,
suggestions=[
"Reduce the size of your target directory",
"Exclude unnecessary files (build artifacts, dependencies, etc.)",
"Consider splitting into smaller analysis targets"
]
)
)
temp_file.write(chunk)
logger.info(f"Received file: {file_size / (1024**2):.2f} MB")
# Parse parameters
workflow_params = {}
if parameters:
try:
import json
workflow_params = json.loads(parameters)
if not isinstance(workflow_params, dict):
raise ValueError("Parameters must be a JSON object")
except (json.JSONDecodeError, ValueError) as e:
raise HTTPException(
status_code=400,
detail=create_structured_error_response(
error_type="InvalidParameters",
message=f"Invalid parameters JSON: {e}",
workflow_name=workflow_name,
suggestions=["Ensure parameters is valid JSON object"]
)
)
# Upload to MinIO
target_id = await temporal_mgr.upload_target(
file_path=Path(temp_file_path),
user_id="api-user",
metadata={
"workflow": workflow_name,
"original_filename": file.filename,
"upload_method": "multipart"
}
)
logger.info(f"Uploaded to MinIO with target_id: {target_id}")
# Merge default parameters with user parameters
workflow_info = temporal_mgr.workflows.get(workflow_name)
metadata = workflow_info.metadata or {}
defaults = metadata.get("default_parameters", {})
workflow_params = {**defaults, **workflow_params}
# Start workflow execution
handle = await temporal_mgr.run_workflow(
workflow_name=workflow_name,
target_id=target_id,
workflow_params=workflow_params
)
run_id = handle.id
# Initialize fuzzing tracking if needed
workflow_info = temporal_mgr.workflows.get(workflow_name, {})
workflow_tags = workflow_info.metadata.get("tags", []) if hasattr(workflow_info, 'metadata') else []
if "fuzzing" in workflow_tags or "fuzz" in workflow_name.lower():
from src.api.fuzzing import initialize_fuzzing_tracking
initialize_fuzzing_tracking(run_id, workflow_name)
return RunSubmissionResponse(
run_id=run_id,
status="RUNNING",
workflow=workflow_name,
message=f"Workflow '{workflow_name}' submitted successfully with uploaded target"
)
except HTTPException:
raise
except Exception as e:
logger.error(f"Failed to upload and submit workflow '{workflow_name}': {e}")
logger.error(f"Traceback: {traceback.format_exc()}")
error_response = create_structured_error_response(
error_type="WorkflowSubmissionError",
message=f"Failed to process upload and submit workflow: {str(e)}",
workflow_name=workflow_name,
suggestions=[
"Check if the uploaded file is a valid tarball",
"Verify MinIO storage is accessible",
"Check backend logs for detailed error information",
"Ensure Temporal workers are running"
]
)
raise HTTPException(status_code=500, detail=error_response)
finally:
# Cleanup temporary file
if temp_file_path and Path(temp_file_path).exists():
try:
Path(temp_file_path).unlink()
logger.debug(f"Cleaned up temp file: {temp_file_path}")
except Exception as e:
logger.warning(f"Failed to cleanup temp file {temp_file_path}: {e}")
@router.get("/{workflow_name}/worker-info")
async def get_workflow_worker_info(
workflow_name: str,
temporal_mgr=Depends(get_temporal_manager)
) -> Dict[str, Any]:
"""
Get the parameters schema for a workflow.
Get worker information for a workflow.
Returns details about which worker is required to execute this workflow,
including container name, task queue, and vertical.
Args:
workflow_name: Name of the workflow
Returns:
Parameters schema with types, descriptions, and defaults
Worker information including vertical, container name, and task queue
Raises:
HTTPException: 404 if workflow not found
"""
if workflow_name not in prefect_mgr.workflows:
available_workflows = list(prefect_mgr.workflows.keys())
if workflow_name not in temporal_mgr.workflows:
available_workflows = list(temporal_mgr.workflows.keys())
error_response = create_structured_error_response(
error_type="WorkflowNotFound",
message=f"Workflow '{workflow_name}' not found",
@@ -357,7 +542,70 @@ async def get_workflow_parameters(
detail=error_response
)
info = prefect_mgr.workflows[workflow_name]
info = temporal_mgr.workflows[workflow_name]
metadata = info.metadata
# Extract vertical from metadata
vertical = metadata.get("vertical")
if not vertical:
error_response = create_structured_error_response(
error_type="MissingVertical",
message=f"Workflow '{workflow_name}' does not specify a vertical in metadata",
workflow_name=workflow_name,
suggestions=[
"Check workflow metadata.yaml for 'vertical' field",
"Contact workflow author for support"
]
)
raise HTTPException(
status_code=500,
detail=error_response
)
return {
"workflow": workflow_name,
"vertical": vertical,
"worker_container": f"fuzzforge-worker-{vertical}",
"task_queue": f"{vertical}-queue",
"required": True
}
@router.get("/{workflow_name}/parameters")
async def get_workflow_parameters(
workflow_name: str,
temporal_mgr=Depends(get_temporal_manager)
) -> Dict[str, Any]:
"""
Get the parameters schema for a workflow.
Args:
workflow_name: Name of the workflow
Returns:
Parameters schema with types, descriptions, and defaults
Raises:
HTTPException: 404 if workflow not found
"""
if workflow_name not in temporal_mgr.workflows:
available_workflows = list(temporal_mgr.workflows.keys())
error_response = create_structured_error_response(
error_type="WorkflowNotFound",
message=f"Workflow '{workflow_name}' not found",
workflow_name=workflow_name,
suggestions=[
f"Available workflows: {', '.join(available_workflows)}",
"Use GET /workflows/ to see all available workflows"
]
)
raise HTTPException(
status_code=404,
detail=error_response
)
info = temporal_mgr.workflows[workflow_name]
metadata = info.metadata
# Return parameters with enhanced schema information
-770
View File
@@ -1,770 +0,0 @@
"""
Prefect Manager - Core orchestration for workflow deployment and execution
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import logging
import os
import platform
import re
from pathlib import Path
from typing import Dict, Optional, Any
from prefect import get_client
from prefect.docker import DockerImage
from prefect.client.schemas import FlowRun
from src.core.workflow_discovery import WorkflowDiscovery, WorkflowInfo
logger = logging.getLogger(__name__)
def get_registry_url(context: str = "default") -> str:
"""
Get the container registry URL to use for a given operation context.
Goals:
- Work reliably across Linux and macOS Docker Desktop
- Prefer in-network service discovery when running inside containers
- Allow full override via env vars from docker-compose
Env overrides:
- FUZZFORGE_REGISTRY_PUSH_URL: used for image builds/pushes
- FUZZFORGE_REGISTRY_PULL_URL: used for workers to pull images
"""
# Normalize context
ctx = (context or "default").lower()
# Always honor explicit overrides first
if ctx in ("push", "build"):
push_url = os.getenv("FUZZFORGE_REGISTRY_PUSH_URL")
if push_url:
logger.debug("Using FUZZFORGE_REGISTRY_PUSH_URL: %s", push_url)
return push_url
# Default to host-published registry for Docker daemon operations
return "localhost:5001"
if ctx == "pull":
pull_url = os.getenv("FUZZFORGE_REGISTRY_PULL_URL")
if pull_url:
logger.debug("Using FUZZFORGE_REGISTRY_PULL_URL: %s", pull_url)
return pull_url
# Prefect worker pulls via host Docker daemon as well
return "localhost:5001"
# Default/fallback
return os.getenv("FUZZFORGE_REGISTRY_PULL_URL", os.getenv("FUZZFORGE_REGISTRY_PUSH_URL", "localhost:5001"))
def _compose_project_name(default: str = "fuzzforge") -> str:
"""Return the docker-compose project name used for network/volume naming.
Always returns 'fuzzforge' regardless of environment variables.
"""
return "fuzzforge"
class PrefectManager:
"""
Manages Prefect deployments and flow runs for discovered workflows.
This class handles:
- Workflow discovery and registration
- Docker image building through Prefect
- Deployment creation and management
- Flow run submission with volume mounting
- Findings retrieval from completed runs
"""
def __init__(self, workflows_dir: Path = None):
"""
Initialize the Prefect manager.
Args:
workflows_dir: Path to the workflows directory (default: toolbox/workflows)
"""
if workflows_dir is None:
workflows_dir = Path("toolbox/workflows")
self.discovery = WorkflowDiscovery(workflows_dir)
self.workflows: Dict[str, WorkflowInfo] = {}
self.deployments: Dict[str, str] = {} # workflow_name -> deployment_id
# Security: Define allowed and forbidden paths for host mounting
self.allowed_base_paths = [
"/tmp",
"/home",
"/Users", # macOS users
"/opt",
"/var/tmp",
"/workspace", # Common container workspace
"/app" # Container application directory (for test projects)
]
self.forbidden_paths = [
"/etc",
"/root",
"/var/run",
"/sys",
"/proc",
"/dev",
"/boot",
"/var/lib/docker", # Critical Docker data
"/var/log", # System logs
"/usr/bin", # System binaries
"/usr/sbin",
"/sbin",
"/bin"
]
@staticmethod
def _parse_memory_to_bytes(memory_str: str) -> int:
"""
Parse memory string (like '512Mi', '1Gi') to bytes.
Args:
memory_str: Memory string with unit suffix
Returns:
Memory in bytes
Raises:
ValueError: If format is invalid
"""
if not memory_str:
return 0
match = re.match(r'^(\d+(?:\.\d+)?)\s*([GMK]i?)$', memory_str.strip())
if not match:
raise ValueError(f"Invalid memory format: {memory_str}. Expected format like '512Mi', '1Gi'")
value, unit = match.groups()
value = float(value)
# Convert to bytes based on unit (binary units: Ki, Mi, Gi)
if unit in ['K', 'Ki']:
multiplier = 1024
elif unit in ['M', 'Mi']:
multiplier = 1024 * 1024
elif unit in ['G', 'Gi']:
multiplier = 1024 * 1024 * 1024
else:
raise ValueError(f"Unsupported memory unit: {unit}")
return int(value * multiplier)
@staticmethod
def _parse_cpu_to_millicores(cpu_str: str) -> int:
"""
Parse CPU string (like '500m', '1', '2.5') to millicores.
Args:
cpu_str: CPU string
Returns:
CPU in millicores (1 core = 1000 millicores)
Raises:
ValueError: If format is invalid
"""
if not cpu_str:
return 0
cpu_str = cpu_str.strip()
# Handle millicores format (e.g., '500m')
if cpu_str.endswith('m'):
try:
return int(cpu_str[:-1])
except ValueError:
raise ValueError(f"Invalid CPU format: {cpu_str}")
# Handle core format (e.g., '1', '2.5')
try:
cores = float(cpu_str)
return int(cores * 1000) # Convert to millicores
except ValueError:
raise ValueError(f"Invalid CPU format: {cpu_str}")
def _extract_resource_requirements(self, workflow_info: WorkflowInfo) -> Dict[str, str]:
"""
Extract resource requirements from workflow metadata.
Args:
workflow_info: Workflow information with metadata
Returns:
Dictionary with resource requirements in Docker format
"""
metadata = workflow_info.metadata
requirements = metadata.get("requirements", {})
resources = requirements.get("resources", {})
resource_config = {}
# Extract memory requirement
memory = resources.get("memory")
if memory:
try:
# Validate memory format and store original string for Docker
self._parse_memory_to_bytes(memory)
resource_config["memory"] = memory
except ValueError as e:
logger.warning(f"Invalid memory requirement in {workflow_info.name}: {e}")
# Extract CPU requirement
cpu = resources.get("cpu")
if cpu:
try:
# Validate CPU format and store original string for Docker
self._parse_cpu_to_millicores(cpu)
resource_config["cpus"] = cpu
except ValueError as e:
logger.warning(f"Invalid CPU requirement in {workflow_info.name}: {e}")
# Extract timeout
timeout = resources.get("timeout")
if timeout and isinstance(timeout, int):
resource_config["timeout"] = str(timeout)
return resource_config
async def initialize(self):
"""
Initialize the manager by discovering and deploying all workflows.
This method:
1. Discovers all valid workflows in the workflows directory
2. Validates their metadata
3. Deploys each workflow to Prefect with Docker images
"""
try:
# Discover workflows
self.workflows = await self.discovery.discover_workflows()
if not self.workflows:
logger.warning("No workflows discovered")
return
logger.info(f"Discovered {len(self.workflows)} workflows: {list(self.workflows.keys())}")
# Deploy each workflow
for name, info in self.workflows.items():
try:
await self._deploy_workflow(name, info)
except Exception as e:
logger.error(f"Failed to deploy workflow '{name}': {e}")
except Exception as e:
logger.error(f"Failed to initialize Prefect manager: {e}")
raise
async def _deploy_workflow(self, name: str, info: WorkflowInfo):
"""
Deploy a single workflow to Prefect with Docker image.
Args:
name: Workflow name
info: Workflow information including metadata and paths
"""
logger.info(f"Deploying workflow '{name}'...")
# Get the flow function from registry
flow_func = self.discovery.get_flow_function(name)
if not flow_func:
logger.error(
f"Failed to get flow function for '{name}' from registry. "
f"Ensure the workflow is properly registered in toolbox/workflows/registry.py"
)
return
# Use the mandatory Dockerfile with absolute paths for Docker Compose
# Get absolute paths for build context and dockerfile
toolbox_path = info.path.parent.parent.resolve()
dockerfile_abs_path = info.dockerfile.resolve()
# Calculate relative dockerfile path from toolbox context
try:
dockerfile_rel_path = dockerfile_abs_path.relative_to(toolbox_path)
except ValueError:
# If relative path fails, use the workflow-specific path
dockerfile_rel_path = Path("workflows") / name / "Dockerfile"
# Determine deployment strategy based on Dockerfile presence
base_image = "prefecthq/prefect:3-python3.11"
has_custom_dockerfile = info.has_docker and info.dockerfile.exists()
logger.info(f"=== DEPLOYMENT DEBUG for '{name}' ===")
logger.info(f"info.has_docker: {info.has_docker}")
logger.info(f"info.dockerfile: {info.dockerfile}")
logger.info(f"info.dockerfile.exists(): {info.dockerfile.exists()}")
logger.info(f"has_custom_dockerfile: {has_custom_dockerfile}")
logger.info(f"toolbox_path: {toolbox_path}")
logger.info(f"dockerfile_rel_path: {dockerfile_rel_path}")
if has_custom_dockerfile:
logger.info(f"Workflow '{name}' has custom Dockerfile - building custom image")
# Decide whether to use registry or keep images local to host engine
import os
# Default to using the local registry; set FUZZFORGE_USE_REGISTRY=false to bypass (not recommended)
use_registry = os.getenv("FUZZFORGE_USE_REGISTRY", "true").lower() == "true"
if use_registry:
registry_url = get_registry_url(context="push")
image_spec = DockerImage(
name=f"{registry_url}/fuzzforge/{name}",
tag="latest",
dockerfile=str(dockerfile_rel_path),
context=str(toolbox_path)
)
deploy_image = f"{registry_url}/fuzzforge/{name}:latest"
build_custom = True
push_custom = True
logger.info(f"Using registry: {registry_url} for '{name}'")
else:
# Single-host mode: build into host engine cache; no push required
image_spec = DockerImage(
name=f"fuzzforge/{name}",
tag="latest",
dockerfile=str(dockerfile_rel_path),
context=str(toolbox_path)
)
deploy_image = f"fuzzforge/{name}:latest"
build_custom = True
push_custom = False
logger.info("Using single-host image (no registry push): %s", deploy_image)
else:
logger.info(f"Workflow '{name}' using base image - no custom dependencies needed")
deploy_image = base_image
build_custom = False
push_custom = False
# Pre-validate registry connectivity when pushing
if push_custom:
try:
from .setup import validate_registry_connectivity
await validate_registry_connectivity(registry_url)
logger.info(f"Registry connectivity validated for {registry_url}")
except Exception as e:
logger.error(f"Registry connectivity validation failed for {registry_url}: {e}")
raise RuntimeError(f"Cannot deploy workflow '{name}': Registry {registry_url} is not accessible. {e}")
# Deploy the workflow
try:
# Ensure any previous deployment is removed so job variables are updated
try:
async with get_client() as client:
existing = await client.read_deployment_by_name(
f"{name}/{name}-deployment"
)
if existing:
logger.info(f"Removing existing deployment for '{name}' to refresh settings...")
await client.delete_deployment(existing.id)
except Exception:
# If not found or deletion fails, continue with deployment
pass
# Extract resource requirements from metadata
workflow_resource_requirements = self._extract_resource_requirements(info)
logger.info(f"Workflow '{name}' resource requirements: {workflow_resource_requirements}")
# Build job variables with resource requirements
job_variables = {
"image": deploy_image, # Use the worker-accessible registry name
"volumes": [], # Populated at run submission with toolbox mount
"env": {
"PYTHONPATH": "/opt/prefect/toolbox:/opt/prefect",
"WORKFLOW_NAME": name
}
}
# Add resource requirements to job variables if present
if workflow_resource_requirements:
job_variables["resources"] = workflow_resource_requirements
# Prepare deployment parameters
deploy_params = {
"name": f"{name}-deployment",
"work_pool_name": "docker-pool",
"image": image_spec if has_custom_dockerfile else deploy_image,
"push": push_custom,
"build": build_custom,
"job_variables": job_variables
}
deployment = await flow_func.deploy(**deploy_params)
self.deployments[name] = str(deployment.id) if hasattr(deployment, 'id') else name
logger.info(f"Successfully deployed workflow '{name}'")
except Exception as e:
# Enhanced error reporting with more context
import traceback
logger.error(f"Failed to deploy workflow '{name}': {e}")
logger.error(f"Deployment traceback: {traceback.format_exc()}")
# Try to capture Docker-specific context
error_context = {
"workflow_name": name,
"has_dockerfile": has_custom_dockerfile,
"image_name": deploy_image if 'deploy_image' in locals() else "unknown",
"registry_url": registry_url if 'registry_url' in locals() else "unknown",
"error_type": type(e).__name__,
"error_message": str(e)
}
# Check for specific error patterns with detailed categorization
error_msg_lower = str(e).lower()
if "registry" in error_msg_lower and ("no such host" in error_msg_lower or "connection" in error_msg_lower):
error_context["category"] = "registry_connectivity_error"
error_context["solution"] = f"Cannot reach registry at {error_context['registry_url']}. Check Docker network and registry service."
elif "docker" in error_msg_lower:
error_context["category"] = "docker_error"
if "build" in error_msg_lower:
error_context["subcategory"] = "image_build_failed"
error_context["solution"] = "Check Dockerfile syntax and dependencies."
elif "pull" in error_msg_lower:
error_context["subcategory"] = "image_pull_failed"
error_context["solution"] = "Check if image exists in registry and network connectivity."
elif "push" in error_msg_lower:
error_context["subcategory"] = "image_push_failed"
error_context["solution"] = f"Check registry connectivity and push permissions to {error_context['registry_url']}."
elif "registry" in error_msg_lower:
error_context["category"] = "registry_error"
error_context["solution"] = "Check registry configuration and accessibility."
elif "prefect" in error_msg_lower:
error_context["category"] = "prefect_error"
error_context["solution"] = "Check Prefect server connectivity and deployment configuration."
else:
error_context["category"] = "unknown_deployment_error"
error_context["solution"] = "Check logs for more specific error details."
logger.error(f"Deployment error context: {error_context}")
# Raise enhanced exception with context
enhanced_error = Exception(f"Deployment failed for workflow '{name}': {str(e)} | Context: {error_context}")
enhanced_error.original_error = e
enhanced_error.context = error_context
raise enhanced_error
async def submit_workflow(
self,
workflow_name: str,
target_path: str,
volume_mode: str = "ro",
parameters: Dict[str, Any] = None,
resource_limits: Dict[str, str] = None,
additional_volumes: list = None,
timeout: int = None
) -> FlowRun:
"""
Submit a workflow for execution with volume mounting.
Args:
workflow_name: Name of the workflow to execute
target_path: Host path to mount as volume
volume_mode: Volume mount mode ("ro" for read-only, "rw" for read-write)
parameters: Workflow-specific parameters
resource_limits: CPU/memory limits for container
additional_volumes: List of additional volume mounts
timeout: Timeout in seconds
Returns:
FlowRun object with run information
Raises:
ValueError: If workflow not found or volume mode not supported
"""
if workflow_name not in self.workflows:
raise ValueError(f"Unknown workflow: {workflow_name}")
# Validate volume mode
workflow_info = self.workflows[workflow_name]
supported_modes = workflow_info.metadata.get("supported_volume_modes", ["ro", "rw"])
if volume_mode not in supported_modes:
raise ValueError(
f"Workflow '{workflow_name}' doesn't support volume mode '{volume_mode}'. "
f"Supported modes: {supported_modes}"
)
# Validate target path with security checks
self._validate_target_path(target_path)
# Validate additional volumes if provided
if additional_volumes:
for volume in additional_volumes:
self._validate_target_path(volume.host_path)
async with get_client() as client:
# Get the deployment, auto-redeploy once if missing
try:
deployment = await client.read_deployment_by_name(
f"{workflow_name}/{workflow_name}-deployment"
)
except Exception as e:
import traceback
logger.error(f"Failed to find deployment for workflow '{workflow_name}': {e}")
logger.error(f"Deployment lookup traceback: {traceback.format_exc()}")
# Attempt a one-time auto-deploy to recover from startup races
try:
logger.info(f"Auto-deploying missing workflow '{workflow_name}' and retrying...")
await self._deploy_workflow(workflow_name, workflow_info)
deployment = await client.read_deployment_by_name(
f"{workflow_name}/{workflow_name}-deployment"
)
except Exception as redeploy_exc:
# Enhanced error with context
error_context = {
"workflow_name": workflow_name,
"error_type": type(e).__name__,
"error_message": str(e),
"redeploy_error": str(redeploy_exc),
"available_deployments": list(self.deployments.keys()),
}
enhanced_error = ValueError(
f"Deployment not found and redeploy failed for workflow '{workflow_name}': {e} | Context: {error_context}"
)
enhanced_error.context = error_context
raise enhanced_error
# Determine the Docker Compose network name and volume names
# Hardcoded to 'fuzzforge' to avoid directory name dependencies
import os
compose_project = "fuzzforge"
docker_network = "fuzzforge_default"
# Build volume mounts
# Add toolbox volume mount for workflow code access
backend_toolbox_path = "/app/toolbox" # Path in backend container
# Hardcoded volume names
prefect_storage_volume = "fuzzforge_prefect_storage"
toolbox_code_volume = "fuzzforge_toolbox_code"
volumes = [
f"{target_path}:/workspace:{volume_mode}",
f"{prefect_storage_volume}:/prefect-storage", # Shared storage for results
f"{toolbox_code_volume}:/opt/prefect/toolbox:ro" # Mount workflow code
]
# Add additional volumes if provided
if additional_volumes:
for volume in additional_volumes:
volume_spec = f"{volume.host_path}:{volume.container_path}:{volume.mode}"
volumes.append(volume_spec)
# Build environment variables
env_vars = {
"PREFECT_API_URL": "http://prefect-server:4200/api", # Use internal network hostname
"PREFECT_LOGGING_LEVEL": "INFO",
"PREFECT_LOCAL_STORAGE_PATH": "/prefect-storage", # Use shared storage
"PREFECT_RESULTS_PERSIST_BY_DEFAULT": "true", # Enable result persistence
"PREFECT_DEFAULT_RESULT_STORAGE_BLOCK": "local-file-system/fuzzforge-results", # Use our storage block
"WORKSPACE_PATH": "/workspace",
"VOLUME_MODE": volume_mode,
"WORKFLOW_NAME": workflow_name
}
# Add additional volume paths to environment for easy access
if additional_volumes:
for i, volume in enumerate(additional_volumes):
env_vars[f"ADDITIONAL_VOLUME_{i}_PATH"] = volume.container_path
# Determine which image to use based on workflow configuration
workflow_info = self.workflows[workflow_name]
has_custom_dockerfile = workflow_info.has_docker and workflow_info.dockerfile.exists()
# Use pull context for worker to pull from registry
registry_url = get_registry_url(context="pull")
workflow_image = f"{registry_url}/fuzzforge/{workflow_name}:latest" if has_custom_dockerfile else "prefecthq/prefect:3-python3.11"
logger.debug(f"Worker will pull image: {workflow_image} (Registry: {registry_url})")
# Configure job variables with volume mounting and network access
job_variables = {
# Use custom image if available, otherwise base Prefect image
"image": workflow_image,
"volumes": volumes,
"networks": [docker_network], # Connect to Docker Compose network
"env": {
**env_vars,
"PYTHONPATH": "/opt/prefect/toolbox:/opt/prefect/toolbox/workflows",
"WORKFLOW_NAME": workflow_name
}
}
# Apply resource requirements from workflow metadata and user overrides
workflow_resource_requirements = self._extract_resource_requirements(workflow_info)
final_resource_config = {}
# Start with workflow requirements as base
if workflow_resource_requirements:
final_resource_config.update(workflow_resource_requirements)
# Apply user-provided resource limits (overrides workflow defaults)
if resource_limits:
user_resource_config = {}
if resource_limits.get("cpu_limit"):
user_resource_config["cpus"] = resource_limits["cpu_limit"]
if resource_limits.get("memory_limit"):
user_resource_config["memory"] = resource_limits["memory_limit"]
# Note: cpu_request and memory_request are not directly supported by Docker
# but could be used for Kubernetes in the future
# User overrides take precedence
final_resource_config.update(user_resource_config)
# Apply final resource configuration
if final_resource_config:
job_variables["resources"] = final_resource_config
logger.info(f"Applied resource limits: {final_resource_config}")
# Merge parameters with defaults from metadata
default_params = workflow_info.metadata.get("default_parameters", {})
final_params = {**default_params, **(parameters or {})}
# Set flow parameters that match the flow signature
final_params["target_path"] = "/workspace" # Container path where volume is mounted
final_params["volume_mode"] = volume_mode
# Create and submit the flow run
# Pass job_variables to ensure network, volumes, and environment are configured
logger.info(f"Submitting flow with job_variables: {job_variables}")
logger.info(f"Submitting flow with parameters: {final_params}")
# Prepare flow run creation parameters
flow_run_params = {
"deployment_id": deployment.id,
"parameters": final_params,
"job_variables": job_variables
}
# Note: Timeout is handled through workflow-level configuration
# Additional timeout configuration can be added to deployment metadata if needed
flow_run = await client.create_flow_run_from_deployment(**flow_run_params)
logger.info(
f"Submitted workflow '{workflow_name}' with run_id: {flow_run.id}, "
f"target: {target_path}, mode: {volume_mode}"
)
return flow_run
async def get_flow_run_findings(self, run_id: str) -> Dict[str, Any]:
"""
Retrieve findings from a completed flow run.
Args:
run_id: The flow run ID
Returns:
Dictionary containing SARIF-formatted findings
Raises:
ValueError: If run not completed or not found
"""
async with get_client() as client:
flow_run = await client.read_flow_run(run_id)
if not flow_run.state.is_completed():
raise ValueError(
f"Flow run {run_id} not completed. Current status: {flow_run.state.name}"
)
# Get the findings from the flow run result
try:
findings = await flow_run.state.result()
return findings
except Exception as e:
logger.error(f"Failed to retrieve findings for run {run_id}: {e}")
raise ValueError(f"Failed to retrieve findings: {e}")
async def get_flow_run_status(self, run_id: str) -> Dict[str, Any]:
"""
Get the current status of a flow run.
Args:
run_id: The flow run ID
Returns:
Dictionary with status information
"""
async with get_client() as client:
flow_run = await client.read_flow_run(run_id)
return {
"run_id": str(flow_run.id),
"workflow": flow_run.deployment_id,
"status": flow_run.state.name,
"is_completed": flow_run.state.is_completed(),
"is_failed": flow_run.state.is_failed(),
"is_running": flow_run.state.is_running(),
"created_at": flow_run.created,
"updated_at": flow_run.updated
}
def _validate_target_path(self, target_path: str) -> None:
"""
Validate target path for security before mounting as volume.
Args:
target_path: Host path to validate
Raises:
ValueError: If path is not allowed for security reasons
"""
target = Path(target_path)
# Path must be absolute
if not target.is_absolute():
raise ValueError(f"Target path must be absolute: {target_path}")
# Resolve path to handle symlinks and relative components
try:
resolved_path = target.resolve()
except (OSError, RuntimeError) as e:
raise ValueError(f"Cannot resolve target path: {target_path} - {e}")
resolved_str = str(resolved_path)
# Check against forbidden paths first (more restrictive)
for forbidden in self.forbidden_paths:
if resolved_str.startswith(forbidden):
raise ValueError(
f"Access denied: Path '{target_path}' resolves to forbidden directory '{forbidden}'. "
f"This path contains sensitive system files and cannot be mounted."
)
# Check if path starts with any allowed base path
path_allowed = False
for allowed in self.allowed_base_paths:
if resolved_str.startswith(allowed):
path_allowed = True
break
if not path_allowed:
allowed_list = ", ".join(self.allowed_base_paths)
raise ValueError(
f"Access denied: Path '{target_path}' is not in allowed directories. "
f"Allowed base paths: {allowed_list}"
)
# Additional security checks
if resolved_str == "/":
raise ValueError("Cannot mount root filesystem")
# Warn if path doesn't exist (but don't block - it might be created later)
if not resolved_path.exists():
logger.warning(f"Target path does not exist: {target_path}")
logger.info(f"Path validation passed for: {target_path} -> {resolved_str}")
+10 -367
View File
@@ -1,5 +1,5 @@
"""
Setup utilities for Prefect infrastructure
Setup utilities for FuzzForge infrastructure
"""
# Copyright (c) 2025 FuzzingLabs
@@ -14,364 +14,21 @@ Setup utilities for Prefect infrastructure
# Additional attribution and requirements are provided in the NOTICE file.
import logging
from prefect import get_client
from prefect.client.schemas.actions import WorkPoolCreate
from prefect.client.schemas.objects import WorkPool
from .prefect_manager import get_registry_url
logger = logging.getLogger(__name__)
async def setup_docker_pool():
"""
Create or update the Docker work pool for container execution.
This work pool is configured to:
- Connect to the local Docker daemon
- Support volume mounting at runtime
- Clean up containers after execution
- Use bridge networking by default
"""
import os
async with get_client() as client:
pool_name = "docker-pool"
# Add force recreation flag for debugging fresh install issues
force_recreate = os.getenv('FORCE_RECREATE_WORK_POOL', 'false').lower() == 'true'
debug_setup = os.getenv('DEBUG_WORK_POOL_SETUP', 'false').lower() == 'true'
if force_recreate:
logger.warning(f"FORCE_RECREATE_WORK_POOL=true - Will recreate work pool regardless of existing configuration")
if debug_setup:
logger.warning(f"DEBUG_WORK_POOL_SETUP=true - Enhanced logging enabled")
# Temporarily set logging level to DEBUG for this function
original_level = logger.level
logger.setLevel(logging.DEBUG)
try:
# Check if pool already exists and supports custom images
existing_pools = await client.read_work_pools()
existing_pool = None
for pool in existing_pools:
if pool.name == pool_name:
existing_pool = pool
break
if existing_pool and not force_recreate:
logger.info(f"Found existing work pool '{pool_name}' - validating configuration...")
# Check if the existing pool has the correct configuration
base_template = existing_pool.base_job_template or {}
logger.debug(f"Base template keys: {list(base_template.keys())}")
job_config = base_template.get("job_configuration", {})
logger.debug(f"Job config keys: {list(job_config.keys())}")
image_config = job_config.get("image", "")
has_image_variable = "{{ image }}" in str(image_config)
logger.debug(f"Image config: '{image_config}' -> has_image_variable: {has_image_variable}")
# Check if volume defaults include toolbox mount
variables = base_template.get("variables", {})
properties = variables.get("properties", {})
volume_config = properties.get("volumes", {})
volume_defaults = volume_config.get("default", [])
has_toolbox_volume = any("toolbox_code" in str(vol) for vol in volume_defaults) if volume_defaults else False
logger.debug(f"Volume defaults: {volume_defaults}")
logger.debug(f"Has toolbox volume: {has_toolbox_volume}")
# Check if environment defaults include required settings
env_config = properties.get("env", {})
env_defaults = env_config.get("default", {})
has_api_url = "PREFECT_API_URL" in env_defaults
has_storage_path = "PREFECT_LOCAL_STORAGE_PATH" in env_defaults
has_results_persist = "PREFECT_RESULTS_PERSIST_BY_DEFAULT" in env_defaults
has_required_env = has_api_url and has_storage_path and has_results_persist
logger.debug(f"Environment defaults: {env_defaults}")
logger.debug(f"Has API URL: {has_api_url}, Has storage path: {has_storage_path}, Has results persist: {has_results_persist}")
logger.debug(f"Has required env: {has_required_env}")
# Log the full validation result
logger.info(f"Work pool validation - Image: {has_image_variable}, Toolbox: {has_toolbox_volume}, Environment: {has_required_env}")
if has_image_variable and has_toolbox_volume and has_required_env:
logger.info(f"Docker work pool '{pool_name}' already exists with correct configuration")
return
else:
reasons = []
if not has_image_variable:
reasons.append("missing image template")
if not has_toolbox_volume:
reasons.append("missing toolbox volume mount")
if not has_required_env:
if not has_api_url:
reasons.append("missing PREFECT_API_URL")
if not has_storage_path:
reasons.append("missing PREFECT_LOCAL_STORAGE_PATH")
if not has_results_persist:
reasons.append("missing PREFECT_RESULTS_PERSIST_BY_DEFAULT")
logger.warning(f"Docker work pool '{pool_name}' exists but lacks: {', '.join(reasons)}. Recreating...")
# Delete the old pool and recreate it
try:
await client.delete_work_pool(pool_name)
logger.info(f"Deleted old work pool '{pool_name}'")
except Exception as e:
logger.warning(f"Failed to delete old work pool: {e}")
elif force_recreate and existing_pool:
logger.warning(f"Force recreation enabled - deleting existing work pool '{pool_name}'")
try:
await client.delete_work_pool(pool_name)
logger.info(f"Deleted existing work pool for force recreation")
except Exception as e:
logger.warning(f"Failed to delete work pool for force recreation: {e}")
logger.info(f"Creating Docker work pool '{pool_name}' with custom image support...")
# Create the work pool with proper Docker configuration
work_pool = WorkPoolCreate(
name=pool_name,
type="docker",
description="Docker work pool for FuzzForge workflows with custom image support",
base_job_template={
"job_configuration": {
"image": "{{ image }}", # Template variable for custom images
"volumes": "{{ volumes }}", # List of volume mounts
"env": "{{ env }}", # Environment variables
"networks": "{{ networks }}", # Docker networks
"stream_output": True,
"auto_remove": True,
"privileged": False,
"network_mode": None, # Use networks instead
"labels": {},
"command": None # Let the image's CMD/ENTRYPOINT run
},
"variables": {
"type": "object",
"properties": {
"image": {
"type": "string",
"title": "Docker Image",
"default": "prefecthq/prefect:3-python3.11",
"description": "Docker image for the flow run"
},
"volumes": {
"type": "array",
"title": "Volume Mounts",
"default": [
"fuzzforge_prefect_storage:/prefect-storage",
"fuzzforge_toolbox_code:/opt/prefect/toolbox:ro"
],
"description": "Volume mounts in format 'host:container:mode'",
"items": {
"type": "string"
}
},
"networks": {
"type": "array",
"title": "Docker Networks",
"default": ["fuzzforge_default"],
"description": "Docker networks to connect container to",
"items": {
"type": "string"
}
},
"env": {
"type": "object",
"title": "Environment Variables",
"default": {
"PREFECT_API_URL": "http://prefect-server:4200/api",
"PREFECT_LOCAL_STORAGE_PATH": "/prefect-storage",
"PREFECT_RESULTS_PERSIST_BY_DEFAULT": "true"
},
"description": "Environment variables for the container",
"additionalProperties": {
"type": "string"
}
}
}
}
}
)
await client.create_work_pool(work_pool)
logger.info(f"Created Docker work pool '{pool_name}'")
except Exception as e:
logger.error(f"Failed to setup Docker work pool: {e}")
raise
finally:
# Restore original logging level if debug mode was enabled
if debug_setup and 'original_level' in locals():
logger.setLevel(original_level)
def get_actual_compose_project_name():
"""
Return the hardcoded compose project name for FuzzForge.
Always returns 'fuzzforge' as per system requirements.
"""
logger.info("Using hardcoded compose project name: fuzzforge")
return "fuzzforge"
async def setup_result_storage():
"""
Create or update Prefect result storage block for findings persistence.
Setup result storage (MinIO).
This sets up a LocalFileSystem storage block pointing to the shared
/prefect-storage volume for result persistence.
MinIO is used for both target upload and result storage.
This is a placeholder for any MinIO-specific setup if needed.
"""
from prefect.filesystems import LocalFileSystem
storage_name = "fuzzforge-results"
try:
# Create the storage block, overwrite if it exists
logger.info(f"Setting up storage block '{storage_name}'...")
storage = LocalFileSystem(basepath="/prefect-storage")
block_doc_id = await storage.save(name=storage_name, overwrite=True)
logger.info(f"Storage block '{storage_name}' configured successfully")
return str(block_doc_id)
except Exception as e:
logger.error(f"Failed to setup result storage: {e}")
# Don't raise the exception - continue without storage block
logger.warning("Continuing without result storage block - findings may not persist")
return None
async def validate_docker_connection():
"""
Validate that Docker is accessible and running.
Note: In containerized deployments with Docker socket proxy,
the backend doesn't need direct Docker access.
Raises:
RuntimeError: If Docker is not accessible
"""
import os
# Skip Docker validation if running in container without socket access
if os.path.exists("/.dockerenv") and not os.path.exists("/var/run/docker.sock"):
logger.info("Running in container without Docker socket - skipping Docker validation")
return
try:
import docker
client = docker.from_env()
client.ping()
logger.info("Docker connection validated")
except Exception as e:
logger.error(f"Docker is not accessible: {e}")
raise RuntimeError(
"Docker is not running or not accessible. "
"Please ensure Docker is installed and running."
)
async def validate_registry_connectivity(registry_url: str = None):
"""
Validate that the Docker registry is accessible.
Args:
registry_url: URL of the Docker registry to validate (auto-detected if None)
Raises:
RuntimeError: If registry is not accessible
"""
# Resolve a reachable test URL from within this process
if registry_url is None:
# If not specified, prefer internal service name in containers, host port on host
import os
if os.path.exists('/.dockerenv'):
registry_url = "registry:5000"
else:
registry_url = "localhost:5001"
# If we're running inside a container and asked to probe localhost:PORT,
# the probe would hit the container, not the host. Use host.docker.internal instead.
import os
try:
host_part, port_part = registry_url.split(":", 1)
except ValueError:
host_part, port_part = registry_url, "80"
if os.path.exists('/.dockerenv') and host_part in ("localhost", "127.0.0.1"):
test_host = "host.docker.internal"
else:
test_host = host_part
test_url = f"http://{test_host}:{port_part}/v2/"
import aiohttp
import asyncio
logger.info(f"Validating registry connectivity to {registry_url}...")
try:
async with aiohttp.ClientSession(timeout=aiohttp.ClientTimeout(total=10)) as session:
async with session.get(test_url) as response:
if response.status == 200:
logger.info(f"Registry at {registry_url} is accessible (tested via {test_host})")
return
else:
raise RuntimeError(f"Registry returned status {response.status}")
except asyncio.TimeoutError:
raise RuntimeError(f"Registry at {registry_url} is not responding (timeout)")
except aiohttp.ClientError as e:
raise RuntimeError(f"Registry at {registry_url} is not accessible: {e}")
except Exception as e:
raise RuntimeError(f"Failed to validate registry connectivity: {e}")
async def validate_docker_network(network_name: str):
"""
Validate that the specified Docker network exists.
Args:
network_name: Name of the Docker network to validate
Raises:
RuntimeError: If network doesn't exist
"""
import os
# Skip network validation if running in container without Docker socket
if os.path.exists("/.dockerenv") and not os.path.exists("/var/run/docker.sock"):
logger.info("Running in container without Docker socket - skipping network validation")
return
try:
import docker
client = docker.from_env()
# List all networks
networks = client.networks.list(names=[network_name])
if not networks:
# Try to find networks with similar names
all_networks = client.networks.list()
similar_networks = [n.name for n in all_networks if "fuzzforge" in n.name.lower()]
error_msg = f"Docker network '{network_name}' not found."
if similar_networks:
error_msg += f" Available networks: {similar_networks}"
else:
error_msg += " Please ensure Docker Compose is running."
raise RuntimeError(error_msg)
logger.info(f"Docker network '{network_name}' validated")
except Exception as e:
if isinstance(e, RuntimeError):
raise
logger.error(f"Network validation failed: {e}")
raise RuntimeError(f"Failed to validate Docker network: {e}")
logger.info("Result storage (MinIO) configured")
# MinIO is configured via environment variables in docker-compose
# No additional setup needed here
return True
async def validate_infrastructure():
@@ -382,21 +39,7 @@ async def validate_infrastructure():
"""
logger.info("Validating infrastructure...")
# Validate Docker connection
await validate_docker_connection()
# Validate registry connectivity for custom image building
await validate_registry_connectivity()
# Validate network (hardcoded to avoid directory name dependencies)
import os
compose_project = "fuzzforge"
docker_network = "fuzzforge_default"
try:
await validate_docker_network(docker_network)
except RuntimeError as e:
logger.warning(f"Network validation failed: {e}")
logger.warning("Workflows may not be able to connect to Prefect services")
# Setup storage (MinIO)
await setup_result_storage()
logger.info("Infrastructure validation completed")
-459
View File
@@ -1,459 +0,0 @@
"""
Workflow Discovery - Registry-based discovery and loading of workflows
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import logging
import yaml
from pathlib import Path
from typing import Dict, Optional, Any, Callable
from pydantic import BaseModel, Field, ConfigDict
logger = logging.getLogger(__name__)
class WorkflowInfo(BaseModel):
"""Information about a discovered workflow"""
name: str = Field(..., description="Workflow name")
path: Path = Field(..., description="Path to workflow directory")
workflow_file: Path = Field(..., description="Path to workflow.py file")
dockerfile: Path = Field(..., description="Path to Dockerfile")
has_docker: bool = Field(..., description="Whether workflow has custom Dockerfile")
metadata: Dict[str, Any] = Field(..., description="Workflow metadata from YAML")
flow_function_name: str = Field(default="main_flow", description="Name of the flow function")
model_config = ConfigDict(arbitrary_types_allowed=True)
class WorkflowDiscovery:
"""
Discovers workflows from the filesystem and validates them against the registry.
This system:
1. Scans for workflows with metadata.yaml files
2. Cross-references them with the manual registry
3. Provides registry-based flow functions for deployment
Workflows must have:
- workflow.py: Contains the Prefect flow
- metadata.yaml: Mandatory metadata file
- Entry in toolbox/workflows/registry.py: Manual registration
- Dockerfile (optional): Custom container definition
- requirements.txt (optional): Python dependencies
"""
def __init__(self, workflows_dir: Path):
"""
Initialize workflow discovery.
Args:
workflows_dir: Path to the workflows directory
"""
self.workflows_dir = workflows_dir
if not self.workflows_dir.exists():
self.workflows_dir.mkdir(parents=True, exist_ok=True)
logger.info(f"Created workflows directory: {self.workflows_dir}")
# Import registry - this validates it on import
try:
from toolbox.workflows.registry import WORKFLOW_REGISTRY, list_registered_workflows
self.registry = WORKFLOW_REGISTRY
logger.info(f"Loaded workflow registry with {len(self.registry)} registered workflows")
except ImportError as e:
logger.error(f"Failed to import workflow registry: {e}")
self.registry = {}
except Exception as e:
logger.error(f"Registry validation failed: {e}")
self.registry = {}
# Cache for discovered workflows
self._workflow_cache: Optional[Dict[str, WorkflowInfo]] = None
self._cache_timestamp: Optional[float] = None
self._cache_ttl = 60.0 # Cache TTL in seconds
async def discover_workflows(self) -> Dict[str, WorkflowInfo]:
"""
Discover workflows by cross-referencing filesystem with registry.
Uses caching to avoid frequent filesystem scans.
Returns:
Dictionary mapping workflow names to their information
"""
# Check cache validity
import time
current_time = time.time()
if (self._workflow_cache is not None and
self._cache_timestamp is not None and
(current_time - self._cache_timestamp) < self._cache_ttl):
# Return cached results
logger.debug(f"Returning cached workflow discovery ({len(self._workflow_cache)} workflows)")
return self._workflow_cache
workflows = {}
discovered_dirs = set()
registry_names = set(self.registry.keys())
if not self.workflows_dir.exists():
logger.warning(f"Workflows directory does not exist: {self.workflows_dir}")
return workflows
# Recursively scan all directories and subdirectories
await self._scan_directory_recursive(self.workflows_dir, workflows, discovered_dirs)
# Check for registry entries without corresponding directories
missing_dirs = registry_names - discovered_dirs
if missing_dirs:
logger.warning(
f"Registry contains workflows without filesystem directories: {missing_dirs}. "
f"These workflows cannot be deployed."
)
logger.info(
f"Discovery complete: {len(workflows)} workflows ready for deployment, "
f"{len(missing_dirs)} registry entries missing directories, "
f"{len(discovered_dirs - registry_names)} filesystem workflows not registered"
)
# Update cache
self._workflow_cache = workflows
self._cache_timestamp = current_time
return workflows
async def _scan_directory_recursive(self, directory: Path, workflows: Dict[str, WorkflowInfo], discovered_dirs: set):
"""
Recursively scan directory for workflows.
Args:
directory: Directory to scan
workflows: Dictionary to populate with discovered workflows
discovered_dirs: Set to track discovered workflow names
"""
for item in directory.iterdir():
if not item.is_dir():
continue
if item.name.startswith('_') or item.name.startswith('.'):
continue # Skip hidden or private directories
# Check if this directory contains workflow files (workflow.py and metadata.yaml)
workflow_file = item / "workflow.py"
metadata_file = item / "metadata.yaml"
if workflow_file.exists() and metadata_file.exists():
# This is a workflow directory
workflow_name = item.name
discovered_dirs.add(workflow_name)
# Only process workflows that are in the registry
if workflow_name not in self.registry:
logger.warning(
f"Workflow '{workflow_name}' found in filesystem but not in registry. "
f"Add it to toolbox/workflows/registry.py to enable deployment."
)
continue
try:
workflow_info = await self._load_workflow(item)
if workflow_info:
workflows[workflow_info.name] = workflow_info
logger.info(f"Discovered and registered workflow: {workflow_info.name}")
except Exception as e:
logger.error(f"Failed to load workflow from {item}: {e}")
else:
# This is a category directory, recurse into it
await self._scan_directory_recursive(item, workflows, discovered_dirs)
async def _load_workflow(self, workflow_dir: Path) -> Optional[WorkflowInfo]:
"""
Load and validate a single workflow.
Args:
workflow_dir: Path to the workflow directory
Returns:
WorkflowInfo if valid, None otherwise
"""
workflow_name = workflow_dir.name
# Check for mandatory files
workflow_file = workflow_dir / "workflow.py"
metadata_file = workflow_dir / "metadata.yaml"
if not workflow_file.exists():
logger.warning(f"Workflow {workflow_name} missing workflow.py")
return None
if not metadata_file.exists():
logger.error(f"Workflow {workflow_name} missing mandatory metadata.yaml")
return None
# Load and validate metadata
try:
metadata = self._load_metadata(metadata_file)
if not self._validate_metadata(metadata, workflow_name):
return None
except Exception as e:
logger.error(f"Failed to load metadata for {workflow_name}: {e}")
return None
# Check for mandatory Dockerfile
dockerfile = workflow_dir / "Dockerfile"
if not dockerfile.exists():
logger.error(f"Workflow {workflow_name} missing mandatory Dockerfile")
return None
has_docker = True # Always True since Dockerfile is mandatory
# Get flow function name from metadata or use default
flow_function_name = metadata.get("flow_function", "main_flow")
return WorkflowInfo(
name=workflow_name,
path=workflow_dir,
workflow_file=workflow_file,
dockerfile=dockerfile,
has_docker=has_docker,
metadata=metadata,
flow_function_name=flow_function_name
)
def _load_metadata(self, metadata_file: Path) -> Dict[str, Any]:
"""
Load metadata from YAML file.
Args:
metadata_file: Path to metadata.yaml
Returns:
Dictionary containing metadata
"""
with open(metadata_file, 'r') as f:
metadata = yaml.safe_load(f)
if metadata is None:
raise ValueError("Empty metadata file")
return metadata
def _validate_metadata(self, metadata: Dict[str, Any], workflow_name: str) -> bool:
"""
Validate that metadata contains all required fields.
Args:
metadata: Metadata dictionary
workflow_name: Name of the workflow for logging
Returns:
True if valid, False otherwise
"""
required_fields = ["name", "version", "description", "author", "category", "parameters", "requirements"]
missing_fields = []
for field in required_fields:
if field not in metadata:
missing_fields.append(field)
if missing_fields:
logger.error(
f"Workflow {workflow_name} metadata missing required fields: {missing_fields}"
)
return False
# Validate version format (semantic versioning)
version = metadata.get("version", "")
if not self._is_valid_version(version):
logger.error(f"Workflow {workflow_name} has invalid version format: {version}")
return False
# Validate parameters structure
parameters = metadata.get("parameters", {})
if not isinstance(parameters, dict):
logger.error(f"Workflow {workflow_name} parameters must be a dictionary")
return False
return True
def _is_valid_version(self, version: str) -> bool:
"""
Check if version follows semantic versioning (x.y.z).
Args:
version: Version string
Returns:
True if valid semantic version
"""
try:
parts = version.split('.')
if len(parts) != 3:
return False
for part in parts:
int(part) # Check if each part is a number
return True
except (ValueError, AttributeError):
return False
def invalidate_cache(self) -> None:
"""
Invalidate the workflow discovery cache.
Useful when workflows are added or modified.
"""
self._workflow_cache = None
self._cache_timestamp = None
logger.debug("Workflow discovery cache invalidated")
def get_flow_function(self, workflow_name: str) -> Optional[Callable]:
"""
Get the flow function from the registry.
Args:
workflow_name: Name of the workflow
Returns:
The flow function if found in registry, None otherwise
"""
if workflow_name not in self.registry:
logger.error(
f"Workflow '{workflow_name}' not found in registry. "
f"Available workflows: {list(self.registry.keys())}"
)
return None
try:
from toolbox.workflows.registry import get_workflow_flow
flow_func = get_workflow_flow(workflow_name)
logger.debug(f"Retrieved flow function for '{workflow_name}' from registry")
return flow_func
except Exception as e:
logger.error(f"Failed to get flow function for '{workflow_name}': {e}")
return None
def get_registry_info(self, workflow_name: str) -> Optional[Dict[str, Any]]:
"""
Get registry information for a workflow.
Args:
workflow_name: Name of the workflow
Returns:
Registry information if found, None otherwise
"""
if workflow_name not in self.registry:
return None
try:
from toolbox.workflows.registry import get_workflow_info
return get_workflow_info(workflow_name)
except Exception as e:
logger.error(f"Failed to get registry info for '{workflow_name}': {e}")
return None
@staticmethod
def get_metadata_schema() -> Dict[str, Any]:
"""
Get the JSON schema for workflow metadata.
Returns:
JSON schema dictionary
"""
return {
"type": "object",
"required": ["name", "version", "description", "author", "category", "parameters", "requirements"],
"properties": {
"name": {
"type": "string",
"description": "Workflow name"
},
"version": {
"type": "string",
"pattern": "^\\d+\\.\\d+\\.\\d+$",
"description": "Semantic version (x.y.z)"
},
"description": {
"type": "string",
"description": "Workflow description"
},
"author": {
"type": "string",
"description": "Workflow author"
},
"category": {
"type": "string",
"enum": ["comprehensive", "specialized", "fuzzing", "focused"],
"description": "Workflow category"
},
"tags": {
"type": "array",
"items": {"type": "string"},
"description": "Workflow tags for categorization"
},
"requirements": {
"type": "object",
"required": ["tools", "resources"],
"properties": {
"tools": {
"type": "array",
"items": {"type": "string"},
"description": "Required security tools"
},
"resources": {
"type": "object",
"required": ["memory", "cpu", "timeout"],
"properties": {
"memory": {
"type": "string",
"pattern": "^\\d+[GMK]i$",
"description": "Memory limit (e.g., 1Gi, 512Mi)"
},
"cpu": {
"type": "string",
"pattern": "^\\d+m?$",
"description": "CPU limit (e.g., 1000m, 2)"
},
"timeout": {
"type": "integer",
"minimum": 60,
"maximum": 7200,
"description": "Workflow timeout in seconds"
}
}
}
}
},
"parameters": {
"type": "object",
"description": "Workflow parameters schema"
},
"default_parameters": {
"type": "object",
"description": "Default parameter values"
},
"required_modules": {
"type": "array",
"items": {"type": "string"},
"description": "Required module names"
},
"supported_volume_modes": {
"type": "array",
"items": {"enum": ["ro", "rw"]},
"default": ["ro", "rw"],
"description": "Supported volume mount modes"
},
"flow_function": {
"type": "string",
"default": "main_flow",
"description": "Name of the flow function in workflow.py"
}
}
}
+171 -310
View File
@@ -12,7 +12,6 @@
import asyncio
import logging
import os
from uuid import UUID
from contextlib import AsyncExitStack, asynccontextmanager, suppress
from typing import Any, Dict, Optional, List
@@ -23,31 +22,20 @@ from starlette.routing import Mount
from fastmcp.server.http import create_sse_app
from src.core.prefect_manager import PrefectManager
from src.core.setup import setup_docker_pool, setup_result_storage, validate_infrastructure
from src.core.workflow_discovery import WorkflowDiscovery
from src.temporal.manager import TemporalManager
from src.core.setup import setup_result_storage, validate_infrastructure
from src.api import workflows, runs, fuzzing
from src.services.prefect_stats_monitor import prefect_stats_monitor
from fastmcp import FastMCP
from prefect.client.orchestration import get_client
from prefect.client.schemas.filters import (
FlowRunFilter,
FlowRunFilterDeploymentId,
FlowRunFilterState,
FlowRunFilterStateType,
)
from prefect.client.schemas.sorting import FlowRunSort
from prefect.states import StateType
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
prefect_mgr = PrefectManager()
temporal_mgr = TemporalManager()
class PrefectBootstrapState:
"""Tracks Prefect initialization progress for API and MCP consumers."""
class TemporalBootstrapState:
"""Tracks Temporal initialization progress for API and MCP consumers."""
def __init__(self) -> None:
self.ready: bool = False
@@ -64,19 +52,19 @@ class PrefectBootstrapState:
}
prefect_bootstrap_state = PrefectBootstrapState()
temporal_bootstrap_state = TemporalBootstrapState()
# Configure retry strategy for bootstrapping Prefect + infrastructure
# Configure retry strategy for bootstrapping Temporal + infrastructure
STARTUP_RETRY_SECONDS = max(1, int(os.getenv("FUZZFORGE_STARTUP_RETRY_SECONDS", "5")))
STARTUP_RETRY_MAX_SECONDS = max(
STARTUP_RETRY_SECONDS,
int(os.getenv("FUZZFORGE_STARTUP_RETRY_MAX_SECONDS", "60")),
)
prefect_bootstrap_task: Optional[asyncio.Task] = None
temporal_bootstrap_task: Optional[asyncio.Task] = None
# ---------------------------------------------------------------------------
# FastAPI application (REST API remains unchanged)
# FastAPI application (REST API)
# ---------------------------------------------------------------------------
app = FastAPI(
@@ -90,20 +78,19 @@ app.include_router(runs.router)
app.include_router(fuzzing.router)
def get_prefect_status() -> Dict[str, Any]:
"""Return a snapshot of Prefect bootstrap state for diagnostics."""
status = prefect_bootstrap_state.as_dict()
status["workflows_loaded"] = len(prefect_mgr.workflows)
status["deployments_tracked"] = len(prefect_mgr.deployments)
def get_temporal_status() -> Dict[str, Any]:
"""Return a snapshot of Temporal bootstrap state for diagnostics."""
status = temporal_bootstrap_state.as_dict()
status["workflows_loaded"] = len(temporal_mgr.workflows)
status["bootstrap_task_running"] = (
prefect_bootstrap_task is not None and not prefect_bootstrap_task.done()
temporal_bootstrap_task is not None and not temporal_bootstrap_task.done()
)
return status
def _prefect_not_ready_status() -> Optional[Dict[str, Any]]:
"""Return status details if Prefect is not ready yet."""
status = get_prefect_status()
def _temporal_not_ready_status() -> Optional[Dict[str, Any]]:
"""Return status details if Temporal is not ready yet."""
status = get_temporal_status()
if status.get("ready"):
return None
return status
@@ -111,19 +98,19 @@ def _prefect_not_ready_status() -> Optional[Dict[str, Any]]:
@app.get("/")
async def root() -> Dict[str, Any]:
status = get_prefect_status()
status = get_temporal_status()
return {
"name": "FuzzForge API",
"version": "0.6.0",
"status": "ready" if status.get("ready") else "initializing",
"workflows_loaded": status.get("workflows_loaded", 0),
"prefect": status,
"temporal": status,
}
@app.get("/health")
async def health() -> Dict[str, str]:
status = get_prefect_status()
status = get_temporal_status()
health_status = "healthy" if status.get("ready") else "initializing"
return {"status": health_status}
@@ -165,65 +152,61 @@ _fastapi_mcp_imported = False
mcp = FastMCP(name="FuzzForge MCP")
async def _bootstrap_prefect_with_retries() -> None:
"""Initialize Prefect infrastructure with exponential backoff retries."""
async def _bootstrap_temporal_with_retries() -> None:
"""Initialize Temporal infrastructure with exponential backoff retries."""
attempt = 0
while True:
attempt += 1
prefect_bootstrap_state.task_running = True
prefect_bootstrap_state.status = "starting"
prefect_bootstrap_state.ready = False
prefect_bootstrap_state.last_error = None
temporal_bootstrap_state.task_running = True
temporal_bootstrap_state.status = "starting"
temporal_bootstrap_state.ready = False
temporal_bootstrap_state.last_error = None
try:
logger.info("Bootstrapping Prefect infrastructure...")
logger.info("Bootstrapping Temporal infrastructure...")
await validate_infrastructure()
await setup_docker_pool()
await setup_result_storage()
await prefect_mgr.initialize()
await prefect_stats_monitor.start_monitoring()
await temporal_mgr.initialize()
prefect_bootstrap_state.ready = True
prefect_bootstrap_state.status = "ready"
prefect_bootstrap_state.task_running = False
logger.info("Prefect infrastructure ready")
temporal_bootstrap_state.ready = True
temporal_bootstrap_state.status = "ready"
temporal_bootstrap_state.task_running = False
logger.info("Temporal infrastructure ready")
return
except asyncio.CancelledError:
prefect_bootstrap_state.status = "cancelled"
prefect_bootstrap_state.task_running = False
logger.info("Prefect bootstrap task cancelled")
temporal_bootstrap_state.status = "cancelled"
temporal_bootstrap_state.task_running = False
logger.info("Temporal bootstrap task cancelled")
raise
except Exception as exc: # pragma: no cover - defensive logging on infra startup
logger.exception("Prefect bootstrap failed")
prefect_bootstrap_state.ready = False
prefect_bootstrap_state.status = "error"
prefect_bootstrap_state.last_error = str(exc)
logger.exception("Temporal bootstrap failed")
temporal_bootstrap_state.ready = False
temporal_bootstrap_state.status = "error"
temporal_bootstrap_state.last_error = str(exc)
# Ensure partial initialization does not leave stale state behind
prefect_mgr.workflows.clear()
prefect_mgr.deployments.clear()
await prefect_stats_monitor.stop_monitoring()
temporal_mgr.workflows.clear()
wait_time = min(
STARTUP_RETRY_SECONDS * (2 ** (attempt - 1)),
STARTUP_RETRY_MAX_SECONDS,
)
logger.info("Retrying Prefect bootstrap in %s second(s)", wait_time)
logger.info("Retrying Temporal bootstrap in %s second(s)", wait_time)
try:
await asyncio.sleep(wait_time)
except asyncio.CancelledError:
prefect_bootstrap_state.status = "cancelled"
prefect_bootstrap_state.task_running = False
temporal_bootstrap_state.status = "cancelled"
temporal_bootstrap_state.task_running = False
raise
def _lookup_workflow(workflow_name: str):
info = prefect_mgr.workflows.get(workflow_name)
info = temporal_mgr.workflows.get(workflow_name)
if not info:
return None
metadata = info.metadata
@@ -248,24 +231,23 @@ def _lookup_workflow(workflow_name: str):
"required_modules": metadata.get("required_modules", []),
"supported_volume_modes": supported_modes,
"default_target_path": default_target_path,
"default_volume_mode": default_volume_mode,
"has_custom_docker": bool(info.has_docker),
"default_volume_mode": default_volume_mode
}
@mcp.tool
async def list_workflows_mcp() -> Dict[str, Any]:
"""List all discovered workflows and their metadata summary."""
not_ready = _prefect_not_ready_status()
not_ready = _temporal_not_ready_status()
if not_ready:
return {
"workflows": [],
"prefect": not_ready,
"message": "Prefect infrastructure is still initializing",
"temporal": not_ready,
"message": "Temporal infrastructure is still initializing",
}
workflows_summary = []
for name, info in prefect_mgr.workflows.items():
for name, info in temporal_mgr.workflows.items():
metadata = info.metadata
defaults = metadata.get("default_parameters", {})
workflows_summary.append({
@@ -279,20 +261,19 @@ async def list_workflows_mcp() -> Dict[str, Any]:
or defaults.get("volume_mode")
or "ro",
"default_target_path": metadata.get("default_target_path")
or defaults.get("target_path"),
"has_custom_docker": bool(info.has_docker),
or defaults.get("target_path")
})
return {"workflows": workflows_summary, "prefect": get_prefect_status()}
return {"workflows": workflows_summary, "temporal": get_temporal_status()}
@mcp.tool
async def get_workflow_metadata_mcp(workflow_name: str) -> Dict[str, Any]:
"""Fetch detailed metadata for a workflow."""
not_ready = _prefect_not_ready_status()
not_ready = _temporal_not_ready_status()
if not_ready:
return {
"error": "Prefect infrastructure not ready",
"prefect": not_ready,
"error": "Temporal infrastructure not ready",
"temporal": not_ready,
}
data = _lookup_workflow(workflow_name)
@@ -304,11 +285,11 @@ async def get_workflow_metadata_mcp(workflow_name: str) -> Dict[str, Any]:
@mcp.tool
async def get_workflow_parameters_mcp(workflow_name: str) -> Dict[str, Any]:
"""Return the parameter schema and defaults for a workflow."""
not_ready = _prefect_not_ready_status()
not_ready = _temporal_not_ready_status()
if not_ready:
return {
"error": "Prefect infrastructure not ready",
"prefect": not_ready,
"error": "Temporal infrastructure not ready",
"temporal": not_ready,
}
data = _lookup_workflow(workflow_name)
@@ -323,72 +304,41 @@ async def get_workflow_parameters_mcp(workflow_name: str) -> Dict[str, Any]:
@mcp.tool
async def get_workflow_metadata_schema_mcp() -> Dict[str, Any]:
"""Return the JSON schema describing workflow metadata files."""
from src.temporal.discovery import WorkflowDiscovery
return WorkflowDiscovery.get_metadata_schema()
@mcp.tool
async def submit_security_scan_mcp(
workflow_name: str,
target_path: str | None = None,
volume_mode: str | None = None,
target_id: str,
parameters: Dict[str, Any] | None = None,
) -> Dict[str, Any] | Dict[str, str]:
"""Submit a Prefect workflow via MCP."""
"""Submit a Temporal workflow via MCP."""
try:
not_ready = _prefect_not_ready_status()
not_ready = _temporal_not_ready_status()
if not_ready:
return {
"error": "Prefect infrastructure not ready",
"prefect": not_ready,
"error": "Temporal infrastructure not ready",
"temporal": not_ready,
}
workflow_info = prefect_mgr.workflows.get(workflow_name)
workflow_info = temporal_mgr.workflows.get(workflow_name)
if not workflow_info:
return {"error": f"Workflow '{workflow_name}' not found"}
metadata = workflow_info.metadata or {}
defaults = metadata.get("default_parameters", {})
resolved_target_path = target_path or metadata.get("default_target_path") or defaults.get("target_path")
if not resolved_target_path:
return {
"error": (
"target_path is required and no default_target_path is defined in metadata"
),
"metadata": {
"workflow": workflow_name,
"default_target_path": metadata.get("default_target_path"),
},
}
requested_volume_mode = volume_mode or metadata.get("default_volume_mode") or defaults.get("volume_mode")
if not requested_volume_mode:
requested_volume_mode = "ro"
normalised_volume_mode = (
str(requested_volume_mode).strip().lower().replace("-", "_")
)
if normalised_volume_mode in {"read_only", "readonly", "ro"}:
normalised_volume_mode = "ro"
elif normalised_volume_mode in {"read_write", "readwrite", "rw"}:
normalised_volume_mode = "rw"
else:
supported_modes = metadata.get("supported_volume_modes", ["ro", "rw"])
if isinstance(supported_modes, list) and normalised_volume_mode in supported_modes:
pass
else:
normalised_volume_mode = "ro"
parameters = parameters or {}
cleaned_parameters: Dict[str, Any] = {**defaults, **parameters}
# Ensure *_config structures default to dicts so Prefect validation passes.
# Ensure *_config structures default to dicts
for key, value in list(cleaned_parameters.items()):
if isinstance(key, str) and key.endswith("_config") and value is None:
cleaned_parameters[key] = {}
# Some workflows expect configuration dictionaries even when omitted.
# Some workflows expect configuration dictionaries even when omitted
parameter_definitions = (
metadata.get("parameters", {}).get("properties", {})
if isinstance(metadata.get("parameters"), dict)
@@ -403,20 +353,19 @@ async def submit_security_scan_mcp(
elif cleaned_parameters[key] is None:
cleaned_parameters[key] = {}
flow_run = await prefect_mgr.submit_workflow(
# Start workflow
handle = await temporal_mgr.run_workflow(
workflow_name=workflow_name,
target_path=resolved_target_path,
volume_mode=normalised_volume_mode,
parameters=cleaned_parameters,
target_id=target_id,
workflow_params=cleaned_parameters,
)
return {
"run_id": str(flow_run.id),
"status": flow_run.state.name if flow_run.state else "PENDING",
"run_id": handle.id,
"status": "RUNNING",
"workflow": workflow_name,
"message": f"Workflow '{workflow_name}' submitted successfully",
"target_path": resolved_target_path,
"volume_mode": normalised_volume_mode,
"target_id": target_id,
"parameters": cleaned_parameters,
"mcp_enabled": True,
}
@@ -427,43 +376,38 @@ async def submit_security_scan_mcp(
@mcp.tool
async def get_comprehensive_scan_summary(run_id: str) -> Dict[str, Any] | Dict[str, str]:
"""Return a summary for the given flow run via MCP."""
"""Return a summary for the given workflow run via MCP."""
try:
not_ready = _prefect_not_ready_status()
not_ready = _temporal_not_ready_status()
if not_ready:
return {
"error": "Prefect infrastructure not ready",
"prefect": not_ready,
"error": "Temporal infrastructure not ready",
"temporal": not_ready,
}
status = await prefect_mgr.get_flow_run_status(run_id)
findings = await prefect_mgr.get_flow_run_findings(run_id)
workflow_name = "unknown"
deployment_id = status.get("workflow", "")
for name, deployment in prefect_mgr.deployments.items():
if str(deployment) == str(deployment_id):
workflow_name = name
break
status = await temporal_mgr.get_workflow_status(run_id)
# Try to get result if completed
total_findings = 0
severity_summary = {"critical": 0, "high": 0, "medium": 0, "low": 0, "info": 0}
if findings and "sarif" in findings:
sarif = findings["sarif"]
if isinstance(sarif, dict):
total_findings = sarif.get("total_findings", 0)
if status.get("status") == "COMPLETED":
try:
result = await temporal_mgr.get_workflow_result(run_id)
if isinstance(result, dict):
summary = result.get("summary", {})
total_findings = summary.get("total_findings", 0)
except Exception as e:
logger.debug(f"Could not retrieve result for {run_id}: {e}")
return {
"run_id": run_id,
"workflow": workflow_name,
"workflow": "unknown", # Temporal doesn't track workflow name in status
"status": status.get("status", "unknown"),
"is_completed": status.get("is_completed", False),
"is_completed": status.get("status") == "COMPLETED",
"total_findings": total_findings,
"severity_summary": severity_summary,
"scan_duration": status.get("updated_at", "")
if status.get("is_completed")
else "In progress",
"scan_duration": status.get("close_time", "In progress"),
"recommendations": (
[
"Review high and critical severity findings first",
@@ -482,32 +426,26 @@ async def get_comprehensive_scan_summary(run_id: str) -> Dict[str, Any] | Dict[s
@mcp.tool
async def get_run_status_mcp(run_id: str) -> Dict[str, Any]:
"""Return current status information for a Prefect run."""
"""Return current status information for a Temporal run."""
try:
not_ready = _prefect_not_ready_status()
not_ready = _temporal_not_ready_status()
if not_ready:
return {
"error": "Prefect infrastructure not ready",
"prefect": not_ready,
"error": "Temporal infrastructure not ready",
"temporal": not_ready,
}
status = await prefect_mgr.get_flow_run_status(run_id)
workflow_name = "unknown"
deployment_id = status.get("workflow", "")
for name, deployment in prefect_mgr.deployments.items():
if str(deployment) == str(deployment_id):
workflow_name = name
break
status = await temporal_mgr.get_workflow_status(run_id)
return {
"run_id": status["run_id"],
"workflow": workflow_name,
"run_id": run_id,
"workflow": "unknown",
"status": status["status"],
"is_completed": status["is_completed"],
"is_failed": status["is_failed"],
"is_running": status["is_running"],
"created_at": status["created_at"],
"updated_at": status["updated_at"],
"is_completed": status["status"] in ["COMPLETED", "FAILED", "CANCELLED"],
"is_failed": status["status"] == "FAILED",
"is_running": status["status"] == "RUNNING",
"created_at": status.get("start_time"),
"updated_at": status.get("close_time") or status.get("execution_time"),
}
except Exception as exc:
logger.exception("MCP run status failed")
@@ -518,38 +456,30 @@ async def get_run_status_mcp(run_id: str) -> Dict[str, Any]:
async def get_run_findings_mcp(run_id: str) -> Dict[str, Any]:
"""Return SARIF findings for a completed run."""
try:
not_ready = _prefect_not_ready_status()
not_ready = _temporal_not_ready_status()
if not_ready:
return {
"error": "Prefect infrastructure not ready",
"prefect": not_ready,
"error": "Temporal infrastructure not ready",
"temporal": not_ready,
}
status = await prefect_mgr.get_flow_run_status(run_id)
if not status.get("is_completed"):
status = await temporal_mgr.get_workflow_status(run_id)
if status.get("status") != "COMPLETED":
return {"error": f"Run {run_id} not completed. Status: {status.get('status')}"}
findings = await prefect_mgr.get_flow_run_findings(run_id)
workflow_name = "unknown"
deployment_id = status.get("workflow", "")
for name, deployment in prefect_mgr.deployments.items():
if str(deployment) == str(deployment_id):
workflow_name = name
break
result = await temporal_mgr.get_workflow_result(run_id)
metadata = {
"completion_time": status.get("updated_at"),
"completion_time": status.get("close_time"),
"workflow_version": "unknown",
}
info = prefect_mgr.workflows.get(workflow_name)
if info:
metadata["workflow_version"] = info.metadata.get("version", "unknown")
sarif = result.get("sarif", {}) if isinstance(result, dict) else {}
return {
"workflow": workflow_name,
"workflow": "unknown",
"run_id": run_id,
"sarif": findings,
"sarif": sarif,
"metadata": metadata,
}
except Exception as exc:
@@ -561,16 +491,15 @@ async def get_run_findings_mcp(run_id: str) -> Dict[str, Any]:
async def list_recent_runs_mcp(
limit: int = 10,
workflow_name: str | None = None,
states: List[str] | None = None,
) -> Dict[str, Any]:
"""List recent Prefect runs with optional workflow/state filters."""
"""List recent Temporal runs with optional workflow filter."""
not_ready = _prefect_not_ready_status()
not_ready = _temporal_not_ready_status()
if not_ready:
return {
"runs": [],
"prefect": not_ready,
"message": "Prefect infrastructure is still initializing",
"temporal": not_ready,
"message": "Temporal infrastructure is still initializing",
}
try:
@@ -579,116 +508,49 @@ async def list_recent_runs_mcp(
limit_value = 10
limit_value = max(1, min(limit_value, 100))
deployment_map = {
str(deployment_id): workflow
for workflow, deployment_id in prefect_mgr.deployments.items()
}
try:
# Build filter query
filter_query = None
if workflow_name:
workflow_info = temporal_mgr.workflows.get(workflow_name)
if workflow_info:
filter_query = f'WorkflowType="{workflow_info.workflow_type}"'
deployment_filter_value = None
if workflow_name:
deployment_id = prefect_mgr.deployments.get(workflow_name)
if not deployment_id:
return {
"runs": [],
"prefect": get_prefect_status(),
"error": f"Workflow '{workflow_name}' has no registered deployment",
}
try:
deployment_filter_value = UUID(str(deployment_id))
except ValueError:
return {
"runs": [],
"prefect": get_prefect_status(),
"error": (
f"Deployment id '{deployment_id}' for workflow '{workflow_name}' is invalid"
),
}
workflows = await temporal_mgr.list_workflows(filter_query, limit_value)
desired_state_types: List[StateType] = []
if states:
for raw_state in states:
if not raw_state:
continue
normalised = raw_state.strip().upper()
if normalised == "ALL":
desired_state_types = []
break
try:
desired_state_types.append(StateType[normalised])
except KeyError:
continue
if not desired_state_types:
desired_state_types = [
StateType.RUNNING,
StateType.COMPLETED,
StateType.FAILED,
StateType.CANCELLED,
]
results: List[Dict[str, Any]] = []
for wf in workflows:
results.append({
"run_id": wf["workflow_id"],
"workflow": workflow_name or "unknown",
"state": wf["status"],
"state_type": wf["status"],
"is_completed": wf["status"] in ["COMPLETED", "FAILED", "CANCELLED"],
"is_running": wf["status"] == "RUNNING",
"is_failed": wf["status"] == "FAILED",
"created_at": wf.get("start_time"),
"updated_at": wf.get("close_time"),
})
flow_filter = FlowRunFilter()
if desired_state_types:
flow_filter.state = FlowRunFilterState(
type=FlowRunFilterStateType(any_=desired_state_types)
)
if deployment_filter_value:
flow_filter.deployment_id = FlowRunFilterDeploymentId(
any_=[deployment_filter_value]
)
return {"runs": results, "temporal": get_temporal_status()}
async with get_client() as client:
flow_runs = await client.read_flow_runs(
limit=limit_value,
flow_run_filter=flow_filter,
sort=FlowRunSort.START_TIME_DESC,
)
results: List[Dict[str, Any]] = []
for flow_run in flow_runs:
deployment_id = getattr(flow_run, "deployment_id", None)
workflow = deployment_map.get(str(deployment_id), "unknown")
state = getattr(flow_run, "state", None)
state_name = getattr(state, "name", None) if state else None
state_type = getattr(state, "type", None) if state else None
results.append(
{
"run_id": str(flow_run.id),
"workflow": workflow,
"deployment_id": str(deployment_id) if deployment_id else None,
"state": state_name or (state_type.name if state_type else None),
"state_type": state_type.name if state_type else None,
"is_completed": bool(getattr(state, "is_completed", lambda: False)()),
"is_running": bool(getattr(state, "is_running", lambda: False)()),
"is_failed": bool(getattr(state, "is_failed", lambda: False)()),
"created_at": getattr(flow_run, "created", None),
"updated_at": getattr(flow_run, "updated", None),
"expected_start_time": getattr(flow_run, "expected_start_time", None),
"start_time": getattr(flow_run, "start_time", None),
}
)
# Normalise datetimes to ISO 8601 strings for serialization
for entry in results:
for key in ("created_at", "updated_at", "expected_start_time", "start_time"):
value = entry.get(key)
if value is None:
continue
try:
entry[key] = value.isoformat()
except AttributeError:
entry[key] = str(value)
return {"runs": results, "prefect": get_prefect_status()}
except Exception as exc:
logger.exception("Failed to list runs")
return {
"runs": [],
"temporal": get_temporal_status(),
"error": str(exc)
}
@mcp.tool
async def get_fuzzing_stats_mcp(run_id: str) -> Dict[str, Any]:
"""Return fuzzing statistics for a run if available."""
not_ready = _prefect_not_ready_status()
not_ready = _temporal_not_ready_status()
if not_ready:
return {
"error": "Prefect infrastructure not ready",
"prefect": not_ready,
"error": "Temporal infrastructure not ready",
"temporal": not_ready,
}
stats = fuzzing.fuzzing_stats.get(run_id)
@@ -708,11 +570,11 @@ async def get_fuzzing_stats_mcp(run_id: str) -> Dict[str, Any]:
@mcp.tool
async def get_fuzzing_crash_reports_mcp(run_id: str) -> Dict[str, Any]:
"""Return crash reports collected for a fuzzing run."""
not_ready = _prefect_not_ready_status()
not_ready = _temporal_not_ready_status()
if not_ready:
return {
"error": "Prefect infrastructure not ready",
"prefect": not_ready,
"error": "Temporal infrastructure not ready",
"temporal": not_ready,
}
reports = fuzzing.crash_reports.get(run_id)
@@ -725,11 +587,11 @@ async def get_fuzzing_crash_reports_mcp(run_id: str) -> Dict[str, Any]:
async def get_backend_status_mcp() -> Dict[str, Any]:
"""Expose backend readiness, workflows, and registered MCP tools."""
status = get_prefect_status()
response: Dict[str, Any] = {"prefect": status}
status = get_temporal_status()
response: Dict[str, Any] = {"temporal": status}
if status.get("ready"):
response["workflows"] = list(prefect_mgr.workflows.keys())
response["workflows"] = list(temporal_mgr.workflows.keys())
try:
tools = await mcp._tool_manager.list_tools()
@@ -775,12 +637,12 @@ def create_mcp_transport_app() -> Starlette:
# ---------------------------------------------------------------------------
# Combined lifespan: Prefect init + dedicated MCP transports
# Combined lifespan: Temporal init + dedicated MCP transports
# ---------------------------------------------------------------------------
@asynccontextmanager
async def combined_lifespan(app: FastAPI):
global prefect_bootstrap_task, _fastapi_mcp_imported
global temporal_bootstrap_task, _fastapi_mcp_imported
logger.info("Starting FuzzForge backend...")
@@ -793,12 +655,12 @@ async def combined_lifespan(app: FastAPI):
except Exception as exc:
logger.exception("Failed to import FastAPI endpoints into MCP", exc_info=exc)
# Kick off Prefect bootstrap in the background if needed
if prefect_bootstrap_task is None or prefect_bootstrap_task.done():
prefect_bootstrap_task = asyncio.create_task(_bootstrap_prefect_with_retries())
logger.info("Prefect bootstrap task started")
# Kick off Temporal bootstrap in the background if needed
if temporal_bootstrap_task is None or temporal_bootstrap_task.done():
temporal_bootstrap_task = asyncio.create_task(_bootstrap_temporal_with_retries())
logger.info("Temporal bootstrap task started")
else:
logger.info("Prefect bootstrap task already running")
logger.info("Temporal bootstrap task already running")
# Start MCP transports on shared port (HTTP + SSE)
mcp_app = create_mcp_transport_app()
@@ -846,18 +708,17 @@ async def combined_lifespan(app: FastAPI):
mcp_server.force_exit = True
await asyncio.gather(mcp_task, return_exceptions=True)
if prefect_bootstrap_task and not prefect_bootstrap_task.done():
prefect_bootstrap_task.cancel()
if temporal_bootstrap_task and not temporal_bootstrap_task.done():
temporal_bootstrap_task.cancel()
with suppress(asyncio.CancelledError):
await prefect_bootstrap_task
prefect_bootstrap_state.task_running = False
if not prefect_bootstrap_state.ready:
prefect_bootstrap_state.status = "stopped"
prefect_bootstrap_state.next_retry_seconds = None
prefect_bootstrap_task = None
await temporal_bootstrap_task
temporal_bootstrap_state.task_running = False
if not temporal_bootstrap_state.ready:
temporal_bootstrap_state.status = "stopped"
temporal_bootstrap_task = None
logger.info("Shutting down Prefect statistics monitor...")
await prefect_stats_monitor.stop_monitoring()
# Close Temporal client
await temporal_mgr.close()
logger.info("Shutting down FuzzForge backend...")
+7 -65
View File
@@ -13,10 +13,9 @@ Models for workflow findings and submissions
#
# Additional attribution and requirements are provided in the NOTICE file.
from pydantic import BaseModel, Field, field_validator
from pydantic import BaseModel, Field
from typing import Dict, Any, Optional, Literal, List
from datetime import datetime
from pathlib import Path
class WorkflowFindings(BaseModel):
@@ -27,47 +26,13 @@ class WorkflowFindings(BaseModel):
metadata: Dict[str, Any] = Field(default_factory=dict, description="Additional metadata")
class ResourceLimits(BaseModel):
"""Resource limits for workflow execution"""
cpu_limit: Optional[str] = Field(None, description="CPU limit (e.g., '2' for 2 cores, '500m' for 0.5 cores)")
memory_limit: Optional[str] = Field(None, description="Memory limit (e.g., '1Gi', '512Mi')")
cpu_request: Optional[str] = Field(None, description="CPU request (guaranteed)")
memory_request: Optional[str] = Field(None, description="Memory request (guaranteed)")
class VolumeMount(BaseModel):
"""Volume mount specification"""
host_path: str = Field(..., description="Host path to mount")
container_path: str = Field(..., description="Container path for mount")
mode: Literal["ro", "rw"] = Field(default="ro", description="Mount mode")
@field_validator("host_path")
@classmethod
def validate_host_path(cls, v):
"""Validate that the host path is absolute (existence checked at runtime)"""
path = Path(v)
if not path.is_absolute():
raise ValueError(f"Host path must be absolute: {v}")
# Note: Path existence is validated at workflow runtime
# We can't validate existence here as this runs inside Docker container
return str(path)
@field_validator("container_path")
@classmethod
def validate_container_path(cls, v):
"""Validate that the container path is absolute"""
if not v.startswith('/'):
raise ValueError(f"Container path must be absolute: {v}")
return v
class WorkflowSubmission(BaseModel):
"""Submit a workflow with configurable settings"""
target_path: str = Field(..., description="Absolute path to analyze")
volume_mode: Literal["ro", "rw"] = Field(
default="ro",
description="Volume mount mode: read-only (ro) or read-write (rw)"
)
"""
Submit a workflow with configurable settings.
Note: This model is deprecated in favor of the /upload-and-submit endpoint
which handles file uploads directly.
"""
parameters: Dict[str, Any] = Field(
default_factory=dict,
description="Workflow-specific parameters"
@@ -78,25 +43,6 @@ class WorkflowSubmission(BaseModel):
ge=1,
le=604800 # Max 7 days to support fuzzing campaigns
)
resource_limits: Optional[ResourceLimits] = Field(
None,
description="Resource limits for workflow container"
)
additional_volumes: List[VolumeMount] = Field(
default_factory=list,
description="Additional volume mounts (e.g., for corpus, output directories)"
)
@field_validator("target_path")
@classmethod
def validate_path(cls, v):
"""Validate that the target path is absolute (existence checked at runtime)"""
path = Path(v)
if not path.is_absolute():
raise ValueError(f"Path must be absolute: {v}")
# Note: Path existence is validated at workflow runtime when volumes are mounted
# We can't validate existence here as this runs inside Docker container
return str(path)
class WorkflowStatus(BaseModel):
@@ -131,10 +77,6 @@ class WorkflowMetadata(BaseModel):
default=["ro", "rw"],
description="Supported volume mount modes"
)
has_custom_docker: bool = Field(
default=False,
description="Whether workflow has custom Dockerfile"
)
class WorkflowListItem(BaseModel):
@@ -1,394 +0,0 @@
"""
Generic Prefect Statistics Monitor Service
This service monitors ALL workflows for structured live data logging and
updates the appropriate statistics APIs. Works with any workflow that follows
the standard LIVE_STATS logging pattern.
"""
# Copyright (c) 2025 FuzzingLabs
#
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
# at the root of this repository for details.
#
# After the Change Date (four years from publication), this version of the
# Licensed Work will be made available under the Apache License, Version 2.0.
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
#
# Additional attribution and requirements are provided in the NOTICE file.
import asyncio
import json
import logging
from datetime import datetime, timedelta, timezone
from typing import Dict, Any, Optional
from prefect.client.orchestration import get_client
from prefect.client.schemas.objects import FlowRun, TaskRun
from src.models.findings import FuzzingStats
from src.api.fuzzing import fuzzing_stats, initialize_fuzzing_tracking, active_connections
logger = logging.getLogger(__name__)
class PrefectStatsMonitor:
"""Monitors Prefect flows and tasks for live statistics from any workflow"""
def __init__(self):
self.monitoring = False
self.monitor_task = None
self.monitored_runs = set()
self.last_log_ts: Dict[str, datetime] = {}
self._client = None
self._client_refresh_time = None
self._client_refresh_interval = 300 # Refresh connection every 5 minutes
async def start_monitoring(self):
"""Start the Prefect statistics monitoring service"""
if self.monitoring:
logger.warning("Prefect stats monitor already running")
return
self.monitoring = True
self.monitor_task = asyncio.create_task(self._monitor_flows())
logger.info("Started Prefect statistics monitor")
async def stop_monitoring(self):
"""Stop the monitoring service"""
self.monitoring = False
if self.monitor_task:
self.monitor_task.cancel()
try:
await self.monitor_task
except asyncio.CancelledError:
pass
logger.info("Stopped Prefect statistics monitor")
async def _get_or_refresh_client(self):
"""Get or refresh Prefect client with connection pooling."""
now = datetime.now(timezone.utc)
if (self._client is None or
self._client_refresh_time is None or
(now - self._client_refresh_time).total_seconds() > self._client_refresh_interval):
if self._client:
try:
await self._client.aclose()
except Exception:
pass
self._client = get_client()
self._client_refresh_time = now
await self._client.__aenter__()
return self._client
async def _monitor_flows(self):
"""Main monitoring loop that watches Prefect flows"""
try:
while self.monitoring:
try:
# Use connection pooling for better performance
client = await self._get_or_refresh_client()
# Get recent flow runs (limit to reduce load)
flow_runs = await client.read_flow_runs(
limit=50,
sort="START_TIME_DESC",
)
# Only consider runs from the last 15 minutes
recent_cutoff = datetime.now(timezone.utc) - timedelta(minutes=15)
for flow_run in flow_runs:
created = getattr(flow_run, "created", None)
if created is None:
continue
try:
# Ensure timezone-aware comparison
if created.tzinfo is None:
created = created.replace(tzinfo=timezone.utc)
if created >= recent_cutoff:
await self._monitor_flow_run(client, flow_run)
except Exception:
# If comparison fails, attempt monitoring anyway
await self._monitor_flow_run(client, flow_run)
await asyncio.sleep(5) # Check every 5 seconds
except Exception as e:
logger.error(f"Error in Prefect monitoring: {e}")
await asyncio.sleep(10)
except asyncio.CancelledError:
logger.info("Prefect monitoring cancelled")
except Exception as e:
logger.error(f"Fatal error in Prefect monitoring: {e}")
finally:
# Clean up client on exit
if self._client:
try:
await self._client.__aexit__(None, None, None)
except Exception:
pass
self._client = None
async def _monitor_flow_run(self, client, flow_run: FlowRun):
"""Monitor a specific flow run for statistics"""
run_id = str(flow_run.id)
workflow_name = flow_run.name or "unknown"
try:
# Initialize tracking if not exists - only for workflows that might have live stats
if run_id not in fuzzing_stats:
initialize_fuzzing_tracking(run_id, workflow_name)
self.monitored_runs.add(run_id)
# Skip corrupted entries (should not happen after startup cleanup, but defensive)
elif not isinstance(fuzzing_stats[run_id], FuzzingStats):
logger.warning(f"Skipping corrupted stats entry for {run_id}, reinitializing")
initialize_fuzzing_tracking(run_id, workflow_name)
self.monitored_runs.add(run_id)
# Get task runs for this flow
task_runs = await client.read_task_runs(
flow_run_filter={"id": {"any_": [flow_run.id]}},
limit=25,
)
# Check all tasks for live statistics logging
for task_run in task_runs:
await self._extract_stats_from_task(client, run_id, task_run, workflow_name)
# Also scan flow-level logs as a fallback
await self._extract_stats_from_flow_logs(client, run_id, flow_run, workflow_name)
except Exception as e:
logger.warning(f"Error monitoring flow run {run_id}: {e}")
async def _extract_stats_from_task(self, client, run_id: str, task_run: TaskRun, workflow_name: str):
"""Extract statistics from any task that logs live stats"""
try:
# Get task run logs
logs = await client.read_logs(
log_filter={
"task_run_id": {"any_": [task_run.id]}
},
limit=100,
sort="TIMESTAMP_ASC"
)
# Parse logs for LIVE_STATS entries (generic pattern for any workflow)
latest_stats = None
for log in logs:
# Prefer structured extra field if present
extra_data = getattr(log, "extra", None) or getattr(log, "extra_fields", None) or None
if isinstance(extra_data, dict):
stat_type = extra_data.get("stats_type")
if stat_type in ["fuzzing_live_update", "scan_progress", "analysis_update", "live_stats"]:
latest_stats = extra_data
continue
# Fallback to parsing from message text
if ("FUZZ_STATS" in log.message or "LIVE_STATS" in log.message):
stats = self._parse_stats_from_log(log.message)
if stats:
latest_stats = stats
# Update statistics if we found any
if latest_stats:
# Calculate elapsed time from task start
elapsed_time = 0
if task_run.start_time:
# Ensure timezone-aware arithmetic
now = datetime.now(timezone.utc)
try:
elapsed_time = int((now - task_run.start_time).total_seconds())
except Exception:
# Fallback to naive UTC if types mismatch
elapsed_time = int((datetime.utcnow() - task_run.start_time.replace(tzinfo=None)).total_seconds())
updated_stats = FuzzingStats(
run_id=run_id,
workflow=workflow_name,
executions=latest_stats.get("executions", 0),
executions_per_sec=latest_stats.get("executions_per_sec", 0.0),
crashes=latest_stats.get("crashes", 0),
unique_crashes=latest_stats.get("unique_crashes", 0),
corpus_size=latest_stats.get("corpus_size", 0),
elapsed_time=elapsed_time
)
# Update the global stats
previous = fuzzing_stats.get(run_id)
fuzzing_stats[run_id] = updated_stats
# Broadcast to any active WebSocket clients for this run
if active_connections.get(run_id):
# Handle both Pydantic objects and plain dicts
if isinstance(updated_stats, dict):
stats_data = updated_stats
elif hasattr(updated_stats, 'model_dump'):
stats_data = updated_stats.model_dump()
elif hasattr(updated_stats, 'dict'):
stats_data = updated_stats.dict()
else:
stats_data = updated_stats.__dict__
message = {
"type": "stats_update",
"data": stats_data,
}
disconnected = []
for ws in active_connections[run_id]:
try:
await ws.send_text(json.dumps(message))
except Exception:
disconnected.append(ws)
# Clean up disconnected sockets
for ws in disconnected:
try:
active_connections[run_id].remove(ws)
except ValueError:
pass
logger.debug(f"Updated Prefect stats for {run_id}: {updated_stats.executions} execs")
except Exception as e:
logger.warning(f"Error extracting stats from task {task_run.id}: {e}")
async def _extract_stats_from_flow_logs(self, client, run_id: str, flow_run: FlowRun, workflow_name: str):
"""Extract statistics by scanning flow-level logs for LIVE/FUZZ stats"""
try:
logs = await client.read_logs(
log_filter={
"flow_run_id": {"any_": [flow_run.id]}
},
limit=200,
sort="TIMESTAMP_ASC"
)
latest_stats = None
last_seen = self.last_log_ts.get(run_id)
max_ts = last_seen
for log in logs:
# Skip logs we've already processed
ts = getattr(log, "timestamp", None)
if last_seen and ts and ts <= last_seen:
continue
if ts and (max_ts is None or ts > max_ts):
max_ts = ts
# Prefer structured extra field if available
extra_data = getattr(log, "extra", None) or getattr(log, "extra_fields", None) or None
if isinstance(extra_data, dict):
stat_type = extra_data.get("stats_type")
if stat_type in ["fuzzing_live_update", "scan_progress", "analysis_update", "live_stats"]:
latest_stats = extra_data
continue
# Fallback to message parse
if ("FUZZ_STATS" in log.message or "LIVE_STATS" in log.message):
stats = self._parse_stats_from_log(log.message)
if stats:
latest_stats = stats
if max_ts:
self.last_log_ts[run_id] = max_ts
if latest_stats:
# Use flow_run timestamps for elapsed time if available
elapsed_time = 0
start_time = getattr(flow_run, "start_time", None) or getattr(flow_run, "start_time", None)
if start_time:
now = datetime.now(timezone.utc)
try:
if start_time.tzinfo is None:
start_time = start_time.replace(tzinfo=timezone.utc)
elapsed_time = int((now - start_time).total_seconds())
except Exception:
elapsed_time = int((datetime.utcnow() - start_time.replace(tzinfo=None)).total_seconds())
updated_stats = FuzzingStats(
run_id=run_id,
workflow=workflow_name,
executions=latest_stats.get("executions", 0),
executions_per_sec=latest_stats.get("executions_per_sec", 0.0),
crashes=latest_stats.get("crashes", 0),
unique_crashes=latest_stats.get("unique_crashes", 0),
corpus_size=latest_stats.get("corpus_size", 0),
elapsed_time=elapsed_time
)
fuzzing_stats[run_id] = updated_stats
# Broadcast if listeners exist
if active_connections.get(run_id):
# Handle both Pydantic objects and plain dicts
if isinstance(updated_stats, dict):
stats_data = updated_stats
elif hasattr(updated_stats, 'model_dump'):
stats_data = updated_stats.model_dump()
elif hasattr(updated_stats, 'dict'):
stats_data = updated_stats.dict()
else:
stats_data = updated_stats.__dict__
message = {
"type": "stats_update",
"data": stats_data,
}
disconnected = []
for ws in active_connections[run_id]:
try:
await ws.send_text(json.dumps(message))
except Exception:
disconnected.append(ws)
for ws in disconnected:
try:
active_connections[run_id].remove(ws)
except ValueError:
pass
except Exception as e:
logger.warning(f"Error extracting stats from flow logs {run_id}: {e}")
def _parse_stats_from_log(self, log_message: str) -> Optional[Dict[str, Any]]:
"""Parse statistics from a log message"""
try:
import re
# Prefer explicit JSON after marker tokens
m = re.search(r'(?:FUZZ_STATS|LIVE_STATS)\s+(\{.*\})', log_message)
if m:
try:
return json.loads(m.group(1))
except Exception:
pass
# Fallback: Extract the extra= dict and coerce to JSON
stats_match = re.search(r'extra=({.*?})', log_message)
if not stats_match:
return None
extra_str = stats_match.group(1)
extra_str = extra_str.replace("'", '"')
extra_str = extra_str.replace('None', 'null')
extra_str = extra_str.replace('True', 'true')
extra_str = extra_str.replace('False', 'false')
stats_data = json.loads(extra_str)
# Support multiple stat types for different workflows
stat_type = stats_data.get("stats_type")
if stat_type in ["fuzzing_live_update", "scan_progress", "analysis_update", "live_stats"]:
return stats_data
except Exception as e:
logger.debug(f"Error parsing log stats: {e}")
return None
# Global instance
prefect_stats_monitor = PrefectStatsMonitor()
+10
View File
@@ -0,0 +1,10 @@
"""
Storage abstraction layer for FuzzForge.
Provides unified interface for storing and retrieving targets and results.
"""
from .base import StorageBackend
from .s3_cached import S3CachedStorage
__all__ = ["StorageBackend", "S3CachedStorage"]
+153
View File
@@ -0,0 +1,153 @@
"""
Base storage backend interface.
All storage implementations must implement this interface.
"""
from abc import ABC, abstractmethod
from pathlib import Path
from typing import Optional, Dict, Any
class StorageBackend(ABC):
"""
Abstract base class for storage backends.
Implementations handle storage and retrieval of:
- Uploaded targets (code, binaries, etc.)
- Workflow results
- Temporary files
"""
@abstractmethod
async def upload_target(
self,
file_path: Path,
user_id: str,
metadata: Optional[Dict[str, Any]] = None
) -> str:
"""
Upload a target file to storage.
Args:
file_path: Local path to file to upload
user_id: ID of user uploading the file
metadata: Optional metadata to store with file
Returns:
Target ID (unique identifier for retrieval)
Raises:
FileNotFoundError: If file_path doesn't exist
StorageError: If upload fails
"""
pass
@abstractmethod
async def get_target(self, target_id: str) -> Path:
"""
Get target file from storage.
Args:
target_id: Unique identifier from upload_target()
Returns:
Local path to cached file
Raises:
FileNotFoundError: If target doesn't exist
StorageError: If download fails
"""
pass
@abstractmethod
async def delete_target(self, target_id: str) -> None:
"""
Delete target from storage.
Args:
target_id: Unique identifier to delete
Raises:
StorageError: If deletion fails (doesn't raise if not found)
"""
pass
@abstractmethod
async def upload_results(
self,
workflow_id: str,
results: Dict[str, Any],
results_format: str = "json"
) -> str:
"""
Upload workflow results to storage.
Args:
workflow_id: Workflow execution ID
results: Results dictionary
results_format: Format (json, sarif, etc.)
Returns:
URL to uploaded results
Raises:
StorageError: If upload fails
"""
pass
@abstractmethod
async def get_results(self, workflow_id: str) -> Dict[str, Any]:
"""
Get workflow results from storage.
Args:
workflow_id: Workflow execution ID
Returns:
Results dictionary
Raises:
FileNotFoundError: If results don't exist
StorageError: If download fails
"""
pass
@abstractmethod
async def list_targets(
self,
user_id: Optional[str] = None,
limit: int = 100
) -> list[Dict[str, Any]]:
"""
List uploaded targets.
Args:
user_id: Filter by user ID (None = all users)
limit: Maximum number of results
Returns:
List of target metadata dictionaries
Raises:
StorageError: If listing fails
"""
pass
@abstractmethod
async def cleanup_cache(self) -> int:
"""
Clean up local cache (LRU eviction).
Returns:
Number of files removed
Raises:
StorageError: If cleanup fails
"""
pass
class StorageError(Exception):
"""Base exception for storage operations."""
pass
+423
View File
@@ -0,0 +1,423 @@
"""
S3-compatible storage backend with local caching.
Works with MinIO (dev/prod) or AWS S3 (cloud).
"""
import json
import logging
import os
import shutil
from datetime import datetime
from pathlib import Path
from typing import Optional, Dict, Any
from uuid import uuid4
import boto3
from botocore.exceptions import ClientError
from .base import StorageBackend, StorageError
logger = logging.getLogger(__name__)
class S3CachedStorage(StorageBackend):
"""
S3-compatible storage with local caching.
Features:
- Upload targets to S3/MinIO
- Download with local caching (LRU eviction)
- Lifecycle management (auto-cleanup old files)
- Metadata tracking
"""
def __init__(
self,
endpoint_url: Optional[str] = None,
access_key: Optional[str] = None,
secret_key: Optional[str] = None,
bucket: str = "targets",
region: str = "us-east-1",
use_ssl: bool = False,
cache_dir: Optional[Path] = None,
cache_max_size_gb: int = 10
):
"""
Initialize S3 storage backend.
Args:
endpoint_url: S3 endpoint (None = AWS S3, or MinIO URL)
access_key: S3 access key (None = from env)
secret_key: S3 secret key (None = from env)
bucket: S3 bucket name
region: AWS region
use_ssl: Use HTTPS
cache_dir: Local cache directory
cache_max_size_gb: Maximum cache size in GB
"""
# Use environment variables as defaults
self.endpoint_url = endpoint_url or os.getenv('S3_ENDPOINT', 'http://minio:9000')
self.access_key = access_key or os.getenv('S3_ACCESS_KEY', 'fuzzforge')
self.secret_key = secret_key or os.getenv('S3_SECRET_KEY', 'fuzzforge123')
self.bucket = bucket or os.getenv('S3_BUCKET', 'targets')
self.region = region or os.getenv('S3_REGION', 'us-east-1')
self.use_ssl = use_ssl or os.getenv('S3_USE_SSL', 'false').lower() == 'true'
# Cache configuration
self.cache_dir = cache_dir or Path(os.getenv('CACHE_DIR', '/tmp/fuzzforge-cache'))
self.cache_max_size = cache_max_size_gb * (1024 ** 3) # Convert to bytes
# Ensure cache directory exists
self.cache_dir.mkdir(parents=True, exist_ok=True)
# Initialize S3 client
try:
self.s3_client = boto3.client(
's3',
endpoint_url=self.endpoint_url,
aws_access_key_id=self.access_key,
aws_secret_access_key=self.secret_key,
region_name=self.region,
use_ssl=self.use_ssl
)
logger.info(f"Initialized S3 storage: {self.endpoint_url}/{self.bucket}")
except Exception as e:
logger.error(f"Failed to initialize S3 client: {e}")
raise StorageError(f"S3 initialization failed: {e}")
async def upload_target(
self,
file_path: Path,
user_id: str,
metadata: Optional[Dict[str, Any]] = None
) -> str:
"""Upload target file to S3/MinIO."""
if not file_path.exists():
raise FileNotFoundError(f"File not found: {file_path}")
# Generate unique target ID
target_id = str(uuid4())
# Prepare metadata
upload_metadata = {
'user_id': user_id,
'uploaded_at': datetime.now().isoformat(),
'filename': file_path.name,
'size': str(file_path.stat().st_size)
}
if metadata:
upload_metadata.update(metadata)
# Upload to S3
s3_key = f'{target_id}/target'
try:
logger.info(f"Uploading target to s3://{self.bucket}/{s3_key}")
self.s3_client.upload_file(
str(file_path),
self.bucket,
s3_key,
ExtraArgs={
'Metadata': upload_metadata
}
)
file_size_mb = file_path.stat().st_size / (1024 * 1024)
logger.info(
f"✓ Uploaded target {target_id} "
f"({file_path.name}, {file_size_mb:.2f} MB)"
)
return target_id
except ClientError as e:
logger.error(f"S3 upload failed: {e}", exc_info=True)
raise StorageError(f"Failed to upload target: {e}")
except Exception as e:
logger.error(f"Upload failed: {e}", exc_info=True)
raise StorageError(f"Upload error: {e}")
async def get_target(self, target_id: str) -> Path:
"""Get target from cache or download from S3/MinIO."""
# Check cache first
cache_path = self.cache_dir / target_id
cached_file = cache_path / "target"
if cached_file.exists():
# Update access time for LRU
cached_file.touch()
logger.info(f"Cache HIT: {target_id}")
return cached_file
# Cache miss - download from S3
logger.info(f"Cache MISS: {target_id}, downloading from S3...")
try:
# Create cache directory
cache_path.mkdir(parents=True, exist_ok=True)
# Download from S3
s3_key = f'{target_id}/target'
logger.info(f"Downloading s3://{self.bucket}/{s3_key}")
self.s3_client.download_file(
self.bucket,
s3_key,
str(cached_file)
)
# Verify download
if not cached_file.exists():
raise StorageError(f"Downloaded file not found: {cached_file}")
file_size_mb = cached_file.stat().st_size / (1024 * 1024)
logger.info(f"✓ Downloaded target {target_id} ({file_size_mb:.2f} MB)")
return cached_file
except ClientError as e:
error_code = e.response.get('Error', {}).get('Code')
if error_code in ['404', 'NoSuchKey']:
logger.error(f"Target not found: {target_id}")
raise FileNotFoundError(f"Target {target_id} not found in storage")
else:
logger.error(f"S3 download failed: {e}", exc_info=True)
raise StorageError(f"Download failed: {e}")
except Exception as e:
logger.error(f"Download error: {e}", exc_info=True)
# Cleanup partial download
if cache_path.exists():
shutil.rmtree(cache_path, ignore_errors=True)
raise StorageError(f"Download error: {e}")
async def delete_target(self, target_id: str) -> None:
"""Delete target from S3/MinIO."""
try:
s3_key = f'{target_id}/target'
logger.info(f"Deleting s3://{self.bucket}/{s3_key}")
self.s3_client.delete_object(
Bucket=self.bucket,
Key=s3_key
)
# Also delete from cache if present
cache_path = self.cache_dir / target_id
if cache_path.exists():
shutil.rmtree(cache_path, ignore_errors=True)
logger.info(f"✓ Deleted target {target_id} from S3 and cache")
else:
logger.info(f"✓ Deleted target {target_id} from S3")
except ClientError as e:
logger.error(f"S3 delete failed: {e}", exc_info=True)
# Don't raise error if object doesn't exist
if e.response.get('Error', {}).get('Code') not in ['404', 'NoSuchKey']:
raise StorageError(f"Delete failed: {e}")
except Exception as e:
logger.error(f"Delete error: {e}", exc_info=True)
raise StorageError(f"Delete error: {e}")
async def upload_results(
self,
workflow_id: str,
results: Dict[str, Any],
results_format: str = "json"
) -> str:
"""Upload workflow results to S3/MinIO."""
try:
# Prepare results content
if results_format == "json":
content = json.dumps(results, indent=2).encode('utf-8')
content_type = 'application/json'
file_ext = 'json'
elif results_format == "sarif":
content = json.dumps(results, indent=2).encode('utf-8')
content_type = 'application/sarif+json'
file_ext = 'sarif'
else:
content = json.dumps(results, indent=2).encode('utf-8')
content_type = 'application/json'
file_ext = 'json'
# Upload to results bucket
results_bucket = 'results'
s3_key = f'{workflow_id}/results.{file_ext}'
logger.info(f"Uploading results to s3://{results_bucket}/{s3_key}")
self.s3_client.put_object(
Bucket=results_bucket,
Key=s3_key,
Body=content,
ContentType=content_type,
Metadata={
'workflow_id': workflow_id,
'format': results_format,
'uploaded_at': datetime.now().isoformat()
}
)
# Construct URL
results_url = f"{self.endpoint_url}/{results_bucket}/{s3_key}"
logger.info(f"✓ Uploaded results: {results_url}")
return results_url
except Exception as e:
logger.error(f"Results upload failed: {e}", exc_info=True)
raise StorageError(f"Results upload failed: {e}")
async def get_results(self, workflow_id: str) -> Dict[str, Any]:
"""Get workflow results from S3/MinIO."""
try:
results_bucket = 'results'
s3_key = f'{workflow_id}/results.json'
logger.info(f"Downloading results from s3://{results_bucket}/{s3_key}")
response = self.s3_client.get_object(
Bucket=results_bucket,
Key=s3_key
)
content = response['Body'].read().decode('utf-8')
results = json.loads(content)
logger.info(f"✓ Downloaded results for workflow {workflow_id}")
return results
except ClientError as e:
error_code = e.response.get('Error', {}).get('Code')
if error_code in ['404', 'NoSuchKey']:
logger.error(f"Results not found: {workflow_id}")
raise FileNotFoundError(f"Results for workflow {workflow_id} not found")
else:
logger.error(f"Results download failed: {e}", exc_info=True)
raise StorageError(f"Results download failed: {e}")
except Exception as e:
logger.error(f"Results download error: {e}", exc_info=True)
raise StorageError(f"Results download error: {e}")
async def list_targets(
self,
user_id: Optional[str] = None,
limit: int = 100
) -> list[Dict[str, Any]]:
"""List uploaded targets."""
try:
targets = []
paginator = self.s3_client.get_paginator('list_objects_v2')
for page in paginator.paginate(Bucket=self.bucket, PaginationConfig={'MaxItems': limit}):
for obj in page.get('Contents', []):
# Get object metadata
try:
metadata_response = self.s3_client.head_object(
Bucket=self.bucket,
Key=obj['Key']
)
metadata = metadata_response.get('Metadata', {})
# Filter by user_id if specified
if user_id and metadata.get('user_id') != user_id:
continue
targets.append({
'target_id': obj['Key'].split('/')[0],
'key': obj['Key'],
'size': obj['Size'],
'last_modified': obj['LastModified'].isoformat(),
'metadata': metadata
})
except Exception as e:
logger.warning(f"Failed to get metadata for {obj['Key']}: {e}")
continue
logger.info(f"Listed {len(targets)} targets (user_id={user_id})")
return targets
except Exception as e:
logger.error(f"List targets failed: {e}", exc_info=True)
raise StorageError(f"List targets failed: {e}")
async def cleanup_cache(self) -> int:
"""Clean up local cache using LRU eviction."""
try:
cache_files = []
total_size = 0
# Gather all cached files with metadata
for cache_file in self.cache_dir.rglob('*'):
if cache_file.is_file():
try:
stat = cache_file.stat()
cache_files.append({
'path': cache_file,
'size': stat.st_size,
'atime': stat.st_atime # Last access time
})
total_size += stat.st_size
except Exception as e:
logger.warning(f"Failed to stat {cache_file}: {e}")
continue
# Check if cleanup is needed
if total_size <= self.cache_max_size:
logger.info(
f"Cache size OK: {total_size / (1024**3):.2f} GB / "
f"{self.cache_max_size / (1024**3):.2f} GB"
)
return 0
# Sort by access time (oldest first)
cache_files.sort(key=lambda x: x['atime'])
# Remove files until under limit
removed_count = 0
for file_info in cache_files:
if total_size <= self.cache_max_size:
break
try:
file_info['path'].unlink()
total_size -= file_info['size']
removed_count += 1
logger.debug(f"Evicted from cache: {file_info['path']}")
except Exception as e:
logger.warning(f"Failed to delete {file_info['path']}: {e}")
continue
logger.info(
f"✓ Cache cleanup: removed {removed_count} files, "
f"new size: {total_size / (1024**3):.2f} GB"
)
return removed_count
except Exception as e:
logger.error(f"Cache cleanup failed: {e}", exc_info=True)
raise StorageError(f"Cache cleanup failed: {e}")
def get_cache_stats(self) -> Dict[str, Any]:
"""Get cache statistics."""
try:
total_size = 0
file_count = 0
for cache_file in self.cache_dir.rglob('*'):
if cache_file.is_file():
total_size += cache_file.stat().st_size
file_count += 1
return {
'total_size_bytes': total_size,
'total_size_gb': total_size / (1024 ** 3),
'file_count': file_count,
'max_size_gb': self.cache_max_size / (1024 ** 3),
'usage_percent': (total_size / self.cache_max_size) * 100
}
except Exception as e:
logger.error(f"Failed to get cache stats: {e}")
return {'error': str(e)}
+10
View File
@@ -0,0 +1,10 @@
"""
Temporal integration for FuzzForge.
Handles workflow execution, monitoring, and management.
"""
from .manager import TemporalManager
from .discovery import WorkflowDiscovery
__all__ = ["TemporalManager", "WorkflowDiscovery"]
+257
View File
@@ -0,0 +1,257 @@
"""
Workflow Discovery for Temporal
Discovers workflows from the toolbox/workflows directory
and provides metadata about available workflows.
"""
import logging
import yaml
from pathlib import Path
from typing import Dict, Any
from pydantic import BaseModel, Field, ConfigDict
logger = logging.getLogger(__name__)
class WorkflowInfo(BaseModel):
"""Information about a discovered workflow"""
name: str = Field(..., description="Workflow name")
path: Path = Field(..., description="Path to workflow directory")
workflow_file: Path = Field(..., description="Path to workflow.py file")
metadata: Dict[str, Any] = Field(..., description="Workflow metadata from YAML")
workflow_type: str = Field(..., description="Workflow class name")
vertical: str = Field(..., description="Vertical (worker type) for this workflow")
model_config = ConfigDict(arbitrary_types_allowed=True)
class WorkflowDiscovery:
"""
Discovers workflows from the filesystem.
Scans toolbox/workflows/ for directories containing:
- metadata.yaml (required)
- workflow.py (required)
Each workflow declares its vertical (rust, android, web, etc.)
which determines which worker pool will execute it.
"""
def __init__(self, workflows_dir: Path):
"""
Initialize workflow discovery.
Args:
workflows_dir: Path to the workflows directory
"""
self.workflows_dir = workflows_dir
if not self.workflows_dir.exists():
self.workflows_dir.mkdir(parents=True, exist_ok=True)
logger.info(f"Created workflows directory: {self.workflows_dir}")
async def discover_workflows(self) -> Dict[str, WorkflowInfo]:
"""
Discover workflows by scanning the workflows directory.
Returns:
Dictionary mapping workflow names to their information
"""
workflows = {}
logger.info(f"Scanning for workflows in: {self.workflows_dir}")
for workflow_dir in self.workflows_dir.iterdir():
if not workflow_dir.is_dir():
continue
# Skip special directories
if workflow_dir.name.startswith('.') or workflow_dir.name == '__pycache__':
continue
metadata_file = workflow_dir / "metadata.yaml"
if not metadata_file.exists():
logger.debug(f"No metadata.yaml in {workflow_dir.name}, skipping")
continue
workflow_file = workflow_dir / "workflow.py"
if not workflow_file.exists():
logger.warning(
f"Workflow {workflow_dir.name} has metadata but no workflow.py, skipping"
)
continue
try:
# Parse metadata
with open(metadata_file) as f:
metadata = yaml.safe_load(f)
# Validate required fields
if 'name' not in metadata:
logger.warning(f"Workflow {workflow_dir.name} metadata missing 'name' field")
metadata['name'] = workflow_dir.name
if 'vertical' not in metadata:
logger.warning(
f"Workflow {workflow_dir.name} metadata missing 'vertical' field"
)
continue
# Infer workflow class name from metadata or use convention
workflow_type = metadata.get('workflow_class')
if not workflow_type:
# Convention: convert snake_case to PascalCase + Workflow
# e.g., rust_test -> RustTestWorkflow
parts = workflow_dir.name.split('_')
workflow_type = ''.join(part.capitalize() for part in parts) + 'Workflow'
# Create workflow info
info = WorkflowInfo(
name=metadata['name'],
path=workflow_dir,
workflow_file=workflow_file,
metadata=metadata,
workflow_type=workflow_type,
vertical=metadata['vertical']
)
workflows[info.name] = info
logger.info(
f"✓ Discovered workflow: {info.name} "
f"(vertical: {info.vertical}, class: {info.workflow_type})"
)
except Exception as e:
logger.error(
f"Error discovering workflow {workflow_dir.name}: {e}",
exc_info=True
)
continue
logger.info(f"Discovered {len(workflows)} workflows")
return workflows
def get_workflows_by_vertical(
self,
workflows: Dict[str, WorkflowInfo],
vertical: str
) -> Dict[str, WorkflowInfo]:
"""
Filter workflows by vertical.
Args:
workflows: All discovered workflows
vertical: Vertical name to filter by
Returns:
Filtered workflows dictionary
"""
return {
name: info
for name, info in workflows.items()
if info.vertical == vertical
}
def get_available_verticals(self, workflows: Dict[str, WorkflowInfo]) -> list[str]:
"""
Get list of all verticals from discovered workflows.
Args:
workflows: All discovered workflows
Returns:
List of unique vertical names
"""
return list(set(info.vertical for info in workflows.values()))
@staticmethod
def get_metadata_schema() -> Dict[str, Any]:
"""
Get the JSON schema for workflow metadata.
Returns:
JSON schema dictionary
"""
return {
"type": "object",
"required": ["name", "version", "description", "author", "vertical", "parameters"],
"properties": {
"name": {
"type": "string",
"description": "Workflow name"
},
"version": {
"type": "string",
"pattern": "^\\d+\\.\\d+\\.\\d+$",
"description": "Semantic version (x.y.z)"
},
"vertical": {
"type": "string",
"description": "Vertical worker type (rust, android, web, etc.)"
},
"description": {
"type": "string",
"description": "Workflow description"
},
"author": {
"type": "string",
"description": "Workflow author"
},
"category": {
"type": "string",
"enum": ["comprehensive", "specialized", "fuzzing", "focused"],
"description": "Workflow category"
},
"tags": {
"type": "array",
"items": {"type": "string"},
"description": "Workflow tags for categorization"
},
"requirements": {
"type": "object",
"required": ["tools", "resources"],
"properties": {
"tools": {
"type": "array",
"items": {"type": "string"},
"description": "Required security tools"
},
"resources": {
"type": "object",
"required": ["memory", "cpu", "timeout"],
"properties": {
"memory": {
"type": "string",
"pattern": "^\\d+[GMK]i$",
"description": "Memory limit (e.g., 1Gi, 512Mi)"
},
"cpu": {
"type": "string",
"pattern": "^\\d+m?$",
"description": "CPU limit (e.g., 1000m, 2)"
},
"timeout": {
"type": "integer",
"minimum": 60,
"maximum": 7200,
"description": "Workflow timeout in seconds"
}
}
}
}
},
"parameters": {
"type": "object",
"description": "Workflow parameters schema"
},
"default_parameters": {
"type": "object",
"description": "Default parameter values"
},
"required_modules": {
"type": "array",
"items": {"type": "string"},
"description": "Required module names"
}
}
}
+371
View File
@@ -0,0 +1,371 @@
"""
Temporal Manager - Workflow execution and management
Handles:
- Workflow discovery from toolbox
- Workflow execution (submit to Temporal)
- Status monitoring
- Results retrieval
"""
import logging
import os
from pathlib import Path
from typing import Dict, Optional, Any
from uuid import uuid4
from temporalio.client import Client, WorkflowHandle
from temporalio.common import RetryPolicy
from datetime import timedelta
from .discovery import WorkflowDiscovery, WorkflowInfo
from src.storage import S3CachedStorage
logger = logging.getLogger(__name__)
class TemporalManager:
"""
Manages Temporal workflow execution for FuzzForge.
This class:
- Discovers available workflows from toolbox
- Submits workflow executions to Temporal
- Monitors workflow status
- Retrieves workflow results
"""
def __init__(
self,
workflows_dir: Optional[Path] = None,
temporal_address: Optional[str] = None,
temporal_namespace: str = "default",
storage: Optional[S3CachedStorage] = None
):
"""
Initialize Temporal manager.
Args:
workflows_dir: Path to workflows directory (default: toolbox/workflows)
temporal_address: Temporal server address (default: from env or localhost:7233)
temporal_namespace: Temporal namespace
storage: Storage backend for file uploads (default: S3CachedStorage)
"""
if workflows_dir is None:
workflows_dir = Path("toolbox/workflows")
self.temporal_address = temporal_address or os.getenv(
'TEMPORAL_ADDRESS',
'localhost:7233'
)
self.temporal_namespace = temporal_namespace
self.discovery = WorkflowDiscovery(workflows_dir)
self.workflows: Dict[str, WorkflowInfo] = {}
self.client: Optional[Client] = None
# Initialize storage backend
self.storage = storage or S3CachedStorage()
logger.info(
f"TemporalManager initialized: {self.temporal_address} "
f"(namespace: {self.temporal_namespace})"
)
async def initialize(self):
"""Initialize the manager by discovering workflows and connecting to Temporal."""
try:
# Discover workflows
self.workflows = await self.discovery.discover_workflows()
if not self.workflows:
logger.warning("No workflows discovered")
else:
logger.info(
f"Discovered {len(self.workflows)} workflows: "
f"{list(self.workflows.keys())}"
)
# Connect to Temporal
self.client = await Client.connect(
self.temporal_address,
namespace=self.temporal_namespace
)
logger.info(f"✓ Connected to Temporal: {self.temporal_address}")
except Exception as e:
logger.error(f"Failed to initialize Temporal manager: {e}", exc_info=True)
raise
async def close(self):
"""Close Temporal client connection."""
if self.client:
# Temporal client doesn't need explicit close in Python SDK
pass
async def get_workflows(self) -> Dict[str, WorkflowInfo]:
"""
Get all discovered workflows.
Returns:
Dictionary mapping workflow names to their info
"""
return self.workflows
async def get_workflow(self, name: str) -> Optional[WorkflowInfo]:
"""
Get workflow info by name.
Args:
name: Workflow name
Returns:
WorkflowInfo or None if not found
"""
return self.workflows.get(name)
async def upload_target(
self,
file_path: Path,
user_id: str,
metadata: Optional[Dict[str, Any]] = None
) -> str:
"""
Upload target file to storage.
Args:
file_path: Local path to file
user_id: User ID
metadata: Optional metadata
Returns:
Target ID for use in workflow execution
"""
target_id = await self.storage.upload_target(file_path, user_id, metadata)
logger.info(f"Uploaded target: {target_id}")
return target_id
async def run_workflow(
self,
workflow_name: str,
target_id: str,
workflow_params: Optional[Dict[str, Any]] = None,
workflow_id: Optional[str] = None
) -> WorkflowHandle:
"""
Execute a workflow.
Args:
workflow_name: Name of workflow to execute
target_id: Target ID (from upload_target)
workflow_params: Additional workflow parameters
workflow_id: Optional workflow ID (generated if not provided)
Returns:
WorkflowHandle for monitoring/results
Raises:
ValueError: If workflow not found or client not initialized
"""
if not self.client:
raise ValueError("Temporal client not initialized. Call initialize() first.")
# Get workflow info
workflow_info = self.workflows.get(workflow_name)
if not workflow_info:
raise ValueError(f"Workflow not found: {workflow_name}")
# Generate workflow ID if not provided
if not workflow_id:
workflow_id = f"{workflow_name}-{str(uuid4())[:8]}"
# Prepare workflow input arguments
workflow_params = workflow_params or {}
# Build args list: [target_id, ...workflow_params values]
# The workflow parameters are passed as individual positional args
workflow_args = [target_id]
# Add parameters in order based on workflow signature
# For security_assessment: scanner_config, analyzer_config, reporter_config
# For atheris_fuzzing: target_file, max_iterations, timeout_seconds
if workflow_params:
workflow_args.extend(workflow_params.values())
# Determine task queue from workflow vertical
vertical = workflow_info.metadata.get("vertical", "default")
task_queue = f"{vertical}-queue"
logger.info(
f"Starting workflow: {workflow_name} "
f"(id={workflow_id}, queue={task_queue}, target={target_id})"
)
try:
# Start workflow execution with positional arguments
handle = await self.client.start_workflow(
workflow=workflow_info.workflow_type, # Workflow class name
args=workflow_args, # Positional arguments
id=workflow_id,
task_queue=task_queue,
retry_policy=RetryPolicy(
initial_interval=timedelta(seconds=1),
maximum_interval=timedelta(minutes=1),
maximum_attempts=3
)
)
logger.info(f"✓ Workflow started: {workflow_id}")
return handle
except Exception as e:
logger.error(f"Failed to start workflow {workflow_name}: {e}", exc_info=True)
raise
async def get_workflow_status(self, workflow_id: str) -> Dict[str, Any]:
"""
Get workflow execution status.
Args:
workflow_id: Workflow execution ID
Returns:
Status dictionary with workflow state
Raises:
ValueError: If client not initialized or workflow not found
"""
if not self.client:
raise ValueError("Temporal client not initialized")
try:
# Get workflow handle
handle = self.client.get_workflow_handle(workflow_id)
# Try to get result (non-blocking describe)
description = await handle.describe()
status = {
"workflow_id": workflow_id,
"status": description.status.name,
"start_time": description.start_time.isoformat() if description.start_time else None,
"execution_time": description.execution_time.isoformat() if description.execution_time else None,
"close_time": description.close_time.isoformat() if description.close_time else None,
"task_queue": description.task_queue,
}
logger.info(f"Workflow {workflow_id} status: {status['status']}")
return status
except Exception as e:
logger.error(f"Failed to get workflow status: {e}", exc_info=True)
raise
async def get_workflow_result(
self,
workflow_id: str,
timeout: Optional[timedelta] = None
) -> Any:
"""
Get workflow execution result (blocking).
Args:
workflow_id: Workflow execution ID
timeout: Maximum time to wait for result
Returns:
Workflow result
Raises:
ValueError: If client not initialized
TimeoutError: If timeout exceeded
"""
if not self.client:
raise ValueError("Temporal client not initialized")
try:
handle = self.client.get_workflow_handle(workflow_id)
logger.info(f"Waiting for workflow result: {workflow_id}")
# Wait for workflow to complete and get result
if timeout:
# Use asyncio timeout if provided
import asyncio
result = await asyncio.wait_for(handle.result(), timeout=timeout.total_seconds())
else:
result = await handle.result()
logger.info(f"✓ Workflow {workflow_id} completed")
return result
except Exception as e:
logger.error(f"Failed to get workflow result: {e}", exc_info=True)
raise
async def cancel_workflow(self, workflow_id: str) -> None:
"""
Cancel a running workflow.
Args:
workflow_id: Workflow execution ID
Raises:
ValueError: If client not initialized
"""
if not self.client:
raise ValueError("Temporal client not initialized")
try:
handle = self.client.get_workflow_handle(workflow_id)
await handle.cancel()
logger.info(f"✓ Workflow cancelled: {workflow_id}")
except Exception as e:
logger.error(f"Failed to cancel workflow: {e}", exc_info=True)
raise
async def list_workflows(
self,
filter_query: Optional[str] = None,
limit: int = 100
) -> list[Dict[str, Any]]:
"""
List workflow executions.
Args:
filter_query: Optional Temporal list filter query
limit: Maximum number of results
Returns:
List of workflow execution info
Raises:
ValueError: If client not initialized
"""
if not self.client:
raise ValueError("Temporal client not initialized")
try:
workflows = []
# Use Temporal's list API
async for workflow in self.client.list_workflows(filter_query):
workflows.append({
"workflow_id": workflow.id,
"workflow_type": workflow.workflow_type,
"status": workflow.status.name,
"start_time": workflow.start_time.isoformat() if workflow.start_time else None,
"close_time": workflow.close_time.isoformat() if workflow.close_time else None,
"task_queue": workflow.task_queue,
})
if len(workflows) >= limit:
break
logger.info(f"Listed {len(workflows)} workflows")
return workflows
except Exception as e:
logger.error(f"Failed to list workflows: {e}", exc_info=True)
raise