mirror of
https://github.com/FuzzingLabs/fuzzforge_ai.git
synced 2026-05-21 16:16:49 +02:00
CI/CD Integration with Ephemeral Deployment Model (#14)
* feat: Complete migration from Prefect to Temporal BREAKING CHANGE: Replaces Prefect workflow orchestration with Temporal ## Major Changes - Replace Prefect with Temporal for workflow orchestration - Implement vertical worker architecture (rust, android) - Replace Docker registry with MinIO for unified storage - Refactor activities to be co-located with workflows - Update all API endpoints for Temporal compatibility ## Infrastructure - New: docker-compose.temporal.yaml (Temporal + MinIO + workers) - New: workers/ directory with rust and android vertical workers - New: backend/src/temporal/ (manager, discovery) - New: backend/src/storage/ (S3-cached storage with MinIO) - New: backend/toolbox/common/ (shared storage activities) - Deleted: docker-compose.yaml (old Prefect setup) - Deleted: backend/src/core/prefect_manager.py - Deleted: backend/src/services/prefect_stats_monitor.py - Deleted: Docker registry and insecure-registries requirement ## Workflows - Migrated: security_assessment workflow to Temporal - New: rust_test workflow (example/test workflow) - Deleted: secret_detection_scan (Prefect-based, to be reimplemented) - Activities now co-located with workflows for independent testing ## API Changes - Updated: backend/src/api/workflows.py (Temporal submission) - Updated: backend/src/api/runs.py (Temporal status/results) - Updated: backend/src/main.py (727 lines, TemporalManager integration) - Updated: All 16 MCP tools to use TemporalManager ## Testing - ✅ All services healthy (Temporal, PostgreSQL, MinIO, workers, backend) - ✅ All API endpoints functional - ✅ End-to-end workflow test passed (72 findings from vulnerable_app) - ✅ MinIO storage integration working (target upload/download, results) - ✅ Worker activity discovery working (6 activities registered) - ✅ Tarball extraction working - ✅ SARIF report generation working ## Documentation - ARCHITECTURE.md: Complete Temporal architecture documentation - QUICKSTART_TEMPORAL.md: Getting started guide - MIGRATION_DECISION.md: Why we chose Temporal over Prefect - IMPLEMENTATION_STATUS.md: Migration progress tracking - workers/README.md: Worker development guide ## Dependencies - Added: temporalio>=1.6.0 - Added: boto3>=1.34.0 (MinIO S3 client) - Removed: prefect>=3.4.18 * feat: Add Python fuzzing vertical with Atheris integration This commit implements a complete Python fuzzing workflow using Atheris: ## Python Worker (workers/python/) - Dockerfile with Python 3.11, Atheris, and build tools - Generic worker.py for dynamic workflow discovery - requirements.txt with temporalio, boto3, atheris dependencies - Added to docker-compose.temporal.yaml with dedicated cache volume ## AtherisFuzzer Module (backend/toolbox/modules/fuzzer/) - Reusable module extending BaseModule - Auto-discovers fuzz targets (fuzz_*.py, *_fuzz.py, fuzz_target.py) - Recursive search to find targets in nested directories - Dynamically loads TestOneInput() function - Configurable max_iterations and timeout - Real-time stats callback support for live monitoring - Returns findings as ModuleFinding objects ## Atheris Fuzzing Workflow (backend/toolbox/workflows/atheris_fuzzing/) - Temporal workflow for orchestrating fuzzing - Downloads user code from MinIO - Executes AtherisFuzzer module - Uploads results to MinIO - Cleans up cache after execution - metadata.yaml with vertical: python for routing ## Test Project (test_projects/python_fuzz_waterfall/) - Demonstrates stateful waterfall vulnerability - main.py with check_secret() that leaks progress - fuzz_target.py with Atheris TestOneInput() harness - Complete README with usage instructions ## Backend Fixes - Fixed parameter merging in REST API endpoints (workflows.py) - Changed workflow parameter passing from positional args to kwargs (manager.py) - Default parameters now properly merged with user parameters ## Testing ✅ Worker discovered AtherisFuzzingWorkflow ✅ Workflow executed end-to-end successfully ✅ Fuzz target auto-discovered in nested directories ✅ Atheris ran 100,000 iterations ✅ Results uploaded and cache cleaned * chore: Complete Temporal migration with updated CLI/SDK/docs This commit includes all remaining Temporal migration changes: ## CLI Updates (cli/) - Updated workflow execution commands for Temporal - Enhanced error handling and exceptions - Updated dependencies in uv.lock ## SDK Updates (sdk/) - Client methods updated for Temporal workflows - Updated models for new workflow execution - Updated dependencies in uv.lock ## Documentation Updates (docs/) - Architecture documentation for Temporal - Workflow concept documentation - Resource management documentation (new) - Debugging guide (new) - Updated tutorials and how-to guides - Troubleshooting updates ## README Updates - Main README with Temporal instructions - Backend README - CLI README - SDK README ## Other - Updated IMPLEMENTATION_STATUS.md - Removed old vulnerable_app.tar.gz These changes complete the Temporal migration and ensure the CLI/SDK work correctly with the new backend. * fix: Use positional args instead of kwargs for Temporal workflows The Temporal Python SDK's start_workflow() method doesn't accept a 'kwargs' parameter. Workflows must receive parameters as positional arguments via the 'args' parameter. Changed from: args=workflow_args # Positional arguments This fixes the error: TypeError: Client.start_workflow() got an unexpected keyword argument 'kwargs' Workflows now correctly receive parameters in order: - security_assessment: [target_id, scanner_config, analyzer_config, reporter_config] - atheris_fuzzing: [target_id, target_file, max_iterations, timeout_seconds] - rust_test: [target_id, test_message] * fix: Filter metadata-only parameters from workflow arguments SecurityAssessmentWorkflow was receiving 7 arguments instead of 2-5. The issue was that target_path and volume_mode from default_parameters were being passed to the workflow, when they should only be used by the system for configuration. Now filters out metadata-only parameters (target_path, volume_mode) before passing arguments to workflow execution. * refactor: Remove Prefect leftovers and volume mounting legacy Complete cleanup of Prefect migration artifacts: Backend: - Delete registry.py and workflow_discovery.py (Prefect-specific files) - Remove Docker validation from setup.py (no longer needed) - Remove ResourceLimits and VolumeMount models - Remove target_path and volume_mode from WorkflowSubmission - Remove supported_volume_modes from API and discovery - Clean up metadata.yaml files (remove volume/path fields) - Simplify parameter filtering in manager.py SDK: - Remove volume_mode parameter from client methods - Remove ResourceLimits and VolumeMount models - Remove Prefect error patterns from docker_logs.py - Clean up WorkflowSubmission and WorkflowMetadata models CLI: - Remove Volume Modes display from workflow info All removed features are Prefect-specific or Docker volume mounting artifacts. Temporal workflows use MinIO storage exclusively. * feat: Add comprehensive test suite and benchmark infrastructure - Add 68 unit tests for fuzzer, scanner, and analyzer modules - Implement pytest-based test infrastructure with fixtures - Add 6 performance benchmarks with category-specific thresholds - Configure GitHub Actions for automated testing and benchmarking - Add test and benchmark documentation Test coverage: - AtherisFuzzer: 8 tests - CargoFuzzer: 14 tests - FileScanner: 22 tests - SecurityAnalyzer: 24 tests All tests passing (68/68) All benchmarks passing (6/6) * fix: Resolve all ruff linting violations across codebase Fixed 27 ruff violations in 12 files: - Removed unused imports (Depends, Dict, Any, Optional, etc.) - Fixed undefined workflow_info variable in workflows.py - Removed dead code with undefined variables in atheris_fuzzer.py - Changed f-string to regular string where no placeholders used All files now pass ruff checks for CI/CD compliance. * fix: Configure CI for unit tests only - Renamed docker-compose.temporal.yaml → docker-compose.yml for CI compatibility - Commented out integration-tests job (no integration tests yet) - Updated test-summary to only depend on lint and unit-tests CI will now run successfully with 68 unit tests. Integration tests can be added later. * feat: Add CI/CD integration with ephemeral deployment model Implements comprehensive CI/CD support for FuzzForge with on-demand worker management: **Worker Management (v0.7.0)** - Add WorkerManager for automatic worker lifecycle control - Auto-start workers from stopped state when workflows execute - Auto-stop workers after workflow completion - Health checks and startup timeout handling (90s default) **CI/CD Features** - `--fail-on` flag: Fail builds based on SARIF severity levels (error/warning/note/info) - `--export-sarif` flag: Export findings in SARIF 2.1.0 format - `--auto-start`/`--auto-stop` flags: Control worker lifecycle - Exit code propagation: Returns 1 on blocking findings, 0 on success **Exit Code Fix** - Add `except typer.Exit: raise` handlers at 3 critical locations - Move worker cleanup to finally block for guaranteed execution - Exit codes now propagate correctly even when build fails **CI Scripts & Examples** - ci-start.sh: Start FuzzForge services with health checks - ci-stop.sh: Clean shutdown with volume preservation option - GitHub Actions workflow example (security-scan.yml) - GitLab CI pipeline example (.gitlab-ci.example.yml) - docker-compose.ci.yml: CI-optimized compose file with profiles **OSS-Fuzz Integration** - New ossfuzz_campaign workflow for running OSS-Fuzz projects - OSS-Fuzz worker with Docker-in-Docker support - Configurable campaign duration and project selection **Documentation** - Comprehensive CI/CD integration guide (docs/how-to/cicd-integration.md) - Updated architecture docs with worker lifecycle details - Updated workspace isolation documentation - CLI README with worker management examples **SDK Enhancements** - Add get_workflow_worker_info() endpoint - Worker vertical metadata in workflow responses **Testing** - All workflows tested: security_assessment, atheris_fuzzing, secret_detection, cargo_fuzzing - All monitoring commands tested: stats, crashes, status, finding - Full CI pipeline simulation verified - Exit codes verified for success/failure scenarios Ephemeral CI/CD model: ~3-4GB RAM, ~60-90s startup, runs entirely in CI containers. * fix: Resolve ruff linting violations in CI/CD code - Remove unused variables (run_id, defaults, result) - Remove unused imports - Fix f-string without placeholders All CI/CD integration files now pass ruff checks.
This commit is contained in:
@@ -14,8 +14,8 @@ API endpoints for fuzzing workflow management and real-time monitoring
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import logging
|
||||
from typing import List, Dict, Any
|
||||
from fastapi import APIRouter, HTTPException, Depends, WebSocket, WebSocketDisconnect
|
||||
from typing import List, Dict
|
||||
from fastapi import APIRouter, HTTPException, WebSocket, WebSocketDisconnect
|
||||
from fastapi.responses import StreamingResponse
|
||||
import asyncio
|
||||
import json
|
||||
@@ -25,7 +25,6 @@ from src.models.findings import (
|
||||
FuzzingStats,
|
||||
CrashReport
|
||||
)
|
||||
from src.core.workflow_discovery import WorkflowDiscovery
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -126,12 +125,13 @@ async def update_fuzzing_stats(run_id: str, stats: FuzzingStats):
|
||||
# Debug: log reception for live instrumentation
|
||||
try:
|
||||
logger.info(
|
||||
"Received fuzzing stats update: run_id=%s exec=%s eps=%.2f crashes=%s corpus=%s elapsed=%ss",
|
||||
"Received fuzzing stats update: run_id=%s exec=%s eps=%.2f crashes=%s corpus=%s coverage=%s elapsed=%ss",
|
||||
run_id,
|
||||
stats.executions,
|
||||
stats.executions_per_sec,
|
||||
stats.crashes,
|
||||
stats.corpus_size,
|
||||
stats.coverage,
|
||||
stats.elapsed_time,
|
||||
)
|
||||
except Exception:
|
||||
|
||||
+49
-56
@@ -14,7 +14,6 @@ API endpoints for workflow run management and findings retrieval
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import logging
|
||||
from typing import Dict, Any
|
||||
from fastapi import APIRouter, HTTPException, Depends
|
||||
|
||||
from src.models.findings import WorkflowFindings, WorkflowStatus
|
||||
@@ -24,22 +23,22 @@ logger = logging.getLogger(__name__)
|
||||
router = APIRouter(prefix="/runs", tags=["runs"])
|
||||
|
||||
|
||||
def get_prefect_manager():
|
||||
"""Dependency to get the Prefect manager instance"""
|
||||
from src.main import prefect_mgr
|
||||
return prefect_mgr
|
||||
def get_temporal_manager():
|
||||
"""Dependency to get the Temporal manager instance"""
|
||||
from src.main import temporal_mgr
|
||||
return temporal_mgr
|
||||
|
||||
|
||||
@router.get("/{run_id}/status", response_model=WorkflowStatus)
|
||||
async def get_run_status(
|
||||
run_id: str,
|
||||
prefect_mgr=Depends(get_prefect_manager)
|
||||
temporal_mgr=Depends(get_temporal_manager)
|
||||
) -> WorkflowStatus:
|
||||
"""
|
||||
Get the current status of a workflow run.
|
||||
|
||||
Args:
|
||||
run_id: The flow run ID
|
||||
run_id: The workflow run ID
|
||||
|
||||
Returns:
|
||||
Status information including state, timestamps, and completion flags
|
||||
@@ -48,25 +47,23 @@ async def get_run_status(
|
||||
HTTPException: 404 if run not found
|
||||
"""
|
||||
try:
|
||||
status = await prefect_mgr.get_flow_run_status(run_id)
|
||||
status = await temporal_mgr.get_workflow_status(run_id)
|
||||
|
||||
# Find workflow name from deployment
|
||||
workflow_name = "unknown"
|
||||
workflow_deployment_id = status.get("workflow", "")
|
||||
for name, deployment_id in prefect_mgr.deployments.items():
|
||||
if str(deployment_id) == str(workflow_deployment_id):
|
||||
workflow_name = name
|
||||
break
|
||||
# Map Temporal status to response format
|
||||
workflow_status = status.get("status", "UNKNOWN")
|
||||
is_completed = workflow_status in ["COMPLETED", "FAILED", "CANCELLED"]
|
||||
is_failed = workflow_status == "FAILED"
|
||||
is_running = workflow_status == "RUNNING"
|
||||
|
||||
return WorkflowStatus(
|
||||
run_id=status["run_id"],
|
||||
workflow=workflow_name,
|
||||
status=status["status"],
|
||||
is_completed=status["is_completed"],
|
||||
is_failed=status["is_failed"],
|
||||
is_running=status["is_running"],
|
||||
created_at=status["created_at"],
|
||||
updated_at=status["updated_at"]
|
||||
run_id=run_id,
|
||||
workflow="unknown", # Temporal doesn't track workflow name in status
|
||||
status=workflow_status,
|
||||
is_completed=is_completed,
|
||||
is_failed=is_failed,
|
||||
is_running=is_running,
|
||||
created_at=status.get("start_time"),
|
||||
updated_at=status.get("close_time") or status.get("execution_time")
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
@@ -80,13 +77,13 @@ async def get_run_status(
|
||||
@router.get("/{run_id}/findings", response_model=WorkflowFindings)
|
||||
async def get_run_findings(
|
||||
run_id: str,
|
||||
prefect_mgr=Depends(get_prefect_manager)
|
||||
temporal_mgr=Depends(get_temporal_manager)
|
||||
) -> WorkflowFindings:
|
||||
"""
|
||||
Get the findings from a completed workflow run.
|
||||
|
||||
Args:
|
||||
run_id: The flow run ID
|
||||
run_id: The workflow run ID
|
||||
|
||||
Returns:
|
||||
SARIF-formatted findings from the workflow execution
|
||||
@@ -96,50 +93,46 @@ async def get_run_findings(
|
||||
"""
|
||||
try:
|
||||
# Get run status first
|
||||
status = await prefect_mgr.get_flow_run_status(run_id)
|
||||
status = await temporal_mgr.get_workflow_status(run_id)
|
||||
workflow_status = status.get("status", "UNKNOWN")
|
||||
|
||||
if not status["is_completed"]:
|
||||
if status["is_running"]:
|
||||
if workflow_status not in ["COMPLETED", "FAILED", "CANCELLED"]:
|
||||
if workflow_status == "RUNNING":
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail=f"Run {run_id} is still running. Current status: {status['status']}"
|
||||
)
|
||||
elif status["is_failed"]:
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail=f"Run {run_id} failed. Status: {status['status']}"
|
||||
detail=f"Run {run_id} is still running. Current status: {workflow_status}"
|
||||
)
|
||||
else:
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail=f"Run {run_id} not completed. Status: {status['status']}"
|
||||
detail=f"Run {run_id} not completed. Status: {workflow_status}"
|
||||
)
|
||||
|
||||
# Get the findings
|
||||
findings = await prefect_mgr.get_flow_run_findings(run_id)
|
||||
if workflow_status == "FAILED":
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail=f"Run {run_id} failed. Status: {workflow_status}"
|
||||
)
|
||||
|
||||
# Find workflow name
|
||||
workflow_name = "unknown"
|
||||
workflow_deployment_id = status.get("workflow", "")
|
||||
for name, deployment_id in prefect_mgr.deployments.items():
|
||||
if str(deployment_id) == str(workflow_deployment_id):
|
||||
workflow_name = name
|
||||
break
|
||||
# Get the workflow result
|
||||
result = await temporal_mgr.get_workflow_result(run_id)
|
||||
|
||||
# Get workflow version if available
|
||||
# Extract SARIF from result (handle None for backwards compatibility)
|
||||
if isinstance(result, dict):
|
||||
sarif = result.get("sarif") or {}
|
||||
else:
|
||||
sarif = {}
|
||||
|
||||
# Metadata
|
||||
metadata = {
|
||||
"completion_time": status["updated_at"],
|
||||
"completion_time": status.get("close_time"),
|
||||
"workflow_version": "unknown"
|
||||
}
|
||||
|
||||
if workflow_name in prefect_mgr.workflows:
|
||||
workflow_info = prefect_mgr.workflows[workflow_name]
|
||||
metadata["workflow_version"] = workflow_info.metadata.get("version", "unknown")
|
||||
|
||||
return WorkflowFindings(
|
||||
workflow=workflow_name,
|
||||
workflow="unknown",
|
||||
run_id=run_id,
|
||||
sarif=findings,
|
||||
sarif=sarif,
|
||||
metadata=metadata
|
||||
)
|
||||
|
||||
@@ -157,7 +150,7 @@ async def get_run_findings(
|
||||
async def get_workflow_findings(
|
||||
workflow_name: str,
|
||||
run_id: str,
|
||||
prefect_mgr=Depends(get_prefect_manager)
|
||||
temporal_mgr=Depends(get_temporal_manager)
|
||||
) -> WorkflowFindings:
|
||||
"""
|
||||
Get findings for a specific workflow run.
|
||||
@@ -166,7 +159,7 @@ async def get_workflow_findings(
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow
|
||||
run_id: The flow run ID
|
||||
run_id: The workflow run ID
|
||||
|
||||
Returns:
|
||||
SARIF-formatted findings from the workflow execution
|
||||
@@ -174,11 +167,11 @@ async def get_workflow_findings(
|
||||
Raises:
|
||||
HTTPException: 404 if workflow or run not found, 400 if run not completed
|
||||
"""
|
||||
if workflow_name not in prefect_mgr.workflows:
|
||||
if workflow_name not in temporal_mgr.workflows:
|
||||
raise HTTPException(
|
||||
status_code=404,
|
||||
detail=f"Workflow not found: {workflow_name}"
|
||||
)
|
||||
|
||||
# Delegate to the main findings endpoint
|
||||
return await get_run_findings(run_id, prefect_mgr)
|
||||
return await get_run_findings(run_id, temporal_mgr)
|
||||
|
||||
+307
-59
@@ -15,8 +15,9 @@ API endpoints for workflow management with enhanced error handling
|
||||
|
||||
import logging
|
||||
import traceback
|
||||
import tempfile
|
||||
from typing import List, Dict, Any, Optional
|
||||
from fastapi import APIRouter, HTTPException, Depends
|
||||
from fastapi import APIRouter, HTTPException, Depends, UploadFile, File, Form
|
||||
from pathlib import Path
|
||||
|
||||
from src.models.findings import (
|
||||
@@ -25,10 +26,20 @@ from src.models.findings import (
|
||||
WorkflowListItem,
|
||||
RunSubmissionResponse
|
||||
)
|
||||
from src.core.workflow_discovery import WorkflowDiscovery
|
||||
from src.temporal.discovery import WorkflowDiscovery
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Configuration for file uploads
|
||||
MAX_UPLOAD_SIZE = 10 * 1024 * 1024 * 1024 # 10 GB
|
||||
ALLOWED_CONTENT_TYPES = [
|
||||
"application/gzip",
|
||||
"application/x-gzip",
|
||||
"application/x-tar",
|
||||
"application/x-compressed-tar",
|
||||
"application/octet-stream", # Generic binary
|
||||
]
|
||||
|
||||
router = APIRouter(prefix="/workflows", tags=["workflows"])
|
||||
|
||||
|
||||
@@ -68,15 +79,15 @@ def create_structured_error_response(
|
||||
return error_response
|
||||
|
||||
|
||||
def get_prefect_manager():
|
||||
"""Dependency to get the Prefect manager instance"""
|
||||
from src.main import prefect_mgr
|
||||
return prefect_mgr
|
||||
def get_temporal_manager():
|
||||
"""Dependency to get the Temporal manager instance"""
|
||||
from src.main import temporal_mgr
|
||||
return temporal_mgr
|
||||
|
||||
|
||||
@router.get("/", response_model=List[WorkflowListItem])
|
||||
async def list_workflows(
|
||||
prefect_mgr=Depends(get_prefect_manager)
|
||||
temporal_mgr=Depends(get_temporal_manager)
|
||||
) -> List[WorkflowListItem]:
|
||||
"""
|
||||
List all discovered workflows with their metadata.
|
||||
@@ -85,7 +96,7 @@ async def list_workflows(
|
||||
author, and tags.
|
||||
"""
|
||||
workflows = []
|
||||
for name, info in prefect_mgr.workflows.items():
|
||||
for name, info in temporal_mgr.workflows.items():
|
||||
workflows.append(WorkflowListItem(
|
||||
name=name,
|
||||
version=info.metadata.get("version", "0.6.0"),
|
||||
@@ -111,7 +122,7 @@ async def get_metadata_schema() -> Dict[str, Any]:
|
||||
@router.get("/{workflow_name}/metadata", response_model=WorkflowMetadata)
|
||||
async def get_workflow_metadata(
|
||||
workflow_name: str,
|
||||
prefect_mgr=Depends(get_prefect_manager)
|
||||
temporal_mgr=Depends(get_temporal_manager)
|
||||
) -> WorkflowMetadata:
|
||||
"""
|
||||
Get complete metadata for a specific workflow.
|
||||
@@ -126,8 +137,8 @@ async def get_workflow_metadata(
|
||||
Raises:
|
||||
HTTPException: 404 if workflow not found
|
||||
"""
|
||||
if workflow_name not in prefect_mgr.workflows:
|
||||
available_workflows = list(prefect_mgr.workflows.keys())
|
||||
if workflow_name not in temporal_mgr.workflows:
|
||||
available_workflows = list(temporal_mgr.workflows.keys())
|
||||
error_response = create_structured_error_response(
|
||||
error_type="WorkflowNotFound",
|
||||
message=f"Workflow '{workflow_name}' not found",
|
||||
@@ -143,7 +154,7 @@ async def get_workflow_metadata(
|
||||
detail=error_response
|
||||
)
|
||||
|
||||
info = prefect_mgr.workflows[workflow_name]
|
||||
info = temporal_mgr.workflows[workflow_name]
|
||||
metadata = info.metadata
|
||||
|
||||
return WorkflowMetadata(
|
||||
@@ -154,9 +165,7 @@ async def get_workflow_metadata(
|
||||
tags=metadata.get("tags", []),
|
||||
parameters=metadata.get("parameters", {}),
|
||||
default_parameters=metadata.get("default_parameters", {}),
|
||||
required_modules=metadata.get("required_modules", []),
|
||||
supported_volume_modes=metadata.get("supported_volume_modes", ["ro", "rw"]),
|
||||
has_custom_docker=info.has_docker
|
||||
required_modules=metadata.get("required_modules", [])
|
||||
)
|
||||
|
||||
|
||||
@@ -164,14 +173,14 @@ async def get_workflow_metadata(
|
||||
async def submit_workflow(
|
||||
workflow_name: str,
|
||||
submission: WorkflowSubmission,
|
||||
prefect_mgr=Depends(get_prefect_manager)
|
||||
temporal_mgr=Depends(get_temporal_manager)
|
||||
) -> RunSubmissionResponse:
|
||||
"""
|
||||
Submit a workflow for execution with volume mounting.
|
||||
Submit a workflow for execution.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow to execute
|
||||
submission: Submission parameters including target path and volume mode
|
||||
submission: Submission parameters including target path and parameters
|
||||
|
||||
Returns:
|
||||
Run submission response with run_id and initial status
|
||||
@@ -179,8 +188,8 @@ async def submit_workflow(
|
||||
Raises:
|
||||
HTTPException: 404 if workflow not found, 400 for invalid parameters
|
||||
"""
|
||||
if workflow_name not in prefect_mgr.workflows:
|
||||
available_workflows = list(prefect_mgr.workflows.keys())
|
||||
if workflow_name not in temporal_mgr.workflows:
|
||||
available_workflows = list(temporal_mgr.workflows.keys())
|
||||
error_response = create_structured_error_response(
|
||||
error_type="WorkflowNotFound",
|
||||
message=f"Workflow '{workflow_name}' not found",
|
||||
@@ -197,31 +206,36 @@ async def submit_workflow(
|
||||
)
|
||||
|
||||
try:
|
||||
# Convert ResourceLimits to dict if provided
|
||||
resource_limits_dict = None
|
||||
if submission.resource_limits:
|
||||
resource_limits_dict = {
|
||||
"cpu_limit": submission.resource_limits.cpu_limit,
|
||||
"memory_limit": submission.resource_limits.memory_limit,
|
||||
"cpu_request": submission.resource_limits.cpu_request,
|
||||
"memory_request": submission.resource_limits.memory_request
|
||||
}
|
||||
# Upload target file to MinIO and get target_id
|
||||
target_path = Path(submission.target_path)
|
||||
if not target_path.exists():
|
||||
raise ValueError(f"Target path does not exist: {submission.target_path}")
|
||||
|
||||
# Submit the workflow with enhanced parameters
|
||||
flow_run = await prefect_mgr.submit_workflow(
|
||||
workflow_name=workflow_name,
|
||||
target_path=submission.target_path,
|
||||
volume_mode=submission.volume_mode,
|
||||
parameters=submission.parameters,
|
||||
resource_limits=resource_limits_dict,
|
||||
additional_volumes=submission.additional_volumes,
|
||||
timeout=submission.timeout
|
||||
# Upload target (using anonymous user for now)
|
||||
target_id = await temporal_mgr.upload_target(
|
||||
file_path=target_path,
|
||||
user_id="api-user",
|
||||
metadata={"workflow": workflow_name}
|
||||
)
|
||||
|
||||
run_id = str(flow_run.id)
|
||||
# Merge default parameters with user parameters
|
||||
workflow_info = temporal_mgr.workflows[workflow_name]
|
||||
metadata = workflow_info.metadata or {}
|
||||
defaults = metadata.get("default_parameters", {})
|
||||
user_params = submission.parameters or {}
|
||||
workflow_params = {**defaults, **user_params}
|
||||
|
||||
# Start workflow execution
|
||||
handle = await temporal_mgr.run_workflow(
|
||||
workflow_name=workflow_name,
|
||||
target_id=target_id,
|
||||
workflow_params=workflow_params
|
||||
)
|
||||
|
||||
run_id = handle.id
|
||||
|
||||
# Initialize fuzzing tracking if this looks like a fuzzing workflow
|
||||
workflow_info = prefect_mgr.workflows.get(workflow_name, {})
|
||||
workflow_info = temporal_mgr.workflows.get(workflow_name, {})
|
||||
workflow_tags = workflow_info.metadata.get("tags", []) if hasattr(workflow_info, 'metadata') else []
|
||||
if "fuzzing" in workflow_tags or "fuzz" in workflow_name.lower():
|
||||
from src.api.fuzzing import initialize_fuzzing_tracking
|
||||
@@ -229,7 +243,7 @@ async def submit_workflow(
|
||||
|
||||
return RunSubmissionResponse(
|
||||
run_id=run_id,
|
||||
status=flow_run.state.name if flow_run.state else "PENDING",
|
||||
status="RUNNING",
|
||||
workflow=workflow_name,
|
||||
message=f"Workflow '{workflow_name}' submitted successfully"
|
||||
)
|
||||
@@ -261,17 +275,13 @@ async def submit_workflow(
|
||||
error_type = "WorkflowSubmissionError"
|
||||
|
||||
# Detect specific error patterns
|
||||
if "deployment" in error_message.lower():
|
||||
error_type = "DeploymentError"
|
||||
deployment_info = {
|
||||
"status": "failed",
|
||||
"error": error_message
|
||||
}
|
||||
if "workflow" in error_message.lower() and "not found" in error_message.lower():
|
||||
error_type = "WorkflowError"
|
||||
suggestions.extend([
|
||||
"Check if Prefect server is running and accessible",
|
||||
"Verify Docker is running and has sufficient resources",
|
||||
"Check container image availability",
|
||||
"Ensure volume paths exist and are accessible"
|
||||
"Check if Temporal server is running and accessible",
|
||||
"Verify workflow workers are running",
|
||||
"Check if workflow is registered with correct vertical",
|
||||
"Ensure Docker is running and has sufficient resources"
|
||||
])
|
||||
|
||||
elif "volume" in error_message.lower() or "mount" in error_message.lower():
|
||||
@@ -324,25 +334,200 @@ async def submit_workflow(
|
||||
)
|
||||
|
||||
|
||||
@router.get("/{workflow_name}/parameters")
|
||||
async def get_workflow_parameters(
|
||||
@router.post("/{workflow_name}/upload-and-submit", response_model=RunSubmissionResponse)
|
||||
async def upload_and_submit_workflow(
|
||||
workflow_name: str,
|
||||
prefect_mgr=Depends(get_prefect_manager)
|
||||
file: UploadFile = File(..., description="Target file or tarball to analyze"),
|
||||
parameters: Optional[str] = Form(None, description="JSON-encoded workflow parameters"),
|
||||
timeout: Optional[int] = Form(None, description="Timeout in seconds"),
|
||||
temporal_mgr=Depends(get_temporal_manager)
|
||||
) -> RunSubmissionResponse:
|
||||
"""
|
||||
Upload a target file/tarball and submit workflow for execution.
|
||||
|
||||
This endpoint accepts multipart/form-data uploads and is the recommended
|
||||
way to submit workflows from remote CLI clients.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow to execute
|
||||
file: Target file or tarball (compressed directory)
|
||||
parameters: JSON string of workflow parameters (optional)
|
||||
timeout: Execution timeout in seconds (optional)
|
||||
|
||||
Returns:
|
||||
Run submission response with run_id and initial status
|
||||
|
||||
Raises:
|
||||
HTTPException: 404 if workflow not found, 400 for invalid parameters,
|
||||
413 if file too large
|
||||
"""
|
||||
if workflow_name not in temporal_mgr.workflows:
|
||||
available_workflows = list(temporal_mgr.workflows.keys())
|
||||
error_response = create_structured_error_response(
|
||||
error_type="WorkflowNotFound",
|
||||
message=f"Workflow '{workflow_name}' not found",
|
||||
workflow_name=workflow_name,
|
||||
suggestions=[
|
||||
f"Available workflows: {', '.join(available_workflows)}",
|
||||
"Use GET /workflows/ to see all available workflows"
|
||||
]
|
||||
)
|
||||
raise HTTPException(status_code=404, detail=error_response)
|
||||
|
||||
temp_file_path = None
|
||||
|
||||
try:
|
||||
# Validate file size
|
||||
file_size = 0
|
||||
chunk_size = 1024 * 1024 # 1MB chunks
|
||||
|
||||
# Create temporary file
|
||||
temp_fd, temp_file_path = tempfile.mkstemp(suffix=".tar.gz")
|
||||
|
||||
logger.info(f"Receiving file upload for workflow '{workflow_name}': {file.filename}")
|
||||
|
||||
# Stream file to disk
|
||||
with open(temp_fd, 'wb') as temp_file:
|
||||
while True:
|
||||
chunk = await file.read(chunk_size)
|
||||
if not chunk:
|
||||
break
|
||||
|
||||
file_size += len(chunk)
|
||||
|
||||
# Check size limit
|
||||
if file_size > MAX_UPLOAD_SIZE:
|
||||
raise HTTPException(
|
||||
status_code=413,
|
||||
detail=create_structured_error_response(
|
||||
error_type="FileTooLarge",
|
||||
message=f"File size exceeds maximum allowed size of {MAX_UPLOAD_SIZE / (1024**3):.1f} GB",
|
||||
workflow_name=workflow_name,
|
||||
suggestions=[
|
||||
"Reduce the size of your target directory",
|
||||
"Exclude unnecessary files (build artifacts, dependencies, etc.)",
|
||||
"Consider splitting into smaller analysis targets"
|
||||
]
|
||||
)
|
||||
)
|
||||
|
||||
temp_file.write(chunk)
|
||||
|
||||
logger.info(f"Received file: {file_size / (1024**2):.2f} MB")
|
||||
|
||||
# Parse parameters
|
||||
workflow_params = {}
|
||||
if parameters:
|
||||
try:
|
||||
import json
|
||||
workflow_params = json.loads(parameters)
|
||||
if not isinstance(workflow_params, dict):
|
||||
raise ValueError("Parameters must be a JSON object")
|
||||
except (json.JSONDecodeError, ValueError) as e:
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail=create_structured_error_response(
|
||||
error_type="InvalidParameters",
|
||||
message=f"Invalid parameters JSON: {e}",
|
||||
workflow_name=workflow_name,
|
||||
suggestions=["Ensure parameters is valid JSON object"]
|
||||
)
|
||||
)
|
||||
|
||||
# Upload to MinIO
|
||||
target_id = await temporal_mgr.upload_target(
|
||||
file_path=Path(temp_file_path),
|
||||
user_id="api-user",
|
||||
metadata={
|
||||
"workflow": workflow_name,
|
||||
"original_filename": file.filename,
|
||||
"upload_method": "multipart"
|
||||
}
|
||||
)
|
||||
|
||||
logger.info(f"Uploaded to MinIO with target_id: {target_id}")
|
||||
|
||||
# Merge default parameters with user parameters
|
||||
workflow_info = temporal_mgr.workflows.get(workflow_name)
|
||||
metadata = workflow_info.metadata or {}
|
||||
defaults = metadata.get("default_parameters", {})
|
||||
workflow_params = {**defaults, **workflow_params}
|
||||
|
||||
# Start workflow execution
|
||||
handle = await temporal_mgr.run_workflow(
|
||||
workflow_name=workflow_name,
|
||||
target_id=target_id,
|
||||
workflow_params=workflow_params
|
||||
)
|
||||
|
||||
run_id = handle.id
|
||||
|
||||
# Initialize fuzzing tracking if needed
|
||||
workflow_info = temporal_mgr.workflows.get(workflow_name, {})
|
||||
workflow_tags = workflow_info.metadata.get("tags", []) if hasattr(workflow_info, 'metadata') else []
|
||||
if "fuzzing" in workflow_tags or "fuzz" in workflow_name.lower():
|
||||
from src.api.fuzzing import initialize_fuzzing_tracking
|
||||
initialize_fuzzing_tracking(run_id, workflow_name)
|
||||
|
||||
return RunSubmissionResponse(
|
||||
run_id=run_id,
|
||||
status="RUNNING",
|
||||
workflow=workflow_name,
|
||||
message=f"Workflow '{workflow_name}' submitted successfully with uploaded target"
|
||||
)
|
||||
|
||||
except HTTPException:
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to upload and submit workflow '{workflow_name}': {e}")
|
||||
logger.error(f"Traceback: {traceback.format_exc()}")
|
||||
|
||||
error_response = create_structured_error_response(
|
||||
error_type="WorkflowSubmissionError",
|
||||
message=f"Failed to process upload and submit workflow: {str(e)}",
|
||||
workflow_name=workflow_name,
|
||||
suggestions=[
|
||||
"Check if the uploaded file is a valid tarball",
|
||||
"Verify MinIO storage is accessible",
|
||||
"Check backend logs for detailed error information",
|
||||
"Ensure Temporal workers are running"
|
||||
]
|
||||
)
|
||||
|
||||
raise HTTPException(status_code=500, detail=error_response)
|
||||
|
||||
finally:
|
||||
# Cleanup temporary file
|
||||
if temp_file_path and Path(temp_file_path).exists():
|
||||
try:
|
||||
Path(temp_file_path).unlink()
|
||||
logger.debug(f"Cleaned up temp file: {temp_file_path}")
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to cleanup temp file {temp_file_path}: {e}")
|
||||
|
||||
|
||||
@router.get("/{workflow_name}/worker-info")
|
||||
async def get_workflow_worker_info(
|
||||
workflow_name: str,
|
||||
temporal_mgr=Depends(get_temporal_manager)
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Get the parameters schema for a workflow.
|
||||
Get worker information for a workflow.
|
||||
|
||||
Returns details about which worker is required to execute this workflow,
|
||||
including container name, task queue, and vertical.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow
|
||||
|
||||
Returns:
|
||||
Parameters schema with types, descriptions, and defaults
|
||||
Worker information including vertical, container name, and task queue
|
||||
|
||||
Raises:
|
||||
HTTPException: 404 if workflow not found
|
||||
"""
|
||||
if workflow_name not in prefect_mgr.workflows:
|
||||
available_workflows = list(prefect_mgr.workflows.keys())
|
||||
if workflow_name not in temporal_mgr.workflows:
|
||||
available_workflows = list(temporal_mgr.workflows.keys())
|
||||
error_response = create_structured_error_response(
|
||||
error_type="WorkflowNotFound",
|
||||
message=f"Workflow '{workflow_name}' not found",
|
||||
@@ -357,7 +542,70 @@ async def get_workflow_parameters(
|
||||
detail=error_response
|
||||
)
|
||||
|
||||
info = prefect_mgr.workflows[workflow_name]
|
||||
info = temporal_mgr.workflows[workflow_name]
|
||||
metadata = info.metadata
|
||||
|
||||
# Extract vertical from metadata
|
||||
vertical = metadata.get("vertical")
|
||||
|
||||
if not vertical:
|
||||
error_response = create_structured_error_response(
|
||||
error_type="MissingVertical",
|
||||
message=f"Workflow '{workflow_name}' does not specify a vertical in metadata",
|
||||
workflow_name=workflow_name,
|
||||
suggestions=[
|
||||
"Check workflow metadata.yaml for 'vertical' field",
|
||||
"Contact workflow author for support"
|
||||
]
|
||||
)
|
||||
raise HTTPException(
|
||||
status_code=500,
|
||||
detail=error_response
|
||||
)
|
||||
|
||||
return {
|
||||
"workflow": workflow_name,
|
||||
"vertical": vertical,
|
||||
"worker_container": f"fuzzforge-worker-{vertical}",
|
||||
"task_queue": f"{vertical}-queue",
|
||||
"required": True
|
||||
}
|
||||
|
||||
|
||||
@router.get("/{workflow_name}/parameters")
|
||||
async def get_workflow_parameters(
|
||||
workflow_name: str,
|
||||
temporal_mgr=Depends(get_temporal_manager)
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Get the parameters schema for a workflow.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow
|
||||
|
||||
Returns:
|
||||
Parameters schema with types, descriptions, and defaults
|
||||
|
||||
Raises:
|
||||
HTTPException: 404 if workflow not found
|
||||
"""
|
||||
if workflow_name not in temporal_mgr.workflows:
|
||||
available_workflows = list(temporal_mgr.workflows.keys())
|
||||
error_response = create_structured_error_response(
|
||||
error_type="WorkflowNotFound",
|
||||
message=f"Workflow '{workflow_name}' not found",
|
||||
workflow_name=workflow_name,
|
||||
suggestions=[
|
||||
f"Available workflows: {', '.join(available_workflows)}",
|
||||
"Use GET /workflows/ to see all available workflows"
|
||||
]
|
||||
)
|
||||
raise HTTPException(
|
||||
status_code=404,
|
||||
detail=error_response
|
||||
)
|
||||
|
||||
info = temporal_mgr.workflows[workflow_name]
|
||||
metadata = info.metadata
|
||||
|
||||
# Return parameters with enhanced schema information
|
||||
|
||||
@@ -1,770 +0,0 @@
|
||||
"""
|
||||
Prefect Manager - Core orchestration for workflow deployment and execution
|
||||
"""
|
||||
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import logging
|
||||
import os
|
||||
import platform
|
||||
import re
|
||||
from pathlib import Path
|
||||
from typing import Dict, Optional, Any
|
||||
from prefect import get_client
|
||||
from prefect.docker import DockerImage
|
||||
from prefect.client.schemas import FlowRun
|
||||
|
||||
from src.core.workflow_discovery import WorkflowDiscovery, WorkflowInfo
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
def get_registry_url(context: str = "default") -> str:
|
||||
"""
|
||||
Get the container registry URL to use for a given operation context.
|
||||
|
||||
Goals:
|
||||
- Work reliably across Linux and macOS Docker Desktop
|
||||
- Prefer in-network service discovery when running inside containers
|
||||
- Allow full override via env vars from docker-compose
|
||||
|
||||
Env overrides:
|
||||
- FUZZFORGE_REGISTRY_PUSH_URL: used for image builds/pushes
|
||||
- FUZZFORGE_REGISTRY_PULL_URL: used for workers to pull images
|
||||
"""
|
||||
# Normalize context
|
||||
ctx = (context or "default").lower()
|
||||
|
||||
# Always honor explicit overrides first
|
||||
if ctx in ("push", "build"):
|
||||
push_url = os.getenv("FUZZFORGE_REGISTRY_PUSH_URL")
|
||||
if push_url:
|
||||
logger.debug("Using FUZZFORGE_REGISTRY_PUSH_URL: %s", push_url)
|
||||
return push_url
|
||||
# Default to host-published registry for Docker daemon operations
|
||||
return "localhost:5001"
|
||||
|
||||
if ctx == "pull":
|
||||
pull_url = os.getenv("FUZZFORGE_REGISTRY_PULL_URL")
|
||||
if pull_url:
|
||||
logger.debug("Using FUZZFORGE_REGISTRY_PULL_URL: %s", pull_url)
|
||||
return pull_url
|
||||
# Prefect worker pulls via host Docker daemon as well
|
||||
return "localhost:5001"
|
||||
|
||||
# Default/fallback
|
||||
return os.getenv("FUZZFORGE_REGISTRY_PULL_URL", os.getenv("FUZZFORGE_REGISTRY_PUSH_URL", "localhost:5001"))
|
||||
|
||||
|
||||
def _compose_project_name(default: str = "fuzzforge") -> str:
|
||||
"""Return the docker-compose project name used for network/volume naming.
|
||||
|
||||
Always returns 'fuzzforge' regardless of environment variables.
|
||||
"""
|
||||
return "fuzzforge"
|
||||
|
||||
|
||||
class PrefectManager:
|
||||
"""
|
||||
Manages Prefect deployments and flow runs for discovered workflows.
|
||||
|
||||
This class handles:
|
||||
- Workflow discovery and registration
|
||||
- Docker image building through Prefect
|
||||
- Deployment creation and management
|
||||
- Flow run submission with volume mounting
|
||||
- Findings retrieval from completed runs
|
||||
"""
|
||||
|
||||
def __init__(self, workflows_dir: Path = None):
|
||||
"""
|
||||
Initialize the Prefect manager.
|
||||
|
||||
Args:
|
||||
workflows_dir: Path to the workflows directory (default: toolbox/workflows)
|
||||
"""
|
||||
if workflows_dir is None:
|
||||
workflows_dir = Path("toolbox/workflows")
|
||||
|
||||
self.discovery = WorkflowDiscovery(workflows_dir)
|
||||
self.workflows: Dict[str, WorkflowInfo] = {}
|
||||
self.deployments: Dict[str, str] = {} # workflow_name -> deployment_id
|
||||
|
||||
# Security: Define allowed and forbidden paths for host mounting
|
||||
self.allowed_base_paths = [
|
||||
"/tmp",
|
||||
"/home",
|
||||
"/Users", # macOS users
|
||||
"/opt",
|
||||
"/var/tmp",
|
||||
"/workspace", # Common container workspace
|
||||
"/app" # Container application directory (for test projects)
|
||||
]
|
||||
|
||||
self.forbidden_paths = [
|
||||
"/etc",
|
||||
"/root",
|
||||
"/var/run",
|
||||
"/sys",
|
||||
"/proc",
|
||||
"/dev",
|
||||
"/boot",
|
||||
"/var/lib/docker", # Critical Docker data
|
||||
"/var/log", # System logs
|
||||
"/usr/bin", # System binaries
|
||||
"/usr/sbin",
|
||||
"/sbin",
|
||||
"/bin"
|
||||
]
|
||||
|
||||
@staticmethod
|
||||
def _parse_memory_to_bytes(memory_str: str) -> int:
|
||||
"""
|
||||
Parse memory string (like '512Mi', '1Gi') to bytes.
|
||||
|
||||
Args:
|
||||
memory_str: Memory string with unit suffix
|
||||
|
||||
Returns:
|
||||
Memory in bytes
|
||||
|
||||
Raises:
|
||||
ValueError: If format is invalid
|
||||
"""
|
||||
if not memory_str:
|
||||
return 0
|
||||
|
||||
match = re.match(r'^(\d+(?:\.\d+)?)\s*([GMK]i?)$', memory_str.strip())
|
||||
if not match:
|
||||
raise ValueError(f"Invalid memory format: {memory_str}. Expected format like '512Mi', '1Gi'")
|
||||
|
||||
value, unit = match.groups()
|
||||
value = float(value)
|
||||
|
||||
# Convert to bytes based on unit (binary units: Ki, Mi, Gi)
|
||||
if unit in ['K', 'Ki']:
|
||||
multiplier = 1024
|
||||
elif unit in ['M', 'Mi']:
|
||||
multiplier = 1024 * 1024
|
||||
elif unit in ['G', 'Gi']:
|
||||
multiplier = 1024 * 1024 * 1024
|
||||
else:
|
||||
raise ValueError(f"Unsupported memory unit: {unit}")
|
||||
|
||||
return int(value * multiplier)
|
||||
|
||||
@staticmethod
|
||||
def _parse_cpu_to_millicores(cpu_str: str) -> int:
|
||||
"""
|
||||
Parse CPU string (like '500m', '1', '2.5') to millicores.
|
||||
|
||||
Args:
|
||||
cpu_str: CPU string
|
||||
|
||||
Returns:
|
||||
CPU in millicores (1 core = 1000 millicores)
|
||||
|
||||
Raises:
|
||||
ValueError: If format is invalid
|
||||
"""
|
||||
if not cpu_str:
|
||||
return 0
|
||||
|
||||
cpu_str = cpu_str.strip()
|
||||
|
||||
# Handle millicores format (e.g., '500m')
|
||||
if cpu_str.endswith('m'):
|
||||
try:
|
||||
return int(cpu_str[:-1])
|
||||
except ValueError:
|
||||
raise ValueError(f"Invalid CPU format: {cpu_str}")
|
||||
|
||||
# Handle core format (e.g., '1', '2.5')
|
||||
try:
|
||||
cores = float(cpu_str)
|
||||
return int(cores * 1000) # Convert to millicores
|
||||
except ValueError:
|
||||
raise ValueError(f"Invalid CPU format: {cpu_str}")
|
||||
|
||||
def _extract_resource_requirements(self, workflow_info: WorkflowInfo) -> Dict[str, str]:
|
||||
"""
|
||||
Extract resource requirements from workflow metadata.
|
||||
|
||||
Args:
|
||||
workflow_info: Workflow information with metadata
|
||||
|
||||
Returns:
|
||||
Dictionary with resource requirements in Docker format
|
||||
"""
|
||||
metadata = workflow_info.metadata
|
||||
requirements = metadata.get("requirements", {})
|
||||
resources = requirements.get("resources", {})
|
||||
|
||||
resource_config = {}
|
||||
|
||||
# Extract memory requirement
|
||||
memory = resources.get("memory")
|
||||
if memory:
|
||||
try:
|
||||
# Validate memory format and store original string for Docker
|
||||
self._parse_memory_to_bytes(memory)
|
||||
resource_config["memory"] = memory
|
||||
except ValueError as e:
|
||||
logger.warning(f"Invalid memory requirement in {workflow_info.name}: {e}")
|
||||
|
||||
# Extract CPU requirement
|
||||
cpu = resources.get("cpu")
|
||||
if cpu:
|
||||
try:
|
||||
# Validate CPU format and store original string for Docker
|
||||
self._parse_cpu_to_millicores(cpu)
|
||||
resource_config["cpus"] = cpu
|
||||
except ValueError as e:
|
||||
logger.warning(f"Invalid CPU requirement in {workflow_info.name}: {e}")
|
||||
|
||||
# Extract timeout
|
||||
timeout = resources.get("timeout")
|
||||
if timeout and isinstance(timeout, int):
|
||||
resource_config["timeout"] = str(timeout)
|
||||
|
||||
return resource_config
|
||||
|
||||
async def initialize(self):
|
||||
"""
|
||||
Initialize the manager by discovering and deploying all workflows.
|
||||
|
||||
This method:
|
||||
1. Discovers all valid workflows in the workflows directory
|
||||
2. Validates their metadata
|
||||
3. Deploys each workflow to Prefect with Docker images
|
||||
"""
|
||||
try:
|
||||
# Discover workflows
|
||||
self.workflows = await self.discovery.discover_workflows()
|
||||
|
||||
if not self.workflows:
|
||||
logger.warning("No workflows discovered")
|
||||
return
|
||||
|
||||
logger.info(f"Discovered {len(self.workflows)} workflows: {list(self.workflows.keys())}")
|
||||
|
||||
# Deploy each workflow
|
||||
for name, info in self.workflows.items():
|
||||
try:
|
||||
await self._deploy_workflow(name, info)
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to deploy workflow '{name}': {e}")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to initialize Prefect manager: {e}")
|
||||
raise
|
||||
|
||||
async def _deploy_workflow(self, name: str, info: WorkflowInfo):
|
||||
"""
|
||||
Deploy a single workflow to Prefect with Docker image.
|
||||
|
||||
Args:
|
||||
name: Workflow name
|
||||
info: Workflow information including metadata and paths
|
||||
"""
|
||||
logger.info(f"Deploying workflow '{name}'...")
|
||||
|
||||
# Get the flow function from registry
|
||||
flow_func = self.discovery.get_flow_function(name)
|
||||
if not flow_func:
|
||||
logger.error(
|
||||
f"Failed to get flow function for '{name}' from registry. "
|
||||
f"Ensure the workflow is properly registered in toolbox/workflows/registry.py"
|
||||
)
|
||||
return
|
||||
|
||||
# Use the mandatory Dockerfile with absolute paths for Docker Compose
|
||||
# Get absolute paths for build context and dockerfile
|
||||
toolbox_path = info.path.parent.parent.resolve()
|
||||
dockerfile_abs_path = info.dockerfile.resolve()
|
||||
|
||||
# Calculate relative dockerfile path from toolbox context
|
||||
try:
|
||||
dockerfile_rel_path = dockerfile_abs_path.relative_to(toolbox_path)
|
||||
except ValueError:
|
||||
# If relative path fails, use the workflow-specific path
|
||||
dockerfile_rel_path = Path("workflows") / name / "Dockerfile"
|
||||
|
||||
# Determine deployment strategy based on Dockerfile presence
|
||||
base_image = "prefecthq/prefect:3-python3.11"
|
||||
has_custom_dockerfile = info.has_docker and info.dockerfile.exists()
|
||||
|
||||
logger.info(f"=== DEPLOYMENT DEBUG for '{name}' ===")
|
||||
logger.info(f"info.has_docker: {info.has_docker}")
|
||||
logger.info(f"info.dockerfile: {info.dockerfile}")
|
||||
logger.info(f"info.dockerfile.exists(): {info.dockerfile.exists()}")
|
||||
logger.info(f"has_custom_dockerfile: {has_custom_dockerfile}")
|
||||
logger.info(f"toolbox_path: {toolbox_path}")
|
||||
logger.info(f"dockerfile_rel_path: {dockerfile_rel_path}")
|
||||
|
||||
if has_custom_dockerfile:
|
||||
logger.info(f"Workflow '{name}' has custom Dockerfile - building custom image")
|
||||
# Decide whether to use registry or keep images local to host engine
|
||||
import os
|
||||
# Default to using the local registry; set FUZZFORGE_USE_REGISTRY=false to bypass (not recommended)
|
||||
use_registry = os.getenv("FUZZFORGE_USE_REGISTRY", "true").lower() == "true"
|
||||
|
||||
if use_registry:
|
||||
registry_url = get_registry_url(context="push")
|
||||
image_spec = DockerImage(
|
||||
name=f"{registry_url}/fuzzforge/{name}",
|
||||
tag="latest",
|
||||
dockerfile=str(dockerfile_rel_path),
|
||||
context=str(toolbox_path)
|
||||
)
|
||||
deploy_image = f"{registry_url}/fuzzforge/{name}:latest"
|
||||
build_custom = True
|
||||
push_custom = True
|
||||
logger.info(f"Using registry: {registry_url} for '{name}'")
|
||||
else:
|
||||
# Single-host mode: build into host engine cache; no push required
|
||||
image_spec = DockerImage(
|
||||
name=f"fuzzforge/{name}",
|
||||
tag="latest",
|
||||
dockerfile=str(dockerfile_rel_path),
|
||||
context=str(toolbox_path)
|
||||
)
|
||||
deploy_image = f"fuzzforge/{name}:latest"
|
||||
build_custom = True
|
||||
push_custom = False
|
||||
logger.info("Using single-host image (no registry push): %s", deploy_image)
|
||||
else:
|
||||
logger.info(f"Workflow '{name}' using base image - no custom dependencies needed")
|
||||
deploy_image = base_image
|
||||
build_custom = False
|
||||
push_custom = False
|
||||
|
||||
# Pre-validate registry connectivity when pushing
|
||||
if push_custom:
|
||||
try:
|
||||
from .setup import validate_registry_connectivity
|
||||
await validate_registry_connectivity(registry_url)
|
||||
logger.info(f"Registry connectivity validated for {registry_url}")
|
||||
except Exception as e:
|
||||
logger.error(f"Registry connectivity validation failed for {registry_url}: {e}")
|
||||
raise RuntimeError(f"Cannot deploy workflow '{name}': Registry {registry_url} is not accessible. {e}")
|
||||
|
||||
# Deploy the workflow
|
||||
try:
|
||||
# Ensure any previous deployment is removed so job variables are updated
|
||||
try:
|
||||
async with get_client() as client:
|
||||
existing = await client.read_deployment_by_name(
|
||||
f"{name}/{name}-deployment"
|
||||
)
|
||||
if existing:
|
||||
logger.info(f"Removing existing deployment for '{name}' to refresh settings...")
|
||||
await client.delete_deployment(existing.id)
|
||||
except Exception:
|
||||
# If not found or deletion fails, continue with deployment
|
||||
pass
|
||||
|
||||
# Extract resource requirements from metadata
|
||||
workflow_resource_requirements = self._extract_resource_requirements(info)
|
||||
logger.info(f"Workflow '{name}' resource requirements: {workflow_resource_requirements}")
|
||||
|
||||
# Build job variables with resource requirements
|
||||
job_variables = {
|
||||
"image": deploy_image, # Use the worker-accessible registry name
|
||||
"volumes": [], # Populated at run submission with toolbox mount
|
||||
"env": {
|
||||
"PYTHONPATH": "/opt/prefect/toolbox:/opt/prefect",
|
||||
"WORKFLOW_NAME": name
|
||||
}
|
||||
}
|
||||
|
||||
# Add resource requirements to job variables if present
|
||||
if workflow_resource_requirements:
|
||||
job_variables["resources"] = workflow_resource_requirements
|
||||
|
||||
# Prepare deployment parameters
|
||||
deploy_params = {
|
||||
"name": f"{name}-deployment",
|
||||
"work_pool_name": "docker-pool",
|
||||
"image": image_spec if has_custom_dockerfile else deploy_image,
|
||||
"push": push_custom,
|
||||
"build": build_custom,
|
||||
"job_variables": job_variables
|
||||
}
|
||||
|
||||
deployment = await flow_func.deploy(**deploy_params)
|
||||
|
||||
self.deployments[name] = str(deployment.id) if hasattr(deployment, 'id') else name
|
||||
logger.info(f"Successfully deployed workflow '{name}'")
|
||||
|
||||
except Exception as e:
|
||||
# Enhanced error reporting with more context
|
||||
import traceback
|
||||
logger.error(f"Failed to deploy workflow '{name}': {e}")
|
||||
logger.error(f"Deployment traceback: {traceback.format_exc()}")
|
||||
|
||||
# Try to capture Docker-specific context
|
||||
error_context = {
|
||||
"workflow_name": name,
|
||||
"has_dockerfile": has_custom_dockerfile,
|
||||
"image_name": deploy_image if 'deploy_image' in locals() else "unknown",
|
||||
"registry_url": registry_url if 'registry_url' in locals() else "unknown",
|
||||
"error_type": type(e).__name__,
|
||||
"error_message": str(e)
|
||||
}
|
||||
|
||||
# Check for specific error patterns with detailed categorization
|
||||
error_msg_lower = str(e).lower()
|
||||
if "registry" in error_msg_lower and ("no such host" in error_msg_lower or "connection" in error_msg_lower):
|
||||
error_context["category"] = "registry_connectivity_error"
|
||||
error_context["solution"] = f"Cannot reach registry at {error_context['registry_url']}. Check Docker network and registry service."
|
||||
elif "docker" in error_msg_lower:
|
||||
error_context["category"] = "docker_error"
|
||||
if "build" in error_msg_lower:
|
||||
error_context["subcategory"] = "image_build_failed"
|
||||
error_context["solution"] = "Check Dockerfile syntax and dependencies."
|
||||
elif "pull" in error_msg_lower:
|
||||
error_context["subcategory"] = "image_pull_failed"
|
||||
error_context["solution"] = "Check if image exists in registry and network connectivity."
|
||||
elif "push" in error_msg_lower:
|
||||
error_context["subcategory"] = "image_push_failed"
|
||||
error_context["solution"] = f"Check registry connectivity and push permissions to {error_context['registry_url']}."
|
||||
elif "registry" in error_msg_lower:
|
||||
error_context["category"] = "registry_error"
|
||||
error_context["solution"] = "Check registry configuration and accessibility."
|
||||
elif "prefect" in error_msg_lower:
|
||||
error_context["category"] = "prefect_error"
|
||||
error_context["solution"] = "Check Prefect server connectivity and deployment configuration."
|
||||
else:
|
||||
error_context["category"] = "unknown_deployment_error"
|
||||
error_context["solution"] = "Check logs for more specific error details."
|
||||
|
||||
logger.error(f"Deployment error context: {error_context}")
|
||||
|
||||
# Raise enhanced exception with context
|
||||
enhanced_error = Exception(f"Deployment failed for workflow '{name}': {str(e)} | Context: {error_context}")
|
||||
enhanced_error.original_error = e
|
||||
enhanced_error.context = error_context
|
||||
raise enhanced_error
|
||||
|
||||
async def submit_workflow(
|
||||
self,
|
||||
workflow_name: str,
|
||||
target_path: str,
|
||||
volume_mode: str = "ro",
|
||||
parameters: Dict[str, Any] = None,
|
||||
resource_limits: Dict[str, str] = None,
|
||||
additional_volumes: list = None,
|
||||
timeout: int = None
|
||||
) -> FlowRun:
|
||||
"""
|
||||
Submit a workflow for execution with volume mounting.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow to execute
|
||||
target_path: Host path to mount as volume
|
||||
volume_mode: Volume mount mode ("ro" for read-only, "rw" for read-write)
|
||||
parameters: Workflow-specific parameters
|
||||
resource_limits: CPU/memory limits for container
|
||||
additional_volumes: List of additional volume mounts
|
||||
timeout: Timeout in seconds
|
||||
|
||||
Returns:
|
||||
FlowRun object with run information
|
||||
|
||||
Raises:
|
||||
ValueError: If workflow not found or volume mode not supported
|
||||
"""
|
||||
if workflow_name not in self.workflows:
|
||||
raise ValueError(f"Unknown workflow: {workflow_name}")
|
||||
|
||||
# Validate volume mode
|
||||
workflow_info = self.workflows[workflow_name]
|
||||
supported_modes = workflow_info.metadata.get("supported_volume_modes", ["ro", "rw"])
|
||||
|
||||
if volume_mode not in supported_modes:
|
||||
raise ValueError(
|
||||
f"Workflow '{workflow_name}' doesn't support volume mode '{volume_mode}'. "
|
||||
f"Supported modes: {supported_modes}"
|
||||
)
|
||||
|
||||
# Validate target path with security checks
|
||||
self._validate_target_path(target_path)
|
||||
|
||||
# Validate additional volumes if provided
|
||||
if additional_volumes:
|
||||
for volume in additional_volumes:
|
||||
self._validate_target_path(volume.host_path)
|
||||
|
||||
async with get_client() as client:
|
||||
# Get the deployment, auto-redeploy once if missing
|
||||
try:
|
||||
deployment = await client.read_deployment_by_name(
|
||||
f"{workflow_name}/{workflow_name}-deployment"
|
||||
)
|
||||
except Exception as e:
|
||||
import traceback
|
||||
logger.error(f"Failed to find deployment for workflow '{workflow_name}': {e}")
|
||||
logger.error(f"Deployment lookup traceback: {traceback.format_exc()}")
|
||||
|
||||
# Attempt a one-time auto-deploy to recover from startup races
|
||||
try:
|
||||
logger.info(f"Auto-deploying missing workflow '{workflow_name}' and retrying...")
|
||||
await self._deploy_workflow(workflow_name, workflow_info)
|
||||
deployment = await client.read_deployment_by_name(
|
||||
f"{workflow_name}/{workflow_name}-deployment"
|
||||
)
|
||||
except Exception as redeploy_exc:
|
||||
# Enhanced error with context
|
||||
error_context = {
|
||||
"workflow_name": workflow_name,
|
||||
"error_type": type(e).__name__,
|
||||
"error_message": str(e),
|
||||
"redeploy_error": str(redeploy_exc),
|
||||
"available_deployments": list(self.deployments.keys()),
|
||||
}
|
||||
enhanced_error = ValueError(
|
||||
f"Deployment not found and redeploy failed for workflow '{workflow_name}': {e} | Context: {error_context}"
|
||||
)
|
||||
enhanced_error.context = error_context
|
||||
raise enhanced_error
|
||||
|
||||
# Determine the Docker Compose network name and volume names
|
||||
# Hardcoded to 'fuzzforge' to avoid directory name dependencies
|
||||
import os
|
||||
compose_project = "fuzzforge"
|
||||
docker_network = "fuzzforge_default"
|
||||
|
||||
# Build volume mounts
|
||||
# Add toolbox volume mount for workflow code access
|
||||
backend_toolbox_path = "/app/toolbox" # Path in backend container
|
||||
|
||||
# Hardcoded volume names
|
||||
prefect_storage_volume = "fuzzforge_prefect_storage"
|
||||
toolbox_code_volume = "fuzzforge_toolbox_code"
|
||||
|
||||
volumes = [
|
||||
f"{target_path}:/workspace:{volume_mode}",
|
||||
f"{prefect_storage_volume}:/prefect-storage", # Shared storage for results
|
||||
f"{toolbox_code_volume}:/opt/prefect/toolbox:ro" # Mount workflow code
|
||||
]
|
||||
|
||||
# Add additional volumes if provided
|
||||
if additional_volumes:
|
||||
for volume in additional_volumes:
|
||||
volume_spec = f"{volume.host_path}:{volume.container_path}:{volume.mode}"
|
||||
volumes.append(volume_spec)
|
||||
|
||||
# Build environment variables
|
||||
env_vars = {
|
||||
"PREFECT_API_URL": "http://prefect-server:4200/api", # Use internal network hostname
|
||||
"PREFECT_LOGGING_LEVEL": "INFO",
|
||||
"PREFECT_LOCAL_STORAGE_PATH": "/prefect-storage", # Use shared storage
|
||||
"PREFECT_RESULTS_PERSIST_BY_DEFAULT": "true", # Enable result persistence
|
||||
"PREFECT_DEFAULT_RESULT_STORAGE_BLOCK": "local-file-system/fuzzforge-results", # Use our storage block
|
||||
"WORKSPACE_PATH": "/workspace",
|
||||
"VOLUME_MODE": volume_mode,
|
||||
"WORKFLOW_NAME": workflow_name
|
||||
}
|
||||
|
||||
# Add additional volume paths to environment for easy access
|
||||
if additional_volumes:
|
||||
for i, volume in enumerate(additional_volumes):
|
||||
env_vars[f"ADDITIONAL_VOLUME_{i}_PATH"] = volume.container_path
|
||||
|
||||
# Determine which image to use based on workflow configuration
|
||||
workflow_info = self.workflows[workflow_name]
|
||||
has_custom_dockerfile = workflow_info.has_docker and workflow_info.dockerfile.exists()
|
||||
# Use pull context for worker to pull from registry
|
||||
registry_url = get_registry_url(context="pull")
|
||||
workflow_image = f"{registry_url}/fuzzforge/{workflow_name}:latest" if has_custom_dockerfile else "prefecthq/prefect:3-python3.11"
|
||||
logger.debug(f"Worker will pull image: {workflow_image} (Registry: {registry_url})")
|
||||
|
||||
# Configure job variables with volume mounting and network access
|
||||
job_variables = {
|
||||
# Use custom image if available, otherwise base Prefect image
|
||||
"image": workflow_image,
|
||||
"volumes": volumes,
|
||||
"networks": [docker_network], # Connect to Docker Compose network
|
||||
"env": {
|
||||
**env_vars,
|
||||
"PYTHONPATH": "/opt/prefect/toolbox:/opt/prefect/toolbox/workflows",
|
||||
"WORKFLOW_NAME": workflow_name
|
||||
}
|
||||
}
|
||||
|
||||
# Apply resource requirements from workflow metadata and user overrides
|
||||
workflow_resource_requirements = self._extract_resource_requirements(workflow_info)
|
||||
final_resource_config = {}
|
||||
|
||||
# Start with workflow requirements as base
|
||||
if workflow_resource_requirements:
|
||||
final_resource_config.update(workflow_resource_requirements)
|
||||
|
||||
# Apply user-provided resource limits (overrides workflow defaults)
|
||||
if resource_limits:
|
||||
user_resource_config = {}
|
||||
if resource_limits.get("cpu_limit"):
|
||||
user_resource_config["cpus"] = resource_limits["cpu_limit"]
|
||||
if resource_limits.get("memory_limit"):
|
||||
user_resource_config["memory"] = resource_limits["memory_limit"]
|
||||
# Note: cpu_request and memory_request are not directly supported by Docker
|
||||
# but could be used for Kubernetes in the future
|
||||
|
||||
# User overrides take precedence
|
||||
final_resource_config.update(user_resource_config)
|
||||
|
||||
# Apply final resource configuration
|
||||
if final_resource_config:
|
||||
job_variables["resources"] = final_resource_config
|
||||
logger.info(f"Applied resource limits: {final_resource_config}")
|
||||
|
||||
# Merge parameters with defaults from metadata
|
||||
default_params = workflow_info.metadata.get("default_parameters", {})
|
||||
final_params = {**default_params, **(parameters or {})}
|
||||
|
||||
# Set flow parameters that match the flow signature
|
||||
final_params["target_path"] = "/workspace" # Container path where volume is mounted
|
||||
final_params["volume_mode"] = volume_mode
|
||||
|
||||
# Create and submit the flow run
|
||||
# Pass job_variables to ensure network, volumes, and environment are configured
|
||||
logger.info(f"Submitting flow with job_variables: {job_variables}")
|
||||
logger.info(f"Submitting flow with parameters: {final_params}")
|
||||
|
||||
# Prepare flow run creation parameters
|
||||
flow_run_params = {
|
||||
"deployment_id": deployment.id,
|
||||
"parameters": final_params,
|
||||
"job_variables": job_variables
|
||||
}
|
||||
|
||||
# Note: Timeout is handled through workflow-level configuration
|
||||
# Additional timeout configuration can be added to deployment metadata if needed
|
||||
|
||||
flow_run = await client.create_flow_run_from_deployment(**flow_run_params)
|
||||
|
||||
logger.info(
|
||||
f"Submitted workflow '{workflow_name}' with run_id: {flow_run.id}, "
|
||||
f"target: {target_path}, mode: {volume_mode}"
|
||||
)
|
||||
|
||||
return flow_run
|
||||
|
||||
async def get_flow_run_findings(self, run_id: str) -> Dict[str, Any]:
|
||||
"""
|
||||
Retrieve findings from a completed flow run.
|
||||
|
||||
Args:
|
||||
run_id: The flow run ID
|
||||
|
||||
Returns:
|
||||
Dictionary containing SARIF-formatted findings
|
||||
|
||||
Raises:
|
||||
ValueError: If run not completed or not found
|
||||
"""
|
||||
async with get_client() as client:
|
||||
flow_run = await client.read_flow_run(run_id)
|
||||
|
||||
if not flow_run.state.is_completed():
|
||||
raise ValueError(
|
||||
f"Flow run {run_id} not completed. Current status: {flow_run.state.name}"
|
||||
)
|
||||
|
||||
# Get the findings from the flow run result
|
||||
try:
|
||||
findings = await flow_run.state.result()
|
||||
return findings
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to retrieve findings for run {run_id}: {e}")
|
||||
raise ValueError(f"Failed to retrieve findings: {e}")
|
||||
|
||||
async def get_flow_run_status(self, run_id: str) -> Dict[str, Any]:
|
||||
"""
|
||||
Get the current status of a flow run.
|
||||
|
||||
Args:
|
||||
run_id: The flow run ID
|
||||
|
||||
Returns:
|
||||
Dictionary with status information
|
||||
"""
|
||||
async with get_client() as client:
|
||||
flow_run = await client.read_flow_run(run_id)
|
||||
|
||||
return {
|
||||
"run_id": str(flow_run.id),
|
||||
"workflow": flow_run.deployment_id,
|
||||
"status": flow_run.state.name,
|
||||
"is_completed": flow_run.state.is_completed(),
|
||||
"is_failed": flow_run.state.is_failed(),
|
||||
"is_running": flow_run.state.is_running(),
|
||||
"created_at": flow_run.created,
|
||||
"updated_at": flow_run.updated
|
||||
}
|
||||
|
||||
def _validate_target_path(self, target_path: str) -> None:
|
||||
"""
|
||||
Validate target path for security before mounting as volume.
|
||||
|
||||
Args:
|
||||
target_path: Host path to validate
|
||||
|
||||
Raises:
|
||||
ValueError: If path is not allowed for security reasons
|
||||
"""
|
||||
target = Path(target_path)
|
||||
|
||||
# Path must be absolute
|
||||
if not target.is_absolute():
|
||||
raise ValueError(f"Target path must be absolute: {target_path}")
|
||||
|
||||
# Resolve path to handle symlinks and relative components
|
||||
try:
|
||||
resolved_path = target.resolve()
|
||||
except (OSError, RuntimeError) as e:
|
||||
raise ValueError(f"Cannot resolve target path: {target_path} - {e}")
|
||||
|
||||
resolved_str = str(resolved_path)
|
||||
|
||||
# Check against forbidden paths first (more restrictive)
|
||||
for forbidden in self.forbidden_paths:
|
||||
if resolved_str.startswith(forbidden):
|
||||
raise ValueError(
|
||||
f"Access denied: Path '{target_path}' resolves to forbidden directory '{forbidden}'. "
|
||||
f"This path contains sensitive system files and cannot be mounted."
|
||||
)
|
||||
|
||||
# Check if path starts with any allowed base path
|
||||
path_allowed = False
|
||||
for allowed in self.allowed_base_paths:
|
||||
if resolved_str.startswith(allowed):
|
||||
path_allowed = True
|
||||
break
|
||||
|
||||
if not path_allowed:
|
||||
allowed_list = ", ".join(self.allowed_base_paths)
|
||||
raise ValueError(
|
||||
f"Access denied: Path '{target_path}' is not in allowed directories. "
|
||||
f"Allowed base paths: {allowed_list}"
|
||||
)
|
||||
|
||||
# Additional security checks
|
||||
if resolved_str == "/":
|
||||
raise ValueError("Cannot mount root filesystem")
|
||||
|
||||
# Warn if path doesn't exist (but don't block - it might be created later)
|
||||
if not resolved_path.exists():
|
||||
logger.warning(f"Target path does not exist: {target_path}")
|
||||
|
||||
logger.info(f"Path validation passed for: {target_path} -> {resolved_str}")
|
||||
+10
-367
@@ -1,5 +1,5 @@
|
||||
"""
|
||||
Setup utilities for Prefect infrastructure
|
||||
Setup utilities for FuzzForge infrastructure
|
||||
"""
|
||||
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
@@ -14,364 +14,21 @@ Setup utilities for Prefect infrastructure
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import logging
|
||||
from prefect import get_client
|
||||
from prefect.client.schemas.actions import WorkPoolCreate
|
||||
from prefect.client.schemas.objects import WorkPool
|
||||
from .prefect_manager import get_registry_url
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
async def setup_docker_pool():
|
||||
"""
|
||||
Create or update the Docker work pool for container execution.
|
||||
|
||||
This work pool is configured to:
|
||||
- Connect to the local Docker daemon
|
||||
- Support volume mounting at runtime
|
||||
- Clean up containers after execution
|
||||
- Use bridge networking by default
|
||||
"""
|
||||
import os
|
||||
|
||||
async with get_client() as client:
|
||||
pool_name = "docker-pool"
|
||||
|
||||
# Add force recreation flag for debugging fresh install issues
|
||||
force_recreate = os.getenv('FORCE_RECREATE_WORK_POOL', 'false').lower() == 'true'
|
||||
debug_setup = os.getenv('DEBUG_WORK_POOL_SETUP', 'false').lower() == 'true'
|
||||
|
||||
if force_recreate:
|
||||
logger.warning(f"FORCE_RECREATE_WORK_POOL=true - Will recreate work pool regardless of existing configuration")
|
||||
if debug_setup:
|
||||
logger.warning(f"DEBUG_WORK_POOL_SETUP=true - Enhanced logging enabled")
|
||||
# Temporarily set logging level to DEBUG for this function
|
||||
original_level = logger.level
|
||||
logger.setLevel(logging.DEBUG)
|
||||
|
||||
try:
|
||||
# Check if pool already exists and supports custom images
|
||||
existing_pools = await client.read_work_pools()
|
||||
existing_pool = None
|
||||
for pool in existing_pools:
|
||||
if pool.name == pool_name:
|
||||
existing_pool = pool
|
||||
break
|
||||
|
||||
if existing_pool and not force_recreate:
|
||||
logger.info(f"Found existing work pool '{pool_name}' - validating configuration...")
|
||||
|
||||
# Check if the existing pool has the correct configuration
|
||||
base_template = existing_pool.base_job_template or {}
|
||||
logger.debug(f"Base template keys: {list(base_template.keys())}")
|
||||
|
||||
job_config = base_template.get("job_configuration", {})
|
||||
logger.debug(f"Job config keys: {list(job_config.keys())}")
|
||||
|
||||
image_config = job_config.get("image", "")
|
||||
has_image_variable = "{{ image }}" in str(image_config)
|
||||
logger.debug(f"Image config: '{image_config}' -> has_image_variable: {has_image_variable}")
|
||||
|
||||
# Check if volume defaults include toolbox mount
|
||||
variables = base_template.get("variables", {})
|
||||
properties = variables.get("properties", {})
|
||||
volume_config = properties.get("volumes", {})
|
||||
volume_defaults = volume_config.get("default", [])
|
||||
has_toolbox_volume = any("toolbox_code" in str(vol) for vol in volume_defaults) if volume_defaults else False
|
||||
logger.debug(f"Volume defaults: {volume_defaults}")
|
||||
logger.debug(f"Has toolbox volume: {has_toolbox_volume}")
|
||||
|
||||
# Check if environment defaults include required settings
|
||||
env_config = properties.get("env", {})
|
||||
env_defaults = env_config.get("default", {})
|
||||
has_api_url = "PREFECT_API_URL" in env_defaults
|
||||
has_storage_path = "PREFECT_LOCAL_STORAGE_PATH" in env_defaults
|
||||
has_results_persist = "PREFECT_RESULTS_PERSIST_BY_DEFAULT" in env_defaults
|
||||
has_required_env = has_api_url and has_storage_path and has_results_persist
|
||||
logger.debug(f"Environment defaults: {env_defaults}")
|
||||
logger.debug(f"Has API URL: {has_api_url}, Has storage path: {has_storage_path}, Has results persist: {has_results_persist}")
|
||||
logger.debug(f"Has required env: {has_required_env}")
|
||||
|
||||
# Log the full validation result
|
||||
logger.info(f"Work pool validation - Image: {has_image_variable}, Toolbox: {has_toolbox_volume}, Environment: {has_required_env}")
|
||||
|
||||
if has_image_variable and has_toolbox_volume and has_required_env:
|
||||
logger.info(f"Docker work pool '{pool_name}' already exists with correct configuration")
|
||||
return
|
||||
else:
|
||||
reasons = []
|
||||
if not has_image_variable:
|
||||
reasons.append("missing image template")
|
||||
if not has_toolbox_volume:
|
||||
reasons.append("missing toolbox volume mount")
|
||||
if not has_required_env:
|
||||
if not has_api_url:
|
||||
reasons.append("missing PREFECT_API_URL")
|
||||
if not has_storage_path:
|
||||
reasons.append("missing PREFECT_LOCAL_STORAGE_PATH")
|
||||
if not has_results_persist:
|
||||
reasons.append("missing PREFECT_RESULTS_PERSIST_BY_DEFAULT")
|
||||
|
||||
logger.warning(f"Docker work pool '{pool_name}' exists but lacks: {', '.join(reasons)}. Recreating...")
|
||||
# Delete the old pool and recreate it
|
||||
try:
|
||||
await client.delete_work_pool(pool_name)
|
||||
logger.info(f"Deleted old work pool '{pool_name}'")
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to delete old work pool: {e}")
|
||||
elif force_recreate and existing_pool:
|
||||
logger.warning(f"Force recreation enabled - deleting existing work pool '{pool_name}'")
|
||||
try:
|
||||
await client.delete_work_pool(pool_name)
|
||||
logger.info(f"Deleted existing work pool for force recreation")
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to delete work pool for force recreation: {e}")
|
||||
|
||||
logger.info(f"Creating Docker work pool '{pool_name}' with custom image support...")
|
||||
|
||||
# Create the work pool with proper Docker configuration
|
||||
work_pool = WorkPoolCreate(
|
||||
name=pool_name,
|
||||
type="docker",
|
||||
description="Docker work pool for FuzzForge workflows with custom image support",
|
||||
base_job_template={
|
||||
"job_configuration": {
|
||||
"image": "{{ image }}", # Template variable for custom images
|
||||
"volumes": "{{ volumes }}", # List of volume mounts
|
||||
"env": "{{ env }}", # Environment variables
|
||||
"networks": "{{ networks }}", # Docker networks
|
||||
"stream_output": True,
|
||||
"auto_remove": True,
|
||||
"privileged": False,
|
||||
"network_mode": None, # Use networks instead
|
||||
"labels": {},
|
||||
"command": None # Let the image's CMD/ENTRYPOINT run
|
||||
},
|
||||
"variables": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"image": {
|
||||
"type": "string",
|
||||
"title": "Docker Image",
|
||||
"default": "prefecthq/prefect:3-python3.11",
|
||||
"description": "Docker image for the flow run"
|
||||
},
|
||||
"volumes": {
|
||||
"type": "array",
|
||||
"title": "Volume Mounts",
|
||||
"default": [
|
||||
"fuzzforge_prefect_storage:/prefect-storage",
|
||||
"fuzzforge_toolbox_code:/opt/prefect/toolbox:ro"
|
||||
],
|
||||
"description": "Volume mounts in format 'host:container:mode'",
|
||||
"items": {
|
||||
"type": "string"
|
||||
}
|
||||
},
|
||||
"networks": {
|
||||
"type": "array",
|
||||
"title": "Docker Networks",
|
||||
"default": ["fuzzforge_default"],
|
||||
"description": "Docker networks to connect container to",
|
||||
"items": {
|
||||
"type": "string"
|
||||
}
|
||||
},
|
||||
"env": {
|
||||
"type": "object",
|
||||
"title": "Environment Variables",
|
||||
"default": {
|
||||
"PREFECT_API_URL": "http://prefect-server:4200/api",
|
||||
"PREFECT_LOCAL_STORAGE_PATH": "/prefect-storage",
|
||||
"PREFECT_RESULTS_PERSIST_BY_DEFAULT": "true"
|
||||
},
|
||||
"description": "Environment variables for the container",
|
||||
"additionalProperties": {
|
||||
"type": "string"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
)
|
||||
|
||||
await client.create_work_pool(work_pool)
|
||||
logger.info(f"Created Docker work pool '{pool_name}'")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to setup Docker work pool: {e}")
|
||||
raise
|
||||
finally:
|
||||
# Restore original logging level if debug mode was enabled
|
||||
if debug_setup and 'original_level' in locals():
|
||||
logger.setLevel(original_level)
|
||||
|
||||
|
||||
def get_actual_compose_project_name():
|
||||
"""
|
||||
Return the hardcoded compose project name for FuzzForge.
|
||||
|
||||
Always returns 'fuzzforge' as per system requirements.
|
||||
"""
|
||||
logger.info("Using hardcoded compose project name: fuzzforge")
|
||||
return "fuzzforge"
|
||||
|
||||
|
||||
async def setup_result_storage():
|
||||
"""
|
||||
Create or update Prefect result storage block for findings persistence.
|
||||
Setup result storage (MinIO).
|
||||
|
||||
This sets up a LocalFileSystem storage block pointing to the shared
|
||||
/prefect-storage volume for result persistence.
|
||||
MinIO is used for both target upload and result storage.
|
||||
This is a placeholder for any MinIO-specific setup if needed.
|
||||
"""
|
||||
from prefect.filesystems import LocalFileSystem
|
||||
|
||||
storage_name = "fuzzforge-results"
|
||||
|
||||
try:
|
||||
# Create the storage block, overwrite if it exists
|
||||
logger.info(f"Setting up storage block '{storage_name}'...")
|
||||
storage = LocalFileSystem(basepath="/prefect-storage")
|
||||
|
||||
block_doc_id = await storage.save(name=storage_name, overwrite=True)
|
||||
logger.info(f"Storage block '{storage_name}' configured successfully")
|
||||
return str(block_doc_id)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to setup result storage: {e}")
|
||||
# Don't raise the exception - continue without storage block
|
||||
logger.warning("Continuing without result storage block - findings may not persist")
|
||||
return None
|
||||
|
||||
|
||||
async def validate_docker_connection():
|
||||
"""
|
||||
Validate that Docker is accessible and running.
|
||||
|
||||
Note: In containerized deployments with Docker socket proxy,
|
||||
the backend doesn't need direct Docker access.
|
||||
|
||||
Raises:
|
||||
RuntimeError: If Docker is not accessible
|
||||
"""
|
||||
import os
|
||||
|
||||
# Skip Docker validation if running in container without socket access
|
||||
if os.path.exists("/.dockerenv") and not os.path.exists("/var/run/docker.sock"):
|
||||
logger.info("Running in container without Docker socket - skipping Docker validation")
|
||||
return
|
||||
|
||||
try:
|
||||
import docker
|
||||
client = docker.from_env()
|
||||
client.ping()
|
||||
logger.info("Docker connection validated")
|
||||
except Exception as e:
|
||||
logger.error(f"Docker is not accessible: {e}")
|
||||
raise RuntimeError(
|
||||
"Docker is not running or not accessible. "
|
||||
"Please ensure Docker is installed and running."
|
||||
)
|
||||
|
||||
|
||||
async def validate_registry_connectivity(registry_url: str = None):
|
||||
"""
|
||||
Validate that the Docker registry is accessible.
|
||||
|
||||
Args:
|
||||
registry_url: URL of the Docker registry to validate (auto-detected if None)
|
||||
|
||||
Raises:
|
||||
RuntimeError: If registry is not accessible
|
||||
"""
|
||||
# Resolve a reachable test URL from within this process
|
||||
if registry_url is None:
|
||||
# If not specified, prefer internal service name in containers, host port on host
|
||||
import os
|
||||
if os.path.exists('/.dockerenv'):
|
||||
registry_url = "registry:5000"
|
||||
else:
|
||||
registry_url = "localhost:5001"
|
||||
|
||||
# If we're running inside a container and asked to probe localhost:PORT,
|
||||
# the probe would hit the container, not the host. Use host.docker.internal instead.
|
||||
import os
|
||||
try:
|
||||
host_part, port_part = registry_url.split(":", 1)
|
||||
except ValueError:
|
||||
host_part, port_part = registry_url, "80"
|
||||
|
||||
if os.path.exists('/.dockerenv') and host_part in ("localhost", "127.0.0.1"):
|
||||
test_host = "host.docker.internal"
|
||||
else:
|
||||
test_host = host_part
|
||||
test_url = f"http://{test_host}:{port_part}/v2/"
|
||||
|
||||
import aiohttp
|
||||
import asyncio
|
||||
|
||||
logger.info(f"Validating registry connectivity to {registry_url}...")
|
||||
|
||||
try:
|
||||
async with aiohttp.ClientSession(timeout=aiohttp.ClientTimeout(total=10)) as session:
|
||||
async with session.get(test_url) as response:
|
||||
if response.status == 200:
|
||||
logger.info(f"Registry at {registry_url} is accessible (tested via {test_host})")
|
||||
return
|
||||
else:
|
||||
raise RuntimeError(f"Registry returned status {response.status}")
|
||||
except asyncio.TimeoutError:
|
||||
raise RuntimeError(f"Registry at {registry_url} is not responding (timeout)")
|
||||
except aiohttp.ClientError as e:
|
||||
raise RuntimeError(f"Registry at {registry_url} is not accessible: {e}")
|
||||
except Exception as e:
|
||||
raise RuntimeError(f"Failed to validate registry connectivity: {e}")
|
||||
|
||||
|
||||
async def validate_docker_network(network_name: str):
|
||||
"""
|
||||
Validate that the specified Docker network exists.
|
||||
|
||||
Args:
|
||||
network_name: Name of the Docker network to validate
|
||||
|
||||
Raises:
|
||||
RuntimeError: If network doesn't exist
|
||||
"""
|
||||
import os
|
||||
|
||||
# Skip network validation if running in container without Docker socket
|
||||
if os.path.exists("/.dockerenv") and not os.path.exists("/var/run/docker.sock"):
|
||||
logger.info("Running in container without Docker socket - skipping network validation")
|
||||
return
|
||||
|
||||
try:
|
||||
import docker
|
||||
client = docker.from_env()
|
||||
|
||||
# List all networks
|
||||
networks = client.networks.list(names=[network_name])
|
||||
|
||||
if not networks:
|
||||
# Try to find networks with similar names
|
||||
all_networks = client.networks.list()
|
||||
similar_networks = [n.name for n in all_networks if "fuzzforge" in n.name.lower()]
|
||||
|
||||
error_msg = f"Docker network '{network_name}' not found."
|
||||
if similar_networks:
|
||||
error_msg += f" Available networks: {similar_networks}"
|
||||
else:
|
||||
error_msg += " Please ensure Docker Compose is running."
|
||||
|
||||
raise RuntimeError(error_msg)
|
||||
|
||||
logger.info(f"Docker network '{network_name}' validated")
|
||||
|
||||
except Exception as e:
|
||||
if isinstance(e, RuntimeError):
|
||||
raise
|
||||
logger.error(f"Network validation failed: {e}")
|
||||
raise RuntimeError(f"Failed to validate Docker network: {e}")
|
||||
logger.info("Result storage (MinIO) configured")
|
||||
# MinIO is configured via environment variables in docker-compose
|
||||
# No additional setup needed here
|
||||
return True
|
||||
|
||||
|
||||
async def validate_infrastructure():
|
||||
@@ -382,21 +39,7 @@ async def validate_infrastructure():
|
||||
"""
|
||||
logger.info("Validating infrastructure...")
|
||||
|
||||
# Validate Docker connection
|
||||
await validate_docker_connection()
|
||||
|
||||
# Validate registry connectivity for custom image building
|
||||
await validate_registry_connectivity()
|
||||
|
||||
# Validate network (hardcoded to avoid directory name dependencies)
|
||||
import os
|
||||
compose_project = "fuzzforge"
|
||||
docker_network = "fuzzforge_default"
|
||||
|
||||
try:
|
||||
await validate_docker_network(docker_network)
|
||||
except RuntimeError as e:
|
||||
logger.warning(f"Network validation failed: {e}")
|
||||
logger.warning("Workflows may not be able to connect to Prefect services")
|
||||
# Setup storage (MinIO)
|
||||
await setup_result_storage()
|
||||
|
||||
logger.info("Infrastructure validation completed")
|
||||
|
||||
@@ -1,459 +0,0 @@
|
||||
"""
|
||||
Workflow Discovery - Registry-based discovery and loading of workflows
|
||||
"""
|
||||
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import logging
|
||||
import yaml
|
||||
from pathlib import Path
|
||||
from typing import Dict, Optional, Any, Callable
|
||||
from pydantic import BaseModel, Field, ConfigDict
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class WorkflowInfo(BaseModel):
|
||||
"""Information about a discovered workflow"""
|
||||
name: str = Field(..., description="Workflow name")
|
||||
path: Path = Field(..., description="Path to workflow directory")
|
||||
workflow_file: Path = Field(..., description="Path to workflow.py file")
|
||||
dockerfile: Path = Field(..., description="Path to Dockerfile")
|
||||
has_docker: bool = Field(..., description="Whether workflow has custom Dockerfile")
|
||||
metadata: Dict[str, Any] = Field(..., description="Workflow metadata from YAML")
|
||||
flow_function_name: str = Field(default="main_flow", description="Name of the flow function")
|
||||
|
||||
model_config = ConfigDict(arbitrary_types_allowed=True)
|
||||
|
||||
|
||||
class WorkflowDiscovery:
|
||||
"""
|
||||
Discovers workflows from the filesystem and validates them against the registry.
|
||||
|
||||
This system:
|
||||
1. Scans for workflows with metadata.yaml files
|
||||
2. Cross-references them with the manual registry
|
||||
3. Provides registry-based flow functions for deployment
|
||||
|
||||
Workflows must have:
|
||||
- workflow.py: Contains the Prefect flow
|
||||
- metadata.yaml: Mandatory metadata file
|
||||
- Entry in toolbox/workflows/registry.py: Manual registration
|
||||
- Dockerfile (optional): Custom container definition
|
||||
- requirements.txt (optional): Python dependencies
|
||||
"""
|
||||
|
||||
def __init__(self, workflows_dir: Path):
|
||||
"""
|
||||
Initialize workflow discovery.
|
||||
|
||||
Args:
|
||||
workflows_dir: Path to the workflows directory
|
||||
"""
|
||||
self.workflows_dir = workflows_dir
|
||||
if not self.workflows_dir.exists():
|
||||
self.workflows_dir.mkdir(parents=True, exist_ok=True)
|
||||
logger.info(f"Created workflows directory: {self.workflows_dir}")
|
||||
|
||||
# Import registry - this validates it on import
|
||||
try:
|
||||
from toolbox.workflows.registry import WORKFLOW_REGISTRY, list_registered_workflows
|
||||
self.registry = WORKFLOW_REGISTRY
|
||||
logger.info(f"Loaded workflow registry with {len(self.registry)} registered workflows")
|
||||
except ImportError as e:
|
||||
logger.error(f"Failed to import workflow registry: {e}")
|
||||
self.registry = {}
|
||||
except Exception as e:
|
||||
logger.error(f"Registry validation failed: {e}")
|
||||
self.registry = {}
|
||||
|
||||
# Cache for discovered workflows
|
||||
self._workflow_cache: Optional[Dict[str, WorkflowInfo]] = None
|
||||
self._cache_timestamp: Optional[float] = None
|
||||
self._cache_ttl = 60.0 # Cache TTL in seconds
|
||||
|
||||
async def discover_workflows(self) -> Dict[str, WorkflowInfo]:
|
||||
"""
|
||||
Discover workflows by cross-referencing filesystem with registry.
|
||||
Uses caching to avoid frequent filesystem scans.
|
||||
|
||||
Returns:
|
||||
Dictionary mapping workflow names to their information
|
||||
"""
|
||||
# Check cache validity
|
||||
import time
|
||||
current_time = time.time()
|
||||
|
||||
if (self._workflow_cache is not None and
|
||||
self._cache_timestamp is not None and
|
||||
(current_time - self._cache_timestamp) < self._cache_ttl):
|
||||
# Return cached results
|
||||
logger.debug(f"Returning cached workflow discovery ({len(self._workflow_cache)} workflows)")
|
||||
return self._workflow_cache
|
||||
workflows = {}
|
||||
discovered_dirs = set()
|
||||
registry_names = set(self.registry.keys())
|
||||
|
||||
if not self.workflows_dir.exists():
|
||||
logger.warning(f"Workflows directory does not exist: {self.workflows_dir}")
|
||||
return workflows
|
||||
|
||||
# Recursively scan all directories and subdirectories
|
||||
await self._scan_directory_recursive(self.workflows_dir, workflows, discovered_dirs)
|
||||
|
||||
# Check for registry entries without corresponding directories
|
||||
missing_dirs = registry_names - discovered_dirs
|
||||
if missing_dirs:
|
||||
logger.warning(
|
||||
f"Registry contains workflows without filesystem directories: {missing_dirs}. "
|
||||
f"These workflows cannot be deployed."
|
||||
)
|
||||
|
||||
logger.info(
|
||||
f"Discovery complete: {len(workflows)} workflows ready for deployment, "
|
||||
f"{len(missing_dirs)} registry entries missing directories, "
|
||||
f"{len(discovered_dirs - registry_names)} filesystem workflows not registered"
|
||||
)
|
||||
|
||||
# Update cache
|
||||
self._workflow_cache = workflows
|
||||
self._cache_timestamp = current_time
|
||||
|
||||
return workflows
|
||||
|
||||
async def _scan_directory_recursive(self, directory: Path, workflows: Dict[str, WorkflowInfo], discovered_dirs: set):
|
||||
"""
|
||||
Recursively scan directory for workflows.
|
||||
|
||||
Args:
|
||||
directory: Directory to scan
|
||||
workflows: Dictionary to populate with discovered workflows
|
||||
discovered_dirs: Set to track discovered workflow names
|
||||
"""
|
||||
for item in directory.iterdir():
|
||||
if not item.is_dir():
|
||||
continue
|
||||
|
||||
if item.name.startswith('_') or item.name.startswith('.'):
|
||||
continue # Skip hidden or private directories
|
||||
|
||||
# Check if this directory contains workflow files (workflow.py and metadata.yaml)
|
||||
workflow_file = item / "workflow.py"
|
||||
metadata_file = item / "metadata.yaml"
|
||||
|
||||
if workflow_file.exists() and metadata_file.exists():
|
||||
# This is a workflow directory
|
||||
workflow_name = item.name
|
||||
discovered_dirs.add(workflow_name)
|
||||
|
||||
# Only process workflows that are in the registry
|
||||
if workflow_name not in self.registry:
|
||||
logger.warning(
|
||||
f"Workflow '{workflow_name}' found in filesystem but not in registry. "
|
||||
f"Add it to toolbox/workflows/registry.py to enable deployment."
|
||||
)
|
||||
continue
|
||||
|
||||
try:
|
||||
workflow_info = await self._load_workflow(item)
|
||||
if workflow_info:
|
||||
workflows[workflow_info.name] = workflow_info
|
||||
logger.info(f"Discovered and registered workflow: {workflow_info.name}")
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to load workflow from {item}: {e}")
|
||||
else:
|
||||
# This is a category directory, recurse into it
|
||||
await self._scan_directory_recursive(item, workflows, discovered_dirs)
|
||||
|
||||
async def _load_workflow(self, workflow_dir: Path) -> Optional[WorkflowInfo]:
|
||||
"""
|
||||
Load and validate a single workflow.
|
||||
|
||||
Args:
|
||||
workflow_dir: Path to the workflow directory
|
||||
|
||||
Returns:
|
||||
WorkflowInfo if valid, None otherwise
|
||||
"""
|
||||
workflow_name = workflow_dir.name
|
||||
|
||||
# Check for mandatory files
|
||||
workflow_file = workflow_dir / "workflow.py"
|
||||
metadata_file = workflow_dir / "metadata.yaml"
|
||||
|
||||
if not workflow_file.exists():
|
||||
logger.warning(f"Workflow {workflow_name} missing workflow.py")
|
||||
return None
|
||||
|
||||
if not metadata_file.exists():
|
||||
logger.error(f"Workflow {workflow_name} missing mandatory metadata.yaml")
|
||||
return None
|
||||
|
||||
# Load and validate metadata
|
||||
try:
|
||||
metadata = self._load_metadata(metadata_file)
|
||||
if not self._validate_metadata(metadata, workflow_name):
|
||||
return None
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to load metadata for {workflow_name}: {e}")
|
||||
return None
|
||||
|
||||
# Check for mandatory Dockerfile
|
||||
dockerfile = workflow_dir / "Dockerfile"
|
||||
if not dockerfile.exists():
|
||||
logger.error(f"Workflow {workflow_name} missing mandatory Dockerfile")
|
||||
return None
|
||||
|
||||
has_docker = True # Always True since Dockerfile is mandatory
|
||||
|
||||
# Get flow function name from metadata or use default
|
||||
flow_function_name = metadata.get("flow_function", "main_flow")
|
||||
|
||||
return WorkflowInfo(
|
||||
name=workflow_name,
|
||||
path=workflow_dir,
|
||||
workflow_file=workflow_file,
|
||||
dockerfile=dockerfile,
|
||||
has_docker=has_docker,
|
||||
metadata=metadata,
|
||||
flow_function_name=flow_function_name
|
||||
)
|
||||
|
||||
def _load_metadata(self, metadata_file: Path) -> Dict[str, Any]:
|
||||
"""
|
||||
Load metadata from YAML file.
|
||||
|
||||
Args:
|
||||
metadata_file: Path to metadata.yaml
|
||||
|
||||
Returns:
|
||||
Dictionary containing metadata
|
||||
"""
|
||||
with open(metadata_file, 'r') as f:
|
||||
metadata = yaml.safe_load(f)
|
||||
|
||||
if metadata is None:
|
||||
raise ValueError("Empty metadata file")
|
||||
|
||||
return metadata
|
||||
|
||||
def _validate_metadata(self, metadata: Dict[str, Any], workflow_name: str) -> bool:
|
||||
"""
|
||||
Validate that metadata contains all required fields.
|
||||
|
||||
Args:
|
||||
metadata: Metadata dictionary
|
||||
workflow_name: Name of the workflow for logging
|
||||
|
||||
Returns:
|
||||
True if valid, False otherwise
|
||||
"""
|
||||
required_fields = ["name", "version", "description", "author", "category", "parameters", "requirements"]
|
||||
|
||||
missing_fields = []
|
||||
for field in required_fields:
|
||||
if field not in metadata:
|
||||
missing_fields.append(field)
|
||||
|
||||
if missing_fields:
|
||||
logger.error(
|
||||
f"Workflow {workflow_name} metadata missing required fields: {missing_fields}"
|
||||
)
|
||||
return False
|
||||
|
||||
# Validate version format (semantic versioning)
|
||||
version = metadata.get("version", "")
|
||||
if not self._is_valid_version(version):
|
||||
logger.error(f"Workflow {workflow_name} has invalid version format: {version}")
|
||||
return False
|
||||
|
||||
# Validate parameters structure
|
||||
parameters = metadata.get("parameters", {})
|
||||
if not isinstance(parameters, dict):
|
||||
logger.error(f"Workflow {workflow_name} parameters must be a dictionary")
|
||||
return False
|
||||
|
||||
return True
|
||||
|
||||
def _is_valid_version(self, version: str) -> bool:
|
||||
"""
|
||||
Check if version follows semantic versioning (x.y.z).
|
||||
|
||||
Args:
|
||||
version: Version string
|
||||
|
||||
Returns:
|
||||
True if valid semantic version
|
||||
"""
|
||||
try:
|
||||
parts = version.split('.')
|
||||
if len(parts) != 3:
|
||||
return False
|
||||
for part in parts:
|
||||
int(part) # Check if each part is a number
|
||||
return True
|
||||
except (ValueError, AttributeError):
|
||||
return False
|
||||
|
||||
def invalidate_cache(self) -> None:
|
||||
"""
|
||||
Invalidate the workflow discovery cache.
|
||||
Useful when workflows are added or modified.
|
||||
"""
|
||||
self._workflow_cache = None
|
||||
self._cache_timestamp = None
|
||||
logger.debug("Workflow discovery cache invalidated")
|
||||
|
||||
def get_flow_function(self, workflow_name: str) -> Optional[Callable]:
|
||||
"""
|
||||
Get the flow function from the registry.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow
|
||||
|
||||
Returns:
|
||||
The flow function if found in registry, None otherwise
|
||||
"""
|
||||
if workflow_name not in self.registry:
|
||||
logger.error(
|
||||
f"Workflow '{workflow_name}' not found in registry. "
|
||||
f"Available workflows: {list(self.registry.keys())}"
|
||||
)
|
||||
return None
|
||||
|
||||
try:
|
||||
from toolbox.workflows.registry import get_workflow_flow
|
||||
flow_func = get_workflow_flow(workflow_name)
|
||||
logger.debug(f"Retrieved flow function for '{workflow_name}' from registry")
|
||||
return flow_func
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get flow function for '{workflow_name}': {e}")
|
||||
return None
|
||||
|
||||
def get_registry_info(self, workflow_name: str) -> Optional[Dict[str, Any]]:
|
||||
"""
|
||||
Get registry information for a workflow.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow
|
||||
|
||||
Returns:
|
||||
Registry information if found, None otherwise
|
||||
"""
|
||||
if workflow_name not in self.registry:
|
||||
return None
|
||||
|
||||
try:
|
||||
from toolbox.workflows.registry import get_workflow_info
|
||||
return get_workflow_info(workflow_name)
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get registry info for '{workflow_name}': {e}")
|
||||
return None
|
||||
|
||||
@staticmethod
|
||||
def get_metadata_schema() -> Dict[str, Any]:
|
||||
"""
|
||||
Get the JSON schema for workflow metadata.
|
||||
|
||||
Returns:
|
||||
JSON schema dictionary
|
||||
"""
|
||||
return {
|
||||
"type": "object",
|
||||
"required": ["name", "version", "description", "author", "category", "parameters", "requirements"],
|
||||
"properties": {
|
||||
"name": {
|
||||
"type": "string",
|
||||
"description": "Workflow name"
|
||||
},
|
||||
"version": {
|
||||
"type": "string",
|
||||
"pattern": "^\\d+\\.\\d+\\.\\d+$",
|
||||
"description": "Semantic version (x.y.z)"
|
||||
},
|
||||
"description": {
|
||||
"type": "string",
|
||||
"description": "Workflow description"
|
||||
},
|
||||
"author": {
|
||||
"type": "string",
|
||||
"description": "Workflow author"
|
||||
},
|
||||
"category": {
|
||||
"type": "string",
|
||||
"enum": ["comprehensive", "specialized", "fuzzing", "focused"],
|
||||
"description": "Workflow category"
|
||||
},
|
||||
"tags": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "Workflow tags for categorization"
|
||||
},
|
||||
"requirements": {
|
||||
"type": "object",
|
||||
"required": ["tools", "resources"],
|
||||
"properties": {
|
||||
"tools": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "Required security tools"
|
||||
},
|
||||
"resources": {
|
||||
"type": "object",
|
||||
"required": ["memory", "cpu", "timeout"],
|
||||
"properties": {
|
||||
"memory": {
|
||||
"type": "string",
|
||||
"pattern": "^\\d+[GMK]i$",
|
||||
"description": "Memory limit (e.g., 1Gi, 512Mi)"
|
||||
},
|
||||
"cpu": {
|
||||
"type": "string",
|
||||
"pattern": "^\\d+m?$",
|
||||
"description": "CPU limit (e.g., 1000m, 2)"
|
||||
},
|
||||
"timeout": {
|
||||
"type": "integer",
|
||||
"minimum": 60,
|
||||
"maximum": 7200,
|
||||
"description": "Workflow timeout in seconds"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"description": "Workflow parameters schema"
|
||||
},
|
||||
"default_parameters": {
|
||||
"type": "object",
|
||||
"description": "Default parameter values"
|
||||
},
|
||||
"required_modules": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "Required module names"
|
||||
},
|
||||
"supported_volume_modes": {
|
||||
"type": "array",
|
||||
"items": {"enum": ["ro", "rw"]},
|
||||
"default": ["ro", "rw"],
|
||||
"description": "Supported volume mount modes"
|
||||
},
|
||||
"flow_function": {
|
||||
"type": "string",
|
||||
"default": "main_flow",
|
||||
"description": "Name of the flow function in workflow.py"
|
||||
}
|
||||
}
|
||||
}
|
||||
+171
-310
@@ -12,7 +12,6 @@
|
||||
import asyncio
|
||||
import logging
|
||||
import os
|
||||
from uuid import UUID
|
||||
from contextlib import AsyncExitStack, asynccontextmanager, suppress
|
||||
from typing import Any, Dict, Optional, List
|
||||
|
||||
@@ -23,31 +22,20 @@ from starlette.routing import Mount
|
||||
|
||||
from fastmcp.server.http import create_sse_app
|
||||
|
||||
from src.core.prefect_manager import PrefectManager
|
||||
from src.core.setup import setup_docker_pool, setup_result_storage, validate_infrastructure
|
||||
from src.core.workflow_discovery import WorkflowDiscovery
|
||||
from src.temporal.manager import TemporalManager
|
||||
from src.core.setup import setup_result_storage, validate_infrastructure
|
||||
from src.api import workflows, runs, fuzzing
|
||||
from src.services.prefect_stats_monitor import prefect_stats_monitor
|
||||
|
||||
from fastmcp import FastMCP
|
||||
from prefect.client.orchestration import get_client
|
||||
from prefect.client.schemas.filters import (
|
||||
FlowRunFilter,
|
||||
FlowRunFilterDeploymentId,
|
||||
FlowRunFilterState,
|
||||
FlowRunFilterStateType,
|
||||
)
|
||||
from prefect.client.schemas.sorting import FlowRunSort
|
||||
from prefect.states import StateType
|
||||
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
prefect_mgr = PrefectManager()
|
||||
temporal_mgr = TemporalManager()
|
||||
|
||||
|
||||
class PrefectBootstrapState:
|
||||
"""Tracks Prefect initialization progress for API and MCP consumers."""
|
||||
class TemporalBootstrapState:
|
||||
"""Tracks Temporal initialization progress for API and MCP consumers."""
|
||||
|
||||
def __init__(self) -> None:
|
||||
self.ready: bool = False
|
||||
@@ -64,19 +52,19 @@ class PrefectBootstrapState:
|
||||
}
|
||||
|
||||
|
||||
prefect_bootstrap_state = PrefectBootstrapState()
|
||||
temporal_bootstrap_state = TemporalBootstrapState()
|
||||
|
||||
# Configure retry strategy for bootstrapping Prefect + infrastructure
|
||||
# Configure retry strategy for bootstrapping Temporal + infrastructure
|
||||
STARTUP_RETRY_SECONDS = max(1, int(os.getenv("FUZZFORGE_STARTUP_RETRY_SECONDS", "5")))
|
||||
STARTUP_RETRY_MAX_SECONDS = max(
|
||||
STARTUP_RETRY_SECONDS,
|
||||
int(os.getenv("FUZZFORGE_STARTUP_RETRY_MAX_SECONDS", "60")),
|
||||
)
|
||||
|
||||
prefect_bootstrap_task: Optional[asyncio.Task] = None
|
||||
temporal_bootstrap_task: Optional[asyncio.Task] = None
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# FastAPI application (REST API remains unchanged)
|
||||
# FastAPI application (REST API)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
app = FastAPI(
|
||||
@@ -90,20 +78,19 @@ app.include_router(runs.router)
|
||||
app.include_router(fuzzing.router)
|
||||
|
||||
|
||||
def get_prefect_status() -> Dict[str, Any]:
|
||||
"""Return a snapshot of Prefect bootstrap state for diagnostics."""
|
||||
status = prefect_bootstrap_state.as_dict()
|
||||
status["workflows_loaded"] = len(prefect_mgr.workflows)
|
||||
status["deployments_tracked"] = len(prefect_mgr.deployments)
|
||||
def get_temporal_status() -> Dict[str, Any]:
|
||||
"""Return a snapshot of Temporal bootstrap state for diagnostics."""
|
||||
status = temporal_bootstrap_state.as_dict()
|
||||
status["workflows_loaded"] = len(temporal_mgr.workflows)
|
||||
status["bootstrap_task_running"] = (
|
||||
prefect_bootstrap_task is not None and not prefect_bootstrap_task.done()
|
||||
temporal_bootstrap_task is not None and not temporal_bootstrap_task.done()
|
||||
)
|
||||
return status
|
||||
|
||||
|
||||
def _prefect_not_ready_status() -> Optional[Dict[str, Any]]:
|
||||
"""Return status details if Prefect is not ready yet."""
|
||||
status = get_prefect_status()
|
||||
def _temporal_not_ready_status() -> Optional[Dict[str, Any]]:
|
||||
"""Return status details if Temporal is not ready yet."""
|
||||
status = get_temporal_status()
|
||||
if status.get("ready"):
|
||||
return None
|
||||
return status
|
||||
@@ -111,19 +98,19 @@ def _prefect_not_ready_status() -> Optional[Dict[str, Any]]:
|
||||
|
||||
@app.get("/")
|
||||
async def root() -> Dict[str, Any]:
|
||||
status = get_prefect_status()
|
||||
status = get_temporal_status()
|
||||
return {
|
||||
"name": "FuzzForge API",
|
||||
"version": "0.6.0",
|
||||
"status": "ready" if status.get("ready") else "initializing",
|
||||
"workflows_loaded": status.get("workflows_loaded", 0),
|
||||
"prefect": status,
|
||||
"temporal": status,
|
||||
}
|
||||
|
||||
|
||||
@app.get("/health")
|
||||
async def health() -> Dict[str, str]:
|
||||
status = get_prefect_status()
|
||||
status = get_temporal_status()
|
||||
health_status = "healthy" if status.get("ready") else "initializing"
|
||||
return {"status": health_status}
|
||||
|
||||
@@ -165,65 +152,61 @@ _fastapi_mcp_imported = False
|
||||
mcp = FastMCP(name="FuzzForge MCP")
|
||||
|
||||
|
||||
async def _bootstrap_prefect_with_retries() -> None:
|
||||
"""Initialize Prefect infrastructure with exponential backoff retries."""
|
||||
async def _bootstrap_temporal_with_retries() -> None:
|
||||
"""Initialize Temporal infrastructure with exponential backoff retries."""
|
||||
|
||||
attempt = 0
|
||||
|
||||
while True:
|
||||
attempt += 1
|
||||
prefect_bootstrap_state.task_running = True
|
||||
prefect_bootstrap_state.status = "starting"
|
||||
prefect_bootstrap_state.ready = False
|
||||
prefect_bootstrap_state.last_error = None
|
||||
temporal_bootstrap_state.task_running = True
|
||||
temporal_bootstrap_state.status = "starting"
|
||||
temporal_bootstrap_state.ready = False
|
||||
temporal_bootstrap_state.last_error = None
|
||||
|
||||
try:
|
||||
logger.info("Bootstrapping Prefect infrastructure...")
|
||||
logger.info("Bootstrapping Temporal infrastructure...")
|
||||
await validate_infrastructure()
|
||||
await setup_docker_pool()
|
||||
await setup_result_storage()
|
||||
await prefect_mgr.initialize()
|
||||
await prefect_stats_monitor.start_monitoring()
|
||||
await temporal_mgr.initialize()
|
||||
|
||||
prefect_bootstrap_state.ready = True
|
||||
prefect_bootstrap_state.status = "ready"
|
||||
prefect_bootstrap_state.task_running = False
|
||||
logger.info("Prefect infrastructure ready")
|
||||
temporal_bootstrap_state.ready = True
|
||||
temporal_bootstrap_state.status = "ready"
|
||||
temporal_bootstrap_state.task_running = False
|
||||
logger.info("Temporal infrastructure ready")
|
||||
return
|
||||
|
||||
except asyncio.CancelledError:
|
||||
prefect_bootstrap_state.status = "cancelled"
|
||||
prefect_bootstrap_state.task_running = False
|
||||
logger.info("Prefect bootstrap task cancelled")
|
||||
temporal_bootstrap_state.status = "cancelled"
|
||||
temporal_bootstrap_state.task_running = False
|
||||
logger.info("Temporal bootstrap task cancelled")
|
||||
raise
|
||||
|
||||
except Exception as exc: # pragma: no cover - defensive logging on infra startup
|
||||
logger.exception("Prefect bootstrap failed")
|
||||
prefect_bootstrap_state.ready = False
|
||||
prefect_bootstrap_state.status = "error"
|
||||
prefect_bootstrap_state.last_error = str(exc)
|
||||
logger.exception("Temporal bootstrap failed")
|
||||
temporal_bootstrap_state.ready = False
|
||||
temporal_bootstrap_state.status = "error"
|
||||
temporal_bootstrap_state.last_error = str(exc)
|
||||
|
||||
# Ensure partial initialization does not leave stale state behind
|
||||
prefect_mgr.workflows.clear()
|
||||
prefect_mgr.deployments.clear()
|
||||
await prefect_stats_monitor.stop_monitoring()
|
||||
temporal_mgr.workflows.clear()
|
||||
|
||||
wait_time = min(
|
||||
STARTUP_RETRY_SECONDS * (2 ** (attempt - 1)),
|
||||
STARTUP_RETRY_MAX_SECONDS,
|
||||
)
|
||||
logger.info("Retrying Prefect bootstrap in %s second(s)", wait_time)
|
||||
logger.info("Retrying Temporal bootstrap in %s second(s)", wait_time)
|
||||
|
||||
try:
|
||||
await asyncio.sleep(wait_time)
|
||||
except asyncio.CancelledError:
|
||||
prefect_bootstrap_state.status = "cancelled"
|
||||
prefect_bootstrap_state.task_running = False
|
||||
temporal_bootstrap_state.status = "cancelled"
|
||||
temporal_bootstrap_state.task_running = False
|
||||
raise
|
||||
|
||||
|
||||
def _lookup_workflow(workflow_name: str):
|
||||
info = prefect_mgr.workflows.get(workflow_name)
|
||||
info = temporal_mgr.workflows.get(workflow_name)
|
||||
if not info:
|
||||
return None
|
||||
metadata = info.metadata
|
||||
@@ -248,24 +231,23 @@ def _lookup_workflow(workflow_name: str):
|
||||
"required_modules": metadata.get("required_modules", []),
|
||||
"supported_volume_modes": supported_modes,
|
||||
"default_target_path": default_target_path,
|
||||
"default_volume_mode": default_volume_mode,
|
||||
"has_custom_docker": bool(info.has_docker),
|
||||
"default_volume_mode": default_volume_mode
|
||||
}
|
||||
|
||||
|
||||
@mcp.tool
|
||||
async def list_workflows_mcp() -> Dict[str, Any]:
|
||||
"""List all discovered workflows and their metadata summary."""
|
||||
not_ready = _prefect_not_ready_status()
|
||||
not_ready = _temporal_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"workflows": [],
|
||||
"prefect": not_ready,
|
||||
"message": "Prefect infrastructure is still initializing",
|
||||
"temporal": not_ready,
|
||||
"message": "Temporal infrastructure is still initializing",
|
||||
}
|
||||
|
||||
workflows_summary = []
|
||||
for name, info in prefect_mgr.workflows.items():
|
||||
for name, info in temporal_mgr.workflows.items():
|
||||
metadata = info.metadata
|
||||
defaults = metadata.get("default_parameters", {})
|
||||
workflows_summary.append({
|
||||
@@ -279,20 +261,19 @@ async def list_workflows_mcp() -> Dict[str, Any]:
|
||||
or defaults.get("volume_mode")
|
||||
or "ro",
|
||||
"default_target_path": metadata.get("default_target_path")
|
||||
or defaults.get("target_path"),
|
||||
"has_custom_docker": bool(info.has_docker),
|
||||
or defaults.get("target_path")
|
||||
})
|
||||
return {"workflows": workflows_summary, "prefect": get_prefect_status()}
|
||||
return {"workflows": workflows_summary, "temporal": get_temporal_status()}
|
||||
|
||||
|
||||
@mcp.tool
|
||||
async def get_workflow_metadata_mcp(workflow_name: str) -> Dict[str, Any]:
|
||||
"""Fetch detailed metadata for a workflow."""
|
||||
not_ready = _prefect_not_ready_status()
|
||||
not_ready = _temporal_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"error": "Prefect infrastructure not ready",
|
||||
"prefect": not_ready,
|
||||
"error": "Temporal infrastructure not ready",
|
||||
"temporal": not_ready,
|
||||
}
|
||||
|
||||
data = _lookup_workflow(workflow_name)
|
||||
@@ -304,11 +285,11 @@ async def get_workflow_metadata_mcp(workflow_name: str) -> Dict[str, Any]:
|
||||
@mcp.tool
|
||||
async def get_workflow_parameters_mcp(workflow_name: str) -> Dict[str, Any]:
|
||||
"""Return the parameter schema and defaults for a workflow."""
|
||||
not_ready = _prefect_not_ready_status()
|
||||
not_ready = _temporal_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"error": "Prefect infrastructure not ready",
|
||||
"prefect": not_ready,
|
||||
"error": "Temporal infrastructure not ready",
|
||||
"temporal": not_ready,
|
||||
}
|
||||
|
||||
data = _lookup_workflow(workflow_name)
|
||||
@@ -323,72 +304,41 @@ async def get_workflow_parameters_mcp(workflow_name: str) -> Dict[str, Any]:
|
||||
@mcp.tool
|
||||
async def get_workflow_metadata_schema_mcp() -> Dict[str, Any]:
|
||||
"""Return the JSON schema describing workflow metadata files."""
|
||||
from src.temporal.discovery import WorkflowDiscovery
|
||||
return WorkflowDiscovery.get_metadata_schema()
|
||||
|
||||
|
||||
@mcp.tool
|
||||
async def submit_security_scan_mcp(
|
||||
workflow_name: str,
|
||||
target_path: str | None = None,
|
||||
volume_mode: str | None = None,
|
||||
target_id: str,
|
||||
parameters: Dict[str, Any] | None = None,
|
||||
) -> Dict[str, Any] | Dict[str, str]:
|
||||
"""Submit a Prefect workflow via MCP."""
|
||||
"""Submit a Temporal workflow via MCP."""
|
||||
try:
|
||||
not_ready = _prefect_not_ready_status()
|
||||
not_ready = _temporal_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"error": "Prefect infrastructure not ready",
|
||||
"prefect": not_ready,
|
||||
"error": "Temporal infrastructure not ready",
|
||||
"temporal": not_ready,
|
||||
}
|
||||
|
||||
workflow_info = prefect_mgr.workflows.get(workflow_name)
|
||||
workflow_info = temporal_mgr.workflows.get(workflow_name)
|
||||
if not workflow_info:
|
||||
return {"error": f"Workflow '{workflow_name}' not found"}
|
||||
|
||||
metadata = workflow_info.metadata or {}
|
||||
defaults = metadata.get("default_parameters", {})
|
||||
|
||||
resolved_target_path = target_path or metadata.get("default_target_path") or defaults.get("target_path")
|
||||
if not resolved_target_path:
|
||||
return {
|
||||
"error": (
|
||||
"target_path is required and no default_target_path is defined in metadata"
|
||||
),
|
||||
"metadata": {
|
||||
"workflow": workflow_name,
|
||||
"default_target_path": metadata.get("default_target_path"),
|
||||
},
|
||||
}
|
||||
|
||||
requested_volume_mode = volume_mode or metadata.get("default_volume_mode") or defaults.get("volume_mode")
|
||||
if not requested_volume_mode:
|
||||
requested_volume_mode = "ro"
|
||||
|
||||
normalised_volume_mode = (
|
||||
str(requested_volume_mode).strip().lower().replace("-", "_")
|
||||
)
|
||||
if normalised_volume_mode in {"read_only", "readonly", "ro"}:
|
||||
normalised_volume_mode = "ro"
|
||||
elif normalised_volume_mode in {"read_write", "readwrite", "rw"}:
|
||||
normalised_volume_mode = "rw"
|
||||
else:
|
||||
supported_modes = metadata.get("supported_volume_modes", ["ro", "rw"])
|
||||
if isinstance(supported_modes, list) and normalised_volume_mode in supported_modes:
|
||||
pass
|
||||
else:
|
||||
normalised_volume_mode = "ro"
|
||||
|
||||
parameters = parameters or {}
|
||||
|
||||
cleaned_parameters: Dict[str, Any] = {**defaults, **parameters}
|
||||
|
||||
# Ensure *_config structures default to dicts so Prefect validation passes.
|
||||
# Ensure *_config structures default to dicts
|
||||
for key, value in list(cleaned_parameters.items()):
|
||||
if isinstance(key, str) and key.endswith("_config") and value is None:
|
||||
cleaned_parameters[key] = {}
|
||||
|
||||
# Some workflows expect configuration dictionaries even when omitted.
|
||||
# Some workflows expect configuration dictionaries even when omitted
|
||||
parameter_definitions = (
|
||||
metadata.get("parameters", {}).get("properties", {})
|
||||
if isinstance(metadata.get("parameters"), dict)
|
||||
@@ -403,20 +353,19 @@ async def submit_security_scan_mcp(
|
||||
elif cleaned_parameters[key] is None:
|
||||
cleaned_parameters[key] = {}
|
||||
|
||||
flow_run = await prefect_mgr.submit_workflow(
|
||||
# Start workflow
|
||||
handle = await temporal_mgr.run_workflow(
|
||||
workflow_name=workflow_name,
|
||||
target_path=resolved_target_path,
|
||||
volume_mode=normalised_volume_mode,
|
||||
parameters=cleaned_parameters,
|
||||
target_id=target_id,
|
||||
workflow_params=cleaned_parameters,
|
||||
)
|
||||
|
||||
return {
|
||||
"run_id": str(flow_run.id),
|
||||
"status": flow_run.state.name if flow_run.state else "PENDING",
|
||||
"run_id": handle.id,
|
||||
"status": "RUNNING",
|
||||
"workflow": workflow_name,
|
||||
"message": f"Workflow '{workflow_name}' submitted successfully",
|
||||
"target_path": resolved_target_path,
|
||||
"volume_mode": normalised_volume_mode,
|
||||
"target_id": target_id,
|
||||
"parameters": cleaned_parameters,
|
||||
"mcp_enabled": True,
|
||||
}
|
||||
@@ -427,43 +376,38 @@ async def submit_security_scan_mcp(
|
||||
|
||||
@mcp.tool
|
||||
async def get_comprehensive_scan_summary(run_id: str) -> Dict[str, Any] | Dict[str, str]:
|
||||
"""Return a summary for the given flow run via MCP."""
|
||||
"""Return a summary for the given workflow run via MCP."""
|
||||
try:
|
||||
not_ready = _prefect_not_ready_status()
|
||||
not_ready = _temporal_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"error": "Prefect infrastructure not ready",
|
||||
"prefect": not_ready,
|
||||
"error": "Temporal infrastructure not ready",
|
||||
"temporal": not_ready,
|
||||
}
|
||||
|
||||
status = await prefect_mgr.get_flow_run_status(run_id)
|
||||
findings = await prefect_mgr.get_flow_run_findings(run_id)
|
||||
|
||||
workflow_name = "unknown"
|
||||
deployment_id = status.get("workflow", "")
|
||||
for name, deployment in prefect_mgr.deployments.items():
|
||||
if str(deployment) == str(deployment_id):
|
||||
workflow_name = name
|
||||
break
|
||||
status = await temporal_mgr.get_workflow_status(run_id)
|
||||
|
||||
# Try to get result if completed
|
||||
total_findings = 0
|
||||
severity_summary = {"critical": 0, "high": 0, "medium": 0, "low": 0, "info": 0}
|
||||
|
||||
if findings and "sarif" in findings:
|
||||
sarif = findings["sarif"]
|
||||
if isinstance(sarif, dict):
|
||||
total_findings = sarif.get("total_findings", 0)
|
||||
if status.get("status") == "COMPLETED":
|
||||
try:
|
||||
result = await temporal_mgr.get_workflow_result(run_id)
|
||||
if isinstance(result, dict):
|
||||
summary = result.get("summary", {})
|
||||
total_findings = summary.get("total_findings", 0)
|
||||
except Exception as e:
|
||||
logger.debug(f"Could not retrieve result for {run_id}: {e}")
|
||||
|
||||
return {
|
||||
"run_id": run_id,
|
||||
"workflow": workflow_name,
|
||||
"workflow": "unknown", # Temporal doesn't track workflow name in status
|
||||
"status": status.get("status", "unknown"),
|
||||
"is_completed": status.get("is_completed", False),
|
||||
"is_completed": status.get("status") == "COMPLETED",
|
||||
"total_findings": total_findings,
|
||||
"severity_summary": severity_summary,
|
||||
"scan_duration": status.get("updated_at", "")
|
||||
if status.get("is_completed")
|
||||
else "In progress",
|
||||
"scan_duration": status.get("close_time", "In progress"),
|
||||
"recommendations": (
|
||||
[
|
||||
"Review high and critical severity findings first",
|
||||
@@ -482,32 +426,26 @@ async def get_comprehensive_scan_summary(run_id: str) -> Dict[str, Any] | Dict[s
|
||||
|
||||
@mcp.tool
|
||||
async def get_run_status_mcp(run_id: str) -> Dict[str, Any]:
|
||||
"""Return current status information for a Prefect run."""
|
||||
"""Return current status information for a Temporal run."""
|
||||
try:
|
||||
not_ready = _prefect_not_ready_status()
|
||||
not_ready = _temporal_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"error": "Prefect infrastructure not ready",
|
||||
"prefect": not_ready,
|
||||
"error": "Temporal infrastructure not ready",
|
||||
"temporal": not_ready,
|
||||
}
|
||||
|
||||
status = await prefect_mgr.get_flow_run_status(run_id)
|
||||
workflow_name = "unknown"
|
||||
deployment_id = status.get("workflow", "")
|
||||
for name, deployment in prefect_mgr.deployments.items():
|
||||
if str(deployment) == str(deployment_id):
|
||||
workflow_name = name
|
||||
break
|
||||
status = await temporal_mgr.get_workflow_status(run_id)
|
||||
|
||||
return {
|
||||
"run_id": status["run_id"],
|
||||
"workflow": workflow_name,
|
||||
"run_id": run_id,
|
||||
"workflow": "unknown",
|
||||
"status": status["status"],
|
||||
"is_completed": status["is_completed"],
|
||||
"is_failed": status["is_failed"],
|
||||
"is_running": status["is_running"],
|
||||
"created_at": status["created_at"],
|
||||
"updated_at": status["updated_at"],
|
||||
"is_completed": status["status"] in ["COMPLETED", "FAILED", "CANCELLED"],
|
||||
"is_failed": status["status"] == "FAILED",
|
||||
"is_running": status["status"] == "RUNNING",
|
||||
"created_at": status.get("start_time"),
|
||||
"updated_at": status.get("close_time") or status.get("execution_time"),
|
||||
}
|
||||
except Exception as exc:
|
||||
logger.exception("MCP run status failed")
|
||||
@@ -518,38 +456,30 @@ async def get_run_status_mcp(run_id: str) -> Dict[str, Any]:
|
||||
async def get_run_findings_mcp(run_id: str) -> Dict[str, Any]:
|
||||
"""Return SARIF findings for a completed run."""
|
||||
try:
|
||||
not_ready = _prefect_not_ready_status()
|
||||
not_ready = _temporal_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"error": "Prefect infrastructure not ready",
|
||||
"prefect": not_ready,
|
||||
"error": "Temporal infrastructure not ready",
|
||||
"temporal": not_ready,
|
||||
}
|
||||
|
||||
status = await prefect_mgr.get_flow_run_status(run_id)
|
||||
if not status.get("is_completed"):
|
||||
status = await temporal_mgr.get_workflow_status(run_id)
|
||||
if status.get("status") != "COMPLETED":
|
||||
return {"error": f"Run {run_id} not completed. Status: {status.get('status')}"}
|
||||
|
||||
findings = await prefect_mgr.get_flow_run_findings(run_id)
|
||||
|
||||
workflow_name = "unknown"
|
||||
deployment_id = status.get("workflow", "")
|
||||
for name, deployment in prefect_mgr.deployments.items():
|
||||
if str(deployment) == str(deployment_id):
|
||||
workflow_name = name
|
||||
break
|
||||
result = await temporal_mgr.get_workflow_result(run_id)
|
||||
|
||||
metadata = {
|
||||
"completion_time": status.get("updated_at"),
|
||||
"completion_time": status.get("close_time"),
|
||||
"workflow_version": "unknown",
|
||||
}
|
||||
info = prefect_mgr.workflows.get(workflow_name)
|
||||
if info:
|
||||
metadata["workflow_version"] = info.metadata.get("version", "unknown")
|
||||
|
||||
sarif = result.get("sarif", {}) if isinstance(result, dict) else {}
|
||||
|
||||
return {
|
||||
"workflow": workflow_name,
|
||||
"workflow": "unknown",
|
||||
"run_id": run_id,
|
||||
"sarif": findings,
|
||||
"sarif": sarif,
|
||||
"metadata": metadata,
|
||||
}
|
||||
except Exception as exc:
|
||||
@@ -561,16 +491,15 @@ async def get_run_findings_mcp(run_id: str) -> Dict[str, Any]:
|
||||
async def list_recent_runs_mcp(
|
||||
limit: int = 10,
|
||||
workflow_name: str | None = None,
|
||||
states: List[str] | None = None,
|
||||
) -> Dict[str, Any]:
|
||||
"""List recent Prefect runs with optional workflow/state filters."""
|
||||
"""List recent Temporal runs with optional workflow filter."""
|
||||
|
||||
not_ready = _prefect_not_ready_status()
|
||||
not_ready = _temporal_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"runs": [],
|
||||
"prefect": not_ready,
|
||||
"message": "Prefect infrastructure is still initializing",
|
||||
"temporal": not_ready,
|
||||
"message": "Temporal infrastructure is still initializing",
|
||||
}
|
||||
|
||||
try:
|
||||
@@ -579,116 +508,49 @@ async def list_recent_runs_mcp(
|
||||
limit_value = 10
|
||||
limit_value = max(1, min(limit_value, 100))
|
||||
|
||||
deployment_map = {
|
||||
str(deployment_id): workflow
|
||||
for workflow, deployment_id in prefect_mgr.deployments.items()
|
||||
}
|
||||
try:
|
||||
# Build filter query
|
||||
filter_query = None
|
||||
if workflow_name:
|
||||
workflow_info = temporal_mgr.workflows.get(workflow_name)
|
||||
if workflow_info:
|
||||
filter_query = f'WorkflowType="{workflow_info.workflow_type}"'
|
||||
|
||||
deployment_filter_value = None
|
||||
if workflow_name:
|
||||
deployment_id = prefect_mgr.deployments.get(workflow_name)
|
||||
if not deployment_id:
|
||||
return {
|
||||
"runs": [],
|
||||
"prefect": get_prefect_status(),
|
||||
"error": f"Workflow '{workflow_name}' has no registered deployment",
|
||||
}
|
||||
try:
|
||||
deployment_filter_value = UUID(str(deployment_id))
|
||||
except ValueError:
|
||||
return {
|
||||
"runs": [],
|
||||
"prefect": get_prefect_status(),
|
||||
"error": (
|
||||
f"Deployment id '{deployment_id}' for workflow '{workflow_name}' is invalid"
|
||||
),
|
||||
}
|
||||
workflows = await temporal_mgr.list_workflows(filter_query, limit_value)
|
||||
|
||||
desired_state_types: List[StateType] = []
|
||||
if states:
|
||||
for raw_state in states:
|
||||
if not raw_state:
|
||||
continue
|
||||
normalised = raw_state.strip().upper()
|
||||
if normalised == "ALL":
|
||||
desired_state_types = []
|
||||
break
|
||||
try:
|
||||
desired_state_types.append(StateType[normalised])
|
||||
except KeyError:
|
||||
continue
|
||||
if not desired_state_types:
|
||||
desired_state_types = [
|
||||
StateType.RUNNING,
|
||||
StateType.COMPLETED,
|
||||
StateType.FAILED,
|
||||
StateType.CANCELLED,
|
||||
]
|
||||
results: List[Dict[str, Any]] = []
|
||||
for wf in workflows:
|
||||
results.append({
|
||||
"run_id": wf["workflow_id"],
|
||||
"workflow": workflow_name or "unknown",
|
||||
"state": wf["status"],
|
||||
"state_type": wf["status"],
|
||||
"is_completed": wf["status"] in ["COMPLETED", "FAILED", "CANCELLED"],
|
||||
"is_running": wf["status"] == "RUNNING",
|
||||
"is_failed": wf["status"] == "FAILED",
|
||||
"created_at": wf.get("start_time"),
|
||||
"updated_at": wf.get("close_time"),
|
||||
})
|
||||
|
||||
flow_filter = FlowRunFilter()
|
||||
if desired_state_types:
|
||||
flow_filter.state = FlowRunFilterState(
|
||||
type=FlowRunFilterStateType(any_=desired_state_types)
|
||||
)
|
||||
if deployment_filter_value:
|
||||
flow_filter.deployment_id = FlowRunFilterDeploymentId(
|
||||
any_=[deployment_filter_value]
|
||||
)
|
||||
return {"runs": results, "temporal": get_temporal_status()}
|
||||
|
||||
async with get_client() as client:
|
||||
flow_runs = await client.read_flow_runs(
|
||||
limit=limit_value,
|
||||
flow_run_filter=flow_filter,
|
||||
sort=FlowRunSort.START_TIME_DESC,
|
||||
)
|
||||
|
||||
results: List[Dict[str, Any]] = []
|
||||
for flow_run in flow_runs:
|
||||
deployment_id = getattr(flow_run, "deployment_id", None)
|
||||
workflow = deployment_map.get(str(deployment_id), "unknown")
|
||||
state = getattr(flow_run, "state", None)
|
||||
state_name = getattr(state, "name", None) if state else None
|
||||
state_type = getattr(state, "type", None) if state else None
|
||||
|
||||
results.append(
|
||||
{
|
||||
"run_id": str(flow_run.id),
|
||||
"workflow": workflow,
|
||||
"deployment_id": str(deployment_id) if deployment_id else None,
|
||||
"state": state_name or (state_type.name if state_type else None),
|
||||
"state_type": state_type.name if state_type else None,
|
||||
"is_completed": bool(getattr(state, "is_completed", lambda: False)()),
|
||||
"is_running": bool(getattr(state, "is_running", lambda: False)()),
|
||||
"is_failed": bool(getattr(state, "is_failed", lambda: False)()),
|
||||
"created_at": getattr(flow_run, "created", None),
|
||||
"updated_at": getattr(flow_run, "updated", None),
|
||||
"expected_start_time": getattr(flow_run, "expected_start_time", None),
|
||||
"start_time": getattr(flow_run, "start_time", None),
|
||||
}
|
||||
)
|
||||
|
||||
# Normalise datetimes to ISO 8601 strings for serialization
|
||||
for entry in results:
|
||||
for key in ("created_at", "updated_at", "expected_start_time", "start_time"):
|
||||
value = entry.get(key)
|
||||
if value is None:
|
||||
continue
|
||||
try:
|
||||
entry[key] = value.isoformat()
|
||||
except AttributeError:
|
||||
entry[key] = str(value)
|
||||
|
||||
return {"runs": results, "prefect": get_prefect_status()}
|
||||
except Exception as exc:
|
||||
logger.exception("Failed to list runs")
|
||||
return {
|
||||
"runs": [],
|
||||
"temporal": get_temporal_status(),
|
||||
"error": str(exc)
|
||||
}
|
||||
|
||||
|
||||
@mcp.tool
|
||||
async def get_fuzzing_stats_mcp(run_id: str) -> Dict[str, Any]:
|
||||
"""Return fuzzing statistics for a run if available."""
|
||||
not_ready = _prefect_not_ready_status()
|
||||
not_ready = _temporal_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"error": "Prefect infrastructure not ready",
|
||||
"prefect": not_ready,
|
||||
"error": "Temporal infrastructure not ready",
|
||||
"temporal": not_ready,
|
||||
}
|
||||
|
||||
stats = fuzzing.fuzzing_stats.get(run_id)
|
||||
@@ -708,11 +570,11 @@ async def get_fuzzing_stats_mcp(run_id: str) -> Dict[str, Any]:
|
||||
@mcp.tool
|
||||
async def get_fuzzing_crash_reports_mcp(run_id: str) -> Dict[str, Any]:
|
||||
"""Return crash reports collected for a fuzzing run."""
|
||||
not_ready = _prefect_not_ready_status()
|
||||
not_ready = _temporal_not_ready_status()
|
||||
if not_ready:
|
||||
return {
|
||||
"error": "Prefect infrastructure not ready",
|
||||
"prefect": not_ready,
|
||||
"error": "Temporal infrastructure not ready",
|
||||
"temporal": not_ready,
|
||||
}
|
||||
|
||||
reports = fuzzing.crash_reports.get(run_id)
|
||||
@@ -725,11 +587,11 @@ async def get_fuzzing_crash_reports_mcp(run_id: str) -> Dict[str, Any]:
|
||||
async def get_backend_status_mcp() -> Dict[str, Any]:
|
||||
"""Expose backend readiness, workflows, and registered MCP tools."""
|
||||
|
||||
status = get_prefect_status()
|
||||
response: Dict[str, Any] = {"prefect": status}
|
||||
status = get_temporal_status()
|
||||
response: Dict[str, Any] = {"temporal": status}
|
||||
|
||||
if status.get("ready"):
|
||||
response["workflows"] = list(prefect_mgr.workflows.keys())
|
||||
response["workflows"] = list(temporal_mgr.workflows.keys())
|
||||
|
||||
try:
|
||||
tools = await mcp._tool_manager.list_tools()
|
||||
@@ -775,12 +637,12 @@ def create_mcp_transport_app() -> Starlette:
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Combined lifespan: Prefect init + dedicated MCP transports
|
||||
# Combined lifespan: Temporal init + dedicated MCP transports
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@asynccontextmanager
|
||||
async def combined_lifespan(app: FastAPI):
|
||||
global prefect_bootstrap_task, _fastapi_mcp_imported
|
||||
global temporal_bootstrap_task, _fastapi_mcp_imported
|
||||
|
||||
logger.info("Starting FuzzForge backend...")
|
||||
|
||||
@@ -793,12 +655,12 @@ async def combined_lifespan(app: FastAPI):
|
||||
except Exception as exc:
|
||||
logger.exception("Failed to import FastAPI endpoints into MCP", exc_info=exc)
|
||||
|
||||
# Kick off Prefect bootstrap in the background if needed
|
||||
if prefect_bootstrap_task is None or prefect_bootstrap_task.done():
|
||||
prefect_bootstrap_task = asyncio.create_task(_bootstrap_prefect_with_retries())
|
||||
logger.info("Prefect bootstrap task started")
|
||||
# Kick off Temporal bootstrap in the background if needed
|
||||
if temporal_bootstrap_task is None or temporal_bootstrap_task.done():
|
||||
temporal_bootstrap_task = asyncio.create_task(_bootstrap_temporal_with_retries())
|
||||
logger.info("Temporal bootstrap task started")
|
||||
else:
|
||||
logger.info("Prefect bootstrap task already running")
|
||||
logger.info("Temporal bootstrap task already running")
|
||||
|
||||
# Start MCP transports on shared port (HTTP + SSE)
|
||||
mcp_app = create_mcp_transport_app()
|
||||
@@ -846,18 +708,17 @@ async def combined_lifespan(app: FastAPI):
|
||||
mcp_server.force_exit = True
|
||||
await asyncio.gather(mcp_task, return_exceptions=True)
|
||||
|
||||
if prefect_bootstrap_task and not prefect_bootstrap_task.done():
|
||||
prefect_bootstrap_task.cancel()
|
||||
if temporal_bootstrap_task and not temporal_bootstrap_task.done():
|
||||
temporal_bootstrap_task.cancel()
|
||||
with suppress(asyncio.CancelledError):
|
||||
await prefect_bootstrap_task
|
||||
prefect_bootstrap_state.task_running = False
|
||||
if not prefect_bootstrap_state.ready:
|
||||
prefect_bootstrap_state.status = "stopped"
|
||||
prefect_bootstrap_state.next_retry_seconds = None
|
||||
prefect_bootstrap_task = None
|
||||
await temporal_bootstrap_task
|
||||
temporal_bootstrap_state.task_running = False
|
||||
if not temporal_bootstrap_state.ready:
|
||||
temporal_bootstrap_state.status = "stopped"
|
||||
temporal_bootstrap_task = None
|
||||
|
||||
logger.info("Shutting down Prefect statistics monitor...")
|
||||
await prefect_stats_monitor.stop_monitoring()
|
||||
# Close Temporal client
|
||||
await temporal_mgr.close()
|
||||
logger.info("Shutting down FuzzForge backend...")
|
||||
|
||||
|
||||
|
||||
@@ -13,10 +13,9 @@ Models for workflow findings and submissions
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
from pydantic import BaseModel, Field, field_validator
|
||||
from pydantic import BaseModel, Field
|
||||
from typing import Dict, Any, Optional, Literal, List
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
class WorkflowFindings(BaseModel):
|
||||
@@ -27,47 +26,13 @@ class WorkflowFindings(BaseModel):
|
||||
metadata: Dict[str, Any] = Field(default_factory=dict, description="Additional metadata")
|
||||
|
||||
|
||||
class ResourceLimits(BaseModel):
|
||||
"""Resource limits for workflow execution"""
|
||||
cpu_limit: Optional[str] = Field(None, description="CPU limit (e.g., '2' for 2 cores, '500m' for 0.5 cores)")
|
||||
memory_limit: Optional[str] = Field(None, description="Memory limit (e.g., '1Gi', '512Mi')")
|
||||
cpu_request: Optional[str] = Field(None, description="CPU request (guaranteed)")
|
||||
memory_request: Optional[str] = Field(None, description="Memory request (guaranteed)")
|
||||
|
||||
|
||||
class VolumeMount(BaseModel):
|
||||
"""Volume mount specification"""
|
||||
host_path: str = Field(..., description="Host path to mount")
|
||||
container_path: str = Field(..., description="Container path for mount")
|
||||
mode: Literal["ro", "rw"] = Field(default="ro", description="Mount mode")
|
||||
|
||||
@field_validator("host_path")
|
||||
@classmethod
|
||||
def validate_host_path(cls, v):
|
||||
"""Validate that the host path is absolute (existence checked at runtime)"""
|
||||
path = Path(v)
|
||||
if not path.is_absolute():
|
||||
raise ValueError(f"Host path must be absolute: {v}")
|
||||
# Note: Path existence is validated at workflow runtime
|
||||
# We can't validate existence here as this runs inside Docker container
|
||||
return str(path)
|
||||
|
||||
@field_validator("container_path")
|
||||
@classmethod
|
||||
def validate_container_path(cls, v):
|
||||
"""Validate that the container path is absolute"""
|
||||
if not v.startswith('/'):
|
||||
raise ValueError(f"Container path must be absolute: {v}")
|
||||
return v
|
||||
|
||||
|
||||
class WorkflowSubmission(BaseModel):
|
||||
"""Submit a workflow with configurable settings"""
|
||||
target_path: str = Field(..., description="Absolute path to analyze")
|
||||
volume_mode: Literal["ro", "rw"] = Field(
|
||||
default="ro",
|
||||
description="Volume mount mode: read-only (ro) or read-write (rw)"
|
||||
)
|
||||
"""
|
||||
Submit a workflow with configurable settings.
|
||||
|
||||
Note: This model is deprecated in favor of the /upload-and-submit endpoint
|
||||
which handles file uploads directly.
|
||||
"""
|
||||
parameters: Dict[str, Any] = Field(
|
||||
default_factory=dict,
|
||||
description="Workflow-specific parameters"
|
||||
@@ -78,25 +43,6 @@ class WorkflowSubmission(BaseModel):
|
||||
ge=1,
|
||||
le=604800 # Max 7 days to support fuzzing campaigns
|
||||
)
|
||||
resource_limits: Optional[ResourceLimits] = Field(
|
||||
None,
|
||||
description="Resource limits for workflow container"
|
||||
)
|
||||
additional_volumes: List[VolumeMount] = Field(
|
||||
default_factory=list,
|
||||
description="Additional volume mounts (e.g., for corpus, output directories)"
|
||||
)
|
||||
|
||||
@field_validator("target_path")
|
||||
@classmethod
|
||||
def validate_path(cls, v):
|
||||
"""Validate that the target path is absolute (existence checked at runtime)"""
|
||||
path = Path(v)
|
||||
if not path.is_absolute():
|
||||
raise ValueError(f"Path must be absolute: {v}")
|
||||
# Note: Path existence is validated at workflow runtime when volumes are mounted
|
||||
# We can't validate existence here as this runs inside Docker container
|
||||
return str(path)
|
||||
|
||||
|
||||
class WorkflowStatus(BaseModel):
|
||||
@@ -131,10 +77,6 @@ class WorkflowMetadata(BaseModel):
|
||||
default=["ro", "rw"],
|
||||
description="Supported volume mount modes"
|
||||
)
|
||||
has_custom_docker: bool = Field(
|
||||
default=False,
|
||||
description="Whether workflow has custom Dockerfile"
|
||||
)
|
||||
|
||||
|
||||
class WorkflowListItem(BaseModel):
|
||||
|
||||
@@ -1,394 +0,0 @@
|
||||
"""
|
||||
Generic Prefect Statistics Monitor Service
|
||||
|
||||
This service monitors ALL workflows for structured live data logging and
|
||||
updates the appropriate statistics APIs. Works with any workflow that follows
|
||||
the standard LIVE_STATS logging pattern.
|
||||
"""
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
|
||||
import asyncio
|
||||
import json
|
||||
import logging
|
||||
from datetime import datetime, timedelta, timezone
|
||||
from typing import Dict, Any, Optional
|
||||
from prefect.client.orchestration import get_client
|
||||
from prefect.client.schemas.objects import FlowRun, TaskRun
|
||||
from src.models.findings import FuzzingStats
|
||||
from src.api.fuzzing import fuzzing_stats, initialize_fuzzing_tracking, active_connections
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class PrefectStatsMonitor:
|
||||
"""Monitors Prefect flows and tasks for live statistics from any workflow"""
|
||||
|
||||
def __init__(self):
|
||||
self.monitoring = False
|
||||
self.monitor_task = None
|
||||
self.monitored_runs = set()
|
||||
self.last_log_ts: Dict[str, datetime] = {}
|
||||
self._client = None
|
||||
self._client_refresh_time = None
|
||||
self._client_refresh_interval = 300 # Refresh connection every 5 minutes
|
||||
|
||||
async def start_monitoring(self):
|
||||
"""Start the Prefect statistics monitoring service"""
|
||||
if self.monitoring:
|
||||
logger.warning("Prefect stats monitor already running")
|
||||
return
|
||||
|
||||
self.monitoring = True
|
||||
self.monitor_task = asyncio.create_task(self._monitor_flows())
|
||||
logger.info("Started Prefect statistics monitor")
|
||||
|
||||
async def stop_monitoring(self):
|
||||
"""Stop the monitoring service"""
|
||||
self.monitoring = False
|
||||
if self.monitor_task:
|
||||
self.monitor_task.cancel()
|
||||
try:
|
||||
await self.monitor_task
|
||||
except asyncio.CancelledError:
|
||||
pass
|
||||
logger.info("Stopped Prefect statistics monitor")
|
||||
|
||||
async def _get_or_refresh_client(self):
|
||||
"""Get or refresh Prefect client with connection pooling."""
|
||||
now = datetime.now(timezone.utc)
|
||||
|
||||
if (self._client is None or
|
||||
self._client_refresh_time is None or
|
||||
(now - self._client_refresh_time).total_seconds() > self._client_refresh_interval):
|
||||
|
||||
if self._client:
|
||||
try:
|
||||
await self._client.aclose()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
self._client = get_client()
|
||||
self._client_refresh_time = now
|
||||
await self._client.__aenter__()
|
||||
|
||||
return self._client
|
||||
|
||||
async def _monitor_flows(self):
|
||||
"""Main monitoring loop that watches Prefect flows"""
|
||||
try:
|
||||
while self.monitoring:
|
||||
try:
|
||||
# Use connection pooling for better performance
|
||||
client = await self._get_or_refresh_client()
|
||||
|
||||
# Get recent flow runs (limit to reduce load)
|
||||
flow_runs = await client.read_flow_runs(
|
||||
limit=50,
|
||||
sort="START_TIME_DESC",
|
||||
)
|
||||
|
||||
# Only consider runs from the last 15 minutes
|
||||
recent_cutoff = datetime.now(timezone.utc) - timedelta(minutes=15)
|
||||
for flow_run in flow_runs:
|
||||
created = getattr(flow_run, "created", None)
|
||||
if created is None:
|
||||
continue
|
||||
try:
|
||||
# Ensure timezone-aware comparison
|
||||
if created.tzinfo is None:
|
||||
created = created.replace(tzinfo=timezone.utc)
|
||||
if created >= recent_cutoff:
|
||||
await self._monitor_flow_run(client, flow_run)
|
||||
except Exception:
|
||||
# If comparison fails, attempt monitoring anyway
|
||||
await self._monitor_flow_run(client, flow_run)
|
||||
|
||||
await asyncio.sleep(5) # Check every 5 seconds
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error in Prefect monitoring: {e}")
|
||||
await asyncio.sleep(10)
|
||||
|
||||
except asyncio.CancelledError:
|
||||
logger.info("Prefect monitoring cancelled")
|
||||
except Exception as e:
|
||||
logger.error(f"Fatal error in Prefect monitoring: {e}")
|
||||
finally:
|
||||
# Clean up client on exit
|
||||
if self._client:
|
||||
try:
|
||||
await self._client.__aexit__(None, None, None)
|
||||
except Exception:
|
||||
pass
|
||||
self._client = None
|
||||
|
||||
async def _monitor_flow_run(self, client, flow_run: FlowRun):
|
||||
"""Monitor a specific flow run for statistics"""
|
||||
run_id = str(flow_run.id)
|
||||
workflow_name = flow_run.name or "unknown"
|
||||
|
||||
try:
|
||||
# Initialize tracking if not exists - only for workflows that might have live stats
|
||||
if run_id not in fuzzing_stats:
|
||||
initialize_fuzzing_tracking(run_id, workflow_name)
|
||||
self.monitored_runs.add(run_id)
|
||||
|
||||
# Skip corrupted entries (should not happen after startup cleanup, but defensive)
|
||||
elif not isinstance(fuzzing_stats[run_id], FuzzingStats):
|
||||
logger.warning(f"Skipping corrupted stats entry for {run_id}, reinitializing")
|
||||
initialize_fuzzing_tracking(run_id, workflow_name)
|
||||
self.monitored_runs.add(run_id)
|
||||
|
||||
# Get task runs for this flow
|
||||
task_runs = await client.read_task_runs(
|
||||
flow_run_filter={"id": {"any_": [flow_run.id]}},
|
||||
limit=25,
|
||||
)
|
||||
|
||||
# Check all tasks for live statistics logging
|
||||
for task_run in task_runs:
|
||||
await self._extract_stats_from_task(client, run_id, task_run, workflow_name)
|
||||
|
||||
# Also scan flow-level logs as a fallback
|
||||
await self._extract_stats_from_flow_logs(client, run_id, flow_run, workflow_name)
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"Error monitoring flow run {run_id}: {e}")
|
||||
|
||||
async def _extract_stats_from_task(self, client, run_id: str, task_run: TaskRun, workflow_name: str):
|
||||
"""Extract statistics from any task that logs live stats"""
|
||||
try:
|
||||
# Get task run logs
|
||||
logs = await client.read_logs(
|
||||
log_filter={
|
||||
"task_run_id": {"any_": [task_run.id]}
|
||||
},
|
||||
limit=100,
|
||||
sort="TIMESTAMP_ASC"
|
||||
)
|
||||
|
||||
# Parse logs for LIVE_STATS entries (generic pattern for any workflow)
|
||||
latest_stats = None
|
||||
for log in logs:
|
||||
# Prefer structured extra field if present
|
||||
extra_data = getattr(log, "extra", None) or getattr(log, "extra_fields", None) or None
|
||||
if isinstance(extra_data, dict):
|
||||
stat_type = extra_data.get("stats_type")
|
||||
if stat_type in ["fuzzing_live_update", "scan_progress", "analysis_update", "live_stats"]:
|
||||
latest_stats = extra_data
|
||||
continue
|
||||
|
||||
# Fallback to parsing from message text
|
||||
if ("FUZZ_STATS" in log.message or "LIVE_STATS" in log.message):
|
||||
stats = self._parse_stats_from_log(log.message)
|
||||
if stats:
|
||||
latest_stats = stats
|
||||
|
||||
# Update statistics if we found any
|
||||
if latest_stats:
|
||||
# Calculate elapsed time from task start
|
||||
elapsed_time = 0
|
||||
if task_run.start_time:
|
||||
# Ensure timezone-aware arithmetic
|
||||
now = datetime.now(timezone.utc)
|
||||
try:
|
||||
elapsed_time = int((now - task_run.start_time).total_seconds())
|
||||
except Exception:
|
||||
# Fallback to naive UTC if types mismatch
|
||||
elapsed_time = int((datetime.utcnow() - task_run.start_time.replace(tzinfo=None)).total_seconds())
|
||||
|
||||
updated_stats = FuzzingStats(
|
||||
run_id=run_id,
|
||||
workflow=workflow_name,
|
||||
executions=latest_stats.get("executions", 0),
|
||||
executions_per_sec=latest_stats.get("executions_per_sec", 0.0),
|
||||
crashes=latest_stats.get("crashes", 0),
|
||||
unique_crashes=latest_stats.get("unique_crashes", 0),
|
||||
corpus_size=latest_stats.get("corpus_size", 0),
|
||||
elapsed_time=elapsed_time
|
||||
)
|
||||
|
||||
# Update the global stats
|
||||
previous = fuzzing_stats.get(run_id)
|
||||
fuzzing_stats[run_id] = updated_stats
|
||||
|
||||
# Broadcast to any active WebSocket clients for this run
|
||||
if active_connections.get(run_id):
|
||||
# Handle both Pydantic objects and plain dicts
|
||||
if isinstance(updated_stats, dict):
|
||||
stats_data = updated_stats
|
||||
elif hasattr(updated_stats, 'model_dump'):
|
||||
stats_data = updated_stats.model_dump()
|
||||
elif hasattr(updated_stats, 'dict'):
|
||||
stats_data = updated_stats.dict()
|
||||
else:
|
||||
stats_data = updated_stats.__dict__
|
||||
|
||||
message = {
|
||||
"type": "stats_update",
|
||||
"data": stats_data,
|
||||
}
|
||||
disconnected = []
|
||||
for ws in active_connections[run_id]:
|
||||
try:
|
||||
await ws.send_text(json.dumps(message))
|
||||
except Exception:
|
||||
disconnected.append(ws)
|
||||
# Clean up disconnected sockets
|
||||
for ws in disconnected:
|
||||
try:
|
||||
active_connections[run_id].remove(ws)
|
||||
except ValueError:
|
||||
pass
|
||||
|
||||
logger.debug(f"Updated Prefect stats for {run_id}: {updated_stats.executions} execs")
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"Error extracting stats from task {task_run.id}: {e}")
|
||||
|
||||
async def _extract_stats_from_flow_logs(self, client, run_id: str, flow_run: FlowRun, workflow_name: str):
|
||||
"""Extract statistics by scanning flow-level logs for LIVE/FUZZ stats"""
|
||||
try:
|
||||
logs = await client.read_logs(
|
||||
log_filter={
|
||||
"flow_run_id": {"any_": [flow_run.id]}
|
||||
},
|
||||
limit=200,
|
||||
sort="TIMESTAMP_ASC"
|
||||
)
|
||||
|
||||
latest_stats = None
|
||||
last_seen = self.last_log_ts.get(run_id)
|
||||
max_ts = last_seen
|
||||
|
||||
for log in logs:
|
||||
# Skip logs we've already processed
|
||||
ts = getattr(log, "timestamp", None)
|
||||
if last_seen and ts and ts <= last_seen:
|
||||
continue
|
||||
if ts and (max_ts is None or ts > max_ts):
|
||||
max_ts = ts
|
||||
|
||||
# Prefer structured extra field if available
|
||||
extra_data = getattr(log, "extra", None) or getattr(log, "extra_fields", None) or None
|
||||
if isinstance(extra_data, dict):
|
||||
stat_type = extra_data.get("stats_type")
|
||||
if stat_type in ["fuzzing_live_update", "scan_progress", "analysis_update", "live_stats"]:
|
||||
latest_stats = extra_data
|
||||
continue
|
||||
|
||||
# Fallback to message parse
|
||||
if ("FUZZ_STATS" in log.message or "LIVE_STATS" in log.message):
|
||||
stats = self._parse_stats_from_log(log.message)
|
||||
if stats:
|
||||
latest_stats = stats
|
||||
|
||||
if max_ts:
|
||||
self.last_log_ts[run_id] = max_ts
|
||||
|
||||
if latest_stats:
|
||||
# Use flow_run timestamps for elapsed time if available
|
||||
elapsed_time = 0
|
||||
start_time = getattr(flow_run, "start_time", None) or getattr(flow_run, "start_time", None)
|
||||
if start_time:
|
||||
now = datetime.now(timezone.utc)
|
||||
try:
|
||||
if start_time.tzinfo is None:
|
||||
start_time = start_time.replace(tzinfo=timezone.utc)
|
||||
elapsed_time = int((now - start_time).total_seconds())
|
||||
except Exception:
|
||||
elapsed_time = int((datetime.utcnow() - start_time.replace(tzinfo=None)).total_seconds())
|
||||
|
||||
updated_stats = FuzzingStats(
|
||||
run_id=run_id,
|
||||
workflow=workflow_name,
|
||||
executions=latest_stats.get("executions", 0),
|
||||
executions_per_sec=latest_stats.get("executions_per_sec", 0.0),
|
||||
crashes=latest_stats.get("crashes", 0),
|
||||
unique_crashes=latest_stats.get("unique_crashes", 0),
|
||||
corpus_size=latest_stats.get("corpus_size", 0),
|
||||
elapsed_time=elapsed_time
|
||||
)
|
||||
|
||||
fuzzing_stats[run_id] = updated_stats
|
||||
|
||||
# Broadcast if listeners exist
|
||||
if active_connections.get(run_id):
|
||||
# Handle both Pydantic objects and plain dicts
|
||||
if isinstance(updated_stats, dict):
|
||||
stats_data = updated_stats
|
||||
elif hasattr(updated_stats, 'model_dump'):
|
||||
stats_data = updated_stats.model_dump()
|
||||
elif hasattr(updated_stats, 'dict'):
|
||||
stats_data = updated_stats.dict()
|
||||
else:
|
||||
stats_data = updated_stats.__dict__
|
||||
|
||||
message = {
|
||||
"type": "stats_update",
|
||||
"data": stats_data,
|
||||
}
|
||||
disconnected = []
|
||||
for ws in active_connections[run_id]:
|
||||
try:
|
||||
await ws.send_text(json.dumps(message))
|
||||
except Exception:
|
||||
disconnected.append(ws)
|
||||
for ws in disconnected:
|
||||
try:
|
||||
active_connections[run_id].remove(ws)
|
||||
except ValueError:
|
||||
pass
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"Error extracting stats from flow logs {run_id}: {e}")
|
||||
|
||||
def _parse_stats_from_log(self, log_message: str) -> Optional[Dict[str, Any]]:
|
||||
"""Parse statistics from a log message"""
|
||||
try:
|
||||
import re
|
||||
|
||||
# Prefer explicit JSON after marker tokens
|
||||
m = re.search(r'(?:FUZZ_STATS|LIVE_STATS)\s+(\{.*\})', log_message)
|
||||
if m:
|
||||
try:
|
||||
return json.loads(m.group(1))
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Fallback: Extract the extra= dict and coerce to JSON
|
||||
stats_match = re.search(r'extra=({.*?})', log_message)
|
||||
if not stats_match:
|
||||
return None
|
||||
|
||||
extra_str = stats_match.group(1)
|
||||
extra_str = extra_str.replace("'", '"')
|
||||
extra_str = extra_str.replace('None', 'null')
|
||||
extra_str = extra_str.replace('True', 'true')
|
||||
extra_str = extra_str.replace('False', 'false')
|
||||
|
||||
stats_data = json.loads(extra_str)
|
||||
|
||||
# Support multiple stat types for different workflows
|
||||
stat_type = stats_data.get("stats_type")
|
||||
if stat_type in ["fuzzing_live_update", "scan_progress", "analysis_update", "live_stats"]:
|
||||
return stats_data
|
||||
|
||||
except Exception as e:
|
||||
logger.debug(f"Error parsing log stats: {e}")
|
||||
|
||||
return None
|
||||
|
||||
|
||||
# Global instance
|
||||
prefect_stats_monitor = PrefectStatsMonitor()
|
||||
@@ -0,0 +1,10 @@
|
||||
"""
|
||||
Storage abstraction layer for FuzzForge.
|
||||
|
||||
Provides unified interface for storing and retrieving targets and results.
|
||||
"""
|
||||
|
||||
from .base import StorageBackend
|
||||
from .s3_cached import S3CachedStorage
|
||||
|
||||
__all__ = ["StorageBackend", "S3CachedStorage"]
|
||||
@@ -0,0 +1,153 @@
|
||||
"""
|
||||
Base storage backend interface.
|
||||
|
||||
All storage implementations must implement this interface.
|
||||
"""
|
||||
|
||||
from abc import ABC, abstractmethod
|
||||
from pathlib import Path
|
||||
from typing import Optional, Dict, Any
|
||||
|
||||
|
||||
class StorageBackend(ABC):
|
||||
"""
|
||||
Abstract base class for storage backends.
|
||||
|
||||
Implementations handle storage and retrieval of:
|
||||
- Uploaded targets (code, binaries, etc.)
|
||||
- Workflow results
|
||||
- Temporary files
|
||||
"""
|
||||
|
||||
@abstractmethod
|
||||
async def upload_target(
|
||||
self,
|
||||
file_path: Path,
|
||||
user_id: str,
|
||||
metadata: Optional[Dict[str, Any]] = None
|
||||
) -> str:
|
||||
"""
|
||||
Upload a target file to storage.
|
||||
|
||||
Args:
|
||||
file_path: Local path to file to upload
|
||||
user_id: ID of user uploading the file
|
||||
metadata: Optional metadata to store with file
|
||||
|
||||
Returns:
|
||||
Target ID (unique identifier for retrieval)
|
||||
|
||||
Raises:
|
||||
FileNotFoundError: If file_path doesn't exist
|
||||
StorageError: If upload fails
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def get_target(self, target_id: str) -> Path:
|
||||
"""
|
||||
Get target file from storage.
|
||||
|
||||
Args:
|
||||
target_id: Unique identifier from upload_target()
|
||||
|
||||
Returns:
|
||||
Local path to cached file
|
||||
|
||||
Raises:
|
||||
FileNotFoundError: If target doesn't exist
|
||||
StorageError: If download fails
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def delete_target(self, target_id: str) -> None:
|
||||
"""
|
||||
Delete target from storage.
|
||||
|
||||
Args:
|
||||
target_id: Unique identifier to delete
|
||||
|
||||
Raises:
|
||||
StorageError: If deletion fails (doesn't raise if not found)
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def upload_results(
|
||||
self,
|
||||
workflow_id: str,
|
||||
results: Dict[str, Any],
|
||||
results_format: str = "json"
|
||||
) -> str:
|
||||
"""
|
||||
Upload workflow results to storage.
|
||||
|
||||
Args:
|
||||
workflow_id: Workflow execution ID
|
||||
results: Results dictionary
|
||||
results_format: Format (json, sarif, etc.)
|
||||
|
||||
Returns:
|
||||
URL to uploaded results
|
||||
|
||||
Raises:
|
||||
StorageError: If upload fails
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def get_results(self, workflow_id: str) -> Dict[str, Any]:
|
||||
"""
|
||||
Get workflow results from storage.
|
||||
|
||||
Args:
|
||||
workflow_id: Workflow execution ID
|
||||
|
||||
Returns:
|
||||
Results dictionary
|
||||
|
||||
Raises:
|
||||
FileNotFoundError: If results don't exist
|
||||
StorageError: If download fails
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def list_targets(
|
||||
self,
|
||||
user_id: Optional[str] = None,
|
||||
limit: int = 100
|
||||
) -> list[Dict[str, Any]]:
|
||||
"""
|
||||
List uploaded targets.
|
||||
|
||||
Args:
|
||||
user_id: Filter by user ID (None = all users)
|
||||
limit: Maximum number of results
|
||||
|
||||
Returns:
|
||||
List of target metadata dictionaries
|
||||
|
||||
Raises:
|
||||
StorageError: If listing fails
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def cleanup_cache(self) -> int:
|
||||
"""
|
||||
Clean up local cache (LRU eviction).
|
||||
|
||||
Returns:
|
||||
Number of files removed
|
||||
|
||||
Raises:
|
||||
StorageError: If cleanup fails
|
||||
"""
|
||||
pass
|
||||
|
||||
|
||||
class StorageError(Exception):
|
||||
"""Base exception for storage operations."""
|
||||
pass
|
||||
@@ -0,0 +1,423 @@
|
||||
"""
|
||||
S3-compatible storage backend with local caching.
|
||||
|
||||
Works with MinIO (dev/prod) or AWS S3 (cloud).
|
||||
"""
|
||||
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import shutil
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
from typing import Optional, Dict, Any
|
||||
from uuid import uuid4
|
||||
|
||||
import boto3
|
||||
from botocore.exceptions import ClientError
|
||||
|
||||
from .base import StorageBackend, StorageError
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class S3CachedStorage(StorageBackend):
|
||||
"""
|
||||
S3-compatible storage with local caching.
|
||||
|
||||
Features:
|
||||
- Upload targets to S3/MinIO
|
||||
- Download with local caching (LRU eviction)
|
||||
- Lifecycle management (auto-cleanup old files)
|
||||
- Metadata tracking
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
endpoint_url: Optional[str] = None,
|
||||
access_key: Optional[str] = None,
|
||||
secret_key: Optional[str] = None,
|
||||
bucket: str = "targets",
|
||||
region: str = "us-east-1",
|
||||
use_ssl: bool = False,
|
||||
cache_dir: Optional[Path] = None,
|
||||
cache_max_size_gb: int = 10
|
||||
):
|
||||
"""
|
||||
Initialize S3 storage backend.
|
||||
|
||||
Args:
|
||||
endpoint_url: S3 endpoint (None = AWS S3, or MinIO URL)
|
||||
access_key: S3 access key (None = from env)
|
||||
secret_key: S3 secret key (None = from env)
|
||||
bucket: S3 bucket name
|
||||
region: AWS region
|
||||
use_ssl: Use HTTPS
|
||||
cache_dir: Local cache directory
|
||||
cache_max_size_gb: Maximum cache size in GB
|
||||
"""
|
||||
# Use environment variables as defaults
|
||||
self.endpoint_url = endpoint_url or os.getenv('S3_ENDPOINT', 'http://minio:9000')
|
||||
self.access_key = access_key or os.getenv('S3_ACCESS_KEY', 'fuzzforge')
|
||||
self.secret_key = secret_key or os.getenv('S3_SECRET_KEY', 'fuzzforge123')
|
||||
self.bucket = bucket or os.getenv('S3_BUCKET', 'targets')
|
||||
self.region = region or os.getenv('S3_REGION', 'us-east-1')
|
||||
self.use_ssl = use_ssl or os.getenv('S3_USE_SSL', 'false').lower() == 'true'
|
||||
|
||||
# Cache configuration
|
||||
self.cache_dir = cache_dir or Path(os.getenv('CACHE_DIR', '/tmp/fuzzforge-cache'))
|
||||
self.cache_max_size = cache_max_size_gb * (1024 ** 3) # Convert to bytes
|
||||
|
||||
# Ensure cache directory exists
|
||||
self.cache_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Initialize S3 client
|
||||
try:
|
||||
self.s3_client = boto3.client(
|
||||
's3',
|
||||
endpoint_url=self.endpoint_url,
|
||||
aws_access_key_id=self.access_key,
|
||||
aws_secret_access_key=self.secret_key,
|
||||
region_name=self.region,
|
||||
use_ssl=self.use_ssl
|
||||
)
|
||||
logger.info(f"Initialized S3 storage: {self.endpoint_url}/{self.bucket}")
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to initialize S3 client: {e}")
|
||||
raise StorageError(f"S3 initialization failed: {e}")
|
||||
|
||||
async def upload_target(
|
||||
self,
|
||||
file_path: Path,
|
||||
user_id: str,
|
||||
metadata: Optional[Dict[str, Any]] = None
|
||||
) -> str:
|
||||
"""Upload target file to S3/MinIO."""
|
||||
if not file_path.exists():
|
||||
raise FileNotFoundError(f"File not found: {file_path}")
|
||||
|
||||
# Generate unique target ID
|
||||
target_id = str(uuid4())
|
||||
|
||||
# Prepare metadata
|
||||
upload_metadata = {
|
||||
'user_id': user_id,
|
||||
'uploaded_at': datetime.now().isoformat(),
|
||||
'filename': file_path.name,
|
||||
'size': str(file_path.stat().st_size)
|
||||
}
|
||||
if metadata:
|
||||
upload_metadata.update(metadata)
|
||||
|
||||
# Upload to S3
|
||||
s3_key = f'{target_id}/target'
|
||||
try:
|
||||
logger.info(f"Uploading target to s3://{self.bucket}/{s3_key}")
|
||||
|
||||
self.s3_client.upload_file(
|
||||
str(file_path),
|
||||
self.bucket,
|
||||
s3_key,
|
||||
ExtraArgs={
|
||||
'Metadata': upload_metadata
|
||||
}
|
||||
)
|
||||
|
||||
file_size_mb = file_path.stat().st_size / (1024 * 1024)
|
||||
logger.info(
|
||||
f"✓ Uploaded target {target_id} "
|
||||
f"({file_path.name}, {file_size_mb:.2f} MB)"
|
||||
)
|
||||
|
||||
return target_id
|
||||
|
||||
except ClientError as e:
|
||||
logger.error(f"S3 upload failed: {e}", exc_info=True)
|
||||
raise StorageError(f"Failed to upload target: {e}")
|
||||
except Exception as e:
|
||||
logger.error(f"Upload failed: {e}", exc_info=True)
|
||||
raise StorageError(f"Upload error: {e}")
|
||||
|
||||
async def get_target(self, target_id: str) -> Path:
|
||||
"""Get target from cache or download from S3/MinIO."""
|
||||
# Check cache first
|
||||
cache_path = self.cache_dir / target_id
|
||||
cached_file = cache_path / "target"
|
||||
|
||||
if cached_file.exists():
|
||||
# Update access time for LRU
|
||||
cached_file.touch()
|
||||
logger.info(f"Cache HIT: {target_id}")
|
||||
return cached_file
|
||||
|
||||
# Cache miss - download from S3
|
||||
logger.info(f"Cache MISS: {target_id}, downloading from S3...")
|
||||
|
||||
try:
|
||||
# Create cache directory
|
||||
cache_path.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Download from S3
|
||||
s3_key = f'{target_id}/target'
|
||||
logger.info(f"Downloading s3://{self.bucket}/{s3_key}")
|
||||
|
||||
self.s3_client.download_file(
|
||||
self.bucket,
|
||||
s3_key,
|
||||
str(cached_file)
|
||||
)
|
||||
|
||||
# Verify download
|
||||
if not cached_file.exists():
|
||||
raise StorageError(f"Downloaded file not found: {cached_file}")
|
||||
|
||||
file_size_mb = cached_file.stat().st_size / (1024 * 1024)
|
||||
logger.info(f"✓ Downloaded target {target_id} ({file_size_mb:.2f} MB)")
|
||||
|
||||
return cached_file
|
||||
|
||||
except ClientError as e:
|
||||
error_code = e.response.get('Error', {}).get('Code')
|
||||
if error_code in ['404', 'NoSuchKey']:
|
||||
logger.error(f"Target not found: {target_id}")
|
||||
raise FileNotFoundError(f"Target {target_id} not found in storage")
|
||||
else:
|
||||
logger.error(f"S3 download failed: {e}", exc_info=True)
|
||||
raise StorageError(f"Download failed: {e}")
|
||||
except Exception as e:
|
||||
logger.error(f"Download error: {e}", exc_info=True)
|
||||
# Cleanup partial download
|
||||
if cache_path.exists():
|
||||
shutil.rmtree(cache_path, ignore_errors=True)
|
||||
raise StorageError(f"Download error: {e}")
|
||||
|
||||
async def delete_target(self, target_id: str) -> None:
|
||||
"""Delete target from S3/MinIO."""
|
||||
try:
|
||||
s3_key = f'{target_id}/target'
|
||||
logger.info(f"Deleting s3://{self.bucket}/{s3_key}")
|
||||
|
||||
self.s3_client.delete_object(
|
||||
Bucket=self.bucket,
|
||||
Key=s3_key
|
||||
)
|
||||
|
||||
# Also delete from cache if present
|
||||
cache_path = self.cache_dir / target_id
|
||||
if cache_path.exists():
|
||||
shutil.rmtree(cache_path, ignore_errors=True)
|
||||
logger.info(f"✓ Deleted target {target_id} from S3 and cache")
|
||||
else:
|
||||
logger.info(f"✓ Deleted target {target_id} from S3")
|
||||
|
||||
except ClientError as e:
|
||||
logger.error(f"S3 delete failed: {e}", exc_info=True)
|
||||
# Don't raise error if object doesn't exist
|
||||
if e.response.get('Error', {}).get('Code') not in ['404', 'NoSuchKey']:
|
||||
raise StorageError(f"Delete failed: {e}")
|
||||
except Exception as e:
|
||||
logger.error(f"Delete error: {e}", exc_info=True)
|
||||
raise StorageError(f"Delete error: {e}")
|
||||
|
||||
async def upload_results(
|
||||
self,
|
||||
workflow_id: str,
|
||||
results: Dict[str, Any],
|
||||
results_format: str = "json"
|
||||
) -> str:
|
||||
"""Upload workflow results to S3/MinIO."""
|
||||
try:
|
||||
# Prepare results content
|
||||
if results_format == "json":
|
||||
content = json.dumps(results, indent=2).encode('utf-8')
|
||||
content_type = 'application/json'
|
||||
file_ext = 'json'
|
||||
elif results_format == "sarif":
|
||||
content = json.dumps(results, indent=2).encode('utf-8')
|
||||
content_type = 'application/sarif+json'
|
||||
file_ext = 'sarif'
|
||||
else:
|
||||
content = json.dumps(results, indent=2).encode('utf-8')
|
||||
content_type = 'application/json'
|
||||
file_ext = 'json'
|
||||
|
||||
# Upload to results bucket
|
||||
results_bucket = 'results'
|
||||
s3_key = f'{workflow_id}/results.{file_ext}'
|
||||
|
||||
logger.info(f"Uploading results to s3://{results_bucket}/{s3_key}")
|
||||
|
||||
self.s3_client.put_object(
|
||||
Bucket=results_bucket,
|
||||
Key=s3_key,
|
||||
Body=content,
|
||||
ContentType=content_type,
|
||||
Metadata={
|
||||
'workflow_id': workflow_id,
|
||||
'format': results_format,
|
||||
'uploaded_at': datetime.now().isoformat()
|
||||
}
|
||||
)
|
||||
|
||||
# Construct URL
|
||||
results_url = f"{self.endpoint_url}/{results_bucket}/{s3_key}"
|
||||
logger.info(f"✓ Uploaded results: {results_url}")
|
||||
|
||||
return results_url
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Results upload failed: {e}", exc_info=True)
|
||||
raise StorageError(f"Results upload failed: {e}")
|
||||
|
||||
async def get_results(self, workflow_id: str) -> Dict[str, Any]:
|
||||
"""Get workflow results from S3/MinIO."""
|
||||
try:
|
||||
results_bucket = 'results'
|
||||
s3_key = f'{workflow_id}/results.json'
|
||||
|
||||
logger.info(f"Downloading results from s3://{results_bucket}/{s3_key}")
|
||||
|
||||
response = self.s3_client.get_object(
|
||||
Bucket=results_bucket,
|
||||
Key=s3_key
|
||||
)
|
||||
|
||||
content = response['Body'].read().decode('utf-8')
|
||||
results = json.loads(content)
|
||||
|
||||
logger.info(f"✓ Downloaded results for workflow {workflow_id}")
|
||||
return results
|
||||
|
||||
except ClientError as e:
|
||||
error_code = e.response.get('Error', {}).get('Code')
|
||||
if error_code in ['404', 'NoSuchKey']:
|
||||
logger.error(f"Results not found: {workflow_id}")
|
||||
raise FileNotFoundError(f"Results for workflow {workflow_id} not found")
|
||||
else:
|
||||
logger.error(f"Results download failed: {e}", exc_info=True)
|
||||
raise StorageError(f"Results download failed: {e}")
|
||||
except Exception as e:
|
||||
logger.error(f"Results download error: {e}", exc_info=True)
|
||||
raise StorageError(f"Results download error: {e}")
|
||||
|
||||
async def list_targets(
|
||||
self,
|
||||
user_id: Optional[str] = None,
|
||||
limit: int = 100
|
||||
) -> list[Dict[str, Any]]:
|
||||
"""List uploaded targets."""
|
||||
try:
|
||||
targets = []
|
||||
paginator = self.s3_client.get_paginator('list_objects_v2')
|
||||
|
||||
for page in paginator.paginate(Bucket=self.bucket, PaginationConfig={'MaxItems': limit}):
|
||||
for obj in page.get('Contents', []):
|
||||
# Get object metadata
|
||||
try:
|
||||
metadata_response = self.s3_client.head_object(
|
||||
Bucket=self.bucket,
|
||||
Key=obj['Key']
|
||||
)
|
||||
metadata = metadata_response.get('Metadata', {})
|
||||
|
||||
# Filter by user_id if specified
|
||||
if user_id and metadata.get('user_id') != user_id:
|
||||
continue
|
||||
|
||||
targets.append({
|
||||
'target_id': obj['Key'].split('/')[0],
|
||||
'key': obj['Key'],
|
||||
'size': obj['Size'],
|
||||
'last_modified': obj['LastModified'].isoformat(),
|
||||
'metadata': metadata
|
||||
})
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to get metadata for {obj['Key']}: {e}")
|
||||
continue
|
||||
|
||||
logger.info(f"Listed {len(targets)} targets (user_id={user_id})")
|
||||
return targets
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"List targets failed: {e}", exc_info=True)
|
||||
raise StorageError(f"List targets failed: {e}")
|
||||
|
||||
async def cleanup_cache(self) -> int:
|
||||
"""Clean up local cache using LRU eviction."""
|
||||
try:
|
||||
cache_files = []
|
||||
total_size = 0
|
||||
|
||||
# Gather all cached files with metadata
|
||||
for cache_file in self.cache_dir.rglob('*'):
|
||||
if cache_file.is_file():
|
||||
try:
|
||||
stat = cache_file.stat()
|
||||
cache_files.append({
|
||||
'path': cache_file,
|
||||
'size': stat.st_size,
|
||||
'atime': stat.st_atime # Last access time
|
||||
})
|
||||
total_size += stat.st_size
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to stat {cache_file}: {e}")
|
||||
continue
|
||||
|
||||
# Check if cleanup is needed
|
||||
if total_size <= self.cache_max_size:
|
||||
logger.info(
|
||||
f"Cache size OK: {total_size / (1024**3):.2f} GB / "
|
||||
f"{self.cache_max_size / (1024**3):.2f} GB"
|
||||
)
|
||||
return 0
|
||||
|
||||
# Sort by access time (oldest first)
|
||||
cache_files.sort(key=lambda x: x['atime'])
|
||||
|
||||
# Remove files until under limit
|
||||
removed_count = 0
|
||||
for file_info in cache_files:
|
||||
if total_size <= self.cache_max_size:
|
||||
break
|
||||
|
||||
try:
|
||||
file_info['path'].unlink()
|
||||
total_size -= file_info['size']
|
||||
removed_count += 1
|
||||
logger.debug(f"Evicted from cache: {file_info['path']}")
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to delete {file_info['path']}: {e}")
|
||||
continue
|
||||
|
||||
logger.info(
|
||||
f"✓ Cache cleanup: removed {removed_count} files, "
|
||||
f"new size: {total_size / (1024**3):.2f} GB"
|
||||
)
|
||||
return removed_count
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Cache cleanup failed: {e}", exc_info=True)
|
||||
raise StorageError(f"Cache cleanup failed: {e}")
|
||||
|
||||
def get_cache_stats(self) -> Dict[str, Any]:
|
||||
"""Get cache statistics."""
|
||||
try:
|
||||
total_size = 0
|
||||
file_count = 0
|
||||
|
||||
for cache_file in self.cache_dir.rglob('*'):
|
||||
if cache_file.is_file():
|
||||
total_size += cache_file.stat().st_size
|
||||
file_count += 1
|
||||
|
||||
return {
|
||||
'total_size_bytes': total_size,
|
||||
'total_size_gb': total_size / (1024 ** 3),
|
||||
'file_count': file_count,
|
||||
'max_size_gb': self.cache_max_size / (1024 ** 3),
|
||||
'usage_percent': (total_size / self.cache_max_size) * 100
|
||||
}
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get cache stats: {e}")
|
||||
return {'error': str(e)}
|
||||
@@ -0,0 +1,10 @@
|
||||
"""
|
||||
Temporal integration for FuzzForge.
|
||||
|
||||
Handles workflow execution, monitoring, and management.
|
||||
"""
|
||||
|
||||
from .manager import TemporalManager
|
||||
from .discovery import WorkflowDiscovery
|
||||
|
||||
__all__ = ["TemporalManager", "WorkflowDiscovery"]
|
||||
@@ -0,0 +1,257 @@
|
||||
"""
|
||||
Workflow Discovery for Temporal
|
||||
|
||||
Discovers workflows from the toolbox/workflows directory
|
||||
and provides metadata about available workflows.
|
||||
"""
|
||||
|
||||
import logging
|
||||
import yaml
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any
|
||||
from pydantic import BaseModel, Field, ConfigDict
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class WorkflowInfo(BaseModel):
|
||||
"""Information about a discovered workflow"""
|
||||
name: str = Field(..., description="Workflow name")
|
||||
path: Path = Field(..., description="Path to workflow directory")
|
||||
workflow_file: Path = Field(..., description="Path to workflow.py file")
|
||||
metadata: Dict[str, Any] = Field(..., description="Workflow metadata from YAML")
|
||||
workflow_type: str = Field(..., description="Workflow class name")
|
||||
vertical: str = Field(..., description="Vertical (worker type) for this workflow")
|
||||
|
||||
model_config = ConfigDict(arbitrary_types_allowed=True)
|
||||
|
||||
|
||||
class WorkflowDiscovery:
|
||||
"""
|
||||
Discovers workflows from the filesystem.
|
||||
|
||||
Scans toolbox/workflows/ for directories containing:
|
||||
- metadata.yaml (required)
|
||||
- workflow.py (required)
|
||||
|
||||
Each workflow declares its vertical (rust, android, web, etc.)
|
||||
which determines which worker pool will execute it.
|
||||
"""
|
||||
|
||||
def __init__(self, workflows_dir: Path):
|
||||
"""
|
||||
Initialize workflow discovery.
|
||||
|
||||
Args:
|
||||
workflows_dir: Path to the workflows directory
|
||||
"""
|
||||
self.workflows_dir = workflows_dir
|
||||
if not self.workflows_dir.exists():
|
||||
self.workflows_dir.mkdir(parents=True, exist_ok=True)
|
||||
logger.info(f"Created workflows directory: {self.workflows_dir}")
|
||||
|
||||
async def discover_workflows(self) -> Dict[str, WorkflowInfo]:
|
||||
"""
|
||||
Discover workflows by scanning the workflows directory.
|
||||
|
||||
Returns:
|
||||
Dictionary mapping workflow names to their information
|
||||
"""
|
||||
workflows = {}
|
||||
|
||||
logger.info(f"Scanning for workflows in: {self.workflows_dir}")
|
||||
|
||||
for workflow_dir in self.workflows_dir.iterdir():
|
||||
if not workflow_dir.is_dir():
|
||||
continue
|
||||
|
||||
# Skip special directories
|
||||
if workflow_dir.name.startswith('.') or workflow_dir.name == '__pycache__':
|
||||
continue
|
||||
|
||||
metadata_file = workflow_dir / "metadata.yaml"
|
||||
if not metadata_file.exists():
|
||||
logger.debug(f"No metadata.yaml in {workflow_dir.name}, skipping")
|
||||
continue
|
||||
|
||||
workflow_file = workflow_dir / "workflow.py"
|
||||
if not workflow_file.exists():
|
||||
logger.warning(
|
||||
f"Workflow {workflow_dir.name} has metadata but no workflow.py, skipping"
|
||||
)
|
||||
continue
|
||||
|
||||
try:
|
||||
# Parse metadata
|
||||
with open(metadata_file) as f:
|
||||
metadata = yaml.safe_load(f)
|
||||
|
||||
# Validate required fields
|
||||
if 'name' not in metadata:
|
||||
logger.warning(f"Workflow {workflow_dir.name} metadata missing 'name' field")
|
||||
metadata['name'] = workflow_dir.name
|
||||
|
||||
if 'vertical' not in metadata:
|
||||
logger.warning(
|
||||
f"Workflow {workflow_dir.name} metadata missing 'vertical' field"
|
||||
)
|
||||
continue
|
||||
|
||||
# Infer workflow class name from metadata or use convention
|
||||
workflow_type = metadata.get('workflow_class')
|
||||
if not workflow_type:
|
||||
# Convention: convert snake_case to PascalCase + Workflow
|
||||
# e.g., rust_test -> RustTestWorkflow
|
||||
parts = workflow_dir.name.split('_')
|
||||
workflow_type = ''.join(part.capitalize() for part in parts) + 'Workflow'
|
||||
|
||||
# Create workflow info
|
||||
info = WorkflowInfo(
|
||||
name=metadata['name'],
|
||||
path=workflow_dir,
|
||||
workflow_file=workflow_file,
|
||||
metadata=metadata,
|
||||
workflow_type=workflow_type,
|
||||
vertical=metadata['vertical']
|
||||
)
|
||||
|
||||
workflows[info.name] = info
|
||||
logger.info(
|
||||
f"✓ Discovered workflow: {info.name} "
|
||||
f"(vertical: {info.vertical}, class: {info.workflow_type})"
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(
|
||||
f"Error discovering workflow {workflow_dir.name}: {e}",
|
||||
exc_info=True
|
||||
)
|
||||
continue
|
||||
|
||||
logger.info(f"Discovered {len(workflows)} workflows")
|
||||
return workflows
|
||||
|
||||
def get_workflows_by_vertical(
|
||||
self,
|
||||
workflows: Dict[str, WorkflowInfo],
|
||||
vertical: str
|
||||
) -> Dict[str, WorkflowInfo]:
|
||||
"""
|
||||
Filter workflows by vertical.
|
||||
|
||||
Args:
|
||||
workflows: All discovered workflows
|
||||
vertical: Vertical name to filter by
|
||||
|
||||
Returns:
|
||||
Filtered workflows dictionary
|
||||
"""
|
||||
return {
|
||||
name: info
|
||||
for name, info in workflows.items()
|
||||
if info.vertical == vertical
|
||||
}
|
||||
|
||||
def get_available_verticals(self, workflows: Dict[str, WorkflowInfo]) -> list[str]:
|
||||
"""
|
||||
Get list of all verticals from discovered workflows.
|
||||
|
||||
Args:
|
||||
workflows: All discovered workflows
|
||||
|
||||
Returns:
|
||||
List of unique vertical names
|
||||
"""
|
||||
return list(set(info.vertical for info in workflows.values()))
|
||||
|
||||
@staticmethod
|
||||
def get_metadata_schema() -> Dict[str, Any]:
|
||||
"""
|
||||
Get the JSON schema for workflow metadata.
|
||||
|
||||
Returns:
|
||||
JSON schema dictionary
|
||||
"""
|
||||
return {
|
||||
"type": "object",
|
||||
"required": ["name", "version", "description", "author", "vertical", "parameters"],
|
||||
"properties": {
|
||||
"name": {
|
||||
"type": "string",
|
||||
"description": "Workflow name"
|
||||
},
|
||||
"version": {
|
||||
"type": "string",
|
||||
"pattern": "^\\d+\\.\\d+\\.\\d+$",
|
||||
"description": "Semantic version (x.y.z)"
|
||||
},
|
||||
"vertical": {
|
||||
"type": "string",
|
||||
"description": "Vertical worker type (rust, android, web, etc.)"
|
||||
},
|
||||
"description": {
|
||||
"type": "string",
|
||||
"description": "Workflow description"
|
||||
},
|
||||
"author": {
|
||||
"type": "string",
|
||||
"description": "Workflow author"
|
||||
},
|
||||
"category": {
|
||||
"type": "string",
|
||||
"enum": ["comprehensive", "specialized", "fuzzing", "focused"],
|
||||
"description": "Workflow category"
|
||||
},
|
||||
"tags": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "Workflow tags for categorization"
|
||||
},
|
||||
"requirements": {
|
||||
"type": "object",
|
||||
"required": ["tools", "resources"],
|
||||
"properties": {
|
||||
"tools": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "Required security tools"
|
||||
},
|
||||
"resources": {
|
||||
"type": "object",
|
||||
"required": ["memory", "cpu", "timeout"],
|
||||
"properties": {
|
||||
"memory": {
|
||||
"type": "string",
|
||||
"pattern": "^\\d+[GMK]i$",
|
||||
"description": "Memory limit (e.g., 1Gi, 512Mi)"
|
||||
},
|
||||
"cpu": {
|
||||
"type": "string",
|
||||
"pattern": "^\\d+m?$",
|
||||
"description": "CPU limit (e.g., 1000m, 2)"
|
||||
},
|
||||
"timeout": {
|
||||
"type": "integer",
|
||||
"minimum": 60,
|
||||
"maximum": 7200,
|
||||
"description": "Workflow timeout in seconds"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"description": "Workflow parameters schema"
|
||||
},
|
||||
"default_parameters": {
|
||||
"type": "object",
|
||||
"description": "Default parameter values"
|
||||
},
|
||||
"required_modules": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "Required module names"
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,371 @@
|
||||
"""
|
||||
Temporal Manager - Workflow execution and management
|
||||
|
||||
Handles:
|
||||
- Workflow discovery from toolbox
|
||||
- Workflow execution (submit to Temporal)
|
||||
- Status monitoring
|
||||
- Results retrieval
|
||||
"""
|
||||
|
||||
import logging
|
||||
import os
|
||||
from pathlib import Path
|
||||
from typing import Dict, Optional, Any
|
||||
from uuid import uuid4
|
||||
|
||||
from temporalio.client import Client, WorkflowHandle
|
||||
from temporalio.common import RetryPolicy
|
||||
from datetime import timedelta
|
||||
|
||||
from .discovery import WorkflowDiscovery, WorkflowInfo
|
||||
from src.storage import S3CachedStorage
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class TemporalManager:
|
||||
"""
|
||||
Manages Temporal workflow execution for FuzzForge.
|
||||
|
||||
This class:
|
||||
- Discovers available workflows from toolbox
|
||||
- Submits workflow executions to Temporal
|
||||
- Monitors workflow status
|
||||
- Retrieves workflow results
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
workflows_dir: Optional[Path] = None,
|
||||
temporal_address: Optional[str] = None,
|
||||
temporal_namespace: str = "default",
|
||||
storage: Optional[S3CachedStorage] = None
|
||||
):
|
||||
"""
|
||||
Initialize Temporal manager.
|
||||
|
||||
Args:
|
||||
workflows_dir: Path to workflows directory (default: toolbox/workflows)
|
||||
temporal_address: Temporal server address (default: from env or localhost:7233)
|
||||
temporal_namespace: Temporal namespace
|
||||
storage: Storage backend for file uploads (default: S3CachedStorage)
|
||||
"""
|
||||
if workflows_dir is None:
|
||||
workflows_dir = Path("toolbox/workflows")
|
||||
|
||||
self.temporal_address = temporal_address or os.getenv(
|
||||
'TEMPORAL_ADDRESS',
|
||||
'localhost:7233'
|
||||
)
|
||||
self.temporal_namespace = temporal_namespace
|
||||
self.discovery = WorkflowDiscovery(workflows_dir)
|
||||
self.workflows: Dict[str, WorkflowInfo] = {}
|
||||
self.client: Optional[Client] = None
|
||||
|
||||
# Initialize storage backend
|
||||
self.storage = storage or S3CachedStorage()
|
||||
|
||||
logger.info(
|
||||
f"TemporalManager initialized: {self.temporal_address} "
|
||||
f"(namespace: {self.temporal_namespace})"
|
||||
)
|
||||
|
||||
async def initialize(self):
|
||||
"""Initialize the manager by discovering workflows and connecting to Temporal."""
|
||||
try:
|
||||
# Discover workflows
|
||||
self.workflows = await self.discovery.discover_workflows()
|
||||
|
||||
if not self.workflows:
|
||||
logger.warning("No workflows discovered")
|
||||
else:
|
||||
logger.info(
|
||||
f"Discovered {len(self.workflows)} workflows: "
|
||||
f"{list(self.workflows.keys())}"
|
||||
)
|
||||
|
||||
# Connect to Temporal
|
||||
self.client = await Client.connect(
|
||||
self.temporal_address,
|
||||
namespace=self.temporal_namespace
|
||||
)
|
||||
logger.info(f"✓ Connected to Temporal: {self.temporal_address}")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to initialize Temporal manager: {e}", exc_info=True)
|
||||
raise
|
||||
|
||||
async def close(self):
|
||||
"""Close Temporal client connection."""
|
||||
if self.client:
|
||||
# Temporal client doesn't need explicit close in Python SDK
|
||||
pass
|
||||
|
||||
async def get_workflows(self) -> Dict[str, WorkflowInfo]:
|
||||
"""
|
||||
Get all discovered workflows.
|
||||
|
||||
Returns:
|
||||
Dictionary mapping workflow names to their info
|
||||
"""
|
||||
return self.workflows
|
||||
|
||||
async def get_workflow(self, name: str) -> Optional[WorkflowInfo]:
|
||||
"""
|
||||
Get workflow info by name.
|
||||
|
||||
Args:
|
||||
name: Workflow name
|
||||
|
||||
Returns:
|
||||
WorkflowInfo or None if not found
|
||||
"""
|
||||
return self.workflows.get(name)
|
||||
|
||||
async def upload_target(
|
||||
self,
|
||||
file_path: Path,
|
||||
user_id: str,
|
||||
metadata: Optional[Dict[str, Any]] = None
|
||||
) -> str:
|
||||
"""
|
||||
Upload target file to storage.
|
||||
|
||||
Args:
|
||||
file_path: Local path to file
|
||||
user_id: User ID
|
||||
metadata: Optional metadata
|
||||
|
||||
Returns:
|
||||
Target ID for use in workflow execution
|
||||
"""
|
||||
target_id = await self.storage.upload_target(file_path, user_id, metadata)
|
||||
logger.info(f"Uploaded target: {target_id}")
|
||||
return target_id
|
||||
|
||||
async def run_workflow(
|
||||
self,
|
||||
workflow_name: str,
|
||||
target_id: str,
|
||||
workflow_params: Optional[Dict[str, Any]] = None,
|
||||
workflow_id: Optional[str] = None
|
||||
) -> WorkflowHandle:
|
||||
"""
|
||||
Execute a workflow.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of workflow to execute
|
||||
target_id: Target ID (from upload_target)
|
||||
workflow_params: Additional workflow parameters
|
||||
workflow_id: Optional workflow ID (generated if not provided)
|
||||
|
||||
Returns:
|
||||
WorkflowHandle for monitoring/results
|
||||
|
||||
Raises:
|
||||
ValueError: If workflow not found or client not initialized
|
||||
"""
|
||||
if not self.client:
|
||||
raise ValueError("Temporal client not initialized. Call initialize() first.")
|
||||
|
||||
# Get workflow info
|
||||
workflow_info = self.workflows.get(workflow_name)
|
||||
if not workflow_info:
|
||||
raise ValueError(f"Workflow not found: {workflow_name}")
|
||||
|
||||
# Generate workflow ID if not provided
|
||||
if not workflow_id:
|
||||
workflow_id = f"{workflow_name}-{str(uuid4())[:8]}"
|
||||
|
||||
# Prepare workflow input arguments
|
||||
workflow_params = workflow_params or {}
|
||||
|
||||
# Build args list: [target_id, ...workflow_params values]
|
||||
# The workflow parameters are passed as individual positional args
|
||||
workflow_args = [target_id]
|
||||
|
||||
# Add parameters in order based on workflow signature
|
||||
# For security_assessment: scanner_config, analyzer_config, reporter_config
|
||||
# For atheris_fuzzing: target_file, max_iterations, timeout_seconds
|
||||
if workflow_params:
|
||||
workflow_args.extend(workflow_params.values())
|
||||
|
||||
# Determine task queue from workflow vertical
|
||||
vertical = workflow_info.metadata.get("vertical", "default")
|
||||
task_queue = f"{vertical}-queue"
|
||||
|
||||
logger.info(
|
||||
f"Starting workflow: {workflow_name} "
|
||||
f"(id={workflow_id}, queue={task_queue}, target={target_id})"
|
||||
)
|
||||
|
||||
try:
|
||||
# Start workflow execution with positional arguments
|
||||
handle = await self.client.start_workflow(
|
||||
workflow=workflow_info.workflow_type, # Workflow class name
|
||||
args=workflow_args, # Positional arguments
|
||||
id=workflow_id,
|
||||
task_queue=task_queue,
|
||||
retry_policy=RetryPolicy(
|
||||
initial_interval=timedelta(seconds=1),
|
||||
maximum_interval=timedelta(minutes=1),
|
||||
maximum_attempts=3
|
||||
)
|
||||
)
|
||||
|
||||
logger.info(f"✓ Workflow started: {workflow_id}")
|
||||
return handle
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to start workflow {workflow_name}: {e}", exc_info=True)
|
||||
raise
|
||||
|
||||
async def get_workflow_status(self, workflow_id: str) -> Dict[str, Any]:
|
||||
"""
|
||||
Get workflow execution status.
|
||||
|
||||
Args:
|
||||
workflow_id: Workflow execution ID
|
||||
|
||||
Returns:
|
||||
Status dictionary with workflow state
|
||||
|
||||
Raises:
|
||||
ValueError: If client not initialized or workflow not found
|
||||
"""
|
||||
if not self.client:
|
||||
raise ValueError("Temporal client not initialized")
|
||||
|
||||
try:
|
||||
# Get workflow handle
|
||||
handle = self.client.get_workflow_handle(workflow_id)
|
||||
|
||||
# Try to get result (non-blocking describe)
|
||||
description = await handle.describe()
|
||||
|
||||
status = {
|
||||
"workflow_id": workflow_id,
|
||||
"status": description.status.name,
|
||||
"start_time": description.start_time.isoformat() if description.start_time else None,
|
||||
"execution_time": description.execution_time.isoformat() if description.execution_time else None,
|
||||
"close_time": description.close_time.isoformat() if description.close_time else None,
|
||||
"task_queue": description.task_queue,
|
||||
}
|
||||
|
||||
logger.info(f"Workflow {workflow_id} status: {status['status']}")
|
||||
return status
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get workflow status: {e}", exc_info=True)
|
||||
raise
|
||||
|
||||
async def get_workflow_result(
|
||||
self,
|
||||
workflow_id: str,
|
||||
timeout: Optional[timedelta] = None
|
||||
) -> Any:
|
||||
"""
|
||||
Get workflow execution result (blocking).
|
||||
|
||||
Args:
|
||||
workflow_id: Workflow execution ID
|
||||
timeout: Maximum time to wait for result
|
||||
|
||||
Returns:
|
||||
Workflow result
|
||||
|
||||
Raises:
|
||||
ValueError: If client not initialized
|
||||
TimeoutError: If timeout exceeded
|
||||
"""
|
||||
if not self.client:
|
||||
raise ValueError("Temporal client not initialized")
|
||||
|
||||
try:
|
||||
handle = self.client.get_workflow_handle(workflow_id)
|
||||
|
||||
logger.info(f"Waiting for workflow result: {workflow_id}")
|
||||
|
||||
# Wait for workflow to complete and get result
|
||||
if timeout:
|
||||
# Use asyncio timeout if provided
|
||||
import asyncio
|
||||
result = await asyncio.wait_for(handle.result(), timeout=timeout.total_seconds())
|
||||
else:
|
||||
result = await handle.result()
|
||||
|
||||
logger.info(f"✓ Workflow {workflow_id} completed")
|
||||
return result
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get workflow result: {e}", exc_info=True)
|
||||
raise
|
||||
|
||||
async def cancel_workflow(self, workflow_id: str) -> None:
|
||||
"""
|
||||
Cancel a running workflow.
|
||||
|
||||
Args:
|
||||
workflow_id: Workflow execution ID
|
||||
|
||||
Raises:
|
||||
ValueError: If client not initialized
|
||||
"""
|
||||
if not self.client:
|
||||
raise ValueError("Temporal client not initialized")
|
||||
|
||||
try:
|
||||
handle = self.client.get_workflow_handle(workflow_id)
|
||||
await handle.cancel()
|
||||
|
||||
logger.info(f"✓ Workflow cancelled: {workflow_id}")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to cancel workflow: {e}", exc_info=True)
|
||||
raise
|
||||
|
||||
async def list_workflows(
|
||||
self,
|
||||
filter_query: Optional[str] = None,
|
||||
limit: int = 100
|
||||
) -> list[Dict[str, Any]]:
|
||||
"""
|
||||
List workflow executions.
|
||||
|
||||
Args:
|
||||
filter_query: Optional Temporal list filter query
|
||||
limit: Maximum number of results
|
||||
|
||||
Returns:
|
||||
List of workflow execution info
|
||||
|
||||
Raises:
|
||||
ValueError: If client not initialized
|
||||
"""
|
||||
if not self.client:
|
||||
raise ValueError("Temporal client not initialized")
|
||||
|
||||
try:
|
||||
workflows = []
|
||||
|
||||
# Use Temporal's list API
|
||||
async for workflow in self.client.list_workflows(filter_query):
|
||||
workflows.append({
|
||||
"workflow_id": workflow.id,
|
||||
"workflow_type": workflow.workflow_type,
|
||||
"status": workflow.status.name,
|
||||
"start_time": workflow.start_time.isoformat() if workflow.start_time else None,
|
||||
"close_time": workflow.close_time.isoformat() if workflow.close_time else None,
|
||||
"task_queue": workflow.task_queue,
|
||||
})
|
||||
|
||||
if len(workflows) >= limit:
|
||||
break
|
||||
|
||||
logger.info(f"Listed {len(workflows)} workflows")
|
||||
return workflows
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to list workflows: {e}", exc_info=True)
|
||||
raise
|
||||
Reference in New Issue
Block a user