mirror of
https://github.com/FuzzingLabs/fuzzforge_ai.git
synced 2026-05-21 08:16:58 +02:00
CI/CD Integration with Ephemeral Deployment Model (#14)
* feat: Complete migration from Prefect to Temporal BREAKING CHANGE: Replaces Prefect workflow orchestration with Temporal ## Major Changes - Replace Prefect with Temporal for workflow orchestration - Implement vertical worker architecture (rust, android) - Replace Docker registry with MinIO for unified storage - Refactor activities to be co-located with workflows - Update all API endpoints for Temporal compatibility ## Infrastructure - New: docker-compose.temporal.yaml (Temporal + MinIO + workers) - New: workers/ directory with rust and android vertical workers - New: backend/src/temporal/ (manager, discovery) - New: backend/src/storage/ (S3-cached storage with MinIO) - New: backend/toolbox/common/ (shared storage activities) - Deleted: docker-compose.yaml (old Prefect setup) - Deleted: backend/src/core/prefect_manager.py - Deleted: backend/src/services/prefect_stats_monitor.py - Deleted: Docker registry and insecure-registries requirement ## Workflows - Migrated: security_assessment workflow to Temporal - New: rust_test workflow (example/test workflow) - Deleted: secret_detection_scan (Prefect-based, to be reimplemented) - Activities now co-located with workflows for independent testing ## API Changes - Updated: backend/src/api/workflows.py (Temporal submission) - Updated: backend/src/api/runs.py (Temporal status/results) - Updated: backend/src/main.py (727 lines, TemporalManager integration) - Updated: All 16 MCP tools to use TemporalManager ## Testing - ✅ All services healthy (Temporal, PostgreSQL, MinIO, workers, backend) - ✅ All API endpoints functional - ✅ End-to-end workflow test passed (72 findings from vulnerable_app) - ✅ MinIO storage integration working (target upload/download, results) - ✅ Worker activity discovery working (6 activities registered) - ✅ Tarball extraction working - ✅ SARIF report generation working ## Documentation - ARCHITECTURE.md: Complete Temporal architecture documentation - QUICKSTART_TEMPORAL.md: Getting started guide - MIGRATION_DECISION.md: Why we chose Temporal over Prefect - IMPLEMENTATION_STATUS.md: Migration progress tracking - workers/README.md: Worker development guide ## Dependencies - Added: temporalio>=1.6.0 - Added: boto3>=1.34.0 (MinIO S3 client) - Removed: prefect>=3.4.18 * feat: Add Python fuzzing vertical with Atheris integration This commit implements a complete Python fuzzing workflow using Atheris: ## Python Worker (workers/python/) - Dockerfile with Python 3.11, Atheris, and build tools - Generic worker.py for dynamic workflow discovery - requirements.txt with temporalio, boto3, atheris dependencies - Added to docker-compose.temporal.yaml with dedicated cache volume ## AtherisFuzzer Module (backend/toolbox/modules/fuzzer/) - Reusable module extending BaseModule - Auto-discovers fuzz targets (fuzz_*.py, *_fuzz.py, fuzz_target.py) - Recursive search to find targets in nested directories - Dynamically loads TestOneInput() function - Configurable max_iterations and timeout - Real-time stats callback support for live monitoring - Returns findings as ModuleFinding objects ## Atheris Fuzzing Workflow (backend/toolbox/workflows/atheris_fuzzing/) - Temporal workflow for orchestrating fuzzing - Downloads user code from MinIO - Executes AtherisFuzzer module - Uploads results to MinIO - Cleans up cache after execution - metadata.yaml with vertical: python for routing ## Test Project (test_projects/python_fuzz_waterfall/) - Demonstrates stateful waterfall vulnerability - main.py with check_secret() that leaks progress - fuzz_target.py with Atheris TestOneInput() harness - Complete README with usage instructions ## Backend Fixes - Fixed parameter merging in REST API endpoints (workflows.py) - Changed workflow parameter passing from positional args to kwargs (manager.py) - Default parameters now properly merged with user parameters ## Testing ✅ Worker discovered AtherisFuzzingWorkflow ✅ Workflow executed end-to-end successfully ✅ Fuzz target auto-discovered in nested directories ✅ Atheris ran 100,000 iterations ✅ Results uploaded and cache cleaned * chore: Complete Temporal migration with updated CLI/SDK/docs This commit includes all remaining Temporal migration changes: ## CLI Updates (cli/) - Updated workflow execution commands for Temporal - Enhanced error handling and exceptions - Updated dependencies in uv.lock ## SDK Updates (sdk/) - Client methods updated for Temporal workflows - Updated models for new workflow execution - Updated dependencies in uv.lock ## Documentation Updates (docs/) - Architecture documentation for Temporal - Workflow concept documentation - Resource management documentation (new) - Debugging guide (new) - Updated tutorials and how-to guides - Troubleshooting updates ## README Updates - Main README with Temporal instructions - Backend README - CLI README - SDK README ## Other - Updated IMPLEMENTATION_STATUS.md - Removed old vulnerable_app.tar.gz These changes complete the Temporal migration and ensure the CLI/SDK work correctly with the new backend. * fix: Use positional args instead of kwargs for Temporal workflows The Temporal Python SDK's start_workflow() method doesn't accept a 'kwargs' parameter. Workflows must receive parameters as positional arguments via the 'args' parameter. Changed from: args=workflow_args # Positional arguments This fixes the error: TypeError: Client.start_workflow() got an unexpected keyword argument 'kwargs' Workflows now correctly receive parameters in order: - security_assessment: [target_id, scanner_config, analyzer_config, reporter_config] - atheris_fuzzing: [target_id, target_file, max_iterations, timeout_seconds] - rust_test: [target_id, test_message] * fix: Filter metadata-only parameters from workflow arguments SecurityAssessmentWorkflow was receiving 7 arguments instead of 2-5. The issue was that target_path and volume_mode from default_parameters were being passed to the workflow, when they should only be used by the system for configuration. Now filters out metadata-only parameters (target_path, volume_mode) before passing arguments to workflow execution. * refactor: Remove Prefect leftovers and volume mounting legacy Complete cleanup of Prefect migration artifacts: Backend: - Delete registry.py and workflow_discovery.py (Prefect-specific files) - Remove Docker validation from setup.py (no longer needed) - Remove ResourceLimits and VolumeMount models - Remove target_path and volume_mode from WorkflowSubmission - Remove supported_volume_modes from API and discovery - Clean up metadata.yaml files (remove volume/path fields) - Simplify parameter filtering in manager.py SDK: - Remove volume_mode parameter from client methods - Remove ResourceLimits and VolumeMount models - Remove Prefect error patterns from docker_logs.py - Clean up WorkflowSubmission and WorkflowMetadata models CLI: - Remove Volume Modes display from workflow info All removed features are Prefect-specific or Docker volume mounting artifacts. Temporal workflows use MinIO storage exclusively. * feat: Add comprehensive test suite and benchmark infrastructure - Add 68 unit tests for fuzzer, scanner, and analyzer modules - Implement pytest-based test infrastructure with fixtures - Add 6 performance benchmarks with category-specific thresholds - Configure GitHub Actions for automated testing and benchmarking - Add test and benchmark documentation Test coverage: - AtherisFuzzer: 8 tests - CargoFuzzer: 14 tests - FileScanner: 22 tests - SecurityAnalyzer: 24 tests All tests passing (68/68) All benchmarks passing (6/6) * fix: Resolve all ruff linting violations across codebase Fixed 27 ruff violations in 12 files: - Removed unused imports (Depends, Dict, Any, Optional, etc.) - Fixed undefined workflow_info variable in workflows.py - Removed dead code with undefined variables in atheris_fuzzer.py - Changed f-string to regular string where no placeholders used All files now pass ruff checks for CI/CD compliance. * fix: Configure CI for unit tests only - Renamed docker-compose.temporal.yaml → docker-compose.yml for CI compatibility - Commented out integration-tests job (no integration tests yet) - Updated test-summary to only depend on lint and unit-tests CI will now run successfully with 68 unit tests. Integration tests can be added later. * feat: Add CI/CD integration with ephemeral deployment model Implements comprehensive CI/CD support for FuzzForge with on-demand worker management: **Worker Management (v0.7.0)** - Add WorkerManager for automatic worker lifecycle control - Auto-start workers from stopped state when workflows execute - Auto-stop workers after workflow completion - Health checks and startup timeout handling (90s default) **CI/CD Features** - `--fail-on` flag: Fail builds based on SARIF severity levels (error/warning/note/info) - `--export-sarif` flag: Export findings in SARIF 2.1.0 format - `--auto-start`/`--auto-stop` flags: Control worker lifecycle - Exit code propagation: Returns 1 on blocking findings, 0 on success **Exit Code Fix** - Add `except typer.Exit: raise` handlers at 3 critical locations - Move worker cleanup to finally block for guaranteed execution - Exit codes now propagate correctly even when build fails **CI Scripts & Examples** - ci-start.sh: Start FuzzForge services with health checks - ci-stop.sh: Clean shutdown with volume preservation option - GitHub Actions workflow example (security-scan.yml) - GitLab CI pipeline example (.gitlab-ci.example.yml) - docker-compose.ci.yml: CI-optimized compose file with profiles **OSS-Fuzz Integration** - New ossfuzz_campaign workflow for running OSS-Fuzz projects - OSS-Fuzz worker with Docker-in-Docker support - Configurable campaign duration and project selection **Documentation** - Comprehensive CI/CD integration guide (docs/how-to/cicd-integration.md) - Updated architecture docs with worker lifecycle details - Updated workspace isolation documentation - CLI README with worker management examples **SDK Enhancements** - Add get_workflow_worker_info() endpoint - Worker vertical metadata in workflow responses **Testing** - All workflows tested: security_assessment, atheris_fuzzing, secret_detection, cargo_fuzzing - All monitoring commands tested: stats, crashes, status, finding - Full CI pipeline simulation verified - Exit codes verified for success/failure scenarios Ephemeral CI/CD model: ~3-4GB RAM, ~60-90s startup, runs entirely in CI containers. * fix: Resolve ruff linting violations in CI/CD code - Remove unused variables (run_id, defaults, result) - Remove unused imports - Fix f-string without placeholders All CI/CD integration files now pass ruff checks.
This commit is contained in:
@@ -1,770 +0,0 @@
|
||||
"""
|
||||
Prefect Manager - Core orchestration for workflow deployment and execution
|
||||
"""
|
||||
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import logging
|
||||
import os
|
||||
import platform
|
||||
import re
|
||||
from pathlib import Path
|
||||
from typing import Dict, Optional, Any
|
||||
from prefect import get_client
|
||||
from prefect.docker import DockerImage
|
||||
from prefect.client.schemas import FlowRun
|
||||
|
||||
from src.core.workflow_discovery import WorkflowDiscovery, WorkflowInfo
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
def get_registry_url(context: str = "default") -> str:
|
||||
"""
|
||||
Get the container registry URL to use for a given operation context.
|
||||
|
||||
Goals:
|
||||
- Work reliably across Linux and macOS Docker Desktop
|
||||
- Prefer in-network service discovery when running inside containers
|
||||
- Allow full override via env vars from docker-compose
|
||||
|
||||
Env overrides:
|
||||
- FUZZFORGE_REGISTRY_PUSH_URL: used for image builds/pushes
|
||||
- FUZZFORGE_REGISTRY_PULL_URL: used for workers to pull images
|
||||
"""
|
||||
# Normalize context
|
||||
ctx = (context or "default").lower()
|
||||
|
||||
# Always honor explicit overrides first
|
||||
if ctx in ("push", "build"):
|
||||
push_url = os.getenv("FUZZFORGE_REGISTRY_PUSH_URL")
|
||||
if push_url:
|
||||
logger.debug("Using FUZZFORGE_REGISTRY_PUSH_URL: %s", push_url)
|
||||
return push_url
|
||||
# Default to host-published registry for Docker daemon operations
|
||||
return "localhost:5001"
|
||||
|
||||
if ctx == "pull":
|
||||
pull_url = os.getenv("FUZZFORGE_REGISTRY_PULL_URL")
|
||||
if pull_url:
|
||||
logger.debug("Using FUZZFORGE_REGISTRY_PULL_URL: %s", pull_url)
|
||||
return pull_url
|
||||
# Prefect worker pulls via host Docker daemon as well
|
||||
return "localhost:5001"
|
||||
|
||||
# Default/fallback
|
||||
return os.getenv("FUZZFORGE_REGISTRY_PULL_URL", os.getenv("FUZZFORGE_REGISTRY_PUSH_URL", "localhost:5001"))
|
||||
|
||||
|
||||
def _compose_project_name(default: str = "fuzzforge") -> str:
|
||||
"""Return the docker-compose project name used for network/volume naming.
|
||||
|
||||
Always returns 'fuzzforge' regardless of environment variables.
|
||||
"""
|
||||
return "fuzzforge"
|
||||
|
||||
|
||||
class PrefectManager:
|
||||
"""
|
||||
Manages Prefect deployments and flow runs for discovered workflows.
|
||||
|
||||
This class handles:
|
||||
- Workflow discovery and registration
|
||||
- Docker image building through Prefect
|
||||
- Deployment creation and management
|
||||
- Flow run submission with volume mounting
|
||||
- Findings retrieval from completed runs
|
||||
"""
|
||||
|
||||
def __init__(self, workflows_dir: Path = None):
|
||||
"""
|
||||
Initialize the Prefect manager.
|
||||
|
||||
Args:
|
||||
workflows_dir: Path to the workflows directory (default: toolbox/workflows)
|
||||
"""
|
||||
if workflows_dir is None:
|
||||
workflows_dir = Path("toolbox/workflows")
|
||||
|
||||
self.discovery = WorkflowDiscovery(workflows_dir)
|
||||
self.workflows: Dict[str, WorkflowInfo] = {}
|
||||
self.deployments: Dict[str, str] = {} # workflow_name -> deployment_id
|
||||
|
||||
# Security: Define allowed and forbidden paths for host mounting
|
||||
self.allowed_base_paths = [
|
||||
"/tmp",
|
||||
"/home",
|
||||
"/Users", # macOS users
|
||||
"/opt",
|
||||
"/var/tmp",
|
||||
"/workspace", # Common container workspace
|
||||
"/app" # Container application directory (for test projects)
|
||||
]
|
||||
|
||||
self.forbidden_paths = [
|
||||
"/etc",
|
||||
"/root",
|
||||
"/var/run",
|
||||
"/sys",
|
||||
"/proc",
|
||||
"/dev",
|
||||
"/boot",
|
||||
"/var/lib/docker", # Critical Docker data
|
||||
"/var/log", # System logs
|
||||
"/usr/bin", # System binaries
|
||||
"/usr/sbin",
|
||||
"/sbin",
|
||||
"/bin"
|
||||
]
|
||||
|
||||
@staticmethod
|
||||
def _parse_memory_to_bytes(memory_str: str) -> int:
|
||||
"""
|
||||
Parse memory string (like '512Mi', '1Gi') to bytes.
|
||||
|
||||
Args:
|
||||
memory_str: Memory string with unit suffix
|
||||
|
||||
Returns:
|
||||
Memory in bytes
|
||||
|
||||
Raises:
|
||||
ValueError: If format is invalid
|
||||
"""
|
||||
if not memory_str:
|
||||
return 0
|
||||
|
||||
match = re.match(r'^(\d+(?:\.\d+)?)\s*([GMK]i?)$', memory_str.strip())
|
||||
if not match:
|
||||
raise ValueError(f"Invalid memory format: {memory_str}. Expected format like '512Mi', '1Gi'")
|
||||
|
||||
value, unit = match.groups()
|
||||
value = float(value)
|
||||
|
||||
# Convert to bytes based on unit (binary units: Ki, Mi, Gi)
|
||||
if unit in ['K', 'Ki']:
|
||||
multiplier = 1024
|
||||
elif unit in ['M', 'Mi']:
|
||||
multiplier = 1024 * 1024
|
||||
elif unit in ['G', 'Gi']:
|
||||
multiplier = 1024 * 1024 * 1024
|
||||
else:
|
||||
raise ValueError(f"Unsupported memory unit: {unit}")
|
||||
|
||||
return int(value * multiplier)
|
||||
|
||||
@staticmethod
|
||||
def _parse_cpu_to_millicores(cpu_str: str) -> int:
|
||||
"""
|
||||
Parse CPU string (like '500m', '1', '2.5') to millicores.
|
||||
|
||||
Args:
|
||||
cpu_str: CPU string
|
||||
|
||||
Returns:
|
||||
CPU in millicores (1 core = 1000 millicores)
|
||||
|
||||
Raises:
|
||||
ValueError: If format is invalid
|
||||
"""
|
||||
if not cpu_str:
|
||||
return 0
|
||||
|
||||
cpu_str = cpu_str.strip()
|
||||
|
||||
# Handle millicores format (e.g., '500m')
|
||||
if cpu_str.endswith('m'):
|
||||
try:
|
||||
return int(cpu_str[:-1])
|
||||
except ValueError:
|
||||
raise ValueError(f"Invalid CPU format: {cpu_str}")
|
||||
|
||||
# Handle core format (e.g., '1', '2.5')
|
||||
try:
|
||||
cores = float(cpu_str)
|
||||
return int(cores * 1000) # Convert to millicores
|
||||
except ValueError:
|
||||
raise ValueError(f"Invalid CPU format: {cpu_str}")
|
||||
|
||||
def _extract_resource_requirements(self, workflow_info: WorkflowInfo) -> Dict[str, str]:
|
||||
"""
|
||||
Extract resource requirements from workflow metadata.
|
||||
|
||||
Args:
|
||||
workflow_info: Workflow information with metadata
|
||||
|
||||
Returns:
|
||||
Dictionary with resource requirements in Docker format
|
||||
"""
|
||||
metadata = workflow_info.metadata
|
||||
requirements = metadata.get("requirements", {})
|
||||
resources = requirements.get("resources", {})
|
||||
|
||||
resource_config = {}
|
||||
|
||||
# Extract memory requirement
|
||||
memory = resources.get("memory")
|
||||
if memory:
|
||||
try:
|
||||
# Validate memory format and store original string for Docker
|
||||
self._parse_memory_to_bytes(memory)
|
||||
resource_config["memory"] = memory
|
||||
except ValueError as e:
|
||||
logger.warning(f"Invalid memory requirement in {workflow_info.name}: {e}")
|
||||
|
||||
# Extract CPU requirement
|
||||
cpu = resources.get("cpu")
|
||||
if cpu:
|
||||
try:
|
||||
# Validate CPU format and store original string for Docker
|
||||
self._parse_cpu_to_millicores(cpu)
|
||||
resource_config["cpus"] = cpu
|
||||
except ValueError as e:
|
||||
logger.warning(f"Invalid CPU requirement in {workflow_info.name}: {e}")
|
||||
|
||||
# Extract timeout
|
||||
timeout = resources.get("timeout")
|
||||
if timeout and isinstance(timeout, int):
|
||||
resource_config["timeout"] = str(timeout)
|
||||
|
||||
return resource_config
|
||||
|
||||
async def initialize(self):
|
||||
"""
|
||||
Initialize the manager by discovering and deploying all workflows.
|
||||
|
||||
This method:
|
||||
1. Discovers all valid workflows in the workflows directory
|
||||
2. Validates their metadata
|
||||
3. Deploys each workflow to Prefect with Docker images
|
||||
"""
|
||||
try:
|
||||
# Discover workflows
|
||||
self.workflows = await self.discovery.discover_workflows()
|
||||
|
||||
if not self.workflows:
|
||||
logger.warning("No workflows discovered")
|
||||
return
|
||||
|
||||
logger.info(f"Discovered {len(self.workflows)} workflows: {list(self.workflows.keys())}")
|
||||
|
||||
# Deploy each workflow
|
||||
for name, info in self.workflows.items():
|
||||
try:
|
||||
await self._deploy_workflow(name, info)
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to deploy workflow '{name}': {e}")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to initialize Prefect manager: {e}")
|
||||
raise
|
||||
|
||||
async def _deploy_workflow(self, name: str, info: WorkflowInfo):
|
||||
"""
|
||||
Deploy a single workflow to Prefect with Docker image.
|
||||
|
||||
Args:
|
||||
name: Workflow name
|
||||
info: Workflow information including metadata and paths
|
||||
"""
|
||||
logger.info(f"Deploying workflow '{name}'...")
|
||||
|
||||
# Get the flow function from registry
|
||||
flow_func = self.discovery.get_flow_function(name)
|
||||
if not flow_func:
|
||||
logger.error(
|
||||
f"Failed to get flow function for '{name}' from registry. "
|
||||
f"Ensure the workflow is properly registered in toolbox/workflows/registry.py"
|
||||
)
|
||||
return
|
||||
|
||||
# Use the mandatory Dockerfile with absolute paths for Docker Compose
|
||||
# Get absolute paths for build context and dockerfile
|
||||
toolbox_path = info.path.parent.parent.resolve()
|
||||
dockerfile_abs_path = info.dockerfile.resolve()
|
||||
|
||||
# Calculate relative dockerfile path from toolbox context
|
||||
try:
|
||||
dockerfile_rel_path = dockerfile_abs_path.relative_to(toolbox_path)
|
||||
except ValueError:
|
||||
# If relative path fails, use the workflow-specific path
|
||||
dockerfile_rel_path = Path("workflows") / name / "Dockerfile"
|
||||
|
||||
# Determine deployment strategy based on Dockerfile presence
|
||||
base_image = "prefecthq/prefect:3-python3.11"
|
||||
has_custom_dockerfile = info.has_docker and info.dockerfile.exists()
|
||||
|
||||
logger.info(f"=== DEPLOYMENT DEBUG for '{name}' ===")
|
||||
logger.info(f"info.has_docker: {info.has_docker}")
|
||||
logger.info(f"info.dockerfile: {info.dockerfile}")
|
||||
logger.info(f"info.dockerfile.exists(): {info.dockerfile.exists()}")
|
||||
logger.info(f"has_custom_dockerfile: {has_custom_dockerfile}")
|
||||
logger.info(f"toolbox_path: {toolbox_path}")
|
||||
logger.info(f"dockerfile_rel_path: {dockerfile_rel_path}")
|
||||
|
||||
if has_custom_dockerfile:
|
||||
logger.info(f"Workflow '{name}' has custom Dockerfile - building custom image")
|
||||
# Decide whether to use registry or keep images local to host engine
|
||||
import os
|
||||
# Default to using the local registry; set FUZZFORGE_USE_REGISTRY=false to bypass (not recommended)
|
||||
use_registry = os.getenv("FUZZFORGE_USE_REGISTRY", "true").lower() == "true"
|
||||
|
||||
if use_registry:
|
||||
registry_url = get_registry_url(context="push")
|
||||
image_spec = DockerImage(
|
||||
name=f"{registry_url}/fuzzforge/{name}",
|
||||
tag="latest",
|
||||
dockerfile=str(dockerfile_rel_path),
|
||||
context=str(toolbox_path)
|
||||
)
|
||||
deploy_image = f"{registry_url}/fuzzforge/{name}:latest"
|
||||
build_custom = True
|
||||
push_custom = True
|
||||
logger.info(f"Using registry: {registry_url} for '{name}'")
|
||||
else:
|
||||
# Single-host mode: build into host engine cache; no push required
|
||||
image_spec = DockerImage(
|
||||
name=f"fuzzforge/{name}",
|
||||
tag="latest",
|
||||
dockerfile=str(dockerfile_rel_path),
|
||||
context=str(toolbox_path)
|
||||
)
|
||||
deploy_image = f"fuzzforge/{name}:latest"
|
||||
build_custom = True
|
||||
push_custom = False
|
||||
logger.info("Using single-host image (no registry push): %s", deploy_image)
|
||||
else:
|
||||
logger.info(f"Workflow '{name}' using base image - no custom dependencies needed")
|
||||
deploy_image = base_image
|
||||
build_custom = False
|
||||
push_custom = False
|
||||
|
||||
# Pre-validate registry connectivity when pushing
|
||||
if push_custom:
|
||||
try:
|
||||
from .setup import validate_registry_connectivity
|
||||
await validate_registry_connectivity(registry_url)
|
||||
logger.info(f"Registry connectivity validated for {registry_url}")
|
||||
except Exception as e:
|
||||
logger.error(f"Registry connectivity validation failed for {registry_url}: {e}")
|
||||
raise RuntimeError(f"Cannot deploy workflow '{name}': Registry {registry_url} is not accessible. {e}")
|
||||
|
||||
# Deploy the workflow
|
||||
try:
|
||||
# Ensure any previous deployment is removed so job variables are updated
|
||||
try:
|
||||
async with get_client() as client:
|
||||
existing = await client.read_deployment_by_name(
|
||||
f"{name}/{name}-deployment"
|
||||
)
|
||||
if existing:
|
||||
logger.info(f"Removing existing deployment for '{name}' to refresh settings...")
|
||||
await client.delete_deployment(existing.id)
|
||||
except Exception:
|
||||
# If not found or deletion fails, continue with deployment
|
||||
pass
|
||||
|
||||
# Extract resource requirements from metadata
|
||||
workflow_resource_requirements = self._extract_resource_requirements(info)
|
||||
logger.info(f"Workflow '{name}' resource requirements: {workflow_resource_requirements}")
|
||||
|
||||
# Build job variables with resource requirements
|
||||
job_variables = {
|
||||
"image": deploy_image, # Use the worker-accessible registry name
|
||||
"volumes": [], # Populated at run submission with toolbox mount
|
||||
"env": {
|
||||
"PYTHONPATH": "/opt/prefect/toolbox:/opt/prefect",
|
||||
"WORKFLOW_NAME": name
|
||||
}
|
||||
}
|
||||
|
||||
# Add resource requirements to job variables if present
|
||||
if workflow_resource_requirements:
|
||||
job_variables["resources"] = workflow_resource_requirements
|
||||
|
||||
# Prepare deployment parameters
|
||||
deploy_params = {
|
||||
"name": f"{name}-deployment",
|
||||
"work_pool_name": "docker-pool",
|
||||
"image": image_spec if has_custom_dockerfile else deploy_image,
|
||||
"push": push_custom,
|
||||
"build": build_custom,
|
||||
"job_variables": job_variables
|
||||
}
|
||||
|
||||
deployment = await flow_func.deploy(**deploy_params)
|
||||
|
||||
self.deployments[name] = str(deployment.id) if hasattr(deployment, 'id') else name
|
||||
logger.info(f"Successfully deployed workflow '{name}'")
|
||||
|
||||
except Exception as e:
|
||||
# Enhanced error reporting with more context
|
||||
import traceback
|
||||
logger.error(f"Failed to deploy workflow '{name}': {e}")
|
||||
logger.error(f"Deployment traceback: {traceback.format_exc()}")
|
||||
|
||||
# Try to capture Docker-specific context
|
||||
error_context = {
|
||||
"workflow_name": name,
|
||||
"has_dockerfile": has_custom_dockerfile,
|
||||
"image_name": deploy_image if 'deploy_image' in locals() else "unknown",
|
||||
"registry_url": registry_url if 'registry_url' in locals() else "unknown",
|
||||
"error_type": type(e).__name__,
|
||||
"error_message": str(e)
|
||||
}
|
||||
|
||||
# Check for specific error patterns with detailed categorization
|
||||
error_msg_lower = str(e).lower()
|
||||
if "registry" in error_msg_lower and ("no such host" in error_msg_lower or "connection" in error_msg_lower):
|
||||
error_context["category"] = "registry_connectivity_error"
|
||||
error_context["solution"] = f"Cannot reach registry at {error_context['registry_url']}. Check Docker network and registry service."
|
||||
elif "docker" in error_msg_lower:
|
||||
error_context["category"] = "docker_error"
|
||||
if "build" in error_msg_lower:
|
||||
error_context["subcategory"] = "image_build_failed"
|
||||
error_context["solution"] = "Check Dockerfile syntax and dependencies."
|
||||
elif "pull" in error_msg_lower:
|
||||
error_context["subcategory"] = "image_pull_failed"
|
||||
error_context["solution"] = "Check if image exists in registry and network connectivity."
|
||||
elif "push" in error_msg_lower:
|
||||
error_context["subcategory"] = "image_push_failed"
|
||||
error_context["solution"] = f"Check registry connectivity and push permissions to {error_context['registry_url']}."
|
||||
elif "registry" in error_msg_lower:
|
||||
error_context["category"] = "registry_error"
|
||||
error_context["solution"] = "Check registry configuration and accessibility."
|
||||
elif "prefect" in error_msg_lower:
|
||||
error_context["category"] = "prefect_error"
|
||||
error_context["solution"] = "Check Prefect server connectivity and deployment configuration."
|
||||
else:
|
||||
error_context["category"] = "unknown_deployment_error"
|
||||
error_context["solution"] = "Check logs for more specific error details."
|
||||
|
||||
logger.error(f"Deployment error context: {error_context}")
|
||||
|
||||
# Raise enhanced exception with context
|
||||
enhanced_error = Exception(f"Deployment failed for workflow '{name}': {str(e)} | Context: {error_context}")
|
||||
enhanced_error.original_error = e
|
||||
enhanced_error.context = error_context
|
||||
raise enhanced_error
|
||||
|
||||
async def submit_workflow(
|
||||
self,
|
||||
workflow_name: str,
|
||||
target_path: str,
|
||||
volume_mode: str = "ro",
|
||||
parameters: Dict[str, Any] = None,
|
||||
resource_limits: Dict[str, str] = None,
|
||||
additional_volumes: list = None,
|
||||
timeout: int = None
|
||||
) -> FlowRun:
|
||||
"""
|
||||
Submit a workflow for execution with volume mounting.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow to execute
|
||||
target_path: Host path to mount as volume
|
||||
volume_mode: Volume mount mode ("ro" for read-only, "rw" for read-write)
|
||||
parameters: Workflow-specific parameters
|
||||
resource_limits: CPU/memory limits for container
|
||||
additional_volumes: List of additional volume mounts
|
||||
timeout: Timeout in seconds
|
||||
|
||||
Returns:
|
||||
FlowRun object with run information
|
||||
|
||||
Raises:
|
||||
ValueError: If workflow not found or volume mode not supported
|
||||
"""
|
||||
if workflow_name not in self.workflows:
|
||||
raise ValueError(f"Unknown workflow: {workflow_name}")
|
||||
|
||||
# Validate volume mode
|
||||
workflow_info = self.workflows[workflow_name]
|
||||
supported_modes = workflow_info.metadata.get("supported_volume_modes", ["ro", "rw"])
|
||||
|
||||
if volume_mode not in supported_modes:
|
||||
raise ValueError(
|
||||
f"Workflow '{workflow_name}' doesn't support volume mode '{volume_mode}'. "
|
||||
f"Supported modes: {supported_modes}"
|
||||
)
|
||||
|
||||
# Validate target path with security checks
|
||||
self._validate_target_path(target_path)
|
||||
|
||||
# Validate additional volumes if provided
|
||||
if additional_volumes:
|
||||
for volume in additional_volumes:
|
||||
self._validate_target_path(volume.host_path)
|
||||
|
||||
async with get_client() as client:
|
||||
# Get the deployment, auto-redeploy once if missing
|
||||
try:
|
||||
deployment = await client.read_deployment_by_name(
|
||||
f"{workflow_name}/{workflow_name}-deployment"
|
||||
)
|
||||
except Exception as e:
|
||||
import traceback
|
||||
logger.error(f"Failed to find deployment for workflow '{workflow_name}': {e}")
|
||||
logger.error(f"Deployment lookup traceback: {traceback.format_exc()}")
|
||||
|
||||
# Attempt a one-time auto-deploy to recover from startup races
|
||||
try:
|
||||
logger.info(f"Auto-deploying missing workflow '{workflow_name}' and retrying...")
|
||||
await self._deploy_workflow(workflow_name, workflow_info)
|
||||
deployment = await client.read_deployment_by_name(
|
||||
f"{workflow_name}/{workflow_name}-deployment"
|
||||
)
|
||||
except Exception as redeploy_exc:
|
||||
# Enhanced error with context
|
||||
error_context = {
|
||||
"workflow_name": workflow_name,
|
||||
"error_type": type(e).__name__,
|
||||
"error_message": str(e),
|
||||
"redeploy_error": str(redeploy_exc),
|
||||
"available_deployments": list(self.deployments.keys()),
|
||||
}
|
||||
enhanced_error = ValueError(
|
||||
f"Deployment not found and redeploy failed for workflow '{workflow_name}': {e} | Context: {error_context}"
|
||||
)
|
||||
enhanced_error.context = error_context
|
||||
raise enhanced_error
|
||||
|
||||
# Determine the Docker Compose network name and volume names
|
||||
# Hardcoded to 'fuzzforge' to avoid directory name dependencies
|
||||
import os
|
||||
compose_project = "fuzzforge"
|
||||
docker_network = "fuzzforge_default"
|
||||
|
||||
# Build volume mounts
|
||||
# Add toolbox volume mount for workflow code access
|
||||
backend_toolbox_path = "/app/toolbox" # Path in backend container
|
||||
|
||||
# Hardcoded volume names
|
||||
prefect_storage_volume = "fuzzforge_prefect_storage"
|
||||
toolbox_code_volume = "fuzzforge_toolbox_code"
|
||||
|
||||
volumes = [
|
||||
f"{target_path}:/workspace:{volume_mode}",
|
||||
f"{prefect_storage_volume}:/prefect-storage", # Shared storage for results
|
||||
f"{toolbox_code_volume}:/opt/prefect/toolbox:ro" # Mount workflow code
|
||||
]
|
||||
|
||||
# Add additional volumes if provided
|
||||
if additional_volumes:
|
||||
for volume in additional_volumes:
|
||||
volume_spec = f"{volume.host_path}:{volume.container_path}:{volume.mode}"
|
||||
volumes.append(volume_spec)
|
||||
|
||||
# Build environment variables
|
||||
env_vars = {
|
||||
"PREFECT_API_URL": "http://prefect-server:4200/api", # Use internal network hostname
|
||||
"PREFECT_LOGGING_LEVEL": "INFO",
|
||||
"PREFECT_LOCAL_STORAGE_PATH": "/prefect-storage", # Use shared storage
|
||||
"PREFECT_RESULTS_PERSIST_BY_DEFAULT": "true", # Enable result persistence
|
||||
"PREFECT_DEFAULT_RESULT_STORAGE_BLOCK": "local-file-system/fuzzforge-results", # Use our storage block
|
||||
"WORKSPACE_PATH": "/workspace",
|
||||
"VOLUME_MODE": volume_mode,
|
||||
"WORKFLOW_NAME": workflow_name
|
||||
}
|
||||
|
||||
# Add additional volume paths to environment for easy access
|
||||
if additional_volumes:
|
||||
for i, volume in enumerate(additional_volumes):
|
||||
env_vars[f"ADDITIONAL_VOLUME_{i}_PATH"] = volume.container_path
|
||||
|
||||
# Determine which image to use based on workflow configuration
|
||||
workflow_info = self.workflows[workflow_name]
|
||||
has_custom_dockerfile = workflow_info.has_docker and workflow_info.dockerfile.exists()
|
||||
# Use pull context for worker to pull from registry
|
||||
registry_url = get_registry_url(context="pull")
|
||||
workflow_image = f"{registry_url}/fuzzforge/{workflow_name}:latest" if has_custom_dockerfile else "prefecthq/prefect:3-python3.11"
|
||||
logger.debug(f"Worker will pull image: {workflow_image} (Registry: {registry_url})")
|
||||
|
||||
# Configure job variables with volume mounting and network access
|
||||
job_variables = {
|
||||
# Use custom image if available, otherwise base Prefect image
|
||||
"image": workflow_image,
|
||||
"volumes": volumes,
|
||||
"networks": [docker_network], # Connect to Docker Compose network
|
||||
"env": {
|
||||
**env_vars,
|
||||
"PYTHONPATH": "/opt/prefect/toolbox:/opt/prefect/toolbox/workflows",
|
||||
"WORKFLOW_NAME": workflow_name
|
||||
}
|
||||
}
|
||||
|
||||
# Apply resource requirements from workflow metadata and user overrides
|
||||
workflow_resource_requirements = self._extract_resource_requirements(workflow_info)
|
||||
final_resource_config = {}
|
||||
|
||||
# Start with workflow requirements as base
|
||||
if workflow_resource_requirements:
|
||||
final_resource_config.update(workflow_resource_requirements)
|
||||
|
||||
# Apply user-provided resource limits (overrides workflow defaults)
|
||||
if resource_limits:
|
||||
user_resource_config = {}
|
||||
if resource_limits.get("cpu_limit"):
|
||||
user_resource_config["cpus"] = resource_limits["cpu_limit"]
|
||||
if resource_limits.get("memory_limit"):
|
||||
user_resource_config["memory"] = resource_limits["memory_limit"]
|
||||
# Note: cpu_request and memory_request are not directly supported by Docker
|
||||
# but could be used for Kubernetes in the future
|
||||
|
||||
# User overrides take precedence
|
||||
final_resource_config.update(user_resource_config)
|
||||
|
||||
# Apply final resource configuration
|
||||
if final_resource_config:
|
||||
job_variables["resources"] = final_resource_config
|
||||
logger.info(f"Applied resource limits: {final_resource_config}")
|
||||
|
||||
# Merge parameters with defaults from metadata
|
||||
default_params = workflow_info.metadata.get("default_parameters", {})
|
||||
final_params = {**default_params, **(parameters or {})}
|
||||
|
||||
# Set flow parameters that match the flow signature
|
||||
final_params["target_path"] = "/workspace" # Container path where volume is mounted
|
||||
final_params["volume_mode"] = volume_mode
|
||||
|
||||
# Create and submit the flow run
|
||||
# Pass job_variables to ensure network, volumes, and environment are configured
|
||||
logger.info(f"Submitting flow with job_variables: {job_variables}")
|
||||
logger.info(f"Submitting flow with parameters: {final_params}")
|
||||
|
||||
# Prepare flow run creation parameters
|
||||
flow_run_params = {
|
||||
"deployment_id": deployment.id,
|
||||
"parameters": final_params,
|
||||
"job_variables": job_variables
|
||||
}
|
||||
|
||||
# Note: Timeout is handled through workflow-level configuration
|
||||
# Additional timeout configuration can be added to deployment metadata if needed
|
||||
|
||||
flow_run = await client.create_flow_run_from_deployment(**flow_run_params)
|
||||
|
||||
logger.info(
|
||||
f"Submitted workflow '{workflow_name}' with run_id: {flow_run.id}, "
|
||||
f"target: {target_path}, mode: {volume_mode}"
|
||||
)
|
||||
|
||||
return flow_run
|
||||
|
||||
async def get_flow_run_findings(self, run_id: str) -> Dict[str, Any]:
|
||||
"""
|
||||
Retrieve findings from a completed flow run.
|
||||
|
||||
Args:
|
||||
run_id: The flow run ID
|
||||
|
||||
Returns:
|
||||
Dictionary containing SARIF-formatted findings
|
||||
|
||||
Raises:
|
||||
ValueError: If run not completed or not found
|
||||
"""
|
||||
async with get_client() as client:
|
||||
flow_run = await client.read_flow_run(run_id)
|
||||
|
||||
if not flow_run.state.is_completed():
|
||||
raise ValueError(
|
||||
f"Flow run {run_id} not completed. Current status: {flow_run.state.name}"
|
||||
)
|
||||
|
||||
# Get the findings from the flow run result
|
||||
try:
|
||||
findings = await flow_run.state.result()
|
||||
return findings
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to retrieve findings for run {run_id}: {e}")
|
||||
raise ValueError(f"Failed to retrieve findings: {e}")
|
||||
|
||||
async def get_flow_run_status(self, run_id: str) -> Dict[str, Any]:
|
||||
"""
|
||||
Get the current status of a flow run.
|
||||
|
||||
Args:
|
||||
run_id: The flow run ID
|
||||
|
||||
Returns:
|
||||
Dictionary with status information
|
||||
"""
|
||||
async with get_client() as client:
|
||||
flow_run = await client.read_flow_run(run_id)
|
||||
|
||||
return {
|
||||
"run_id": str(flow_run.id),
|
||||
"workflow": flow_run.deployment_id,
|
||||
"status": flow_run.state.name,
|
||||
"is_completed": flow_run.state.is_completed(),
|
||||
"is_failed": flow_run.state.is_failed(),
|
||||
"is_running": flow_run.state.is_running(),
|
||||
"created_at": flow_run.created,
|
||||
"updated_at": flow_run.updated
|
||||
}
|
||||
|
||||
def _validate_target_path(self, target_path: str) -> None:
|
||||
"""
|
||||
Validate target path for security before mounting as volume.
|
||||
|
||||
Args:
|
||||
target_path: Host path to validate
|
||||
|
||||
Raises:
|
||||
ValueError: If path is not allowed for security reasons
|
||||
"""
|
||||
target = Path(target_path)
|
||||
|
||||
# Path must be absolute
|
||||
if not target.is_absolute():
|
||||
raise ValueError(f"Target path must be absolute: {target_path}")
|
||||
|
||||
# Resolve path to handle symlinks and relative components
|
||||
try:
|
||||
resolved_path = target.resolve()
|
||||
except (OSError, RuntimeError) as e:
|
||||
raise ValueError(f"Cannot resolve target path: {target_path} - {e}")
|
||||
|
||||
resolved_str = str(resolved_path)
|
||||
|
||||
# Check against forbidden paths first (more restrictive)
|
||||
for forbidden in self.forbidden_paths:
|
||||
if resolved_str.startswith(forbidden):
|
||||
raise ValueError(
|
||||
f"Access denied: Path '{target_path}' resolves to forbidden directory '{forbidden}'. "
|
||||
f"This path contains sensitive system files and cannot be mounted."
|
||||
)
|
||||
|
||||
# Check if path starts with any allowed base path
|
||||
path_allowed = False
|
||||
for allowed in self.allowed_base_paths:
|
||||
if resolved_str.startswith(allowed):
|
||||
path_allowed = True
|
||||
break
|
||||
|
||||
if not path_allowed:
|
||||
allowed_list = ", ".join(self.allowed_base_paths)
|
||||
raise ValueError(
|
||||
f"Access denied: Path '{target_path}' is not in allowed directories. "
|
||||
f"Allowed base paths: {allowed_list}"
|
||||
)
|
||||
|
||||
# Additional security checks
|
||||
if resolved_str == "/":
|
||||
raise ValueError("Cannot mount root filesystem")
|
||||
|
||||
# Warn if path doesn't exist (but don't block - it might be created later)
|
||||
if not resolved_path.exists():
|
||||
logger.warning(f"Target path does not exist: {target_path}")
|
||||
|
||||
logger.info(f"Path validation passed for: {target_path} -> {resolved_str}")
|
||||
+10
-367
@@ -1,5 +1,5 @@
|
||||
"""
|
||||
Setup utilities for Prefect infrastructure
|
||||
Setup utilities for FuzzForge infrastructure
|
||||
"""
|
||||
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
@@ -14,364 +14,21 @@ Setup utilities for Prefect infrastructure
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import logging
|
||||
from prefect import get_client
|
||||
from prefect.client.schemas.actions import WorkPoolCreate
|
||||
from prefect.client.schemas.objects import WorkPool
|
||||
from .prefect_manager import get_registry_url
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
async def setup_docker_pool():
|
||||
"""
|
||||
Create or update the Docker work pool for container execution.
|
||||
|
||||
This work pool is configured to:
|
||||
- Connect to the local Docker daemon
|
||||
- Support volume mounting at runtime
|
||||
- Clean up containers after execution
|
||||
- Use bridge networking by default
|
||||
"""
|
||||
import os
|
||||
|
||||
async with get_client() as client:
|
||||
pool_name = "docker-pool"
|
||||
|
||||
# Add force recreation flag for debugging fresh install issues
|
||||
force_recreate = os.getenv('FORCE_RECREATE_WORK_POOL', 'false').lower() == 'true'
|
||||
debug_setup = os.getenv('DEBUG_WORK_POOL_SETUP', 'false').lower() == 'true'
|
||||
|
||||
if force_recreate:
|
||||
logger.warning(f"FORCE_RECREATE_WORK_POOL=true - Will recreate work pool regardless of existing configuration")
|
||||
if debug_setup:
|
||||
logger.warning(f"DEBUG_WORK_POOL_SETUP=true - Enhanced logging enabled")
|
||||
# Temporarily set logging level to DEBUG for this function
|
||||
original_level = logger.level
|
||||
logger.setLevel(logging.DEBUG)
|
||||
|
||||
try:
|
||||
# Check if pool already exists and supports custom images
|
||||
existing_pools = await client.read_work_pools()
|
||||
existing_pool = None
|
||||
for pool in existing_pools:
|
||||
if pool.name == pool_name:
|
||||
existing_pool = pool
|
||||
break
|
||||
|
||||
if existing_pool and not force_recreate:
|
||||
logger.info(f"Found existing work pool '{pool_name}' - validating configuration...")
|
||||
|
||||
# Check if the existing pool has the correct configuration
|
||||
base_template = existing_pool.base_job_template or {}
|
||||
logger.debug(f"Base template keys: {list(base_template.keys())}")
|
||||
|
||||
job_config = base_template.get("job_configuration", {})
|
||||
logger.debug(f"Job config keys: {list(job_config.keys())}")
|
||||
|
||||
image_config = job_config.get("image", "")
|
||||
has_image_variable = "{{ image }}" in str(image_config)
|
||||
logger.debug(f"Image config: '{image_config}' -> has_image_variable: {has_image_variable}")
|
||||
|
||||
# Check if volume defaults include toolbox mount
|
||||
variables = base_template.get("variables", {})
|
||||
properties = variables.get("properties", {})
|
||||
volume_config = properties.get("volumes", {})
|
||||
volume_defaults = volume_config.get("default", [])
|
||||
has_toolbox_volume = any("toolbox_code" in str(vol) for vol in volume_defaults) if volume_defaults else False
|
||||
logger.debug(f"Volume defaults: {volume_defaults}")
|
||||
logger.debug(f"Has toolbox volume: {has_toolbox_volume}")
|
||||
|
||||
# Check if environment defaults include required settings
|
||||
env_config = properties.get("env", {})
|
||||
env_defaults = env_config.get("default", {})
|
||||
has_api_url = "PREFECT_API_URL" in env_defaults
|
||||
has_storage_path = "PREFECT_LOCAL_STORAGE_PATH" in env_defaults
|
||||
has_results_persist = "PREFECT_RESULTS_PERSIST_BY_DEFAULT" in env_defaults
|
||||
has_required_env = has_api_url and has_storage_path and has_results_persist
|
||||
logger.debug(f"Environment defaults: {env_defaults}")
|
||||
logger.debug(f"Has API URL: {has_api_url}, Has storage path: {has_storage_path}, Has results persist: {has_results_persist}")
|
||||
logger.debug(f"Has required env: {has_required_env}")
|
||||
|
||||
# Log the full validation result
|
||||
logger.info(f"Work pool validation - Image: {has_image_variable}, Toolbox: {has_toolbox_volume}, Environment: {has_required_env}")
|
||||
|
||||
if has_image_variable and has_toolbox_volume and has_required_env:
|
||||
logger.info(f"Docker work pool '{pool_name}' already exists with correct configuration")
|
||||
return
|
||||
else:
|
||||
reasons = []
|
||||
if not has_image_variable:
|
||||
reasons.append("missing image template")
|
||||
if not has_toolbox_volume:
|
||||
reasons.append("missing toolbox volume mount")
|
||||
if not has_required_env:
|
||||
if not has_api_url:
|
||||
reasons.append("missing PREFECT_API_URL")
|
||||
if not has_storage_path:
|
||||
reasons.append("missing PREFECT_LOCAL_STORAGE_PATH")
|
||||
if not has_results_persist:
|
||||
reasons.append("missing PREFECT_RESULTS_PERSIST_BY_DEFAULT")
|
||||
|
||||
logger.warning(f"Docker work pool '{pool_name}' exists but lacks: {', '.join(reasons)}. Recreating...")
|
||||
# Delete the old pool and recreate it
|
||||
try:
|
||||
await client.delete_work_pool(pool_name)
|
||||
logger.info(f"Deleted old work pool '{pool_name}'")
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to delete old work pool: {e}")
|
||||
elif force_recreate and existing_pool:
|
||||
logger.warning(f"Force recreation enabled - deleting existing work pool '{pool_name}'")
|
||||
try:
|
||||
await client.delete_work_pool(pool_name)
|
||||
logger.info(f"Deleted existing work pool for force recreation")
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to delete work pool for force recreation: {e}")
|
||||
|
||||
logger.info(f"Creating Docker work pool '{pool_name}' with custom image support...")
|
||||
|
||||
# Create the work pool with proper Docker configuration
|
||||
work_pool = WorkPoolCreate(
|
||||
name=pool_name,
|
||||
type="docker",
|
||||
description="Docker work pool for FuzzForge workflows with custom image support",
|
||||
base_job_template={
|
||||
"job_configuration": {
|
||||
"image": "{{ image }}", # Template variable for custom images
|
||||
"volumes": "{{ volumes }}", # List of volume mounts
|
||||
"env": "{{ env }}", # Environment variables
|
||||
"networks": "{{ networks }}", # Docker networks
|
||||
"stream_output": True,
|
||||
"auto_remove": True,
|
||||
"privileged": False,
|
||||
"network_mode": None, # Use networks instead
|
||||
"labels": {},
|
||||
"command": None # Let the image's CMD/ENTRYPOINT run
|
||||
},
|
||||
"variables": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"image": {
|
||||
"type": "string",
|
||||
"title": "Docker Image",
|
||||
"default": "prefecthq/prefect:3-python3.11",
|
||||
"description": "Docker image for the flow run"
|
||||
},
|
||||
"volumes": {
|
||||
"type": "array",
|
||||
"title": "Volume Mounts",
|
||||
"default": [
|
||||
"fuzzforge_prefect_storage:/prefect-storage",
|
||||
"fuzzforge_toolbox_code:/opt/prefect/toolbox:ro"
|
||||
],
|
||||
"description": "Volume mounts in format 'host:container:mode'",
|
||||
"items": {
|
||||
"type": "string"
|
||||
}
|
||||
},
|
||||
"networks": {
|
||||
"type": "array",
|
||||
"title": "Docker Networks",
|
||||
"default": ["fuzzforge_default"],
|
||||
"description": "Docker networks to connect container to",
|
||||
"items": {
|
||||
"type": "string"
|
||||
}
|
||||
},
|
||||
"env": {
|
||||
"type": "object",
|
||||
"title": "Environment Variables",
|
||||
"default": {
|
||||
"PREFECT_API_URL": "http://prefect-server:4200/api",
|
||||
"PREFECT_LOCAL_STORAGE_PATH": "/prefect-storage",
|
||||
"PREFECT_RESULTS_PERSIST_BY_DEFAULT": "true"
|
||||
},
|
||||
"description": "Environment variables for the container",
|
||||
"additionalProperties": {
|
||||
"type": "string"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
)
|
||||
|
||||
await client.create_work_pool(work_pool)
|
||||
logger.info(f"Created Docker work pool '{pool_name}'")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to setup Docker work pool: {e}")
|
||||
raise
|
||||
finally:
|
||||
# Restore original logging level if debug mode was enabled
|
||||
if debug_setup and 'original_level' in locals():
|
||||
logger.setLevel(original_level)
|
||||
|
||||
|
||||
def get_actual_compose_project_name():
|
||||
"""
|
||||
Return the hardcoded compose project name for FuzzForge.
|
||||
|
||||
Always returns 'fuzzforge' as per system requirements.
|
||||
"""
|
||||
logger.info("Using hardcoded compose project name: fuzzforge")
|
||||
return "fuzzforge"
|
||||
|
||||
|
||||
async def setup_result_storage():
|
||||
"""
|
||||
Create or update Prefect result storage block for findings persistence.
|
||||
Setup result storage (MinIO).
|
||||
|
||||
This sets up a LocalFileSystem storage block pointing to the shared
|
||||
/prefect-storage volume for result persistence.
|
||||
MinIO is used for both target upload and result storage.
|
||||
This is a placeholder for any MinIO-specific setup if needed.
|
||||
"""
|
||||
from prefect.filesystems import LocalFileSystem
|
||||
|
||||
storage_name = "fuzzforge-results"
|
||||
|
||||
try:
|
||||
# Create the storage block, overwrite if it exists
|
||||
logger.info(f"Setting up storage block '{storage_name}'...")
|
||||
storage = LocalFileSystem(basepath="/prefect-storage")
|
||||
|
||||
block_doc_id = await storage.save(name=storage_name, overwrite=True)
|
||||
logger.info(f"Storage block '{storage_name}' configured successfully")
|
||||
return str(block_doc_id)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to setup result storage: {e}")
|
||||
# Don't raise the exception - continue without storage block
|
||||
logger.warning("Continuing without result storage block - findings may not persist")
|
||||
return None
|
||||
|
||||
|
||||
async def validate_docker_connection():
|
||||
"""
|
||||
Validate that Docker is accessible and running.
|
||||
|
||||
Note: In containerized deployments with Docker socket proxy,
|
||||
the backend doesn't need direct Docker access.
|
||||
|
||||
Raises:
|
||||
RuntimeError: If Docker is not accessible
|
||||
"""
|
||||
import os
|
||||
|
||||
# Skip Docker validation if running in container without socket access
|
||||
if os.path.exists("/.dockerenv") and not os.path.exists("/var/run/docker.sock"):
|
||||
logger.info("Running in container without Docker socket - skipping Docker validation")
|
||||
return
|
||||
|
||||
try:
|
||||
import docker
|
||||
client = docker.from_env()
|
||||
client.ping()
|
||||
logger.info("Docker connection validated")
|
||||
except Exception as e:
|
||||
logger.error(f"Docker is not accessible: {e}")
|
||||
raise RuntimeError(
|
||||
"Docker is not running or not accessible. "
|
||||
"Please ensure Docker is installed and running."
|
||||
)
|
||||
|
||||
|
||||
async def validate_registry_connectivity(registry_url: str = None):
|
||||
"""
|
||||
Validate that the Docker registry is accessible.
|
||||
|
||||
Args:
|
||||
registry_url: URL of the Docker registry to validate (auto-detected if None)
|
||||
|
||||
Raises:
|
||||
RuntimeError: If registry is not accessible
|
||||
"""
|
||||
# Resolve a reachable test URL from within this process
|
||||
if registry_url is None:
|
||||
# If not specified, prefer internal service name in containers, host port on host
|
||||
import os
|
||||
if os.path.exists('/.dockerenv'):
|
||||
registry_url = "registry:5000"
|
||||
else:
|
||||
registry_url = "localhost:5001"
|
||||
|
||||
# If we're running inside a container and asked to probe localhost:PORT,
|
||||
# the probe would hit the container, not the host. Use host.docker.internal instead.
|
||||
import os
|
||||
try:
|
||||
host_part, port_part = registry_url.split(":", 1)
|
||||
except ValueError:
|
||||
host_part, port_part = registry_url, "80"
|
||||
|
||||
if os.path.exists('/.dockerenv') and host_part in ("localhost", "127.0.0.1"):
|
||||
test_host = "host.docker.internal"
|
||||
else:
|
||||
test_host = host_part
|
||||
test_url = f"http://{test_host}:{port_part}/v2/"
|
||||
|
||||
import aiohttp
|
||||
import asyncio
|
||||
|
||||
logger.info(f"Validating registry connectivity to {registry_url}...")
|
||||
|
||||
try:
|
||||
async with aiohttp.ClientSession(timeout=aiohttp.ClientTimeout(total=10)) as session:
|
||||
async with session.get(test_url) as response:
|
||||
if response.status == 200:
|
||||
logger.info(f"Registry at {registry_url} is accessible (tested via {test_host})")
|
||||
return
|
||||
else:
|
||||
raise RuntimeError(f"Registry returned status {response.status}")
|
||||
except asyncio.TimeoutError:
|
||||
raise RuntimeError(f"Registry at {registry_url} is not responding (timeout)")
|
||||
except aiohttp.ClientError as e:
|
||||
raise RuntimeError(f"Registry at {registry_url} is not accessible: {e}")
|
||||
except Exception as e:
|
||||
raise RuntimeError(f"Failed to validate registry connectivity: {e}")
|
||||
|
||||
|
||||
async def validate_docker_network(network_name: str):
|
||||
"""
|
||||
Validate that the specified Docker network exists.
|
||||
|
||||
Args:
|
||||
network_name: Name of the Docker network to validate
|
||||
|
||||
Raises:
|
||||
RuntimeError: If network doesn't exist
|
||||
"""
|
||||
import os
|
||||
|
||||
# Skip network validation if running in container without Docker socket
|
||||
if os.path.exists("/.dockerenv") and not os.path.exists("/var/run/docker.sock"):
|
||||
logger.info("Running in container without Docker socket - skipping network validation")
|
||||
return
|
||||
|
||||
try:
|
||||
import docker
|
||||
client = docker.from_env()
|
||||
|
||||
# List all networks
|
||||
networks = client.networks.list(names=[network_name])
|
||||
|
||||
if not networks:
|
||||
# Try to find networks with similar names
|
||||
all_networks = client.networks.list()
|
||||
similar_networks = [n.name for n in all_networks if "fuzzforge" in n.name.lower()]
|
||||
|
||||
error_msg = f"Docker network '{network_name}' not found."
|
||||
if similar_networks:
|
||||
error_msg += f" Available networks: {similar_networks}"
|
||||
else:
|
||||
error_msg += " Please ensure Docker Compose is running."
|
||||
|
||||
raise RuntimeError(error_msg)
|
||||
|
||||
logger.info(f"Docker network '{network_name}' validated")
|
||||
|
||||
except Exception as e:
|
||||
if isinstance(e, RuntimeError):
|
||||
raise
|
||||
logger.error(f"Network validation failed: {e}")
|
||||
raise RuntimeError(f"Failed to validate Docker network: {e}")
|
||||
logger.info("Result storage (MinIO) configured")
|
||||
# MinIO is configured via environment variables in docker-compose
|
||||
# No additional setup needed here
|
||||
return True
|
||||
|
||||
|
||||
async def validate_infrastructure():
|
||||
@@ -382,21 +39,7 @@ async def validate_infrastructure():
|
||||
"""
|
||||
logger.info("Validating infrastructure...")
|
||||
|
||||
# Validate Docker connection
|
||||
await validate_docker_connection()
|
||||
|
||||
# Validate registry connectivity for custom image building
|
||||
await validate_registry_connectivity()
|
||||
|
||||
# Validate network (hardcoded to avoid directory name dependencies)
|
||||
import os
|
||||
compose_project = "fuzzforge"
|
||||
docker_network = "fuzzforge_default"
|
||||
|
||||
try:
|
||||
await validate_docker_network(docker_network)
|
||||
except RuntimeError as e:
|
||||
logger.warning(f"Network validation failed: {e}")
|
||||
logger.warning("Workflows may not be able to connect to Prefect services")
|
||||
# Setup storage (MinIO)
|
||||
await setup_result_storage()
|
||||
|
||||
logger.info("Infrastructure validation completed")
|
||||
|
||||
@@ -1,459 +0,0 @@
|
||||
"""
|
||||
Workflow Discovery - Registry-based discovery and loading of workflows
|
||||
"""
|
||||
|
||||
# Copyright (c) 2025 FuzzingLabs
|
||||
#
|
||||
# Licensed under the Business Source License 1.1 (BSL). See the LICENSE file
|
||||
# at the root of this repository for details.
|
||||
#
|
||||
# After the Change Date (four years from publication), this version of the
|
||||
# Licensed Work will be made available under the Apache License, Version 2.0.
|
||||
# See the LICENSE-APACHE file or http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Additional attribution and requirements are provided in the NOTICE file.
|
||||
|
||||
import logging
|
||||
import yaml
|
||||
from pathlib import Path
|
||||
from typing import Dict, Optional, Any, Callable
|
||||
from pydantic import BaseModel, Field, ConfigDict
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class WorkflowInfo(BaseModel):
|
||||
"""Information about a discovered workflow"""
|
||||
name: str = Field(..., description="Workflow name")
|
||||
path: Path = Field(..., description="Path to workflow directory")
|
||||
workflow_file: Path = Field(..., description="Path to workflow.py file")
|
||||
dockerfile: Path = Field(..., description="Path to Dockerfile")
|
||||
has_docker: bool = Field(..., description="Whether workflow has custom Dockerfile")
|
||||
metadata: Dict[str, Any] = Field(..., description="Workflow metadata from YAML")
|
||||
flow_function_name: str = Field(default="main_flow", description="Name of the flow function")
|
||||
|
||||
model_config = ConfigDict(arbitrary_types_allowed=True)
|
||||
|
||||
|
||||
class WorkflowDiscovery:
|
||||
"""
|
||||
Discovers workflows from the filesystem and validates them against the registry.
|
||||
|
||||
This system:
|
||||
1. Scans for workflows with metadata.yaml files
|
||||
2. Cross-references them with the manual registry
|
||||
3. Provides registry-based flow functions for deployment
|
||||
|
||||
Workflows must have:
|
||||
- workflow.py: Contains the Prefect flow
|
||||
- metadata.yaml: Mandatory metadata file
|
||||
- Entry in toolbox/workflows/registry.py: Manual registration
|
||||
- Dockerfile (optional): Custom container definition
|
||||
- requirements.txt (optional): Python dependencies
|
||||
"""
|
||||
|
||||
def __init__(self, workflows_dir: Path):
|
||||
"""
|
||||
Initialize workflow discovery.
|
||||
|
||||
Args:
|
||||
workflows_dir: Path to the workflows directory
|
||||
"""
|
||||
self.workflows_dir = workflows_dir
|
||||
if not self.workflows_dir.exists():
|
||||
self.workflows_dir.mkdir(parents=True, exist_ok=True)
|
||||
logger.info(f"Created workflows directory: {self.workflows_dir}")
|
||||
|
||||
# Import registry - this validates it on import
|
||||
try:
|
||||
from toolbox.workflows.registry import WORKFLOW_REGISTRY, list_registered_workflows
|
||||
self.registry = WORKFLOW_REGISTRY
|
||||
logger.info(f"Loaded workflow registry with {len(self.registry)} registered workflows")
|
||||
except ImportError as e:
|
||||
logger.error(f"Failed to import workflow registry: {e}")
|
||||
self.registry = {}
|
||||
except Exception as e:
|
||||
logger.error(f"Registry validation failed: {e}")
|
||||
self.registry = {}
|
||||
|
||||
# Cache for discovered workflows
|
||||
self._workflow_cache: Optional[Dict[str, WorkflowInfo]] = None
|
||||
self._cache_timestamp: Optional[float] = None
|
||||
self._cache_ttl = 60.0 # Cache TTL in seconds
|
||||
|
||||
async def discover_workflows(self) -> Dict[str, WorkflowInfo]:
|
||||
"""
|
||||
Discover workflows by cross-referencing filesystem with registry.
|
||||
Uses caching to avoid frequent filesystem scans.
|
||||
|
||||
Returns:
|
||||
Dictionary mapping workflow names to their information
|
||||
"""
|
||||
# Check cache validity
|
||||
import time
|
||||
current_time = time.time()
|
||||
|
||||
if (self._workflow_cache is not None and
|
||||
self._cache_timestamp is not None and
|
||||
(current_time - self._cache_timestamp) < self._cache_ttl):
|
||||
# Return cached results
|
||||
logger.debug(f"Returning cached workflow discovery ({len(self._workflow_cache)} workflows)")
|
||||
return self._workflow_cache
|
||||
workflows = {}
|
||||
discovered_dirs = set()
|
||||
registry_names = set(self.registry.keys())
|
||||
|
||||
if not self.workflows_dir.exists():
|
||||
logger.warning(f"Workflows directory does not exist: {self.workflows_dir}")
|
||||
return workflows
|
||||
|
||||
# Recursively scan all directories and subdirectories
|
||||
await self._scan_directory_recursive(self.workflows_dir, workflows, discovered_dirs)
|
||||
|
||||
# Check for registry entries without corresponding directories
|
||||
missing_dirs = registry_names - discovered_dirs
|
||||
if missing_dirs:
|
||||
logger.warning(
|
||||
f"Registry contains workflows without filesystem directories: {missing_dirs}. "
|
||||
f"These workflows cannot be deployed."
|
||||
)
|
||||
|
||||
logger.info(
|
||||
f"Discovery complete: {len(workflows)} workflows ready for deployment, "
|
||||
f"{len(missing_dirs)} registry entries missing directories, "
|
||||
f"{len(discovered_dirs - registry_names)} filesystem workflows not registered"
|
||||
)
|
||||
|
||||
# Update cache
|
||||
self._workflow_cache = workflows
|
||||
self._cache_timestamp = current_time
|
||||
|
||||
return workflows
|
||||
|
||||
async def _scan_directory_recursive(self, directory: Path, workflows: Dict[str, WorkflowInfo], discovered_dirs: set):
|
||||
"""
|
||||
Recursively scan directory for workflows.
|
||||
|
||||
Args:
|
||||
directory: Directory to scan
|
||||
workflows: Dictionary to populate with discovered workflows
|
||||
discovered_dirs: Set to track discovered workflow names
|
||||
"""
|
||||
for item in directory.iterdir():
|
||||
if not item.is_dir():
|
||||
continue
|
||||
|
||||
if item.name.startswith('_') or item.name.startswith('.'):
|
||||
continue # Skip hidden or private directories
|
||||
|
||||
# Check if this directory contains workflow files (workflow.py and metadata.yaml)
|
||||
workflow_file = item / "workflow.py"
|
||||
metadata_file = item / "metadata.yaml"
|
||||
|
||||
if workflow_file.exists() and metadata_file.exists():
|
||||
# This is a workflow directory
|
||||
workflow_name = item.name
|
||||
discovered_dirs.add(workflow_name)
|
||||
|
||||
# Only process workflows that are in the registry
|
||||
if workflow_name not in self.registry:
|
||||
logger.warning(
|
||||
f"Workflow '{workflow_name}' found in filesystem but not in registry. "
|
||||
f"Add it to toolbox/workflows/registry.py to enable deployment."
|
||||
)
|
||||
continue
|
||||
|
||||
try:
|
||||
workflow_info = await self._load_workflow(item)
|
||||
if workflow_info:
|
||||
workflows[workflow_info.name] = workflow_info
|
||||
logger.info(f"Discovered and registered workflow: {workflow_info.name}")
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to load workflow from {item}: {e}")
|
||||
else:
|
||||
# This is a category directory, recurse into it
|
||||
await self._scan_directory_recursive(item, workflows, discovered_dirs)
|
||||
|
||||
async def _load_workflow(self, workflow_dir: Path) -> Optional[WorkflowInfo]:
|
||||
"""
|
||||
Load and validate a single workflow.
|
||||
|
||||
Args:
|
||||
workflow_dir: Path to the workflow directory
|
||||
|
||||
Returns:
|
||||
WorkflowInfo if valid, None otherwise
|
||||
"""
|
||||
workflow_name = workflow_dir.name
|
||||
|
||||
# Check for mandatory files
|
||||
workflow_file = workflow_dir / "workflow.py"
|
||||
metadata_file = workflow_dir / "metadata.yaml"
|
||||
|
||||
if not workflow_file.exists():
|
||||
logger.warning(f"Workflow {workflow_name} missing workflow.py")
|
||||
return None
|
||||
|
||||
if not metadata_file.exists():
|
||||
logger.error(f"Workflow {workflow_name} missing mandatory metadata.yaml")
|
||||
return None
|
||||
|
||||
# Load and validate metadata
|
||||
try:
|
||||
metadata = self._load_metadata(metadata_file)
|
||||
if not self._validate_metadata(metadata, workflow_name):
|
||||
return None
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to load metadata for {workflow_name}: {e}")
|
||||
return None
|
||||
|
||||
# Check for mandatory Dockerfile
|
||||
dockerfile = workflow_dir / "Dockerfile"
|
||||
if not dockerfile.exists():
|
||||
logger.error(f"Workflow {workflow_name} missing mandatory Dockerfile")
|
||||
return None
|
||||
|
||||
has_docker = True # Always True since Dockerfile is mandatory
|
||||
|
||||
# Get flow function name from metadata or use default
|
||||
flow_function_name = metadata.get("flow_function", "main_flow")
|
||||
|
||||
return WorkflowInfo(
|
||||
name=workflow_name,
|
||||
path=workflow_dir,
|
||||
workflow_file=workflow_file,
|
||||
dockerfile=dockerfile,
|
||||
has_docker=has_docker,
|
||||
metadata=metadata,
|
||||
flow_function_name=flow_function_name
|
||||
)
|
||||
|
||||
def _load_metadata(self, metadata_file: Path) -> Dict[str, Any]:
|
||||
"""
|
||||
Load metadata from YAML file.
|
||||
|
||||
Args:
|
||||
metadata_file: Path to metadata.yaml
|
||||
|
||||
Returns:
|
||||
Dictionary containing metadata
|
||||
"""
|
||||
with open(metadata_file, 'r') as f:
|
||||
metadata = yaml.safe_load(f)
|
||||
|
||||
if metadata is None:
|
||||
raise ValueError("Empty metadata file")
|
||||
|
||||
return metadata
|
||||
|
||||
def _validate_metadata(self, metadata: Dict[str, Any], workflow_name: str) -> bool:
|
||||
"""
|
||||
Validate that metadata contains all required fields.
|
||||
|
||||
Args:
|
||||
metadata: Metadata dictionary
|
||||
workflow_name: Name of the workflow for logging
|
||||
|
||||
Returns:
|
||||
True if valid, False otherwise
|
||||
"""
|
||||
required_fields = ["name", "version", "description", "author", "category", "parameters", "requirements"]
|
||||
|
||||
missing_fields = []
|
||||
for field in required_fields:
|
||||
if field not in metadata:
|
||||
missing_fields.append(field)
|
||||
|
||||
if missing_fields:
|
||||
logger.error(
|
||||
f"Workflow {workflow_name} metadata missing required fields: {missing_fields}"
|
||||
)
|
||||
return False
|
||||
|
||||
# Validate version format (semantic versioning)
|
||||
version = metadata.get("version", "")
|
||||
if not self._is_valid_version(version):
|
||||
logger.error(f"Workflow {workflow_name} has invalid version format: {version}")
|
||||
return False
|
||||
|
||||
# Validate parameters structure
|
||||
parameters = metadata.get("parameters", {})
|
||||
if not isinstance(parameters, dict):
|
||||
logger.error(f"Workflow {workflow_name} parameters must be a dictionary")
|
||||
return False
|
||||
|
||||
return True
|
||||
|
||||
def _is_valid_version(self, version: str) -> bool:
|
||||
"""
|
||||
Check if version follows semantic versioning (x.y.z).
|
||||
|
||||
Args:
|
||||
version: Version string
|
||||
|
||||
Returns:
|
||||
True if valid semantic version
|
||||
"""
|
||||
try:
|
||||
parts = version.split('.')
|
||||
if len(parts) != 3:
|
||||
return False
|
||||
for part in parts:
|
||||
int(part) # Check if each part is a number
|
||||
return True
|
||||
except (ValueError, AttributeError):
|
||||
return False
|
||||
|
||||
def invalidate_cache(self) -> None:
|
||||
"""
|
||||
Invalidate the workflow discovery cache.
|
||||
Useful when workflows are added or modified.
|
||||
"""
|
||||
self._workflow_cache = None
|
||||
self._cache_timestamp = None
|
||||
logger.debug("Workflow discovery cache invalidated")
|
||||
|
||||
def get_flow_function(self, workflow_name: str) -> Optional[Callable]:
|
||||
"""
|
||||
Get the flow function from the registry.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow
|
||||
|
||||
Returns:
|
||||
The flow function if found in registry, None otherwise
|
||||
"""
|
||||
if workflow_name not in self.registry:
|
||||
logger.error(
|
||||
f"Workflow '{workflow_name}' not found in registry. "
|
||||
f"Available workflows: {list(self.registry.keys())}"
|
||||
)
|
||||
return None
|
||||
|
||||
try:
|
||||
from toolbox.workflows.registry import get_workflow_flow
|
||||
flow_func = get_workflow_flow(workflow_name)
|
||||
logger.debug(f"Retrieved flow function for '{workflow_name}' from registry")
|
||||
return flow_func
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get flow function for '{workflow_name}': {e}")
|
||||
return None
|
||||
|
||||
def get_registry_info(self, workflow_name: str) -> Optional[Dict[str, Any]]:
|
||||
"""
|
||||
Get registry information for a workflow.
|
||||
|
||||
Args:
|
||||
workflow_name: Name of the workflow
|
||||
|
||||
Returns:
|
||||
Registry information if found, None otherwise
|
||||
"""
|
||||
if workflow_name not in self.registry:
|
||||
return None
|
||||
|
||||
try:
|
||||
from toolbox.workflows.registry import get_workflow_info
|
||||
return get_workflow_info(workflow_name)
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get registry info for '{workflow_name}': {e}")
|
||||
return None
|
||||
|
||||
@staticmethod
|
||||
def get_metadata_schema() -> Dict[str, Any]:
|
||||
"""
|
||||
Get the JSON schema for workflow metadata.
|
||||
|
||||
Returns:
|
||||
JSON schema dictionary
|
||||
"""
|
||||
return {
|
||||
"type": "object",
|
||||
"required": ["name", "version", "description", "author", "category", "parameters", "requirements"],
|
||||
"properties": {
|
||||
"name": {
|
||||
"type": "string",
|
||||
"description": "Workflow name"
|
||||
},
|
||||
"version": {
|
||||
"type": "string",
|
||||
"pattern": "^\\d+\\.\\d+\\.\\d+$",
|
||||
"description": "Semantic version (x.y.z)"
|
||||
},
|
||||
"description": {
|
||||
"type": "string",
|
||||
"description": "Workflow description"
|
||||
},
|
||||
"author": {
|
||||
"type": "string",
|
||||
"description": "Workflow author"
|
||||
},
|
||||
"category": {
|
||||
"type": "string",
|
||||
"enum": ["comprehensive", "specialized", "fuzzing", "focused"],
|
||||
"description": "Workflow category"
|
||||
},
|
||||
"tags": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "Workflow tags for categorization"
|
||||
},
|
||||
"requirements": {
|
||||
"type": "object",
|
||||
"required": ["tools", "resources"],
|
||||
"properties": {
|
||||
"tools": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "Required security tools"
|
||||
},
|
||||
"resources": {
|
||||
"type": "object",
|
||||
"required": ["memory", "cpu", "timeout"],
|
||||
"properties": {
|
||||
"memory": {
|
||||
"type": "string",
|
||||
"pattern": "^\\d+[GMK]i$",
|
||||
"description": "Memory limit (e.g., 1Gi, 512Mi)"
|
||||
},
|
||||
"cpu": {
|
||||
"type": "string",
|
||||
"pattern": "^\\d+m?$",
|
||||
"description": "CPU limit (e.g., 1000m, 2)"
|
||||
},
|
||||
"timeout": {
|
||||
"type": "integer",
|
||||
"minimum": 60,
|
||||
"maximum": 7200,
|
||||
"description": "Workflow timeout in seconds"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"description": "Workflow parameters schema"
|
||||
},
|
||||
"default_parameters": {
|
||||
"type": "object",
|
||||
"description": "Default parameter values"
|
||||
},
|
||||
"required_modules": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "Required module names"
|
||||
},
|
||||
"supported_volume_modes": {
|
||||
"type": "array",
|
||||
"items": {"enum": ["ro", "rw"]},
|
||||
"default": ["ro", "rw"],
|
||||
"description": "Supported volume mount modes"
|
||||
},
|
||||
"flow_function": {
|
||||
"type": "string",
|
||||
"default": "main_flow",
|
||||
"description": "Name of the flow function in workflow.py"
|
||||
}
|
||||
}
|
||||
}
|
||||
Reference in New Issue
Block a user