mirror of
https://github.com/FuzzingLabs/fuzzforge_ai.git
synced 2026-05-12 19:32:21 +02:00
Initial commit
This commit is contained in:
@@ -0,0 +1,8 @@
|
||||
{
|
||||
"label": "Concept",
|
||||
"position": 2,
|
||||
"link": {
|
||||
"type": "generated-index",
|
||||
"description": "Concept pages that are understanding-oriented."
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,214 @@
|
||||
# Architecture
|
||||
|
||||
FuzzForge is a distributed, containerized platform for security analysis workflows. Its architecture is designed for scalability, isolation, and reliability, drawing on modern patterns like microservices and orchestration. This page explains the core architectural concepts behind FuzzForge, meaning what the main components are, how they interact, and why the system is structured this way.
|
||||
|
||||
:::warning
|
||||
|
||||
FuzzForge’s architecture is evolving. While the long-term goal is a hexagonal architecture, the current implementation is still in transition. Expect changes as the platform matures.
|
||||
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
## Why This Architecture?
|
||||
|
||||
FuzzForge’s architecture is shaped by several key goals:
|
||||
|
||||
- **Scalability:** Handle many workflows in parallel, scaling up or down as needed.
|
||||
- **Isolation:** Run each workflow in its own secure environment, minimizing risk.
|
||||
- **Reliability:** Ensure that failures in one part of the system don’t bring down the whole platform.
|
||||
- **Extensibility:** Make it easy to add new workflows, tools, or integrations.
|
||||
|
||||
## High-Level System Overview
|
||||
|
||||
At a glance, FuzzForge is organized into several layers, each with a clear responsibility:
|
||||
|
||||
- **Client Layer:** Where users and external systems interact (CLI, API clients, MCP server).
|
||||
- **API Layer:** The FastAPI backend, which exposes REST endpoints and manages requests.
|
||||
- **Orchestration Layer:** Prefect server and workers, which schedule and execute workflows.
|
||||
- **Execution Layer:** Docker Engine and containers, where workflows actually run.
|
||||
- **Storage Layer:** PostgreSQL database, Docker volumes, and a result cache for persistence.
|
||||
|
||||
Here’s a simplified view of how these layers fit together:
|
||||
|
||||
```mermaid
|
||||
graph TB
|
||||
subgraph "Client Layer"
|
||||
CLI[CLI Client]
|
||||
API_Client[API Client]
|
||||
MCP[MCP Server]
|
||||
end
|
||||
|
||||
subgraph "API Layer"
|
||||
FastAPI[FastAPI Backend]
|
||||
Router[Route Handlers]
|
||||
Middleware[Middleware Stack]
|
||||
end
|
||||
|
||||
subgraph "Orchestration Layer"
|
||||
Prefect[Prefect Server]
|
||||
Workers[Prefect Workers]
|
||||
Scheduler[Workflow Scheduler]
|
||||
end
|
||||
|
||||
subgraph "Execution Layer"
|
||||
Docker[Docker Engine]
|
||||
Containers[Workflow Containers]
|
||||
Registry[Docker Registry]
|
||||
end
|
||||
|
||||
subgraph "Storage Layer"
|
||||
PostgreSQL[PostgreSQL Database]
|
||||
Volumes[Docker Volumes]
|
||||
Cache[Result Cache]
|
||||
end
|
||||
|
||||
CLI --> FastAPI
|
||||
API_Client --> FastAPI
|
||||
MCP --> FastAPI
|
||||
|
||||
FastAPI --> Router
|
||||
Router --> Middleware
|
||||
Middleware --> Prefect
|
||||
|
||||
Prefect --> Workers
|
||||
Workers --> Scheduler
|
||||
Scheduler --> Docker
|
||||
|
||||
Docker --> Containers
|
||||
Docker --> Registry
|
||||
Containers --> Volumes
|
||||
|
||||
FastAPI --> PostgreSQL
|
||||
Workers --> PostgreSQL
|
||||
Containers --> Cache
|
||||
```
|
||||
|
||||
## What Are the Main Components?
|
||||
|
||||
### API Layer
|
||||
|
||||
- **FastAPI Backend:** The main entry point for users and clients. Handles authentication, request validation, and exposes endpoints for workflow management, results, and health checks.
|
||||
- **Middleware Stack:** Manages API keys, user authentication, CORS, logging, and error handling.
|
||||
|
||||
### Orchestration Layer
|
||||
|
||||
- **Prefect Server:** Schedules and tracks workflows, backed by PostgreSQL.
|
||||
- **Prefect Workers:** Execute workflows in Docker containers. Can be scaled horizontally.
|
||||
- **Workflow Scheduler:** Balances load, manages priorities, and enforces resource limits.
|
||||
|
||||
### Execution Layer
|
||||
|
||||
- **Docker Engine:** Runs workflow containers, enforcing isolation and resource limits.
|
||||
- **Workflow Containers:** Custom images with security tools, mounting code and results volumes.
|
||||
- **Docker Registry:** Stores and distributes workflow images.
|
||||
|
||||
### Storage Layer
|
||||
|
||||
- **PostgreSQL Database:** Stores workflow metadata, state, and results.
|
||||
- **Docker Volumes:** Persist workflow results and artifacts.
|
||||
- **Result Cache:** Speeds up access to recent results, with in-memory and disk persistence.
|
||||
|
||||
## How Does Data Flow Through the System?
|
||||
|
||||
### Submitting a Workflow
|
||||
|
||||
1. **User submits a workflow** via CLI or API client.
|
||||
2. **API validates** the request and creates a deployment in Prefect.
|
||||
3. **Prefect schedules** the workflow and assigns it to a worker.
|
||||
4. **Worker launches a container** to run the workflow.
|
||||
5. **Results are stored** in Docker volumes and the database.
|
||||
6. **Status updates** flow back through Prefect and the API to the user.
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant User
|
||||
participant API
|
||||
participant Prefect
|
||||
participant Worker
|
||||
participant Container
|
||||
participant Storage
|
||||
|
||||
User->>API: Submit workflow
|
||||
API->>API: Validate parameters
|
||||
API->>Prefect: Create deployment
|
||||
Prefect->>Worker: Schedule execution
|
||||
Worker->>Container: Create and start
|
||||
Container->>Container: Execute security tools
|
||||
Container->>Storage: Store SARIF results
|
||||
Worker->>Prefect: Update status
|
||||
Prefect->>API: Workflow complete
|
||||
API->>User: Return results
|
||||
```
|
||||
|
||||
### Retrieving Results
|
||||
|
||||
1. **User requests status or results** via the API.
|
||||
2. **API queries the database** for workflow metadata.
|
||||
3. **If complete,** results are fetched from storage and returned to the user.
|
||||
|
||||
## How Do Services Communicate?
|
||||
|
||||
- **Internally:** FastAPI talks to Prefect via REST; Prefect coordinates with workers over HTTP; workers manage containers via the Docker Engine API. All core services use pooled connections to PostgreSQL.
|
||||
- **Externally:** Users interact via CLI or API clients (HTTP REST). The MCP server can automate workflows via its own protocol.
|
||||
|
||||
## How Is Security Enforced?
|
||||
|
||||
- **Container Isolation:** Each workflow runs in its own Docker network, as a non-root user, with strict resource limits and only necessary volumes mounted.
|
||||
- **Volume Security:** Source code is mounted read-only; results are written to dedicated, temporary volumes.
|
||||
- **API Security:** All endpoints require API keys, validate inputs, enforce rate limits, and log requests for auditing.
|
||||
|
||||
## How Does FuzzForge Scale?
|
||||
|
||||
- **Horizontally:** Add more Prefect workers to handle more workflows in parallel. Scale the database with read replicas and connection pooling.
|
||||
- **Vertically:** Adjust CPU and memory limits for containers and services as needed.
|
||||
|
||||
Example Docker Compose scaling:
|
||||
```yaml
|
||||
services:
|
||||
prefect-worker:
|
||||
deploy:
|
||||
resources:
|
||||
limits:
|
||||
memory: 4G
|
||||
cpus: '2.0'
|
||||
reservations:
|
||||
memory: 1G
|
||||
cpus: '0.5'
|
||||
```
|
||||
|
||||
## How Is It Deployed?
|
||||
|
||||
- **Development:** All services run via Docker Compose—backend, Prefect, workers, database, and registry.
|
||||
- **Production:** Add load balancers, database clustering, and multiple worker instances for high availability. Health checks, metrics, and centralized logging support monitoring and troubleshooting.
|
||||
|
||||
## How Is Configuration Managed?
|
||||
|
||||
- **Environment Variables:** Control core settings like database URLs, registry location, and Prefect API endpoints.
|
||||
- **Service Discovery:** Docker Compose’s internal DNS lets services find each other by name, with consistent port mapping and health check endpoints.
|
||||
|
||||
Example configuration:
|
||||
```bash
|
||||
COMPOSE_PROJECT_NAME=fuzzforge_alpha
|
||||
DATABASE_URL=postgresql://postgres:postgres@postgres:5432/fuzzforge
|
||||
PREFECT_API_URL=http://prefect-server:4200/api
|
||||
DOCKER_REGISTRY=localhost:5001
|
||||
DOCKER_INSECURE_REGISTRY=true
|
||||
```
|
||||
|
||||
## How Are Failures Handled?
|
||||
|
||||
- **Failure Isolation:** Each service is independent; failures don’t cascade. Circuit breakers and graceful degradation keep the system stable.
|
||||
- **Recovery:** Automatic retries with backoff for transient errors, dead letter queues for persistent failures, and workflow state recovery after restarts.
|
||||
|
||||
## Implementation Details
|
||||
|
||||
- **Tech Stack:** FastAPI (Python async), Prefect 3.x, Docker, Docker Compose, PostgreSQL (asyncpg), and Docker networking.
|
||||
- **Performance:** Workflows start in 2–5 seconds; results are retrieved quickly thanks to caching and database indexing.
|
||||
- **Extensibility:** Add new workflows by deploying new Docker images; extend the API with new endpoints; configure storage backends as needed.
|
||||
|
||||
---
|
||||
|
||||
## In Summary
|
||||
|
||||
FuzzForge’s architecture is designed to be robust, scalable, and secure—ready to handle demanding security analysis workflows in a modern, distributed environment. As the platform evolves, expect even more modularity and flexibility, making it easier to adapt to new requirements and technologies.
|
||||
@@ -0,0 +1,20 @@
|
||||
# {Concept Title}
|
||||
|
||||
{Brief introduction of the concept, including its origin and general purpose.}
|
||||
|
||||
## Purpose
|
||||
|
||||
- {The primary purpose and its relevance in its field.}
|
||||
|
||||
## Common Usage
|
||||
|
||||
- {Usage 1}: {Brief description.}
|
||||
- {Usage 2}: {Brief description.}
|
||||
|
||||
## Benefits
|
||||
|
||||
- {Key benefit and why it's preferred in certain scenarios.}
|
||||
|
||||
## Conclusion
|
||||
|
||||
{Summary of its importance and role in its respective field.}
|
||||
@@ -0,0 +1,217 @@
|
||||
# Docker Containers in FuzzForge: Concept and Design
|
||||
|
||||
Docker containers are at the heart of FuzzForge’s execution model. They provide the isolation, consistency, and flexibility needed to run security workflows reliably—no matter where FuzzForge is deployed. This page explains the core concepts behind container usage in FuzzForge, why containers are used, and how they shape the platform’s behavior.
|
||||
|
||||
---
|
||||
|
||||
## Why Use Docker Containers?
|
||||
|
||||
FuzzForge relies on Docker containers for several key reasons:
|
||||
|
||||
- **Isolation:** Each workflow runs in its own container, so tools and processes can’t interfere with each other or the host.
|
||||
- **Consistency:** The environment inside a container is always the same, regardless of the underlying system.
|
||||
- **Security:** Containers restrict access to host resources and run as non-root users.
|
||||
- **Reproducibility:** Results are deterministic, since the environment is controlled and versioned.
|
||||
- **Scalability:** Containers can be started, stopped, and scaled up or down as needed.
|
||||
|
||||
---
|
||||
|
||||
## How Does FuzzForge Use Containers?
|
||||
|
||||
### The Container Model
|
||||
|
||||
Every workflow in FuzzForge is executed inside a Docker container. Here’s what that means in practice:
|
||||
|
||||
- **Workflow containers** are built from language-specific base images (like Python or Node.js), with security tools and workflow code pre-installed.
|
||||
- **Infrastructure containers** (API server, Prefect, database) use official images and are configured for the platform’s needs.
|
||||
|
||||
### Container Lifecycle: From Build to Cleanup
|
||||
|
||||
The lifecycle of a workflow container looks like this:
|
||||
|
||||
1. **Image Build:** A Docker image is built with all required tools and code.
|
||||
2. **Image Push/Pull:** The image is pushed to (and later pulled from) a local or remote registry.
|
||||
3. **Container Creation:** The container is created with the right volumes and environment.
|
||||
4. **Execution:** The workflow runs inside the container.
|
||||
5. **Result Storage:** Results are written to mounted volumes.
|
||||
6. **Cleanup:** The container and any temporary data are removed.
|
||||
|
||||
```mermaid
|
||||
graph TB
|
||||
Build[Build Image] --> Push[Push to Registry]
|
||||
Push --> Pull[Pull Image]
|
||||
Pull --> Create[Create Container]
|
||||
Create --> Mount[Mount Volumes]
|
||||
Mount --> Start[Start Container]
|
||||
Start --> Execute[Run Workflow]
|
||||
Execute --> Results[Store Results]
|
||||
Execute --> Stop[Stop Container]
|
||||
Stop --> Cleanup[Cleanup Data]
|
||||
Cleanup --> Remove[Remove Container]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## What’s Inside a Workflow Container?
|
||||
|
||||
A typical workflow container is structured like this:
|
||||
|
||||
- **Base Image:** Usually a slim language image (e.g., `python:3.11-slim`).
|
||||
- **System Dependencies:** Installed as needed (e.g., `git`, `curl`).
|
||||
- **Security Tools:** Pre-installed (e.g., `semgrep`, `bandit`, `safety`).
|
||||
- **Workflow Code:** Copied into the container.
|
||||
- **Non-root User:** Created for execution.
|
||||
- **Entrypoint:** Runs the workflow code.
|
||||
|
||||
Example Dockerfile snippet:
|
||||
|
||||
```dockerfile
|
||||
FROM python:3.11-slim
|
||||
RUN apt-get update && apt-get install -y git curl && rm -rf /var/lib/apt/lists/*
|
||||
RUN pip install semgrep bandit safety
|
||||
COPY ./toolbox /app/toolbox
|
||||
WORKDIR /app
|
||||
RUN useradd -m -u 1000 fuzzforge
|
||||
USER fuzzforge
|
||||
CMD ["python", "-m", "toolbox.main"]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## How Are Containers Networked and Connected?
|
||||
|
||||
- **Docker Compose Network:** All containers are attached to a custom bridge network for internal communication.
|
||||
- **Internal DNS:** Services communicate using Docker Compose service names.
|
||||
- **Port Exposure:** Only necessary ports are exposed to the host.
|
||||
- **Network Isolation:** Workflow containers are isolated from infrastructure containers when possible.
|
||||
|
||||
Example network config:
|
||||
|
||||
```yaml
|
||||
networks:
|
||||
fuzzforge:
|
||||
driver: bridge
|
||||
ipam:
|
||||
config:
|
||||
- subnet: 172.20.0.0/16
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## How Is Data Managed with Volumes?
|
||||
|
||||
### Volume Types
|
||||
|
||||
- **Target Code Volume:** Mounts the code to be analyzed, read-only, into the container.
|
||||
- **Result Volume:** Stores workflow results and artifacts, persists after container exit.
|
||||
- **Temporary Volumes:** Used for scratch space, destroyed with the container.
|
||||
|
||||
Example volume mount:
|
||||
|
||||
```yaml
|
||||
volumes:
|
||||
- "/host/path/to/code:/app/target:ro"
|
||||
- "fuzzforge_alpha_prefect_storage:/app/prefect"
|
||||
```
|
||||
|
||||
### Volume Security
|
||||
|
||||
- **Read-only Mounts:** Prevent workflows from modifying source code.
|
||||
- **Isolated Results:** Each workflow writes to its own result directory.
|
||||
- **No Arbitrary Host Access:** Only explicitly mounted paths are accessible.
|
||||
|
||||
---
|
||||
|
||||
## How Are Images Built and Managed?
|
||||
|
||||
- **Automated Builds:** Images are built and pushed to a local registry for development, or a secure registry for production.
|
||||
- **Build Optimization:** Use layer caching, multi-stage builds, and minimal base images.
|
||||
- **Versioning:** Use tags (`latest`, semantic versions, or SHA digests) to track images.
|
||||
|
||||
Example build and push:
|
||||
|
||||
```bash
|
||||
docker build -t localhost:5001/fuzzforge-static-analysis:latest .
|
||||
docker push localhost:5001/fuzzforge-static-analysis:latest
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## How Are Resources Controlled?
|
||||
|
||||
- **Memory and CPU Limits:** Set per container to prevent resource exhaustion.
|
||||
- **Resource Monitoring:** Use `docker stats` and platform APIs to track usage.
|
||||
- **Alerts:** Detect and handle out-of-memory or CPU throttling events.
|
||||
|
||||
Example resource config:
|
||||
|
||||
```yaml
|
||||
services:
|
||||
prefect-worker:
|
||||
deploy:
|
||||
resources:
|
||||
limits:
|
||||
memory: 4G
|
||||
cpus: '2.0'
|
||||
reservations:
|
||||
memory: 1G
|
||||
cpus: '0.5'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## How Is Security Enforced?
|
||||
|
||||
- **Non-root Execution:** Containers run as a dedicated, non-root user.
|
||||
- **Capability Restrictions:** Drop unnecessary Linux capabilities.
|
||||
- **Filesystem Protection:** Use read-only filesystems and tmpfs for temporary data.
|
||||
- **Network Isolation:** Restrict network access to only what’s needed.
|
||||
- **No Privileged Mode:** Containers never run with elevated privileges.
|
||||
|
||||
Example security options:
|
||||
|
||||
```yaml
|
||||
services:
|
||||
prefect-worker:
|
||||
security_opt:
|
||||
- no-new-privileges:true
|
||||
cap_drop:
|
||||
- ALL
|
||||
cap_add:
|
||||
- CHOWN
|
||||
- SETGID
|
||||
- SETUID
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## How Is Performance Optimized?
|
||||
|
||||
- **Image Layering:** Structure Dockerfiles for efficient caching.
|
||||
- **Dependency Preinstallation:** Reduce startup time by pre-installing dependencies.
|
||||
- **Warm Containers:** Optionally pre-create containers for faster workflow startup.
|
||||
- **Horizontal Scaling:** Scale worker containers to handle more workflows in parallel.
|
||||
|
||||
---
|
||||
|
||||
## How Are Containers Monitored and Debugged?
|
||||
|
||||
- **Health Checks:** Each service and workflow container has a health endpoint or check.
|
||||
- **Logging:** All container logs are collected and can be accessed via `docker logs` or the FuzzForge API.
|
||||
- **Debug Access:** Use `docker exec` to access running containers for troubleshooting.
|
||||
- **Resource Stats:** Monitor with `docker stats` or platform dashboards.
|
||||
|
||||
---
|
||||
|
||||
## How Does This All Fit Into FuzzForge?
|
||||
|
||||
- **Prefect Workers:** Manage the full lifecycle of workflow containers.
|
||||
- **API Integration:** Exposes container status, logs, and resource metrics.
|
||||
- **Volume Management:** Ensures results and artifacts are collected and persisted.
|
||||
- **Security and Resource Controls:** Enforced automatically for every workflow.
|
||||
|
||||
---
|
||||
|
||||
## In Summary
|
||||
|
||||
Docker containers are the foundation of FuzzForge’s execution model. They provide the isolation, security, and reproducibility needed for robust security analysis workflows—while making it easy to scale, monitor, and extend the platform.
|
||||
@@ -0,0 +1,83 @@
|
||||
# FuzzForge AI: Conceptual Overview
|
||||
|
||||
Welcome to FuzzForge AI—a multi-agent orchestration platform designed to supercharge your intelligent automation, security workflows, and project knowledge management. This document provides a high-level conceptual introduction to what FuzzForge AI is, what problems it solves, and how its architecture enables powerful, context-aware agent collaboration.
|
||||
|
||||
---
|
||||
|
||||
## What is FuzzForge AI?
|
||||
|
||||
FuzzForge AI is a multi-agent orchestration system that implements the A2A (Agent-to-Agent) protocol for intelligent agent routing, persistent memory management, and project-scoped knowledge graphs. Think of it as an intelligent hub that coordinates a team of specialized agents, each with their own skills, while maintaining context and knowledge across sessions and projects.
|
||||
|
||||
**Key Goals:**
|
||||
- Seamlessly route requests to the right agent for the job
|
||||
- Preserve and leverage project-specific knowledge
|
||||
- Enable secure, auditable, and extensible automation workflows
|
||||
- Make multi-agent collaboration as easy as talking to a single assistant
|
||||
|
||||
---
|
||||
|
||||
## Core Concepts
|
||||
|
||||
### 1. **Agent Orchestration**
|
||||
FuzzForge AI acts as a conductor, automatically routing your requests to the most capable registered agent. Agents can be local or remote, and each advertises its skills and capabilities via the A2A protocol.
|
||||
|
||||
### 2. **Memory & Knowledge Management**
|
||||
The system features a three-layer memory architecture:
|
||||
- **Session Persistence:** Keeps track of ongoing sessions and conversations.
|
||||
- **Semantic Memory:** Archives conversations and enables semantic search.
|
||||
- **Knowledge Graphs:** Maintains structured, project-scoped knowledge for deep context.
|
||||
|
||||
### 3. **Artifact System**
|
||||
Artifacts are files or structured content generated, processed, or shared by agents. The artifact system supports creation, storage, and secure sharing of code, configs, reports, and more—enabling reproducible, auditable workflows.
|
||||
|
||||
### 4. **A2A Protocol Compliance**
|
||||
FuzzForge AI fully implements the A2A (Agent-to-Agent) protocol (spec 0.3.0), ensuring standardized, interoperable communication between agents—whether they're running locally or across the network.
|
||||
|
||||
---
|
||||
|
||||
## High-Level Architecture
|
||||
|
||||
Here's how the main components fit together:
|
||||
|
||||
```
|
||||
FuzzForge AI System
|
||||
├── CLI Interface (cli.py)
|
||||
│ ├── Commands & Session Management
|
||||
│ └── Agent Registry Persistence
|
||||
├── Agent Core (agent.py)
|
||||
│ ├── Main Coordinator
|
||||
│ └── Memory Manager Integration
|
||||
├── Agent Executor (agent_executor.py)
|
||||
│ ├── Tool Management & Orchestration
|
||||
│ ├── ROUTE_TO Pattern Implementation
|
||||
│ └── Artifact Creation & Management
|
||||
├── Memory Architecture (Three Layers)
|
||||
│ ├── Session Persistence
|
||||
│ ├── Semantic Memory
|
||||
│ └── Knowledge Graphs
|
||||
├── A2A Communication Layer
|
||||
│ ├── Remote Agent Connection
|
||||
│ ├── Agent Card Management
|
||||
│ └── Protocol Compliance
|
||||
└── A2A Server (a2a_server.py)
|
||||
├── HTTP/SSE Server
|
||||
├── Artifact HTTP Serving
|
||||
└── Task Store & Queue Management
|
||||
```
|
||||
|
||||
**How it works:**
|
||||
1. **User Input:** You interact via CLI or API, using natural language or commands.
|
||||
2. **Agent Routing:** The system decides whether to handle the request itself or route it to a specialist agent.
|
||||
3. **Tool Execution:** Built-in and agent-provided tools perform operations.
|
||||
4. **Memory Integration:** Results and context are stored for future use.
|
||||
5. **Response Generation:** The system returns results, often with artifacts or actionable insights.
|
||||
|
||||
---
|
||||
|
||||
## Why FuzzForge AI?
|
||||
|
||||
- **Extensible:** Easily add new agents, tools, and workflows.
|
||||
- **Context-Aware:** Remembers project history, conversations, and knowledge.
|
||||
- **Secure:** Project isolation, input validation, and artifact management.
|
||||
- **Collaborative:** Enables multi-agent workflows and knowledge sharing.
|
||||
- **Fun & Productive:** Designed to make automation and security tasks less tedious and more interactive.
|
||||
@@ -0,0 +1,618 @@
|
||||
# SARIF Format
|
||||
|
||||
FuzzForge uses the Static Analysis Results Interchange Format (SARIF) as the standardized output format for all security analysis results. SARIF provides a consistent, machine-readable format that enables tool interoperability and comprehensive result analysis.
|
||||
|
||||
## What is SARIF?
|
||||
|
||||
### Overview
|
||||
|
||||
SARIF (Static Analysis Results Interchange Format) is an OASIS-approved standard (SARIF 2.1.0) designed to standardize the output of static analysis tools. FuzzForge extends this standard to cover dynamic analysis, secret detection, infrastructure analysis, and fuzzing results.
|
||||
|
||||
### Key Benefits
|
||||
|
||||
- **Standardization**: Consistent format across all security tools and workflows
|
||||
- **Interoperability**: Integration with existing security tools and platforms
|
||||
- **Rich Metadata**: Comprehensive information about findings, tools, and analysis runs
|
||||
- **Tool Agnostic**: Works with any security tool that produces structured results
|
||||
- **IDE Integration**: Native support in modern development environments
|
||||
|
||||
### SARIF Structure
|
||||
|
||||
```json
|
||||
{
|
||||
"version": "2.1.0",
|
||||
"schema": "https://json.schemastore.org/sarif-2.1.0.json",
|
||||
"runs": [
|
||||
{
|
||||
"tool": { /* Tool information */ },
|
||||
"invocations": [ /* How the tool was run */ ],
|
||||
"artifacts": [ /* Files analyzed */ ],
|
||||
"results": [ /* Security findings */ ]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## FuzzForge SARIF Implementation
|
||||
|
||||
### Run Structure
|
||||
|
||||
Each FuzzForge workflow produces a SARIF "run" containing:
|
||||
|
||||
```json
|
||||
{
|
||||
"tool": {
|
||||
"driver": {
|
||||
"name": "FuzzForge",
|
||||
"version": "1.0.0",
|
||||
"informationUri": "https://github.com/FuzzingLabs/fuzzforge",
|
||||
"organization": "FuzzingLabs",
|
||||
"rules": [ /* Security rules applied */ ]
|
||||
},
|
||||
"extensions": [
|
||||
{
|
||||
"name": "semgrep",
|
||||
"version": "1.45.0",
|
||||
"rules": [ /* Semgrep-specific rules */ ]
|
||||
}
|
||||
]
|
||||
},
|
||||
"invocations": [
|
||||
{
|
||||
"executionSuccessful": true,
|
||||
"startTimeUtc": "2025-09-25T12:00:00.000Z",
|
||||
"endTimeUtc": "2025-09-25T12:05:30.000Z",
|
||||
"workingDirectory": {
|
||||
"uri": "file:///app/target/"
|
||||
},
|
||||
"commandLine": "python -m toolbox.workflows.static_analysis",
|
||||
"environmentVariables": {
|
||||
"WORKFLOW_TYPE": "static_analysis_scan"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Result Structure
|
||||
|
||||
Each security finding is represented as a SARIF result:
|
||||
|
||||
```json
|
||||
{
|
||||
"ruleId": "semgrep.security.audit.sqli.pg-sqli",
|
||||
"ruleIndex": 42,
|
||||
"level": "error",
|
||||
"message": {
|
||||
"text": "Potential SQL injection vulnerability detected"
|
||||
},
|
||||
"locations": [
|
||||
{
|
||||
"physicalLocation": {
|
||||
"artifactLocation": {
|
||||
"uri": "src/database/queries.py",
|
||||
"uriBaseId": "SRCROOT"
|
||||
},
|
||||
"region": {
|
||||
"startLine": 156,
|
||||
"startColumn": 20,
|
||||
"endLine": 156,
|
||||
"endColumn": 45,
|
||||
"snippet": {
|
||||
"text": "cursor.execute(query)"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
"properties": {
|
||||
"tool": "semgrep",
|
||||
"confidence": "high",
|
||||
"severity": "high",
|
||||
"cwe": ["CWE-89"],
|
||||
"owasp": ["A03:2021"],
|
||||
"references": [
|
||||
"https://owasp.org/Top10/A03_2021-Injection/"
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Finding Categories and Severity
|
||||
|
||||
### Severity Levels
|
||||
|
||||
FuzzForge maps tool-specific severity levels to SARIF standard levels:
|
||||
|
||||
#### SARIF Level Mapping
|
||||
- **error**: Critical and High severity findings
|
||||
- **warning**: Medium severity findings
|
||||
- **note**: Low severity findings
|
||||
- **info**: Informational findings
|
||||
|
||||
#### Extended Severity Properties
|
||||
```json
|
||||
{
|
||||
"properties": {
|
||||
"severity": "high", // FuzzForge severity
|
||||
"confidence": "medium", // Tool confidence
|
||||
"exploitability": "high", // Likelihood of exploitation
|
||||
"impact": "data_breach" // Potential impact
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Vulnerability Classification
|
||||
|
||||
#### CWE (Common Weakness Enumeration)
|
||||
```json
|
||||
{
|
||||
"properties": {
|
||||
"cwe": ["CWE-89", "CWE-79"],
|
||||
"cwe_category": "Injection"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### OWASP Top 10 Mapping
|
||||
```json
|
||||
{
|
||||
"properties": {
|
||||
"owasp": ["A03:2021", "A06:2021"],
|
||||
"owasp_category": "Injection"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### Tool-Specific Classifications
|
||||
```json
|
||||
{
|
||||
"properties": {
|
||||
"tool_category": "security",
|
||||
"rule_type": "semantic_grep",
|
||||
"finding_type": "sql_injection"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Multi-Tool Result Aggregation
|
||||
|
||||
### Tool Extension Model
|
||||
|
||||
FuzzForge aggregates results from multiple tools using SARIF's extension model:
|
||||
|
||||
```json
|
||||
{
|
||||
"tool": {
|
||||
"driver": {
|
||||
"name": "FuzzForge",
|
||||
"version": "1.0.0"
|
||||
},
|
||||
"extensions": [
|
||||
{
|
||||
"name": "semgrep",
|
||||
"version": "1.45.0",
|
||||
"guid": "semgrep-extension-guid"
|
||||
},
|
||||
{
|
||||
"name": "bandit",
|
||||
"version": "1.7.5",
|
||||
"guid": "bandit-extension-guid"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Result Correlation
|
||||
|
||||
#### Cross-Tool Finding Correlation
|
||||
```json
|
||||
{
|
||||
"ruleId": "fuzzforge.correlation.sql-injection",
|
||||
"level": "error",
|
||||
"message": {
|
||||
"text": "SQL injection vulnerability confirmed by multiple tools"
|
||||
},
|
||||
"locations": [ /* Primary location */ ],
|
||||
"relatedLocations": [ /* Additional contexts */ ],
|
||||
"properties": {
|
||||
"correlation_id": "corr-001",
|
||||
"confirming_tools": ["semgrep", "bandit"],
|
||||
"confidence_score": 0.95,
|
||||
"aggregated_severity": "critical"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### Finding Relationships
|
||||
```json
|
||||
{
|
||||
"ruleId": "semgrep.security.audit.xss.direct-use-of-jinja2",
|
||||
"properties": {
|
||||
"related_findings": [
|
||||
{
|
||||
"correlation_type": "same_vulnerability_class",
|
||||
"related_rule": "bandit.B703",
|
||||
"relationship": "confirms"
|
||||
},
|
||||
{
|
||||
"correlation_type": "attack_chain",
|
||||
"related_rule": "nuclei.xss.reflected",
|
||||
"relationship": "exploits"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Workflow-Specific Extensions
|
||||
|
||||
### Static Analysis Results
|
||||
```json
|
||||
{
|
||||
"properties": {
|
||||
"analysis_type": "static",
|
||||
"language": "python",
|
||||
"complexity_score": 3.2,
|
||||
"coverage": {
|
||||
"lines_analyzed": 15420,
|
||||
"functions_analyzed": 892,
|
||||
"classes_analyzed": 156
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Dynamic Analysis Results
|
||||
```json
|
||||
{
|
||||
"properties": {
|
||||
"analysis_type": "dynamic",
|
||||
"test_method": "web_application_scan",
|
||||
"target_url": "https://example.com",
|
||||
"http_method": "POST",
|
||||
"request_payload": "user_input=<script>alert(1)</script>",
|
||||
"response_code": 200,
|
||||
"exploitation_proof": "alert_box_displayed"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Secret Detection Results
|
||||
```json
|
||||
{
|
||||
"properties": {
|
||||
"analysis_type": "secret_detection",
|
||||
"secret_type": "api_key",
|
||||
"entropy_score": 4.2,
|
||||
"commit_hash": "abc123def456",
|
||||
"commit_date": "2025-09-20T10:30:00Z",
|
||||
"author": "developer@example.com",
|
||||
"exposure_duration": "30_days"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Infrastructure Analysis Results
|
||||
```json
|
||||
{
|
||||
"properties": {
|
||||
"analysis_type": "infrastructure",
|
||||
"resource_type": "docker_container",
|
||||
"policy_violation": "privileged_container",
|
||||
"compliance_framework": ["CIS", "NIST"],
|
||||
"remediation_effort": "low",
|
||||
"deployment_risk": "high"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Fuzzing Results
|
||||
```json
|
||||
{
|
||||
"properties": {
|
||||
"analysis_type": "fuzzing",
|
||||
"fuzzer": "afl++",
|
||||
"crash_type": "segmentation_fault",
|
||||
"crash_address": "0x7fff8b2a1000",
|
||||
"exploitability": "likely_exploitable",
|
||||
"test_case": "base64:SGVsbG8gV29ybGQ=",
|
||||
"coverage_achieved": "85%"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## SARIF Processing and Analysis
|
||||
|
||||
### Result Filtering
|
||||
|
||||
#### Severity-Based Filtering
|
||||
```python
|
||||
def filter_by_severity(sarif_results, min_severity="medium"):
|
||||
"""Filter SARIF results by minimum severity level"""
|
||||
severity_order = {"info": 0, "note": 1, "warning": 2, "error": 3}
|
||||
min_level = severity_order.get(min_severity, 1)
|
||||
|
||||
filtered_results = []
|
||||
for result in sarif_results["runs"][0]["results"]:
|
||||
result_level = severity_order.get(result.get("level", "note"), 1)
|
||||
if result_level >= min_level:
|
||||
filtered_results.append(result)
|
||||
|
||||
return filtered_results
|
||||
```
|
||||
|
||||
#### Rule-Based Filtering
|
||||
```python
|
||||
def filter_by_rules(sarif_results, rule_patterns):
|
||||
"""Filter results by rule ID patterns"""
|
||||
import re
|
||||
|
||||
filtered_results = []
|
||||
for result in sarif_results["runs"][0]["results"]:
|
||||
rule_id = result.get("ruleId", "")
|
||||
for pattern in rule_patterns:
|
||||
if re.match(pattern, rule_id):
|
||||
filtered_results.append(result)
|
||||
break
|
||||
|
||||
return filtered_results
|
||||
```
|
||||
|
||||
### Statistical Analysis
|
||||
|
||||
#### Severity Distribution
|
||||
```python
|
||||
def analyze_severity_distribution(sarif_results):
|
||||
"""Analyze distribution of findings by severity"""
|
||||
distribution = {"error": 0, "warning": 0, "note": 0, "info": 0}
|
||||
|
||||
for result in sarif_results["runs"][0]["results"]:
|
||||
level = result.get("level", "note")
|
||||
distribution[level] += 1
|
||||
|
||||
return distribution
|
||||
```
|
||||
|
||||
#### Tool Coverage Analysis
|
||||
```python
|
||||
def analyze_tool_coverage(sarif_results):
|
||||
"""Analyze which tools contributed findings"""
|
||||
tool_stats = {}
|
||||
|
||||
for result in sarif_results["runs"][0]["results"]:
|
||||
tool = result.get("properties", {}).get("tool", "unknown")
|
||||
if tool not in tool_stats:
|
||||
tool_stats[tool] = {"count": 0, "severities": {"error": 0, "warning": 0, "note": 0, "info": 0}}
|
||||
|
||||
tool_stats[tool]["count"] += 1
|
||||
level = result.get("level", "note")
|
||||
tool_stats[tool]["severities"][level] += 1
|
||||
|
||||
return tool_stats
|
||||
```
|
||||
|
||||
## SARIF Export and Integration
|
||||
|
||||
### Export Formats
|
||||
|
||||
#### JSON Export
|
||||
```python
|
||||
def export_sarif_json(sarif_results, output_path):
|
||||
"""Export SARIF results as JSON"""
|
||||
import json
|
||||
|
||||
with open(output_path, 'w') as f:
|
||||
json.dump(sarif_results, f, indent=2, ensure_ascii=False)
|
||||
```
|
||||
|
||||
#### CSV Export for Spreadsheets
|
||||
```python
|
||||
def export_sarif_csv(sarif_results, output_path):
|
||||
"""Export SARIF results as CSV for spreadsheet analysis"""
|
||||
import csv
|
||||
|
||||
with open(output_path, 'w', newline='') as f:
|
||||
writer = csv.writer(f)
|
||||
writer.writerow(['Rule ID', 'Severity', 'Message', 'File', 'Line', 'Tool'])
|
||||
|
||||
for result in sarif_results["runs"][0]["results"]:
|
||||
rule_id = result.get("ruleId", "unknown")
|
||||
level = result.get("level", "note")
|
||||
message = result.get("message", {}).get("text", "")
|
||||
tool = result.get("properties", {}).get("tool", "unknown")
|
||||
|
||||
for location in result.get("locations", []):
|
||||
physical_location = location.get("physicalLocation", {})
|
||||
file_path = physical_location.get("artifactLocation", {}).get("uri", "")
|
||||
line = physical_location.get("region", {}).get("startLine", "")
|
||||
|
||||
writer.writerow([rule_id, level, message, file_path, line, tool])
|
||||
```
|
||||
|
||||
### IDE Integration
|
||||
|
||||
#### Visual Studio Code
|
||||
SARIF files can be opened directly in VS Code with the SARIF extension:
|
||||
|
||||
```json
|
||||
{
|
||||
"recommendations": ["ms-sarif.sarif-viewer"],
|
||||
"sarif.viewer.connectToGitHub": true,
|
||||
"sarif.viewer.showResultsInExplorer": true
|
||||
}
|
||||
```
|
||||
|
||||
#### GitHub Integration
|
||||
GitHub automatically processes SARIF files uploaded through Actions:
|
||||
|
||||
```yaml
|
||||
- name: Upload SARIF results
|
||||
uses: github/codeql-action/upload-sarif@v2
|
||||
with:
|
||||
sarif_file: fuzzforge-results.sarif
|
||||
category: security-analysis
|
||||
```
|
||||
|
||||
### API Integration
|
||||
|
||||
#### SARIF Result Access
|
||||
```python
|
||||
# Example: Accessing SARIF results via FuzzForge API
|
||||
async with FuzzForgeClient() as client:
|
||||
result = await client.get_workflow_result(run_id)
|
||||
|
||||
# Access SARIF data
|
||||
sarif_data = result["sarif"]
|
||||
findings = sarif_data["runs"][0]["results"]
|
||||
|
||||
# Filter critical findings
|
||||
critical_findings = [
|
||||
f for f in findings
|
||||
if f.get("level") == "error" and
|
||||
f.get("properties", {}).get("severity") == "critical"
|
||||
]
|
||||
```
|
||||
|
||||
## SARIF Validation and Quality
|
||||
|
||||
### Schema Validation
|
||||
```python
|
||||
import jsonschema
|
||||
import requests
|
||||
|
||||
def validate_sarif(sarif_data):
|
||||
"""Validate SARIF data against official schema"""
|
||||
schema_url = "https://json.schemastore.org/sarif-2.1.0.json"
|
||||
schema = requests.get(schema_url).json()
|
||||
|
||||
try:
|
||||
jsonschema.validate(sarif_data, schema)
|
||||
return True, "Valid SARIF 2.1.0 format"
|
||||
except jsonschema.ValidationError as e:
|
||||
return False, f"SARIF validation error: {e.message}"
|
||||
```
|
||||
|
||||
### Quality Metrics
|
||||
```python
|
||||
def calculate_sarif_quality_metrics(sarif_data):
|
||||
"""Calculate quality metrics for SARIF results"""
|
||||
results = sarif_data["runs"][0]["results"]
|
||||
|
||||
metrics = {
|
||||
"total_findings": len(results),
|
||||
"findings_with_location": len([r for r in results if r.get("locations")]),
|
||||
"findings_with_message": len([r for r in results if r.get("message", {}).get("text")]),
|
||||
"findings_with_remediation": len([r for r in results if r.get("fixes")]),
|
||||
"unique_rules": len(set(r.get("ruleId") for r in results)),
|
||||
"coverage_percentage": calculate_coverage(sarif_data)
|
||||
}
|
||||
|
||||
metrics["quality_score"] = (
|
||||
metrics["findings_with_location"] / max(metrics["total_findings"], 1) * 0.3 +
|
||||
metrics["findings_with_message"] / max(metrics["total_findings"], 1) * 0.3 +
|
||||
metrics["findings_with_remediation"] / max(metrics["total_findings"], 1) * 0.2 +
|
||||
min(metrics["coverage_percentage"] / 100, 1.0) * 0.2
|
||||
)
|
||||
|
||||
return metrics
|
||||
```
|
||||
|
||||
## Advanced SARIF Features
|
||||
|
||||
### Fixes and Remediation
|
||||
```json
|
||||
{
|
||||
"ruleId": "semgrep.security.audit.sqli.pg-sqli",
|
||||
"fixes": [
|
||||
{
|
||||
"description": {
|
||||
"text": "Use parameterized queries to prevent SQL injection"
|
||||
},
|
||||
"artifactChanges": [
|
||||
{
|
||||
"artifactLocation": {
|
||||
"uri": "src/database/queries.py"
|
||||
},
|
||||
"replacements": [
|
||||
{
|
||||
"deletedRegion": {
|
||||
"startLine": 156,
|
||||
"startColumn": 20,
|
||||
"endLine": 156,
|
||||
"endColumn": 45
|
||||
},
|
||||
"insertedContent": {
|
||||
"text": "cursor.execute(query, params)"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Code Flows for Complex Vulnerabilities
|
||||
```json
|
||||
{
|
||||
"ruleId": "dataflow.taint.sql-injection",
|
||||
"codeFlows": [
|
||||
{
|
||||
"message": {
|
||||
"text": "Tainted data flows from user input to SQL query"
|
||||
},
|
||||
"threadFlows": [
|
||||
{
|
||||
"locations": [
|
||||
{
|
||||
"location": {
|
||||
"physicalLocation": {
|
||||
"artifactLocation": {"uri": "src/api/handlers.py"},
|
||||
"region": {"startLine": 45}
|
||||
}
|
||||
},
|
||||
"state": {"source": "user_input"},
|
||||
"nestingLevel": 0
|
||||
},
|
||||
{
|
||||
"location": {
|
||||
"physicalLocation": {
|
||||
"artifactLocation": {"uri": "src/database/queries.py"},
|
||||
"region": {"startLine": 156}
|
||||
}
|
||||
},
|
||||
"state": {"sink": "sql_query"},
|
||||
"nestingLevel": 0
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## SARIF Best Practices
|
||||
|
||||
### Result Quality
|
||||
- **Precise Locations**: Always include accurate file paths and line numbers
|
||||
- **Clear Messages**: Write descriptive, actionable finding messages
|
||||
- **Remediation Guidance**: Include fix suggestions when possible
|
||||
- **Severity Consistency**: Use consistent severity mappings across tools
|
||||
|
||||
### Performance
|
||||
- **Efficient Processing**: Process SARIF results efficiently for large result sets
|
||||
- **Streaming**: Use streaming for very large SARIF files
|
||||
- **Caching**: Cache processed results for faster repeated access
|
||||
- **Compression**: Compress SARIF files for storage and transmission
|
||||
|
||||
### Integration
|
||||
- **Tool Interoperability**: Ensure SARIF compatibility with existing tools
|
||||
- **Standard Compliance**: Follow SARIF 2.1.0 specification precisely
|
||||
- **Extension Documentation**: Document any custom extensions clearly
|
||||
- **Version Management**: Handle SARIF schema version differences
|
||||
@@ -0,0 +1,174 @@
|
||||
# Security Analysis in FuzzForge: Concepts and Approach
|
||||
|
||||
Security analysis is at the core of FuzzForge’s mission. This page explains the philosophy, methodologies, and integration patterns that shape how FuzzForge discovers vulnerabilities and helps teams secure their software. If you’re curious about what “security analysis” really means in this platform—and why it’s designed this way—read on.
|
||||
|
||||
---
|
||||
|
||||
## Why Does FuzzForge Approach Security Analysis This Way?
|
||||
|
||||
FuzzForge’s security analysis is built on a few guiding principles:
|
||||
|
||||
- **Defense in Depth:** No single tool or method catches everything. FuzzForge layers multiple analysis types—static, dynamic, secret detection, infrastructure checks, and fuzzing—to maximize coverage.
|
||||
- **Tool Diversity:** Different tools find different issues. Running several tools for each analysis type reduces blind spots and increases confidence in results.
|
||||
- **Standardized Results:** All findings are normalized into SARIF, a widely adopted format. This makes results easy to aggregate, review, and integrate with other tools.
|
||||
- **Automation and Integration:** Security analysis is only useful if it fits into real-world workflows. FuzzForge is designed for CI/CD, developer feedback, and automated reporting.
|
||||
|
||||
---
|
||||
|
||||
## What Types of Security Analysis Does FuzzForge Perform?
|
||||
|
||||
### Static Analysis
|
||||
|
||||
- **What it is:** Examines source code without running it, looking for vulnerabilities, anti-patterns, and risky constructs.
|
||||
- **How it works:** Parses code, analyzes control and data flow, and matches patterns against known vulnerabilities.
|
||||
- **Tools:** Semgrep, Bandit, CodeQL, ESLint, and more.
|
||||
- **Strengths:** Fast, broad coverage, no runtime needed.
|
||||
- **Limitations:** Can’t see runtime issues, may produce false positives.
|
||||
|
||||
### Dynamic Analysis
|
||||
|
||||
- **What it is:** Tests running applications to find vulnerabilities that only appear at runtime.
|
||||
- **How it works:** Deploys the app in a test environment, probes entry points, and observes behavior under attack.
|
||||
- **Tools:** Nuclei, OWASP ZAP, Nmap, SQLMap.
|
||||
- **Strengths:** Finds real, exploitable issues; validates actual behavior.
|
||||
- **Limitations:** Needs a working environment; slower; may not cover all code.
|
||||
|
||||
### Secret Detection
|
||||
|
||||
- **What it is:** Scans code and configuration for exposed credentials, API keys, and sensitive data.
|
||||
- **How it works:** Uses pattern matching, entropy analysis, and context checks—sometimes even scanning git history.
|
||||
- **Tools:** TruffleHog, Gitleaks, detect-secrets, GitGuardian.
|
||||
- **Strengths:** Fast, critical for preventing leaks.
|
||||
- **Limitations:** Can’t find encrypted/encoded secrets; needs regular pattern updates.
|
||||
|
||||
### Infrastructure Analysis
|
||||
|
||||
- **What it is:** Analyzes infrastructure-as-code, container configs, and deployment manifests for security misconfigurations.
|
||||
- **How it works:** Parses config files, applies security policies, checks compliance, and assesses risk.
|
||||
- **Tools:** Checkov, Hadolint, Kubesec, Terrascan.
|
||||
- **Strengths:** Prevents misconfigurations before deployment; automates compliance.
|
||||
- **Limitations:** Can’t see runtime changes; depends on up-to-date policies.
|
||||
|
||||
### Fuzzing
|
||||
|
||||
- **What it is:** Automatically generates and sends unexpected or random inputs to code, looking for crashes or unexpected behavior.
|
||||
- **How it works:** Identifies targets, generates inputs, monitors execution, and analyzes crashes.
|
||||
- **Tools:** AFL++, libFuzzer, Cargo Fuzz, Jazzer.
|
||||
- **Strengths:** Finds deep, complex bugs; great for memory safety.
|
||||
- **Limitations:** Resource-intensive; may need manual setup.
|
||||
|
||||
### Comprehensive Assessment
|
||||
|
||||
- **What it is:** Combines all the above for a holistic view, correlating findings and prioritizing risks.
|
||||
- **How it works:** Runs multiple analyses, aggregates and correlates results, and generates unified reports.
|
||||
- **Benefits:** Complete coverage, better context, prioritized remediation, and compliance support.
|
||||
|
||||
---
|
||||
|
||||
## How Does FuzzForge Integrate and Orchestrate Analysis?
|
||||
|
||||
### Workflow Composition
|
||||
|
||||
FuzzForge composes analysis workflows by combining different analysis types, each running in its own containerized environment. Inputs (code, configs, parameters) are fed into the appropriate tools, and results are normalized and aggregated.
|
||||
|
||||
```mermaid
|
||||
graph TB
|
||||
subgraph "Input"
|
||||
Target[Target Codebase]
|
||||
Config[Analysis Configuration]
|
||||
end
|
||||
|
||||
subgraph "Analysis Workflows"
|
||||
Static[Static Analysis]
|
||||
Dynamic[Dynamic Analysis]
|
||||
Secrets[Secret Detection]
|
||||
Infra[Infrastructure Analysis]
|
||||
Fuzz[Fuzzing Analysis]
|
||||
end
|
||||
|
||||
subgraph "Processing"
|
||||
Normalize[Result Normalization]
|
||||
Merge[Finding Aggregation]
|
||||
Correlate[Cross-Tool Correlation]
|
||||
end
|
||||
|
||||
subgraph "Output"
|
||||
SARIF[SARIF Results]
|
||||
Report[Security Report]
|
||||
Metrics[Analysis Metrics]
|
||||
end
|
||||
|
||||
Target --> Static
|
||||
Target --> Dynamic
|
||||
Target --> Secrets
|
||||
Target --> Infra
|
||||
Target --> Fuzz
|
||||
Config --> Static
|
||||
Config --> Dynamic
|
||||
Config --> Secrets
|
||||
Config --> Infra
|
||||
Config --> Fuzz
|
||||
|
||||
Static --> Normalize
|
||||
Dynamic --> Normalize
|
||||
Secrets --> Normalize
|
||||
Infra --> Normalize
|
||||
Fuzz --> Normalize
|
||||
|
||||
Normalize --> Merge
|
||||
Merge --> Correlate
|
||||
Correlate --> SARIF
|
||||
Correlate --> Report
|
||||
Correlate --> Metrics
|
||||
```
|
||||
|
||||
### Orchestration Patterns
|
||||
|
||||
- **Parallel Execution:** Tools of the same type (e.g., multiple static analyzers) run in parallel for speed and redundancy.
|
||||
- **Sequential Execution:** Some analyses depend on previous results (e.g., dynamic analysis using endpoints found by static analysis).
|
||||
- **Result Normalization:** All findings are converted to SARIF for consistency.
|
||||
- **Correlation:** Related findings from different tools are grouped and prioritized.
|
||||
|
||||
---
|
||||
|
||||
## How Is Quality Ensured?
|
||||
|
||||
### Metrics and Measurement
|
||||
|
||||
- **Coverage:** How much code, how many rules, and how many vulnerability types are analyzed.
|
||||
- **Accuracy:** False positive/negative rates, confidence scores, and validation rates.
|
||||
- **Performance:** Analysis duration, resource usage, and scalability.
|
||||
|
||||
### Quality Assurance
|
||||
|
||||
- **Cross-Tool Validation:** Findings are confirmed by multiple tools when possible.
|
||||
- **Manual Review:** High-severity findings can be flagged for expert review.
|
||||
- **Continuous Improvement:** Tools and rules are updated regularly, and user feedback is incorporated.
|
||||
|
||||
---
|
||||
|
||||
## How Does Security Analysis Fit Into Development Workflows?
|
||||
|
||||
### CI/CD Integration
|
||||
|
||||
- **Pre-commit Hooks:** Run security checks before code is committed.
|
||||
- **Pipeline Integration:** Block deployments if high/critical issues are found.
|
||||
- **Quality Gates:** Enforce severity thresholds and track trends over time.
|
||||
|
||||
### Developer Experience
|
||||
|
||||
- **IDE Integration:** Import SARIF findings into supported IDEs for inline feedback.
|
||||
- **Real-Time Analysis:** Optionally run background checks during development.
|
||||
- **Reporting:** Executive dashboards, technical reports, and compliance summaries.
|
||||
|
||||
---
|
||||
|
||||
## What’s Next for Security Analysis in FuzzForge?
|
||||
|
||||
FuzzForge is designed to evolve. Advanced techniques like machine learning for pattern recognition, contextual analysis, and business logic checks are on the roadmap. The goal: keep raising the bar for automated, actionable, and developer-friendly security analysis.
|
||||
|
||||
---
|
||||
|
||||
## In Summary
|
||||
|
||||
FuzzForge’s security analysis is comprehensive, layered, and designed for real-world integration. By combining multiple analysis types, normalizing results, and focusing on automation and developer experience, FuzzForge helps teams find and fix vulnerabilities—before attackers do.
|
||||
@@ -0,0 +1,128 @@
|
||||
# Understanding Workflows in FuzzForge
|
||||
|
||||
Workflows are the backbone of FuzzForge’s security analysis platform. If you want to get the most out of FuzzForge, it’s essential to understand what workflows are, how they’re designed, and how they operate from start to finish. This page explains the core concepts, design principles, and execution models behind FuzzForge workflows—so you can use them confidently and effectively.
|
||||
|
||||
---
|
||||
|
||||
## What Is a Workflow?
|
||||
|
||||
A **workflow** in FuzzForge is a containerized process that orchestrates one or more security tools to analyze a target codebase or application. Each workflow is tailored for a specific type of security analysis (like static analysis, secret detection, or fuzzing) and is designed to be:
|
||||
|
||||
- **Isolated:** Runs in its own Docker container for security and reproducibility.
|
||||
- **Integrated:** Can combine multiple tools for comprehensive results.
|
||||
- **Standardized:** Always produces SARIF-compliant output.
|
||||
- **Configurable:** Accepts parameters to customize analysis.
|
||||
- **Scalable:** Can run in parallel and scale horizontally.
|
||||
|
||||
---
|
||||
|
||||
## How Does a Workflow Operate?
|
||||
|
||||
### High-Level Architecture
|
||||
|
||||
Here’s how a workflow moves through the FuzzForge system:
|
||||
|
||||
```mermaid
|
||||
graph TB
|
||||
User[User/CLI/API] --> API[FuzzForge API]
|
||||
API --> Prefect[Prefect Orchestrator]
|
||||
Prefect --> Worker[Prefect Worker]
|
||||
Worker --> Container[Docker Container]
|
||||
Container --> Tools[Security Tools]
|
||||
Tools --> Results[SARIF Results]
|
||||
Results --> Storage[Persistent Storage]
|
||||
```
|
||||
|
||||
**Key roles:**
|
||||
- **User/CLI/API:** Submits and manages workflows.
|
||||
- **FuzzForge API:** Validates, orchestrates, and tracks workflows.
|
||||
- **Prefect Orchestrator:** Schedules and manages workflow execution.
|
||||
- **Prefect Worker:** Runs the workflow in a Docker container.
|
||||
- **Security Tools:** Perform the actual analysis.
|
||||
- **Persistent Storage:** Stores results and artifacts.
|
||||
|
||||
---
|
||||
|
||||
## Workflow Lifecycle: From Idea to Results
|
||||
|
||||
1. **Design:** Choose tools, define integration logic, set up parameters, and build the Docker image.
|
||||
2. **Deployment:** Build and push the image, register the workflow, and configure defaults.
|
||||
3. **Execution:** User submits a workflow; parameters and target are validated; the workflow is scheduled and executed in a container; tools run as designed.
|
||||
4. **Completion:** Results are collected, normalized, and stored; status is updated; temporary resources are cleaned up; results are made available via API/CLI.
|
||||
|
||||
---
|
||||
|
||||
## Types of Workflows
|
||||
|
||||
FuzzForge supports several workflow types, each optimized for a specific security need:
|
||||
|
||||
- **Static Analysis:** Examines source code without running it (e.g., Semgrep, Bandit).
|
||||
- **Dynamic Analysis:** Tests running applications for runtime vulnerabilities (e.g., OWASP ZAP, Nuclei).
|
||||
- **Secret Detection:** Finds exposed credentials and sensitive data (e.g., TruffleHog, Gitleaks).
|
||||
- **Infrastructure Analysis:** Checks infrastructure-as-code and configs for misconfigurations (e.g., Checkov, Hadolint).
|
||||
- **Fuzzing:** Generates unexpected inputs to find crashes and edge cases (e.g., AFL++, libFuzzer).
|
||||
- **Comprehensive Assessment:** Combines multiple analysis types for full coverage.
|
||||
|
||||
---
|
||||
|
||||
## Workflow Design Principles
|
||||
|
||||
- **Tool Agnostic:** Workflows abstract away the specifics of underlying tools, providing a consistent interface.
|
||||
- **Fail-Safe Execution:** If one tool fails, others continue—partial results are still valuable.
|
||||
- **Configurable:** Users can adjust parameters to control tool behavior, output, and execution.
|
||||
- **Resource-Aware:** Workflows specify and respect resource limits (CPU, memory).
|
||||
- **Standardized Output:** All results are normalized to SARIF for easy integration and reporting.
|
||||
|
||||
---
|
||||
|
||||
## Execution Models
|
||||
|
||||
- **Synchronous:** Wait for the workflow to finish and get results immediately—great for interactive use.
|
||||
- **Asynchronous:** Submit a workflow and check back later for results—ideal for long-running or batch jobs.
|
||||
- **Parallel:** Run multiple workflows at once for comprehensive or time-sensitive analysis.
|
||||
|
||||
---
|
||||
|
||||
## Data Flow and Storage
|
||||
|
||||
- **Input:** Target code and parameters are validated and mounted as read-only volumes.
|
||||
- **Processing:** Tools are initialized and run (often in parallel); outputs are collected and normalized.
|
||||
- **Output:** Results are stored in persistent volumes and indexed for fast retrieval; metadata is saved in the database; intermediate results may be cached for performance.
|
||||
|
||||
---
|
||||
|
||||
## Error Handling and Recovery
|
||||
|
||||
- **Tool-Level:** Timeouts, resource exhaustion, and crashes are handled gracefully; failed tools don’t stop the workflow.
|
||||
- **Workflow-Level:** Container failures, volume issues, and network problems are detected and reported.
|
||||
- **Recovery:** Automatic retries for transient errors; partial results are returned when possible; workflows degrade gracefully if some tools are unavailable.
|
||||
|
||||
---
|
||||
|
||||
## Performance and Optimization
|
||||
|
||||
- **Container Efficiency:** Docker images are layered and cached for fast startup; containers may be reused when safe.
|
||||
- **Parallel Processing:** Independent tools run concurrently to maximize CPU usage and minimize wait times.
|
||||
- **Caching:** Images, dependencies, and intermediate results are cached to avoid unnecessary recomputation.
|
||||
|
||||
---
|
||||
|
||||
## Monitoring and Observability
|
||||
|
||||
- **Metrics:** Track execution time, resource usage, and success/failure rates.
|
||||
- **Logging:** Structured logs and tool outputs are captured for debugging and analysis.
|
||||
- **Real-Time Monitoring:** Live status updates and progress indicators are available via API/WebSocket.
|
||||
|
||||
---
|
||||
|
||||
## Integration Patterns
|
||||
|
||||
- **CI/CD:** Integrate workflows into pipelines to block deployments on critical findings.
|
||||
- **API:** Programmatically submit and track workflows from your own tools or scripts.
|
||||
- **Event-Driven:** Use webhooks or event listeners to trigger actions on workflow completion.
|
||||
|
||||
---
|
||||
|
||||
## In Summary
|
||||
|
||||
Workflows in FuzzForge are designed to be robust, flexible, and easy to integrate into your security and development processes. By combining containerization, orchestration, and a standardized interface, FuzzForge workflows help you automate and scale security analysis—so you can focus on fixing issues, not just finding them.
|
||||
@@ -0,0 +1,72 @@
|
||||
# Working with documentation
|
||||
|
||||
To update the documentation on any of the sections just add a new markdown file to the designated subfolder below :
|
||||
|
||||
```
|
||||
├─concepts
|
||||
├─tutorials
|
||||
├─how-to
|
||||
│ └─troubleshooting
|
||||
└─reference
|
||||
├─architecture
|
||||
├─decisions
|
||||
└─faq
|
||||
```
|
||||
|
||||
:::note Templates
|
||||
|
||||
Each folder contains templates that can be used as quickstarts. Those are named `<template name>.tpml`.
|
||||
|
||||
:::
|
||||
|
||||
See [Diataxis documentation](../reference/diataxis-documentation.md) for more information on diátaxis.
|
||||
|
||||
## Manage Docs Versions
|
||||
|
||||
Docusaurus can manage multiple versions of the docs.
|
||||
|
||||
### Create a docs version
|
||||
|
||||
Release a version 1.0 of your project:
|
||||
|
||||
```bash
|
||||
npm run docusaurus docs:version 1.0
|
||||
```
|
||||
|
||||
The `docs` folder is copied into `versioned_docs/version-1.0` and `versions.json` is created.
|
||||
|
||||
Your docs now have 2 versions:
|
||||
|
||||
- `1.0` at `http://localhost:3000/docs/` for the version 1.0 docs
|
||||
- `current` at `http://localhost:3000/docs/next/` for the **upcoming, unreleased docs**
|
||||
|
||||
### Add a Version Dropdown
|
||||
|
||||
To navigate seamlessly across versions, add a version dropdown.
|
||||
|
||||
Modify the `docusaurus.config.js` file:
|
||||
|
||||
```js title="docusaurus.config.js"
|
||||
export default {
|
||||
themeConfig: {
|
||||
navbar: {
|
||||
items: [
|
||||
// highlight-start
|
||||
{
|
||||
type: 'docsVersionDropdown',
|
||||
},
|
||||
// highlight-end
|
||||
],
|
||||
},
|
||||
},
|
||||
};
|
||||
```
|
||||
|
||||
The docs version dropdown appears in the navbar.
|
||||
|
||||
## Update an existing version
|
||||
|
||||
It is possible to edit versioned docs in their respective folder:
|
||||
|
||||
- `versioned_docs/version-1.0/hello.md` updates `http://localhost:3000/docs/hello`
|
||||
- `docs/hello.md` updates `http://localhost:3000/docs/next/hello`
|
||||
Reference in New Issue
Block a user