mirror of https://github.com/KeygraphHQ/shannon.git synced 2026-02-12 09:12:50 +00:00

Files

Arjun Malleswaran ae4c4ed402 fix: add file_path parameter to save_deliverable for large reports (#123 )

* fix: add file_path parameter to save_deliverable for large reports

Large deliverable reports can exceed output token limits when passed as
inline content. This change allows agents to write reports to disk first
and pass a file_path instead.

Changes:
- Add file_path parameter to save_deliverable MCP tool with path
  traversal protection
- Pass CLAUDE_CODE_MAX_OUTPUT_TOKENS env var to SDK subprocesses
- Fix false positive error detection by extracting only text content
  (not tool_use JSON) when checking for API errors
- Update all prompts to instruct agents to use file_path for large
  reports and stop immediately after completion

* docs: simplify and condense CLAUDE.md

Reduce verbosity while preserving all essential information for AI
assistance. Makes the documentation more scannable and focused.

* feat: add issue number detection to pr command

The /pr command now automatically detects issue numbers from:
1. Explicit arguments (e.g., /pr 123 or /pr 123,456)
2. Branch name patterns (e.g., fix/123-bug, issue-456-feature)

Adds "Closes #X" lines to PR body to auto-close issues on merge.

* chore: remove CLAUDE_CODE_MAX_OUTPUT_TOKENS env var handling

No longer needed with the new Claude Agent SDK version.

* fix: restore max_output_tokens error handling

2026-02-11 13:40:49 -08:00

6.6 KiB

Raw Permalink Blame History

CLAUDE.md

AI-powered penetration testing agent for defensive security analysis. Automates vulnerability assessment by combining reconnaissance tools with AI-powered code analysis.

Commands

Prerequisites: Docker, Anthropic API key in .env

# Setup
cp .env.example .env && edit .env  # Set ANTHROPIC_API_KEY

# Prepare repo (REPO is a folder name inside ./repos/, not an absolute path)
git clone https://github.com/org/repo.git ./repos/my-repo
# or symlink: ln -s /path/to/existing/repo ./repos/my-repo

# Run
./shannon start URL=<url> REPO=my-repo
./shannon start URL=<url> REPO=my-repo CONFIG=./configs/my-config.yaml

# Monitor
./shannon logs                      # Real-time worker logs
./shannon query ID=<workflow-id>    # Query workflow progress
# Temporal Web UI: http://localhost:8233

# Stop
./shannon stop                      # Preserves workflow data
./shannon stop CLEAN=true           # Full cleanup including volumes

# Build
npm run build

Options: CONFIG=<file> (YAML config), OUTPUT=<path> (default: ./audit-logs/), PIPELINE_TESTING=true (minimal prompts, 10s retries), REBUILD=true (force Docker rebuild), ROUTER=true (multi-model routing via claude-code-router)

Architecture

Core Modules

src/session-manager.ts — Agent definitions, execution order, parallel groups
src/ai/claude-executor.ts — Claude Agent SDK integration with retry logic and git checkpoints
src/config-parser.ts — YAML config parsing with JSON Schema validation
src/error-handling.ts — Categorized error types (PentestError, ConfigError, NetworkError) with retry logic
src/tool-checker.ts — Validates external security tool availability before execution
src/queue-validation.ts — Deliverable validation and agent prerequisites

Temporal Orchestration

Durable workflow orchestration with crash recovery, queryable progress, intelligent retry, and parallel execution (5 concurrent agents in vuln/exploit phases).

src/temporal/workflows.ts — Main workflow (pentestPipelineWorkflow)
src/temporal/activities.ts — Activity implementations with heartbeats
src/temporal/worker.ts — Worker entry point
src/temporal/client.ts — CLI client for starting workflows
src/temporal/shared.ts — Types, interfaces, query definitions
src/temporal/query.ts — Query tool for progress inspection

Five-Phase Pipeline

Pre-Recon (pre-recon) — External scans (nmap, subfinder, whatweb) + source code analysis
Recon (recon) — Attack surface mapping from initial findings
Vulnerability Analysis (5 parallel agents) — injection, xss, auth, authz, ssrf
Exploitation (5 parallel agents, conditional) — Exploits confirmed vulnerabilities
Reporting (report) — Executive-level security report

Supporting Systems

Configuration — YAML configs in configs/ with JSON Schema validation (config-schema.json). Supports auth settings, MFA/TOTP, and per-app testing parameters
Prompts — Per-phase templates in prompts/ with variable substitution ({{TARGET_URL}}, {{CONFIG_CONTEXT}}). Shared partials in prompts/shared/ via prompt-manager.ts
SDK Integration — Uses @anthropic-ai/claude-agent-sdk with maxTurns: 10_000 and bypassPermissions mode. Playwright MCP for browser automation, TOTP generation via MCP tool. Login flow template at prompts/shared/login-instructions.txt supports form, SSO, API, and basic auth
Audit System — Crash-safe append-only logging in audit-logs/{hostname}_{sessionId}/. Tracks session metrics, per-agent logs, prompts, and deliverables
Deliverables — Saved to deliverables/ in the target repo via the save_deliverable MCP tool

Development Notes

Adding a New Agent

Define agent in src/session-manager.ts (add to AGENT_QUEUE and parallel group)
Create prompt template in prompts/ (e.g., vuln-newtype.txt)
Add activity function in src/temporal/activities.ts
Register activity in src/temporal/workflows.ts within the appropriate phase

Modifying Prompts

Variable substitution: {{TARGET_URL}}, {{CONFIG_CONTEXT}}, {{LOGIN_INSTRUCTIONS}}
Shared partials in prompts/shared/ included via prompt-manager.ts
Test with PIPELINE_TESTING=true for fast iteration

Key Design Patterns

Configuration-Driven — YAML configs with JSON Schema validation
Progressive Analysis — Each phase builds on previous results
SDK-First — Claude Agent SDK handles autonomous analysis
Modular Error Handling — Categorized errors with automatic retry (3 attempts per agent)

Security

Defensive security tool only. Use only on systems you own or have explicit permission to test.

Code Style Guidelines

Clarity Over Brevity

Optimize for readability, not line count — three clear lines beat one dense expression
Use descriptive names that convey intent
Prefer explicit logic over clever one-liners

Structure

Keep functions focused on a single responsibility
Use early returns and guard clauses instead of deep nesting
Never use nested ternary operators — use if/else or switch
Extract complex conditions into well-named boolean variables

TypeScript Conventions

Use function keyword for top-level functions (not arrow functions)
Explicit return type annotations on exported/top-level functions
Prefer readonly for data that shouldn't be mutated

Avoid

Combining multiple concerns into a single function to "save lines"
Dense callback chains when sequential logic is clearer
Sacrificing readability for DRY — some repetition is fine if clearer
Abstractions for one-time operations

Key Files

Entry Points: src/temporal/workflows.ts, src/temporal/activities.ts, src/temporal/worker.ts, src/temporal/client.ts

Core Logic: src/session-manager.ts, src/ai/claude-executor.ts, src/config-parser.ts, src/audit/

Config: shannon (CLI), docker-compose.yml, configs/, prompts/

Troubleshooting

"Repository not found" — REPO must be a folder name inside ./repos/, not an absolute path. Clone or symlink your repo there first: ln -s /path/to/repo ./repos/my-repo
"Temporal not ready" — Wait for health check or docker compose logs temporal
Worker not processing — Check docker compose ps
Reset state — ./shannon stop CLEAN=true
Local apps unreachable — Use host.docker.internal instead of localhost
Missing tools — Use PIPELINE_TESTING=true to skip nmap/subfinder/whatweb (graceful degradation)
Container permissions — On Linux, may need sudo for docker commands

6.6 KiB Raw Permalink Blame History