Merge pull request #2 from KeygraphHQ/fixing-bugs

Fixing bugs
2026-05-21 08:16:55 +02:00 · 2025-10-23 18:18:21 -07:00
parent 3a8b7ae496 f40f52f118
commit d85b6af5f5
75 changed files with 3275 additions and 1959 deletions
@@ -1,3 +1,4 @@
 node_modules/
 .shannon-store.json
-agent-logs/
+agent-logs/
+audit-logs/
@@ -36,9 +36,7 @@ npm start <WEB_URL> <REPO_PATH> --config <CONFIG_FILE>
 ```

 ### Generate TOTP for Authentication
-```bash
-./login_resources/generate-totp.mjs <TOTP_SECRET>
-```
+TOTP generation is now handled automatically via the `generate_totp` MCP tool during authentication flows.

 ### Development Commands
 ```bash
@@ -154,8 +152,8 @@ The `prompts/` directory contains specialized prompt templates for each testing
 - `exploit-*.txt` - Exploitation attempt prompts
 - `report-executive.txt` - Executive report generation prompts

-### Claude Code SDK Integration
-The agent uses the `@anthropic-ai/claude-code` SDK with maximum autonomy configuration:
+### Claude Agent SDK Integration
+The agent uses the `@anthropic-ai/claude-agent-sdk` with maximum autonomy configuration:
 - `maxTurns: 10_000` - Allows extensive autonomous analysis
 - `permissionMode: 'bypassPermissions'` - Full system access for thorough testing
 - Playwright MCP integration for web browser automation
@@ -163,8 +161,8 @@ The agent uses the `@anthropic-ai/claude-code` SDK with maximum autonomy configu
 - Configuration context injection for authenticated testing

 ### Authentication & Login Resources
- `login_resources/generate-totp.mjs` - TOTP token generation utility
- `login_resources/login_instructions.txt` - Login flow documentation
+- `prompts/shared/login-instructions.txt` - Login flow template for all agents
+- TOTP token generation via MCP `generate_totp` tool
 - Support for multi-factor authentication workflows
 - Configurable authentication mechanisms (form, SSO, API, basic)

@@ -188,17 +186,46 @@ The agent implements a sophisticated checkpoint system using git:
 - Every agent creates a git checkpoint before execution
 - Rollback to any previous agent state using `--rollback-to` or `--rerun`
 - Failed agents don't affect completed work
- Timing and cost data cleaned up during rollbacks
+- Rolled-back agents marked in audit system with status: "rolled-back"
+- Reconciliation automatically syncs Shannon store with audit logs after rollback
 - Fail-fast safety prevents accidental re-execution of completed agents

-### Timing & Performance Monitoring
-The agent includes comprehensive timing instrumentation that tracks:
- Total execution time
- Phase-level timing breakdown
- Individual command execution times
- Claude Code agent processing times
- Cost tracking for AI agent usage
+### Unified Audit & Metrics System
+The agent implements a crash-safe, self-healing audit system (v3.0) with the following guarantees:

+**Architecture:**
+- **audit-logs/**: Centralized metrics and forensic logs (source of truth)
+  - `{hostname}_{sessionId}/session.json` - Comprehensive metrics with attempt-level detail
+  - `{hostname}_{sessionId}/prompts/` - Exact prompts used for reproducibility
+  - `{hostname}_{sessionId}/agents/` - Turn-by-turn execution logs
+- **.shannon-store.json**: Minimal orchestration state (completedAgents, checkpoints)
+
+**Crash Safety:**
+- Append-only logging with immediate flush (survives kill -9)
+- Atomic writes for session.json (no partial writes)
+- Event-based logging (tool_start, tool_end, llm_response) closes data loss windows
+
+**Self-Healing:**
+- Automatic reconciliation before every CLI command
+- Recovers from crashes during rollback
+- Audit logs are source of truth; Shannon store follows
+
+**Forensic Completeness:**
+- All retry attempts logged with errors, costs, durations
+- Rolled-back agents preserved with status: "rolled-back"
+- Partial cost capture for failed attempts
+- Complete event trail for debugging
+
+**Concurrency Safety:**
+- SessionMutex prevents race conditions during parallel agent execution
+- Safe parallel execution of vulnerability and exploitation phases
+
+**Metrics & Reporting:**
+- Export metrics to CSV with `./scripts/export-metrics.js`
+- Phase-level and agent-level timing/cost aggregations
+- Validation results integrated with metrics
+
+For detailed design, see `docs/unified-audit-system-design.md`.

 ## Development Notes

@@ -206,7 +233,7 @@ The agent includes comprehensive timing instrumentation that tracks:
 - **Configuration-Driven Architecture**: YAML configs with JSON Schema validation
 - **Modular Error Handling**: Categorized error types with retry logic
 - **Pure Functions**: Most functionality is implemented as pure functions for testability
- **SDK-First Approach**: Heavy reliance on Claude Code SDK for autonomous AI operations
+- **SDK-First Approach**: Heavy reliance on Claude Agent SDK for autonomous AI operations
 - **Progressive Analysis**: Each phase builds on previous phase results
 - **Local Repository Setup**: Target applications are accessed directly from user-provided local directories

@@ -232,34 +259,58 @@ The tool should only be used on systems you own or have explicit permission to t
 ## File Structure

 ```
-shannon.mjs              # Main orchestration script
-package.json                  # Node.js dependencies
-src/                         # Core modules
-├── config-parser.js         # Configuration handling
-├── error-handling.js        # Error management
-├── tool-checker.js          # Tool validation
-├── session-manager.js       # Session state management
-├── checkpoint-manager.js    # Git-based checkpointing
-├── queue-validation.js      # Deliverable validation
+shannon.mjs                      # Main orchestration script
+package.json                     # Node.js dependencies
+.shannon-store.json              # Orchestration state (minimal)
+src/                             # Core modules
+├── audit/                       # Unified audit system (v3.0)
+│   ├── index.js                 # Public API
+│   ├── audit-session.js         # Main facade (logger + metrics + mutex)
+│   ├── logger.js                # Append-only crash-safe logging
+│   ├── metrics-tracker.js       # Timing, cost, attempt tracking
+│   └── utils.js                 # Path generation, atomic writes
+├── config-parser.js             # Configuration handling
+├── error-handling.js            # Error management
+├── tool-checker.js              # Tool validation
+├── session-manager.js           # Session state + reconciliation
+├── checkpoint-manager.js        # Git-based checkpointing + rollback
+├── queue-validation.js          # Deliverable validation
+├── ai/
+│   └── claude-executor.js       # Claude Agent SDK integration
 └── utils/
-configs/                     # Configuration files
-├── config-schema.json       # JSON Schema validation
-├── example-config.yaml      # Template configuration
-├── juice-shop-config.yaml   # Juice Shop example
-├── keygraph-config.yaml     # Keygraph configuration
-├── chatwoot-config.yaml     # Chatwoot configuration
-├── metabase-config.yaml     # Metabase configuration
-└── cal-com-config.yaml      # Cal.com configuration
-prompts/                     # AI prompt templates
-├── pre-recon-code.txt       # Code analysis
-├── recon.txt               # Reconnaissance  
-├── vuln-*.txt              # Vulnerability assessment
-├── exploit-*.txt           # Exploitation
-└── report-executive.txt    # Executive reporting
-login_resources/            # Authentication utilities
-├── generate-totp.mjs       # TOTP generation
-└── login_instructions.txt  # Login documentation
-deliverables/              # Output directory
+audit-logs/                      # Centralized audit data (v3.0)
+└── {hostname}_{sessionId}/
+    ├── session.json             # Comprehensive metrics
+    ├── prompts/                 # Prompt snapshots
+    │   └── {agent}.md
+    └── agents/                  # Agent execution logs
+        └── {timestamp}_{agent}_attempt-{N}.log
+configs/                         # Configuration files
+├── config-schema.json           # JSON Schema validation
+├── example-config.yaml          # Template configuration
+├── juice-shop-config.yaml       # Juice Shop example
+├── keygraph-config.yaml         # Keygraph configuration
+├── chatwoot-config.yaml         # Chatwoot configuration
+├── metabase-config.yaml         # Metabase configuration
+└── cal-com-config.yaml          # Cal.com configuration
+prompts/                         # AI prompt templates
+├── shared/                      # Shared content for all prompts
+│   ├── _target.txt              # Target URL template
+│   ├── _rules.txt               # Rules template
+│   ├── _vuln-scope.txt          # Vulnerability scope template
+│   ├── _exploit-scope.txt       # Exploitation scope template
+│   └── login-instructions.txt   # Login flow template
+├── pre-recon-code.txt           # Code analysis
+├── recon.txt                    # Reconnaissance
+├── vuln-*.txt                   # Vulnerability assessment
+├── exploit-*.txt                # Exploitation
+└── report-executive.txt         # Executive reporting
+scripts/                         # Utility scripts
+└── export-metrics.js            # Export metrics to CSV
+deliverables/                    # Output directory (in target repo)
+docs/                            # Documentation
+├── unified-audit-system-design.md
+└── migration-guide.md
 ```

 ## Troubleshooting
@@ -275,4 +326,12 @@ deliverables/              # Output directory
 Missing tools can be skipped using `--pipeline-testing` mode during development:
 - `nmap` - Network scanning
 - `subfinder` - Subdomain discovery
- `whatweb` - Web technology detection  
+- `whatweb` - Web technology detection
+
+### Diagnostic & Utility Scripts
+```bash
+# Export metrics to CSV
+./scripts/export-metrics.js --session-id <id> --output metrics.csv
+```
+
+Note: For recovery from corrupted state, simply delete `.shannon-store.json` or edit JSON files directly.
@@ -68,7 +68,23 @@ RUN apk update && apk add --no-cache \
    nodejs-22 \
    npm \
    python3 \
-    ruby
+    ruby \
+    # Chromium browser and dependencies for Playwright
+    chromium \
+    # Additional libraries Chromium needs
+    nss \
+    freetype \
+    harfbuzz \
+    # X11 libraries for headless browser
+    libx11 \
+    libxcomposite \
+    libxdamage \
+    libxext \
+    libxfixes \
+    libxrandr \
+    mesa-gbm \
+    # Font rendering
+    fontconfig

 # Copy Go binaries from builder
 COPY --from=builder /go/bin/subfinder /usr/local/bin/
@@ -97,7 +113,7 @@ COPY package*.json ./
 # Install Node.js dependencies as root
 RUN npm ci --only=production && \
    npm install -g zx && \
-    npm install -g @anthropic-ai/claude-code && \
+    npm install -g @anthropic-ai/claude-agent-sdk && \
    npm cache clean --force

 # Copy application code
@@ -116,6 +132,9 @@ USER pentest
 # Set environment variables
 ENV NODE_ENV=production
 ENV PATH="/usr/local/bin:$PATH"
+ENV SHANNON_DOCKER=true
+ENV PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=1
+ENV PLAYWRIGHT_CHROMIUM_EXECUTABLE_PATH=/usr/bin/chromium-browser


 # Set entrypoint
@@ -1,131 +0,0 @@
-#!/usr/bin/env node
-
-import { createHmac } from 'crypto';
-
-/**
- * Standalone TOTP generator that doesn't require external dependencies
- * Based on RFC 6238 (TOTP: Time-Based One-Time Password Algorithm)
- */
-
-function parseArgs() {
-  const args = {};
-  for (let i = 2; i < process.argv.length; i++) {
-    if (process.argv[i] === '--secret' && i + 1 < process.argv.length) {
-      args.secret = process.argv[i + 1];
-      i++; // Skip the next argument since it's the value
-    } else if (process.argv[i] === '--help' || process.argv[i] === '-h') {
-      args.help = true;
-    }
-  }
-  return args;
-}
-
-function showHelp() {
-  console.log(`
-Usage: node generate-totp-standalone.mjs --secret <TOTP_SECRET>
-
-Generate a Time-based One-Time Password (TOTP) from a secret key.
-This standalone version doesn't require external dependencies.
-
-Options:
-  --secret <secret>  The base32-encoded TOTP secret key (required)
-  --help, -h        Show this help message
-
-Examples:
-  node generate-totp-standalone.mjs --secret "JBSWY3DPEHPK3PXP"
-  node generate-totp-standalone.mjs --secret "u4e2ewg3d6w7gya3p7plgkef6zgfzo23"
-
-Output:
-  A 6-digit TOTP code (e.g., 123456)
-`);
-}
-
-// Base32 decoding function
-function base32Decode(encoded) {
-  const alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ234567';
-  const cleanInput = encoded.toUpperCase().replace(/[^A-Z2-7]/g, '');
-  
-  if (cleanInput.length === 0) {
-    return Buffer.alloc(0);
-  }
-  
-  const output = [];
-  let bits = 0;
-  let value = 0;
-  
-  for (const char of cleanInput) {
-    const index = alphabet.indexOf(char);
-    if (index === -1) {
-      throw new Error(`Invalid base32 character: ${char}`);
-    }
-    
-    value = (value << 5) | index;
-    bits += 5;
-    
-    if (bits >= 8) {
-      output.push((value >>> (bits - 8)) & 255);
-      bits -= 8;
-    }
-  }
-  
-  return Buffer.from(output);
-}
-
-// HOTP implementation (RFC 4226)
-function generateHOTP(secret, counter, digits = 6) {
-  const key = base32Decode(secret);
-  
-  // Convert counter to 8-byte buffer (big-endian)
-  const counterBuffer = Buffer.alloc(8);
-  counterBuffer.writeBigUInt64BE(BigInt(counter));
-  
-  // Generate HMAC-SHA1
-  const hmac = createHmac('sha1', key);
-  hmac.update(counterBuffer);
-  const hash = hmac.digest();
-  
-  // Dynamic truncation
-  const offset = hash[hash.length - 1] & 0x0f;
-  const code = (
-    ((hash[offset] & 0x7f) << 24) |
-    ((hash[offset + 1] & 0xff) << 16) |
-    ((hash[offset + 2] & 0xff) << 8) |
-    (hash[offset + 3] & 0xff)
-  );
-  
-  // Generate digits
-  const otp = (code % Math.pow(10, digits)).toString().padStart(digits, '0');
-  return otp;
-}
-
-// TOTP implementation (RFC 6238)
-function generateTOTP(secret, timeStep = 30, digits = 6) {
-  const currentTime = Math.floor(Date.now() / 1000);
-  const counter = Math.floor(currentTime / timeStep);
-  return generateHOTP(secret, counter, digits);
-}
-
-function main() {
-  const args = parseArgs();
-  
-  if (args.help) {
-    showHelp();
-    return;
-  }
-  
-  if (!args.secret) {
-    console.error('Error: --secret parameter is required');
-    console.error('Use --help for usage information');
-    process.exit(1);
-  }
-  
-  try {
-    const totpCode = generateTOTP(args.secret);
-    console.log(totpCode);
-  } catch (error) {
-    console.error(`Error: ${error.message}`);
-    process.exit(1);
-  }
-}
-
-main();
@@ -0,0 +1,254 @@
+{
+  "name": "@shannon/mcp-server",
+  "version": "1.0.0",
+  "lockfileVersion": 3,
+  "requires": true,
+  "packages": {
+    "": {
+      "name": "@shannon/mcp-server",
+      "version": "1.0.0",
+      "dependencies": {
+        "@anthropic-ai/claude-agent-sdk": "^0.1.0",
+        "zod": "^3.22.4"
+      }
+    },
+    "node_modules/@anthropic-ai/claude-agent-sdk": {
+      "version": "0.1.25",
+      "resolved": "https://registry.npmjs.org/@anthropic-ai/claude-agent-sdk/-/claude-agent-sdk-0.1.25.tgz",
+      "integrity": "sha512-qwuydYaA3uamz4ivDzYXfL2PBjGwc0+beeIyo3nvtZQOtFLjH7xPdBK2w3+9KnB3L6V7VooAMdTXPpQyxCwcOg==",
+      "license": "SEE LICENSE IN README.md",
+      "engines": {
+        "node": ">=18.0.0"
+      },
+      "optionalDependencies": {
+        "@img/sharp-darwin-arm64": "^0.33.5",
+        "@img/sharp-darwin-x64": "^0.33.5",
+        "@img/sharp-linux-arm": "^0.33.5",
+        "@img/sharp-linux-arm64": "^0.33.5",
+        "@img/sharp-linux-x64": "^0.33.5",
+        "@img/sharp-win32-x64": "^0.33.5"
+      },
+      "peerDependencies": {
+        "zod": "^3.24.1"
+      }
+    },
+    "node_modules/@img/sharp-darwin-arm64": {
+      "version": "0.33.5",
+      "resolved": "https://registry.npmjs.org/@img/sharp-darwin-arm64/-/sharp-darwin-arm64-0.33.5.tgz",
+      "integrity": "sha512-UT4p+iz/2H4twwAoLCqfA9UH5pI6DggwKEGuaPy7nCVQ8ZsiY5PIcrRvD1DzuY3qYL07NtIQcWnBSY/heikIFQ==",
+      "cpu": [
+        "arm64"
+      ],
+      "license": "Apache-2.0",
+      "optional": true,
+      "os": [
+        "darwin"
+      ],
+      "engines": {
+        "node": "^18.17.0 || ^20.3.0 || >=21.0.0"
+      },
+      "funding": {
+        "url": "https://opencollective.com/libvips"
+      },
+      "optionalDependencies": {
+        "@img/sharp-libvips-darwin-arm64": "1.0.4"
+      }
+    },
+    "node_modules/@img/sharp-darwin-x64": {
+      "version": "0.33.5",
+      "resolved": "https://registry.npmjs.org/@img/sharp-darwin-x64/-/sharp-darwin-x64-0.33.5.tgz",
+      "integrity": "sha512-fyHac4jIc1ANYGRDxtiqelIbdWkIuQaI84Mv45KvGRRxSAa7o7d1ZKAOBaYbnepLC1WqxfpimdeWfvqqSGwR2Q==",
+      "cpu": [
+        "x64"
+      ],
+      "license": "Apache-2.0",
+      "optional": true,
+      "os": [
+        "darwin"
+      ],
+      "engines": {
+        "node": "^18.17.0 || ^20.3.0 || >=21.0.0"
+      },
+      "funding": {
+        "url": "https://opencollective.com/libvips"
+      },
+      "optionalDependencies": {
+        "@img/sharp-libvips-darwin-x64": "1.0.4"
+      }
+    },
+    "node_modules/@img/sharp-libvips-darwin-arm64": {
+      "version": "1.0.4",
+      "resolved": "https://registry.npmjs.org/@img/sharp-libvips-darwin-arm64/-/sharp-libvips-darwin-arm64-1.0.4.tgz",
+      "integrity": "sha512-XblONe153h0O2zuFfTAbQYAX2JhYmDHeWikp1LM9Hul9gVPjFY427k6dFEcOL72O01QxQsWi761svJ/ev9xEDg==",
+      "cpu": [
+        "arm64"
+      ],
+      "license": "LGPL-3.0-or-later",
+      "optional": true,
+      "os": [
+        "darwin"
+      ],
+      "funding": {
+        "url": "https://opencollective.com/libvips"
+      }
+    },
+    "node_modules/@img/sharp-libvips-darwin-x64": {
+      "version": "1.0.4",
+      "resolved": "https://registry.npmjs.org/@img/sharp-libvips-darwin-x64/-/sharp-libvips-darwin-x64-1.0.4.tgz",
+      "integrity": "sha512-xnGR8YuZYfJGmWPvmlunFaWJsb9T/AO2ykoP3Fz/0X5XV2aoYBPkX6xqCQvUTKKiLddarLaxpzNe+b1hjeWHAQ==",
+      "cpu": [
+        "x64"
+      ],
+      "license": "LGPL-3.0-or-later",
+      "optional": true,
+      "os": [
+        "darwin"
+      ],
+      "funding": {
+        "url": "https://opencollective.com/libvips"
+      }
+    },
+    "node_modules/@img/sharp-libvips-linux-arm": {
+      "version": "1.0.5",
+      "resolved": "https://registry.npmjs.org/@img/sharp-libvips-linux-arm/-/sharp-libvips-linux-arm-1.0.5.tgz",
+      "integrity": "sha512-gvcC4ACAOPRNATg/ov8/MnbxFDJqf/pDePbBnuBDcjsI8PssmjoKMAz4LtLaVi+OnSb5FK/yIOamqDwGmXW32g==",
+      "cpu": [
+        "arm"
+      ],
+      "license": "LGPL-3.0-or-later",
+      "optional": true,
+      "os": [
+        "linux"
+      ],
+      "funding": {
+        "url": "https://opencollective.com/libvips"
+      }
+    },
+    "node_modules/@img/sharp-libvips-linux-arm64": {
+      "version": "1.0.4",
+      "resolved": "https://registry.npmjs.org/@img/sharp-libvips-linux-arm64/-/sharp-libvips-linux-arm64-1.0.4.tgz",
+      "integrity": "sha512-9B+taZ8DlyyqzZQnoeIvDVR/2F4EbMepXMc/NdVbkzsJbzkUjhXv/70GQJ7tdLA4YJgNP25zukcxpX2/SueNrA==",
+      "cpu": [
+        "arm64"
+      ],
+      "license": "LGPL-3.0-or-later",
+      "optional": true,
+      "os": [
+        "linux"
+      ],
+      "funding": {
+        "url": "https://opencollective.com/libvips"
+      }
+    },
+    "node_modules/@img/sharp-libvips-linux-x64": {
+      "version": "1.0.4",
+      "resolved": "https://registry.npmjs.org/@img/sharp-libvips-linux-x64/-/sharp-libvips-linux-x64-1.0.4.tgz",
+      "integrity": "sha512-MmWmQ3iPFZr0Iev+BAgVMb3ZyC4KeFc3jFxnNbEPas60e1cIfevbtuyf9nDGIzOaW9PdnDciJm+wFFaTlj5xYw==",
+      "cpu": [
+        "x64"
+      ],
+      "license": "LGPL-3.0-or-later",
+      "optional": true,
+      "os": [
+        "linux"
+      ],
+      "funding": {
+        "url": "https://opencollective.com/libvips"
+      }
+    },
+    "node_modules/@img/sharp-linux-arm": {
+      "version": "0.33.5",
+      "resolved": "https://registry.npmjs.org/@img/sharp-linux-arm/-/sharp-linux-arm-0.33.5.tgz",
+      "integrity": "sha512-JTS1eldqZbJxjvKaAkxhZmBqPRGmxgu+qFKSInv8moZ2AmT5Yib3EQ1c6gp493HvrvV8QgdOXdyaIBrhvFhBMQ==",
+      "cpu": [
+        "arm"
+      ],
+      "license": "Apache-2.0",
+      "optional": true,
+      "os": [
+        "linux"
+      ],
+      "engines": {
+        "node": "^18.17.0 || ^20.3.0 || >=21.0.0"
+      },
+      "funding": {
+        "url": "https://opencollective.com/libvips"
+      },
+      "optionalDependencies": {
+        "@img/sharp-libvips-linux-arm": "1.0.5"
+      }
+    },
+    "node_modules/@img/sharp-linux-arm64": {
+      "version": "0.33.5",
+      "resolved": "https://registry.npmjs.org/@img/sharp-linux-arm64/-/sharp-linux-arm64-0.33.5.tgz",
+      "integrity": "sha512-JMVv+AMRyGOHtO1RFBiJy/MBsgz0x4AWrT6QoEVVTyh1E39TrCUpTRI7mx9VksGX4awWASxqCYLCV4wBZHAYxA==",
+      "cpu": [
+        "arm64"
+      ],
+      "license": "Apache-2.0",
+      "optional": true,
+      "os": [
+        "linux"
+      ],
+      "engines": {
+        "node": "^18.17.0 || ^20.3.0 || >=21.0.0"
+      },
+      "funding": {
+        "url": "https://opencollective.com/libvips"
+      },
+      "optionalDependencies": {
+        "@img/sharp-libvips-linux-arm64": "1.0.4"
+      }
+    },
+    "node_modules/@img/sharp-linux-x64": {
+      "version": "0.33.5",
+      "resolved": "https://registry.npmjs.org/@img/sharp-linux-x64/-/sharp-linux-x64-0.33.5.tgz",
+      "integrity": "sha512-opC+Ok5pRNAzuvq1AG0ar+1owsu842/Ab+4qvU879ippJBHvyY5n2mxF1izXqkPYlGuP/M556uh53jRLJmzTWA==",
+      "cpu": [
+        "x64"
+      ],
+      "license": "Apache-2.0",
+      "optional": true,
+      "os": [
+        "linux"
+      ],
+      "engines": {
+        "node": "^18.17.0 || ^20.3.0 || >=21.0.0"
+      },
+      "funding": {
+        "url": "https://opencollective.com/libvips"
+      },
+      "optionalDependencies": {
+        "@img/sharp-libvips-linux-x64": "1.0.4"
+      }
+    },
+    "node_modules/@img/sharp-win32-x64": {
+      "version": "0.33.5",
+      "resolved": "https://registry.npmjs.org/@img/sharp-win32-x64/-/sharp-win32-x64-0.33.5.tgz",
+      "integrity": "sha512-MpY/o8/8kj+EcnxwvrP4aTJSWw/aZ7JIGR4aBeZkZw5B7/Jn+tY9/VNwtcoGmdT7GfggGIU4kygOMSbYnOrAbg==",
+      "cpu": [
+        "x64"
+      ],
+      "license": "Apache-2.0 AND LGPL-3.0-or-later",
+      "optional": true,
+      "os": [
+        "win32"
+      ],
+      "engines": {
+        "node": "^18.17.0 || ^20.3.0 || >=21.0.0"
+      },
+      "funding": {
+        "url": "https://opencollective.com/libvips"
+      }
+    },
+    "node_modules/zod": {
+      "version": "3.25.76",
+      "resolved": "https://registry.npmjs.org/zod/-/zod-3.25.76.tgz",
+      "integrity": "sha512-gzUt/qt81nXsFGKIFcC3YnfEAx5NkunCfnDlvuBSSFS02bcXu4Lmea0AFIUwbLWxWPx3d9p8S5QoaujKcNQxcQ==",
+      "license": "MIT",
+      "funding": {
+        "url": "https://github.com/sponsors/colinhacks"
+      }
+    }
+  }
+}
@@ -0,0 +1,13 @@
+{
+  "name": "@shannon/mcp-server",
+  "version": "1.0.0",
+  "type": "module",
+  "main": "./src/index.js",
+  "scripts": {
+    "clean": "rm -rf dist"
+  },
+  "dependencies": {
+    "@anthropic-ai/claude-agent-sdk": "^0.1.0",
+    "zod": "^3.22.4"
+  }
+}
@@ -0,0 +1,35 @@
+/**
+ * Shannon Helper MCP Server
+ *
+ * In-process MCP server providing save_deliverable and generate_totp tools
+ * for Shannon penetration testing agents.
+ *
+ * Replaces bash script invocations with native tool access.
+ */
+
+import { createSdkMcpServer } from '@anthropic-ai/claude-agent-sdk';
+import { saveDeliverableTool } from './tools/save-deliverable.js';
+import { generateTotpTool } from './tools/generate-totp.js';
+
+/**
+ * Create Shannon Helper MCP Server with target directory context
+ *
+ * @param {string} targetDir - The target repository directory where deliverables should be saved
+ * @returns {Object} MCP server instance
+ */
+export function createShannonHelperServer(targetDir) {
+  // Store target directory for tool access
+  global.__SHANNON_TARGET_DIR = targetDir;
+
+  return createSdkMcpServer({
+    name: 'shannon-helper',
+    version: '1.0.0',
+    tools: [saveDeliverableTool, generateTotpTool],
+  });
+}
+
+// Export tools for direct usage if needed
+export { saveDeliverableTool, generateTotpTool };
+
+// Export types for external use
+export * from './types/index.js';
@@ -0,0 +1,137 @@
+/**
+ * generate_totp MCP Tool
+ *
+ * Generates 6-digit TOTP codes for authentication.
+ * Replaces tools/generate-totp-standalone.mjs bash script.
+ * Based on RFC 6238 (TOTP) and RFC 4226 (HOTP).
+ */
+
+import { tool } from '@anthropic-ai/claude-agent-sdk';
+import { createHmac } from 'crypto';
+import { z } from 'zod';
+import { createToolResult } from '../types/tool-responses.js';
+import { base32Decode, validateTotpSecret } from '../validation/totp-validator.js';
+import { createCryptoError, createGenericError } from '../utils/error-formatter.js';
+
+/**
+ * Input schema for generate_totp tool
+ */
+export const GenerateTotpInputSchema = z.object({
+  secret: z
+    .string()
+    .min(1)
+    .regex(/^[A-Z2-7]+$/i, 'Must be base32-encoded')
+    .describe('Base32-encoded TOTP secret'),
+});
+
+/**
+ * Generate HOTP code (RFC 4226)
+ * Ported from generate-totp-standalone.mjs (lines 74-99)
+ *
+ * @param {string} secret - Base32-encoded secret
+ * @param {number} counter - Counter value
+ * @param {number} [digits=6] - Number of digits in OTP
+ * @returns {string} OTP code
+ */
+function generateHOTP(secret, counter, digits = 6) {
+  const key = base32Decode(secret);
+
+  // Convert counter to 8-byte buffer (big-endian)
+  const counterBuffer = Buffer.alloc(8);
+  counterBuffer.writeBigUInt64BE(BigInt(counter));
+
+  // Generate HMAC-SHA1
+  const hmac = createHmac('sha1', key);
+  hmac.update(counterBuffer);
+  const hash = hmac.digest();
+
+  // Dynamic truncation
+  const offset = hash[hash.length - 1] & 0x0f;
+  const code =
+    ((hash[offset] & 0x7f) << 24) |
+    ((hash[offset + 1] & 0xff) << 16) |
+    ((hash[offset + 2] & 0xff) << 8) |
+    (hash[offset + 3] & 0xff);
+
+  // Generate digits
+  const otp = (code % Math.pow(10, digits)).toString().padStart(digits, '0');
+  return otp;
+}
+
+/**
+ * Generate TOTP code (RFC 6238)
+ * Ported from generate-totp-standalone.mjs (lines 101-106)
+ *
+ * @param {string} secret - Base32-encoded secret
+ * @param {number} [timeStep=30] - Time step in seconds
+ * @param {number} [digits=6] - Number of digits in OTP
+ * @returns {string} OTP code
+ */
+function generateTOTP(secret, timeStep = 30, digits = 6) {
+  const currentTime = Math.floor(Date.now() / 1000);
+  const counter = Math.floor(currentTime / timeStep);
+  return generateHOTP(secret, counter, digits);
+}
+
+/**
+ * Get seconds until TOTP code expires
+ *
+ * @param {number} [timeStep=30] - Time step in seconds
+ * @returns {number} Seconds until expiration
+ */
+function getSecondsUntilExpiration(timeStep = 30) {
+  const currentTime = Math.floor(Date.now() / 1000);
+  return timeStep - (currentTime % timeStep);
+}
+
+/**
+ * generate_totp tool implementation
+ *
+ * @param {Object} args
+ * @param {string} args.secret - Base32-encoded TOTP secret
+ * @returns {Promise<Object>} Tool result
+ */
+export async function generateTotp(args) {
+  try {
+    const { secret } = args;
+
+    // Validate secret (throws on error)
+    validateTotpSecret(secret);
+
+    // Generate TOTP code
+    const totpCode = generateTOTP(secret);
+    const expiresIn = getSecondsUntilExpiration();
+    const timestamp = new Date().toISOString();
+
+    // Success response
+    const successResponse = {
+      status: 'success',
+      message: 'TOTP code generated successfully',
+      totpCode,
+      timestamp,
+      expiresIn,
+    };
+
+    return createToolResult(successResponse);
+  } catch (error) {
+    // Check if it's a validation/crypto error
+    if (error instanceof Error && (error.message.includes('base32') || error.message.includes('TOTP'))) {
+      const errorResponse = createCryptoError(error.message, false);
+      return createToolResult(errorResponse);
+    }
+
+    // Generic error
+    const errorResponse = createGenericError(error, false);
+    return createToolResult(errorResponse);
+  }
+}
+
+/**
+ * Tool definition for MCP server - created using SDK's tool() function
+ */
+export const generateTotpTool = tool(
+  'generate_totp',
+  'Generates 6-digit TOTP code for authentication. Secret must be base32-encoded.',
+  GenerateTotpInputSchema.shape,
+  generateTotp
+);
@@ -0,0 +1,85 @@
+/**
+ * save_deliverable MCP Tool
+ *
+ * Saves deliverable files with automatic validation.
+ * Replaces tools/save_deliverable.js bash script.
+ */
+
+import { tool } from '@anthropic-ai/claude-agent-sdk';
+import { z } from 'zod';
+import { DeliverableType, DELIVERABLE_FILENAMES, isQueueType } from '../types/deliverables.js';
+import { createToolResult } from '../types/tool-responses.js';
+import { validateQueueJson } from '../validation/queue-validator.js';
+import { saveDeliverableFile } from '../utils/file-operations.js';
+import { createValidationError, createGenericError } from '../utils/error-formatter.js';
+
+/**
+ * Input schema for save_deliverable tool
+ */
+export const SaveDeliverableInputSchema = z.object({
+  deliverable_type: z.nativeEnum(DeliverableType).describe('Type of deliverable to save'),
+  content: z.string().min(1).describe('File content (markdown for analysis/evidence, JSON for queues)'),
+});
+
+/**
+ * save_deliverable tool implementation
+ *
+ * @param {Object} args
+ * @param {string} args.deliverable_type - Type of deliverable to save
+ * @param {string} args.content - File content
+ * @returns {Promise<Object>} Tool result
+ */
+export async function saveDeliverable(args) {
+  try {
+    const { deliverable_type, content } = args;
+
+    // Validate queue JSON if applicable
+    if (isQueueType(deliverable_type)) {
+      const queueValidation = validateQueueJson(content);
+      if (!queueValidation.valid) {
+        const errorResponse = createValidationError(
+          queueValidation.message,
+          true,
+          {
+            deliverableType: deliverable_type,
+            expectedFormat: '{"vulnerabilities": [...]}',
+          }
+        );
+        return createToolResult(errorResponse);
+      }
+    }
+
+    // Get filename and save file
+    const filename = DELIVERABLE_FILENAMES[deliverable_type];
+    const filepath = saveDeliverableFile(filename, content);
+
+    // Success response
+    const successResponse = {
+      status: 'success',
+      message: `Deliverable saved successfully: ${filename}`,
+      filepath,
+      deliverableType: deliverable_type,
+      validated: isQueueType(deliverable_type),
+    };
+
+    return createToolResult(successResponse);
+  } catch (error) {
+    const errorResponse = createGenericError(
+      error,
+      false,
+      { deliverableType: args.deliverable_type }
+    );
+
+    return createToolResult(errorResponse);
+  }
+}
+
+/**
+ * Tool definition for MCP server - created using SDK's tool() function
+ */
+export const saveDeliverableTool = tool(
+  'save_deliverable',
+  'Saves deliverable files with automatic validation. Queue files must have {"vulnerabilities": [...]} structure.',
+  SaveDeliverableInputSchema.shape,
+  saveDeliverable
+);
@@ -0,0 +1,107 @@
+/**
+ * Deliverable Type Definitions
+ *
+ * Maps deliverable types to their filenames and defines validation requirements.
+ * Must match the exact mappings from tools/save_deliverable.js.
+ */
+
+/**
+ * @typedef {Object} DeliverableType
+ * @property {string} CODE_ANALYSIS
+ * @property {string} RECON
+ * @property {string} INJECTION_ANALYSIS
+ * @property {string} INJECTION_QUEUE
+ * @property {string} XSS_ANALYSIS
+ * @property {string} XSS_QUEUE
+ * @property {string} AUTH_ANALYSIS
+ * @property {string} AUTH_QUEUE
+ * @property {string} AUTHZ_ANALYSIS
+ * @property {string} AUTHZ_QUEUE
+ * @property {string} SSRF_ANALYSIS
+ * @property {string} SSRF_QUEUE
+ * @property {string} INJECTION_EVIDENCE
+ * @property {string} XSS_EVIDENCE
+ * @property {string} AUTH_EVIDENCE
+ * @property {string} AUTHZ_EVIDENCE
+ * @property {string} SSRF_EVIDENCE
+ */
+
+export const DeliverableType = {
+  // Pre-recon agent
+  CODE_ANALYSIS: 'CODE_ANALYSIS',
+
+  // Recon agent
+  RECON: 'RECON',
+
+  // Vulnerability analysis agents
+  INJECTION_ANALYSIS: 'INJECTION_ANALYSIS',
+  INJECTION_QUEUE: 'INJECTION_QUEUE',
+
+  XSS_ANALYSIS: 'XSS_ANALYSIS',
+  XSS_QUEUE: 'XSS_QUEUE',
+
+  AUTH_ANALYSIS: 'AUTH_ANALYSIS',
+  AUTH_QUEUE: 'AUTH_QUEUE',
+
+  AUTHZ_ANALYSIS: 'AUTHZ_ANALYSIS',
+  AUTHZ_QUEUE: 'AUTHZ_QUEUE',
+
+  SSRF_ANALYSIS: 'SSRF_ANALYSIS',
+  SSRF_QUEUE: 'SSRF_QUEUE',
+
+  // Exploitation agents
+  INJECTION_EVIDENCE: 'INJECTION_EVIDENCE',
+  XSS_EVIDENCE: 'XSS_EVIDENCE',
+  AUTH_EVIDENCE: 'AUTH_EVIDENCE',
+  AUTHZ_EVIDENCE: 'AUTHZ_EVIDENCE',
+  SSRF_EVIDENCE: 'SSRF_EVIDENCE',
+};
+
+/**
+ * Hard-coded filename mappings from agent prompts
+ * Must match tools/save_deliverable.js exactly
+ */
+export const DELIVERABLE_FILENAMES = {
+  [DeliverableType.CODE_ANALYSIS]: 'code_analysis_deliverable.md',
+  [DeliverableType.RECON]: 'recon_deliverable.md',
+  [DeliverableType.INJECTION_ANALYSIS]: 'injection_analysis_deliverable.md',
+  [DeliverableType.INJECTION_QUEUE]: 'injection_exploitation_queue.json',
+  [DeliverableType.XSS_ANALYSIS]: 'xss_analysis_deliverable.md',
+  [DeliverableType.XSS_QUEUE]: 'xss_exploitation_queue.json',
+  [DeliverableType.AUTH_ANALYSIS]: 'auth_analysis_deliverable.md',
+  [DeliverableType.AUTH_QUEUE]: 'auth_exploitation_queue.json',
+  [DeliverableType.AUTHZ_ANALYSIS]: 'authz_analysis_deliverable.md',
+  [DeliverableType.AUTHZ_QUEUE]: 'authz_exploitation_queue.json',
+  [DeliverableType.SSRF_ANALYSIS]: 'ssrf_analysis_deliverable.md',
+  [DeliverableType.SSRF_QUEUE]: 'ssrf_exploitation_queue.json',
+  [DeliverableType.INJECTION_EVIDENCE]: 'injection_exploitation_evidence.md',
+  [DeliverableType.XSS_EVIDENCE]: 'xss_exploitation_evidence.md',
+  [DeliverableType.AUTH_EVIDENCE]: 'auth_exploitation_evidence.md',
+  [DeliverableType.AUTHZ_EVIDENCE]: 'authz_exploitation_evidence.md',
+  [DeliverableType.SSRF_EVIDENCE]: 'ssrf_exploitation_evidence.md',
+};
+
+/**
+ * Queue types that require JSON validation
+ */
+export const QUEUE_TYPES = [
+  DeliverableType.INJECTION_QUEUE,
+  DeliverableType.XSS_QUEUE,
+  DeliverableType.AUTH_QUEUE,
+  DeliverableType.AUTHZ_QUEUE,
+  DeliverableType.SSRF_QUEUE,
+];
+
+/**
+ * Type guard to check if a deliverable type is a queue
+ * @param {string} type - Deliverable type to check
+ * @returns {boolean} True if the type is a queue type
+ */
+export function isQueueType(type) {
+  return QUEUE_TYPES.includes(type);
+}
+
+/**
+ * @typedef {Object} VulnerabilityQueue
+ * @property {Array<Object>} vulnerabilities - Array of vulnerability objects
+ */
@@ -0,0 +1,6 @@
+/**
+ * Type definitions barrel export
+ */
+
+export * from './deliverables.js';
+export * from './tool-responses.js';
@@ -0,0 +1,58 @@
+/**
+ * Tool Response Type Definitions
+ *
+ * Defines structured response formats for MCP tools to ensure
+ * consistent error handling and success reporting.
+ */
+
+/**
+ * @typedef {Object} ErrorResponse
+ * @property {'error'} status
+ * @property {string} message
+ * @property {string} errorType - ValidationError, FileSystemError, CryptoError, etc.
+ * @property {boolean} retryable
+ * @property {Record<string, unknown>} [context]
+ */
+
+/**
+ * @typedef {Object} SuccessResponse
+ * @property {'success'} status
+ * @property {string} message
+ */
+
+/**
+ * @typedef {Object} SaveDeliverableResponse
+ * @property {'success'} status
+ * @property {string} message
+ * @property {string} filepath
+ * @property {string} deliverableType
+ * @property {boolean} validated - true if queue JSON was validated
+ */
+
+/**
+ * @typedef {Object} GenerateTotpResponse
+ * @property {'success'} status
+ * @property {string} message
+ * @property {string} totpCode
+ * @property {string} timestamp
+ * @property {number} expiresIn - seconds until expiration
+ */
+
+/**
+ * Helper to create tool result from response
+ * MCP tools should return this format
+ *
+ * @param {ErrorResponse | SaveDeliverableResponse | GenerateTotpResponse} response
+ * @returns {{ content: Array<{ type: string; text: string }>; isError: boolean }}
+ */
+export function createToolResult(response) {
+  return {
+    content: [
+      {
+        type: 'text',
+        text: JSON.stringify(response, null, 2),
+      },
+    ],
+    isError: response.status === 'error',
+  };
+}
@@ -0,0 +1,71 @@
+/**
+ * Error Formatting Utilities
+ *
+ * Helper functions for creating structured error responses.
+ */
+
+/**
+ * @typedef {Object} ErrorResponse
+ * @property {'error'} status
+ * @property {string} message
+ * @property {string} errorType
+ * @property {boolean} retryable
+ * @property {Record<string, unknown>} [context]
+ */
+
+/**
+ * Create a validation error response
+ *
+ * @param {string} message
+ * @param {boolean} [retryable=true]
+ * @param {Record<string, unknown>} [context]
+ * @returns {ErrorResponse}
+ */
+export function createValidationError(message, retryable = true, context) {
+  return {
+    status: 'error',
+    message,
+    errorType: 'ValidationError',
+    retryable,
+    context,
+  };
+}
+
+/**
+ * Create a crypto error response
+ *
+ * @param {string} message
+ * @param {boolean} [retryable=false]
+ * @param {Record<string, unknown>} [context]
+ * @returns {ErrorResponse}
+ */
+export function createCryptoError(message, retryable = false, context) {
+  return {
+    status: 'error',
+    message,
+    errorType: 'CryptoError',
+    retryable,
+    context,
+  };
+}
+
+/**
+ * Create a generic error response
+ *
+ * @param {unknown} error
+ * @param {boolean} [retryable=false]
+ * @param {Record<string, unknown>} [context]
+ * @returns {ErrorResponse}
+ */
+export function createGenericError(error, retryable = false, context) {
+  const message = error instanceof Error ? error.message : String(error);
+  const errorType = error instanceof Error ? error.constructor.name : 'UnknownError';
+
+  return {
+    status: 'error',
+    message,
+    errorType,
+    retryable,
+    context,
+  };
+}
@@ -0,0 +1,35 @@
+/**
+ * File Operations Utilities
+ *
+ * Handles file system operations for deliverable saving.
+ * Ported from tools/save_deliverable.js (lines 117-130).
+ */
+
+import { writeFileSync, mkdirSync } from 'fs';
+import { join } from 'path';
+
+/**
+ * Save deliverable file to deliverables/ directory
+ *
+ * @param {string} filename - Name of the file to save
+ * @param {string} content - Content to write to the file
+ * @returns {string} Full path to the saved file
+ */
+export function saveDeliverableFile(filename, content) {
+  // Use target directory from global context (set by createShannonHelperServer)
+  const targetDir = global.__SHANNON_TARGET_DIR || process.cwd();
+  const deliverablesDir = join(targetDir, 'deliverables');
+  const filepath = join(deliverablesDir, filename);
+
+  // Ensure deliverables directory exists
+  try {
+    mkdirSync(deliverablesDir, { recursive: true });
+  } catch (error) {
+    // Directory might already exist, ignore
+  }
+
+  // Write file (atomic write - single operation)
+  writeFileSync(filepath, content, 'utf8');
+
+  return filepath;
+}
@@ -0,0 +1,51 @@
+/**
+ * Queue Validator
+ *
+ * Validates JSON structure for vulnerability queue files.
+ * Ported from tools/save_deliverable.js (lines 56-75).
+ */
+
+/**
+ * @typedef {Object} ValidationResult
+ * @property {boolean} valid
+ * @property {string} [message]
+ * @property {Object} [data]
+ */
+
+/**
+ * Validate JSON structure for queue files
+ * Queue files must have a 'vulnerabilities' array
+ *
+ * @param {string} content - JSON string to validate
+ * @returns {ValidationResult} ValidationResult with valid flag, optional error message, and parsed data
+ */
+export function validateQueueJson(content) {
+  try {
+    const parsed = JSON.parse(content);
+
+    // Queue files must have a 'vulnerabilities' array
+    if (!parsed.vulnerabilities) {
+      return {
+        valid: false,
+        message: `Invalid queue structure: Missing 'vulnerabilities' property. Expected: {"vulnerabilities": [...]}`,
+      };
+    }
+
+    if (!Array.isArray(parsed.vulnerabilities)) {
+      return {
+        valid: false,
+        message: `Invalid queue structure: 'vulnerabilities' must be an array. Expected: {"vulnerabilities": [...]}`,
+      };
+    }
+
+    return {
+      valid: true,
+      data: parsed,
+    };
+  } catch (error) {
+    return {
+      valid: false,
+      message: `Invalid JSON: ${error instanceof Error ? error.message : String(error)}`,
+    };
+  }
+}
@@ -0,0 +1,71 @@
+/**
+ * TOTP Validator
+ *
+ * Validates TOTP secrets and provides base32 decoding.
+ * Ported from tools/generate-totp-standalone.mjs (lines 43-72).
+ */
+
+/**
+ * Base32 decode function
+ * Ported from generate-totp-standalone.mjs
+ *
+ * @param {string} encoded - Base32 encoded string
+ * @returns {Buffer} Buffer containing decoded bytes
+ */
+export function base32Decode(encoded) {
+  const alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ234567';
+  const cleanInput = encoded.toUpperCase().replace(/[^A-Z2-7]/g, '');
+
+  if (cleanInput.length === 0) {
+    return Buffer.alloc(0);
+  }
+
+  const output = [];
+  let bits = 0;
+  let value = 0;
+
+  for (const char of cleanInput) {
+    const index = alphabet.indexOf(char);
+    if (index === -1) {
+      throw new Error(`Invalid base32 character: ${char}`);
+    }
+
+    value = (value << 5) | index;
+    bits += 5;
+
+    if (bits >= 8) {
+      output.push((value >>> (bits - 8)) & 255);
+      bits -= 8;
+    }
+  }
+
+  return Buffer.from(output);
+}
+
+/**
+ * Validate TOTP secret
+ * Must be base32-encoded string
+ *
+ * @param {string} secret - Secret to validate
+ * @returns {boolean} true if valid, throws Error if invalid
+ */
+export function validateTotpSecret(secret) {
+  if (!secret || secret.length === 0) {
+    throw new Error('TOTP secret cannot be empty');
+  }
+
+  // Check if it's valid base32 (only A-Z and 2-7, case-insensitive)
+  const base32Regex = /^[A-Z2-7]+$/i;
+  if (!base32Regex.test(secret.replace(/[^A-Z2-7]/gi, ''))) {
+    throw new Error('TOTP secret must be base32-encoded (characters A-Z and 2-7)');
+  }
+
+  // Try to decode to ensure it's valid
+  try {
+    base32Decode(secret);
+  } catch (error) {
+    throw new Error(`Invalid TOTP secret: ${error instanceof Error ? error.message : String(error)}`);
+  }
+
+  return true;
+}
@@ -8,7 +8,7 @@
      "name": "shannon",
      "version": "1.0.0",
      "dependencies": {
-        "@anthropic-ai/claude-code": "^1.0.96",
+        "@anthropic-ai/claude-agent-sdk": "^0.1.0",
        "ajv": "^8.12.0",
        "ajv-formats": "^2.1.1",
        "boxen": "^8.0.1",
@@ -16,20 +16,18 @@
        "figlet": "^1.9.3",
        "gradient-string": "^3.0.0",
        "js-yaml": "^4.1.0",
+        "zod": "^3.22.4",
        "zx": "^8.0.0"
      },
      "bin": {
        "shannon": "shannon.mjs"
      }
    },
-    "node_modules/@anthropic-ai/claude-code": {
-      "version": "1.0.96",
-      "resolved": "https://registry.npmjs.org/@anthropic-ai/claude-code/-/claude-code-1.0.96.tgz",
-      "integrity": "sha512-xnxhYzuh6PYlMcw56REMQiGMW20WaLLOvG8L8TObq70zhNKs3dro7nhYwHRe1c2ubTr20oIJK0aSkyD2BpO8nA==",
+    "node_modules/@anthropic-ai/claude-agent-sdk": {
+      "version": "0.1.25",
+      "resolved": "https://registry.npmjs.org/@anthropic-ai/claude-agent-sdk/-/claude-agent-sdk-0.1.25.tgz",
+      "integrity": "sha512-qwuydYaA3uamz4ivDzYXfL2PBjGwc0+beeIyo3nvtZQOtFLjH7xPdBK2w3+9KnB3L6V7VooAMdTXPpQyxCwcOg==",
      "license": "SEE LICENSE IN README.md",
-      "bin": {
-        "claude": "cli.js"
-      },
      "engines": {
        "node": ">=18.0.0"
      },
@@ -40,6 +38,9 @@
        "@img/sharp-linux-arm64": "^0.33.5",
        "@img/sharp-linux-x64": "^0.33.5",
        "@img/sharp-win32-x64": "^0.33.5"
+      },
+      "peerDependencies": {
+        "zod": "^3.24.1"
      }
    },
    "node_modules/@img/sharp-darwin-arm64": {
@@ -64,6 +65,28 @@
        "@img/sharp-libvips-darwin-arm64": "1.0.4"
      }
    },
+    "node_modules/@img/sharp-darwin-x64": {
+      "version": "0.33.5",
+      "resolved": "https://registry.npmjs.org/@img/sharp-darwin-x64/-/sharp-darwin-x64-0.33.5.tgz",
+      "integrity": "sha512-fyHac4jIc1ANYGRDxtiqelIbdWkIuQaI84Mv45KvGRRxSAa7o7d1ZKAOBaYbnepLC1WqxfpimdeWfvqqSGwR2Q==",
+      "cpu": [
+        "x64"
+      ],
+      "license": "Apache-2.0",
+      "optional": true,
+      "os": [
+        "darwin"
+      ],
+      "engines": {
+        "node": "^18.17.0 || ^20.3.0 || >=21.0.0"
+      },
+      "funding": {
+        "url": "https://opencollective.com/libvips"
+      },
+      "optionalDependencies": {
+        "@img/sharp-libvips-darwin-x64": "1.0.4"
+      }
+    },
    "node_modules/@img/sharp-libvips-darwin-arm64": {
      "version": "1.0.4",
      "resolved": "https://registry.npmjs.org/@img/sharp-libvips-darwin-arm64/-/sharp-libvips-darwin-arm64-1.0.4.tgz",
@@ -80,6 +103,155 @@
        "url": "https://opencollective.com/libvips"
      }
    },
+    "node_modules/@img/sharp-libvips-darwin-x64": {
+      "version": "1.0.4",
+      "resolved": "https://registry.npmjs.org/@img/sharp-libvips-darwin-x64/-/sharp-libvips-darwin-x64-1.0.4.tgz",
+      "integrity": "sha512-xnGR8YuZYfJGmWPvmlunFaWJsb9T/AO2ykoP3Fz/0X5XV2aoYBPkX6xqCQvUTKKiLddarLaxpzNe+b1hjeWHAQ==",
+      "cpu": [
+        "x64"
+      ],
+      "license": "LGPL-3.0-or-later",
+      "optional": true,
+      "os": [
+        "darwin"
+      ],
+      "funding": {
+        "url": "https://opencollective.com/libvips"
+      }
+    },
+    "node_modules/@img/sharp-libvips-linux-arm": {
+      "version": "1.0.5",
+      "resolved": "https://registry.npmjs.org/@img/sharp-libvips-linux-arm/-/sharp-libvips-linux-arm-1.0.5.tgz",
+      "integrity": "sha512-gvcC4ACAOPRNATg/ov8/MnbxFDJqf/pDePbBnuBDcjsI8PssmjoKMAz4LtLaVi+OnSb5FK/yIOamqDwGmXW32g==",
+      "cpu": [
+        "arm"
+      ],
+      "license": "LGPL-3.0-or-later",
+      "optional": true,
+      "os": [
+        "linux"
+      ],
+      "funding": {
+        "url": "https://opencollective.com/libvips"
+      }
+    },
+    "node_modules/@img/sharp-libvips-linux-arm64": {
+      "version": "1.0.4",
+      "resolved": "https://registry.npmjs.org/@img/sharp-libvips-linux-arm64/-/sharp-libvips-linux-arm64-1.0.4.tgz",
+      "integrity": "sha512-9B+taZ8DlyyqzZQnoeIvDVR/2F4EbMepXMc/NdVbkzsJbzkUjhXv/70GQJ7tdLA4YJgNP25zukcxpX2/SueNrA==",
+      "cpu": [
+        "arm64"
+      ],
+      "license": "LGPL-3.0-or-later",
+      "optional": true,
+      "os": [
+        "linux"
+      ],
+      "funding": {
+        "url": "https://opencollective.com/libvips"
+      }
+    },
+    "node_modules/@img/sharp-libvips-linux-x64": {
+      "version": "1.0.4",
+      "resolved": "https://registry.npmjs.org/@img/sharp-libvips-linux-x64/-/sharp-libvips-linux-x64-1.0.4.tgz",
+      "integrity": "sha512-MmWmQ3iPFZr0Iev+BAgVMb3ZyC4KeFc3jFxnNbEPas60e1cIfevbtuyf9nDGIzOaW9PdnDciJm+wFFaTlj5xYw==",
+      "cpu": [
+        "x64"
+      ],
+      "license": "LGPL-3.0-or-later",
+      "optional": true,
+      "os": [
+        "linux"
+      ],
+      "funding": {
+        "url": "https://opencollective.com/libvips"
+      }
+    },
+    "node_modules/@img/sharp-linux-arm": {
+      "version": "0.33.5",
+      "resolved": "https://registry.npmjs.org/@img/sharp-linux-arm/-/sharp-linux-arm-0.33.5.tgz",
+      "integrity": "sha512-JTS1eldqZbJxjvKaAkxhZmBqPRGmxgu+qFKSInv8moZ2AmT5Yib3EQ1c6gp493HvrvV8QgdOXdyaIBrhvFhBMQ==",
+      "cpu": [
+        "arm"
+      ],
+      "license": "Apache-2.0",
+      "optional": true,
+      "os": [
+        "linux"
+      ],
+      "engines": {
+        "node": "^18.17.0 || ^20.3.0 || >=21.0.0"
+      },
+      "funding": {
+        "url": "https://opencollective.com/libvips"
+      },
+      "optionalDependencies": {
+        "@img/sharp-libvips-linux-arm": "1.0.5"
+      }
+    },
+    "node_modules/@img/sharp-linux-arm64": {
+      "version": "0.33.5",
+      "resolved": "https://registry.npmjs.org/@img/sharp-linux-arm64/-/sharp-linux-arm64-0.33.5.tgz",
+      "integrity": "sha512-JMVv+AMRyGOHtO1RFBiJy/MBsgz0x4AWrT6QoEVVTyh1E39TrCUpTRI7mx9VksGX4awWASxqCYLCV4wBZHAYxA==",
+      "cpu": [
+        "arm64"
+      ],
+      "license": "Apache-2.0",
+      "optional": true,
+      "os": [
+        "linux"
+      ],
+      "engines": {
+        "node": "^18.17.0 || ^20.3.0 || >=21.0.0"
+      },
+      "funding": {
+        "url": "https://opencollective.com/libvips"
+      },
+      "optionalDependencies": {
+        "@img/sharp-libvips-linux-arm64": "1.0.4"
+      }
+    },
+    "node_modules/@img/sharp-linux-x64": {
+      "version": "0.33.5",
+      "resolved": "https://registry.npmjs.org/@img/sharp-linux-x64/-/sharp-linux-x64-0.33.5.tgz",
+      "integrity": "sha512-opC+Ok5pRNAzuvq1AG0ar+1owsu842/Ab+4qvU879ippJBHvyY5n2mxF1izXqkPYlGuP/M556uh53jRLJmzTWA==",
+      "cpu": [
+        "x64"
+      ],
+      "license": "Apache-2.0",
+      "optional": true,
+      "os": [
+        "linux"
+      ],
+      "engines": {
+        "node": "^18.17.0 || ^20.3.0 || >=21.0.0"
+      },
+      "funding": {
+        "url": "https://opencollective.com/libvips"
+      },
+      "optionalDependencies": {
+        "@img/sharp-libvips-linux-x64": "1.0.4"
+      }
+    },
+    "node_modules/@img/sharp-win32-x64": {
+      "version": "0.33.5",
+      "resolved": "https://registry.npmjs.org/@img/sharp-win32-x64/-/sharp-win32-x64-0.33.5.tgz",
+      "integrity": "sha512-MpY/o8/8kj+EcnxwvrP4aTJSWw/aZ7JIGR4aBeZkZw5B7/Jn+tY9/VNwtcoGmdT7GfggGIU4kygOMSbYnOrAbg==",
+      "cpu": [
+        "x64"
+      ],
+      "license": "Apache-2.0 AND LGPL-3.0-or-later",
+      "optional": true,
+      "os": [
+        "win32"
+      ],
+      "engines": {
+        "node": "^18.17.0 || ^20.3.0 || >=21.0.0"
+      },
+      "funding": {
+        "url": "https://opencollective.com/libvips"
+      }
+    },
    "node_modules/@types/tinycolor2": {
      "version": "1.4.6",
      "resolved": "https://registry.npmjs.org/@types/tinycolor2/-/tinycolor2-1.4.6.tgz",
@@ -462,6 +634,15 @@
        "url": "https://github.com/chalk/wrap-ansi?sponsor=1"
      }
    },
+    "node_modules/zod": {
+      "version": "3.25.76",
+      "resolved": "https://registry.npmjs.org/zod/-/zod-3.25.76.tgz",
+      "integrity": "sha512-gzUt/qt81nXsFGKIFcC3YnfEAx5NkunCfnDlvuBSSFS02bcXu4Lmea0AFIUwbLWxWPx3d9p8S5QoaujKcNQxcQ==",
+      "license": "MIT",
+      "funding": {
+        "url": "https://github.com/sponsors/colinhacks"
+      }
+    },
    "node_modules/zx": {
      "version": "8.8.1",
      "resolved": "https://registry.npmjs.org/zx/-/zx-8.8.1.tgz",
@@ -7,7 +7,7 @@
    "start": "./shannon.mjs"
  },
  "dependencies": {
-    "@anthropic-ai/claude-code": "^1.0.96",
+    "@anthropic-ai/claude-agent-sdk": "^0.1.0",
    "ajv": "^8.12.0",
    "ajv-formats": "^2.1.1",
    "boxen": "^8.0.1",
@@ -15,6 +15,7 @@
    "figlet": "^1.9.3",
    "gradient-string": "^3.0.0",
    "js-yaml": "^4.1.0",
+    "zod": "^3.22.4",
    "zx": "^8.0.0"
  },
  "bin": {
@@ -8,20 +8,7 @@ Success criterion: A comprehensive evidence report detailing the outcome for **e
 </objective>

 <exploitation_scope>
-**MANDATORY EXPLOITATION CONSTRAINTS:**
- ALL exploitation attempts MUST be performed against {{WEB_URL}} from external network
- DO NOT attempt to exploit vulnerabilities that require:
-  - Internal network access (Tailscale, VPN, etc.)
-  - Direct server access
-  - Database console access
-
-**VERDICT CLASSIFICATIONS:**
- EXPLOITED: Successfully demonstrated impact via public interface
- BLOCKED_BY_SECURITY: Valid vulnerability but blocked by WAF/controls
- OUT_OF_SCOPE_INTERNAL: Requires internal access - DO NOT PURSUE
- FALSE_POSITIVE: Not actually vulnerable
-
-Only EXPLOITED and BLOCKED_BY_SECURITY should be reported as findings.
+@include(shared/_exploit-scope.txt)
 </exploitation_scope>

 <critical>
@@ -107,13 +94,11 @@ Chain exploits together: Use enumerated users in password attacks. Use created w
 If you only confirmed a vulnerability without attempting to weaponize it, you have not completed the exploitation task.

 <target>
-URL: {{WEB_URL}}
+@include(shared/_target.txt)
 </target>

 <rules>
-Rules to Avoid:
-{{RULES_AVOID}}
-
+@include(shared/_rules.txt)
 </rules>

 <login_instructions>
@@ -158,10 +143,15 @@ You are the **Identity Compromise Specialist** - proving tangible impact of brok

 <available_tools>
 - **{{MCP_SERVER}} (Playwright):** Essential for interacting with multi-step authentication flows, injecting stolen session cookies, and verifying account takeover in a real browser context.
- **Bash tool:** Crucial for crafting manual API requests with `curl` to replay tokens, test credential stuffing attacks, and probe for logical flaws.
+- **save_deliverable (MCP Tool):** Saves exploitation evidence files.
+  - **Parameters:**
+    - `deliverable_type`: "AUTH_EVIDENCE" (required)
+    - `content`: Your complete evidence report (markdown) (required)
+  - **Returns:** `{ status: "success", filepath: "..." }` on success or `{ status: "error", message: "..." }` on failure
+- **Bash tool:** Use for creating directories, copying files, and other shell commands as needed.
 - **Task Agent:** Mandatory coder-executor for brute force batches, credential stuffing, token replay automation, and any scripted workflow.
 - **TodoWrite tool:** Essential for managing your exploitation task list. You MUST use this tool to track all tasks, mark progress, and document your exploitation workflow.
- **Read / Write tools:** To document false positives in your tracking file at `workspace/auth_false_positives.md`.
+- **Read tool:** To read false positives from your tracking file at `workspace/auth_false_positives.md`.

 **CRITICAL TASK AGENT WORKFLOW:**
 - Delegate every custom script or multi-step automation to the Task Agent; do not hand-write scripts in Bash.
@@ -322,7 +312,7 @@ When your attack plan is fully explored (all checkboxes are marked `[x]`), your

 **1. Your Specialist Deliverable**

-First, synthesize all evidence into your specialist section at `deliverables/auth_exploitation_evidence.md`. This section will be combined with other specialist reports to form the complete security assessment.
+First, synthesize all evidence into your specialist section and save it using the save_deliverable script with type AUTH_EVIDENCE. This section will be combined with other specialist reports to form the complete security assessment.

 **Section Ordering & Format Requirements:**
 - Use the exact vulnerability IDs from `deliverables/auth_exploitation_queue.json`
@@ -415,8 +405,8 @@ If [blocker] were bypassed/removed:

 COMPLETION REQUIREMENTS (ALL must be satisfied):
 	1. Plan Completion: ALL tasks in your todo list must be marked as completed using the TodoWrite tool.
-	2. Deliverable Generation: The required file must be successfully created:
-	- deliverables/auth_exploitation_evidence.md (Created)
+	2. Deliverable Generation: The required deliverable must be successfully saved using save_deliverable script:
+	  - Evidence report: Use `save_deliverable` MCP tool with `deliverable_type: "AUTH_EVIDENCE"` and your evidence report as `content`

 CRITICAL WARNING: Announcing completion before every item in deliverables/auth_exploitation_queue.json has been pursued to a final, evidence-backed conclusion will be considered a mission failure.

@@ -8,20 +8,7 @@ Success criterion: A comprehensive evidence report detailing the outcome for **e
 </objective>

 <exploitation_scope>
-**MANDATORY EXPLOITATION CONSTRAINTS:**
- ALL exploitation attempts MUST be performed against {{WEB_URL}} from external network
- DO NOT attempt to exploit vulnerabilities that require:
-  - Internal network access (Tailscale, VPN, etc.)
-  - Direct server access
-  - Database console access
-
-**VERDICT CLASSIFICATIONS:**
- EXPLOITED: Successfully demonstrated impact via public interface
- BLOCKED_BY_SECURITY: Valid vulnerability but blocked by WAF/controls
- OUT_OF_SCOPE_INTERNAL: Requires internal access - DO NOT PURSUE
- FALSE_POSITIVE: Not actually vulnerable
-
-Only EXPLOITED and BLOCKED_BY_SECURITY should be reported as findings.
+@include(shared/_exploit-scope.txt)
 </exploitation_scope>

 <critical>
@@ -94,13 +81,11 @@ After exhaustive bypass attempts, determine:
 </critical>

 <target>
-URL: {{WEB_URL}}
+@include(shared/_target.txt)
 </target>

 <rules>
-Rules to Avoid:
-{{RULES_AVOID}}
-
+@include(shared/_rules.txt)
 </rules>

 <login_instructions>
@@ -145,10 +130,15 @@ You are the **Privilege Escalation Specialist** - proving tangible impact of bro

 <available_tools>
 - **{{MCP_SERVER}} (Playwright):** Essential for interacting with complex authorization flows, testing role-based access controls in browser contexts, and verifying privilege escalation through UI elements.
- **Bash tool:** Crucial for crafting manual API requests with `curl` to test IDOR vulnerabilities, manipulate user parameters, and probe for access control bypasses.
+- **save_deliverable (MCP Tool):** Saves exploitation evidence files.
+  - **Parameters:**
+    - `deliverable_type`: "AUTHZ_EVIDENCE" (required)
+    - `content`: Your complete evidence report (markdown) (required)
+  - **Returns:** `{ status: "success", filepath: "..." }` on success or `{ status: "error", message: "..." }` on failure
+- **Bash tool:** Use for creating directories, copying files, and other shell commands as needed.
 - **Task Agent:** Mandatory coder-executor for IDOR sweeps, role escalation loops, and workflow bypass automation.
 - **TodoWrite tool:** Essential for managing your exploitation task list. You MUST use this tool to track all tasks, mark progress, and document your exploitation workflow.
- **Read / Write tools:** To document false positives in your tracking file at `workspace/authz_false_positives.md`.
+- **Read tool:** To read false positives from your tracking file at `workspace/authz_false_positives.md`.

 **CRITICAL TASK AGENT WORKFLOW:**
 - Delegate every multi-user iteration, role toggle test, or workflow automation script to the Task Agent—never handcraft these scripts yourself.
@@ -328,7 +318,7 @@ When your attack plan is fully explored (all checkboxes are marked `[x]`), your

 **1. Your Specialist Deliverable**

-First, synthesize all evidence into your specialist section at `deliverables/authz_exploitation_evidence.md`. This section will be combined with other specialist reports to form the complete security assessment and MUST use the following structure:
+First, synthesize all evidence into your specialist section and save it using the save_deliverable script with type AUTHZ_EVIDENCE. This section will be combined with other specialist reports to form the complete security assessment and MUST use the following structure:

 **Section Ordering & Format Requirements:**
 - Use the exact vulnerability IDs from `deliverables/authz_exploitation_queue.json`
@@ -417,8 +407,8 @@ If [blocker] were bypassed/removed:

 COMPLETION REQUIREMENTS (ALL must be satisfied):
 	1. Plan Completion: ALL tasks in your todo list must be marked as completed using the TodoWrite tool.
-	2. Deliverable Generation: The required file must be successfully created:
-	- deliverables/authz_exploitation_evidence.md (Created)
+	2. Deliverable Generation: The required deliverable must be successfully saved using save_deliverable script:
+	  - Evidence report: Use `save_deliverable` MCP tool with `deliverable_type: "AUTHZ_EVIDENCE"` and your evidence report as `content`

 CRITICAL WARNING: Announcing completion before every item in deliverables/authz_exploitation_queue.json has been pursued to a final, evidence-backed conclusion will be considered a mission failure.

@@ -8,20 +8,7 @@ Success criterion: A comprehensive evidence report detailing the outcome for **e
 </objective>

 <exploitation_scope>
-**MANDATORY EXPLOITATION CONSTRAINTS:**
- ALL exploitation attempts MUST be performed against {{WEB_URL}} from external network
- DO NOT attempt to exploit vulnerabilities that require:
-  - Internal network access (Tailscale, VPN, etc.)
-  - Direct server access
-  - Database console access
-
-**VERDICT CLASSIFICATIONS:**
- EXPLOITED: Successfully demonstrated impact via public interface
- BLOCKED_BY_SECURITY: Valid vulnerability but blocked by WAF/controls
- OUT_OF_SCOPE_INTERNAL: Requires internal access - DO NOT PURSUE
- FALSE_POSITIVE: Not actually vulnerable
-
-Only EXPLOITED and BLOCKED_BY_SECURITY should be reported as findings.
+@include(shared/_exploit-scope.txt)
 </exploitation_scope>

 <critical>
@@ -71,12 +58,11 @@ Remember: An unproven vulnerability is worse than no finding at all - it wastes


 <target>
-URL: {{WEB_URL}}
+@include(shared/_target.txt)
 </target>

 <rules>
-Rules to Avoid:
-{{RULES_AVOID}}
+@include(shared/_rules.txt)
 </rules>

 <login_instructions>
@@ -126,10 +112,15 @@ You are the **Proof-of-Impact Generator** - converting theoretical injection fla

 <available_tools>
 - **{{MCP_SERVER}} (Playwright):** For testing injection vulnerabilities through browser interactions when needed.
- **Bash tool:** Your primary tool for executing attacks using `curl` and automated tools like `sqlmap`.
+- **save_deliverable (MCP Tool):** Saves exploitation evidence files.
+  - **Parameters:**
+    - `deliverable_type`: "INJECTION_EVIDENCE" (required)
+    - `content`: Your complete evidence report (markdown) (required)
+  - **Returns:** `{ status: "success", filepath: "..." }` on success or `{ status: "error", message: "..." }` on failure
+- **Bash tool:** Use for creating directories, copying files, and other shell commands as needed.
 - **Task Agent:** Mandatory coder-executor for any custom scripting beyond single ad-hoc commands.
 - **TodoWrite tool:** Essential for managing your exploitation task list. You MUST use this tool to track all tasks, mark progress, and document your exploitation workflow.
- **Read / Write tools:** To document false positives in your tracking file at `workspace/injection_false_positives.md`.
+- **Read tool:** To read false positives from your tracking file at `workspace/injection_false_positives.md`.

 **CRITICAL TASK AGENT WORKFLOW:**
 - Task Agent must author and run every custom script, payload loop, or enumeration workflow. Do not craft standalone scripts in Bash or other tools.
@@ -351,7 +342,7 @@ When your attack plan is fully explored (all checkboxes are marked `[x]`), your

 **1. Your Specialist Deliverable**

-First, synthesize all of your evidence into your specialist section at `deliverables/injection_exploitation_evidence.md`. This section will be combined with other specialist reports to form the complete security assessment.
+First, synthesize all of your evidence into your specialist section and save it using the save_deliverable script with type INJECTION_EVIDENCE. This section will be combined with other specialist reports to form the complete security assessment.

 Your section MUST use the following structure precisely:

@@ -443,8 +434,8 @@ If [blocker] were bypassed/removed:

 COMPLETION REQUIREMENTS (ALL must be satisfied):
 1.  **Plan Completion:** ALL tasks for EVERY vulnerability in your todo list must be marked as completed using the TodoWrite tool. **No vulnerability or task can be left unaddressed.**
-2.  **Deliverable Generation:** The required file must be successfully created:
-    - `deliverables/injection_exploitation_evidence.md`
+2.  **Deliverable Generation:** The required deliverable must be successfully saved using save_deliverable MCP tool:
+    - Evidence report: Use `save_deliverable` MCP tool with `deliverable_type: "INJECTION_EVIDENCE"` and your evidence report as `content`

 **CRITICAL WARNING:** Announcing completion before every item in `deliverables/injection_exploitation_queue.json` has been pursued to a final, evidence-backed conclusion (either successfully exploited or verified false positive) will be considered a mission failure. Superficial testing is not acceptable.

@@ -8,20 +8,7 @@ Success criterion: A comprehensive evidence report detailing the outcome for **e
 </objective>

 <exploitation_scope>
-**MANDATORY EXPLOITATION CONSTRAINTS:**
- ALL exploitation attempts MUST be performed against {{WEB_URL}} from external network
- DO NOT attempt to exploit vulnerabilities that require:
-  - Internal network access (Tailscale, VPN, etc.)
-  - Direct server access
-  - Database console access
-
-**VERDICT CLASSIFICATIONS:**
- EXPLOITED: Successfully demonstrated impact via public interface
- BLOCKED_BY_SECURITY: Valid vulnerability but blocked by WAF/controls
- OUT_OF_SCOPE_INTERNAL: Requires internal access - DO NOT PURSUE
- FALSE_POSITIVE: Not actually vulnerable
-
-Only EXPLOITED and BLOCKED_BY_SECURITY should be reported as findings.
+@include(shared/_exploit-scope.txt)
 </exploitation_scope>

 <critical> 
@@ -94,13 +81,11 @@ After exhaustive bypass attempts, determine:
 </critical>

 <target>
-URL: {{WEB_URL}}
+@include(shared/_target.txt)
 </target>

 <rules>
-Rules to Avoid:
-{{RULES_AVOID}}
-
+@include(shared/_rules.txt)
 </rules>

 <login_instructions>
@@ -144,11 +129,16 @@ You are the **Network Boundary Breaker** - proving tangible impact of SSRF vulne
 </system_architecture>

 <available_tools>
- **Bash tool:** Essential for crafting HTTP requests with `curl` to exploit SSRF vulnerabilities, access internal services, and retrieve cloud metadata.
+- **save_deliverable (MCP Tool):** Saves exploitation evidence files.
+  - **Parameters:**
+    - `deliverable_type`: "SSRF_EVIDENCE" (required)
+    - `content`: Your complete evidence report (markdown) (required)
+  - **Returns:** `{ status: "success", filepath: "..." }` on success or `{ status: "error", message: "..." }` on failure
+- **Bash tool:** Use for creating directories, copying files, and other shell commands as needed.
 - **{{MCP_SERVER}} (Playwright):** Useful for complex multi-step SSRF exploitation that requires browser context or JavaScript execution.
 - **Task Agent:** Mandatory coder-executor for host enumeration loops, protocol sweeps, and metadata retrieval scripts.
 - **TodoWrite tool:** Essential for managing your exploitation task list. You MUST use this tool to track all tasks, mark progress, and document your exploitation workflow.
- **Read / Write tools:** To document false positives in your tracking file at `workspace/ssrf_false_positives.md`.
+- **Read tool:** To read false positives from your tracking file at `workspace/ssrf_false_positives.md`.

 **CRITICAL TASK AGENT WORKFLOW:**
 - Delegate every automated scan (internal hosts, cloud metadata, port sweeps) to the Task Agent; do not handcraft scripts locally.
@@ -405,7 +395,7 @@ When your attack plan is fully explored (all checkboxes are marked `[x]`), your

 **1. Your Specialist Deliverable**

-First, synthesize all evidence into your specialist section at `deliverables/ssrf_exploitation_evidence.md`. This section will be combined with other specialist reports to form the complete security assessment and MUST use the following structure:
+First, synthesize all evidence into your specialist section and save it using the save_deliverable script with type SSRF_EVIDENCE. This section will be combined with other specialist reports to form the complete security assessment and MUST use the following structure:

 **Section Ordering & Format Requirements:**
 - Use the exact vulnerability IDs from `deliverables/ssrf_exploitation_queue.json`
@@ -494,8 +484,8 @@ If [blocker] were bypassed/removed:

 COMPLETION REQUIREMENTS (ALL must be satisfied):
 	1. Plan Completion: ALL tasks in your todo list must be marked as completed using the TodoWrite tool.
-	2. Deliverable Generation: The required file must be successfully created:
-	- deliverables/ssrf_exploitation_evidence.md (Created)
+	2. Deliverable Generation: The required deliverable must be successfully saved using save_deliverable script:
+	  - Evidence report: Use `save_deliverable` MCP tool with `deliverable_type: "SSRF_EVIDENCE"` and your evidence report as `content`

 CRITICAL WARNING: Announcing completion before every item in deliverables/ssrf_exploitation_queue.json has been pursued to a final, evidence-backed conclusion will be considered a mission failure.

@@ -19,20 +19,7 @@ Success criterion: A comprehensive evidence report detailing the outcome for **e
 </objective>

 <exploitation_scope>
-**MANDATORY EXPLOITATION CONSTRAINTS:**
- ALL exploitation attempts MUST be performed against {{WEB_URL}} from external network
- DO NOT attempt to exploit vulnerabilities that require:
-  - Internal network access (Tailscale, VPN, etc.)
-  - Direct server access
-  - Database console access
-
-**VERDICT CLASSIFICATIONS:**
- EXPLOITED: Successfully demonstrated impact via public interface
- BLOCKED_BY_SECURITY: Valid vulnerability but blocked by WAF/controls
- OUT_OF_SCOPE_INTERNAL: Requires internal access - DO NOT PURSUE
- FALSE_POSITIVE: Not actually vulnerable
-
-Only EXPLOITED and BLOCKED_BY_SECURITY should be reported as findings.
+@include(shared/_exploit-scope.txt)
 </exploitation_scope>

 <critical>
@@ -92,13 +79,11 @@ After exhaustive bypass attempts, determine:
 </critical>

 <target>
-URL: {{WEB_URL}}
+@include(shared/_target.txt)
 </target>

 <rules>
-Rules to Avoid:
-{{RULES_AVOID}}
-
+@include(shared/_rules.txt)
 </rules>

 <login_instructions>
@@ -147,10 +132,15 @@ You are the **Client-Side Impact Demonstrator** - converting theoretical XSS fla

 <available_tools>
 - **{{MCP_SERVER}} (Playwright):** Your primary tool for testing DOM-based and Stored XSS, confirming script execution in a real browser context, and interacting with the application post-exploitation.
- **Bash tool:** Essential for testing Reflected XSS with `curl` to observe raw server responses and craft payloads without browser interference.
+- **save_deliverable (MCP Tool):** Saves exploitation evidence files.
+  - **Parameters:**
+    - `deliverable_type`: "XSS_EVIDENCE" (required)
+    - `content`: Your complete evidence report (markdown) (required)
+  - **Returns:** `{ status: "success", filepath: "..." }` on success or `{ status: "error", message: "..." }` on failure
+- **Bash tool:** Use for creating directories, copying files, and other shell commands as needed.
 - **Task Agent:** Mandatory coder-executor for payload iteration scripts, exfiltration listeners, and DOM interaction helpers beyond single manual steps.
 - **TodoWrite tool:** To create and manage your exploitation todo list, tracking each vulnerability systematically.
- **Read / Write tools:** To document false positives in your tracking file at `workspace/xss_false_positives.md`.
+- **Read tool:** To read false positives from your tracking file at `workspace/xss_false_positives.md`.

 **CRITICAL TASK AGENT WORKFLOW:**
 - Delegate every automated payload sweep, browser interaction loop, or listener setup to the Task Agent—do not craft standalone scripts manually.
@@ -337,7 +327,7 @@ POTENTIAL (Report in separate subsection):
 When all todos in your list are marked as completed, your mission is complete. You MUST generate two final files.

 1. Your Specialist Deliverable  
-First, synthesize all of your evidence into your specialist section at `deliverables/xss_exploitation_evidence.md`. This section will be combined with other specialist reports to form the complete security assessment.  
+First, synthesize all of your evidence into your specialist section and save it using the save_deliverable script with type XSS_EVIDENCE. This section will be combined with other specialist reports to form the complete security assessment.  
 Your section MUST use the following structure precisely:

 **Section Ordering & Format Requirements:**
@@ -434,11 +424,11 @@ If [blocker] were bypassed/removed:

 COMPLETION REQUIREMENTS (ALL must be satisfied):
 - Todo List Completion: ALL vulnerabilities from the exploitation queue must have been processed and marked as completed in your todo list.
- Deliverable Generation: The required file must be successfully created:
- - deliverables/xss_exploitation_evidence.md (Created)
+- Deliverable Generation: The required deliverable must be successfully saved using save_deliverable MCP tool:
+  - Evidence report: Use `save_deliverable` MCP tool with `deliverable_type: "XSS_EVIDENCE"` and your evidence report as `content`

 **CRITICAL WARNING:** Announcing completion before every item in `deliverables/xss_exploitation_queue.json` has been pursued to a final, evidence-backed conclusion (either successfully exploited or verified false positive) will be considered a mission failure. Superficial testing is not acceptable.

-ONLY AFTER both plan completion AND successful generation of all required files, announce: "XSS EXPLOITATION COMPLETE"
+ONLY AFTER both plan completion AND successful deliverable generation, announce: "XSS EXPLOITATION COMPLETE"
 and stop.
 </conclusion_trigger>
@@ -19,29 +19,13 @@
   ```

 ### Step 2: Create Required Deliverable
-Create file `deliverables/auth_exploitation_evidence.md` with content:
-```
-# Authentication Exploitation Evidence (Pipeline Test)
+Use the save_deliverable MCP tool to create your evidence:

-**MCP Server Used:** {{MCP_SERVER}}
-**Test Site:** https://example.com
-**Screenshot:** auth-exploit-test.png
-**Status:** Browser isolation test successful
+Use `save_deliverable` MCP tool with:
+- `deliverable_type: "AUTH_EVIDENCE"`
+- `content: "# Authentication Exploitation Evidence (Pipeline Test)\n\n**MCP Server Used:** {{MCP_SERVER}}\n**Test Site:** https://example.com\n**Screenshot:** auth-exploit-test.png\n**Status:** Browser isolation test successful\n\n## Exploitation Results\n- **Target:** Test authentication vulnerability\n- **Method:** Authentication bypass via parameter manipulation\n- **Evidence:** Successfully bypassed login using admin=true parameter\n- **Impact:** Demonstrated unauthorized access to protected resources\n- **Proof:** Pipeline testing mode - simulated successful exploitation\n\n## Technical Details\n- **Vulnerability ID:** AUTH-VULN-01\n- **Exploitation Status:** Simulated Success (Pipeline Test)\n- **MCP Instance:** {{MCP_SERVER}}\n- **Attack Vector:** Parameter-based authentication bypass\n- **Bypass Method:** URL parameter manipulation"`

-## Exploitation Results
- **Target:** Test authentication vulnerability
- **Method:** Authentication bypass via parameter manipulation
- **Evidence:** Successfully bypassed login using admin=true parameter
- **Impact:** Demonstrated unauthorized access to protected resources
- **Proof:** Pipeline testing mode - simulated successful exploitation
-
-## Technical Details
- **Vulnerability ID:** AUTH-VULN-01
- **Exploitation Status:** Simulated Success (Pipeline Test)
- **MCP Instance:** {{MCP_SERVER}}
- **Attack Vector:** Parameter-based authentication bypass
- **Bypass Method:** URL parameter manipulation
-```
+This will automatically create `deliverables/auth_exploitation_evidence.md`.

 ### Step 3: Verify MCP Isolation
 This agent should be using {{MCP_SERVER}} and navigating to example.com independently of other parallel exploitation agents.
@@ -19,29 +19,13 @@
   ```

 ### Step 2: Create Required Deliverable
-Create file `deliverables/authz_exploitation_evidence.md` with content:
-```
-# Authorization Exploitation Evidence (Pipeline Test)
+Use the save_deliverable MCP tool to create your evidence:

-**MCP Server Used:** {{MCP_SERVER}}
-**Test Site:** https://example.com
-**Screenshot:** authz-exploit-test.png
-**Status:** Browser isolation test successful
+Use `save_deliverable` MCP tool with:
+- `deliverable_type: "AUTHZ_EVIDENCE"`
+- `content: "# Authorization Exploitation Evidence (Pipeline Test)\n\n**MCP Server Used:** {{MCP_SERVER}}\n**Test Site:** https://example.com\n**Screenshot:** authz-exploit-test.png\n**Status:** Browser isolation test successful\n\n## Exploitation Results\n- **Target:** Test authorization vulnerability\n- **Method:** Privilege escalation via direct object reference\n- **Evidence:** Successfully accessed admin panel with user-level privileges\n- **Impact:** Demonstrated unauthorized access to privileged functionality\n- **Proof:** Pipeline testing mode - simulated successful exploitation\n\n## Technical Details\n- **Vulnerability ID:** AUTHZ-VULN-01\n- **Exploitation Status:** Simulated Success (Pipeline Test)\n- **MCP Instance:** {{MCP_SERVER}}\n- **Attack Vector:** Insecure Direct Object Reference (IDOR)\n- **Escalation Method:** User ID manipulation in API calls"`

-## Exploitation Results
- **Target:** Test authorization vulnerability
- **Method:** Privilege escalation via direct object reference
- **Evidence:** Successfully accessed admin panel with user-level privileges
- **Impact:** Demonstrated unauthorized access to privileged functionality
- **Proof:** Pipeline testing mode - simulated successful exploitation
-
-## Technical Details
- **Vulnerability ID:** AUTHZ-VULN-01
- **Exploitation Status:** Simulated Success (Pipeline Test)
- **MCP Instance:** {{MCP_SERVER}}
- **Attack Vector:** Insecure Direct Object Reference (IDOR)
- **Escalation Method:** User ID manipulation in API calls
-```
+This will automatically create `deliverables/authz_exploitation_evidence.md`.

 ### Step 3: Verify MCP Isolation
 This agent should be using {{MCP_SERVER}} and navigating to example.com independently of other parallel exploitation agents.
@@ -19,27 +19,13 @@
   ```

 ### Step 2: Create Required Deliverable
-Create file `deliverables/injection_exploitation_evidence.md` with content:
-```
-# Injection Exploitation Evidence (Pipeline Test)
+Use the save_deliverable MCP tool to create your evidence:

-**MCP Server Used:** {{MCP_SERVER}}
-**Test Site:** https://example.com
-**Screenshot:** injection-exploit-test.png
-**Status:** Browser isolation test successful
+Use `save_deliverable` MCP tool with:
+- `deliverable_type: "INJECTION_EVIDENCE"`
+- `content: "# Injection Exploitation Evidence (Pipeline Test)\n\n**MCP Server Used:** {{MCP_SERVER}}\n**Test Site:** https://example.com\n**Screenshot:** injection-exploit-test.png\n**Status:** Browser isolation test successful\n\n## Exploitation Results\n- **Target:** Test injection vulnerability\n- **Method:** SQL injection via query parameter\n- **Evidence:** Successfully executed test payload `'; DROP TABLE users; --`\n- **Impact:** Demonstrated ability to manipulate database queries\n- **Proof:** Pipeline testing mode - simulated successful exploitation\n\n## Technical Details\n- **Vulnerability ID:** INJ-VULN-01\n- **Exploitation Status:** Simulated Success (Pipeline Test)\n- **MCP Instance:** {{MCP_SERVER}}"`

-## Exploitation Results
- **Target:** Test injection vulnerability
- **Method:** SQL injection via query parameter
- **Evidence:** Successfully executed test payload `'; DROP TABLE users; --`
- **Impact:** Demonstrated ability to manipulate database queries
- **Proof:** Pipeline testing mode - simulated successful exploitation
-
-## Technical Details
- **Vulnerability ID:** INJ-VULN-01
- **Exploitation Status:** Simulated Success (Pipeline Test)
- **MCP Instance:** {{MCP_SERVER}}
-```
+This will automatically create `deliverables/injection_exploitation_evidence.md`.

 ### Step 3: Verify MCP Isolation
 This agent should be using {{MCP_SERVER}} and navigating to example.com independently of other parallel exploitation agents.
@@ -19,29 +19,13 @@
   ```

 ### Step 2: Create Required Deliverable
-Create file `deliverables/ssrf_exploitation_evidence.md` with content:
-```
-# SSRF Exploitation Evidence (Pipeline Test)
+Use the save_deliverable MCP tool to create your evidence:

-**MCP Server Used:** {{MCP_SERVER}}
-**Test Site:** https://example.com
-**Screenshot:** ssrf-exploit-test.png
-**Status:** Browser isolation test successful
+Use `save_deliverable` MCP tool with:
+- `deliverable_type: "SSRF_EVIDENCE"`
+- `content: "# SSRF Exploitation Evidence (Pipeline Test)\n\n**MCP Server Used:** {{MCP_SERVER}}\n**Test Site:** https://example.com\n**Screenshot:** ssrf-exploit-test.png\n**Status:** Browser isolation test successful\n\n## Exploitation Results\n- **Target:** Test SSRF vulnerability\n- **Method:** Server-Side Request Forgery via URL parameter\n- **Evidence:** Successfully forced server to make request to internal network\n- **Impact:** Demonstrated access to internal services and potential data exfiltration\n- **Proof:** Pipeline testing mode - simulated successful exploitation\n\n## Technical Details\n- **Vulnerability ID:** SSRF-VULN-01\n- **Exploitation Status:** Simulated Success (Pipeline Test)\n- **MCP Instance:** {{MCP_SERVER}}\n- **Attack Vector:** URL parameter manipulation\n- **Target:** Internal network services (localhost:8080)"`

-## Exploitation Results
- **Target:** Test SSRF vulnerability
- **Method:** Server-Side Request Forgery via URL parameter
- **Evidence:** Successfully forced server to make request to internal network
- **Impact:** Demonstrated access to internal services and potential data exfiltration
- **Proof:** Pipeline testing mode - simulated successful exploitation
-
-## Technical Details
- **Vulnerability ID:** SSRF-VULN-01
- **Exploitation Status:** Simulated Success (Pipeline Test)
- **MCP Instance:** {{MCP_SERVER}}
- **Attack Vector:** URL parameter manipulation
- **Target:** Internal network services (localhost:8080)
-```
+This will automatically create `deliverables/ssrf_exploitation_evidence.md`.

 ### Step 3: Verify MCP Isolation
 This agent should be using {{MCP_SERVER}} and navigating to example.com independently of other parallel exploitation agents.
@@ -19,28 +19,13 @@
   ```

 ### Step 2: Create Required Deliverable
-Create file `deliverables/xss_exploitation_evidence.md` with content:
-```
-# XSS Exploitation Evidence (Pipeline Test)
+Use the save_deliverable MCP tool to create your evidence:

-**MCP Server Used:** {{MCP_SERVER}}
-**Test Site:** https://example.com
-**Screenshot:** xss-exploit-test.png
-**Status:** Browser isolation test successful
+Use `save_deliverable` MCP tool with:
+- `deliverable_type: "XSS_EVIDENCE"`
+- `content: "# XSS Exploitation Evidence (Pipeline Test)\n\n**MCP Server Used:** {{MCP_SERVER}}\n**Test Site:** https://example.com\n**Screenshot:** xss-exploit-test.png\n**Status:** Browser isolation test successful\n\n## Exploitation Results\n- **Target:** Test XSS vulnerability\n- **Method:** Reflected XSS via search parameter\n- **Evidence:** Successfully executed payload `<script>alert('XSS')</script>`\n- **Impact:** Demonstrated JavaScript code execution in user context\n- **Proof:** Pipeline testing mode - simulated successful exploitation\n\n## Technical Details\n- **Vulnerability ID:** XSS-VULN-01\n- **Exploitation Status:** Simulated Success (Pipeline Test)\n- **MCP Instance:** {{MCP_SERVER}}\n- **Attack Vector:** Reflected XSS in search functionality"`

-## Exploitation Results
- **Target:** Test XSS vulnerability
- **Method:** Reflected XSS via search parameter
- **Evidence:** Successfully executed payload `<script>alert('XSS')</script>`
- **Impact:** Demonstrated JavaScript code execution in user context
- **Proof:** Pipeline testing mode - simulated successful exploitation
-
-## Technical Details
- **Vulnerability ID:** XSS-VULN-01
- **Exploitation Status:** Simulated Success (Pipeline Test)
- **MCP Instance:** {{MCP_SERVER}}
- **Attack Vector:** Reflected XSS in search functionality
-```
+This will automatically create `deliverables/xss_exploitation_evidence.md`.

 ### Step 3: Verify MCP Isolation
 This agent should be using {{MCP_SERVER}} and navigating to example.com independently of other parallel exploitation agents.
@@ -1 +1 @@
-Create file `deliverables/code_analysis_deliverable.md` with content "Pre-recon analysis complete".
+Use the save_deliverable MCP tool with `deliverable_type: "CODE_ANALYSIS"` and `content: "Pre-recon analysis complete"`. Then say "Done".
@@ -1 +1 @@
-Create file `deliverables/recon_deliverable.md` with content "Reconnaissance analysis complete".
+Use the save_deliverable MCP tool with `deliverable_type: "RECON"` and `content: "Reconnaissance analysis complete"`. Then say "Done".
@@ -1,138 +1,31 @@
-<role>
-You are a Security Report Generator for pipeline testing. Your job is to create a comprehensive security assessment report that simulates realistic exploitation scenarios.
-</role>
+Create a file at `deliverables/comprehensive_security_assessment_report.md` with this content:

-<audience>
-Pipeline testing and vulnerability reporting validation.
-</audience>
+```markdown
+# Injection Exploitation Evidence

-<objective>
-Create a comprehensive security assessment report with successfully exploited vulnerabilities.
-</objective>
+## Successfully Exploited Vulnerabilities

-<target>
-URL: {{WEB_URL}}
-Repository: {{GITHUB_URL}}
-</target>
+### INJ-VULN-01: SQL Injection Test

-<instructions>
-Create or modify `deliverables/comprehensive_security_assessment_report.md` with the following structure:
-
-# Security Assessment Report
-
-## Executive Summary
- Target: {{WEB_URL}}
- Assessment Date: {current date}
- Scope: Authentication, XSS, SQL and Command Injection, SSRF, Authorization testing
-
-## Summary by Vulnerability Type
-
-**Authentication Vulnerabilities:**
-Three authentication vulnerabilities were successfully exploited. Critical findings include session management bypass allowing unauthorized access, weak password policy enabling brute force attacks, and missing multi-factor authentication on administrative functions.
-
-**Authorization Vulnerabilities:**
-One authorization vulnerability was successfully exploited: privilege escalation through role manipulation in user profile settings.
-
-**Cross-Site Scripting (XSS) Vulnerabilities:**
-One XSS vulnerability was successfully exploited: CSS injection in the chart styling component enabling CSRF token extraction and keylogging capabilities.
-
-**SQL/Command Injection Vulnerabilities:**
-One injection vulnerability was successfully exploited: SQL injection in user search functionality allowing database enumeration and data extraction.
-
-**Server-Side Request Forgery (SSRF) Vulnerabilities:**
-No SSRF vulnerabilities were successfully exploited due to network access restrictions.
-
-## Network Reconnaissance
-
-**Open Ports and Services:**
- Port 80/tcp: HTTP service, redirects to HTTPS
- Port 443/tcp: HTTPS service, main application access point
-
-**Security Headers:**
- HTTP Strict Transport Security (HSTS) configured with max-age=31536000
- X-Frame-Options set to DENY preventing clickjacking attacks
- Content Security Policy implemented with nonces
+**Summary:** Test SQL injection vulnerability for pipeline testing.

 # Authentication Exploitation Evidence

 ## Successfully Exploited Vulnerabilities

-### AUTH-VULN-01: Session Management Bypass
+### AUTH-VULN-01: Authentication Bypass Test

-**Summary:**
- **Vulnerability:** Session token predictability in authentication system
- **Impact:** Complete authentication bypass allowing unauthorized access
- **Severity:** Critical
+**Summary:** Test authentication vulnerability for pipeline testing.
+```

-**Exploitation Evidence:**
-Successfully bypassed authentication by predicting session tokens through temporal analysis. Gained administrative access to user management functions.
+Then read and modify the file by adding these sections at the top:

-### AUTH-VULN-02: Weak Password Policy Exploitation
+1. **Executive Summary** section with:
+   - Target: {{WEB_URL}}
+   - Assessment Date: (current date)
+   - Scope: Authentication, XSS, SQL and Command Injection, SSRF, Authorization testing
+   - Brief summary of findings (2-3 vulnerabilities found for testing)

-**Summary:**
- **Vulnerability:** Insufficient password complexity requirements
- **Impact:** Account takeover through brute force attacks
- **Severity:** High
+2. **Network Reconnaissance** section with test data for open ports and services

-**Exploitation Evidence:**
-Successfully performed brute force attacks against multiple user accounts due to weak password policy allowing simple passwords.
-
-# Authorization Exploitation Evidence
-
-## Successfully Exploited Vulnerabilities
-
-### AUTHZ-VULN-01: Privilege Escalation via Role Manipulation
-
-**Summary:**
- **Vulnerability:** Insufficient authorization checks in user profile settings
- **Impact:** Privilege escalation from standard user to administrator
- **Severity:** High
-
-**Exploitation Evidence:**
-Successfully escalated privileges by manipulating user role parameters in profile update requests, gaining administrative access.
-
-# XSS Exploitation Evidence
-
-## Successfully Exploited Vulnerabilities
-
-### XSS-VULN-01: CSS Injection with Token Extraction
-
-**Summary:**
- **Vulnerability:** CSS injection in chart styling component using dangerouslySetInnerHTML
- **Impact:** CSRF token extraction and keylogging capabilities
- **Severity:** Medium
-
-**Exploitation Evidence:**
-Successfully injected CSS code that extracted CSRF tokens and demonstrated keylogging capabilities. CSP blocked data exfiltration but exploitation was proven.
-
-# Injection Exploitation Evidence
-
-## Successfully Exploited Vulnerabilities
-
-### INJ-VULN-01: SQL Injection in User Search
-
-**Summary:**
- **Vulnerability:** SQL injection in user search functionality
- **Impact:** Database enumeration and sensitive data extraction
- **Severity:** High
-
-**Exploitation Evidence:**
-Successfully exploited SQL injection to enumerate database structure and extract sensitive user information including password hashes.
-
-# SSRF Exploitation Evidence
-
-## Potential Vulnerabilities (Validation Blocked)
-
-### SSRF-VULN-01: Internal Service Access via URL Parameter
-
-**Summary:**
- **Vulnerability:** SSRF in URL processing service
- **Current Blocker:** Network access restriction
- **Potential Impact:** Internal service enumeration and data access
- **Confidence:** HIGH
-
-**Evidence of Vulnerability:**
-Code analysis shows SSRF vulnerability in URL processing service but exploitation blocked by network restrictions.
-
-
-</instructions>
+Save the modified content back to `deliverables/comprehensive_security_assessment_report.md`, then say "Done".
@@ -19,35 +19,19 @@
   ```

 ### Step 2: Create Required Deliverables
-Create these 2 files:
-1. `deliverables/auth_analysis_deliverable.md` with content:
-```
-# Auth Analysis Complete (Pipeline Test)
+Use the save_deliverable MCP tool to create both deliverables:

-**MCP Server Used:** {{MCP_SERVER}}
-**Test Site:** https://example.net
-**Screenshot:** auth-agent-test.png
-**Status:** Browser isolation test successful
-```
+**1. Analysis Report:**
+Use `save_deliverable` MCP tool with:
+- `deliverable_type: "AUTH_ANALYSIS"`
+- `content: "# Auth Analysis Complete (Pipeline Test)\n\n**MCP Server Used:** {{MCP_SERVER}}\n**Test Site:** https://example.net\n**Screenshot:** auth-agent-test.png\n**Status:** Browser isolation test successful"`

-2. `deliverables/auth_exploitation_queue.json` with content:
-```json
-{
-  "vulnerabilities": [
-    {
-      "ID": "AUTH-VULN-01",
-      "verdict": "vulnerable",
-      "confidence": "high",
-      "notes": "Test auth vulnerability for pipeline validation (MCP: {{MCP_SERVER}})",
-      "vulnerability_type": "Authentication_Bypass",
-      "endpoint": "/api/login",
-      "flaw_type": "Missing token validation",
-      "exploit_hypothesis": "Empty or invalid token allows access to protected resources",
-      "witness_payload": "Authorization: Bearer invalid_token"
-    }
-  ]
-}
-```
+**2. Exploitation Queue:**
+Use `save_deliverable` MCP tool with:
+- `deliverable_type: "AUTH_QUEUE"`
+- `content: {"vulnerabilities": [{"ID": "AUTH-VULN-01", "verdict": "vulnerable", "confidence": "high", "notes": "Test auth vulnerability for pipeline validation (MCP: {{MCP_SERVER}})", "vulnerability_type": "Authentication_Bypass", "endpoint": "/api/login", "flaw_type": "Missing token validation", "exploit_hypothesis": "Empty or invalid token allows access to protected resources", "witness_payload": "Authorization: Bearer invalid_token"}]}`
+
+These tools will automatically create the correct files in `deliverables/`.

 ### Step 3: Verify MCP Isolation
 This agent should be using {{MCP_SERVER}} and navigating to example.net independently of other parallel agents.
@@ -19,34 +19,19 @@
   ```

 ### Step 2: Create Required Deliverables
-Create these 2 files:
-1. `deliverables/authz_analysis_deliverable.md` with content:
-```
-# Authorization Analysis Complete (Pipeline Test)
+Use the save_deliverable MCP tool to create both deliverables:

-**MCP Server Used:** {{MCP_SERVER}}
-**Test Site:** https://jsonplaceholder.typicode.com
-**Screenshot:** authz-agent-test.png
-**Status:** Browser isolation test successful
-```
+**1. Analysis Report:**
+Use `save_deliverable` MCP tool with:
+- `deliverable_type: "AUTHZ_ANALYSIS"`
+- `content: "# Authorization Analysis Complete (Pipeline Test)\n\n**MCP Server Used:** {{MCP_SERVER}}\n**Test Site:** https://jsonplaceholder.typicode.com\n**Screenshot:** authz-agent-test.png\n**Status:** Browser isolation test successful"`

-2. `deliverables/authz_exploitation_queue.json` with content:
-```json
-{
-  "vulnerabilities": [
-    {
-      "ID": "AUTHZ-VULN-01",
-      "verdict": "vulnerable",
-      "confidence": "high",
-      "notes": "Test authz vulnerability for pipeline validation (MCP: {{MCP_SERVER}})",
-      "vulnerability_type": "Vertical",
-      "endpoint": "/admin/users",
-      "actual_access": "Regular users can access admin functions",
-      "witness_payload": "GET /admin/users with regular user token"
-    }
-  ]
-}
-```
+**2. Exploitation Queue:**
+Use `save_deliverable` MCP tool with:
+- `deliverable_type: "AUTHZ_QUEUE"`
+- `content: {"vulnerabilities": [{"ID": "AUTHZ-VULN-01", "verdict": "vulnerable", "confidence": "high", "notes": "Test authz vulnerability for pipeline validation (MCP: {{MCP_SERVER}})", "vulnerability_type": "Vertical", "endpoint": "/admin/users", "actual_access": "Regular users can access admin functions", "witness_payload": "GET /admin/users with regular user token"}]}`
+
+These tools will automatically create the correct files in `deliverables/`.

 ### Step 3: Verify MCP Isolation
 This agent should be using {{MCP_SERVER}} and navigating to jsonplaceholder.typicode.com independently of other parallel agents.
@@ -19,37 +19,19 @@
   ```

 ### Step 2: Create Required Deliverables
-Create these 2 files:
-1. `deliverables/injection_analysis_deliverable.md` with content:
-```
-# Injection Analysis Complete (Pipeline Test)
+Use the save_deliverable MCP tool to create both deliverables:

-**MCP Server Used:** {{MCP_SERVER}}
-**Test Site:** https://example.com
-**Screenshot:** injection-agent-test.png
-**Status:** Browser isolation test successful
-```
+**1. Analysis Report:**
+Use `save_deliverable` MCP tool with:
+- `deliverable_type: "INJECTION_ANALYSIS"`
+- `content: "# Injection Analysis Complete (Pipeline Test)\n\n**MCP Server Used:** {{MCP_SERVER}}\n**Test Site:** https://example.com\n**Screenshot:** injection-agent-test.png\n**Status:** Browser isolation test successful"`

-2. `deliverables/injection_exploitation_queue.json` with content:
-```json
-{
-  "vulnerabilities": [
-    {
-      "ID": "INJ-VULN-01",
-      "verdict": "vulnerable",
-      "confidence": "high",
-      "notes": "Test vulnerability for pipeline validation (MCP: {{MCP_SERVER}})",
-      "vulnerability_type": "SQLi",
-      "source": "query parameter at app.js:15",
-      "path": "controller → db.query",
-      "sink_call": "db.query at app.js:20",
-      "slot_type": "SQL-val",
-      "mismatch_reason": "User input directly concatenated into SQL query",
-      "witness_payload": "'; DROP TABLE users; --"
-    }
-  ]
-}
-```
+**2. Exploitation Queue:**
+Use `save_deliverable` MCP tool with:
+- `deliverable_type: "INJECTION_QUEUE"`
+- `content: {"vulnerabilities": [{"ID": "INJ-VULN-01", "verdict": "vulnerable", "confidence": "high", "notes": "Test vulnerability for pipeline validation (MCP: {{MCP_SERVER}})", "vulnerability_type": "SQLi", "source": "query parameter at app.js:15", "path": "controller → db.query", "sink_call": "db.query at app.js:20", "slot_type": "SQL-val", "mismatch_reason": "User input directly concatenated into SQL query", "witness_payload": "'; DROP TABLE users; --"}]}`
+
+These tools will automatically create the correct files in `deliverables/`.

 ### Step 3: Verify MCP Isolation
 This agent should be using {{MCP_SERVER}} and navigating to example.com independently of other parallel agents.
@@ -19,34 +19,19 @@
   ```

 ### Step 2: Create Required Deliverables
-Create these 2 files:
-1. `deliverables/ssrf_analysis_deliverable.md` with content:
-```
-# SSRF Analysis Complete (Pipeline Test)
+Use the save_deliverable MCP tool to create both deliverables:

-**MCP Server Used:** {{MCP_SERVER}}
-**Test Site:** https://httpbin.org
-**Screenshot:** ssrf-agent-test.png
-**Status:** Browser isolation test successful
-```
+**1. Analysis Report:**
+Use `save_deliverable` MCP tool with:
+- `deliverable_type: "SSRF_ANALYSIS"`
+- `content: "# SSRF Analysis Complete (Pipeline Test)\n\n**MCP Server Used:** {{MCP_SERVER}}\n**Test Site:** https://httpbin.org\n**Screenshot:** ssrf-agent-test.png\n**Status:** Browser isolation test successful"`

-2. `deliverables/ssrf_exploitation_queue.json` with content:
-```json
-{
-  "vulnerabilities": [
-    {
-      "ID": "SSRF-VULN-01",
-      "verdict": "vulnerable",
-      "confidence": "high",
-      "notes": "Test SSRF vulnerability for pipeline validation (MCP: {{MCP_SERVER}})",
-      "vulnerability_type": "URL_Manipulation",
-      "source": "url parameter in /api/fetch",
-      "outbound_call": "fetch() at api.js:45",
-      "witness_payload": "http://internal.localhost/admin"
-    }
-  ]
-}
-```
+**2. Exploitation Queue:**
+Use `save_deliverable` MCP tool with:
+- `deliverable_type: "SSRF_QUEUE"`
+- `content: {"vulnerabilities": [{"ID": "SSRF-VULN-01", "verdict": "vulnerable", "confidence": "high", "notes": "Test SSRF vulnerability for pipeline validation (MCP: {{MCP_SERVER}})", "vulnerability_type": "URL_Manipulation", "source": "url parameter in /api/fetch", "outbound_call": "fetch() at api.js:45", "witness_payload": "http://internal.localhost/admin"}]}`
+
+These tools will automatically create the correct files in `deliverables/`.

 ### Step 3: Verify MCP Isolation
 This agent should be using {{MCP_SERVER}} and navigating to httpbin.org independently of other parallel agents.
@@ -19,36 +19,19 @@
   ```

 ### Step 2: Create Required Deliverables
-Create these 2 files:
-1. `deliverables/xss_analysis_deliverable.md` with content:
-```
-# XSS Analysis Complete (Pipeline Test)
+Use the save_deliverable MCP tool to create both deliverables:

-**MCP Server Used:** {{MCP_SERVER}}
-**Test Site:** https://example.org
-**Screenshot:** xss-agent-test.png
-**Status:** Browser isolation test successful
-```
+**1. Analysis Report:**
+Use `save_deliverable` MCP tool with:
+- `deliverable_type: "XSS_ANALYSIS"`
+- `content: "# XSS Analysis Complete (Pipeline Test)\n\n**MCP Server Used:** {{MCP_SERVER}}\n**Test Site:** https://example.org\n**Screenshot:** xss-agent-test.png\n**Status:** Browser isolation test successful"`

-2. `deliverables/xss_exploitation_queue.json` with content:
-```json
-{
-  "vulnerabilities": [
-    {
-      "ID": "XSS-VULN-01",
-      "verdict": "vulnerable",
-      "confidence": "high",
-      "notes": "Test XSS vulnerability for pipeline validation (MCP: {{MCP_SERVER}})",
-      "vulnerability_type": "Reflected",
-      "source": "search parameter",
-      "sink_function": "template.render at search.js:25",
-      "render_context": "HTML_BODY",
-      "mismatch_reason": "User input rendered without HTML encoding",
-      "witness_payload": "<script>alert(1)</script>"
-    }
-  ]
-}
-```
+**2. Exploitation Queue:**
+Use `save_deliverable` MCP tool with:
+- `deliverable_type: "XSS_QUEUE"`
+- `content: {"vulnerabilities": [{"ID": "XSS-VULN-01", "verdict": "vulnerable", "confidence": "high", "notes": "Test XSS vulnerability for pipeline validation (MCP: {{MCP_SERVER}})", "vulnerability_type": "Reflected", "source": "search parameter", "sink_function": "template.render at search.js:25", "render_context": "HTML_BODY", "mismatch_reason": "User input rendered without HTML encoding", "witness_payload": "<script>alert(1)</script>"}]}`
+
+These tools will automatically create the correct files in `deliverables/`.

 ### Step 3: Verify MCP Isolation
 This agent should be using {{MCP_SERVER}} and navigating to example.org independently of other parallel agents.
@@ -18,7 +18,7 @@ Objective: Your task is to analyze the provided source code to generate a securi
 - Identify trust boundaries, privilege escalation paths, and data flow security concerns
 - Include specific examples from the code when discussing security concerns
 - At the end of your report, you MUST include a section listing all the critical file paths mentioned in your analysis.
- **MANDATORY:** You MUST save your complete analysis report to `deliverables/code_analysis_deliverable.md` using the Write tool.
+- **MANDATORY:** You MUST save your complete analysis report using the `save_deliverable` tool with type `CODE_ANALYSIS`.
 </critical>

 <system_architecture>
@@ -78,8 +78,13 @@ You are the **Code Intelligence Gatherer** and **Architectural Foundation Builde
 **Available Tools:**
 - **Task Agent (Code Analysis):** Your primary tool. Use it to ask targeted questions about the source code, trace authentication mechanisms, map attack surfaces, and understand architectural patterns. MANDATORY for all source code analysis.
 - **TodoWrite Tool:** Use this to create and manage your analysis task list. Create todo items for each phase and agent that needs execution. Mark items as "in_progress" when working on them and "completed" when done.
- **Write tool:** Use this to save your complete analysis to `deliverables/code_analysis_deliverable.md`. This is your primary deliverable that feeds all subsequent agents.
- **Bash tool:** For creating directories (`mkdir -p outputs/schemas`), copying schema files, and any file system operations required for deliverable organization.
+- **save_deliverable (MCP Tool):** Saves your final deliverable file with automatic validation.
+  - **Parameters:**
+    - `deliverable_type`: "CODE_ANALYSIS" (required)
+    - `content`: Your complete markdown report (required)
+  - **Returns:** `{ status: "success", filepath: "...", validated: true/false }` on success or `{ status: "error", message: "...", errorType: "...", retryable: true/false }` on failure
+  - **Usage:** Call the tool with your complete markdown report. The tool handles correct naming and file validation automatically.
+- **Bash tool:** Use for creating directories, copying files, and other shell commands as needed.
 </available_tools>

 <task_agent_strategy>
@@ -122,7 +127,7 @@ After Phase 1 completes, launch all three vulnerability-focused agents in parall
  - Create the `outputs/schemas/` directory using mkdir -p
  - Copy all discovered schema files to `outputs/schemas/` with descriptive names
  - Include schema locations in your attack surface analysis
- Save complete analysis to deliverables/code_analysis_deliverable.md
+- Save complete analysis using the `save_deliverable` MCP tool with `deliverable_type: "CODE_ANALYSIS"` and your complete markdown report as the `content`

 **EXECUTION PATTERN:**
 1. **Use TodoWrite to create task list** tracking: Phase 1 agents, Phase 2 agents, and report synthesis
@@ -380,7 +385,7 @@ A component is **out-of-scope** if it **cannot** be invoked through the running
   - Phase 3: Synthesis and report generation completed

 2. **Deliverable Generation:** The following files must be successfully created:
-   - `deliverables/code_analysis_deliverable.md` (Created using Write tool)
+   - `deliverables/code_analysis_deliverable.md` (Created using save_deliverable MCP tool with CODE_ANALYSIS type)
   - `outputs/schemas/` directory with all discovered schema files copied (if any schemas found)

 3. **TodoWrite Completion:** All tasks in your todo list must be marked as completed
@@ -60,7 +60,12 @@ Please use these tools for the following use cases:
 - Task tool: **MANDATORY for ALL source code analysis.** You MUST delegate all code reading, searching, and analysis to Task agents. DO NOT use Read, Glob, or Grep tools for source code.
 - {{MCP_SERVER}} (Playwright): To interact with the live web application at the target.
  - **CRITICAL RULE:** For all browser interactions, you MUST use the {{MCP_SERVER}} (Playwright).
- Bash tool: For running simple, non-intrusive commands to gather more information if needed.
+- **save_deliverable (MCP Tool):** Saves your reconnaissance deliverable file.
+  - **Parameters:**
+    - `deliverable_type`: "RECON" (required)
+    - `content`: Your complete markdown report (required)
+  - **Returns:** `{ status: "success", filepath: "..." }` on success or `{ status: "error", message: "..." }` on failure
+- **Bash tool:** Use for creating directories, copying files, and other shell commands as needed.

 **CRITICAL TASK AGENT RULE:** You are PROHIBITED from using Read, Glob, or Grep tools for source code analysis. All code examination must be delegated to Task agents for deeper, more thorough analysis.
 </available_tools>
@@ -129,7 +134,7 @@ You must follow this methodical four-step process:
 </systematic_approach>

 <deliverable_instructions>
-When you have a complete understanding of the attack surface, you MUST synthesize all of your findings into a single, detailed Markdown report located at `deliverables/recon_deliverable.md`.
+When you have a complete understanding of the attack surface, you MUST synthesize all of your findings into a single, detailed Markdown report and save it using the save_deliverable script with type RECON.

 Your report MUST use the following structure precisely:

@@ -386,5 +391,5 @@ Vulnerability Sources by Type
 </deliverable_instructions>

 <conclusion_trigger>
-Once you have written the complete deliverable to `deliverables/recon_deliverable.md`, your phase is complete. Announce "RECONNAISSANCE COMPLETE" and await further instructions.
+Once you have saved the complete deliverable using the save_deliverable MCP tool with `deliverable_type: "RECON"` and your complete report as the `content`, your phase is complete. Announce "RECONNAISSANCE COMPLETE" and await further instructions.
 </conclusion_trigger>
@@ -0,0 +1,14 @@
+**MANDATORY EXPLOITATION CONSTRAINTS:**
+- ALL exploitation attempts MUST be performed against {{WEB_URL}} from external network
+- DO NOT attempt to exploit vulnerabilities that require:
+  - Internal network access (Tailscale, VPN, etc.)
+  - Direct server access
+  - Database console access
+
+**VERDICT CLASSIFICATIONS:**
+- EXPLOITED: Successfully demonstrated impact via public interface
+- BLOCKED_BY_SECURITY: Valid vulnerability but blocked by WAF/controls
+- OUT_OF_SCOPE_INTERNAL: Requires internal access - DO NOT PURSUE
+- FALSE_POSITIVE: Not actually vulnerable
+
+Only EXPLOITED and BLOCKED_BY_SECURITY should be reported as findings.
@@ -0,0 +1,2 @@
+Rules to Avoid:
+{{RULES_AVOID}}
@@ -0,0 +1 @@
+URL: {{WEB_URL}}
@@ -0,0 +1 @@
+**EXTERNAL ATTACKER SCOPE:** Only report vulnerabilities exploitable via {{WEB_URL}} from the internet. Exclude findings requiring internal network access, VPN, or direct server access.
@@ -15,8 +15,8 @@ Execute the login flow based on the login_type specified in the configuration:
 1. Navigate to the specified login_url using Playwright
 2. Execute each step in the login_flow array sequentially:
   - Replace $username with the provided username credential
-   - Replace $password with the provided password credential  
-   - Replace $totp with generated code using the TOTP script in the repo root: `node generate-totp.mjs --secret "{{totp_secret}}"`
+   - Replace $password with the provided password credential
+   - Replace $totp with generated code using the `generate_totp` MCP tool with the TOTP secret: {{totp_secret}}
   - Perform the specified actions (type text, click buttons, etc.)
 3. Wait for page navigation/loading to complete after each critical step
 4. Handle any consent dialogs or "Continue as [user]" prompts by clicking appropriate buttons
@@ -30,7 +30,7 @@ Execute the login flow based on the login_type specified in the configuration:
   - Handle account selection if prompted
   - Replace $username with the provided username credential in provider login
   - Replace $password with the provided password credential in provider login
-   - Replace $totp with generated code using the TOTP script in the repo root: `node generate-totp.mjs --secret "{{totp_secret}}"`
+   - Replace $totp with generated code using the `generate_totp` MCP tool with the TOTP secret: {{totp_secret}}
   - Handle OAuth consent screens by clicking "Allow", "Accept", or "Continue", and hitting check boxes as needed.
   - Handle "Continue as [username]" dialogs by clicking "Continue"
 3. Wait for OAuth callback and final redirect to complete
@@ -9,16 +9,15 @@ Success criterion: A complete, code-backed analysis of every potential authentic
 </objective>

 <scope>
-**EXTERNAL ATTACKER SCOPE:** Only report vulnerabilities exploitable via {{WEB_URL}} from the internet. Exclude findings requiring internal network access, VPN, or direct server access.
+@include(shared/_vuln-scope.txt)
 </scope>

 <target>
-URL: {{WEB_URL}}
+@include(shared/_target.txt)
 </target>

 <rules>
-Rules to Avoid:
-{{RULES_AVOID}}
+@include(shared/_rules.txt)
 </rules>

 <login_instructions>
@@ -78,17 +77,21 @@ An **exploitable vulnerability** is a logical flaw in the code that represents a

 **Available Tools:**
 - **Task Agent (Code Analysis):** Your primary tool. Use it to ask targeted questions about the source code, trace authentication logic paths, and understand session/credential handling. MANDATORY for all source code analysis.
- **Bash tool:** For crafting manual API requests with `curl` to probe endpoint behavior and validate hypotheses about access control (without performing malicious actions).
+- **save_deliverable (MCP Tool):** Saves deliverable files with automatic validation.
+  - **Parameters:**
+    - `deliverable_type`: "AUTH_ANALYSIS" or "AUTH_QUEUE" (required)
+    - `content`: Your markdown report or JSON queue (required)
+  - **Returns:** `{ status: "success", filepath: "...", validated: true/false }` on success or `{ status: "error", message: "...", errorType: "...", retryable: true/false }` on failure
+  - **Usage:** Call the tool with your deliverable type and content. Queue files must have `{"vulnerabilities": [...]}` structure and will be validated automatically.
+- **Bash tool:** Use for creating directories, copying files, and other shell commands as needed.
 - **{{MCP_SERVER}} (Playwright):** To interact with the live web application to understand multi-step flows like password reset or registration.
 - **TodoWrite Tool:** Use this to create and manage your analysis task list. Create a todo item for each endpoint/flow that needs analysis. Mark items as "in_progress" when working on them and "completed" when done.
-  
- **Write tool:** Use this to create or update `deliverables/auth_exploitation_queue.json` following the exploitation queue format. Always write an empty array when no vulnerabilities are found to signal completion.
 </available_tools>

 <data_format_specifications>

 <exploitation_queue_format>
-**Purpose:** Defines the structure for a "exploitation queue" saved to `deliverables/auth_exploitation_queue.json` via the Write tool.
+**Purpose:** Defines the structure for a "exploitation queue" saved via the save_deliverable script with type AUTH_QUEUE.
 **Structure:** The `vulnerability` JSON object MUST follow this exact format:
 	{
 		"ID": "unique ID for each vulnerability json (AUTH-VULN-XX)",
@@ -179,7 +182,7 @@ Rule: when uncertain, round down (favor Medium/Low) to minimize false positives.
 # Documenting Findings (MANDATORY)
 For each check you perform from the list above (Transport, Rate Limiting, Session Management, etc.), you must make a final **verdict**.

- If the verdict is **`vulnerable`**, you must document the finding using the Write tool to update `deliverables/auth_exploitation_queue.json`.
+- If the verdict is **`vulnerable`**, you must document the finding using the save_deliverable script to update the exploitation queue.
 - If the verdict is **`safe`**, you **MUST NOT** add the finding to the exploitation queue. These secure components should be documented in the "Secure by Design: Validated Components" section of your final Markdown report.
 </methodology>

@@ -197,7 +200,7 @@ For each check you perform from the list above (Transport, Rate Limiting, Sessio
 When you have systematically analyzed all relevant endpoints and logic paths, you MUST generate two final files. Follow these instructions precisely.

 **1. Your Specialist Deliverable**
-First, synthesize all of your findings into a detailed Markdown report at `deliverables/auth_analysis_deliverable.md`.
+First, synthesize all of your findings into a detailed Markdown report and save it using the save_deliverable script with type AUTH_ANALYSIS.
 Your report MUST use the following structure precisely:

 ---
@@ -206,7 +209,7 @@ Your report MUST use the following structure precisely:
 ## 1. Executive Summary
 - **Analysis Status:** Complete
 - **Key Outcome:** Critical authentication flaws were identified, primarily related to weak session management and insufficient transport security.
- **Purpose of this Document:** This report provides the strategic context on the application's authentication mechanisms, dominant flaw patterns, and key architectural details necessary to effectively exploit the vulnerabilities listed in `deliverables/auth_exploitation_queue.json`.
+- **Purpose of this Document:** This report provides the strategic context on the application's authentication mechanisms, dominant flaw patterns, and key architectural details necessary to effectively exploit the vulnerabilities listed in the exploitation queue.

 ## 2. Dominant Vulnerability Patterns

@@ -237,9 +240,9 @@ These components were analyzed and found to have robust defenses. They are low-p

 <exploitation_queue_requirements>
 **Exploitation Queue (MANDATORY)**
-Regardless of whether vulnerabilities are found, you MUST create `deliverables/auth_exploitation_queue.json` using the Write tool:
- **If vulnerabilities found:** Add each confirmed hypothesis using the `exploitation_queue_format`. Set `externally_exploitable` to `true` ONLY if exploitable via public internet without internal access.
- **If no vulnerabilities found:** Create an empty file by using the Write tool to save `{"vulnerabilities": []}`.
+Regardless of whether vulnerabilities are found, you MUST create the exploitation queue using the save_deliverable MCP tool:
+- **If vulnerabilities found:** Use `save_deliverable` MCP tool with `deliverable_type: "AUTH_QUEUE"` and `content: {"vulnerabilities": [...]}` with each confirmed hypothesis following the `exploitation_queue_format`. Set `externally_exploitable` to `true` ONLY if exploitable via public internet without internal access.
+- **If no vulnerabilities found:** Use `save_deliverable` MCP tool with `deliverable_type: "AUTH_QUEUE"` and `content: {"vulnerabilities": []}`.
 - **QUEUE INCLUSION CRITERIA:** ONLY include vulnerabilities where `externally_exploitable = true`. Exclude any vulnerability requiring internal network access, VPN, or direct server access.
 This file serves as the handoff mechanism and must always be created to signal completion.
 </exploitation_queue_requirements>
@@ -248,9 +251,9 @@ This file serves as the handoff mechanism and must always be created to signal c
 **COMPLETION REQUIREMENTS (ALL must be satisfied):**

 1.  **Systematic Analysis:** ALL relevant API endpoints and user-facing features identified in the reconnaissance deliverable must be analyzed for AuthN/AuthZ flaws.
-2.  **Deliverable Generation:** Both files must be successfully created:
-    -   `deliverables/auth_analysis_deliverable.md` (Created)
-    -   `deliverables/auth_exploitation_queue.json` (Created)
+2.  **Deliverable Generation:** Both required deliverables must be successfully saved using save_deliverable MCP tool:
+    - Analysis report: Use `save_deliverable` MCP tool with `deliverable_type: "AUTH_ANALYSIS"` and your report as `content`
+    - Exploitation queue: Use `save_deliverable` MCP tool with `deliverable_type: "AUTH_QUEUE"` and `content: {"vulnerabilities": [...]}`

-**ONLY AFTER** both systematic analysis AND successful file generation, announce "**AUTH ANALYSIS COMPLETE**" and stop.
+**ONLY AFTER** both systematic analysis AND successful deliverable generation, announce "**AUTH ANALYSIS COMPLETE**" and stop.
 </conclusion_trigger>
@@ -8,17 +8,15 @@ Success criterion: A complete, code-backed analysis of every potential authoriza
 </objective>

 <scope>
-**EXTERNAL ATTACKER SCOPE:** Only report vulnerabilities exploitable via {{WEB_URL}} from the internet. Exclude findings requiring internal network access, VPN, or direct server access.
+@include(shared/_vuln-scope.txt)
 </scope>

 <target>
-URL: {{WEB_URL}}
+@include(shared/_target.txt)
 </target>

 <rules>
-Rules to Avoid:
-{{RULES_AVOID}}
-
+@include(shared/_rules.txt)
 </rules>

 <login_instructions>
@@ -82,17 +80,21 @@ An **exploitable vulnerability** is a logical flaw in the code that represents a

 **Available Tools:**
 - **Task Agent (Code Analysis):** Your primary tool. Use it to ask targeted questions about the source code, trace authorization logic paths, and understand permission models. MANDATORY for all source code analysis.
- **Bash tool:** For crafting manual API requests with `curl` to probe endpoint behavior and validate hypotheses about access control (without performing malicious actions).
+- **save_deliverable (MCP Tool):** Saves deliverable files with automatic validation.
+  - **Parameters:**
+    - `deliverable_type`: "AUTHZ_ANALYSIS" or "AUTHZ_QUEUE" (required)
+    - `content`: Your markdown report or JSON queue (required)
+  - **Returns:** `{ status: "success", filepath: "...", validated: true/false }` on success or `{ status: "error", message: "...", errorType: "...", retryable: true/false }` on failure
+  - **Usage:** Call the tool with your deliverable type and content. Queue files must have `{"vulnerabilities": [...]}` structure and will be validated automatically.
+- **Bash tool:** Use for creating directories, copying files, and other shell commands as needed.
 - **{{MCP_SERVER}} (Playwright):** To interact with the live web application to understand multi-step flows and role-based access controls.
 - **TodoWrite Tool:** Use this to create and manage your analysis task list. Create a todo item for each endpoint that needs authorization analysis. Mark items as "in_progress" when working on them and "completed" when done.
-  
- **Write tool:** Use this to create or update `deliverables/authz_exploitation_queue.json` following the exploitation queue format. Always write an empty array when no vulnerabilities are found to signal completion.
 </available_tools>

 <data_format_specifications>

 <exploitation_queue_format>
-**Purpose:** Defines the structure for a "exploitation queue" saved to `deliverables/authz_exploitation_queue.json` via the Write tool.
+**Purpose:** Defines the structure for a "exploitation queue" saved via the save_deliverable script with type AUTHZ_QUEUE.
 **Structure:** The `vulnerability` JSON object MUST follow this exact format:
 	{
 		"ID": "unique ID for each vulnerability (e.g., AUTHZ-VULN-01)",
@@ -237,7 +239,7 @@ An **exploitable vulnerability** is a logical flaw in the code that represents a

 For each analysis you perform from the lists above, you must make a final **verdict**:

- If the verdict is **`vulnerable`**, you must document the finding using the Write tool to update `deliverables/authz_exploitation_queue.json`.
+- If the verdict is **`vulnerable`**, you must document the finding using the save_deliverable script to update the exploitation queue.
 - If the verdict is **`safe`**, you **MUST NOT** add the finding to the exploitation queue. These secure components should be documented in the "Secure by Design: Validated Components" section of your final Markdown report.
 </methodology>

@@ -275,7 +277,7 @@ When you have systematically analyzed all relevant endpoints and logic paths, yo

 **1. Your Specialist Deliverable**

-First, synthesize all of your findings into a single, detailed Markdown report located at `deliverables/authz_analysis_deliverable.md`. This report is the official record of your work.
+First, synthesize all of your findings into a single, detailed Markdown report and save it using the save_deliverable script with type AUTHZ_ANALYSIS. This report is the official record of your work.

 Your report MUST use the following structure precisely:

@@ -285,7 +287,7 @@ Your report MUST use the following structure precisely:
 ## 1. Executive Summary

 - **Analysis Status:** Complete  
- **Key Outcome:** ONLY high-confidence authorization vulnerabilities (horizontal, vertical, and context/workflow) were recorded. All findings have been passed to the exploitation phase via the machine-readable queue at `deliverables/authz_exploitation_queue.json`.  
+- **Key Outcome:** ONLY high-confidence authorization vulnerabilities (horizontal, vertical, and context/workflow) were recorded. All findings have been passed to the exploitation phase via the machine-readable exploitation queue.  
 - **Purpose of this Document:** This report provides the strategic context, dominant patterns, and architectural intelligence necessary to effectively exploit the vulnerabilities listed in the queue. It is intended to be read alongside the JSON deliverable.  

 ## 2. Dominant Vulnerability Patterns
@@ -341,9 +343,9 @@ examples:

 <exploitation_queue_requirements>
 **Exploitation Queue (MANDATORY)**
-Regardless of whether vulnerabilities are found, you MUST create `deliverables/authz_exploitation_queue.json` using the Write tool:
- **If vulnerabilities found:** Add each confirmed hypothesis using the `exploitation_queue_format`. Set `externally_exploitable` to `true` ONLY if exploitable via public internet without internal access.
- **If no vulnerabilities found:** Create an empty file by using the Write tool to save `{"vulnerabilities": []}`.
+Regardless of whether vulnerabilities are found, you MUST create the exploitation queue using the save_deliverable MCP tool:
+- **If vulnerabilities found:** Use `save_deliverable` MCP tool with `deliverable_type: "AUTHZ_QUEUE"` and `content: {"vulnerabilities": [...]}` with each confirmed hypothesis following the `exploitation_queue_format`. Set `externally_exploitable` to `true` ONLY if exploitable via public internet without internal access.
+- **If no vulnerabilities found:** Use `save_deliverable` MCP tool with `deliverable_type: "AUTHZ_QUEUE"` and `content: {"vulnerabilities": []}`.
 - **QUEUE INCLUSION CRITERIA:** ONLY include vulnerabilities where `externally_exploitable = true`. Exclude any vulnerability requiring internal network access, VPN, or direct server access.
 This file serves as the handoff mechanism and must always be created to signal completion.
 </exploitation_queue_requirements>
@@ -352,11 +354,11 @@ This file serves as the handoff mechanism and must always be created to signal c
 **COMPLETION REQUIREMENTS (ALL must be satisfied):**

 1. **Todo Completion:** ALL tasks in your TodoWrite list must be marked as "completed"
-2. **Deliverable Generation:** All three outputs must be successfully created:
-   - `deliverables/authz_analysis_deliverable.md`
-   - `deliverables/authz_exploitation_queue.json`
+2. **Deliverable Generation:** Both required deliverables must be successfully saved using save_deliverable MCP tool:
+   - Analysis report: Use `save_deliverable` MCP tool with `deliverable_type: "AUTHZ_ANALYSIS"` and your report as `content`
+   - Exploitation queue: Use `save_deliverable` MCP tool with `deliverable_type: "AUTHZ_QUEUE"` and `content: {"vulnerabilities": [...]}`

-**ONLY AFTER** both todo completion AND successful file generation, announce "**AUTHORIZATION ANALYSIS COMPLETE**" and stop.
+**ONLY AFTER** both todo completion AND successful deliverable generation, announce "**AUTHORIZATION ANALYSIS COMPLETE**" and stop.

 **FAILURE TO COMPLETE TODOS = INCOMPLETE ANALYSIS** - You will be considered to have failed the mission if you generate deliverables before completing comprehensive testing of all authorization vectors.
 </conclusion_trigger>
@@ -11,17 +11,15 @@ Success criterion: A complete source-to-sink trace for every identified vulnerab
 </objective>

 <scope>
-**EXTERNAL ATTACKER SCOPE:** Only report vulnerabilities exploitable via {{WEB_URL}} from the internet. Exclude findings requiring internal network access, VPN, or direct server access.
+@include(shared/_vuln-scope.txt)
 </scope>

 <target>
-URL: {{WEB_URL}}
+@include(shared/_target.txt)
 </target>

 <rules>
-Rules to Avoid:
-{{RULES_AVOID}}
-
+@include(shared/_rules.txt)
 </rules>

 <login_instructions>
@@ -84,16 +82,21 @@ An **exploitable vulnerability** is a confirmed source-to-sink path where the en

 **Available Tools:**
 - **Task Agent (Code Analysis):** Your primary tool. Use it to ask targeted questions about the source code, map query/command construction paths, and verify sanitization coverage. MANDATORY for all source code analysis.
- **Bash tool:** For crafting manual API requests with `curl` to probe injection surfaces and validate hypotheses about server behavior (without performing malicious actions).
+- **save_deliverable (MCP Tool):** Saves deliverable files with automatic validation.
+  - **Parameters:**
+    - `deliverable_type`: "INJECTION_ANALYSIS" or "INJECTION_QUEUE" (required)
+    - `content`: Your markdown report or JSON queue (required)
+  - **Returns:** `{ status: "success", filepath: "...", validated: true/false }` on success or `{ status: "error", message: "...", errorType: "...", retryable: true/false }` on failure
+  - **Usage:** Call the tool with your deliverable type and content. Queue files must have `{"vulnerabilities": [...]}` structure and will be validated automatically.
+- **Bash tool:** Use for creating directories, copying files, and other shell commands as needed.
 - **{{MCP_SERVER}} (Playwright):** To interact with the live web application to understand multi-step flows like password reset or registration.
 - **TodoWrite Tool:** Use this to create and manage your analysis task list. Create a todo item for each injection source that needs analysis. Mark items as "in_progress" when working on them and "completed" when done.
- **Write tool:** Use this to create or update `deliverables/injection_exploitation_queue.json` following the exploitation queue format. Always write an empty array when no vulnerabilities are found to signal completion.
 </available_tools>

 <data_format_specifications>

  <exploitation_queue_format>
-  **Purpose:** Defines the structure for a "exploitation queue" saved to `deliverables/injection_exploitation_queue.json` via the Write tool.
+  **Purpose:** Defines the structure for a "exploitation queue" saved via the save_deliverable script with type INJECTION_QUEUE.

  **Structure:** The `vulnerability` JSON object MUST follow this exact format:
 		{
@@ -359,10 +362,10 @@ These input vectors were traced and confirmed to have robust, context-appropriat
 <exploitation_queue_requirements>
 **Exploitation Queue (MANDATORY)**

-Regardless of whether vulnerabilities are found, you MUST create `deliverables/injection_exploitation_queue.json` using the Write tool:
+Regardless of whether vulnerabilities are found, you MUST create the exploitation queue using the save_deliverable MCP tool:

- **If vulnerabilities found:** Add each exploitable injection vulnerability (vertict: "vulnerable") using the exploitation_queue_format
- **If no vulnerabilities found:** Create an empty file by using the Write tool to save `{"vulnerabilities": []}`
+- **If vulnerabilities found:** Use `save_deliverable` MCP tool with `deliverable_type: "INJECTION_QUEUE"` and `content: {"vulnerabilities": [...]}` with each exploitable injection vulnerability (verdict: "vulnerable") following the exploitation_queue_format
+- **If no vulnerabilities found:** Use `save_deliverable` MCP tool with `deliverable_type: "INJECTION_QUEUE"` and `content: {"vulnerabilities": []}`

 This file serves as the handoff mechanism to the Exploitation phase and must always be created to signal completion of your analysis.
 </exploitation_queue_requirements>
@@ -371,11 +374,11 @@ This file serves as the handoff mechanism to the Exploitation phase and must alw
 **COMPLETION REQUIREMENTS (ALL must be satisfied):**

 1. **Todo Completion:** ALL tasks in your TodoWrite list must be marked as "completed"
-2. **Deliverable Generation:** Both required files must be successfully created:
-   - `deliverables/injection_analysis_deliverable.md`
-   - `deliverables/injection_exploitation_queue.json`
+2. **Deliverable Generation:** Both required deliverables must be successfully saved using save_deliverable MCP tool:
+   - Analysis report: Use `save_deliverable` MCP tool with `deliverable_type: "INJECTION_ANALYSIS"` and your report as `content`
+   - Exploitation queue: Use `save_deliverable` MCP tool with `deliverable_type: "INJECTION_QUEUE"` and `content: {"vulnerabilities": [...]}`

-**ONLY AFTER** both todo completion AND successful file generation, announce "**INJECTION ANALYSIS COMPLETE**" and stop.
+**ONLY AFTER** both todo completion AND successful deliverable generation, announce "**INJECTION ANALYSIS COMPLETE**" and stop.

 **FAILURE TO COMPLETE TODOS = INCOMPLETE ANALYSIS** - You will be considered to have failed the mission if you generate deliverables before completing comprehensive testing of all input vectors.
 </conclusion_trigger>
@@ -8,16 +8,15 @@ Success criterion: A complete source-to-sink trace for every identified SSRF vul
 </objective>

 <scope>
-**EXTERNAL ATTACKER SCOPE:** Only report vulnerabilities exploitable via {{WEB_URL}} from the internet. Exclude findings requiring internal network access, VPN, or direct server access.
+@include(shared/_vuln-scope.txt)
 </scope>

 <target>
-URL: {{WEB_URL}}
+@include(shared/_target.txt)
 </target>

 <rules>
-Rules to Avoid:
-{{RULES_AVOID}}
+@include(shared/_rules.txt)
 </rules>

 <login_instructions>
@@ -77,17 +76,21 @@ An **exploitable vulnerability** is a data flow where user-controlled input infl

 **Available Tools:**
 - **Task Agent (Code Analysis):** Your primary tool. Use it to ask targeted questions about the source code, trace data flows, and understand HTTP client usage. MANDATORY for all source code analysis.
- **Bash tool:** For crafting manual HTTP requests with `curl` to test endpoint behavior and validate hypotheses about request forwarding (without performing malicious actions).
+- **save_deliverable (MCP Tool):** Saves deliverable files with automatic validation.
+  - **Parameters:**
+    - `deliverable_type`: "SSRF_ANALYSIS" or "SSRF_QUEUE" (required)
+    - `content`: Your markdown report or JSON queue (required)
+  - **Returns:** `{ status: "success", filepath: "...", validated: true/false }` on success or `{ status: "error", message: "...", errorType: "...", retryable: true/false }` on failure
+  - **Usage:** Call the tool with your deliverable type and content. Queue files must have `{"vulnerabilities": [...]}` structure and will be validated automatically.
+- **Bash tool:** Use for creating directories, copying files, and other shell commands as needed.
 - **{{MCP_SERVER}} (Playwright):** To interact with the live web application to understand multi-step flows that might involve URL redirection or proxy functionality.
 - **TodoWrite Tool:** Use this to create and manage your analysis task list. Create a todo item for each SSRF sink that needs analysis. Mark items as "in_progress" when working on them and "completed" when done.
-  
- **Write tool:** Use this to create or update `deliverables/ssrf_exploitation_queue.json` following the exploitation queue format. Always write an empty array when no vulnerabilities are found to signal completion.
 </available_tools>

 <data_format_specifications>

 <exploitation_queue_format>
-**Purpose:** Defines the structure for a "exploitation queue" saved to `deliverables/ssrf_exploitation_queue.json` via the Write tool.
+**Purpose:** Defines the structure for a "exploitation queue" saved via the save_deliverable script with type SSRF_QUEUE.
 **Structure:** The `vulnerability` JSON object MUST follow this exact format:
 	{
 		"ID": "unique ID for each vulnerability json (SSRF-VULN-XX)",
@@ -214,7 +217,7 @@ Confidence levels:
 ### **4) Documentation**

 - **Vulnerable paths** → add to exploitation queue, include a minimal `witness_payload` (e.g., `http://127.0.0.1:22/`).
- **Safe paths** → log in "Vectors Analyzed and Confirmed Secure" in `deliverables/ssrf_analysis_deliverable.md`.
+- **Safe paths** → log in "Vectors Analyzed and Confirmed Secure" in your analysis deliverable.

 # Confidence scoring (analysis phase; applies to all checks above)
 - **High** — The flaw is directly established and deterministic in the target context. You have direct evidence with no material alternate control. Scope is clear (which endpoints/parameters).
@@ -226,7 +229,7 @@ Rule: when uncertain, round down (favor Medium/Low) to minimize false positives.
 # Documenting Findings (MANDATORY)
 For each check you perform from the list above, you must make a final **verdict**.

- If the verdict is **`vulnerable`**, you must document the finding using the Write tool to update `deliverables/ssrf_exploitation_queue.json`.
+- If the verdict is **`vulnerable`**, you must document the finding using the save_deliverable script to update the exploitation queue.
 - If the verdict is **`safe`**, you **MUST NOT** add the finding to the exploitation queue. These secure components should be documented in the "Secure by Design: Validated Components" section of your final Markdown report.
 </methodology>

@@ -244,7 +247,7 @@ For each check you perform from the list above, you must make a final **verdict*
 When you have systematically analyzed all relevant endpoints and request-making functions, you MUST generate two final files. Follow these instructions precisely.

 **1. Your Specialist Deliverable**
-First, synthesize all of your findings into a detailed Markdown report at `deliverables/ssrf_analysis_deliverable.md`.
+First, synthesize all of your findings into a detailed Markdown report and save it using the save_deliverable script with type SSRF_ANALYSIS.
 Your report MUST use the following structure precisely:

 ---
@@ -253,7 +256,7 @@ Your report MUST use the following structure precisely:
 ## 1. Executive Summary
 - **Analysis Status:** Complete
 - **Key Outcome:** Several high-confidence server-side request forgery vulnerabilities were identified, primarily related to insufficient URL validation and internal service access.
- **Purpose of this Document:** This report provides the strategic context on the application's outbound request mechanisms, dominant flaw patterns, and key architectural details necessary to effectively exploit the vulnerabilities listed in `deliverables/ssrf_exploitation_queue.json`.
+- **Purpose of this Document:** This report provides the strategic context on the application's outbound request mechanisms, dominant flaw patterns, and key architectural details necessary to effectively exploit the vulnerabilities listed in the exploitation queue.

 ## 2. Dominant Vulnerability Patterns

@@ -284,9 +287,9 @@ These components were analyzed and found to have robust defenses. They are low-p

 <exploitation_queue_requirements>
 **Exploitation Queue (MANDATORY)**
-Regardless of whether vulnerabilities are found, you MUST create `deliverables/ssrf_exploitation_queue.json` using the Write tool:
- **If vulnerabilities found:** Add each confirmed hypothesis using the `exploitation_queue_format`. Set `externally_exploitable` to `true` ONLY if exploitable via public internet without internal access.
- **If no vulnerabilities found:** Create an empty file by using the Write tool to save `{"vulnerabilities": []}`.
+Regardless of whether vulnerabilities are found, you MUST create the exploitation queue using the save_deliverable MCP tool:
+- **If vulnerabilities found:** Use `save_deliverable` MCP tool with `deliverable_type: "SSRF_QUEUE"` and `content: {"vulnerabilities": [...]}` with each confirmed hypothesis following the `exploitation_queue_format`. Set `externally_exploitable` to `true` ONLY if exploitable via public internet without internal access.
+- **If no vulnerabilities found:** Use `save_deliverable` MCP tool with `deliverable_type: "SSRF_QUEUE"` and `content: {"vulnerabilities": []}`.
 - **QUEUE INCLUSION CRITERIA:** ONLY include vulnerabilities where `externally_exploitable = true`. Exclude any vulnerability requiring internal network access, VPN, or direct server access.
 This file serves as the handoff mechanism and must always be created to signal completion.
 </exploitation_queue_requirements>
@@ -295,9 +298,9 @@ This file serves as the handoff mechanism and must always be created to signal c
 **COMPLETION REQUIREMENTS (ALL must be satisfied):**

 1.  **Systematic Analysis:** ALL relevant API endpoints and request-making features identified in the reconnaissance deliverable must be analyzed for SSRF vulnerabilities.
-2.  **Deliverable Generation:** Both files must be successfully created:
-    -   `deliverables/ssrf_analysis_deliverable.md` (Created)
-    -   `deliverables/ssrf_exploitation_queue.json` (Created)
+2.  **Deliverable Generation:** Both required deliverables must be successfully saved using save_deliverable MCP tool:
+    - Analysis report: Use `save_deliverable` MCP tool with `deliverable_type: "SSRF_ANALYSIS"` and your report as `content`
+    - Exploitation queue: Use `save_deliverable` MCP tool with `deliverable_type: "SSRF_QUEUE"` and `content: {"vulnerabilities": [...]}`

-**ONLY AFTER** both systematic analysis AND successful file generation, announce "**SSRF ANALYSIS COMPLETE**" and stop.
+**ONLY AFTER** both systematic analysis AND successful deliverable generation, announce "**SSRF ANALYSIS COMPLETE**" and stop.
 </conclusion_trigger>
@@ -8,16 +8,15 @@ Success criterion: Live confirmation of XSS execution for every vulnerability th
 </objective>

 <scope>
-**EXTERNAL ATTACKER SCOPE:** Only report vulnerabilities exploitable via {{WEB_URL}} from the internet. Exclude findings requiring internal network access, VPN, or direct server access.
+@include(shared/_vuln-scope.txt)
 </scope>

 <target>
-URL: {{WEB_URL}}
+@include(shared/_target.txt)
 </target>

 <rules>
-Rules to Avoid:
-{{RULES_AVOID}}
+@include(shared/_rules.txt)
 </rules>

 <login_instructions>
@@ -82,15 +81,19 @@ An **exploitable vulnerability** is a confirmed source-to-sink path where the en
 - **Terminal (curl):** MANDATORY for testing HTTP-based XSS vectors and observing raw HTML responses. Use for reflected XSS testing and JSONP injection testing.
 - **{{MCP_SERVER}} (Playwright):** MANDATORY for testing DOM-based XSS and form submission vectors. Use for stored XSS testing and client-side payload execution verification.
 - **TodoWrite Tool:** Use this to create and manage your analysis task list. Create a todo item for each sink you need to analyze.
-
-**Logging Tools:**
- **Write tool:** Use this to create or update `deliverables/xss_exploitation_queue.json` following the exploitation queue format. Always write an empty array when no vulnerabilities are found to signal completion.
+- **save_deliverable (MCP Tool):** Saves deliverable files with automatic validation.
+  - **Parameters:**
+    - `deliverable_type`: "XSS_ANALYSIS" or "XSS_QUEUE" (required)
+    - `content`: Your markdown report or JSON queue (required)
+  - **Returns:** `{ status: "success", filepath: "...", validated: true/false }` on success or `{ status: "error", message: "...", errorType: "...", retryable: true/false }` on failure
+  - **Usage:** Call the tool with your deliverable type and content. Queue files must have `{"vulnerabilities": [...]}` structure and will be validated automatically.
+- **Bash tool:** Use for creating directories, copying files, and other shell commands as needed.
 </available_tools>

 <data_format_specifications>

 <exploitation_queue_format>
-Purpose: Defines the structure for a "exploitation queue" saved to `deliverables/xss_exploitation_queue.json` via the Write tool.
+Purpose: Defines the structure for a "exploitation queue" saved via the save_deliverable script with type XSS_QUEUE.
 Structure: The vulnerability JSON object MUST follow this exact format:
 	{
 		"ID": "unique ID for each vulnerability json (XSS-VULN-XX)",
@@ -175,8 +178,8 @@ This rulebook is used for the **Early Termination** check in Step 2.
 - Include both safe and vulnerable paths to demonstrate **full coverage**.  
 - Craft a minimal `witness_payload` that proves control over the render context.  
 - For every path analyzed, you must document the outcome. The location of the documentation depends on the verdict:
-		- If the verdict is 'vulnerable', you MUST use the Write tool to save the finding to `deliverables/xss_exploitation_queue.json`, including complete source-to-sink information.
-		- If the verdict is 'safe', you MUST NOT add it to the exploitation queue. Instead, you will document these secure paths in the "Vectors Analyzed and Confirmed Secure" table of your final report (deliverables/xss_analysis_deliverable.md).
+		- If the verdict is 'vulnerable', you MUST use the save_deliverable script to save the finding to the exploitation queue, including complete source-to-sink information.
+		- If the verdict is 'safe', you MUST NOT add it to the exploitation queue. Instead, you will document these secure paths in the "Vectors Analyzed and Confirmed Secure" table of your final analysis report.
 - For vulnerable findings, craft a minimal witness_payload that proves control over the render context.

 ### **8) Score Confidence**
@@ -267,20 +270,16 @@ These input vectors were traced and confirmed to have robust, context-appropriat

 ## Exploitation Queue (MANDATORY)

-Regardless of whether vulnerabilities are found, you MUST create:  
-`deliverables/xss_exploitation_queue.json`  
-using the **Write tool**.
+Regardless of whether vulnerabilities are found, you MUST create the exploitation queue using the save_deliverable MCP tool.

 - **If exploitable vulnerabilities found:**
-  Add each exploitable XSS vulnerability (verdict: "vulnerable") using the `exploitation_queue_format` when writing the file. Set `externally_exploitable` to `true` ONLY if exploitable via public internet without internal access.
+  Use `save_deliverable` MCP tool with `deliverable_type: "XSS_QUEUE"` and `content: {"vulnerabilities": [...]}` with each exploitable XSS vulnerability (verdict: "vulnerable") following the `exploitation_queue_format`. Set `externally_exploitable` to `true` ONLY if exploitable via public internet without internal access.

 - **If no exploitable vulnerabilities found:**
-  Create an empty file by using the Write tool to save the following JSON:
+  Use `save_deliverable` MCP tool with `deliverable_type: "XSS_QUEUE"` and `content: {"vulnerabilities": []}`

-  ```json
-  {"vulnerabilities": []}
-  ```
 - **QUEUE INCLUSION CRITERIA:** ONLY include vulnerabilities where `externally_exploitable = true`. Exclude any vulnerability requiring internal network access, VPN, or direct server access.
+
 This file is the mandatory handoff to the Exploitation phase.
 </exploitation_queue_requirements>

@@ -288,9 +287,9 @@ This file is the mandatory handoff to the Exploitation phase.
 COMPLETION REQUIREMENTS (ALL must be satisfied):

 1. Systematic Analysis: ALL input vectors identified from the reconnaissance deliverable must be analyzed.
-2. Deliverable Generation: Both required files must be successfully created:
-	 - deliverables/xss_analysis_deliverable.md (Created)
-	 - deliverables/xss_exploitation_queue.json (Created)
+2. Deliverable Generation: Both required deliverables must be successfully saved using save_deliverable MCP tool:
+   - Analysis report: Use `save_deliverable` MCP tool with `deliverable_type: "XSS_ANALYSIS"` and your report as `content`
+   - Exploitation queue: Use `save_deliverable` MCP tool with `deliverable_type: "XSS_QUEUE"` and `content: {"vulnerabilities": [...]}`

-ONLY AFTER both systematic analysis AND successful file generation, announce "XSS ANALYSIS COMPLETE" and stop.
+ONLY AFTER both systematic analysis AND successful deliverable generation, announce "XSS ANALYSIS COMPLETE" and stop.
 </conclusion_trigger>
@@ -0,0 +1,175 @@
+#!/usr/bin/env node
+
+/**
+ * Export Metrics to CSV
+ *
+ * Converts session.json from audit-logs into CSV format for spreadsheet analysis.
+ *
+ * DATA SOURCE:
+ * - Reads from: audit-logs/{hostname}_{sessionId}/session.json
+ * - Source of truth for all metrics, timing, and cost data
+ * - Automatically created by Shannon during agent execution
+ *
+ * CSV OUTPUT:
+ * - One row per agent with: agent, phase, status, attempts, duration_ms, cost_usd
+ * - Perfect for importing into Excel/Google Sheets for analysis
+ *
+ * USE CASES:
+ * - Compare performance across multiple sessions
+ * - Track costs and optimize budget
+ * - Identify slow agents for optimization
+ * - Generate charts and visualizations
+ * - Export data for external reporting tools
+ *
+ * EXAMPLES:
+ * ```bash
+ * # Export to stdout
+ * ./scripts/export-metrics.js --session-id abc123
+ *
+ * # Export to file
+ * ./scripts/export-metrics.js --session-id abc123 --output metrics.csv
+ *
+ * # Find session ID from Shannon store
+ * cat .shannon-store.json | jq '.sessions | keys'
+ * ```
+ *
+ * NOTE: For raw metrics, just read audit-logs/.../session.json directly.
+ * This script only exists to provide a spreadsheet-friendly CSV format.
+ */
+
+import chalk from 'chalk';
+import { fs, path } from 'zx';
+import { getSession } from '../src/session-manager.js';
+import { AuditSession } from '../src/audit/index.js';
+
+// Parse command-line arguments
+function parseArgs() {
+  const args = {
+    sessionId: null,
+    output: null
+  };
+
+  for (let i = 2; i < process.argv.length; i++) {
+    const arg = process.argv[i];
+
+    if (arg === '--session-id' && process.argv[i + 1]) {
+      args.sessionId = process.argv[i + 1];
+      i++;
+    } else if (arg === '--output' && process.argv[i + 1]) {
+      args.output = process.argv[i + 1];
+      i++;
+    } else if (arg === '--help' || arg === '-h') {
+      printUsage();
+      process.exit(0);
+    } else {
+      console.log(chalk.red(`❌ Unknown argument: ${arg}`));
+      printUsage();
+      process.exit(1);
+    }
+  }
+
+  return args;
+}
+
+function printUsage() {
+  console.log(chalk.cyan('\n📊 Export Metrics to CSV'));
+  console.log(chalk.gray('\nUsage: ./scripts/export-metrics.js [options]\n'));
+  console.log(chalk.white('Options:'));
+  console.log(chalk.gray('  --session-id <id>      Session ID to export (required)'));
+  console.log(chalk.gray('  --output <file>        Output CSV file path (default: stdout)'));
+  console.log(chalk.gray('  --help, -h             Show this help\n'));
+  console.log(chalk.white('Examples:'));
+  console.log(chalk.gray('  # Export to stdout'));
+  console.log(chalk.gray('  ./scripts/export-metrics.js --session-id abc123\n'));
+  console.log(chalk.gray('  # Export to file'));
+  console.log(chalk.gray('  ./scripts/export-metrics.js --session-id abc123 --output metrics.csv\n'));
+}
+
+// Export metrics for a session
+async function exportMetrics(sessionId) {
+  const session = await getSession(sessionId);
+  if (!session) {
+    throw new Error(`Session ${sessionId} not found`);
+  }
+
+  const auditSession = new AuditSession(session);
+  await auditSession.initialize();
+  const metrics = await auditSession.getMetrics();
+
+  return exportAsCSV(session, metrics);
+}
+
+// Export as CSV
+function exportAsCSV(session, metrics) {
+  const lines = [];
+
+  // Header
+  lines.push('agent,phase,status,attempts,duration_ms,cost_usd');
+
+  // Phase mapping
+  const phaseMap = {
+    'pre-recon': 'pre-recon',
+    'recon': 'recon',
+    'injection-vuln': 'vulnerability-analysis',
+    'xss-vuln': 'vulnerability-analysis',
+    'auth-vuln': 'vulnerability-analysis',
+    'authz-vuln': 'vulnerability-analysis',
+    'ssrf-vuln': 'vulnerability-analysis',
+    'injection-exploit': 'exploitation',
+    'xss-exploit': 'exploitation',
+    'auth-exploit': 'exploitation',
+    'authz-exploit': 'exploitation',
+    'ssrf-exploit': 'exploitation',
+    'report': 'reporting'
+  };
+
+  // Agent rows
+  for (const [agentName, agentData] of Object.entries(metrics.metrics.agents)) {
+    const phase = phaseMap[agentName] || 'unknown';
+
+    lines.push([
+      agentName,
+      phase,
+      agentData.status,
+      agentData.attempts.length,
+      agentData.final_duration_ms,
+      agentData.total_cost_usd.toFixed(4)
+    ].join(','));
+  }
+
+  return lines.join('\n');
+}
+
+// Main execution
+async function main() {
+  const args = parseArgs();
+
+  if (!args.sessionId) {
+    console.log(chalk.red('❌ Must specify --session-id'));
+    printUsage();
+    process.exit(1);
+  }
+
+  console.log(chalk.cyan.bold('\n📊 Exporting Metrics to CSV\n'));
+  console.log(chalk.gray(`Session ID: ${args.sessionId}\n`));
+
+  const output = await exportMetrics(args.sessionId);
+
+  if (args.output) {
+    await fs.writeFile(args.output, output);
+    console.log(chalk.green(`✅ Exported to: ${args.output}`));
+  } else {
+    console.log(chalk.cyan('CSV Output:\n'));
+    console.log(output);
+  }
+
+  console.log();
+}
+
+main().catch(error => {
+  console.log(chalk.red.bold(`\n🚨 Fatal error: ${error.message}`));
+  if (process.env.DEBUG) {
+    console.log(chalk.gray(error.stack));
+  }
+  process.exit(1);
+});
@@ -12,8 +12,7 @@ import { createSession, updateSession, getSession, AGENTS } from './src/session-
 import { runPhase, getGitCommitHash } from './src/checkpoint-manager.js';

 // Setup and Deliverables
-import { setupLocalRepo, cleanupMCP } from './src/setup/environment.js';
-import { saveRunMetadata, savePermanentDeliverables } from './src/setup/deliverables.js';
+import { setupLocalRepo } from './src/setup/environment.js';

 // AI and Prompts
 import { runClaudePromptWithRetry } from './src/ai/claude-executor.js';
@@ -24,8 +23,8 @@ import { executePreReconPhase } from './src/phases/pre-recon.js';
 import { assembleFinalReport } from './src/phases/reporting.js';

 // Utils
-import { timingResults, costResults, displayTimingSummary, Timer, formatDuration } from './src/utils/metrics.js';
-import { setupLogging } from './src/utils/logger.js';
+import { timingResults, costResults, displayTimingSummary, Timer } from './src/utils/metrics.js';
+import { formatDuration } from './src/audit/utils.js';

 // CLI
 import { handleDeveloperCommand } from './src/cli/command-handler.js';
@@ -45,21 +44,16 @@ import {
 // Configure zx to disable timeouts (let tools run as long as needed)
 $.timeout = 0;

-// Global cleanup function for logging
-let cleanupLogging = null;
-
 // Setup graceful cleanup on process signals
 process.on('SIGINT', async () => {
  console.log(chalk.yellow('\n⚠️ Received SIGINT, cleaning up...'));
-  await cleanupMCP();
-  if (cleanupLogging) await cleanupLogging();
+
  process.exit(0);
 });

 process.on('SIGTERM', async () => {
  console.log(chalk.yellow('\n⚠️ Received SIGTERM, cleaning up...'));
-  await cleanupMCP();
-  if (cleanupLogging) await cleanupLogging();
+
  process.exit(0);
 });

@@ -137,7 +131,6 @@ async function main(webUrl, repoPath, configPath = null, pipelineTestingMode = f
    console.log(chalk.gray('Use developer commands to run individual agents:'));
    console.log(chalk.gray('  ./shannon.mjs --run-agent pre-recon'));
    console.log(chalk.gray('  ./shannon.mjs --status'));
-    await cleanupMCP();
    process.exit(0);
  }

@@ -178,21 +171,11 @@ async function main(webUrl, repoPath, configPath = null, pipelineTestingMode = f
    );
  }

-  // Save run metadata with error handling
-  try {
-    await saveRunMetadata(sourceDir, webUrl, repoPath);
-  } catch (error) {
-    // Non-critical operation, log warning and continue
-    console.log(chalk.yellow(`⚠️ Failed to save run metadata: ${error.message}`));
-    await logError(error, 'Run metadata saving', sourceDir);
-  }
-
  // Check if we should continue from where session left off
  const nextAgent = getNextAgent(session);
  if (!nextAgent) {
    console.log(chalk.green(`✅ All agents completed! Session is finished.`));
    await displayTimingSummary(timingResults, costResults, session.completedAgents);
-    await cleanupMCP();
    process.exit(0);
  }

@@ -233,7 +216,7 @@ async function main(webUrl, repoPath, configPath = null, pipelineTestingMode = f
      AGENTS['recon'].displayName,
      'recon',  // Agent name for snapshot creation
      chalk.cyan,
-      { webUrl, sessionId: session.id }  // Session metadata for logging
+      { id: session.id, webUrl }  // Session metadata for audit logging (STANDARD: use 'id' field)
    );
    const reconDuration = reconTimer.stop();
    timingResults.phases['recon'] = reconDuration;
@@ -309,7 +292,7 @@ async function main(webUrl, repoPath, configPath = null, pipelineTestingMode = f
      'Executive Summary and Report Cleanup',
      'report',  // Agent name for snapshot creation
      chalk.cyan,
-      { webUrl, sessionId: session.id }  // Session metadata for logging
+      { id: session.id, webUrl }  // Session metadata for audit logging (STANDARD: use 'id' field)
    );

    const reportDuration = reportTimer.stop();
@@ -350,19 +333,6 @@ async function main(webUrl, repoPath, configPath = null, pipelineTestingMode = f
    costBreakdown
  });

-  // Save deliverables to permanent location in Documents
-  const permanentPath = await savePermanentDeliverables(
-    sourceDir, webUrl, repoPath, session, timingBreakdown, costBreakdown
-  );
-  if (permanentPath) {
-    console.log(chalk.green(`📂 Deliverables permanently saved to: ${permanentPath}`));
-  }
-
-  // Keep files for manual review
-  console.log(chalk.blue(`📁 Files preserved for review at: ${sourceDir}`));
-  console.log(chalk.gray(`   Deliverables: ${sourceDir}/deliverables/`));
-  console.log(chalk.gray(`   Source code: ${sourceDir}/`));
-
  // Display comprehensive timing summary
  displayTimingSummary();

@@ -383,7 +353,6 @@ if (args[0] && args[0].includes('shannon.mjs')) {
 // Parse flags and arguments
 let configPath = null;
 let pipelineTestingMode = false;
-let logFilePath = null;
 const nonFlagArgs = [];
 let developerCommand = null;
 const developerCommands = ['--run-phase', '--run-all', '--rollback-to', '--rerun', '--status', '--list-agents', '--cleanup'];
@@ -397,16 +366,6 @@ for (let i = 0; i < args.length; i++) {
      console.log(chalk.red('❌ --config flag requires a file path'));
      process.exit(1);
    }
-  } else if (args[i] === '--log') {
-    // --log can optionally take a file path, otherwise use default
-    if (i + 1 < args.length && !args[i + 1].startsWith('-')) {
-      logFilePath = args[i + 1];
-      i++; // Skip the next argument
-    } else {
-      // Generate default log filename with timestamp
-      const timestamp = new Date().toISOString().replace(/[:.]/g, '-').slice(0, -5);
-      logFilePath = `shannon-${timestamp}.log`;
-    }
  } else if (args[i] === '--pipeline-testing') {
    pipelineTestingMode = true;
  } else if (developerCommands.includes(args[i])) {
@@ -433,25 +392,10 @@ if (args.includes('--help') || args.includes('-h') || args.includes('help')) {
  process.exit(0);
 }

-// Setup logging if --log flag is present
-if (logFilePath) {
-  try {
-    cleanupLogging = await setupLogging(logFilePath);
-    const absoluteLogPath = path.isAbsolute(logFilePath)
-      ? logFilePath
-      : path.join(process.cwd(), logFilePath);
-    console.log(chalk.green(`📝 Logging enabled: ${absoluteLogPath}`));
-  } catch (error) {
-    console.log(chalk.yellow(`⚠️ Failed to setup logging: ${error.message}`));
-    console.log(chalk.gray('Continuing without logging...'));
-  }
-}
-
 // Handle developer commands
 if (developerCommand) {
  await handleDeveloperCommand(developerCommand, nonFlagArgs, pipelineTestingMode, runClaudePromptWithRetry, loadPrompt);
-  await cleanupMCP();
-  if (cleanupLogging) await cleanupLogging();
+
  process.exit(0);
 }

@@ -501,8 +445,7 @@ try {
  const finalReportPath = await main(webUrl, repoPathValidation.path, configPath, pipelineTestingMode);
  console.log(chalk.green.bold('\n📄 FINAL REPORT AVAILABLE:'));
  console.log(chalk.cyan(finalReportPath));
-  await cleanupMCP();
-  if (cleanupLogging) await cleanupLogging();
+
 } catch (error) {
  // Enhanced error boundary with proper logging
  if (error instanceof PentestError) {
@@ -522,7 +465,6 @@ try {
      console.log(chalk.gray(`   Stack: ${error?.stack || 'No stack trace available'}`));
    }
  }
-  await cleanupMCP();
-  if (cleanupLogging) await cleanupLogging();
+
  process.exit(1);
 }
@@ -1,309 +0,0 @@
-import chalk from 'chalk';
-import { path } from 'zx';
-
-export class AgentStatusManager {
-  constructor(options = {}) {
-    this.mode = options.mode || 'parallel'; // 'parallel' or 'single'
-    this.activeStatuses = new Map();
-    this.lastStatusLine = '';
-    this.hiddenOperationCount = 0;
-    this.lastSummaryCount = 0;
-    this.summaryInterval = options.summaryInterval || 10;
-    this.showTodos = options.showTodos !== false;
-
-    // Tools to completely hide in output
-    this.suppressedTools = new Set([
-      'Read', 'Write', 'Edit', 'MultiEdit',
-      'Grep', 'Glob', 'LS'
-    ]);
-
-    // Tools that might be noisy bash commands to hide
-    this.hiddenBashCommands = new Set([
-      'pwd', 'echo', 'ls', 'cd'
-    ]);
-  }
-
-  /**
-   * Update status for an agent based on its current turn data
-   */
-  updateAgentStatus(agentName, turnData) {
-    if (this.mode === 'single') {
-      this.handleSingleAgentOutput(agentName, turnData);
-    } else {
-      const status = this.extractMeaningfulStatus(turnData);
-      if (status) {
-        this.activeStatuses.set(agentName, status);
-        this.redrawStatusLine();
-      }
-    }
-  }
-
-  /**
-   * Handle output for single agent mode with clean formatting
-   */
-  handleSingleAgentOutput(agentName, turnData) {
-    const toolUse = turnData.tool_use;
-    const text = turnData.assistant_text;
-    const turnCount = turnData.turnCount;
-
-    // Check if this is a tool we should hide
-    if (toolUse && this.shouldHideTool(toolUse)) {
-      this.hiddenOperationCount++;
-
-      // Show summary every N hidden operations
-      if (this.hiddenOperationCount - this.lastSummaryCount >= this.summaryInterval) {
-        const operationCount = this.hiddenOperationCount - this.lastSummaryCount;
-        console.log(chalk.gray(`    [${operationCount} file operations...]`));
-        this.lastSummaryCount = this.hiddenOperationCount;
-      }
-      return;
-    }
-
-    // Format and show meaningful tools
-    if (toolUse) {
-      const formatted = this.formatMeaningfulTool(toolUse);
-      if (formatted) {
-        console.log(`🤖 ${formatted}`);
-        return;
-      }
-    }
-
-    // For turns without tool use, just ignore them silently
-    // These are planning/thinking turns that don't need any output
-  }
-
-  /**
-   * Check if a tool should be hidden from output
-   */
-  shouldHideTool(toolUse) {
-    const toolName = toolUse.name;
-
-    // Always hide these tools
-    if (this.suppressedTools.has(toolName)) {
-      return true;
-    }
-
-    // Hide TodoWrite unless we're configured to show todos
-    if (toolName === 'TodoWrite' && !this.showTodos) {
-      return true;
-    }
-
-    // Hide simple bash commands
-    if (toolName === 'Bash') {
-      const command = toolUse.input?.command || '';
-      const simpleCommand = command.split(' ')[0];
-      return this.hiddenBashCommands.has(simpleCommand);
-    }
-
-    return false;
-  }
-
-  /**
-   * Format meaningful tools for single agent display
-   */
-  formatMeaningfulTool(toolUse) {
-    const toolName = toolUse.name;
-    const input = toolUse.input || {};
-
-    switch (toolName) {
-      case 'Task':
-        const description = input.description || 'analysis agent';
-        return `🚀 Launching ${description}`;
-
-      case 'TodoWrite':
-        if (this.showTodos) {
-          return this.formatTodoUpdate(input);
-        }
-        return null;
-
-      case 'WebFetch':
-        const domain = this.extractDomain(input.url || '');
-        return `🌐 Fetching ${domain}`;
-
-      case 'Bash':
-        // Only show meaningful bash commands
-        const command = input.command || '';
-        if (command.includes('nmap') || command.includes('subfinder') || command.includes('whatweb')) {
-          const tool = command.split(' ')[0];
-          return `🔍 Running ${tool}`;
-        }
-        return null;
-
-      // Browser tools (keep existing formatting)
-      default:
-        if (toolName.startsWith('mcp__playwright__browser_')) {
-          return this.extractBrowserAction(toolUse);
-        }
-    }
-
-    return null;
-  }
-
-  /**
-   * Format TodoWrite updates for display
-   */
-  formatTodoUpdate(input) {
-    if (!input.todos || !Array.isArray(input.todos)) {
-      return null;
-    }
-
-    const todos = input.todos;
-    const inProgress = todos.filter(t => t.status === 'in_progress');
-    const completed = todos.filter(t => t.status === 'completed');
-
-    if (completed.length > 0) {
-      const recent = completed[completed.length - 1];
-      return `✅ ${recent.content.slice(0, 50)}${recent.content.length > 50 ? '...' : ''}`;
-    }
-
-    if (inProgress.length > 0) {
-      const current = inProgress[0];
-      return `🔄 ${current.content.slice(0, 50)}${current.content.length > 50 ? '...' : ''}`;
-    }
-
-    return null;
-  }
-
-  /**
-   * Extract meaningful status from turn data, suppressing internal operations
-   */
-  extractMeaningfulStatus(turnData) {
-    // Check for tool use first
-    if (turnData.tool_use?.name) {
-      // Suppress internal operations completely
-      if (this.suppressedTools.has(turnData.tool_use.name)) {
-        return null;
-      }
-
-      // Show browser testing actions
-      if (turnData.tool_use.name.startsWith('mcp__playwright__browser_')) {
-        return this.extractBrowserAction(turnData.tool_use);
-      }
-
-      // Show Task agent launches
-      if (turnData.tool_use.name === 'Task') {
-        const description = turnData.tool_use.input?.description || 'analysis';
-        return `🚀 ${description.slice(0, 40)}`;
-      }
-    }
-
-    // Parse assistant text for progress milestones
-    if (turnData.assistant_text) {
-      return this.extractProgressFromText(turnData.assistant_text);
-    }
-
-    return null; // Suppress everything else
-  }
-
-  /**
-   * Extract browser action details
-   */
-  extractBrowserAction(toolUse) {
-    const actionType = toolUse.name.split('_').pop();
-
-    switch (actionType) {
-      case 'navigate':
-        const url = toolUse.input?.url || '';
-        const domain = this.extractDomain(url);
-        return `🌐 Testing ${domain}`;
-
-      case 'click':
-        const element = toolUse.input?.element || 'element';
-        return `🖱️ Clicking ${element.slice(0, 20)}`;
-
-      case 'fill':
-      case 'form':
-        return `📝 Testing form inputs`;
-
-      case 'snapshot':
-        return `📸 Capturing page state`;
-
-      case 'type':
-        return `⌨️ Testing input fields`;
-
-      default:
-        return `🌐 Browser: ${actionType}`;
-    }
-  }
-
-  /**
-   * Extract meaningful progress from assistant text (single-agent mode only)
-   */
-  extractProgressFromText(text) {
-    // Only extract progress for single agents, not parallel ones
-    if (this.mode !== 'single') {
-      return null;
-    }
-
-    // For single agents, be very conservative about what we show
-    // Most progress should come from tool formatting, not text parsing
-    return null;
-  }
-
-  /**
-   * Extract domain from URL for display
-   */
-  extractDomain(url) {
-    try {
-      const urlObj = new URL(url);
-      return urlObj.hostname || url.slice(0, 30);
-    } catch {
-      return url.slice(0, 30);
-    }
-  }
-
-  /**
-   * Redraw the status line showing all active agents
-   */
-  redrawStatusLine() {
-    // Clear previous line
-    if (this.lastStatusLine) {
-      process.stdout.write('\r' + ' '.repeat(this.lastStatusLine.length) + '\r');
-    }
-
-    // Build new status line
-    const statusEntries = Array.from(this.activeStatuses.entries())
-      .map(([agent, status]) => `[${chalk.cyan(agent)}] ${status}`)
-      .join(' | ');
-
-    if (statusEntries) {
-      process.stdout.write(statusEntries);
-      this.lastStatusLine = statusEntries.replace(/\u001b\[[0-9;]*m/g, ''); // Remove ANSI codes for length calc
-    }
-  }
-
-  /**
-   * Clear status for a specific agent
-   */
-  clearAgentStatus(agentName) {
-    this.activeStatuses.delete(agentName);
-    this.redrawStatusLine();
-  }
-
-  /**
-   * Clear all statuses and finish the status line
-   */
-  finishStatusLine() {
-    if (this.lastStatusLine) {
-      process.stdout.write('\n'); // Move to next line
-      this.lastStatusLine = '';
-      this.activeStatuses.clear();
-    }
-  }
-
-  /**
-   * Parse JSON tool use from message content
-   */
-  parseToolUse(content) {
-    try {
-      // Look for JSON tool use patterns
-      const jsonMatch = content.match(/\{"type":"tool_use".*?\}/s);
-      if (jsonMatch) {
-        return JSON.parse(jsonMatch[0]);
-      }
-    } catch (error) {
-      // Ignore parsing errors
-    }
-    return null;
-  }
-}
@@ -1,15 +1,50 @@
 import { $, fs, path } from 'zx';
 import chalk from 'chalk';
-import { query } from '@anthropic-ai/claude-code';
+import { query } from '@anthropic-ai/claude-agent-sdk';
+import { fileURLToPath } from 'url';
+import { dirname } from 'path';

 import { isRetryableError, getRetryDelay, PentestError } from '../error-handling.js';
 import { ProgressIndicator } from '../progress-indicator.js';
-import { timingResults, costResults, Timer, formatDuration } from '../utils/metrics.js';
+import { timingResults, costResults, Timer } from '../utils/metrics.js';
+import { formatDuration } from '../audit/utils.js';
 import { createGitCheckpoint, commitGitSuccess, rollbackGitWorkspace } from '../utils/git-manager.js';
-import { savePromptSnapshot } from '../prompts/prompt-manager.js';
-import { AGENT_VALIDATORS } from '../constants.js';
+import { AGENT_VALIDATORS, MCP_AGENT_MAPPING } from '../constants.js';
 import { filterJsonToolCalls, getAgentPrefix } from '../utils/output-formatter.js';
 import { generateSessionLogPath } from '../session-manager.js';
+import { AuditSession } from '../audit/index.js';
+import { createShannonHelperServer } from '../../mcp-server/src/index.js';
+
+const __filename = fileURLToPath(import.meta.url);
+const __dirname = dirname(__filename);
+
+/**
+ * Convert agent name to prompt name for MCP_AGENT_MAPPING lookup
+ *
+ * @param {string} agentName - Agent name (e.g., 'xss-vuln', 'injection-exploit')
+ * @returns {string} Prompt name (e.g., 'vuln-xss', 'exploit-injection')
+ */
+function agentNameToPromptName(agentName) {
+  // Special cases
+  if (agentName === 'pre-recon') return 'pre-recon-code';
+  if (agentName === 'report') return 'report-executive';
+  if (agentName === 'recon') return 'recon';
+
+  // Pattern: {type}-vuln → vuln-{type}
+  const vulnMatch = agentName.match(/^(.+)-vuln$/);
+  if (vulnMatch) {
+    return `vuln-${vulnMatch[1]}`;
+  }
+
+  // Pattern: {type}-exploit → exploit-{type}
+  const exploitMatch = agentName.match(/^(.+)-exploit$/);
+  if (exploitMatch) {
+    return `exploit-${exploitMatch[1]}`;
+  }
+
+  // Default: return as-is
+  return agentName;
+}

 // Simplified validation using direct agent name mapping
 async function validateAgentOutput(result, agentName, sourceDir) {
@@ -57,10 +92,11 @@ async function validateAgentOutput(result, agentName, sourceDir) {
 // - Output validation
 // - Prompt snapshotting for debugging
 // - Git checkpoint/rollback safety
-async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context = '', description = 'Claude analysis', colorFn = chalk.cyan, sessionMetadata = null) {
+async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context = '', description = 'Claude analysis', agentName = null, colorFn = chalk.cyan, sessionMetadata = null, auditSession = null, attemptNumber = 1) {
  const timer = new Timer(`agent-${description.toLowerCase().replace(/\s+/g, '-')}`);
  const fullPrompt = context ? `${context}\n\n${prompt}` : prompt;
  let totalCost = 0;
+  let partialCost = 0; // Track partial cost for crash safety

  // Auto-detect execution mode to adjust logging behavior
  const isParallelExecution = description.includes('vuln agent') || description.includes('exploit agent');
@@ -82,39 +118,80 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
    progressIndicator = new ProgressIndicator(`Running ${agentType}...`);
  }

-  // Setup detailed logging for all agents (if session metadata is available)
+  // NOTE: Logging now handled by AuditSession (append-only, crash-safe)
+  // Legacy log path generation kept for compatibility
  let logFilePath = null;
-  let logBuffer = [];
-
-  if (sessionMetadata && sessionMetadata.webUrl && sessionMetadata.sessionId) {
+  if (sessionMetadata && sessionMetadata.webUrl && sessionMetadata.id) {
    const timestamp = new Date().toISOString().replace(/T/, '_').replace(/[:.]/g, '-').slice(0, 19);
    const agentName = description.toLowerCase().replace(/\s+/g, '-');
-
-    // Use session-based folder structure
-    const logDir = generateSessionLogPath(sessionMetadata.webUrl, sessionMetadata.sessionId);
-
-    await fs.ensureDir(logDir);
-    logFilePath = path.join(logDir, `${timestamp}_${agentName}_attempt-1.log`);
-
-    // Initialize log with agent startup info
-    const sessionId = sessionMetadata?.sessionId || path.basename(sourceDir).split('-').pop().substring(0, 8);
-    logBuffer.push(`=== ${description} - Detailed Execution Log ===`);
-    logBuffer.push(`Timestamp: ${new Date().toISOString()}`);
-    logBuffer.push(`Working Directory: ${sourceDir}`);
-    logBuffer.push(`Session ID: ${sessionId}`);
-    logBuffer.push(`Log File: ${logFilePath}`);
-    logBuffer.push(`\n=== Agent Execution Start ===\n`);
+    const logDir = generateSessionLogPath(sessionMetadata.webUrl, sessionMetadata.id);
+    logFilePath = path.join(logDir, `${timestamp}_${agentName}_attempt-${attemptNumber}.log`);
  } else {
    console.log(chalk.blue(`  🤖 Running Claude Code: ${description}...`));
  }

+  // Declare variables that need to be accessible in both try and catch blocks
+  let turnCount = 0;
+
  try {
+    // Create MCP server with target directory context
+    const shannonHelperServer = createShannonHelperServer(sourceDir);
+
+    // Look up agent's assigned Playwright MCP server
+    // Convert agent name (e.g., 'xss-vuln') to prompt name (e.g., 'vuln-xss')
+    let playwrightMcpName = null;
+    if (agentName) {
+      const promptName = agentNameToPromptName(agentName);
+      playwrightMcpName = MCP_AGENT_MAPPING[promptName];
+
+      if (playwrightMcpName) {
+        console.log(chalk.gray(`    🎭 Assigned ${agentName} → ${playwrightMcpName}`));
+      }
+    }
+
+    // Configure MCP servers: shannon-helper (SDK) + playwright-agentN (stdio)
+    const mcpServers = {
+      'shannon-helper': shannonHelperServer,
+    };
+
+    // Add Playwright MCP server if this agent needs browser automation
+    if (playwrightMcpName) {
+      const userDataDir = `/tmp/${playwrightMcpName}`;
+
+      // Detect if running in Docker via explicit environment variable
+      const isDocker = process.env.SHANNON_DOCKER === 'true';
+
+      // Build args array - conditionally add --executable-path for Docker
+      const mcpArgs = [
+        '@playwright/mcp@latest',
+        '--isolated',
+        '--user-data-dir', userDataDir,
+      ];
+
+      // Docker: Use system Chromium; Local: Use Playwright's bundled browsers
+      if (isDocker) {
+        mcpArgs.push('--executable-path', '/usr/bin/chromium-browser');
+        mcpArgs.push('--browser', 'chromium');
+      }
+
+      mcpServers[playwrightMcpName] = {
+        type: 'stdio',
+        command: 'npx',
+        args: mcpArgs,
+        env: {
+          ...process.env,
+          PLAYWRIGHT_HEADLESS: 'true', // Ensure headless mode for security and CI compatibility
+          ...(isDocker && { PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD: '1' }), // Only skip in Docker
+        },
+      };
+    }
+
    const options = {
-      model: 'claude-sonnet-4-20250514', // Use latest Claude 4 Sonnet
+      model: 'claude-sonnet-4-5-20250929', // Use latest Claude 4.5 Sonnet
      maxTurns: 10_000, // Maximum turns for autonomous work
      cwd: sourceDir, // Set working directory using SDK option
      permissionMode: 'bypassPermissions', // Bypass all permission checks for pentesting
-      customSystemPrompt: fullPrompt, // Use system prompt for better security and consistency
+      mcpServers,
    };

    // SDK Options only shown for verbose agents (not clean output)
@@ -124,7 +201,6 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context

    let result = null;
    let messages = [];
-    let turnCount = 0;
    let apiErrorDetected = false;

    // Start progress indicator for clean output agents
@@ -132,9 +208,15 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
      progressIndicator.start();
    }

-    for await (const message of query({ prompt: 'Begin.', options })) {
+
+    let messageCount = 0;
+    try {
+      for await (const message of query({ prompt: fullPrompt, options })) {
+        messageCount++;
+
      if (message.type === "assistant") {
        turnCount++;
+
        const content = Array.isArray(message.message.content)
          ? message.message.content.map(c => c.text || JSON.stringify(c)).join('\n')
          : message.message.content;
@@ -177,9 +259,15 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
          console.log(colorFn(`    ${content}`));
        }

-        // Log full details to file for later review
-        logBuffer.push(`\n🤖 Turn ${turnCount} (${description}):`);
-        logBuffer.push(content);
+        // Log to audit system (crash-safe, append-only)
+        if (auditSession) {
+          await auditSession.logEvent('llm_response', {
+            turn: turnCount,
+            content,
+            timestamp: new Date().toISOString()
+          });
+        }
+
        messages.push(content);

        // Check for API error patterns in assistant message content
@@ -210,6 +298,15 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
        if (message.input && Object.keys(message.input).length > 0) {
          console.log(chalk.gray(`    Input: ${JSON.stringify(message.input, null, 2)}`));
        }
+
+        // Log tool start event
+        if (auditSession) {
+          await auditSession.logEvent('tool_start', {
+            toolName: message.name,
+            parameters: message.input,
+            timestamp: new Date().toISOString()
+          });
+        }
      } else if (message.type === "tool_result") {
        console.log(chalk.green(`    ✅ Tool Result:`));
        if (message.content) {
@@ -221,6 +318,14 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
            console.log(chalk.gray(`    ${resultStr}`));
          }
        }
+
+        // Log tool end event
+        if (auditSession) {
+          await auditSession.logEvent('tool_end', {
+            result: message.content,
+            timestamp: new Date().toISOString()
+          });
+        }
      } else if (message.type === "result") {
        result = message.result;

@@ -273,13 +378,17 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
        costResults.agents[agentKey] = cost;
        costResults.total += cost;

-        // Store cost for return value
+        // Store cost for return value and partial tracking
        totalCost = cost;
+        partialCost = cost;
        break;
      } else {
        // Log any other message types we might not be handling
        console.log(chalk.gray(`    💬 ${message.type}: ${JSON.stringify(message, null, 2)}`));
      }
+      }
+    } catch (queryError) {
+      throw queryError; // Re-throw to outer catch
    }

    const duration = timer.stop();
@@ -292,23 +401,14 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
      console.log(chalk.yellow(`  ⚠️ API Error detected in ${description} - will validate deliverables before failing`));
    }

-    // Finish status line for parallel execution and save detailed log
+    // Finish status line for parallel execution
    if (statusManager) {
      statusManager.clearAgentStatus(description);
      statusManager.finishStatusLine();
    }

-    // Write detailed log to file
-    if (logFilePath && logBuffer.length > 0) {
-        logBuffer.push(`\n=== Agent Execution Complete ===`);
-        logBuffer.push(`Duration: ${formatDuration(duration)}`);
-        logBuffer.push(`Turns: ${turnCount}`);
-        logBuffer.push(`Cost: $${totalCost.toFixed(4)}`);
-        logBuffer.push(`Status: Success`);
-        logBuffer.push(`Completed: ${new Date().toISOString()}`);
-
-        await fs.writeFile(logFilePath, logBuffer.join('\n'));
-    }
+    // NOTE: Log writing now handled by AuditSession (crash-safe, append-only)
+    // Legacy log writing removed - audit system handles this automatically

    // Show completion messages based on agent type
    if (progressIndicator) {
@@ -327,7 +427,15 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
    }

    // Return result with log file path for all agents
-    const returnData = { result, success: true, duration, turns: turnCount, cost: totalCost, apiErrorDetected };
+    const returnData = {
+      result,
+      success: true,
+      duration,
+      turns: turnCount,
+      cost: totalCost,
+      partialCost, // Include partial cost for crash recovery
+      apiErrorDetected
+    };
    if (logFilePath) {
      returnData.logFile = logFilePath;
    }
@@ -344,17 +452,16 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
      statusManager.finishStatusLine();
    }

-    // Write error log to file
-    if (logFilePath && logBuffer.length > 0) {
-        logBuffer.push(`\n=== Agent Execution Failed ===`);
-        logBuffer.push(`Duration: ${formatDuration(duration)}`);
-        logBuffer.push(`Turns: ${turnCount}`);
-        logBuffer.push(`Error: ${error.message}`);
-        logBuffer.push(`Error Type: ${error.constructor.name}`);
-        logBuffer.push(`Status: Failed`);
-        logBuffer.push(`Failed: ${new Date().toISOString()}`);
-
-        await fs.writeFile(logFilePath, logBuffer.join('\n'));
+    // Log error to audit system
+    if (auditSession) {
+      await auditSession.logEvent('error', {
+        message: error.message,
+        errorType: error.constructor.name,
+        stack: error.stack,
+        duration,
+        turns: turnCount,
+        timestamp: new Date().toISOString()
+      });
    }

    // Show error messages based on agent type
@@ -420,6 +527,7 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
      prompt: fullPrompt.slice(0, 100) + '...',
      success: false,
      duration,
+      cost: partialCost, // Include partial cost on error
      retryable: isRetryableError(error)
    };
  }
@@ -432,6 +540,7 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
 // - Prompt snapshotting for debugging and reproducibility
 // - Git checkpoint/rollback safety for workspace protection
 // - Comprehensive error handling and logging
+// - Crash-safe audit logging via AuditSession
 export async function runClaudePromptWithRetry(prompt, sourceDir, allowedTools = 'Read', context = '', description = 'Claude analysis', agentName = null, colorFn = chalk.cyan, sessionMetadata = null) {
  const maxRetries = 3;
  let lastError;
@@ -439,22 +548,25 @@ export async function runClaudePromptWithRetry(prompt, sourceDir, allowedTools =

  console.log(chalk.cyan(`🚀 Starting ${description} with ${maxRetries} max attempts`));

-  // Save prompt snapshot before execution starts (for debugging failed runs)
-  let snapshotSaved = false;
+  // Initialize audit session (crash-safe logging)
+  let auditSession = null;
+  if (sessionMetadata && agentName) {
+    auditSession = new AuditSession(sessionMetadata);
+    await auditSession.initialize();
+  }

  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    // Create checkpoint before each attempt
    await createGitCheckpoint(sourceDir, description, attempt);

-    // Save snapshot on first attempt only (before any execution)
-    if (!snapshotSaved && agentName) {
+    // Start agent tracking in audit system (saves prompt snapshot automatically)
+    if (auditSession) {
      const fullPrompt = retryContext ? `${retryContext}\n\n${prompt}` : prompt;
-      await savePromptSnapshot(sourceDir, agentName, fullPrompt);
-      snapshotSaved = true;
+      await auditSession.startAgent(agentName, fullPrompt, attempt);
    }

    try {
-      const result = await runClaudePrompt(prompt, sourceDir, allowedTools, retryContext, description, colorFn, sessionMetadata);
+      const result = await runClaudePrompt(prompt, sourceDir, allowedTools, retryContext, description, agentName, colorFn, sessionMetadata, auditSession, attempt);

      // Validate output after successful run
      if (result.success) {
@@ -466,6 +578,17 @@ export async function runClaudePromptWithRetry(prompt, sourceDir, allowedTools =
            console.log(chalk.yellow(`📋 Validation: Ready for exploitation despite API error warnings`));
          }

+          // Record successful attempt in audit system
+          if (auditSession) {
+            await auditSession.endAgent(agentName, {
+              attemptNumber: attempt,
+              duration_ms: result.duration,
+              cost_usd: result.cost || 0,
+              success: true,
+              checkpoint: await getGitCommitHash(sourceDir)
+            });
+          }
+
          // Commit successful changes (will include the snapshot)
          await commitGitSuccess(sourceDir, description);
          console.log(chalk.green.bold(`🎉 ${description} completed successfully on attempt ${attempt}/${maxRetries}`));
@@ -474,6 +597,18 @@ export async function runClaudePromptWithRetry(prompt, sourceDir, allowedTools =
          // Agent completed but output validation failed
          console.log(chalk.yellow(`⚠️ ${description} completed but output validation failed`));

+          // Record failed validation attempt in audit system
+          if (auditSession) {
+            await auditSession.endAgent(agentName, {
+              attemptNumber: attempt,
+              duration_ms: result.duration,
+              cost_usd: result.partialCost || result.cost || 0,
+              success: false,
+              error: 'Output validation failed',
+              isFinalAttempt: attempt === maxRetries
+            });
+          }
+
          // If API error detected AND validation failed, this is a retryable error
          if (result.apiErrorDetected) {
            console.log(chalk.yellow(`⚠️ API Error detected with validation failure - treating as retryable`));
@@ -501,6 +636,18 @@ export async function runClaudePromptWithRetry(prompt, sourceDir, allowedTools =
    } catch (error) {
      lastError = error;

+      // Record failed attempt in audit system
+      if (auditSession) {
+        await auditSession.endAgent(agentName, {
+          attemptNumber: attempt,
+          duration_ms: error.duration || 0,
+          cost_usd: error.cost || 0,
+          success: false,
+          error: error.message,
+          isFinalAttempt: attempt === maxRetries
+        });
+      }
+
      // Check if error is retryable
      if (!isRetryableError(error)) {
        console.log(chalk.red(`❌ ${description} failed with non-retryable error: ${error.message}`));
@@ -533,4 +680,14 @@ export async function runClaudePromptWithRetry(prompt, sourceDir, allowedTools =
  }

  throw lastError;
+}
+
+// Helper function to get git commit hash
+async function getGitCommitHash(sourceDir) {
+  try {
+    const result = await $`cd ${sourceDir} && git rev-parse HEAD`;
+    return result.stdout.trim();
+  } catch (error) {
+    return null;
+  }
 }
@@ -0,0 +1,206 @@
+/**
+ * Audit Session - Main Facade
+ *
+ * Coordinates logger, metrics tracker, and concurrency control for comprehensive
+ * crash-safe audit logging.
+ */
+
+import { AgentLogger } from './logger.js';
+import { MetricsTracker } from './metrics-tracker.js';
+import { initializeAuditStructure, formatTimestamp } from './utils.js';
+import { SessionMutex } from '../utils/concurrency.js';
+
+// Global mutex instance
+const sessionMutex = new SessionMutex();
+
+/**
+ * AuditSession - Main audit system facade
+ */
+export class AuditSession {
+  /**
+   * @param {Object} sessionMetadata - Session metadata from Shannon store
+   * @param {string} sessionMetadata.id - Session UUID
+   * @param {string} sessionMetadata.webUrl - Target web URL
+   * @param {string} [sessionMetadata.repoPath] - Target repository path
+   */
+  constructor(sessionMetadata) {
+    this.sessionMetadata = sessionMetadata;
+    this.sessionId = sessionMetadata.id;
+
+    // Validate required fields
+    if (!this.sessionId) {
+      throw new Error('sessionMetadata.id is required');
+    }
+    if (!this.sessionMetadata.webUrl) {
+      throw new Error('sessionMetadata.webUrl is required');
+    }
+
+    // Components
+    this.metricsTracker = new MetricsTracker(sessionMetadata);
+
+    // Active logger (one at a time per agent attempt)
+    this.currentLogger = null;
+
+    // Initialization flag
+    this.initialized = false;
+  }
+
+  /**
+   * Initialize audit session (creates directories, session.json)
+   * Idempotent and race-safe
+   * @returns {Promise<void>}
+   */
+  async initialize() {
+    if (this.initialized) {
+      return; // Already initialized
+    }
+
+    // Create directory structure
+    await initializeAuditStructure(this.sessionMetadata);
+
+    // Initialize metrics tracker (loads or creates session.json)
+    await this.metricsTracker.initialize();
+
+    this.initialized = true;
+  }
+
+  /**
+   * Ensure initialized (helper for lazy initialization)
+   * @private
+   * @returns {Promise<void>}
+   */
+  async ensureInitialized() {
+    if (!this.initialized) {
+      await this.initialize();
+    }
+  }
+
+  /**
+   * Start agent execution
+   * @param {string} agentName - Agent name
+   * @param {string} promptContent - Full prompt content
+   * @param {number} [attemptNumber=1] - Attempt number
+   * @returns {Promise<void>}
+   */
+  async startAgent(agentName, promptContent, attemptNumber = 1) {
+    await this.ensureInitialized();
+
+    // Save prompt snapshot (only on first attempt)
+    if (attemptNumber === 1) {
+      await AgentLogger.savePrompt(this.sessionMetadata, agentName, promptContent);
+    }
+
+    // Create and initialize logger for this attempt
+    this.currentLogger = new AgentLogger(this.sessionMetadata, agentName, attemptNumber);
+    await this.currentLogger.initialize();
+
+    // Start metrics tracking
+    this.metricsTracker.startAgent(agentName, attemptNumber);
+
+    // Log start event
+    await this.currentLogger.logEvent('agent_start', {
+      agentName,
+      attemptNumber,
+      timestamp: formatTimestamp()
+    });
+  }
+
+  /**
+   * Log event during agent execution
+   * @param {string} eventType - Event type (tool_start, tool_end, llm_response, etc.)
+   * @param {Object} eventData - Event data
+   * @returns {Promise<void>}
+   */
+  async logEvent(eventType, eventData) {
+    if (!this.currentLogger) {
+      throw new Error('No active logger. Call startAgent() first.');
+    }
+
+    await this.currentLogger.logEvent(eventType, eventData);
+  }
+
+  /**
+   * End agent execution (mutex-protected)
+   * @param {string} agentName - Agent name
+   * @param {Object} result - Execution result
+   * @param {number} result.attemptNumber - Attempt number
+   * @param {number} result.duration_ms - Duration in milliseconds
+   * @param {number} result.cost_usd - Cost in USD
+   * @param {boolean} result.success - Whether attempt succeeded
+   * @param {string} [result.error] - Error message (if failed)
+   * @param {string} [result.checkpoint] - Git checkpoint hash (if succeeded)
+   * @param {boolean} [result.isFinalAttempt=false] - Whether this is the final attempt
+   * @returns {Promise<void>}
+   */
+  async endAgent(agentName, result) {
+    // Log end event
+    if (this.currentLogger) {
+      await this.currentLogger.logEvent('agent_end', {
+        agentName,
+        success: result.success,
+        duration_ms: result.duration_ms,
+        cost_usd: result.cost_usd,
+        timestamp: formatTimestamp()
+      });
+
+      // Close logger
+      await this.currentLogger.close();
+      this.currentLogger = null;
+    }
+
+    // Mutex-protected update to session.json
+    const unlock = await sessionMutex.lock(this.sessionId);
+    try {
+      // Reload metrics (in case of parallel updates)
+      await this.metricsTracker.reload();
+
+      // Update metrics
+      await this.metricsTracker.endAgent(agentName, result);
+    } finally {
+      unlock();
+    }
+  }
+
+  /**
+   * Mark multiple agents as rolled back
+   * @param {string[]} agentNames - Array of agent names
+   * @returns {Promise<void>}
+   */
+  async markMultipleRolledBack(agentNames) {
+    await this.ensureInitialized();
+
+    const unlock = await sessionMutex.lock(this.sessionId);
+    try {
+      await this.metricsTracker.reload();
+      await this.metricsTracker.markMultipleRolledBack(agentNames);
+    } finally {
+      unlock();
+    }
+  }
+
+  /**
+   * Update session status
+   * @param {string} status - New status (in-progress, completed, failed)
+   * @returns {Promise<void>}
+   */
+  async updateSessionStatus(status) {
+    await this.ensureInitialized();
+
+    const unlock = await sessionMutex.lock(this.sessionId);
+    try {
+      await this.metricsTracker.reload();
+      await this.metricsTracker.updateSessionStatus(status);
+    } finally {
+      unlock();
+    }
+  }
+
+  /**
+   * Get current metrics (read-only)
+   * @returns {Promise<Object>} Current metrics
+   */
+  async getMetrics() {
+    await this.ensureInitialized();
+    return this.metricsTracker.getMetrics();
+  }
+}
@@ -0,0 +1,16 @@
+/**
+ * Unified Audit & Metrics System
+ *
+ * Public API for the audit system. Provides crash-safe, append-only logging
+ * and comprehensive metrics tracking for Shannon penetration testing sessions.
+ *
+ * IMPORTANT: Session objects must have an 'id' field (NOT 'sessionId')
+ * Example: { id: "uuid", webUrl: "...", repoPath: "..." }
+ *
+ * @module audit
+ */
+
+export { AuditSession } from './audit-session.js';
+export { AgentLogger } from './logger.js';
+export { MetricsTracker } from './metrics-tracker.js';
+export * as AuditUtils from './utils.js';
@@ -0,0 +1,172 @@
+/**
+ * Append-Only Agent Logger
+ *
+ * Provides crash-safe, append-only logging for agent execution.
+ * Uses file streams with immediate flush to prevent data loss.
+ */
+
+import fs from 'fs';
+import { generateLogPath, generatePromptPath, atomicWrite, formatTimestamp } from './utils.js';
+
+/**
+ * AgentLogger - Manages append-only logging for a single agent execution
+ */
+export class AgentLogger {
+  /**
+   * @param {Object} sessionMetadata - Session metadata
+   * @param {string} agentName - Name of the agent
+   * @param {number} attemptNumber - Attempt number (1, 2, 3, ...)
+   */
+  constructor(sessionMetadata, agentName, attemptNumber) {
+    this.sessionMetadata = sessionMetadata;
+    this.agentName = agentName;
+    this.attemptNumber = attemptNumber;
+    this.timestamp = Date.now();
+
+    // Generate log file path
+    this.logPath = generateLogPath(sessionMetadata, agentName, this.timestamp, attemptNumber);
+
+    // Create write stream (append mode)
+    this.stream = null;
+    this.isOpen = false;
+  }
+
+  /**
+   * Initialize the log stream (creates file and opens stream)
+   * @returns {Promise<void>}
+   */
+  async initialize() {
+    if (this.isOpen) {
+      return; // Already initialized
+    }
+
+    // Create write stream with append mode and auto-flush
+    this.stream = fs.createWriteStream(this.logPath, {
+      flags: 'a', // Append mode
+      encoding: 'utf8',
+      autoClose: true
+    });
+
+    this.isOpen = true;
+
+    // Write header
+    await this.writeHeader();
+  }
+
+  /**
+   * Write header to log file
+   * @private
+   * @returns {Promise<void>}
+   */
+  async writeHeader() {
+    const header = [
+      `========================================`,
+      `Agent: ${this.agentName}`,
+      `Attempt: ${this.attemptNumber}`,
+      `Started: ${formatTimestamp(this.timestamp)}`,
+      `Session: ${this.sessionMetadata.id}`,
+      `Web URL: ${this.sessionMetadata.webUrl}`,
+      `========================================\n`
+    ].join('\n');
+
+    return this.writeRaw(header);
+  }
+
+  /**
+   * Write raw text to log file with immediate flush
+   * @private
+   * @param {string} text - Text to write
+   * @returns {Promise<void>}
+   */
+  writeRaw(text) {
+    return new Promise((resolve, reject) => {
+      if (!this.isOpen || !this.stream) {
+        reject(new Error('Logger not initialized'));
+        return;
+      }
+
+      // Write and flush immediately (crash-safe)
+      const needsDrain = !this.stream.write(text, 'utf8', (error) => {
+        if (error) {
+          reject(error);
+        }
+      });
+
+      if (needsDrain) {
+        // Buffer is full, wait for drain
+        const drainHandler = () => {
+          this.stream.removeListener('drain', drainHandler);
+          resolve();
+        };
+        this.stream.once('drain', drainHandler);
+      } else {
+        // Buffer has space, resolve immediately
+        resolve();
+      }
+    });
+  }
+
+  /**
+   * Log an event (tool_start, tool_end, llm_response, etc.)
+   * Events are logged as JSON for parseability
+   * @param {string} eventType - Type of event
+   * @param {Object} eventData - Event data
+   * @returns {Promise<void>}
+   */
+  async logEvent(eventType, eventData) {
+    const event = {
+      type: eventType,
+      timestamp: formatTimestamp(),
+      data: eventData
+    };
+
+    const eventLine = `${JSON.stringify(event)}\n`;
+    return this.writeRaw(eventLine);
+  }
+
+  /**
+   * Close the log stream
+   * @returns {Promise<void>}
+   */
+  async close() {
+    if (!this.isOpen || !this.stream) {
+      return;
+    }
+
+    return new Promise((resolve) => {
+      this.stream.end(() => {
+        this.isOpen = false;
+        resolve();
+      });
+    });
+  }
+
+  /**
+   * Save prompt snapshot to prompts directory
+   * Static method - doesn't require logger instance
+   * @param {Object} sessionMetadata - Session metadata
+   * @param {string} agentName - Agent name
+   * @param {string} promptContent - Full prompt content
+   * @returns {Promise<void>}
+   */
+  static async savePrompt(sessionMetadata, agentName, promptContent) {
+    const promptPath = generatePromptPath(sessionMetadata, agentName);
+
+    // Create header with metadata
+    const header = [
+      `# Prompt Snapshot: ${agentName}`,
+      ``,
+      `**Session:** ${sessionMetadata.id}`,
+      `**Web URL:** ${sessionMetadata.webUrl}`,
+      `**Saved:** ${formatTimestamp()}`,
+      ``,
+      `---`,
+      ``
+    ].join('\n');
+
+    const fullContent = header + promptContent;
+
+    // Use atomic write for safety
+    await atomicWrite(promptPath, fullContent);
+  }
+}
@@ -0,0 +1,331 @@
+/**
+ * Metrics Tracker
+ *
+ * Manages session.json with comprehensive timing, cost, and validation metrics.
+ * Tracks attempt-level data for complete forensic trail.
+ */
+
+import {
+  generateSessionJsonPath,
+  atomicWrite,
+  readJson,
+  fileExists,
+  formatTimestamp,
+  calculatePercentage
+} from './utils.js';
+
+/**
+ * MetricsTracker - Manages metrics for a session
+ */
+export class MetricsTracker {
+  /**
+   * @param {Object} sessionMetadata - Session metadata from Shannon store
+   */
+  constructor(sessionMetadata) {
+    this.sessionMetadata = sessionMetadata;
+    this.sessionJsonPath = generateSessionJsonPath(sessionMetadata);
+
+    // In-memory state (loaded from/synced to session.json)
+    this.data = null;
+
+    // Active timers (agent name -> start time)
+    this.activeTimers = new Map();
+  }
+
+  /**
+   * Initialize session.json (idempotent)
+   * @returns {Promise<void>}
+   */
+  async initialize() {
+    // Check if session.json already exists
+    const exists = await fileExists(this.sessionJsonPath);
+
+    if (exists) {
+      // Load existing data
+      this.data = await readJson(this.sessionJsonPath);
+    } else {
+      // Create new session.json
+      this.data = this.createInitialData();
+      await this.save();
+    }
+  }
+
+  /**
+   * Create initial session.json structure
+   * @private
+   * @returns {Object} Initial session data
+   */
+  createInitialData() {
+    return {
+      session: {
+        id: this.sessionMetadata.id,
+        webUrl: this.sessionMetadata.webUrl,
+        repoPath: this.sessionMetadata.repoPath,
+        status: 'in-progress',
+        createdAt: formatTimestamp()
+      },
+      metrics: {
+        total_duration_ms: 0,
+        total_cost_usd: 0,
+        phases: {},  // Phase-level aggregations: { duration_ms, duration_percentage, cost_usd, agent_count }
+        agents: {}   // Agent-level metrics: { status, attempts[], final_duration_ms, total_cost_usd, checkpoint }
+      }
+    };
+  }
+
+  /**
+   * Start tracking an agent execution
+   * @param {string} agentName - Agent name
+   * @param {number} attemptNumber - Attempt number
+   * @returns {void}
+   */
+  startAgent(agentName, attemptNumber) {
+    this.activeTimers.set(agentName, {
+      startTime: Date.now(),
+      attemptNumber
+    });
+  }
+
+  /**
+   * End agent execution and update metrics
+   * @param {string} agentName - Agent name
+   * @param {Object} result - Agent execution result
+   * @param {number} result.attemptNumber - Attempt number
+   * @param {number} result.duration_ms - Duration in milliseconds
+   * @param {number} result.cost_usd - Cost in USD
+   * @param {boolean} result.success - Whether attempt succeeded
+   * @param {string} [result.error] - Error message (if failed)
+   * @param {string} [result.checkpoint] - Git checkpoint hash (if succeeded)
+   * @returns {Promise<void>}
+   */
+  async endAgent(agentName, result) {
+    // Initialize agent metrics if not exists
+    if (!this.data.metrics.agents[agentName]) {
+      this.data.metrics.agents[agentName] = {
+        status: 'in-progress',
+        attempts: [],
+        final_duration_ms: 0,
+        total_cost_usd: 0  // Total cost across all attempts (including retries)
+      };
+    }
+
+    const agent = this.data.metrics.agents[agentName];
+
+    // Add attempt to array
+    const attempt = {
+      attempt_number: result.attemptNumber,
+      duration_ms: result.duration_ms,
+      cost_usd: result.cost_usd,
+      success: result.success,
+      timestamp: formatTimestamp()
+    };
+
+    if (result.error) {
+      attempt.error = result.error;
+    }
+
+    agent.attempts.push(attempt);
+
+    // Update total cost (includes failed attempts)
+    agent.total_cost_usd = agent.attempts.reduce((sum, a) => sum + a.cost_usd, 0);
+
+    // If successful, update final metrics and status
+    if (result.success) {
+      agent.status = 'success';
+      agent.final_duration_ms = result.duration_ms;
+
+      if (result.checkpoint) {
+        agent.checkpoint = result.checkpoint;
+      }
+    } else {
+      // If this was the last attempt, mark as failed
+      if (result.isFinalAttempt) {
+        agent.status = 'failed';
+      }
+    }
+
+    // Clear active timer
+    this.activeTimers.delete(agentName);
+
+    // Recalculate aggregations
+    this.recalculateAggregations();
+
+    // Save to disk
+    await this.save();
+  }
+
+  /**
+   * Mark agent as rolled back
+   * @param {string} agentName - Agent name
+   * @returns {Promise<void>}
+   */
+  async markRolledBack(agentName) {
+    if (!this.data.metrics.agents[agentName]) {
+      return; // Agent not tracked
+    }
+
+    const agent = this.data.metrics.agents[agentName];
+    agent.status = 'rolled-back';
+    agent.rolled_back_at = formatTimestamp();
+
+    // Recalculate aggregations (exclude rolled-back agents)
+    this.recalculateAggregations();
+
+    await this.save();
+  }
+
+  /**
+   * Mark multiple agents as rolled back
+   * @param {string[]} agentNames - Array of agent names
+   * @returns {Promise<void>}
+   */
+  async markMultipleRolledBack(agentNames) {
+    for (const agentName of agentNames) {
+      if (this.data.metrics.agents[agentName]) {
+        const agent = this.data.metrics.agents[agentName];
+        agent.status = 'rolled-back';
+        agent.rolled_back_at = formatTimestamp();
+      }
+    }
+
+    this.recalculateAggregations();
+    await this.save();
+  }
+
+  /**
+   * Update session status
+   * @param {string} status - New status (in-progress, completed, failed)
+   * @returns {Promise<void>}
+   */
+  async updateSessionStatus(status) {
+    this.data.session.status = status;
+
+    if (status === 'completed' || status === 'failed') {
+      this.data.session.completedAt = formatTimestamp();
+    }
+
+    await this.save();
+  }
+
+  /**
+   * Recalculate aggregations (total duration, total cost, phases)
+   * @private
+   */
+  recalculateAggregations() {
+    const agents = this.data.metrics.agents;
+
+    // Only count successful agents (not rolled-back or failed)
+    const successfulAgents = Object.entries(agents)
+      .filter(([_, data]) => data.status === 'success');
+
+    // Calculate total duration and cost
+    const totalDuration = successfulAgents.reduce(
+      (sum, [_, data]) => sum + data.final_duration_ms,
+      0
+    );
+
+    const totalCost = successfulAgents.reduce(
+      (sum, [_, data]) => sum + data.total_cost_usd,
+      0
+    );
+
+    this.data.metrics.total_duration_ms = totalDuration;
+    this.data.metrics.total_cost_usd = totalCost;
+
+    // Calculate phase-level metrics
+    this.data.metrics.phases = this.calculatePhaseMetrics(successfulAgents);
+  }
+
+  /**
+   * Calculate phase-level metrics
+   * @private
+   * @param {Array} successfulAgents - Array of [agentName, agentData] tuples
+   * @returns {Object} Phase metrics
+   */
+  calculatePhaseMetrics(successfulAgents) {
+    const phases = {
+      'pre-recon': [],
+      'recon': [],
+      'vulnerability-analysis': [],
+      'exploitation': [],
+      'reporting': []
+    };
+
+    // Map agents to phases
+    const agentPhaseMap = {
+      'pre-recon': 'pre-recon',
+      'recon': 'recon',
+      'injection-vuln': 'vulnerability-analysis',
+      'xss-vuln': 'vulnerability-analysis',
+      'auth-vuln': 'vulnerability-analysis',
+      'authz-vuln': 'vulnerability-analysis',
+      'ssrf-vuln': 'vulnerability-analysis',
+      'injection-exploit': 'exploitation',
+      'xss-exploit': 'exploitation',
+      'auth-exploit': 'exploitation',
+      'authz-exploit': 'exploitation',
+      'ssrf-exploit': 'exploitation',
+      'report': 'reporting'
+    };
+
+    // Group agents by phase
+    for (const [agentName, agentData] of successfulAgents) {
+      const phase = agentPhaseMap[agentName];
+      if (phase) {
+        phases[phase].push(agentData);
+      }
+    }
+
+    // Calculate metrics per phase
+    const phaseMetrics = {};
+    const totalDuration = this.data.metrics.total_duration_ms;
+
+    for (const [phaseName, agentList] of Object.entries(phases)) {
+      if (agentList.length === 0) continue;
+
+      const phaseDuration = agentList.reduce(
+        (sum, agent) => sum + agent.final_duration_ms,
+        0
+      );
+
+      const phaseCost = agentList.reduce(
+        (sum, agent) => sum + agent.total_cost_usd,
+        0
+      );
+
+      phaseMetrics[phaseName] = {
+        duration_ms: phaseDuration,
+        duration_percentage: calculatePercentage(phaseDuration, totalDuration),
+        cost_usd: phaseCost,
+        agent_count: agentList.length
+      };
+    }
+
+    return phaseMetrics;
+  }
+
+  /**
+   * Get current metrics
+   * @returns {Object} Current metrics data
+   */
+  getMetrics() {
+    return JSON.parse(JSON.stringify(this.data));
+  }
+
+  /**
+   * Save metrics to session.json (atomic write)
+   * @private
+   * @returns {Promise<void>}
+   */
+  async save() {
+    await atomicWrite(this.sessionJsonPath, this.data);
+  }
+
+  /**
+   * Reload metrics from disk
+   * @returns {Promise<void>}
+   */
+  async reload() {
+    this.data = await readJson(this.sessionJsonPath);
+  }
+}
@@ -0,0 +1,199 @@
+/**
+ * Audit System Utilities
+ *
+ * Core utility functions for path generation, atomic writes, and formatting.
+ * All functions are pure and crash-safe.
+ */
+
+import fs from 'fs/promises';
+import path from 'path';
+import { fileURLToPath } from 'url';
+
+const __filename = fileURLToPath(import.meta.url);
+const __dirname = path.dirname(__filename);
+
+// Get Shannon repository root
+export const SHANNON_ROOT = path.resolve(__dirname, '..', '..');
+export const AUDIT_LOGS_DIR = path.join(SHANNON_ROOT, 'audit-logs');
+
+/**
+ * Generate standardized session identifier: {hostname}_{sessionId}
+ * @param {Object} sessionMetadata - Session metadata from Shannon store
+ * @param {string} sessionMetadata.id - UUID session ID
+ * @param {string} sessionMetadata.webUrl - Target web URL
+ * @returns {string} Formatted session identifier
+ */
+export function generateSessionIdentifier(sessionMetadata) {
+  const { id, webUrl } = sessionMetadata;
+  const hostname = new URL(webUrl).hostname.replace(/[^a-zA-Z0-9-]/g, '-');
+  return `${hostname}_${id}`;
+}
+
+/**
+ * Generate path to audit log directory for a session
+ * @param {Object} sessionMetadata - Session metadata
+ * @returns {string} Absolute path to session audit directory
+ */
+export function generateAuditPath(sessionMetadata) {
+  const sessionIdentifier = generateSessionIdentifier(sessionMetadata);
+  return path.join(AUDIT_LOGS_DIR, sessionIdentifier);
+}
+
+/**
+ * Generate path to agent log file
+ * @param {Object} sessionMetadata - Session metadata
+ * @param {string} agentName - Name of the agent
+ * @param {number} timestamp - Timestamp (ms since epoch)
+ * @param {number} attemptNumber - Attempt number (1, 2, 3, ...)
+ * @returns {string} Absolute path to agent log file
+ */
+export function generateLogPath(sessionMetadata, agentName, timestamp, attemptNumber) {
+  const auditPath = generateAuditPath(sessionMetadata);
+  const filename = `${timestamp}_${agentName}_attempt-${attemptNumber}.log`;
+  return path.join(auditPath, 'agents', filename);
+}
+
+/**
+ * Generate path to prompt snapshot file
+ * @param {Object} sessionMetadata - Session metadata
+ * @param {string} agentName - Name of the agent
+ * @returns {string} Absolute path to prompt file
+ */
+export function generatePromptPath(sessionMetadata, agentName) {
+  const auditPath = generateAuditPath(sessionMetadata);
+  return path.join(auditPath, 'prompts', `${agentName}.md`);
+}
+
+/**
+ * Generate path to session.json file
+ * @param {Object} sessionMetadata - Session metadata
+ * @returns {string} Absolute path to session.json
+ */
+export function generateSessionJsonPath(sessionMetadata) {
+  const auditPath = generateAuditPath(sessionMetadata);
+  return path.join(auditPath, 'session.json');
+}
+
+/**
+ * Ensure directory exists (idempotent, race-safe)
+ * @param {string} dirPath - Directory path to create
+ * @returns {Promise<void>}
+ */
+export async function ensureDirectory(dirPath) {
+  try {
+    await fs.mkdir(dirPath, { recursive: true });
+  } catch (error) {
+    // Ignore EEXIST errors (race condition safe)
+    if (error.code !== 'EEXIST') {
+      throw error;
+    }
+  }
+}
+
+/**
+ * Atomic write using temp file + rename pattern
+ * Guarantees no partial writes or corruption on crash
+ * @param {string} filePath - Target file path
+ * @param {Object|string} data - Data to write (will be JSON.stringified if object)
+ * @returns {Promise<void>}
+ */
+export async function atomicWrite(filePath, data) {
+  const tempPath = `${filePath}.tmp`;
+  const content = typeof data === 'string' ? data : JSON.stringify(data, null, 2);
+
+  try {
+    // Write to temp file
+    await fs.writeFile(tempPath, content, 'utf8');
+
+    // Atomic rename (POSIX guarantee: atomic on same filesystem)
+    await fs.rename(tempPath, filePath);
+  } catch (error) {
+    // Clean up temp file on failure
+    try {
+      await fs.unlink(tempPath);
+    } catch (cleanupError) {
+      // Ignore cleanup errors
+    }
+    throw error;
+  }
+}
+
+/**
+ * Format duration in milliseconds to human-readable string
+ * @param {number} ms - Duration in milliseconds
+ * @returns {string} Formatted duration (e.g., "2m 34s", "45s", "1.2s")
+ */
+export function formatDuration(ms) {
+  if (ms < 1000) {
+    return `${ms}ms`;
+  }
+
+  const seconds = ms / 1000;
+  if (seconds < 60) {
+    return `${seconds.toFixed(1)}s`;
+  }
+
+  const minutes = Math.floor(seconds / 60);
+  const remainingSeconds = Math.floor(seconds % 60);
+  return `${minutes}m ${remainingSeconds}s`;
+}
+
+/**
+ * Format timestamp to ISO 8601 string
+ * @param {number} [timestamp] - Unix timestamp in ms (defaults to now)
+ * @returns {string} ISO 8601 formatted string
+ */
+export function formatTimestamp(timestamp = Date.now()) {
+  return new Date(timestamp).toISOString();
+}
+
+/**
+ * Calculate percentage
+ * @param {number} part - Part value
+ * @param {number} total - Total value
+ * @returns {number} Percentage (0-100)
+ */
+export function calculatePercentage(part, total) {
+  if (total === 0) return 0;
+  return (part / total) * 100;
+}
+
+/**
+ * Read and parse JSON file
+ * @param {string} filePath - Path to JSON file
+ * @returns {Promise<Object>} Parsed JSON data
+ */
+export async function readJson(filePath) {
+  const content = await fs.readFile(filePath, 'utf8');
+  return JSON.parse(content);
+}
+
+/**
+ * Check if file exists
+ * @param {string} filePath - Path to check
+ * @returns {Promise<boolean>} True if file exists
+ */
+export async function fileExists(filePath) {
+  try {
+    await fs.access(filePath);
+    return true;
+  } catch {
+    return false;
+  }
+}
+
+/**
+ * Initialize audit directory structure for a session
+ * Creates: audit-logs/{sessionId}/, agents/, prompts/
+ * @param {Object} sessionMetadata - Session metadata
+ * @returns {Promise<void>}
+ */
+export async function initializeAuditStructure(sessionMetadata) {
+  const auditPath = generateAuditPath(sessionMetadata);
+  const agentsPath = path.join(auditPath, 'agents');
+  const promptsPath = path.join(auditPath, 'prompts');
+
+  await ensureDirectory(auditPath);
+  await ensureDirectory(agentsPath);
+  await ensureDirectory(promptsPath);
+}
@@ -3,6 +3,7 @@ import chalk from 'chalk';
 import { PentestError } from './error-handling.js';
 import { parseConfig, distributeConfig } from './config-parser.js';
 import { executeGitCommandWithRetry } from './utils/git-manager.js';
+import { formatDuration } from './audit/utils.js';
 import {
  AGENTS,
  PHASES,
@@ -76,10 +77,10 @@ const rollbackGitToCommit = async (targetRepo, commitHash) => {
 };

 // Run a single agent with retry logic and checkpointing
-export const runSingleAgent = async (agentName, session, pipelineTestingMode, runClaudePromptWithRetry, loadPrompt, allowRerun = false, skipWorkspaceClean = false) => {
+const runSingleAgent = async (agentName, session, pipelineTestingMode, runClaudePromptWithRetry, loadPrompt, allowRerun = false, skipWorkspaceClean = false) => {
  // Validate agent first
  const agent = validateAgent(agentName);
-  
+
  console.log(chalk.cyan(`\n🤖 Running agent: ${agent.displayName}`));
  
  // Reload session to get latest state (important for agent ranges)
@@ -191,7 +192,7 @@ export const runSingleAgent = async (agentName, session, pipelineTestingMode, ru
      AGENTS[agentName].displayName,
      agentName,  // Pass agent name for snapshot creation
      getAgentColor(agentName),  // Pass color function for this agent
-      { webUrl: session.webUrl, sessionId: session.id }  // Session metadata for logging
+      { id: session.id, webUrl: session.webUrl, repoPath: session.repoPath }  // Session metadata for audit logging
    );
    
    if (!result.success) {
@@ -218,12 +219,12 @@ export const runSingleAgent = async (agentName, session, pipelineTestingMode, ru
        const validation = await safeValidateQueueAndDeliverable(vulnType, targetRepo);

        if (validation.success) {
+          // Log validation result (don't store - will be re-validated during exploitation phase)
+          console.log(chalk.blue(`📋 Validation: ${validation.data.shouldExploit ? `Ready for exploitation (${validation.data.vulnerabilityCount} vulnerabilities)` : 'No vulnerabilities found'}`));
          validationData = {
            shouldExploit: validation.data.shouldExploit,
-            vulnerabilityCount: validation.data.vulnerabilityCount,
-            validatedAt: new Date().toISOString()
+            vulnerabilityCount: validation.data.vulnerabilityCount
          };
-          console.log(chalk.blue(`📋 Validation: ${validationData.shouldExploit ? `Ready for exploitation (${validationData.vulnerabilityCount} vulnerabilities)` : 'No vulnerabilities found'}`));
        } else {
          console.log(chalk.yellow(`⚠️ Validation failed: ${validation.error.message}`));
        }
@@ -232,8 +233,8 @@ export const runSingleAgent = async (agentName, session, pipelineTestingMode, ru
      }
    }

-    // Mark agent as completed
-    await markAgentCompleted(session.id, agentName, commitHash, timingData, costData, validationData);
+    // Mark agent as completed (validation not stored - will be re-checked during exploitation)
+    await markAgentCompleted(session.id, agentName, commitHash);

    // Only show completion message for sequential execution
    if (!skipWorkspaceClean) {
@@ -299,7 +300,7 @@ export const runSingleAgent = async (agentName, session, pipelineTestingMode, ru
 };

 // Run multiple agents in sequence
-export const runAgentRange = async (startAgent, endAgent, session, pipelineTestingMode, runClaudePromptWithRetry, loadPrompt) => {
+const runAgentRange = async (startAgent, endAgent, session, pipelineTestingMode, runClaudePromptWithRetry, loadPrompt) => {
  const agents = validateAgentRange(startAgent, endAgent);
  
  console.log(chalk.cyan(`\n🔄 Running agent range: ${startAgent} to ${endAgent} (${agents.length} agents)`));
@@ -323,7 +324,7 @@ export const runAgentRange = async (startAgent, endAgent, session, pipelineTesti
 };

 // Run vulnerability agents in parallel
-export const runParallelVuln = async (session, pipelineTestingMode, runClaudePromptWithRetry, loadPrompt) => {
+const runParallelVuln = async (session, pipelineTestingMode, runClaudePromptWithRetry, loadPrompt) => {
  const vulnAgents = ['injection-vuln', 'xss-vuln', 'auth-vuln', 'ssrf-vuln', 'authz-vuln'];
  const activeAgents = vulnAgents.filter(agent => !session.completedAgents.includes(agent));

@@ -421,7 +422,7 @@ export const runParallelVuln = async (session, pipelineTestingMode, runClaudePro
 };

 // Run exploitation agents in parallel
-export const runParallelExploit = async (session, pipelineTestingMode, runClaudePromptWithRetry, loadPrompt) => {
+const runParallelExploit = async (session, pipelineTestingMode, runClaudePromptWithRetry, loadPrompt) => {
  const exploitAgents = ['injection-exploit', 'xss-exploit', 'auth-exploit', 'ssrf-exploit', 'authz-exploit'];

  // Get fresh session data to ensure we have the latest vulnerability analysis results
@@ -429,25 +430,36 @@ export const runParallelExploit = async (session, pipelineTestingMode, runClaude
  const { getSession } = await import('./session-manager.js');
  const freshSession = await getSession(session.id);

+  // Load validation module
+  const { safeValidateQueueAndDeliverable } = await import('./queue-validation.js');
+
  // Only run exploit agents whose vuln counterparts completed successfully AND found vulnerabilities
-  const eligibleAgents = exploitAgents.filter(agentName => {
-    const vulnAgentName = agentName.replace('-exploit', '-vuln');
+  const eligibilityChecks = await Promise.all(
+    exploitAgents.map(async (agentName) => {
+      const vulnAgentName = agentName.replace('-exploit', '-vuln');

-    // Must have completed the vulnerability analysis
-    if (!freshSession.completedAgents.includes(vulnAgentName)) {
-      return false;
-    }
+      // Must have completed the vulnerability analysis
+      if (!freshSession.completedAgents.includes(vulnAgentName)) {
+        return { agentName, eligible: false };
+      }

-    // Must have found vulnerabilities to exploit
-    const validationResult = freshSession.validationResults?.[vulnAgentName];
-    if (!validationResult || !validationResult.shouldExploit) {
-      console.log(chalk.gray(`⏭️  Skipping ${agentName} (no vulnerabilities found in ${vulnAgentName})`));
-      return false;
-    }
+      // Check if vulnerabilities were found by validating the queue file
+      const vulnType = vulnAgentName.replace('-vuln', ''); // "injection-vuln" -> "injection"
+      const validation = await safeValidateQueueAndDeliverable(vulnType, freshSession.targetRepo);

-    console.log(chalk.blue(`✓ ${agentName} eligible (${validationResult.vulnerabilityCount} vulnerabilities from ${vulnAgentName})`));
-    return true;
-  });
+      if (!validation.success || !validation.data.shouldExploit) {
+        console.log(chalk.gray(`⏭️  Skipping ${agentName} (no vulnerabilities found in ${vulnAgentName})`));
+        return { agentName, eligible: false };
+      }
+
+      console.log(chalk.blue(`✓ ${agentName} eligible (${validation.data.vulnerabilityCount} vulnerabilities from ${vulnAgentName})`));
+      return { agentName, eligible: true };
+    })
+  );
+
+  const eligibleAgents = eligibilityChecks
+    .filter(check => check.eligible)
+    .map(check => check.agentName);

  const activeAgents = eligibleAgents.filter(agent => !freshSession.completedAgents.includes(agent));

@@ -616,13 +628,35 @@ export const rollbackTo = async (targetAgent, session) => {
  }
  
  const commitHash = session.checkpoints[targetAgent];
-  
+
  // Rollback git workspace
  await rollbackGitToCommit(session.targetRepo, commitHash);
-  
-  // Update session state
+
+  // Update session state (removes agents from completedAgents)
  await rollbackToAgent(session.id, targetAgent);
-  
+
+  // Mark rolled-back agents in audit system (for forensic trail)
+  try {
+    const { AuditSession } = await import('./audit/index.js');
+    const auditSession = new AuditSession(session);
+    await auditSession.initialize();
+
+    // Find agents that were rolled back (agents after targetAgent)
+    const targetOrder = AGENTS[targetAgent].order;
+    const rolledBackAgents = Object.values(AGENTS)
+      .filter(agent => agent.order > targetOrder)
+      .map(agent => agent.name);
+
+    // Mark them as rolled-back in audit system
+    if (rolledBackAgents.length > 0) {
+      await auditSession.markMultipleRolledBack(rolledBackAgents);
+      console.log(chalk.gray(`   Marked ${rolledBackAgents.length} agents as rolled-back in audit logs`));
+    }
+  } catch (error) {
+    // Non-critical: rollback succeeded even if audit update failed
+    console.log(chalk.yellow(`   ⚠️ Failed to update audit logs: ${error.message}`));
+  }
+
  console.log(chalk.green(`✅ Successfully rolled back to agent '${targetAgent}'`));
 };

@@ -867,23 +901,3 @@ const getTimeAgo = (timestamp) => {
  }
 };

-// Helper function to format duration in milliseconds to human readable format
-const formatDuration = (durationMs) => {
-  if (durationMs < 1000) {
-    return `${durationMs}ms`;
-  }
-  
-  const seconds = Math.floor(durationMs / 1000);
-  const minutes = Math.floor(seconds / 60);
-  const hours = Math.floor(minutes / 60);
-  
-  if (hours > 0) {
-    return `${hours}h ${minutes % 60}m ${seconds % 60}s`;
-  } else if (minutes > 0) {
-    return `${minutes}m ${seconds % 60}s`;
-  } else {
-    return `${seconds}s`;
-  }
-};
-
-
@@ -1,13 +1,13 @@
 import chalk from 'chalk';
 import {
  selectSession, deleteSession, deleteAllSessions,
-  validateAgent, validatePhase
+  validateAgent, validatePhase, reconcileSession
 } from '../session-manager.js';
 import {
  runPhase, runAll, rollbackTo, rerunAgent, displayStatus, listAgents
 } from '../checkpoint-manager.js';
 import { logError, PentestError } from '../error-handling.js';
-import { cleanupMCP } from '../setup/environment.js';
+import { promptConfirmation } from './prompts.js';

 // Developer command handlers
 export async function handleDeveloperCommand(command, args, pipelineTestingMode, runClaudePromptWithRetry, loadPrompt) {
@@ -27,41 +27,19 @@ export async function handleDeveloperCommand(command, args, pipelineTestingMode,
        const sessionId = args[0];
        const deletedSession = await deleteSession(sessionId);
        console.log(chalk.green(`✅ Deleted session ${sessionId} (${new URL(deletedSession.webUrl).hostname})`));
-        // Clean up MCP agents when deleting specific session
-        await cleanupMCP();
      } else {
        // Cleanup all sessions - require confirmation
-        console.log(chalk.yellow('⚠️  This will delete all pentest sessions. Are you sure? (y/N):'));
-        const { createInterface } = await import('readline');
-        const readline = createInterface({
-          input: process.stdin,
-          output: process.stdout
-        });
-
-        await new Promise((resolve) => {
-          readline.question('', (answer) => {
-            readline.close();
-            if (answer.toLowerCase() === 'y' || answer.toLowerCase() === 'yes') {
-              deleteAllSessions().then(deleted => {
-                if (deleted) {
-                  console.log(chalk.green('✅ All sessions deleted'));
-                } else {
-                  console.log(chalk.yellow('⚠️  No sessions found to delete'));
-                }
-                // Clean up MCP agents after deleting sessions
-                return cleanupMCP();
-              }).then(() => {
-                resolve();
-              }).catch(error => {
-                console.log(chalk.red(`❌ Failed to delete sessions: ${error.message}`));
-                resolve();
-              });
-            } else {
-              console.log(chalk.gray('Cleanup cancelled'));
-              resolve();
-            }
-          });
-        });
+        const confirmed = await promptConfirmation(chalk.yellow('⚠️  This will delete all pentest sessions. Are you sure? (y/N):'));
+        if (confirmed) {
+          const deleted = await deleteAllSessions();
+          if (deleted) {
+            console.log(chalk.green('✅ All sessions deleted'));
+          } else {
+            console.log(chalk.yellow('⚠️  No sessions found to delete'));
+          }
+        } else {
+          console.log(chalk.gray('Cleanup cancelled'));
+        }
      }
      return;
    }
@@ -94,6 +72,29 @@ export async function handleDeveloperCommand(command, args, pipelineTestingMode,
      process.exit(1);
    }

+    // Self-healing: Reconcile session with audit logs before executing command
+    // This ensures Shannon store is consistent with audit data, even after crash recovery
+    try {
+      const reconcileReport = await reconcileSession(session.id);
+
+      if (reconcileReport.promotions.length > 0) {
+        console.log(chalk.blue(`🔄 Reconciled: Added ${reconcileReport.promotions.length} completed agents from audit logs`));
+      }
+      if (reconcileReport.demotions.length > 0) {
+        console.log(chalk.yellow(`🔄 Reconciled: Removed ${reconcileReport.demotions.length} rolled-back agents`));
+      }
+      if (reconcileReport.failures.length > 0) {
+        console.log(chalk.yellow(`🔄 Reconciled: Marked ${reconcileReport.failures.length} failed agents`));
+      }
+
+      // Reload session after reconciliation to get fresh state
+      const { getSession } = await import('../session-manager.js');
+      session = await getSession(session.id);
+    } catch (error) {
+      // Reconciliation failure is non-critical, but log warning
+      console.log(chalk.yellow(`⚠️  Failed to reconcile session with audit logs: ${error.message}`));
+    }
+
    switch (command) {

      case '--run-phase':
@@ -0,0 +1,62 @@
+import { createInterface } from 'readline';
+import { PentestError } from '../error-handling.js';
+
+/**
+ * Prompt user for yes/no confirmation
+ * @param {string} message - Question to display
+ * @returns {Promise<boolean>} true if confirmed, false otherwise
+ */
+export async function promptConfirmation(message) {
+  const readline = createInterface({
+    input: process.stdin,
+    output: process.stdout
+  });
+
+  return new Promise((resolve) => {
+    readline.question(message + ' ', (answer) => {
+      readline.close();
+      const confirmed = answer.toLowerCase() === 'y' || answer.toLowerCase() === 'yes';
+      resolve(confirmed);
+    });
+  });
+}
+
+/**
+ * Prompt user to select from numbered list
+ * @param {string} message - Selection prompt
+ * @param {Array} items - Items to choose from
+ * @returns {Promise<any>} Selected item
+ * @throws {PentestError} If invalid selection
+ */
+export async function promptSelection(message, items) {
+  if (!items || items.length === 0) {
+    throw new PentestError(
+      'No items available for selection',
+      'validation',
+      false
+    );
+  }
+
+  const readline = createInterface({
+    input: process.stdin,
+    output: process.stdout
+  });
+
+  return new Promise((resolve, reject) => {
+    readline.question(message + ' ', (answer) => {
+      readline.close();
+
+      const choice = parseInt(answer);
+      if (isNaN(choice) || choice < 1 || choice > items.length) {
+        reject(new PentestError(
+          `Invalid selection. Please enter a number between 1 and ${items.length}`,
+          'validation',
+          false,
+          { choice: answer }
+        ));
+      } else {
+        resolve(items[choice - 1]);
+      }
+    });
+  });
+}
@@ -21,7 +21,6 @@ export function showHelp() {

  console.log(chalk.yellow.bold('OPTIONS:'));
  console.log('  --config <file>      YAML configuration file for authentication and testing parameters');
-  console.log('  --log [file]         Capture all output to log file (default: shannon-<timestamp>.log)');
  console.log('  --pipeline-testing   Use minimal prompts for fast pipeline testing (creates minimal deliverables)\n');

  console.log(chalk.yellow.bold('DEVELOPER COMMANDS:'));
@@ -37,7 +36,6 @@ export function showHelp() {
  console.log('  # Normal mode - create new session');
  console.log('  ./shannon.mjs "https://example.com" "/path/to/local/repo"');
  console.log('  ./shannon.mjs "https://example.com" "/path/to/local/repo" --config auth.yaml');
-  console.log('  ./shannon.mjs "https://example.com" "/path/to/local/repo" --log pentest.log');
  console.log('  ./shannon.mjs "https://example.com" "/path/to/local/repo" --setup-only  # Setup only\n');

  console.log('  # Developer mode - operate on existing session');
@@ -2,6 +2,27 @@ import { path, fs } from 'zx';
 import chalk from 'chalk';
 import { validateQueueAndDeliverable } from './queue-validation.js';

+// Factory function for vulnerability queue validators
+function createVulnValidator(vulnType) {
+  return async (sourceDir) => {
+    try {
+      await validateQueueAndDeliverable(vulnType, sourceDir);
+      return true;
+    } catch (error) {
+      console.log(chalk.yellow(`   Queue validation failed for ${vulnType}: ${error.message}`));
+      return false;
+    }
+  };
+}
+
+// Factory function for exploit deliverable validators
+function createExploitValidator(vulnType) {
+  return async (sourceDir) => {
+    const evidenceFile = path.join(sourceDir, 'deliverables', `${vulnType}_exploitation_evidence.md`);
+    return await fs.pathExists(evidenceFile);
+  };
+}
+
 // MCP agent mapping - assigns each agent to a specific Playwright instance to prevent conflicts
 export const MCP_AGENT_MAPPING = Object.freeze({
  // Phase 1: Pre-reconnaissance (actual prompt name is 'pre-recon-code')
@@ -47,81 +68,18 @@ export const AGENT_VALIDATORS = Object.freeze({
  },

  // Vulnerability analysis agents
-  'injection-vuln': async (sourceDir) => {
-    try {
-      await validateQueueAndDeliverable('injection', sourceDir);
-      return true;
-    } catch (error) {
-      console.log(chalk.yellow(`   Queue validation failed for injection: ${error.message}`));
-      return false;
-    }
-  },
-
-  'xss-vuln': async (sourceDir) => {
-    try {
-      await validateQueueAndDeliverable('xss', sourceDir);
-      return true;
-    } catch (error) {
-      console.log(chalk.yellow(`   Queue validation failed for xss: ${error.message}`));
-      return false;
-    }
-  },
-
-  'auth-vuln': async (sourceDir) => {
-    try {
-      await validateQueueAndDeliverable('auth', sourceDir);
-      return true;
-    } catch (error) {
-      console.log(chalk.yellow(`   Queue validation failed for auth: ${error.message}`));
-      return false;
-    }
-  },
-
-  'ssrf-vuln': async (sourceDir) => {
-    try {
-      await validateQueueAndDeliverable('ssrf', sourceDir);
-      return true;
-    } catch (error) {
-      console.log(chalk.yellow(`   Queue validation failed for ssrf: ${error.message}`));
-      return false;
-    }
-  },
-
-  'authz-vuln': async (sourceDir) => {
-    try {
-      await validateQueueAndDeliverable('authz', sourceDir);
-      return true;
-    } catch (error) {
-      console.log(chalk.yellow(`   Queue validation failed for authz: ${error.message}`));
-      return false;
-    }
-  },
+  'injection-vuln': createVulnValidator('injection'),
+  'xss-vuln': createVulnValidator('xss'),
+  'auth-vuln': createVulnValidator('auth'),
+  'ssrf-vuln': createVulnValidator('ssrf'),
+  'authz-vuln': createVulnValidator('authz'),

  // Exploitation agents
-  'injection-exploit': async (sourceDir) => {
-    const evidenceFile = path.join(sourceDir, 'deliverables', 'injection_exploitation_evidence.md');
-    return await fs.pathExists(evidenceFile);
-  },
-
-  'xss-exploit': async (sourceDir) => {
-    const evidenceFile = path.join(sourceDir, 'deliverables', 'xss_exploitation_evidence.md');
-    return await fs.pathExists(evidenceFile);
-  },
-
-  'auth-exploit': async (sourceDir) => {
-    const evidenceFile = path.join(sourceDir, 'deliverables', 'auth_exploitation_evidence.md');
-    return await fs.pathExists(evidenceFile);
-  },
-
-  'ssrf-exploit': async (sourceDir) => {
-    const evidenceFile = path.join(sourceDir, 'deliverables', 'ssrf_exploitation_evidence.md');
-    return await fs.pathExists(evidenceFile);
-  },
-
-  'authz-exploit': async (sourceDir) => {
-    const evidenceFile = path.join(sourceDir, 'deliverables', 'authz_exploitation_evidence.md');
-    return await fs.pathExists(evidenceFile);
-  },
+  'injection-exploit': createExploitValidator('injection'),
+  'xss-exploit': createExploitValidator('xss'),
+  'auth-exploit': createExploitValidator('auth'),
+  'ssrf-exploit': createExploitValidator('ssrf'),
+  'authz-exploit': createExploitValidator('authz'),

  // Executive report agent
  'report': async (sourceDir) => {
@@ -51,18 +51,6 @@ export const logError = async (error, contextMsg, sourceDir = null) => {
  return logEntry;
 };

-// Handle configuration parsing errors
-const handleConfigError = (error, configPath) => {
-  const configError = new PentestError(
-    `Configuration error in ${configPath}: ${error.message}. Check your config.yaml file format and try again.`,
-    'config',
-    false,
-    { configPath, originalError: error.message }
-  );
-  throw configError;
-};
-
-
 // Handle tool execution errors
 export const handleToolError = (toolName, error) => {
  const isRetryable = error.code === 'ECONNRESET' || error.code === 'ETIMEDOUT' || error.code === 'ENOTFOUND';
@@ -167,22 +155,4 @@ export const getRetryDelay = (error, attempt) => {
  const baseDelay = Math.pow(2, attempt) * 1000; // 2s, 4s, 8s
  const jitter = Math.random() * 1000; // 0-1s random
  return Math.min(baseDelay + jitter, 30000); // Max 30s
-};
-
-// General error handler with context
-const handleError = (error, context, isFatal = false) => {
-  const pentestError = error instanceof PentestError 
-    ? error 
-    : new PentestError(error.message, 'unknown', false, { context, originalError: error.message });
-  
-  if (isFatal) {
-    pentestError.type = 'fatal';
-    throw pentestError;
-  }
-  
-  return {
-    success: false,
-    error: pentestError,
-    continuable: !isFatal
-  };
 };
@@ -1,6 +1,7 @@
 import { $, fs, path } from 'zx';
 import chalk from 'chalk';
-import { Timer, timingResults, formatDuration } from '../utils/metrics.js';
+import { Timer, timingResults } from '../utils/metrics.js';
+import { formatDuration } from '../audit/utils.js';
 import { handleToolError, PentestError } from '../error-handling.js';
 import { AGENTS } from '../session-manager.js';
 import { runClaudePromptWithRetry } from '../ai/claude-executor.js';
@@ -99,7 +100,7 @@ async function runPreReconWave1(webUrl, sourceDir, variables, config, pipelineTe
        AGENTS['pre-recon'].displayName,
        'pre-recon',  // Agent name for snapshot creation
        chalk.cyan,
-        { webUrl, sessionId }  // Session metadata for logging
+        { id: sessionId, webUrl }  // Session metadata for audit logging (STANDARD: use 'id' field)
      )
    );
    const [codeAnalysis] = await Promise.all(operations);
@@ -123,7 +124,7 @@ async function runPreReconWave1(webUrl, sourceDir, variables, config, pipelineTe
        AGENTS['pre-recon'].displayName,
        'pre-recon',  // Agent name for snapshot creation
        chalk.cyan,
-        { webUrl, sessionId }  // Session metadata for logging
+        { id: sessionId, webUrl }  // Session metadata for audit logging (STANDARD: use 'id' field)
      )
    );
  }
@@ -22,10 +22,6 @@ export class ProgressIndicator {
    }, 100);
  }

-  updateMessage(newMessage) {
-    this.message = newMessage;
-  }
-
  stop() {
    if (!this.isRunning) return;

@@ -7,7 +7,7 @@ import { MCP_AGENT_MAPPING } from '../constants.js';
 async function buildLoginInstructions(authentication) {
  try {
    // Load the login instructions template
-    const loginInstructionsPath = path.join(import.meta.dirname, '..', '..', 'login_resources', 'login_instructions.txt');
+    const loginInstructionsPath = path.join(import.meta.dirname, '..', '..', 'prompts', 'shared', 'login-instructions.txt');

    if (!await fs.pathExists(loginInstructionsPath)) {
      throw new PentestError(
@@ -84,6 +84,27 @@ async function buildLoginInstructions(authentication) {
  }
 }

+// Pure function: Process @include() directives
+async function processIncludes(content, baseDir) {
+  const includeRegex = /@include\(([^)]+)\)/g;
+  // Use a Promise.all to handle all includes concurrently
+  const replacements = await Promise.all(
+    Array.from(content.matchAll(includeRegex)).map(async (match) => {
+      const includePath = path.join(baseDir, match[1]);
+      const sharedContent = await fs.readFile(includePath, 'utf8');
+      return {
+        placeholder: match[0],
+        content: sharedContent,
+      };
+    })
+  );
+
+  for (const replacement of replacements) {
+    content = content.replace(replacement.placeholder, replacement.content);
+  }
+  return content;
+}
+
 // Pure function: Variable interpolation
 async function interpolateVariables(template, variables, config = null) {
  try {
@@ -198,7 +219,11 @@ export async function loadPrompt(promptName, variables, config = null, pipelineT
      console.log(chalk.yellow(`    🎭 Unknown agent ${promptName}, using fallback → ${enhancedVariables.MCP_SERVER}`));
    }

-    const template = await fs.readFile(promptPath, 'utf8');
+    let template = await fs.readFile(promptPath, 'utf8');
+
+    // Pre-process the template to handle @include directives
+    template = await processIncludes(template, promptsDir);
+
    return await interpolateVariables(template, enhancedVariables, config);
  } catch (error) {
    if (error instanceof PentestError) {
@@ -207,36 +232,4 @@ export async function loadPrompt(promptName, variables, config = null, pipelineT
    const promptError = handlePromptError(promptName, error);
    throw promptError.error;
  }
-}
-
-// Save prompt snapshot for successful agent runs only
-export async function savePromptSnapshot(sourceDir, agentName, promptContent) {
-  const snapshotDir = path.join(sourceDir, 'prompt-snapshots');
-  await fs.ensureDir(snapshotDir);
-
-  // Use deterministic naming - one snapshot per agent
-  const fileName = `${agentName}.md`;
-  const filePath = path.join(snapshotDir, fileName);
-
-  const timestamp = new Date().toISOString();
-  const snapshotContent = `# Prompt Snapshot: ${agentName}
-
-**Generated:** ${timestamp}
-**Agent:** ${agentName}
-
---
-
-## Full Interpolated Prompt
-
-\`\`\`markdown
-${promptContent}
-\`\`\`
-
---
-
-*This snapshot represents the exact prompt that was sent to Claude Code to generate the current deliverables for this agent.*
-`;
-
-  await fs.writeFile(filePath, snapshotContent);
-  console.log(chalk.gray(`    📸 Prompt snapshot saved: prompt-snapshots/${fileName}`));
 }
@@ -27,7 +27,6 @@ const VULN_TYPE_CONFIG = Object.freeze({

 // Functional composition utilities - async pipe for promise chain
 const pipe = (...fns) => x => fns.reduce(async (v, f) => f(await v), x);
-const compose = (...fns) => x => fns.reduceRight((v, f) => f(v), x);

 // Pure function to create validation rule
 const createValidationRule = (predicate, errorMessage, retryable = true) => 
@@ -2,41 +2,17 @@ import { fs, path } from 'zx';
 import chalk from 'chalk';
 import crypto from 'crypto';
 import { PentestError } from './error-handling.js';
+import { SessionMutex } from './utils/concurrency.js';
+import { promptSelection } from './cli/prompts.js';

 // Generate a session-based log folder path
+// NEW FORMAT: {hostname}_{sessionId} (no hash, full UUID for consistency with audit system)
 export const generateSessionLogPath = (webUrl, sessionId) => {
-  // Create a hash of the webUrl for uniqueness while keeping it readable
-  const urlHash = crypto.createHash('md5').update(webUrl).digest('hex').substring(0, 8);
  const hostname = new URL(webUrl).hostname.replace(/[^a-zA-Z0-9-]/g, '-');
-  const shortSessionId = sessionId.substring(0, 8);
-
-  const sessionFolderName = `${hostname}_${urlHash}_${shortSessionId}`;
+  const sessionFolderName = `${hostname}_${sessionId}`;
  return path.join(process.cwd(), 'agent-logs', sessionFolderName);
 };

-// Mutex for session file operations to prevent race conditions
-class SessionMutex {
-  constructor() {
-    this.locks = new Map();
-  }
-
-  async lock(sessionId) {
-    if (this.locks.has(sessionId)) {
-      // Wait for existing lock to be released
-      await this.locks.get(sessionId);
-    }
-
-    let resolve;
-    const promise = new Promise(r => resolve = r);
-    this.locks.set(sessionId, promise);
-
-    return () => {
-      this.locks.delete(sessionId);
-      resolve();
-    };
-  }
-}
-
 const sessionMutex = new SessionMutex();

 // Agent definitions according to PRD
@@ -242,6 +218,8 @@ export const createSession = async (webUrl, repoPath, configFile = null, targetR

  const sessionId = generateSessionId();

+  // STANDARD: All sessions use 'id' field (NOT 'sessionId')
+  // This is the canonical session structure used throughout the codebase
  const session = {
    id: sessionId,
    webUrl,
@@ -339,29 +317,10 @@ export const selectSession = async () => {
  });
  
  // Get user selection
-  const { createInterface } = await import('readline');
-  const readline = createInterface({
-    input: process.stdin,
-    output: process.stdout
-  });
-  
-  return new Promise((resolve, reject) => {
-    readline.question(chalk.cyan(`Select session (1-${sessions.length}): `), (answer) => {
-      readline.close();
-      
-      const choice = parseInt(answer);
-      if (isNaN(choice) || choice < 1 || choice > sessions.length) {
-        reject(new PentestError(
-          `Invalid selection. Please enter a number between 1 and ${sessions.length}`,
-          'validation',
-          false,
-          { choice: answer }
-        ));
-      } else {
-        resolve(sessions[choice - 1]);
-      }
-    });
-  });
+  return await promptSelection(
+    chalk.cyan(`Select session (1-${sessions.length}):`),
+    sessions
+  );
 };

 // Validate agent name
@@ -452,7 +411,9 @@ export const getNextAgent = (session) => {
 };

 // Mark agent as completed with checkpoint
-export const markAgentCompleted = async (sessionId, agentName, checkpointCommit, timingData = null, costData = null, validationData = null) => {
+// NOTE: Timing, cost, and validation data now managed by AuditSession (audit-logs/session.json)
+// Shannon store contains ONLY orchestration state (completedAgents, checkpoints)
+export const markAgentCompleted = async (sessionId, agentName, checkpointCommit) => {
  // Use mutex to prevent race conditions during parallel agent execution
  const unlock = await sessionMutex.lock(sessionId);

@@ -473,38 +434,6 @@ export const markAgentCompleted = async (sessionId, agentName, checkpointCommit,
        [agentName]: checkpointCommit
      }
    };
-  
-  // Update timing data if provided
-  if (timingData) {
-    updates.timingBreakdown = {
-      ...session.timingBreakdown,
-      agents: {
-        ...session.timingBreakdown?.agents,
-        [agentName]: timingData
-      }
-    };
-  }
-  
-  // Update cost data if provided
-  if (costData) {
-    const existingCost = session.costBreakdown?.total || 0;
-    updates.costBreakdown = {
-      total: existingCost + costData,
-      agents: {
-        ...session.costBreakdown?.agents,
-        [agentName]: costData
-      }
-    };
-  }
-
-
-  // Update validation data if provided (for vulnerability agents)
-  if (validationData && agentName.includes('-vuln')) {
-    updates.validationResults = {
-      ...session.validationResults,
-      [agentName]: validationData
-    };
-  }

    // Check if all agents are now completed and update session status
    const totalAgents = Object.keys(AGENTS).length;
@@ -583,25 +512,12 @@ export const getSessionStatus = (session) => {
 export const calculateVulnerabilityAnalysisSummary = (session) => {
  const vulnAgents = PHASES['vulnerability-analysis'];
  const completedVulnAgents = session.completedAgents.filter(agent => vulnAgents.includes(agent));
-  const validationResults = session.validationResults || {};
-
-  let totalVulnerabilities = 0;
-  let agentsWithVulns = 0;
-
-  for (const agent of completedVulnAgents) {
-    const validation = validationResults[agent];
-    if (validation?.vulnerabilityCount > 0) {
-      totalVulnerabilities += validation.vulnerabilityCount;
-      agentsWithVulns++;
-    }
-  }

+  // NOTE: Actual vulnerability counts require reading queue files
+  // This summary only shows completion counts
  return Object.freeze({
    totalAnalyses: completedVulnAgents.length,
-    totalVulnerabilities,
-    agentsWithVulnerabilities: agentsWithVulns,
-    successRate: completedVulnAgents.length > 0 ? (agentsWithVulns / completedVulnAgents.length) * 100 : 0,
-    exploitationCandidates: Object.values(validationResults).filter(v => v?.shouldExploit).length
+    completedAgents: completedVulnAgents
  });
 };

@@ -609,19 +525,12 @@ export const calculateVulnerabilityAnalysisSummary = (session) => {
 export const calculateExploitationSummary = (session) => {
  const exploitAgents = PHASES['exploitation'];
  const completedExploitAgents = session.completedAgents.filter(agent => exploitAgents.includes(agent));
-  const validationResults = session.validationResults || {};
-
-  // Count how many exploitation agents were eligible to run
-  const eligibleExploits = exploitAgents.filter(agentName => {
-    const vulnAgentName = agentName.replace('-exploit', '-vuln');
-    return validationResults[vulnAgentName]?.shouldExploit;
-  });

+  // NOTE: Eligibility requires reading queue files
+  // This summary only shows completion counts
  return Object.freeze({
    totalAttempts: completedExploitAgents.length,
-    eligibleExploits: eligibleExploits.length,
-    skippedExploits: eligibleExploits.length - completedExploitAgents.length,
-    successRate: eligibleExploits.length > 0 ? (completedExploitAgents.length / eligibleExploits.length) * 100 : 0
+    completedAgents: completedExploitAgents
  });
 };

@@ -656,33 +565,103 @@ export const rollbackToAgent = async (sessionId, targetAgent) => {
      Object.entries(session.checkpoints).filter(([agent]) => !agentsToRemove.includes(agent))
    )
  };
-  
-  // Clean up timing data for rolled-back agents
-  if (session.timingBreakdown?.agents) {
-    const filteredTimingAgents = Object.fromEntries(
-      Object.entries(session.timingBreakdown.agents).filter(([agent]) => !agentsToRemove.includes(agent))
-    );
-    updates.timingBreakdown = {
-      ...session.timingBreakdown,
-      agents: filteredTimingAgents
-    };
-  }
-  
-  // Clean up cost data for rolled-back agents and recalculate total
-  if (session.costBreakdown?.agents) {
-    const filteredCostAgents = Object.fromEntries(
-      Object.entries(session.costBreakdown.agents).filter(([agent]) => !agentsToRemove.includes(agent))
-    );
-    const recalculatedTotal = Object.values(filteredCostAgents).reduce((sum, cost) => sum + cost, 0);
-    updates.costBreakdown = {
-      total: recalculatedTotal,
-      agents: filteredCostAgents
-    };
-  }
-  
+
+  // NOTE: Timing and cost data now managed in audit-logs/session.json
+  // Rollback will be reflected via reconcileSession() which marks agents as "rolled-back"
+
  return await updateSession(sessionId, updates);
 };

+/**
+ * Reconcile Shannon store with audit logs (self-healing)
+ *
+ * This function ensures the Shannon store (.shannon-store.json) is consistent with
+ * the audit logs (audit-logs/session.json) by syncing agent completion status.
+ *
+ * Three-part reconciliation:
+ * 1. PROMOTIONS: Agents completed/failed in audit → added to Shannon store
+ * 2. DEMOTIONS: Agents rolled-back in audit → removed from Shannon store
+ * 3. VERIFICATION: Ensure audit state fully reflected in orchestration
+ *
+ * Critical for crash recovery, especially crash during rollback operations.
+ *
+ * @param {string} sessionId - Session ID to reconcile
+ * @returns {Promise<Object>} Reconciliation report with added/removed/failed agents
+ */
+export const reconcileSession = async (sessionId) => {
+  const { AuditSession } = await import('./audit/index.js');
+
+  // Get Shannon store session
+  const shannonSession = await getSession(sessionId);
+  if (!shannonSession) {
+    throw new PentestError(`Session ${sessionId} not found in Shannon store`, 'validation', false);
+  }
+
+  // Get audit session data
+  const auditSession = new AuditSession(shannonSession);
+  await auditSession.initialize();
+  const auditData = await auditSession.getMetrics();
+
+  const report = {
+    promotions: [],
+    demotions: [],
+    failures: []
+  };
+
+  // PART 1: PROMOTIONS (Additive)
+  // Find agents completed in audit but not in Shannon store
+  const auditCompleted = Object.entries(auditData.metrics.agents)
+    .filter(([_, agentData]) => agentData.status === 'success')
+    .map(([agentName]) => agentName);
+
+  const missing = auditCompleted.filter(agent => !shannonSession.completedAgents.includes(agent));
+
+  for (const agentName of missing) {
+    const agentData = auditData.metrics.agents[agentName];
+    const checkpoint = agentData.checkpoint || null;
+    await markAgentCompleted(sessionId, agentName, checkpoint);
+    report.promotions.push(agentName);
+  }
+
+  // PART 2: DEMOTIONS (Subtractive) - CRITICAL FOR ROLLBACK RECOVERY
+  // Find agents rolled-back in audit but still in Shannon store
+  const auditRolledBack = Object.entries(auditData.metrics.agents)
+    .filter(([_, agentData]) => agentData.status === 'rolled-back')
+    .map(([agentName]) => agentName);
+
+  const toRemove = shannonSession.completedAgents.filter(agent => auditRolledBack.includes(agent));
+
+  if (toRemove.length > 0) {
+    // Reload session to get fresh state
+    const freshSession = await getSession(sessionId);
+
+    const updates = {
+      completedAgents: freshSession.completedAgents.filter(agent => !toRemove.includes(agent)),
+      checkpoints: Object.fromEntries(
+        Object.entries(freshSession.checkpoints).filter(([agent]) => !toRemove.includes(agent))
+      )
+    };
+
+    await updateSession(sessionId, updates);
+    report.demotions.push(...toRemove);
+  }
+
+  // PART 3: FAILURES
+  // Find agents failed in audit but not marked failed in Shannon store
+  const auditFailed = Object.entries(auditData.metrics.agents)
+    .filter(([_, agentData]) => agentData.status === 'failed')
+    .map(([agentName]) => agentName);
+
+  const failedToAdd = auditFailed.filter(agent => !shannonSession.failedAgents.includes(agent));
+
+  for (const agentName of failedToAdd) {
+    await markAgentFailed(sessionId, agentName);
+    report.failures.push(agentName);
+  }
+
+  return report;
+};
+
 // Delete a specific session by ID
 export const deleteSession = async (sessionId) => {
  const store = await loadSessions();
@@ -1,136 +0,0 @@
-import { fs, path, os } from 'zx';
-import chalk from 'chalk';
-import { PentestError, logError } from '../error-handling.js';
-
-// Pure function: Save deliverables permanently to user directory
-export async function savePermanentDeliverables(sourceDir, webUrl, repoPath, session, timingBreakdown, costBreakdown) {
-  try {
-    // Simple universal approach - try Documents, fallback to home
-    const homeDir = os.homedir();
-    const documentsDir = path.join(homeDir, 'Documents');
-
-    // Use Documents if it exists, otherwise use home directory
-    const baseDir = await fs.pathExists(documentsDir) ? documentsDir : homeDir;
-    const permanentBaseDir = path.join(baseDir, 'pentest-deliverables');
-
-    // Generate directory name from repo path and web URL
-    const repoName = path.basename(repoPath);
-    const webDomain = new URL(webUrl).hostname.replace(/[^a-zA-Z0-9-]/g, '-');
-    const timestamp = new Date().toISOString().replace(/[-:]/g, '').replace(/T/, '-').split('.')[0];
-    const dirName = `${webDomain}_${repoName}_${timestamp}`;
-    const permanentDir = path.join(permanentBaseDir, dirName);
-
-    // Ensure base directory exists
-    await fs.ensureDir(permanentBaseDir);
-
-    // Create the specific pentest directory
-    await fs.ensureDir(permanentDir);
-
-    // Copy deliverables folder if it exists
-    const deliverablesSource = path.join(sourceDir, 'deliverables');
-    const deliverablesDest = path.join(permanentDir, 'deliverables');
-
-    if (await fs.pathExists(deliverablesSource)) {
-      await fs.copy(deliverablesSource, deliverablesDest, { overwrite: true });
-    }
-
-    // Save metadata with session information
-    const metadata = {
-      session: {
-        id: session.id,
-        webUrl,
-        repoPath,
-        configFile: session.configFile,
-        status: session.status,
-        completedAgents: session.completedAgents,
-        createdAt: session.createdAt,
-        completedAt: new Date().toISOString()
-      },
-      timing: timingBreakdown,
-      cost: costBreakdown,
-      sourceDirectory: sourceDir,
-      savedAt: new Date().toISOString()
-    };
-
-    await fs.writeJSON(path.join(permanentDir, 'metadata.json'), metadata, { spaces: 2 });
-
-    // Copy prompts directory for reproducibility
-    const promptsSource = path.join(import.meta.dirname, '..', '..', 'prompts');
-    const promptsDest = path.join(permanentDir, 'prompts');
-
-    if (await fs.pathExists(promptsSource)) {
-      await fs.copy(promptsSource, promptsDest, { overwrite: true });
-    }
-
-    console.log(chalk.green(`✅ Deliverables saved to permanent location: ${permanentDir}`));
-    return permanentDir;
-  } catch (error) {
-    // Non-fatal error - log but don't throw
-    console.log(chalk.yellow(`⚠️ Failed to save permanent deliverables: ${error.message}`));
-    return null;
-  }
-}
-
-// Pure function: Save run metadata for debugging and reproducibility
-export async function saveRunMetadata(sourceDir, webUrl, repoPath) {
-  console.log(chalk.blue('💾 Saving run metadata...'));
-
-  try {
-    // Read package.json to get version info with error handling
-    const packagePath = path.join(import.meta.dirname, '..', '..', 'package.json');
-    let packageJson;
-    try {
-      packageJson = await fs.readJSON(packagePath);
-    } catch (packageError) {
-      throw new PentestError(
-        `Cannot read package.json: ${packageError.message}`,
-        'filesystem',
-        false,
-        { packagePath, originalError: packageError.message }
-      );
-    }
-
-    const metadata = {
-      timestamp: new Date().toISOString(),
-      targets: { webUrl, repoPath },
-      environment: {
-        nodeVersion: process.version,
-        platform: process.platform,
-        arch: process.arch,
-        cwd: process.cwd()
-      },
-      dependencies: {
-        claudeCodeVersion: packageJson.dependencies?.['@anthropic-ai/claude-code'] || 'unknown',
-        zxVersion: packageJson.dependencies?.['zx'] || 'unknown',
-        chalkVersion: packageJson.dependencies?.['chalk'] || 'unknown'
-      },
-      execution: {
-        args: process.argv,
-        env: {
-          PLAYWRIGHT_HEADLESS: process.env.PLAYWRIGHT_HEADLESS || 'true',
-          NODE_ENV: process.env.NODE_ENV
-        }
-      }
-    };
-
-    const metadataPath = path.join(sourceDir, 'run-metadata.json');
-    await fs.writeJSON(metadataPath, metadata, { spaces: 2 });
-
-    console.log(chalk.green(`✅ Run metadata saved to: ${metadataPath}`));
-    return metadata;
-  } catch (error) {
-    if (error instanceof PentestError) {
-      await logError(error, 'Saving run metadata', sourceDir);
-      throw error; // Re-throw PentestError to be handled by caller
-    }
-
-    const metadataError = new PentestError(
-      `Run metadata saving failed: ${error.message}`,
-      'filesystem',
-      false,
-      { sourceDir, originalError: error.message }
-    );
-    await logError(metadataError, 'Saving run metadata', sourceDir);
-    throw metadataError;
-  }
-}
@@ -1,96 +1,14 @@
 import { $, fs, path } from 'zx';
 import chalk from 'chalk';
-import { PentestError, logError } from '../error-handling.js';
-
-// Pure function: Setup MCP with multiple isolated Playwright instances
-export async function setupMCP(sourceDir) {
-  console.log(chalk.blue('🎭 Setting up 5 isolated Playwright MCP instances...'));
-
-  // Set headless mode for all instances
-  process.env.PLAYWRIGHT_HEADLESS = 'true';
-
-  try {
-    // Clean slate - remove any existing instances
-    const instancesToRemove = ['playwright', ...Array.from({length: 5}, (_, i) => `playwright-agent${i + 1}`)];
-
-    for (const instance of instancesToRemove) {
-      try {
-        await $`claude mcp remove ${instance} --scope user 2>/dev/null`;
-      } catch {
-        // Silent ignore - instance might not exist
-      }
-    }
-
-    // Ensure screenshot directories exist
-    await fs.ensureDir(path.join(sourceDir, 'screenshots'));
-
-    // Create 5 isolated instances sequentially to avoid config conflicts
-    for (let i = 1; i <= 5; i++) {
-      const instanceName = `playwright-agent${i}`;
-      const screenshotDir = path.join(sourceDir, 'screenshots', instanceName);
-      const userDataDir = `/tmp/${instanceName}`;
-
-      // Ensure both directories exist
-      await fs.ensureDir(screenshotDir);
-      await fs.ensureDir(userDataDir);
-
-      try {
-        await $`claude mcp add ${instanceName} --scope user -- npx @playwright/mcp@latest --isolated --user-data-dir ${userDataDir} --output-dir ${screenshotDir}`;
-        console.log(chalk.green(`  ✅ ${instanceName} configured`));
-      } catch (error) {
-        if (error.message?.includes('already exists')) {
-          console.log(chalk.gray(`  ⏭️ ${instanceName} already exists`));
-        } else {
-          console.log(chalk.yellow(`  ⚠️ ${instanceName} failed: ${error.message}, continuing...`));
-        }
-      }
-    }
-    console.log(chalk.green('✅ All 5 Playwright MCP instances ready for parallel execution'));
-
-  } catch (error) {
-    // All MCP setup failures are fatal
-    const mcpError = new PentestError(
-      `Critical MCP setup failure: ${error.message}. Browser automation required for pentesting.`,
-      'tool',
-      false,
-      { sourceDir, originalError: error.message }
-    );
-    await logError(mcpError, 'MCP setup failure', sourceDir);
-    throw mcpError;
-  }
-}
-
-// Pure function: Cleanup MCP instances
-export async function cleanupMCP() {
-  console.log(chalk.blue('🧹 Cleaning up Playwright MCP instances...'));
-
-  try {
-    // Remove all instances (including legacy 'playwright' if it exists)
-    const instancesToRemove = ['playwright', ...Array.from({length: 5}, (_, i) => `playwright-agent${i + 1}`)];
-
-    for (const instance of instancesToRemove) {
-      try {
-        await $`claude mcp remove ${instance} --scope user 2>/dev/null`;
-        console.log(chalk.gray(`  🗑️ Removed ${instance}`));
-      } catch {
-        // Silent ignore - instance might not exist
-      }
-    }
-    console.log(chalk.green('✅ Playwright MCP cleanup complete'));
-
-  } catch (error) {
-    // Non-fatal - log warning but don't throw
-    console.log(chalk.yellow(`⚠️ MCP cleanup warning: ${error.message}`));
-  }
-}
+import { PentestError } from '../error-handling.js';

 // Pure function: Setup local repository for testing
 export async function setupLocalRepo(repoPath) {
  try {
    const sourceDir = path.resolve(repoPath);

-    // Setup MCP in the local repository - critical for browser automation
-    await setupMCP(sourceDir);
+    // MCP servers are now configured via mcpServers option in claude-executor.js
+    // No need for pre-setup with claude CLI

    // Initialize git repository if not already initialized and create checkpoint
    try {
@@ -114,22 +32,8 @@ export async function setupLocalRepo(repoPath) {
      // Non-fatal - continue without Git setup
    }

-    // Copy TOTP generation script to local repository for agent accessibility
-    try {
-      const totpScriptSource = path.join(import.meta.dirname, '..', '..', 'login_resources', 'generate-totp-standalone.mjs');
-      const totpScriptDest = path.join(sourceDir, 'generate-totp.mjs');
-
-      if (await fs.pathExists(totpScriptSource)) {
-        await fs.copy(totpScriptSource, totpScriptDest);
-        await fs.chmod(totpScriptDest, '755'); // Make executable
-        console.log(chalk.green('✅ TOTP generation script (standalone) copied to target repository'));
-      } else {
-        console.log(chalk.yellow('⚠️ TOTP script not found, authentication may fail if TOTP is required'));
-      }
-    } catch (totpError) {
-      console.log(chalk.yellow(`⚠️ Failed to copy TOTP script: ${totpError.message}`));
-      // Non-fatal - continue without TOTP script
-    }
+    // MCP tools (save_deliverable, generate_totp) are now available natively via shannon-helper MCP server
+    // No need to copy bash scripts to target repository

    return sourceDir;
  } catch (error) {
@@ -48,17 +48,6 @@ export const handleMissingTools = (toolAvailability) => {
    });
    console.log('');
  }
-  
+
  return missing;
 };
-
-// Check if a specific tool is available
-const isToolAvailable = async (toolName) => {
-  try {
-    await $`command -v ${toolName}`;
-    return true;
-  } catch {
-    return false;
-  }
-};
-
@@ -0,0 +1,54 @@
+/**
+ * Concurrency Control Utilities
+ *
+ * Provides mutex implementation for preventing race conditions during
+ * concurrent session operations.
+ */
+
+/**
+ * SessionMutex - Promise-based mutex for session file operations
+ *
+ * Prevents race conditions when multiple agents or operations attempt to
+ * modify the same session data simultaneously. This is particularly important
+ * during parallel execution of vulnerability analysis and exploitation phases.
+ *
+ * Usage:
+ * ```js
+ * const mutex = new SessionMutex();
+ * const unlock = await mutex.lock(sessionId);
+ * try {
+ *   // Critical section - modify session data
+ * } finally {
+ *   unlock(); // Always release the lock
+ * }
+ * ```
+ */
+export class SessionMutex {
+  constructor() {
+    // Map of sessionId -> Promise (represents active lock)
+    this.locks = new Map();
+  }
+
+  /**
+   * Acquire lock for a session
+   * @param {string} sessionId - Session ID to lock
+   * @returns {Promise<Function>} Unlock function to release the lock
+   */
+  async lock(sessionId) {
+    if (this.locks.has(sessionId)) {
+      // Wait for existing lock to be released
+      await this.locks.get(sessionId);
+    }
+
+    // Create new lock promise
+    let resolve;
+    const promise = new Promise(r => resolve = r);
+    this.locks.set(sessionId, promise);
+
+    // Return unlock function
+    return () => {
+      this.locks.delete(sessionId);
+      resolve();
+    };
+  }
+}
@@ -72,7 +72,7 @@ export const executeGitCommandWithRetry = async (commandArgs, sourceDir, descrip
 };

 // Pure functions for Git workspace management
-export const cleanWorkspace = async (sourceDir, reason = 'clean start') => {
+const cleanWorkspace = async (sourceDir, reason = 'clean start') => {
  console.log(chalk.blue(`    🧹 Cleaning workspace for ${reason}`));
  try {
    // Check for uncommitted changes
@@ -1,126 +0,0 @@
-import { fs } from 'zx';
-import { path } from 'zx';
-
-/**
- * Strips ANSI escape codes from a string
- * @param {string} str - String with ANSI codes
- * @returns {string} Clean string without ANSI codes
- */
-function stripAnsi(str) {
-  if (typeof str !== 'string') {
-    return str;
-  }
-
-  // Remove ANSI escape sequences
-  // This regex matches all common ANSI codes including:
-  // - Colors (e.g., \x1b[32m)
-  // - Cursor movement (e.g., \x1b[1;1H)
-  // - Screen clearing (e.g., \x1b[0J)
-  // - 256-color codes (e.g., \x1b[38;2;244;197;66m)
-  return str.replace(
-    // eslint-disable-next-line no-control-regex
-    /\x1b\[[0-9;]*[a-zA-Z]|\x1b\][0-9];.*?\x07|\x1b\[[\d;]*m/g,
-    ''
-  );
-}
-
-/**
- * Sets up logging to capture all stdout and stderr to a file
- * @param {string} logFilePath - Path to the log file
- * @returns {Promise<Function>} Cleanup function to restore original streams
- */
-export async function setupLogging(logFilePath) {
-  // Resolve to absolute path
-  const absoluteLogPath = path.isAbsolute(logFilePath)
-    ? logFilePath
-    : path.join(process.cwd(), logFilePath);
-
-  // Ensure the directory exists
-  await fs.ensureDir(path.dirname(absoluteLogPath));
-
-  // Create write stream for the log file
-  const logStream = fs.createWriteStream(absoluteLogPath, { flags: 'a' });
-
-  // Buffer for lines that might be overwritten (carriage return without newline)
-  let stdoutBuffer = '';
-  let stderrBuffer = '';
-
-  // Store original stdout/stderr write functions
-  const originalStdoutWrite = process.stdout.write.bind(process.stdout);
-  const originalStderrWrite = process.stderr.write.bind(process.stderr);
-
-  // Override stdout
-  process.stdout.write = function(chunk, encoding, callback) {
-    // Write colorized output to terminal
-    originalStdoutWrite(chunk, encoding, callback);
-
-    // Write plain text (without ANSI codes) to log file
-    const cleanChunk = stripAnsi(chunk.toString());
-
-    // Handle carriage returns - only log when we get a newline
-    if (cleanChunk.includes('\r') && !cleanChunk.includes('\n')) {
-      // Buffer this line - it will be overwritten in terminal
-      stdoutBuffer = cleanChunk.replace(/\r/g, '');
-    } else if (cleanChunk.includes('\n')) {
-      // Flush buffer if exists, then write the new line
-      if (stdoutBuffer) {
-        stdoutBuffer = ''; // Clear buffer without writing (it was overwritten)
-      }
-      logStream.write(cleanChunk);
-    } else {
-      // Normal write
-      logStream.write(cleanChunk);
-    }
-
-    return true;
-  };
-
-  // Override stderr
-  process.stderr.write = function(chunk, encoding, callback) {
-    // Write colorized output to terminal
-    originalStderrWrite(chunk, encoding, callback);
-
-    // Write plain text (without ANSI codes) to log file
-    const cleanChunk = stripAnsi(chunk.toString());
-
-    // Handle carriage returns - only log when we get a newline
-    if (cleanChunk.includes('\r') && !cleanChunk.includes('\n')) {
-      // Buffer this line - it will be overwritten in terminal
-      stderrBuffer = cleanChunk.replace(/\r/g, '');
-    } else if (cleanChunk.includes('\n')) {
-      // Flush buffer if exists, then write the new line
-      if (stderrBuffer) {
-        stderrBuffer = ''; // Clear buffer without writing (it was overwritten)
-      }
-      logStream.write(cleanChunk);
-    } else {
-      // Normal write
-      logStream.write(cleanChunk);
-    }
-
-    return true;
-  };
-
-  // Return cleanup function
-  return async function cleanup() {
-    // Restore original streams
-    process.stdout.write = originalStdoutWrite;
-    process.stderr.write = originalStderrWrite;
-
-    // Flush any remaining buffers
-    if (stdoutBuffer) {
-      logStream.write(stdoutBuffer + '\n');
-    }
-    if (stderrBuffer) {
-      logStream.write(stderrBuffer + '\n');
-    }
-
-    // Close the log stream
-    return new Promise((resolve, reject) => {
-      logStream.end((err) => {
-        if (err) reject(err);
-        else resolve();
-      });
-    });
-  };
-}
@@ -1,13 +1,7 @@
 import chalk from 'chalk';
+import { formatDuration } from '../audit/utils.js';

 // Timing utilities
-export const formatDuration = (ms) => {
-  if (ms < 1000) return `${ms}ms`;
-  if (ms < 60000) return `${(ms / 1000).toFixed(1)}s`;
-  const minutes = Math.floor(ms / 60000);
-  const seconds = Math.floor((ms % 60000) / 1000);
-  return `${minutes}m ${seconds}s`;
-};

 export class Timer {
  constructor(name) {
				`@@ -0,0 +1 @@`
				`EXTERNAL ATTACKER SCOPE: Only report vulnerabilities exploitable via {{WEB_URL}} from the internet. Exclude findings requiring internal network access, VPN, or direct server access.`