Merge pull request #2 from KeygraphHQ/fixing-bugs

Fixing bugs
This commit is contained in:
ajmallesh
2025-10-23 18:18:21 -07:00
committed by GitHub
75 changed files with 3275 additions and 1959 deletions
+2 -1
View File
@@ -1,3 +1,4 @@
node_modules/
.shannon-store.json
agent-logs/
agent-logs/
audit-logs/
+103 -44
View File
@@ -36,9 +36,7 @@ npm start <WEB_URL> <REPO_PATH> --config <CONFIG_FILE>
```
### Generate TOTP for Authentication
```bash
./login_resources/generate-totp.mjs <TOTP_SECRET>
```
TOTP generation is now handled automatically via the `generate_totp` MCP tool during authentication flows.
### Development Commands
```bash
@@ -154,8 +152,8 @@ The `prompts/` directory contains specialized prompt templates for each testing
- `exploit-*.txt` - Exploitation attempt prompts
- `report-executive.txt` - Executive report generation prompts
### Claude Code SDK Integration
The agent uses the `@anthropic-ai/claude-code` SDK with maximum autonomy configuration:
### Claude Agent SDK Integration
The agent uses the `@anthropic-ai/claude-agent-sdk` with maximum autonomy configuration:
- `maxTurns: 10_000` - Allows extensive autonomous analysis
- `permissionMode: 'bypassPermissions'` - Full system access for thorough testing
- Playwright MCP integration for web browser automation
@@ -163,8 +161,8 @@ The agent uses the `@anthropic-ai/claude-code` SDK with maximum autonomy configu
- Configuration context injection for authenticated testing
### Authentication & Login Resources
- `login_resources/generate-totp.mjs` - TOTP token generation utility
- `login_resources/login_instructions.txt` - Login flow documentation
- `prompts/shared/login-instructions.txt` - Login flow template for all agents
- TOTP token generation via MCP `generate_totp` tool
- Support for multi-factor authentication workflows
- Configurable authentication mechanisms (form, SSO, API, basic)
@@ -188,17 +186,46 @@ The agent implements a sophisticated checkpoint system using git:
- Every agent creates a git checkpoint before execution
- Rollback to any previous agent state using `--rollback-to` or `--rerun`
- Failed agents don't affect completed work
- Timing and cost data cleaned up during rollbacks
- Rolled-back agents marked in audit system with status: "rolled-back"
- Reconciliation automatically syncs Shannon store with audit logs after rollback
- Fail-fast safety prevents accidental re-execution of completed agents
### Timing & Performance Monitoring
The agent includes comprehensive timing instrumentation that tracks:
- Total execution time
- Phase-level timing breakdown
- Individual command execution times
- Claude Code agent processing times
- Cost tracking for AI agent usage
### Unified Audit & Metrics System
The agent implements a crash-safe, self-healing audit system (v3.0) with the following guarantees:
**Architecture:**
- **audit-logs/**: Centralized metrics and forensic logs (source of truth)
- `{hostname}_{sessionId}/session.json` - Comprehensive metrics with attempt-level detail
- `{hostname}_{sessionId}/prompts/` - Exact prompts used for reproducibility
- `{hostname}_{sessionId}/agents/` - Turn-by-turn execution logs
- **.shannon-store.json**: Minimal orchestration state (completedAgents, checkpoints)
**Crash Safety:**
- Append-only logging with immediate flush (survives kill -9)
- Atomic writes for session.json (no partial writes)
- Event-based logging (tool_start, tool_end, llm_response) closes data loss windows
**Self-Healing:**
- Automatic reconciliation before every CLI command
- Recovers from crashes during rollback
- Audit logs are source of truth; Shannon store follows
**Forensic Completeness:**
- All retry attempts logged with errors, costs, durations
- Rolled-back agents preserved with status: "rolled-back"
- Partial cost capture for failed attempts
- Complete event trail for debugging
**Concurrency Safety:**
- SessionMutex prevents race conditions during parallel agent execution
- Safe parallel execution of vulnerability and exploitation phases
**Metrics & Reporting:**
- Export metrics to CSV with `./scripts/export-metrics.js`
- Phase-level and agent-level timing/cost aggregations
- Validation results integrated with metrics
For detailed design, see `docs/unified-audit-system-design.md`.
## Development Notes
@@ -206,7 +233,7 @@ The agent includes comprehensive timing instrumentation that tracks:
- **Configuration-Driven Architecture**: YAML configs with JSON Schema validation
- **Modular Error Handling**: Categorized error types with retry logic
- **Pure Functions**: Most functionality is implemented as pure functions for testability
- **SDK-First Approach**: Heavy reliance on Claude Code SDK for autonomous AI operations
- **SDK-First Approach**: Heavy reliance on Claude Agent SDK for autonomous AI operations
- **Progressive Analysis**: Each phase builds on previous phase results
- **Local Repository Setup**: Target applications are accessed directly from user-provided local directories
@@ -232,34 +259,58 @@ The tool should only be used on systems you own or have explicit permission to t
## File Structure
```
shannon.mjs # Main orchestration script
package.json # Node.js dependencies
src/ # Core modules
├── config-parser.js # Configuration handling
├── error-handling.js # Error management
├── tool-checker.js # Tool validation
├── session-manager.js # Session state management
├── checkpoint-manager.js # Git-based checkpointing
├── queue-validation.js # Deliverable validation
shannon.mjs # Main orchestration script
package.json # Node.js dependencies
.shannon-store.json # Orchestration state (minimal)
src/ # Core modules
├── audit/ # Unified audit system (v3.0)
│ ├── index.js # Public API
│ ├── audit-session.js # Main facade (logger + metrics + mutex)
│ ├── logger.js # Append-only crash-safe logging
│ ├── metrics-tracker.js # Timing, cost, attempt tracking
│ └── utils.js # Path generation, atomic writes
├── config-parser.js # Configuration handling
├── error-handling.js # Error management
├── tool-checker.js # Tool validation
├── session-manager.js # Session state + reconciliation
├── checkpoint-manager.js # Git-based checkpointing + rollback
├── queue-validation.js # Deliverable validation
├── ai/
│ └── claude-executor.js # Claude Agent SDK integration
└── utils/
configs/ # Configuration files
── config-schema.json # JSON Schema validation
├── example-config.yaml # Template configuration
├── juice-shop-config.yaml # Juice Shop example
├── keygraph-config.yaml # Keygraph configuration
├── chatwoot-config.yaml # Chatwoot configuration
├── metabase-config.yaml # Metabase configuration
└── cal-com-config.yaml # Cal.com configuration
prompts/ # AI prompt templates
├── pre-recon-code.txt # Code analysis
├── recon.txt # Reconnaissance
├── vuln-*.txt # Vulnerability assessment
├── exploit-*.txt # Exploitation
── report-executive.txt # Executive reporting
login_resources/ # Authentication utilities
├── generate-totp.mjs # TOTP generation
── login_instructions.txt # Login documentation
deliverables/ # Output directory
audit-logs/ # Centralized audit data (v3.0)
── {hostname}_{sessionId}/
├── session.json # Comprehensive metrics
├── prompts/ # Prompt snapshots
│ └── {agent}.md
└── agents/ # Agent execution logs
└── {timestamp}_{agent}_attempt-{N}.log
configs/ # Configuration files
├── config-schema.json # JSON Schema validation
├── example-config.yaml # Template configuration
├── juice-shop-config.yaml # Juice Shop example
├── keygraph-config.yaml # Keygraph configuration
├── chatwoot-config.yaml # Chatwoot configuration
── metabase-config.yaml # Metabase configuration
└── cal-com-config.yaml # Cal.com configuration
prompts/ # AI prompt templates
── shared/ # Shared content for all prompts
│ ├── _target.txt # Target URL template
│ ├── _rules.txt # Rules template
│ ├── _vuln-scope.txt # Vulnerability scope template
│ ├── _exploit-scope.txt # Exploitation scope template
│ └── login-instructions.txt # Login flow template
├── pre-recon-code.txt # Code analysis
├── recon.txt # Reconnaissance
├── vuln-*.txt # Vulnerability assessment
├── exploit-*.txt # Exploitation
└── report-executive.txt # Executive reporting
scripts/ # Utility scripts
└── export-metrics.js # Export metrics to CSV
deliverables/ # Output directory (in target repo)
docs/ # Documentation
├── unified-audit-system-design.md
└── migration-guide.md
```
## Troubleshooting
@@ -275,4 +326,12 @@ deliverables/ # Output directory
Missing tools can be skipped using `--pipeline-testing` mode during development:
- `nmap` - Network scanning
- `subfinder` - Subdomain discovery
- `whatweb` - Web technology detection
- `whatweb` - Web technology detection
### Diagnostic & Utility Scripts
```bash
# Export metrics to CSV
./scripts/export-metrics.js --session-id <id> --output metrics.csv
```
Note: For recovery from corrupted state, simply delete `.shannon-store.json` or edit JSON files directly.
+21 -2
View File
@@ -68,7 +68,23 @@ RUN apk update && apk add --no-cache \
nodejs-22 \
npm \
python3 \
ruby
ruby \
# Chromium browser and dependencies for Playwright
chromium \
# Additional libraries Chromium needs
nss \
freetype \
harfbuzz \
# X11 libraries for headless browser
libx11 \
libxcomposite \
libxdamage \
libxext \
libxfixes \
libxrandr \
mesa-gbm \
# Font rendering
fontconfig
# Copy Go binaries from builder
COPY --from=builder /go/bin/subfinder /usr/local/bin/
@@ -97,7 +113,7 @@ COPY package*.json ./
# Install Node.js dependencies as root
RUN npm ci --only=production && \
npm install -g zx && \
npm install -g @anthropic-ai/claude-code && \
npm install -g @anthropic-ai/claude-agent-sdk && \
npm cache clean --force
# Copy application code
@@ -116,6 +132,9 @@ USER pentest
# Set environment variables
ENV NODE_ENV=production
ENV PATH="/usr/local/bin:$PATH"
ENV SHANNON_DOCKER=true
ENV PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=1
ENV PLAYWRIGHT_CHROMIUM_EXECUTABLE_PATH=/usr/bin/chromium-browser
# Set entrypoint
@@ -1,131 +0,0 @@
#!/usr/bin/env node
import { createHmac } from 'crypto';
/**
* Standalone TOTP generator that doesn't require external dependencies
* Based on RFC 6238 (TOTP: Time-Based One-Time Password Algorithm)
*/
function parseArgs() {
const args = {};
for (let i = 2; i < process.argv.length; i++) {
if (process.argv[i] === '--secret' && i + 1 < process.argv.length) {
args.secret = process.argv[i + 1];
i++; // Skip the next argument since it's the value
} else if (process.argv[i] === '--help' || process.argv[i] === '-h') {
args.help = true;
}
}
return args;
}
function showHelp() {
console.log(`
Usage: node generate-totp-standalone.mjs --secret <TOTP_SECRET>
Generate a Time-based One-Time Password (TOTP) from a secret key.
This standalone version doesn't require external dependencies.
Options:
--secret <secret> The base32-encoded TOTP secret key (required)
--help, -h Show this help message
Examples:
node generate-totp-standalone.mjs --secret "JBSWY3DPEHPK3PXP"
node generate-totp-standalone.mjs --secret "u4e2ewg3d6w7gya3p7plgkef6zgfzo23"
Output:
A 6-digit TOTP code (e.g., 123456)
`);
}
// Base32 decoding function
function base32Decode(encoded) {
const alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ234567';
const cleanInput = encoded.toUpperCase().replace(/[^A-Z2-7]/g, '');
if (cleanInput.length === 0) {
return Buffer.alloc(0);
}
const output = [];
let bits = 0;
let value = 0;
for (const char of cleanInput) {
const index = alphabet.indexOf(char);
if (index === -1) {
throw new Error(`Invalid base32 character: ${char}`);
}
value = (value << 5) | index;
bits += 5;
if (bits >= 8) {
output.push((value >>> (bits - 8)) & 255);
bits -= 8;
}
}
return Buffer.from(output);
}
// HOTP implementation (RFC 4226)
function generateHOTP(secret, counter, digits = 6) {
const key = base32Decode(secret);
// Convert counter to 8-byte buffer (big-endian)
const counterBuffer = Buffer.alloc(8);
counterBuffer.writeBigUInt64BE(BigInt(counter));
// Generate HMAC-SHA1
const hmac = createHmac('sha1', key);
hmac.update(counterBuffer);
const hash = hmac.digest();
// Dynamic truncation
const offset = hash[hash.length - 1] & 0x0f;
const code = (
((hash[offset] & 0x7f) << 24) |
((hash[offset + 1] & 0xff) << 16) |
((hash[offset + 2] & 0xff) << 8) |
(hash[offset + 3] & 0xff)
);
// Generate digits
const otp = (code % Math.pow(10, digits)).toString().padStart(digits, '0');
return otp;
}
// TOTP implementation (RFC 6238)
function generateTOTP(secret, timeStep = 30, digits = 6) {
const currentTime = Math.floor(Date.now() / 1000);
const counter = Math.floor(currentTime / timeStep);
return generateHOTP(secret, counter, digits);
}
function main() {
const args = parseArgs();
if (args.help) {
showHelp();
return;
}
if (!args.secret) {
console.error('Error: --secret parameter is required');
console.error('Use --help for usage information');
process.exit(1);
}
try {
const totpCode = generateTOTP(args.secret);
console.log(totpCode);
} catch (error) {
console.error(`Error: ${error.message}`);
process.exit(1);
}
}
main();
+254
View File
@@ -0,0 +1,254 @@
{
"name": "@shannon/mcp-server",
"version": "1.0.0",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "@shannon/mcp-server",
"version": "1.0.0",
"dependencies": {
"@anthropic-ai/claude-agent-sdk": "^0.1.0",
"zod": "^3.22.4"
}
},
"node_modules/@anthropic-ai/claude-agent-sdk": {
"version": "0.1.25",
"resolved": "https://registry.npmjs.org/@anthropic-ai/claude-agent-sdk/-/claude-agent-sdk-0.1.25.tgz",
"integrity": "sha512-qwuydYaA3uamz4ivDzYXfL2PBjGwc0+beeIyo3nvtZQOtFLjH7xPdBK2w3+9KnB3L6V7VooAMdTXPpQyxCwcOg==",
"license": "SEE LICENSE IN README.md",
"engines": {
"node": ">=18.0.0"
},
"optionalDependencies": {
"@img/sharp-darwin-arm64": "^0.33.5",
"@img/sharp-darwin-x64": "^0.33.5",
"@img/sharp-linux-arm": "^0.33.5",
"@img/sharp-linux-arm64": "^0.33.5",
"@img/sharp-linux-x64": "^0.33.5",
"@img/sharp-win32-x64": "^0.33.5"
},
"peerDependencies": {
"zod": "^3.24.1"
}
},
"node_modules/@img/sharp-darwin-arm64": {
"version": "0.33.5",
"resolved": "https://registry.npmjs.org/@img/sharp-darwin-arm64/-/sharp-darwin-arm64-0.33.5.tgz",
"integrity": "sha512-UT4p+iz/2H4twwAoLCqfA9UH5pI6DggwKEGuaPy7nCVQ8ZsiY5PIcrRvD1DzuY3qYL07NtIQcWnBSY/heikIFQ==",
"cpu": [
"arm64"
],
"license": "Apache-2.0",
"optional": true,
"os": [
"darwin"
],
"engines": {
"node": "^18.17.0 || ^20.3.0 || >=21.0.0"
},
"funding": {
"url": "https://opencollective.com/libvips"
},
"optionalDependencies": {
"@img/sharp-libvips-darwin-arm64": "1.0.4"
}
},
"node_modules/@img/sharp-darwin-x64": {
"version": "0.33.5",
"resolved": "https://registry.npmjs.org/@img/sharp-darwin-x64/-/sharp-darwin-x64-0.33.5.tgz",
"integrity": "sha512-fyHac4jIc1ANYGRDxtiqelIbdWkIuQaI84Mv45KvGRRxSAa7o7d1ZKAOBaYbnepLC1WqxfpimdeWfvqqSGwR2Q==",
"cpu": [
"x64"
],
"license": "Apache-2.0",
"optional": true,
"os": [
"darwin"
],
"engines": {
"node": "^18.17.0 || ^20.3.0 || >=21.0.0"
},
"funding": {
"url": "https://opencollective.com/libvips"
},
"optionalDependencies": {
"@img/sharp-libvips-darwin-x64": "1.0.4"
}
},
"node_modules/@img/sharp-libvips-darwin-arm64": {
"version": "1.0.4",
"resolved": "https://registry.npmjs.org/@img/sharp-libvips-darwin-arm64/-/sharp-libvips-darwin-arm64-1.0.4.tgz",
"integrity": "sha512-XblONe153h0O2zuFfTAbQYAX2JhYmDHeWikp1LM9Hul9gVPjFY427k6dFEcOL72O01QxQsWi761svJ/ev9xEDg==",
"cpu": [
"arm64"
],
"license": "LGPL-3.0-or-later",
"optional": true,
"os": [
"darwin"
],
"funding": {
"url": "https://opencollective.com/libvips"
}
},
"node_modules/@img/sharp-libvips-darwin-x64": {
"version": "1.0.4",
"resolved": "https://registry.npmjs.org/@img/sharp-libvips-darwin-x64/-/sharp-libvips-darwin-x64-1.0.4.tgz",
"integrity": "sha512-xnGR8YuZYfJGmWPvmlunFaWJsb9T/AO2ykoP3Fz/0X5XV2aoYBPkX6xqCQvUTKKiLddarLaxpzNe+b1hjeWHAQ==",
"cpu": [
"x64"
],
"license": "LGPL-3.0-or-later",
"optional": true,
"os": [
"darwin"
],
"funding": {
"url": "https://opencollective.com/libvips"
}
},
"node_modules/@img/sharp-libvips-linux-arm": {
"version": "1.0.5",
"resolved": "https://registry.npmjs.org/@img/sharp-libvips-linux-arm/-/sharp-libvips-linux-arm-1.0.5.tgz",
"integrity": "sha512-gvcC4ACAOPRNATg/ov8/MnbxFDJqf/pDePbBnuBDcjsI8PssmjoKMAz4LtLaVi+OnSb5FK/yIOamqDwGmXW32g==",
"cpu": [
"arm"
],
"license": "LGPL-3.0-or-later",
"optional": true,
"os": [
"linux"
],
"funding": {
"url": "https://opencollective.com/libvips"
}
},
"node_modules/@img/sharp-libvips-linux-arm64": {
"version": "1.0.4",
"resolved": "https://registry.npmjs.org/@img/sharp-libvips-linux-arm64/-/sharp-libvips-linux-arm64-1.0.4.tgz",
"integrity": "sha512-9B+taZ8DlyyqzZQnoeIvDVR/2F4EbMepXMc/NdVbkzsJbzkUjhXv/70GQJ7tdLA4YJgNP25zukcxpX2/SueNrA==",
"cpu": [
"arm64"
],
"license": "LGPL-3.0-or-later",
"optional": true,
"os": [
"linux"
],
"funding": {
"url": "https://opencollective.com/libvips"
}
},
"node_modules/@img/sharp-libvips-linux-x64": {
"version": "1.0.4",
"resolved": "https://registry.npmjs.org/@img/sharp-libvips-linux-x64/-/sharp-libvips-linux-x64-1.0.4.tgz",
"integrity": "sha512-MmWmQ3iPFZr0Iev+BAgVMb3ZyC4KeFc3jFxnNbEPas60e1cIfevbtuyf9nDGIzOaW9PdnDciJm+wFFaTlj5xYw==",
"cpu": [
"x64"
],
"license": "LGPL-3.0-or-later",
"optional": true,
"os": [
"linux"
],
"funding": {
"url": "https://opencollective.com/libvips"
}
},
"node_modules/@img/sharp-linux-arm": {
"version": "0.33.5",
"resolved": "https://registry.npmjs.org/@img/sharp-linux-arm/-/sharp-linux-arm-0.33.5.tgz",
"integrity": "sha512-JTS1eldqZbJxjvKaAkxhZmBqPRGmxgu+qFKSInv8moZ2AmT5Yib3EQ1c6gp493HvrvV8QgdOXdyaIBrhvFhBMQ==",
"cpu": [
"arm"
],
"license": "Apache-2.0",
"optional": true,
"os": [
"linux"
],
"engines": {
"node": "^18.17.0 || ^20.3.0 || >=21.0.0"
},
"funding": {
"url": "https://opencollective.com/libvips"
},
"optionalDependencies": {
"@img/sharp-libvips-linux-arm": "1.0.5"
}
},
"node_modules/@img/sharp-linux-arm64": {
"version": "0.33.5",
"resolved": "https://registry.npmjs.org/@img/sharp-linux-arm64/-/sharp-linux-arm64-0.33.5.tgz",
"integrity": "sha512-JMVv+AMRyGOHtO1RFBiJy/MBsgz0x4AWrT6QoEVVTyh1E39TrCUpTRI7mx9VksGX4awWASxqCYLCV4wBZHAYxA==",
"cpu": [
"arm64"
],
"license": "Apache-2.0",
"optional": true,
"os": [
"linux"
],
"engines": {
"node": "^18.17.0 || ^20.3.0 || >=21.0.0"
},
"funding": {
"url": "https://opencollective.com/libvips"
},
"optionalDependencies": {
"@img/sharp-libvips-linux-arm64": "1.0.4"
}
},
"node_modules/@img/sharp-linux-x64": {
"version": "0.33.5",
"resolved": "https://registry.npmjs.org/@img/sharp-linux-x64/-/sharp-linux-x64-0.33.5.tgz",
"integrity": "sha512-opC+Ok5pRNAzuvq1AG0ar+1owsu842/Ab+4qvU879ippJBHvyY5n2mxF1izXqkPYlGuP/M556uh53jRLJmzTWA==",
"cpu": [
"x64"
],
"license": "Apache-2.0",
"optional": true,
"os": [
"linux"
],
"engines": {
"node": "^18.17.0 || ^20.3.0 || >=21.0.0"
},
"funding": {
"url": "https://opencollective.com/libvips"
},
"optionalDependencies": {
"@img/sharp-libvips-linux-x64": "1.0.4"
}
},
"node_modules/@img/sharp-win32-x64": {
"version": "0.33.5",
"resolved": "https://registry.npmjs.org/@img/sharp-win32-x64/-/sharp-win32-x64-0.33.5.tgz",
"integrity": "sha512-MpY/o8/8kj+EcnxwvrP4aTJSWw/aZ7JIGR4aBeZkZw5B7/Jn+tY9/VNwtcoGmdT7GfggGIU4kygOMSbYnOrAbg==",
"cpu": [
"x64"
],
"license": "Apache-2.0 AND LGPL-3.0-or-later",
"optional": true,
"os": [
"win32"
],
"engines": {
"node": "^18.17.0 || ^20.3.0 || >=21.0.0"
},
"funding": {
"url": "https://opencollective.com/libvips"
}
},
"node_modules/zod": {
"version": "3.25.76",
"resolved": "https://registry.npmjs.org/zod/-/zod-3.25.76.tgz",
"integrity": "sha512-gzUt/qt81nXsFGKIFcC3YnfEAx5NkunCfnDlvuBSSFS02bcXu4Lmea0AFIUwbLWxWPx3d9p8S5QoaujKcNQxcQ==",
"license": "MIT",
"funding": {
"url": "https://github.com/sponsors/colinhacks"
}
}
}
}
+13
View File
@@ -0,0 +1,13 @@
{
"name": "@shannon/mcp-server",
"version": "1.0.0",
"type": "module",
"main": "./src/index.js",
"scripts": {
"clean": "rm -rf dist"
},
"dependencies": {
"@anthropic-ai/claude-agent-sdk": "^0.1.0",
"zod": "^3.22.4"
}
}
+35
View File
@@ -0,0 +1,35 @@
/**
* Shannon Helper MCP Server
*
* In-process MCP server providing save_deliverable and generate_totp tools
* for Shannon penetration testing agents.
*
* Replaces bash script invocations with native tool access.
*/
import { createSdkMcpServer } from '@anthropic-ai/claude-agent-sdk';
import { saveDeliverableTool } from './tools/save-deliverable.js';
import { generateTotpTool } from './tools/generate-totp.js';
/**
* Create Shannon Helper MCP Server with target directory context
*
* @param {string} targetDir - The target repository directory where deliverables should be saved
* @returns {Object} MCP server instance
*/
export function createShannonHelperServer(targetDir) {
// Store target directory for tool access
global.__SHANNON_TARGET_DIR = targetDir;
return createSdkMcpServer({
name: 'shannon-helper',
version: '1.0.0',
tools: [saveDeliverableTool, generateTotpTool],
});
}
// Export tools for direct usage if needed
export { saveDeliverableTool, generateTotpTool };
// Export types for external use
export * from './types/index.js';
+137
View File
@@ -0,0 +1,137 @@
/**
* generate_totp MCP Tool
*
* Generates 6-digit TOTP codes for authentication.
* Replaces tools/generate-totp-standalone.mjs bash script.
* Based on RFC 6238 (TOTP) and RFC 4226 (HOTP).
*/
import { tool } from '@anthropic-ai/claude-agent-sdk';
import { createHmac } from 'crypto';
import { z } from 'zod';
import { createToolResult } from '../types/tool-responses.js';
import { base32Decode, validateTotpSecret } from '../validation/totp-validator.js';
import { createCryptoError, createGenericError } from '../utils/error-formatter.js';
/**
* Input schema for generate_totp tool
*/
export const GenerateTotpInputSchema = z.object({
secret: z
.string()
.min(1)
.regex(/^[A-Z2-7]+$/i, 'Must be base32-encoded')
.describe('Base32-encoded TOTP secret'),
});
/**
* Generate HOTP code (RFC 4226)
* Ported from generate-totp-standalone.mjs (lines 74-99)
*
* @param {string} secret - Base32-encoded secret
* @param {number} counter - Counter value
* @param {number} [digits=6] - Number of digits in OTP
* @returns {string} OTP code
*/
function generateHOTP(secret, counter, digits = 6) {
const key = base32Decode(secret);
// Convert counter to 8-byte buffer (big-endian)
const counterBuffer = Buffer.alloc(8);
counterBuffer.writeBigUInt64BE(BigInt(counter));
// Generate HMAC-SHA1
const hmac = createHmac('sha1', key);
hmac.update(counterBuffer);
const hash = hmac.digest();
// Dynamic truncation
const offset = hash[hash.length - 1] & 0x0f;
const code =
((hash[offset] & 0x7f) << 24) |
((hash[offset + 1] & 0xff) << 16) |
((hash[offset + 2] & 0xff) << 8) |
(hash[offset + 3] & 0xff);
// Generate digits
const otp = (code % Math.pow(10, digits)).toString().padStart(digits, '0');
return otp;
}
/**
* Generate TOTP code (RFC 6238)
* Ported from generate-totp-standalone.mjs (lines 101-106)
*
* @param {string} secret - Base32-encoded secret
* @param {number} [timeStep=30] - Time step in seconds
* @param {number} [digits=6] - Number of digits in OTP
* @returns {string} OTP code
*/
function generateTOTP(secret, timeStep = 30, digits = 6) {
const currentTime = Math.floor(Date.now() / 1000);
const counter = Math.floor(currentTime / timeStep);
return generateHOTP(secret, counter, digits);
}
/**
* Get seconds until TOTP code expires
*
* @param {number} [timeStep=30] - Time step in seconds
* @returns {number} Seconds until expiration
*/
function getSecondsUntilExpiration(timeStep = 30) {
const currentTime = Math.floor(Date.now() / 1000);
return timeStep - (currentTime % timeStep);
}
/**
* generate_totp tool implementation
*
* @param {Object} args
* @param {string} args.secret - Base32-encoded TOTP secret
* @returns {Promise<Object>} Tool result
*/
export async function generateTotp(args) {
try {
const { secret } = args;
// Validate secret (throws on error)
validateTotpSecret(secret);
// Generate TOTP code
const totpCode = generateTOTP(secret);
const expiresIn = getSecondsUntilExpiration();
const timestamp = new Date().toISOString();
// Success response
const successResponse = {
status: 'success',
message: 'TOTP code generated successfully',
totpCode,
timestamp,
expiresIn,
};
return createToolResult(successResponse);
} catch (error) {
// Check if it's a validation/crypto error
if (error instanceof Error && (error.message.includes('base32') || error.message.includes('TOTP'))) {
const errorResponse = createCryptoError(error.message, false);
return createToolResult(errorResponse);
}
// Generic error
const errorResponse = createGenericError(error, false);
return createToolResult(errorResponse);
}
}
/**
* Tool definition for MCP server - created using SDK's tool() function
*/
export const generateTotpTool = tool(
'generate_totp',
'Generates 6-digit TOTP code for authentication. Secret must be base32-encoded.',
GenerateTotpInputSchema.shape,
generateTotp
);
+85
View File
@@ -0,0 +1,85 @@
/**
* save_deliverable MCP Tool
*
* Saves deliverable files with automatic validation.
* Replaces tools/save_deliverable.js bash script.
*/
import { tool } from '@anthropic-ai/claude-agent-sdk';
import { z } from 'zod';
import { DeliverableType, DELIVERABLE_FILENAMES, isQueueType } from '../types/deliverables.js';
import { createToolResult } from '../types/tool-responses.js';
import { validateQueueJson } from '../validation/queue-validator.js';
import { saveDeliverableFile } from '../utils/file-operations.js';
import { createValidationError, createGenericError } from '../utils/error-formatter.js';
/**
* Input schema for save_deliverable tool
*/
export const SaveDeliverableInputSchema = z.object({
deliverable_type: z.nativeEnum(DeliverableType).describe('Type of deliverable to save'),
content: z.string().min(1).describe('File content (markdown for analysis/evidence, JSON for queues)'),
});
/**
* save_deliverable tool implementation
*
* @param {Object} args
* @param {string} args.deliverable_type - Type of deliverable to save
* @param {string} args.content - File content
* @returns {Promise<Object>} Tool result
*/
export async function saveDeliverable(args) {
try {
const { deliverable_type, content } = args;
// Validate queue JSON if applicable
if (isQueueType(deliverable_type)) {
const queueValidation = validateQueueJson(content);
if (!queueValidation.valid) {
const errorResponse = createValidationError(
queueValidation.message,
true,
{
deliverableType: deliverable_type,
expectedFormat: '{"vulnerabilities": [...]}',
}
);
return createToolResult(errorResponse);
}
}
// Get filename and save file
const filename = DELIVERABLE_FILENAMES[deliverable_type];
const filepath = saveDeliverableFile(filename, content);
// Success response
const successResponse = {
status: 'success',
message: `Deliverable saved successfully: ${filename}`,
filepath,
deliverableType: deliverable_type,
validated: isQueueType(deliverable_type),
};
return createToolResult(successResponse);
} catch (error) {
const errorResponse = createGenericError(
error,
false,
{ deliverableType: args.deliverable_type }
);
return createToolResult(errorResponse);
}
}
/**
* Tool definition for MCP server - created using SDK's tool() function
*/
export const saveDeliverableTool = tool(
'save_deliverable',
'Saves deliverable files with automatic validation. Queue files must have {"vulnerabilities": [...]} structure.',
SaveDeliverableInputSchema.shape,
saveDeliverable
);
+107
View File
@@ -0,0 +1,107 @@
/**
* Deliverable Type Definitions
*
* Maps deliverable types to their filenames and defines validation requirements.
* Must match the exact mappings from tools/save_deliverable.js.
*/
/**
* @typedef {Object} DeliverableType
* @property {string} CODE_ANALYSIS
* @property {string} RECON
* @property {string} INJECTION_ANALYSIS
* @property {string} INJECTION_QUEUE
* @property {string} XSS_ANALYSIS
* @property {string} XSS_QUEUE
* @property {string} AUTH_ANALYSIS
* @property {string} AUTH_QUEUE
* @property {string} AUTHZ_ANALYSIS
* @property {string} AUTHZ_QUEUE
* @property {string} SSRF_ANALYSIS
* @property {string} SSRF_QUEUE
* @property {string} INJECTION_EVIDENCE
* @property {string} XSS_EVIDENCE
* @property {string} AUTH_EVIDENCE
* @property {string} AUTHZ_EVIDENCE
* @property {string} SSRF_EVIDENCE
*/
export const DeliverableType = {
// Pre-recon agent
CODE_ANALYSIS: 'CODE_ANALYSIS',
// Recon agent
RECON: 'RECON',
// Vulnerability analysis agents
INJECTION_ANALYSIS: 'INJECTION_ANALYSIS',
INJECTION_QUEUE: 'INJECTION_QUEUE',
XSS_ANALYSIS: 'XSS_ANALYSIS',
XSS_QUEUE: 'XSS_QUEUE',
AUTH_ANALYSIS: 'AUTH_ANALYSIS',
AUTH_QUEUE: 'AUTH_QUEUE',
AUTHZ_ANALYSIS: 'AUTHZ_ANALYSIS',
AUTHZ_QUEUE: 'AUTHZ_QUEUE',
SSRF_ANALYSIS: 'SSRF_ANALYSIS',
SSRF_QUEUE: 'SSRF_QUEUE',
// Exploitation agents
INJECTION_EVIDENCE: 'INJECTION_EVIDENCE',
XSS_EVIDENCE: 'XSS_EVIDENCE',
AUTH_EVIDENCE: 'AUTH_EVIDENCE',
AUTHZ_EVIDENCE: 'AUTHZ_EVIDENCE',
SSRF_EVIDENCE: 'SSRF_EVIDENCE',
};
/**
* Hard-coded filename mappings from agent prompts
* Must match tools/save_deliverable.js exactly
*/
export const DELIVERABLE_FILENAMES = {
[DeliverableType.CODE_ANALYSIS]: 'code_analysis_deliverable.md',
[DeliverableType.RECON]: 'recon_deliverable.md',
[DeliverableType.INJECTION_ANALYSIS]: 'injection_analysis_deliverable.md',
[DeliverableType.INJECTION_QUEUE]: 'injection_exploitation_queue.json',
[DeliverableType.XSS_ANALYSIS]: 'xss_analysis_deliverable.md',
[DeliverableType.XSS_QUEUE]: 'xss_exploitation_queue.json',
[DeliverableType.AUTH_ANALYSIS]: 'auth_analysis_deliverable.md',
[DeliverableType.AUTH_QUEUE]: 'auth_exploitation_queue.json',
[DeliverableType.AUTHZ_ANALYSIS]: 'authz_analysis_deliverable.md',
[DeliverableType.AUTHZ_QUEUE]: 'authz_exploitation_queue.json',
[DeliverableType.SSRF_ANALYSIS]: 'ssrf_analysis_deliverable.md',
[DeliverableType.SSRF_QUEUE]: 'ssrf_exploitation_queue.json',
[DeliverableType.INJECTION_EVIDENCE]: 'injection_exploitation_evidence.md',
[DeliverableType.XSS_EVIDENCE]: 'xss_exploitation_evidence.md',
[DeliverableType.AUTH_EVIDENCE]: 'auth_exploitation_evidence.md',
[DeliverableType.AUTHZ_EVIDENCE]: 'authz_exploitation_evidence.md',
[DeliverableType.SSRF_EVIDENCE]: 'ssrf_exploitation_evidence.md',
};
/**
* Queue types that require JSON validation
*/
export const QUEUE_TYPES = [
DeliverableType.INJECTION_QUEUE,
DeliverableType.XSS_QUEUE,
DeliverableType.AUTH_QUEUE,
DeliverableType.AUTHZ_QUEUE,
DeliverableType.SSRF_QUEUE,
];
/**
* Type guard to check if a deliverable type is a queue
* @param {string} type - Deliverable type to check
* @returns {boolean} True if the type is a queue type
*/
export function isQueueType(type) {
return QUEUE_TYPES.includes(type);
}
/**
* @typedef {Object} VulnerabilityQueue
* @property {Array<Object>} vulnerabilities - Array of vulnerability objects
*/
+6
View File
@@ -0,0 +1,6 @@
/**
* Type definitions barrel export
*/
export * from './deliverables.js';
export * from './tool-responses.js';
+58
View File
@@ -0,0 +1,58 @@
/**
* Tool Response Type Definitions
*
* Defines structured response formats for MCP tools to ensure
* consistent error handling and success reporting.
*/
/**
* @typedef {Object} ErrorResponse
* @property {'error'} status
* @property {string} message
* @property {string} errorType - ValidationError, FileSystemError, CryptoError, etc.
* @property {boolean} retryable
* @property {Record<string, unknown>} [context]
*/
/**
* @typedef {Object} SuccessResponse
* @property {'success'} status
* @property {string} message
*/
/**
* @typedef {Object} SaveDeliverableResponse
* @property {'success'} status
* @property {string} message
* @property {string} filepath
* @property {string} deliverableType
* @property {boolean} validated - true if queue JSON was validated
*/
/**
* @typedef {Object} GenerateTotpResponse
* @property {'success'} status
* @property {string} message
* @property {string} totpCode
* @property {string} timestamp
* @property {number} expiresIn - seconds until expiration
*/
/**
* Helper to create tool result from response
* MCP tools should return this format
*
* @param {ErrorResponse | SaveDeliverableResponse | GenerateTotpResponse} response
* @returns {{ content: Array<{ type: string; text: string }>; isError: boolean }}
*/
export function createToolResult(response) {
return {
content: [
{
type: 'text',
text: JSON.stringify(response, null, 2),
},
],
isError: response.status === 'error',
};
}
+71
View File
@@ -0,0 +1,71 @@
/**
* Error Formatting Utilities
*
* Helper functions for creating structured error responses.
*/
/**
* @typedef {Object} ErrorResponse
* @property {'error'} status
* @property {string} message
* @property {string} errorType
* @property {boolean} retryable
* @property {Record<string, unknown>} [context]
*/
/**
* Create a validation error response
*
* @param {string} message
* @param {boolean} [retryable=true]
* @param {Record<string, unknown>} [context]
* @returns {ErrorResponse}
*/
export function createValidationError(message, retryable = true, context) {
return {
status: 'error',
message,
errorType: 'ValidationError',
retryable,
context,
};
}
/**
* Create a crypto error response
*
* @param {string} message
* @param {boolean} [retryable=false]
* @param {Record<string, unknown>} [context]
* @returns {ErrorResponse}
*/
export function createCryptoError(message, retryable = false, context) {
return {
status: 'error',
message,
errorType: 'CryptoError',
retryable,
context,
};
}
/**
* Create a generic error response
*
* @param {unknown} error
* @param {boolean} [retryable=false]
* @param {Record<string, unknown>} [context]
* @returns {ErrorResponse}
*/
export function createGenericError(error, retryable = false, context) {
const message = error instanceof Error ? error.message : String(error);
const errorType = error instanceof Error ? error.constructor.name : 'UnknownError';
return {
status: 'error',
message,
errorType,
retryable,
context,
};
}
+35
View File
@@ -0,0 +1,35 @@
/**
* File Operations Utilities
*
* Handles file system operations for deliverable saving.
* Ported from tools/save_deliverable.js (lines 117-130).
*/
import { writeFileSync, mkdirSync } from 'fs';
import { join } from 'path';
/**
* Save deliverable file to deliverables/ directory
*
* @param {string} filename - Name of the file to save
* @param {string} content - Content to write to the file
* @returns {string} Full path to the saved file
*/
export function saveDeliverableFile(filename, content) {
// Use target directory from global context (set by createShannonHelperServer)
const targetDir = global.__SHANNON_TARGET_DIR || process.cwd();
const deliverablesDir = join(targetDir, 'deliverables');
const filepath = join(deliverablesDir, filename);
// Ensure deliverables directory exists
try {
mkdirSync(deliverablesDir, { recursive: true });
} catch (error) {
// Directory might already exist, ignore
}
// Write file (atomic write - single operation)
writeFileSync(filepath, content, 'utf8');
return filepath;
}
@@ -0,0 +1,51 @@
/**
* Queue Validator
*
* Validates JSON structure for vulnerability queue files.
* Ported from tools/save_deliverable.js (lines 56-75).
*/
/**
* @typedef {Object} ValidationResult
* @property {boolean} valid
* @property {string} [message]
* @property {Object} [data]
*/
/**
* Validate JSON structure for queue files
* Queue files must have a 'vulnerabilities' array
*
* @param {string} content - JSON string to validate
* @returns {ValidationResult} ValidationResult with valid flag, optional error message, and parsed data
*/
export function validateQueueJson(content) {
try {
const parsed = JSON.parse(content);
// Queue files must have a 'vulnerabilities' array
if (!parsed.vulnerabilities) {
return {
valid: false,
message: `Invalid queue structure: Missing 'vulnerabilities' property. Expected: {"vulnerabilities": [...]}`,
};
}
if (!Array.isArray(parsed.vulnerabilities)) {
return {
valid: false,
message: `Invalid queue structure: 'vulnerabilities' must be an array. Expected: {"vulnerabilities": [...]}`,
};
}
return {
valid: true,
data: parsed,
};
} catch (error) {
return {
valid: false,
message: `Invalid JSON: ${error instanceof Error ? error.message : String(error)}`,
};
}
}
@@ -0,0 +1,71 @@
/**
* TOTP Validator
*
* Validates TOTP secrets and provides base32 decoding.
* Ported from tools/generate-totp-standalone.mjs (lines 43-72).
*/
/**
* Base32 decode function
* Ported from generate-totp-standalone.mjs
*
* @param {string} encoded - Base32 encoded string
* @returns {Buffer} Buffer containing decoded bytes
*/
export function base32Decode(encoded) {
const alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ234567';
const cleanInput = encoded.toUpperCase().replace(/[^A-Z2-7]/g, '');
if (cleanInput.length === 0) {
return Buffer.alloc(0);
}
const output = [];
let bits = 0;
let value = 0;
for (const char of cleanInput) {
const index = alphabet.indexOf(char);
if (index === -1) {
throw new Error(`Invalid base32 character: ${char}`);
}
value = (value << 5) | index;
bits += 5;
if (bits >= 8) {
output.push((value >>> (bits - 8)) & 255);
bits -= 8;
}
}
return Buffer.from(output);
}
/**
* Validate TOTP secret
* Must be base32-encoded string
*
* @param {string} secret - Secret to validate
* @returns {boolean} true if valid, throws Error if invalid
*/
export function validateTotpSecret(secret) {
if (!secret || secret.length === 0) {
throw new Error('TOTP secret cannot be empty');
}
// Check if it's valid base32 (only A-Z and 2-7, case-insensitive)
const base32Regex = /^[A-Z2-7]+$/i;
if (!base32Regex.test(secret.replace(/[^A-Z2-7]/gi, ''))) {
throw new Error('TOTP secret must be base32-encoded (characters A-Z and 2-7)');
}
// Try to decode to ensure it's valid
try {
base32Decode(secret);
} catch (error) {
throw new Error(`Invalid TOTP secret: ${error instanceof Error ? error.message : String(error)}`);
}
return true;
}
+189 -8
View File
@@ -8,7 +8,7 @@
"name": "shannon",
"version": "1.0.0",
"dependencies": {
"@anthropic-ai/claude-code": "^1.0.96",
"@anthropic-ai/claude-agent-sdk": "^0.1.0",
"ajv": "^8.12.0",
"ajv-formats": "^2.1.1",
"boxen": "^8.0.1",
@@ -16,20 +16,18 @@
"figlet": "^1.9.3",
"gradient-string": "^3.0.0",
"js-yaml": "^4.1.0",
"zod": "^3.22.4",
"zx": "^8.0.0"
},
"bin": {
"shannon": "shannon.mjs"
}
},
"node_modules/@anthropic-ai/claude-code": {
"version": "1.0.96",
"resolved": "https://registry.npmjs.org/@anthropic-ai/claude-code/-/claude-code-1.0.96.tgz",
"integrity": "sha512-xnxhYzuh6PYlMcw56REMQiGMW20WaLLOvG8L8TObq70zhNKs3dro7nhYwHRe1c2ubTr20oIJK0aSkyD2BpO8nA==",
"node_modules/@anthropic-ai/claude-agent-sdk": {
"version": "0.1.25",
"resolved": "https://registry.npmjs.org/@anthropic-ai/claude-agent-sdk/-/claude-agent-sdk-0.1.25.tgz",
"integrity": "sha512-qwuydYaA3uamz4ivDzYXfL2PBjGwc0+beeIyo3nvtZQOtFLjH7xPdBK2w3+9KnB3L6V7VooAMdTXPpQyxCwcOg==",
"license": "SEE LICENSE IN README.md",
"bin": {
"claude": "cli.js"
},
"engines": {
"node": ">=18.0.0"
},
@@ -40,6 +38,9 @@
"@img/sharp-linux-arm64": "^0.33.5",
"@img/sharp-linux-x64": "^0.33.5",
"@img/sharp-win32-x64": "^0.33.5"
},
"peerDependencies": {
"zod": "^3.24.1"
}
},
"node_modules/@img/sharp-darwin-arm64": {
@@ -64,6 +65,28 @@
"@img/sharp-libvips-darwin-arm64": "1.0.4"
}
},
"node_modules/@img/sharp-darwin-x64": {
"version": "0.33.5",
"resolved": "https://registry.npmjs.org/@img/sharp-darwin-x64/-/sharp-darwin-x64-0.33.5.tgz",
"integrity": "sha512-fyHac4jIc1ANYGRDxtiqelIbdWkIuQaI84Mv45KvGRRxSAa7o7d1ZKAOBaYbnepLC1WqxfpimdeWfvqqSGwR2Q==",
"cpu": [
"x64"
],
"license": "Apache-2.0",
"optional": true,
"os": [
"darwin"
],
"engines": {
"node": "^18.17.0 || ^20.3.0 || >=21.0.0"
},
"funding": {
"url": "https://opencollective.com/libvips"
},
"optionalDependencies": {
"@img/sharp-libvips-darwin-x64": "1.0.4"
}
},
"node_modules/@img/sharp-libvips-darwin-arm64": {
"version": "1.0.4",
"resolved": "https://registry.npmjs.org/@img/sharp-libvips-darwin-arm64/-/sharp-libvips-darwin-arm64-1.0.4.tgz",
@@ -80,6 +103,155 @@
"url": "https://opencollective.com/libvips"
}
},
"node_modules/@img/sharp-libvips-darwin-x64": {
"version": "1.0.4",
"resolved": "https://registry.npmjs.org/@img/sharp-libvips-darwin-x64/-/sharp-libvips-darwin-x64-1.0.4.tgz",
"integrity": "sha512-xnGR8YuZYfJGmWPvmlunFaWJsb9T/AO2ykoP3Fz/0X5XV2aoYBPkX6xqCQvUTKKiLddarLaxpzNe+b1hjeWHAQ==",
"cpu": [
"x64"
],
"license": "LGPL-3.0-or-later",
"optional": true,
"os": [
"darwin"
],
"funding": {
"url": "https://opencollective.com/libvips"
}
},
"node_modules/@img/sharp-libvips-linux-arm": {
"version": "1.0.5",
"resolved": "https://registry.npmjs.org/@img/sharp-libvips-linux-arm/-/sharp-libvips-linux-arm-1.0.5.tgz",
"integrity": "sha512-gvcC4ACAOPRNATg/ov8/MnbxFDJqf/pDePbBnuBDcjsI8PssmjoKMAz4LtLaVi+OnSb5FK/yIOamqDwGmXW32g==",
"cpu": [
"arm"
],
"license": "LGPL-3.0-or-later",
"optional": true,
"os": [
"linux"
],
"funding": {
"url": "https://opencollective.com/libvips"
}
},
"node_modules/@img/sharp-libvips-linux-arm64": {
"version": "1.0.4",
"resolved": "https://registry.npmjs.org/@img/sharp-libvips-linux-arm64/-/sharp-libvips-linux-arm64-1.0.4.tgz",
"integrity": "sha512-9B+taZ8DlyyqzZQnoeIvDVR/2F4EbMepXMc/NdVbkzsJbzkUjhXv/70GQJ7tdLA4YJgNP25zukcxpX2/SueNrA==",
"cpu": [
"arm64"
],
"license": "LGPL-3.0-or-later",
"optional": true,
"os": [
"linux"
],
"funding": {
"url": "https://opencollective.com/libvips"
}
},
"node_modules/@img/sharp-libvips-linux-x64": {
"version": "1.0.4",
"resolved": "https://registry.npmjs.org/@img/sharp-libvips-linux-x64/-/sharp-libvips-linux-x64-1.0.4.tgz",
"integrity": "sha512-MmWmQ3iPFZr0Iev+BAgVMb3ZyC4KeFc3jFxnNbEPas60e1cIfevbtuyf9nDGIzOaW9PdnDciJm+wFFaTlj5xYw==",
"cpu": [
"x64"
],
"license": "LGPL-3.0-or-later",
"optional": true,
"os": [
"linux"
],
"funding": {
"url": "https://opencollective.com/libvips"
}
},
"node_modules/@img/sharp-linux-arm": {
"version": "0.33.5",
"resolved": "https://registry.npmjs.org/@img/sharp-linux-arm/-/sharp-linux-arm-0.33.5.tgz",
"integrity": "sha512-JTS1eldqZbJxjvKaAkxhZmBqPRGmxgu+qFKSInv8moZ2AmT5Yib3EQ1c6gp493HvrvV8QgdOXdyaIBrhvFhBMQ==",
"cpu": [
"arm"
],
"license": "Apache-2.0",
"optional": true,
"os": [
"linux"
],
"engines": {
"node": "^18.17.0 || ^20.3.0 || >=21.0.0"
},
"funding": {
"url": "https://opencollective.com/libvips"
},
"optionalDependencies": {
"@img/sharp-libvips-linux-arm": "1.0.5"
}
},
"node_modules/@img/sharp-linux-arm64": {
"version": "0.33.5",
"resolved": "https://registry.npmjs.org/@img/sharp-linux-arm64/-/sharp-linux-arm64-0.33.5.tgz",
"integrity": "sha512-JMVv+AMRyGOHtO1RFBiJy/MBsgz0x4AWrT6QoEVVTyh1E39TrCUpTRI7mx9VksGX4awWASxqCYLCV4wBZHAYxA==",
"cpu": [
"arm64"
],
"license": "Apache-2.0",
"optional": true,
"os": [
"linux"
],
"engines": {
"node": "^18.17.0 || ^20.3.0 || >=21.0.0"
},
"funding": {
"url": "https://opencollective.com/libvips"
},
"optionalDependencies": {
"@img/sharp-libvips-linux-arm64": "1.0.4"
}
},
"node_modules/@img/sharp-linux-x64": {
"version": "0.33.5",
"resolved": "https://registry.npmjs.org/@img/sharp-linux-x64/-/sharp-linux-x64-0.33.5.tgz",
"integrity": "sha512-opC+Ok5pRNAzuvq1AG0ar+1owsu842/Ab+4qvU879ippJBHvyY5n2mxF1izXqkPYlGuP/M556uh53jRLJmzTWA==",
"cpu": [
"x64"
],
"license": "Apache-2.0",
"optional": true,
"os": [
"linux"
],
"engines": {
"node": "^18.17.0 || ^20.3.0 || >=21.0.0"
},
"funding": {
"url": "https://opencollective.com/libvips"
},
"optionalDependencies": {
"@img/sharp-libvips-linux-x64": "1.0.4"
}
},
"node_modules/@img/sharp-win32-x64": {
"version": "0.33.5",
"resolved": "https://registry.npmjs.org/@img/sharp-win32-x64/-/sharp-win32-x64-0.33.5.tgz",
"integrity": "sha512-MpY/o8/8kj+EcnxwvrP4aTJSWw/aZ7JIGR4aBeZkZw5B7/Jn+tY9/VNwtcoGmdT7GfggGIU4kygOMSbYnOrAbg==",
"cpu": [
"x64"
],
"license": "Apache-2.0 AND LGPL-3.0-or-later",
"optional": true,
"os": [
"win32"
],
"engines": {
"node": "^18.17.0 || ^20.3.0 || >=21.0.0"
},
"funding": {
"url": "https://opencollective.com/libvips"
}
},
"node_modules/@types/tinycolor2": {
"version": "1.4.6",
"resolved": "https://registry.npmjs.org/@types/tinycolor2/-/tinycolor2-1.4.6.tgz",
@@ -462,6 +634,15 @@
"url": "https://github.com/chalk/wrap-ansi?sponsor=1"
}
},
"node_modules/zod": {
"version": "3.25.76",
"resolved": "https://registry.npmjs.org/zod/-/zod-3.25.76.tgz",
"integrity": "sha512-gzUt/qt81nXsFGKIFcC3YnfEAx5NkunCfnDlvuBSSFS02bcXu4Lmea0AFIUwbLWxWPx3d9p8S5QoaujKcNQxcQ==",
"license": "MIT",
"funding": {
"url": "https://github.com/sponsors/colinhacks"
}
},
"node_modules/zx": {
"version": "8.8.1",
"resolved": "https://registry.npmjs.org/zx/-/zx-8.8.1.tgz",
+2 -1
View File
@@ -7,7 +7,7 @@
"start": "./shannon.mjs"
},
"dependencies": {
"@anthropic-ai/claude-code": "^1.0.96",
"@anthropic-ai/claude-agent-sdk": "^0.1.0",
"ajv": "^8.12.0",
"ajv-formats": "^2.1.1",
"boxen": "^8.0.1",
@@ -15,6 +15,7 @@
"figlet": "^1.9.3",
"gradient-string": "^3.0.0",
"js-yaml": "^4.1.0",
"zod": "^3.22.4",
"zx": "^8.0.0"
},
"bin": {
+13 -23
View File
@@ -8,20 +8,7 @@ Success criterion: A comprehensive evidence report detailing the outcome for **e
</objective>
<exploitation_scope>
**MANDATORY EXPLOITATION CONSTRAINTS:**
- ALL exploitation attempts MUST be performed against {{WEB_URL}} from external network
- DO NOT attempt to exploit vulnerabilities that require:
- Internal network access (Tailscale, VPN, etc.)
- Direct server access
- Database console access
**VERDICT CLASSIFICATIONS:**
- EXPLOITED: Successfully demonstrated impact via public interface
- BLOCKED_BY_SECURITY: Valid vulnerability but blocked by WAF/controls
- OUT_OF_SCOPE_INTERNAL: Requires internal access - DO NOT PURSUE
- FALSE_POSITIVE: Not actually vulnerable
Only EXPLOITED and BLOCKED_BY_SECURITY should be reported as findings.
@include(shared/_exploit-scope.txt)
</exploitation_scope>
<critical>
@@ -107,13 +94,11 @@ Chain exploits together: Use enumerated users in password attacks. Use created w
If you only confirmed a vulnerability without attempting to weaponize it, you have not completed the exploitation task.
<target>
URL: {{WEB_URL}}
@include(shared/_target.txt)
</target>
<rules>
Rules to Avoid:
{{RULES_AVOID}}
@include(shared/_rules.txt)
</rules>
<login_instructions>
@@ -158,10 +143,15 @@ You are the **Identity Compromise Specialist** - proving tangible impact of brok
<available_tools>
- **{{MCP_SERVER}} (Playwright):** Essential for interacting with multi-step authentication flows, injecting stolen session cookies, and verifying account takeover in a real browser context.
- **Bash tool:** Crucial for crafting manual API requests with `curl` to replay tokens, test credential stuffing attacks, and probe for logical flaws.
- **save_deliverable (MCP Tool):** Saves exploitation evidence files.
- **Parameters:**
- `deliverable_type`: "AUTH_EVIDENCE" (required)
- `content`: Your complete evidence report (markdown) (required)
- **Returns:** `{ status: "success", filepath: "..." }` on success or `{ status: "error", message: "..." }` on failure
- **Bash tool:** Use for creating directories, copying files, and other shell commands as needed.
- **Task Agent:** Mandatory coder-executor for brute force batches, credential stuffing, token replay automation, and any scripted workflow.
- **TodoWrite tool:** Essential for managing your exploitation task list. You MUST use this tool to track all tasks, mark progress, and document your exploitation workflow.
- **Read / Write tools:** To document false positives in your tracking file at `workspace/auth_false_positives.md`.
- **Read tool:** To read false positives from your tracking file at `workspace/auth_false_positives.md`.
**CRITICAL TASK AGENT WORKFLOW:**
- Delegate every custom script or multi-step automation to the Task Agent; do not hand-write scripts in Bash.
@@ -322,7 +312,7 @@ When your attack plan is fully explored (all checkboxes are marked `[x]`), your
**1. Your Specialist Deliverable**
First, synthesize all evidence into your specialist section at `deliverables/auth_exploitation_evidence.md`. This section will be combined with other specialist reports to form the complete security assessment.
First, synthesize all evidence into your specialist section and save it using the save_deliverable script with type AUTH_EVIDENCE. This section will be combined with other specialist reports to form the complete security assessment.
**Section Ordering & Format Requirements:**
- Use the exact vulnerability IDs from `deliverables/auth_exploitation_queue.json`
@@ -415,8 +405,8 @@ If [blocker] were bypassed/removed:
COMPLETION REQUIREMENTS (ALL must be satisfied):
1. Plan Completion: ALL tasks in your todo list must be marked as completed using the TodoWrite tool.
2. Deliverable Generation: The required file must be successfully created:
- deliverables/auth_exploitation_evidence.md (Created)
2. Deliverable Generation: The required deliverable must be successfully saved using save_deliverable script:
- Evidence report: Use `save_deliverable` MCP tool with `deliverable_type: "AUTH_EVIDENCE"` and your evidence report as `content`
CRITICAL WARNING: Announcing completion before every item in deliverables/auth_exploitation_queue.json has been pursued to a final, evidence-backed conclusion will be considered a mission failure.
+13 -23
View File
@@ -8,20 +8,7 @@ Success criterion: A comprehensive evidence report detailing the outcome for **e
</objective>
<exploitation_scope>
**MANDATORY EXPLOITATION CONSTRAINTS:**
- ALL exploitation attempts MUST be performed against {{WEB_URL}} from external network
- DO NOT attempt to exploit vulnerabilities that require:
- Internal network access (Tailscale, VPN, etc.)
- Direct server access
- Database console access
**VERDICT CLASSIFICATIONS:**
- EXPLOITED: Successfully demonstrated impact via public interface
- BLOCKED_BY_SECURITY: Valid vulnerability but blocked by WAF/controls
- OUT_OF_SCOPE_INTERNAL: Requires internal access - DO NOT PURSUE
- FALSE_POSITIVE: Not actually vulnerable
Only EXPLOITED and BLOCKED_BY_SECURITY should be reported as findings.
@include(shared/_exploit-scope.txt)
</exploitation_scope>
<critical>
@@ -94,13 +81,11 @@ After exhaustive bypass attempts, determine:
</critical>
<target>
URL: {{WEB_URL}}
@include(shared/_target.txt)
</target>
<rules>
Rules to Avoid:
{{RULES_AVOID}}
@include(shared/_rules.txt)
</rules>
<login_instructions>
@@ -145,10 +130,15 @@ You are the **Privilege Escalation Specialist** - proving tangible impact of bro
<available_tools>
- **{{MCP_SERVER}} (Playwright):** Essential for interacting with complex authorization flows, testing role-based access controls in browser contexts, and verifying privilege escalation through UI elements.
- **Bash tool:** Crucial for crafting manual API requests with `curl` to test IDOR vulnerabilities, manipulate user parameters, and probe for access control bypasses.
- **save_deliverable (MCP Tool):** Saves exploitation evidence files.
- **Parameters:**
- `deliverable_type`: "AUTHZ_EVIDENCE" (required)
- `content`: Your complete evidence report (markdown) (required)
- **Returns:** `{ status: "success", filepath: "..." }` on success or `{ status: "error", message: "..." }` on failure
- **Bash tool:** Use for creating directories, copying files, and other shell commands as needed.
- **Task Agent:** Mandatory coder-executor for IDOR sweeps, role escalation loops, and workflow bypass automation.
- **TodoWrite tool:** Essential for managing your exploitation task list. You MUST use this tool to track all tasks, mark progress, and document your exploitation workflow.
- **Read / Write tools:** To document false positives in your tracking file at `workspace/authz_false_positives.md`.
- **Read tool:** To read false positives from your tracking file at `workspace/authz_false_positives.md`.
**CRITICAL TASK AGENT WORKFLOW:**
- Delegate every multi-user iteration, role toggle test, or workflow automation script to the Task Agent—never handcraft these scripts yourself.
@@ -328,7 +318,7 @@ When your attack plan is fully explored (all checkboxes are marked `[x]`), your
**1. Your Specialist Deliverable**
First, synthesize all evidence into your specialist section at `deliverables/authz_exploitation_evidence.md`. This section will be combined with other specialist reports to form the complete security assessment and MUST use the following structure:
First, synthesize all evidence into your specialist section and save it using the save_deliverable script with type AUTHZ_EVIDENCE. This section will be combined with other specialist reports to form the complete security assessment and MUST use the following structure:
**Section Ordering & Format Requirements:**
- Use the exact vulnerability IDs from `deliverables/authz_exploitation_queue.json`
@@ -417,8 +407,8 @@ If [blocker] were bypassed/removed:
COMPLETION REQUIREMENTS (ALL must be satisfied):
1. Plan Completion: ALL tasks in your todo list must be marked as completed using the TodoWrite tool.
2. Deliverable Generation: The required file must be successfully created:
- deliverables/authz_exploitation_evidence.md (Created)
2. Deliverable Generation: The required deliverable must be successfully saved using save_deliverable script:
- Evidence report: Use `save_deliverable` MCP tool with `deliverable_type: "AUTHZ_EVIDENCE"` and your evidence report as `content`
CRITICAL WARNING: Announcing completion before every item in deliverables/authz_exploitation_queue.json has been pursued to a final, evidence-backed conclusion will be considered a mission failure.
+13 -22
View File
@@ -8,20 +8,7 @@ Success criterion: A comprehensive evidence report detailing the outcome for **e
</objective>
<exploitation_scope>
**MANDATORY EXPLOITATION CONSTRAINTS:**
- ALL exploitation attempts MUST be performed against {{WEB_URL}} from external network
- DO NOT attempt to exploit vulnerabilities that require:
- Internal network access (Tailscale, VPN, etc.)
- Direct server access
- Database console access
**VERDICT CLASSIFICATIONS:**
- EXPLOITED: Successfully demonstrated impact via public interface
- BLOCKED_BY_SECURITY: Valid vulnerability but blocked by WAF/controls
- OUT_OF_SCOPE_INTERNAL: Requires internal access - DO NOT PURSUE
- FALSE_POSITIVE: Not actually vulnerable
Only EXPLOITED and BLOCKED_BY_SECURITY should be reported as findings.
@include(shared/_exploit-scope.txt)
</exploitation_scope>
<critical>
@@ -71,12 +58,11 @@ Remember: An unproven vulnerability is worse than no finding at all - it wastes
<target>
URL: {{WEB_URL}}
@include(shared/_target.txt)
</target>
<rules>
Rules to Avoid:
{{RULES_AVOID}}
@include(shared/_rules.txt)
</rules>
<login_instructions>
@@ -126,10 +112,15 @@ You are the **Proof-of-Impact Generator** - converting theoretical injection fla
<available_tools>
- **{{MCP_SERVER}} (Playwright):** For testing injection vulnerabilities through browser interactions when needed.
- **Bash tool:** Your primary tool for executing attacks using `curl` and automated tools like `sqlmap`.
- **save_deliverable (MCP Tool):** Saves exploitation evidence files.
- **Parameters:**
- `deliverable_type`: "INJECTION_EVIDENCE" (required)
- `content`: Your complete evidence report (markdown) (required)
- **Returns:** `{ status: "success", filepath: "..." }` on success or `{ status: "error", message: "..." }` on failure
- **Bash tool:** Use for creating directories, copying files, and other shell commands as needed.
- **Task Agent:** Mandatory coder-executor for any custom scripting beyond single ad-hoc commands.
- **TodoWrite tool:** Essential for managing your exploitation task list. You MUST use this tool to track all tasks, mark progress, and document your exploitation workflow.
- **Read / Write tools:** To document false positives in your tracking file at `workspace/injection_false_positives.md`.
- **Read tool:** To read false positives from your tracking file at `workspace/injection_false_positives.md`.
**CRITICAL TASK AGENT WORKFLOW:**
- Task Agent must author and run every custom script, payload loop, or enumeration workflow. Do not craft standalone scripts in Bash or other tools.
@@ -351,7 +342,7 @@ When your attack plan is fully explored (all checkboxes are marked `[x]`), your
**1. Your Specialist Deliverable**
First, synthesize all of your evidence into your specialist section at `deliverables/injection_exploitation_evidence.md`. This section will be combined with other specialist reports to form the complete security assessment.
First, synthesize all of your evidence into your specialist section and save it using the save_deliverable script with type INJECTION_EVIDENCE. This section will be combined with other specialist reports to form the complete security assessment.
Your section MUST use the following structure precisely:
@@ -443,8 +434,8 @@ If [blocker] were bypassed/removed:
COMPLETION REQUIREMENTS (ALL must be satisfied):
1. **Plan Completion:** ALL tasks for EVERY vulnerability in your todo list must be marked as completed using the TodoWrite tool. **No vulnerability or task can be left unaddressed.**
2. **Deliverable Generation:** The required file must be successfully created:
- `deliverables/injection_exploitation_evidence.md`
2. **Deliverable Generation:** The required deliverable must be successfully saved using save_deliverable MCP tool:
- Evidence report: Use `save_deliverable` MCP tool with `deliverable_type: "INJECTION_EVIDENCE"` and your evidence report as `content`
**CRITICAL WARNING:** Announcing completion before every item in `deliverables/injection_exploitation_queue.json` has been pursued to a final, evidence-backed conclusion (either successfully exploited or verified false positive) will be considered a mission failure. Superficial testing is not acceptable.
+13 -23
View File
@@ -8,20 +8,7 @@ Success criterion: A comprehensive evidence report detailing the outcome for **e
</objective>
<exploitation_scope>
**MANDATORY EXPLOITATION CONSTRAINTS:**
- ALL exploitation attempts MUST be performed against {{WEB_URL}} from external network
- DO NOT attempt to exploit vulnerabilities that require:
- Internal network access (Tailscale, VPN, etc.)
- Direct server access
- Database console access
**VERDICT CLASSIFICATIONS:**
- EXPLOITED: Successfully demonstrated impact via public interface
- BLOCKED_BY_SECURITY: Valid vulnerability but blocked by WAF/controls
- OUT_OF_SCOPE_INTERNAL: Requires internal access - DO NOT PURSUE
- FALSE_POSITIVE: Not actually vulnerable
Only EXPLOITED and BLOCKED_BY_SECURITY should be reported as findings.
@include(shared/_exploit-scope.txt)
</exploitation_scope>
<critical>
@@ -94,13 +81,11 @@ After exhaustive bypass attempts, determine:
</critical>
<target>
URL: {{WEB_URL}}
@include(shared/_target.txt)
</target>
<rules>
Rules to Avoid:
{{RULES_AVOID}}
@include(shared/_rules.txt)
</rules>
<login_instructions>
@@ -144,11 +129,16 @@ You are the **Network Boundary Breaker** - proving tangible impact of SSRF vulne
</system_architecture>
<available_tools>
- **Bash tool:** Essential for crafting HTTP requests with `curl` to exploit SSRF vulnerabilities, access internal services, and retrieve cloud metadata.
- **save_deliverable (MCP Tool):** Saves exploitation evidence files.
- **Parameters:**
- `deliverable_type`: "SSRF_EVIDENCE" (required)
- `content`: Your complete evidence report (markdown) (required)
- **Returns:** `{ status: "success", filepath: "..." }` on success or `{ status: "error", message: "..." }` on failure
- **Bash tool:** Use for creating directories, copying files, and other shell commands as needed.
- **{{MCP_SERVER}} (Playwright):** Useful for complex multi-step SSRF exploitation that requires browser context or JavaScript execution.
- **Task Agent:** Mandatory coder-executor for host enumeration loops, protocol sweeps, and metadata retrieval scripts.
- **TodoWrite tool:** Essential for managing your exploitation task list. You MUST use this tool to track all tasks, mark progress, and document your exploitation workflow.
- **Read / Write tools:** To document false positives in your tracking file at `workspace/ssrf_false_positives.md`.
- **Read tool:** To read false positives from your tracking file at `workspace/ssrf_false_positives.md`.
**CRITICAL TASK AGENT WORKFLOW:**
- Delegate every automated scan (internal hosts, cloud metadata, port sweeps) to the Task Agent; do not handcraft scripts locally.
@@ -405,7 +395,7 @@ When your attack plan is fully explored (all checkboxes are marked `[x]`), your
**1. Your Specialist Deliverable**
First, synthesize all evidence into your specialist section at `deliverables/ssrf_exploitation_evidence.md`. This section will be combined with other specialist reports to form the complete security assessment and MUST use the following structure:
First, synthesize all evidence into your specialist section and save it using the save_deliverable script with type SSRF_EVIDENCE. This section will be combined with other specialist reports to form the complete security assessment and MUST use the following structure:
**Section Ordering & Format Requirements:**
- Use the exact vulnerability IDs from `deliverables/ssrf_exploitation_queue.json`
@@ -494,8 +484,8 @@ If [blocker] were bypassed/removed:
COMPLETION REQUIREMENTS (ALL must be satisfied):
1. Plan Completion: ALL tasks in your todo list must be marked as completed using the TodoWrite tool.
2. Deliverable Generation: The required file must be successfully created:
- deliverables/ssrf_exploitation_evidence.md (Created)
2. Deliverable Generation: The required deliverable must be successfully saved using save_deliverable script:
- Evidence report: Use `save_deliverable` MCP tool with `deliverable_type: "SSRF_EVIDENCE"` and your evidence report as `content`
CRITICAL WARNING: Announcing completion before every item in deliverables/ssrf_exploitation_queue.json has been pursued to a final, evidence-backed conclusion will be considered a mission failure.
+14 -24
View File
@@ -19,20 +19,7 @@ Success criterion: A comprehensive evidence report detailing the outcome for **e
</objective>
<exploitation_scope>
**MANDATORY EXPLOITATION CONSTRAINTS:**
- ALL exploitation attempts MUST be performed against {{WEB_URL}} from external network
- DO NOT attempt to exploit vulnerabilities that require:
- Internal network access (Tailscale, VPN, etc.)
- Direct server access
- Database console access
**VERDICT CLASSIFICATIONS:**
- EXPLOITED: Successfully demonstrated impact via public interface
- BLOCKED_BY_SECURITY: Valid vulnerability but blocked by WAF/controls
- OUT_OF_SCOPE_INTERNAL: Requires internal access - DO NOT PURSUE
- FALSE_POSITIVE: Not actually vulnerable
Only EXPLOITED and BLOCKED_BY_SECURITY should be reported as findings.
@include(shared/_exploit-scope.txt)
</exploitation_scope>
<critical>
@@ -92,13 +79,11 @@ After exhaustive bypass attempts, determine:
</critical>
<target>
URL: {{WEB_URL}}
@include(shared/_target.txt)
</target>
<rules>
Rules to Avoid:
{{RULES_AVOID}}
@include(shared/_rules.txt)
</rules>
<login_instructions>
@@ -147,10 +132,15 @@ You are the **Client-Side Impact Demonstrator** - converting theoretical XSS fla
<available_tools>
- **{{MCP_SERVER}} (Playwright):** Your primary tool for testing DOM-based and Stored XSS, confirming script execution in a real browser context, and interacting with the application post-exploitation.
- **Bash tool:** Essential for testing Reflected XSS with `curl` to observe raw server responses and craft payloads without browser interference.
- **save_deliverable (MCP Tool):** Saves exploitation evidence files.
- **Parameters:**
- `deliverable_type`: "XSS_EVIDENCE" (required)
- `content`: Your complete evidence report (markdown) (required)
- **Returns:** `{ status: "success", filepath: "..." }` on success or `{ status: "error", message: "..." }` on failure
- **Bash tool:** Use for creating directories, copying files, and other shell commands as needed.
- **Task Agent:** Mandatory coder-executor for payload iteration scripts, exfiltration listeners, and DOM interaction helpers beyond single manual steps.
- **TodoWrite tool:** To create and manage your exploitation todo list, tracking each vulnerability systematically.
- **Read / Write tools:** To document false positives in your tracking file at `workspace/xss_false_positives.md`.
- **Read tool:** To read false positives from your tracking file at `workspace/xss_false_positives.md`.
**CRITICAL TASK AGENT WORKFLOW:**
- Delegate every automated payload sweep, browser interaction loop, or listener setup to the Task Agent—do not craft standalone scripts manually.
@@ -337,7 +327,7 @@ POTENTIAL (Report in separate subsection):
When all todos in your list are marked as completed, your mission is complete. You MUST generate two final files.
1. Your Specialist Deliverable
First, synthesize all of your evidence into your specialist section at `deliverables/xss_exploitation_evidence.md`. This section will be combined with other specialist reports to form the complete security assessment.
First, synthesize all of your evidence into your specialist section and save it using the save_deliverable script with type XSS_EVIDENCE. This section will be combined with other specialist reports to form the complete security assessment.
Your section MUST use the following structure precisely:
**Section Ordering & Format Requirements:**
@@ -434,11 +424,11 @@ If [blocker] were bypassed/removed:
COMPLETION REQUIREMENTS (ALL must be satisfied):
- Todo List Completion: ALL vulnerabilities from the exploitation queue must have been processed and marked as completed in your todo list.
- Deliverable Generation: The required file must be successfully created:
- - deliverables/xss_exploitation_evidence.md (Created)
- Deliverable Generation: The required deliverable must be successfully saved using save_deliverable MCP tool:
- Evidence report: Use `save_deliverable` MCP tool with `deliverable_type: "XSS_EVIDENCE"` and your evidence report as `content`
**CRITICAL WARNING:** Announcing completion before every item in `deliverables/xss_exploitation_queue.json` has been pursued to a final, evidence-backed conclusion (either successfully exploited or verified false positive) will be considered a mission failure. Superficial testing is not acceptable.
ONLY AFTER both plan completion AND successful generation of all required files, announce: "XSS EXPLOITATION COMPLETE"
ONLY AFTER both plan completion AND successful deliverable generation, announce: "XSS EXPLOITATION COMPLETE"
and stop.
</conclusion_trigger>
+5 -21
View File
@@ -19,29 +19,13 @@
```
### Step 2: Create Required Deliverable
Create file `deliverables/auth_exploitation_evidence.md` with content:
```
# Authentication Exploitation Evidence (Pipeline Test)
Use the save_deliverable MCP tool to create your evidence:
**MCP Server Used:** {{MCP_SERVER}}
**Test Site:** https://example.com
**Screenshot:** auth-exploit-test.png
**Status:** Browser isolation test successful
Use `save_deliverable` MCP tool with:
- `deliverable_type: "AUTH_EVIDENCE"`
- `content: "# Authentication Exploitation Evidence (Pipeline Test)\n\n**MCP Server Used:** {{MCP_SERVER}}\n**Test Site:** https://example.com\n**Screenshot:** auth-exploit-test.png\n**Status:** Browser isolation test successful\n\n## Exploitation Results\n- **Target:** Test authentication vulnerability\n- **Method:** Authentication bypass via parameter manipulation\n- **Evidence:** Successfully bypassed login using admin=true parameter\n- **Impact:** Demonstrated unauthorized access to protected resources\n- **Proof:** Pipeline testing mode - simulated successful exploitation\n\n## Technical Details\n- **Vulnerability ID:** AUTH-VULN-01\n- **Exploitation Status:** Simulated Success (Pipeline Test)\n- **MCP Instance:** {{MCP_SERVER}}\n- **Attack Vector:** Parameter-based authentication bypass\n- **Bypass Method:** URL parameter manipulation"`
## Exploitation Results
- **Target:** Test authentication vulnerability
- **Method:** Authentication bypass via parameter manipulation
- **Evidence:** Successfully bypassed login using admin=true parameter
- **Impact:** Demonstrated unauthorized access to protected resources
- **Proof:** Pipeline testing mode - simulated successful exploitation
## Technical Details
- **Vulnerability ID:** AUTH-VULN-01
- **Exploitation Status:** Simulated Success (Pipeline Test)
- **MCP Instance:** {{MCP_SERVER}}
- **Attack Vector:** Parameter-based authentication bypass
- **Bypass Method:** URL parameter manipulation
```
This will automatically create `deliverables/auth_exploitation_evidence.md`.
### Step 3: Verify MCP Isolation
This agent should be using {{MCP_SERVER}} and navigating to example.com independently of other parallel exploitation agents.
+5 -21
View File
@@ -19,29 +19,13 @@
```
### Step 2: Create Required Deliverable
Create file `deliverables/authz_exploitation_evidence.md` with content:
```
# Authorization Exploitation Evidence (Pipeline Test)
Use the save_deliverable MCP tool to create your evidence:
**MCP Server Used:** {{MCP_SERVER}}
**Test Site:** https://example.com
**Screenshot:** authz-exploit-test.png
**Status:** Browser isolation test successful
Use `save_deliverable` MCP tool with:
- `deliverable_type: "AUTHZ_EVIDENCE"`
- `content: "# Authorization Exploitation Evidence (Pipeline Test)\n\n**MCP Server Used:** {{MCP_SERVER}}\n**Test Site:** https://example.com\n**Screenshot:** authz-exploit-test.png\n**Status:** Browser isolation test successful\n\n## Exploitation Results\n- **Target:** Test authorization vulnerability\n- **Method:** Privilege escalation via direct object reference\n- **Evidence:** Successfully accessed admin panel with user-level privileges\n- **Impact:** Demonstrated unauthorized access to privileged functionality\n- **Proof:** Pipeline testing mode - simulated successful exploitation\n\n## Technical Details\n- **Vulnerability ID:** AUTHZ-VULN-01\n- **Exploitation Status:** Simulated Success (Pipeline Test)\n- **MCP Instance:** {{MCP_SERVER}}\n- **Attack Vector:** Insecure Direct Object Reference (IDOR)\n- **Escalation Method:** User ID manipulation in API calls"`
## Exploitation Results
- **Target:** Test authorization vulnerability
- **Method:** Privilege escalation via direct object reference
- **Evidence:** Successfully accessed admin panel with user-level privileges
- **Impact:** Demonstrated unauthorized access to privileged functionality
- **Proof:** Pipeline testing mode - simulated successful exploitation
## Technical Details
- **Vulnerability ID:** AUTHZ-VULN-01
- **Exploitation Status:** Simulated Success (Pipeline Test)
- **MCP Instance:** {{MCP_SERVER}}
- **Attack Vector:** Insecure Direct Object Reference (IDOR)
- **Escalation Method:** User ID manipulation in API calls
```
This will automatically create `deliverables/authz_exploitation_evidence.md`.
### Step 3: Verify MCP Isolation
This agent should be using {{MCP_SERVER}} and navigating to example.com independently of other parallel exploitation agents.
+5 -19
View File
@@ -19,27 +19,13 @@
```
### Step 2: Create Required Deliverable
Create file `deliverables/injection_exploitation_evidence.md` with content:
```
# Injection Exploitation Evidence (Pipeline Test)
Use the save_deliverable MCP tool to create your evidence:
**MCP Server Used:** {{MCP_SERVER}}
**Test Site:** https://example.com
**Screenshot:** injection-exploit-test.png
**Status:** Browser isolation test successful
Use `save_deliverable` MCP tool with:
- `deliverable_type: "INJECTION_EVIDENCE"`
- `content: "# Injection Exploitation Evidence (Pipeline Test)\n\n**MCP Server Used:** {{MCP_SERVER}}\n**Test Site:** https://example.com\n**Screenshot:** injection-exploit-test.png\n**Status:** Browser isolation test successful\n\n## Exploitation Results\n- **Target:** Test injection vulnerability\n- **Method:** SQL injection via query parameter\n- **Evidence:** Successfully executed test payload `'; DROP TABLE users; --`\n- **Impact:** Demonstrated ability to manipulate database queries\n- **Proof:** Pipeline testing mode - simulated successful exploitation\n\n## Technical Details\n- **Vulnerability ID:** INJ-VULN-01\n- **Exploitation Status:** Simulated Success (Pipeline Test)\n- **MCP Instance:** {{MCP_SERVER}}"`
## Exploitation Results
- **Target:** Test injection vulnerability
- **Method:** SQL injection via query parameter
- **Evidence:** Successfully executed test payload `'; DROP TABLE users; --`
- **Impact:** Demonstrated ability to manipulate database queries
- **Proof:** Pipeline testing mode - simulated successful exploitation
## Technical Details
- **Vulnerability ID:** INJ-VULN-01
- **Exploitation Status:** Simulated Success (Pipeline Test)
- **MCP Instance:** {{MCP_SERVER}}
```
This will automatically create `deliverables/injection_exploitation_evidence.md`.
### Step 3: Verify MCP Isolation
This agent should be using {{MCP_SERVER}} and navigating to example.com independently of other parallel exploitation agents.
+5 -21
View File
@@ -19,29 +19,13 @@
```
### Step 2: Create Required Deliverable
Create file `deliverables/ssrf_exploitation_evidence.md` with content:
```
# SSRF Exploitation Evidence (Pipeline Test)
Use the save_deliverable MCP tool to create your evidence:
**MCP Server Used:** {{MCP_SERVER}}
**Test Site:** https://example.com
**Screenshot:** ssrf-exploit-test.png
**Status:** Browser isolation test successful
Use `save_deliverable` MCP tool with:
- `deliverable_type: "SSRF_EVIDENCE"`
- `content: "# SSRF Exploitation Evidence (Pipeline Test)\n\n**MCP Server Used:** {{MCP_SERVER}}\n**Test Site:** https://example.com\n**Screenshot:** ssrf-exploit-test.png\n**Status:** Browser isolation test successful\n\n## Exploitation Results\n- **Target:** Test SSRF vulnerability\n- **Method:** Server-Side Request Forgery via URL parameter\n- **Evidence:** Successfully forced server to make request to internal network\n- **Impact:** Demonstrated access to internal services and potential data exfiltration\n- **Proof:** Pipeline testing mode - simulated successful exploitation\n\n## Technical Details\n- **Vulnerability ID:** SSRF-VULN-01\n- **Exploitation Status:** Simulated Success (Pipeline Test)\n- **MCP Instance:** {{MCP_SERVER}}\n- **Attack Vector:** URL parameter manipulation\n- **Target:** Internal network services (localhost:8080)"`
## Exploitation Results
- **Target:** Test SSRF vulnerability
- **Method:** Server-Side Request Forgery via URL parameter
- **Evidence:** Successfully forced server to make request to internal network
- **Impact:** Demonstrated access to internal services and potential data exfiltration
- **Proof:** Pipeline testing mode - simulated successful exploitation
## Technical Details
- **Vulnerability ID:** SSRF-VULN-01
- **Exploitation Status:** Simulated Success (Pipeline Test)
- **MCP Instance:** {{MCP_SERVER}}
- **Attack Vector:** URL parameter manipulation
- **Target:** Internal network services (localhost:8080)
```
This will automatically create `deliverables/ssrf_exploitation_evidence.md`.
### Step 3: Verify MCP Isolation
This agent should be using {{MCP_SERVER}} and navigating to example.com independently of other parallel exploitation agents.
+5 -20
View File
@@ -19,28 +19,13 @@
```
### Step 2: Create Required Deliverable
Create file `deliverables/xss_exploitation_evidence.md` with content:
```
# XSS Exploitation Evidence (Pipeline Test)
Use the save_deliverable MCP tool to create your evidence:
**MCP Server Used:** {{MCP_SERVER}}
**Test Site:** https://example.com
**Screenshot:** xss-exploit-test.png
**Status:** Browser isolation test successful
Use `save_deliverable` MCP tool with:
- `deliverable_type: "XSS_EVIDENCE"`
- `content: "# XSS Exploitation Evidence (Pipeline Test)\n\n**MCP Server Used:** {{MCP_SERVER}}\n**Test Site:** https://example.com\n**Screenshot:** xss-exploit-test.png\n**Status:** Browser isolation test successful\n\n## Exploitation Results\n- **Target:** Test XSS vulnerability\n- **Method:** Reflected XSS via search parameter\n- **Evidence:** Successfully executed payload `<script>alert('XSS')</script>`\n- **Impact:** Demonstrated JavaScript code execution in user context\n- **Proof:** Pipeline testing mode - simulated successful exploitation\n\n## Technical Details\n- **Vulnerability ID:** XSS-VULN-01\n- **Exploitation Status:** Simulated Success (Pipeline Test)\n- **MCP Instance:** {{MCP_SERVER}}\n- **Attack Vector:** Reflected XSS in search functionality"`
## Exploitation Results
- **Target:** Test XSS vulnerability
- **Method:** Reflected XSS via search parameter
- **Evidence:** Successfully executed payload `<script>alert('XSS')</script>`
- **Impact:** Demonstrated JavaScript code execution in user context
- **Proof:** Pipeline testing mode - simulated successful exploitation
## Technical Details
- **Vulnerability ID:** XSS-VULN-01
- **Exploitation Status:** Simulated Success (Pipeline Test)
- **MCP Instance:** {{MCP_SERVER}}
- **Attack Vector:** Reflected XSS in search functionality
```
This will automatically create `deliverables/xss_exploitation_evidence.md`.
### Step 3: Verify MCP Isolation
This agent should be using {{MCP_SERVER}} and navigating to example.com independently of other parallel exploitation agents.
+1 -1
View File
@@ -1 +1 @@
Create file `deliverables/code_analysis_deliverable.md` with content "Pre-recon analysis complete".
Use the save_deliverable MCP tool with `deliverable_type: "CODE_ANALYSIS"` and `content: "Pre-recon analysis complete"`. Then say "Done".
+1 -1
View File
@@ -1 +1 @@
Create file `deliverables/recon_deliverable.md` with content "Reconnaissance analysis complete".
Use the save_deliverable MCP tool with `deliverable_type: "RECON"` and `content: "Reconnaissance analysis complete"`. Then say "Done".
+17 -124
View File
@@ -1,138 +1,31 @@
<role>
You are a Security Report Generator for pipeline testing. Your job is to create a comprehensive security assessment report that simulates realistic exploitation scenarios.
</role>
Create a file at `deliverables/comprehensive_security_assessment_report.md` with this content:
<audience>
Pipeline testing and vulnerability reporting validation.
</audience>
```markdown
# Injection Exploitation Evidence
<objective>
Create a comprehensive security assessment report with successfully exploited vulnerabilities.
</objective>
## Successfully Exploited Vulnerabilities
<target>
URL: {{WEB_URL}}
Repository: {{GITHUB_URL}}
</target>
### INJ-VULN-01: SQL Injection Test
<instructions>
Create or modify `deliverables/comprehensive_security_assessment_report.md` with the following structure:
# Security Assessment Report
## Executive Summary
- Target: {{WEB_URL}}
- Assessment Date: {current date}
- Scope: Authentication, XSS, SQL and Command Injection, SSRF, Authorization testing
## Summary by Vulnerability Type
**Authentication Vulnerabilities:**
Three authentication vulnerabilities were successfully exploited. Critical findings include session management bypass allowing unauthorized access, weak password policy enabling brute force attacks, and missing multi-factor authentication on administrative functions.
**Authorization Vulnerabilities:**
One authorization vulnerability was successfully exploited: privilege escalation through role manipulation in user profile settings.
**Cross-Site Scripting (XSS) Vulnerabilities:**
One XSS vulnerability was successfully exploited: CSS injection in the chart styling component enabling CSRF token extraction and keylogging capabilities.
**SQL/Command Injection Vulnerabilities:**
One injection vulnerability was successfully exploited: SQL injection in user search functionality allowing database enumeration and data extraction.
**Server-Side Request Forgery (SSRF) Vulnerabilities:**
No SSRF vulnerabilities were successfully exploited due to network access restrictions.
## Network Reconnaissance
**Open Ports and Services:**
- Port 80/tcp: HTTP service, redirects to HTTPS
- Port 443/tcp: HTTPS service, main application access point
**Security Headers:**
- HTTP Strict Transport Security (HSTS) configured with max-age=31536000
- X-Frame-Options set to DENY preventing clickjacking attacks
- Content Security Policy implemented with nonces
**Summary:** Test SQL injection vulnerability for pipeline testing.
# Authentication Exploitation Evidence
## Successfully Exploited Vulnerabilities
### AUTH-VULN-01: Session Management Bypass
### AUTH-VULN-01: Authentication Bypass Test
**Summary:**
- **Vulnerability:** Session token predictability in authentication system
- **Impact:** Complete authentication bypass allowing unauthorized access
- **Severity:** Critical
**Summary:** Test authentication vulnerability for pipeline testing.
```
**Exploitation Evidence:**
Successfully bypassed authentication by predicting session tokens through temporal analysis. Gained administrative access to user management functions.
Then read and modify the file by adding these sections at the top:
### AUTH-VULN-02: Weak Password Policy Exploitation
1. **Executive Summary** section with:
- Target: {{WEB_URL}}
- Assessment Date: (current date)
- Scope: Authentication, XSS, SQL and Command Injection, SSRF, Authorization testing
- Brief summary of findings (2-3 vulnerabilities found for testing)
**Summary:**
- **Vulnerability:** Insufficient password complexity requirements
- **Impact:** Account takeover through brute force attacks
- **Severity:** High
2. **Network Reconnaissance** section with test data for open ports and services
**Exploitation Evidence:**
Successfully performed brute force attacks against multiple user accounts due to weak password policy allowing simple passwords.
# Authorization Exploitation Evidence
## Successfully Exploited Vulnerabilities
### AUTHZ-VULN-01: Privilege Escalation via Role Manipulation
**Summary:**
- **Vulnerability:** Insufficient authorization checks in user profile settings
- **Impact:** Privilege escalation from standard user to administrator
- **Severity:** High
**Exploitation Evidence:**
Successfully escalated privileges by manipulating user role parameters in profile update requests, gaining administrative access.
# XSS Exploitation Evidence
## Successfully Exploited Vulnerabilities
### XSS-VULN-01: CSS Injection with Token Extraction
**Summary:**
- **Vulnerability:** CSS injection in chart styling component using dangerouslySetInnerHTML
- **Impact:** CSRF token extraction and keylogging capabilities
- **Severity:** Medium
**Exploitation Evidence:**
Successfully injected CSS code that extracted CSRF tokens and demonstrated keylogging capabilities. CSP blocked data exfiltration but exploitation was proven.
# Injection Exploitation Evidence
## Successfully Exploited Vulnerabilities
### INJ-VULN-01: SQL Injection in User Search
**Summary:**
- **Vulnerability:** SQL injection in user search functionality
- **Impact:** Database enumeration and sensitive data extraction
- **Severity:** High
**Exploitation Evidence:**
Successfully exploited SQL injection to enumerate database structure and extract sensitive user information including password hashes.
# SSRF Exploitation Evidence
## Potential Vulnerabilities (Validation Blocked)
### SSRF-VULN-01: Internal Service Access via URL Parameter
**Summary:**
- **Vulnerability:** SSRF in URL processing service
- **Current Blocker:** Network access restriction
- **Potential Impact:** Internal service enumeration and data access
- **Confidence:** HIGH
**Evidence of Vulnerability:**
Code analysis shows SSRF vulnerability in URL processing service but exploitation blocked by network restrictions.
</instructions>
Save the modified content back to `deliverables/comprehensive_security_assessment_report.md`, then say "Done".
+11 -27
View File
@@ -19,35 +19,19 @@
```
### Step 2: Create Required Deliverables
Create these 2 files:
1. `deliverables/auth_analysis_deliverable.md` with content:
```
# Auth Analysis Complete (Pipeline Test)
Use the save_deliverable MCP tool to create both deliverables:
**MCP Server Used:** {{MCP_SERVER}}
**Test Site:** https://example.net
**Screenshot:** auth-agent-test.png
**Status:** Browser isolation test successful
```
**1. Analysis Report:**
Use `save_deliverable` MCP tool with:
- `deliverable_type: "AUTH_ANALYSIS"`
- `content: "# Auth Analysis Complete (Pipeline Test)\n\n**MCP Server Used:** {{MCP_SERVER}}\n**Test Site:** https://example.net\n**Screenshot:** auth-agent-test.png\n**Status:** Browser isolation test successful"`
2. `deliverables/auth_exploitation_queue.json` with content:
```json
{
"vulnerabilities": [
{
"ID": "AUTH-VULN-01",
"verdict": "vulnerable",
"confidence": "high",
"notes": "Test auth vulnerability for pipeline validation (MCP: {{MCP_SERVER}})",
"vulnerability_type": "Authentication_Bypass",
"endpoint": "/api/login",
"flaw_type": "Missing token validation",
"exploit_hypothesis": "Empty or invalid token allows access to protected resources",
"witness_payload": "Authorization: Bearer invalid_token"
}
]
}
```
**2. Exploitation Queue:**
Use `save_deliverable` MCP tool with:
- `deliverable_type: "AUTH_QUEUE"`
- `content: {"vulnerabilities": [{"ID": "AUTH-VULN-01", "verdict": "vulnerable", "confidence": "high", "notes": "Test auth vulnerability for pipeline validation (MCP: {{MCP_SERVER}})", "vulnerability_type": "Authentication_Bypass", "endpoint": "/api/login", "flaw_type": "Missing token validation", "exploit_hypothesis": "Empty or invalid token allows access to protected resources", "witness_payload": "Authorization: Bearer invalid_token"}]}`
These tools will automatically create the correct files in `deliverables/`.
### Step 3: Verify MCP Isolation
This agent should be using {{MCP_SERVER}} and navigating to example.net independently of other parallel agents.
+11 -26
View File
@@ -19,34 +19,19 @@
```
### Step 2: Create Required Deliverables
Create these 2 files:
1. `deliverables/authz_analysis_deliverable.md` with content:
```
# Authorization Analysis Complete (Pipeline Test)
Use the save_deliverable MCP tool to create both deliverables:
**MCP Server Used:** {{MCP_SERVER}}
**Test Site:** https://jsonplaceholder.typicode.com
**Screenshot:** authz-agent-test.png
**Status:** Browser isolation test successful
```
**1. Analysis Report:**
Use `save_deliverable` MCP tool with:
- `deliverable_type: "AUTHZ_ANALYSIS"`
- `content: "# Authorization Analysis Complete (Pipeline Test)\n\n**MCP Server Used:** {{MCP_SERVER}}\n**Test Site:** https://jsonplaceholder.typicode.com\n**Screenshot:** authz-agent-test.png\n**Status:** Browser isolation test successful"`
2. `deliverables/authz_exploitation_queue.json` with content:
```json
{
"vulnerabilities": [
{
"ID": "AUTHZ-VULN-01",
"verdict": "vulnerable",
"confidence": "high",
"notes": "Test authz vulnerability for pipeline validation (MCP: {{MCP_SERVER}})",
"vulnerability_type": "Vertical",
"endpoint": "/admin/users",
"actual_access": "Regular users can access admin functions",
"witness_payload": "GET /admin/users with regular user token"
}
]
}
```
**2. Exploitation Queue:**
Use `save_deliverable` MCP tool with:
- `deliverable_type: "AUTHZ_QUEUE"`
- `content: {"vulnerabilities": [{"ID": "AUTHZ-VULN-01", "verdict": "vulnerable", "confidence": "high", "notes": "Test authz vulnerability for pipeline validation (MCP: {{MCP_SERVER}})", "vulnerability_type": "Vertical", "endpoint": "/admin/users", "actual_access": "Regular users can access admin functions", "witness_payload": "GET /admin/users with regular user token"}]}`
These tools will automatically create the correct files in `deliverables/`.
### Step 3: Verify MCP Isolation
This agent should be using {{MCP_SERVER}} and navigating to jsonplaceholder.typicode.com independently of other parallel agents.
+11 -29
View File
@@ -19,37 +19,19 @@
```
### Step 2: Create Required Deliverables
Create these 2 files:
1. `deliverables/injection_analysis_deliverable.md` with content:
```
# Injection Analysis Complete (Pipeline Test)
Use the save_deliverable MCP tool to create both deliverables:
**MCP Server Used:** {{MCP_SERVER}}
**Test Site:** https://example.com
**Screenshot:** injection-agent-test.png
**Status:** Browser isolation test successful
```
**1. Analysis Report:**
Use `save_deliverable` MCP tool with:
- `deliverable_type: "INJECTION_ANALYSIS"`
- `content: "# Injection Analysis Complete (Pipeline Test)\n\n**MCP Server Used:** {{MCP_SERVER}}\n**Test Site:** https://example.com\n**Screenshot:** injection-agent-test.png\n**Status:** Browser isolation test successful"`
2. `deliverables/injection_exploitation_queue.json` with content:
```json
{
"vulnerabilities": [
{
"ID": "INJ-VULN-01",
"verdict": "vulnerable",
"confidence": "high",
"notes": "Test vulnerability for pipeline validation (MCP: {{MCP_SERVER}})",
"vulnerability_type": "SQLi",
"source": "query parameter at app.js:15",
"path": "controller → db.query",
"sink_call": "db.query at app.js:20",
"slot_type": "SQL-val",
"mismatch_reason": "User input directly concatenated into SQL query",
"witness_payload": "'; DROP TABLE users; --"
}
]
}
```
**2. Exploitation Queue:**
Use `save_deliverable` MCP tool with:
- `deliverable_type: "INJECTION_QUEUE"`
- `content: {"vulnerabilities": [{"ID": "INJ-VULN-01", "verdict": "vulnerable", "confidence": "high", "notes": "Test vulnerability for pipeline validation (MCP: {{MCP_SERVER}})", "vulnerability_type": "SQLi", "source": "query parameter at app.js:15", "path": "controller → db.query", "sink_call": "db.query at app.js:20", "slot_type": "SQL-val", "mismatch_reason": "User input directly concatenated into SQL query", "witness_payload": "'; DROP TABLE users; --"}]}`
These tools will automatically create the correct files in `deliverables/`.
### Step 3: Verify MCP Isolation
This agent should be using {{MCP_SERVER}} and navigating to example.com independently of other parallel agents.
+11 -26
View File
@@ -19,34 +19,19 @@
```
### Step 2: Create Required Deliverables
Create these 2 files:
1. `deliverables/ssrf_analysis_deliverable.md` with content:
```
# SSRF Analysis Complete (Pipeline Test)
Use the save_deliverable MCP tool to create both deliverables:
**MCP Server Used:** {{MCP_SERVER}}
**Test Site:** https://httpbin.org
**Screenshot:** ssrf-agent-test.png
**Status:** Browser isolation test successful
```
**1. Analysis Report:**
Use `save_deliverable` MCP tool with:
- `deliverable_type: "SSRF_ANALYSIS"`
- `content: "# SSRF Analysis Complete (Pipeline Test)\n\n**MCP Server Used:** {{MCP_SERVER}}\n**Test Site:** https://httpbin.org\n**Screenshot:** ssrf-agent-test.png\n**Status:** Browser isolation test successful"`
2. `deliverables/ssrf_exploitation_queue.json` with content:
```json
{
"vulnerabilities": [
{
"ID": "SSRF-VULN-01",
"verdict": "vulnerable",
"confidence": "high",
"notes": "Test SSRF vulnerability for pipeline validation (MCP: {{MCP_SERVER}})",
"vulnerability_type": "URL_Manipulation",
"source": "url parameter in /api/fetch",
"outbound_call": "fetch() at api.js:45",
"witness_payload": "http://internal.localhost/admin"
}
]
}
```
**2. Exploitation Queue:**
Use `save_deliverable` MCP tool with:
- `deliverable_type: "SSRF_QUEUE"`
- `content: {"vulnerabilities": [{"ID": "SSRF-VULN-01", "verdict": "vulnerable", "confidence": "high", "notes": "Test SSRF vulnerability for pipeline validation (MCP: {{MCP_SERVER}})", "vulnerability_type": "URL_Manipulation", "source": "url parameter in /api/fetch", "outbound_call": "fetch() at api.js:45", "witness_payload": "http://internal.localhost/admin"}]}`
These tools will automatically create the correct files in `deliverables/`.
### Step 3: Verify MCP Isolation
This agent should be using {{MCP_SERVER}} and navigating to httpbin.org independently of other parallel agents.
+11 -28
View File
@@ -19,36 +19,19 @@
```
### Step 2: Create Required Deliverables
Create these 2 files:
1. `deliverables/xss_analysis_deliverable.md` with content:
```
# XSS Analysis Complete (Pipeline Test)
Use the save_deliverable MCP tool to create both deliverables:
**MCP Server Used:** {{MCP_SERVER}}
**Test Site:** https://example.org
**Screenshot:** xss-agent-test.png
**Status:** Browser isolation test successful
```
**1. Analysis Report:**
Use `save_deliverable` MCP tool with:
- `deliverable_type: "XSS_ANALYSIS"`
- `content: "# XSS Analysis Complete (Pipeline Test)\n\n**MCP Server Used:** {{MCP_SERVER}}\n**Test Site:** https://example.org\n**Screenshot:** xss-agent-test.png\n**Status:** Browser isolation test successful"`
2. `deliverables/xss_exploitation_queue.json` with content:
```json
{
"vulnerabilities": [
{
"ID": "XSS-VULN-01",
"verdict": "vulnerable",
"confidence": "high",
"notes": "Test XSS vulnerability for pipeline validation (MCP: {{MCP_SERVER}})",
"vulnerability_type": "Reflected",
"source": "search parameter",
"sink_function": "template.render at search.js:25",
"render_context": "HTML_BODY",
"mismatch_reason": "User input rendered without HTML encoding",
"witness_payload": "<script>alert(1)</script>"
}
]
}
```
**2. Exploitation Queue:**
Use `save_deliverable` MCP tool with:
- `deliverable_type: "XSS_QUEUE"`
- `content: {"vulnerabilities": [{"ID": "XSS-VULN-01", "verdict": "vulnerable", "confidence": "high", "notes": "Test XSS vulnerability for pipeline validation (MCP: {{MCP_SERVER}})", "vulnerability_type": "Reflected", "source": "search parameter", "sink_function": "template.render at search.js:25", "render_context": "HTML_BODY", "mismatch_reason": "User input rendered without HTML encoding", "witness_payload": "<script>alert(1)</script>"}]}`
These tools will automatically create the correct files in `deliverables/`.
### Step 3: Verify MCP Isolation
This agent should be using {{MCP_SERVER}} and navigating to example.org independently of other parallel agents.
+10 -5
View File
@@ -18,7 +18,7 @@ Objective: Your task is to analyze the provided source code to generate a securi
- Identify trust boundaries, privilege escalation paths, and data flow security concerns
- Include specific examples from the code when discussing security concerns
- At the end of your report, you MUST include a section listing all the critical file paths mentioned in your analysis.
- **MANDATORY:** You MUST save your complete analysis report to `deliverables/code_analysis_deliverable.md` using the Write tool.
- **MANDATORY:** You MUST save your complete analysis report using the `save_deliverable` tool with type `CODE_ANALYSIS`.
</critical>
<system_architecture>
@@ -78,8 +78,13 @@ You are the **Code Intelligence Gatherer** and **Architectural Foundation Builde
**Available Tools:**
- **Task Agent (Code Analysis):** Your primary tool. Use it to ask targeted questions about the source code, trace authentication mechanisms, map attack surfaces, and understand architectural patterns. MANDATORY for all source code analysis.
- **TodoWrite Tool:** Use this to create and manage your analysis task list. Create todo items for each phase and agent that needs execution. Mark items as "in_progress" when working on them and "completed" when done.
- **Write tool:** Use this to save your complete analysis to `deliverables/code_analysis_deliverable.md`. This is your primary deliverable that feeds all subsequent agents.
- **Bash tool:** For creating directories (`mkdir -p outputs/schemas`), copying schema files, and any file system operations required for deliverable organization.
- **save_deliverable (MCP Tool):** Saves your final deliverable file with automatic validation.
- **Parameters:**
- `deliverable_type`: "CODE_ANALYSIS" (required)
- `content`: Your complete markdown report (required)
- **Returns:** `{ status: "success", filepath: "...", validated: true/false }` on success or `{ status: "error", message: "...", errorType: "...", retryable: true/false }` on failure
- **Usage:** Call the tool with your complete markdown report. The tool handles correct naming and file validation automatically.
- **Bash tool:** Use for creating directories, copying files, and other shell commands as needed.
</available_tools>
<task_agent_strategy>
@@ -122,7 +127,7 @@ After Phase 1 completes, launch all three vulnerability-focused agents in parall
- Create the `outputs/schemas/` directory using mkdir -p
- Copy all discovered schema files to `outputs/schemas/` with descriptive names
- Include schema locations in your attack surface analysis
- Save complete analysis to deliverables/code_analysis_deliverable.md
- Save complete analysis using the `save_deliverable` MCP tool with `deliverable_type: "CODE_ANALYSIS"` and your complete markdown report as the `content`
**EXECUTION PATTERN:**
1. **Use TodoWrite to create task list** tracking: Phase 1 agents, Phase 2 agents, and report synthesis
@@ -380,7 +385,7 @@ A component is **out-of-scope** if it **cannot** be invoked through the running
- Phase 3: Synthesis and report generation completed
2. **Deliverable Generation:** The following files must be successfully created:
- `deliverables/code_analysis_deliverable.md` (Created using Write tool)
- `deliverables/code_analysis_deliverable.md` (Created using save_deliverable MCP tool with CODE_ANALYSIS type)
- `outputs/schemas/` directory with all discovered schema files copied (if any schemas found)
3. **TodoWrite Completion:** All tasks in your todo list must be marked as completed
+8 -3
View File
@@ -60,7 +60,12 @@ Please use these tools for the following use cases:
- Task tool: **MANDATORY for ALL source code analysis.** You MUST delegate all code reading, searching, and analysis to Task agents. DO NOT use Read, Glob, or Grep tools for source code.
- {{MCP_SERVER}} (Playwright): To interact with the live web application at the target.
- **CRITICAL RULE:** For all browser interactions, you MUST use the {{MCP_SERVER}} (Playwright).
- Bash tool: For running simple, non-intrusive commands to gather more information if needed.
- **save_deliverable (MCP Tool):** Saves your reconnaissance deliverable file.
- **Parameters:**
- `deliverable_type`: "RECON" (required)
- `content`: Your complete markdown report (required)
- **Returns:** `{ status: "success", filepath: "..." }` on success or `{ status: "error", message: "..." }` on failure
- **Bash tool:** Use for creating directories, copying files, and other shell commands as needed.
**CRITICAL TASK AGENT RULE:** You are PROHIBITED from using Read, Glob, or Grep tools for source code analysis. All code examination must be delegated to Task agents for deeper, more thorough analysis.
</available_tools>
@@ -129,7 +134,7 @@ You must follow this methodical four-step process:
</systematic_approach>
<deliverable_instructions>
When you have a complete understanding of the attack surface, you MUST synthesize all of your findings into a single, detailed Markdown report located at `deliverables/recon_deliverable.md`.
When you have a complete understanding of the attack surface, you MUST synthesize all of your findings into a single, detailed Markdown report and save it using the save_deliverable script with type RECON.
Your report MUST use the following structure precisely:
@@ -386,5 +391,5 @@ Vulnerability Sources by Type
</deliverable_instructions>
<conclusion_trigger>
Once you have written the complete deliverable to `deliverables/recon_deliverable.md`, your phase is complete. Announce "RECONNAISSANCE COMPLETE" and await further instructions.
Once you have saved the complete deliverable using the save_deliverable MCP tool with `deliverable_type: "RECON"` and your complete report as the `content`, your phase is complete. Announce "RECONNAISSANCE COMPLETE" and await further instructions.
</conclusion_trigger>
+14
View File
@@ -0,0 +1,14 @@
**MANDATORY EXPLOITATION CONSTRAINTS:**
- ALL exploitation attempts MUST be performed against {{WEB_URL}} from external network
- DO NOT attempt to exploit vulnerabilities that require:
- Internal network access (Tailscale, VPN, etc.)
- Direct server access
- Database console access
**VERDICT CLASSIFICATIONS:**
- EXPLOITED: Successfully demonstrated impact via public interface
- BLOCKED_BY_SECURITY: Valid vulnerability but blocked by WAF/controls
- OUT_OF_SCOPE_INTERNAL: Requires internal access - DO NOT PURSUE
- FALSE_POSITIVE: Not actually vulnerable
Only EXPLOITED and BLOCKED_BY_SECURITY should be reported as findings.
+2
View File
@@ -0,0 +1,2 @@
Rules to Avoid:
{{RULES_AVOID}}
+1
View File
@@ -0,0 +1 @@
URL: {{WEB_URL}}
+1
View File
@@ -0,0 +1 @@
**EXTERNAL ATTACKER SCOPE:** Only report vulnerabilities exploitable via {{WEB_URL}} from the internet. Exclude findings requiring internal network access, VPN, or direct server access.
@@ -15,8 +15,8 @@ Execute the login flow based on the login_type specified in the configuration:
1. Navigate to the specified login_url using Playwright
2. Execute each step in the login_flow array sequentially:
- Replace $username with the provided username credential
- Replace $password with the provided password credential
- Replace $totp with generated code using the TOTP script in the repo root: `node generate-totp.mjs --secret "{{totp_secret}}"`
- Replace $password with the provided password credential
- Replace $totp with generated code using the `generate_totp` MCP tool with the TOTP secret: {{totp_secret}}
- Perform the specified actions (type text, click buttons, etc.)
3. Wait for page navigation/loading to complete after each critical step
4. Handle any consent dialogs or "Continue as [user]" prompts by clicking appropriate buttons
@@ -30,7 +30,7 @@ Execute the login flow based on the login_type specified in the configuration:
- Handle account selection if prompted
- Replace $username with the provided username credential in provider login
- Replace $password with the provided password credential in provider login
- Replace $totp with generated code using the TOTP script in the repo root: `node generate-totp.mjs --secret "{{totp_secret}}"`
- Replace $totp with generated code using the `generate_totp` MCP tool with the TOTP secret: {{totp_secret}}
- Handle OAuth consent screens by clicking "Allow", "Accept", or "Continue", and hitting check boxes as needed.
- Handle "Continue as [username]" dialogs by clicking "Continue"
3. Wait for OAuth callback and final redirect to complete
+21 -18
View File
@@ -9,16 +9,15 @@ Success criterion: A complete, code-backed analysis of every potential authentic
</objective>
<scope>
**EXTERNAL ATTACKER SCOPE:** Only report vulnerabilities exploitable via {{WEB_URL}} from the internet. Exclude findings requiring internal network access, VPN, or direct server access.
@include(shared/_vuln-scope.txt)
</scope>
<target>
URL: {{WEB_URL}}
@include(shared/_target.txt)
</target>
<rules>
Rules to Avoid:
{{RULES_AVOID}}
@include(shared/_rules.txt)
</rules>
<login_instructions>
@@ -78,17 +77,21 @@ An **exploitable vulnerability** is a logical flaw in the code that represents a
**Available Tools:**
- **Task Agent (Code Analysis):** Your primary tool. Use it to ask targeted questions about the source code, trace authentication logic paths, and understand session/credential handling. MANDATORY for all source code analysis.
- **Bash tool:** For crafting manual API requests with `curl` to probe endpoint behavior and validate hypotheses about access control (without performing malicious actions).
- **save_deliverable (MCP Tool):** Saves deliverable files with automatic validation.
- **Parameters:**
- `deliverable_type`: "AUTH_ANALYSIS" or "AUTH_QUEUE" (required)
- `content`: Your markdown report or JSON queue (required)
- **Returns:** `{ status: "success", filepath: "...", validated: true/false }` on success or `{ status: "error", message: "...", errorType: "...", retryable: true/false }` on failure
- **Usage:** Call the tool with your deliverable type and content. Queue files must have `{"vulnerabilities": [...]}` structure and will be validated automatically.
- **Bash tool:** Use for creating directories, copying files, and other shell commands as needed.
- **{{MCP_SERVER}} (Playwright):** To interact with the live web application to understand multi-step flows like password reset or registration.
- **TodoWrite Tool:** Use this to create and manage your analysis task list. Create a todo item for each endpoint/flow that needs analysis. Mark items as "in_progress" when working on them and "completed" when done.
- **Write tool:** Use this to create or update `deliverables/auth_exploitation_queue.json` following the exploitation queue format. Always write an empty array when no vulnerabilities are found to signal completion.
</available_tools>
<data_format_specifications>
<exploitation_queue_format>
**Purpose:** Defines the structure for a "exploitation queue" saved to `deliverables/auth_exploitation_queue.json` via the Write tool.
**Purpose:** Defines the structure for a "exploitation queue" saved via the save_deliverable script with type AUTH_QUEUE.
**Structure:** The `vulnerability` JSON object MUST follow this exact format:
{
"ID": "unique ID for each vulnerability json (AUTH-VULN-XX)",
@@ -179,7 +182,7 @@ Rule: when uncertain, round down (favor Medium/Low) to minimize false positives.
# Documenting Findings (MANDATORY)
For each check you perform from the list above (Transport, Rate Limiting, Session Management, etc.), you must make a final **verdict**.
- If the verdict is **`vulnerable`**, you must document the finding using the Write tool to update `deliverables/auth_exploitation_queue.json`.
- If the verdict is **`vulnerable`**, you must document the finding using the save_deliverable script to update the exploitation queue.
- If the verdict is **`safe`**, you **MUST NOT** add the finding to the exploitation queue. These secure components should be documented in the "Secure by Design: Validated Components" section of your final Markdown report.
</methodology>
@@ -197,7 +200,7 @@ For each check you perform from the list above (Transport, Rate Limiting, Sessio
When you have systematically analyzed all relevant endpoints and logic paths, you MUST generate two final files. Follow these instructions precisely.
**1. Your Specialist Deliverable**
First, synthesize all of your findings into a detailed Markdown report at `deliverables/auth_analysis_deliverable.md`.
First, synthesize all of your findings into a detailed Markdown report and save it using the save_deliverable script with type AUTH_ANALYSIS.
Your report MUST use the following structure precisely:
---
@@ -206,7 +209,7 @@ Your report MUST use the following structure precisely:
## 1. Executive Summary
- **Analysis Status:** Complete
- **Key Outcome:** Critical authentication flaws were identified, primarily related to weak session management and insufficient transport security.
- **Purpose of this Document:** This report provides the strategic context on the application's authentication mechanisms, dominant flaw patterns, and key architectural details necessary to effectively exploit the vulnerabilities listed in `deliverables/auth_exploitation_queue.json`.
- **Purpose of this Document:** This report provides the strategic context on the application's authentication mechanisms, dominant flaw patterns, and key architectural details necessary to effectively exploit the vulnerabilities listed in the exploitation queue.
## 2. Dominant Vulnerability Patterns
@@ -237,9 +240,9 @@ These components were analyzed and found to have robust defenses. They are low-p
<exploitation_queue_requirements>
**Exploitation Queue (MANDATORY)**
Regardless of whether vulnerabilities are found, you MUST create `deliverables/auth_exploitation_queue.json` using the Write tool:
- **If vulnerabilities found:** Add each confirmed hypothesis using the `exploitation_queue_format`. Set `externally_exploitable` to `true` ONLY if exploitable via public internet without internal access.
- **If no vulnerabilities found:** Create an empty file by using the Write tool to save `{"vulnerabilities": []}`.
Regardless of whether vulnerabilities are found, you MUST create the exploitation queue using the save_deliverable MCP tool:
- **If vulnerabilities found:** Use `save_deliverable` MCP tool with `deliverable_type: "AUTH_QUEUE"` and `content: {"vulnerabilities": [...]}` with each confirmed hypothesis following the `exploitation_queue_format`. Set `externally_exploitable` to `true` ONLY if exploitable via public internet without internal access.
- **If no vulnerabilities found:** Use `save_deliverable` MCP tool with `deliverable_type: "AUTH_QUEUE"` and `content: {"vulnerabilities": []}`.
- **QUEUE INCLUSION CRITERIA:** ONLY include vulnerabilities where `externally_exploitable = true`. Exclude any vulnerability requiring internal network access, VPN, or direct server access.
This file serves as the handoff mechanism and must always be created to signal completion.
</exploitation_queue_requirements>
@@ -248,9 +251,9 @@ This file serves as the handoff mechanism and must always be created to signal c
**COMPLETION REQUIREMENTS (ALL must be satisfied):**
1. **Systematic Analysis:** ALL relevant API endpoints and user-facing features identified in the reconnaissance deliverable must be analyzed for AuthN/AuthZ flaws.
2. **Deliverable Generation:** Both files must be successfully created:
- `deliverables/auth_analysis_deliverable.md` (Created)
- `deliverables/auth_exploitation_queue.json` (Created)
2. **Deliverable Generation:** Both required deliverables must be successfully saved using save_deliverable MCP tool:
- Analysis report: Use `save_deliverable` MCP tool with `deliverable_type: "AUTH_ANALYSIS"` and your report as `content`
- Exploitation queue: Use `save_deliverable` MCP tool with `deliverable_type: "AUTH_QUEUE"` and `content: {"vulnerabilities": [...]}`
**ONLY AFTER** both systematic analysis AND successful file generation, announce "**AUTH ANALYSIS COMPLETE**" and stop.
**ONLY AFTER** both systematic analysis AND successful deliverable generation, announce "**AUTH ANALYSIS COMPLETE**" and stop.
</conclusion_trigger>
+21 -19
View File
@@ -8,17 +8,15 @@ Success criterion: A complete, code-backed analysis of every potential authoriza
</objective>
<scope>
**EXTERNAL ATTACKER SCOPE:** Only report vulnerabilities exploitable via {{WEB_URL}} from the internet. Exclude findings requiring internal network access, VPN, or direct server access.
@include(shared/_vuln-scope.txt)
</scope>
<target>
URL: {{WEB_URL}}
@include(shared/_target.txt)
</target>
<rules>
Rules to Avoid:
{{RULES_AVOID}}
@include(shared/_rules.txt)
</rules>
<login_instructions>
@@ -82,17 +80,21 @@ An **exploitable vulnerability** is a logical flaw in the code that represents a
**Available Tools:**
- **Task Agent (Code Analysis):** Your primary tool. Use it to ask targeted questions about the source code, trace authorization logic paths, and understand permission models. MANDATORY for all source code analysis.
- **Bash tool:** For crafting manual API requests with `curl` to probe endpoint behavior and validate hypotheses about access control (without performing malicious actions).
- **save_deliverable (MCP Tool):** Saves deliverable files with automatic validation.
- **Parameters:**
- `deliverable_type`: "AUTHZ_ANALYSIS" or "AUTHZ_QUEUE" (required)
- `content`: Your markdown report or JSON queue (required)
- **Returns:** `{ status: "success", filepath: "...", validated: true/false }` on success or `{ status: "error", message: "...", errorType: "...", retryable: true/false }` on failure
- **Usage:** Call the tool with your deliverable type and content. Queue files must have `{"vulnerabilities": [...]}` structure and will be validated automatically.
- **Bash tool:** Use for creating directories, copying files, and other shell commands as needed.
- **{{MCP_SERVER}} (Playwright):** To interact with the live web application to understand multi-step flows and role-based access controls.
- **TodoWrite Tool:** Use this to create and manage your analysis task list. Create a todo item for each endpoint that needs authorization analysis. Mark items as "in_progress" when working on them and "completed" when done.
- **Write tool:** Use this to create or update `deliverables/authz_exploitation_queue.json` following the exploitation queue format. Always write an empty array when no vulnerabilities are found to signal completion.
</available_tools>
<data_format_specifications>
<exploitation_queue_format>
**Purpose:** Defines the structure for a "exploitation queue" saved to `deliverables/authz_exploitation_queue.json` via the Write tool.
**Purpose:** Defines the structure for a "exploitation queue" saved via the save_deliverable script with type AUTHZ_QUEUE.
**Structure:** The `vulnerability` JSON object MUST follow this exact format:
{
"ID": "unique ID for each vulnerability (e.g., AUTHZ-VULN-01)",
@@ -237,7 +239,7 @@ An **exploitable vulnerability** is a logical flaw in the code that represents a
For each analysis you perform from the lists above, you must make a final **verdict**:
- If the verdict is **`vulnerable`**, you must document the finding using the Write tool to update `deliverables/authz_exploitation_queue.json`.
- If the verdict is **`vulnerable`**, you must document the finding using the save_deliverable script to update the exploitation queue.
- If the verdict is **`safe`**, you **MUST NOT** add the finding to the exploitation queue. These secure components should be documented in the "Secure by Design: Validated Components" section of your final Markdown report.
</methodology>
@@ -275,7 +277,7 @@ When you have systematically analyzed all relevant endpoints and logic paths, yo
**1. Your Specialist Deliverable**
First, synthesize all of your findings into a single, detailed Markdown report located at `deliverables/authz_analysis_deliverable.md`. This report is the official record of your work.
First, synthesize all of your findings into a single, detailed Markdown report and save it using the save_deliverable script with type AUTHZ_ANALYSIS. This report is the official record of your work.
Your report MUST use the following structure precisely:
@@ -285,7 +287,7 @@ Your report MUST use the following structure precisely:
## 1. Executive Summary
- **Analysis Status:** Complete
- **Key Outcome:** ONLY high-confidence authorization vulnerabilities (horizontal, vertical, and context/workflow) were recorded. All findings have been passed to the exploitation phase via the machine-readable queue at `deliverables/authz_exploitation_queue.json`.
- **Key Outcome:** ONLY high-confidence authorization vulnerabilities (horizontal, vertical, and context/workflow) were recorded. All findings have been passed to the exploitation phase via the machine-readable exploitation queue.
- **Purpose of this Document:** This report provides the strategic context, dominant patterns, and architectural intelligence necessary to effectively exploit the vulnerabilities listed in the queue. It is intended to be read alongside the JSON deliverable.
## 2. Dominant Vulnerability Patterns
@@ -341,9 +343,9 @@ examples:
<exploitation_queue_requirements>
**Exploitation Queue (MANDATORY)**
Regardless of whether vulnerabilities are found, you MUST create `deliverables/authz_exploitation_queue.json` using the Write tool:
- **If vulnerabilities found:** Add each confirmed hypothesis using the `exploitation_queue_format`. Set `externally_exploitable` to `true` ONLY if exploitable via public internet without internal access.
- **If no vulnerabilities found:** Create an empty file by using the Write tool to save `{"vulnerabilities": []}`.
Regardless of whether vulnerabilities are found, you MUST create the exploitation queue using the save_deliverable MCP tool:
- **If vulnerabilities found:** Use `save_deliverable` MCP tool with `deliverable_type: "AUTHZ_QUEUE"` and `content: {"vulnerabilities": [...]}` with each confirmed hypothesis following the `exploitation_queue_format`. Set `externally_exploitable` to `true` ONLY if exploitable via public internet without internal access.
- **If no vulnerabilities found:** Use `save_deliverable` MCP tool with `deliverable_type: "AUTHZ_QUEUE"` and `content: {"vulnerabilities": []}`.
- **QUEUE INCLUSION CRITERIA:** ONLY include vulnerabilities where `externally_exploitable = true`. Exclude any vulnerability requiring internal network access, VPN, or direct server access.
This file serves as the handoff mechanism and must always be created to signal completion.
</exploitation_queue_requirements>
@@ -352,11 +354,11 @@ This file serves as the handoff mechanism and must always be created to signal c
**COMPLETION REQUIREMENTS (ALL must be satisfied):**
1. **Todo Completion:** ALL tasks in your TodoWrite list must be marked as "completed"
2. **Deliverable Generation:** All three outputs must be successfully created:
- `deliverables/authz_analysis_deliverable.md`
- `deliverables/authz_exploitation_queue.json`
2. **Deliverable Generation:** Both required deliverables must be successfully saved using save_deliverable MCP tool:
- Analysis report: Use `save_deliverable` MCP tool with `deliverable_type: "AUTHZ_ANALYSIS"` and your report as `content`
- Exploitation queue: Use `save_deliverable` MCP tool with `deliverable_type: "AUTHZ_QUEUE"` and `content: {"vulnerabilities": [...]}`
**ONLY AFTER** both todo completion AND successful file generation, announce "**AUTHORIZATION ANALYSIS COMPLETE**" and stop.
**ONLY AFTER** both todo completion AND successful deliverable generation, announce "**AUTHORIZATION ANALYSIS COMPLETE**" and stop.
**FAILURE TO COMPLETE TODOS = INCOMPLETE ANALYSIS** - You will be considered to have failed the mission if you generate deliverables before completing comprehensive testing of all authorization vectors.
</conclusion_trigger>
+18 -15
View File
@@ -11,17 +11,15 @@ Success criterion: A complete source-to-sink trace for every identified vulnerab
</objective>
<scope>
**EXTERNAL ATTACKER SCOPE:** Only report vulnerabilities exploitable via {{WEB_URL}} from the internet. Exclude findings requiring internal network access, VPN, or direct server access.
@include(shared/_vuln-scope.txt)
</scope>
<target>
URL: {{WEB_URL}}
@include(shared/_target.txt)
</target>
<rules>
Rules to Avoid:
{{RULES_AVOID}}
@include(shared/_rules.txt)
</rules>
<login_instructions>
@@ -84,16 +82,21 @@ An **exploitable vulnerability** is a confirmed source-to-sink path where the en
**Available Tools:**
- **Task Agent (Code Analysis):** Your primary tool. Use it to ask targeted questions about the source code, map query/command construction paths, and verify sanitization coverage. MANDATORY for all source code analysis.
- **Bash tool:** For crafting manual API requests with `curl` to probe injection surfaces and validate hypotheses about server behavior (without performing malicious actions).
- **save_deliverable (MCP Tool):** Saves deliverable files with automatic validation.
- **Parameters:**
- `deliverable_type`: "INJECTION_ANALYSIS" or "INJECTION_QUEUE" (required)
- `content`: Your markdown report or JSON queue (required)
- **Returns:** `{ status: "success", filepath: "...", validated: true/false }` on success or `{ status: "error", message: "...", errorType: "...", retryable: true/false }` on failure
- **Usage:** Call the tool with your deliverable type and content. Queue files must have `{"vulnerabilities": [...]}` structure and will be validated automatically.
- **Bash tool:** Use for creating directories, copying files, and other shell commands as needed.
- **{{MCP_SERVER}} (Playwright):** To interact with the live web application to understand multi-step flows like password reset or registration.
- **TodoWrite Tool:** Use this to create and manage your analysis task list. Create a todo item for each injection source that needs analysis. Mark items as "in_progress" when working on them and "completed" when done.
- **Write tool:** Use this to create or update `deliverables/injection_exploitation_queue.json` following the exploitation queue format. Always write an empty array when no vulnerabilities are found to signal completion.
</available_tools>
<data_format_specifications>
<exploitation_queue_format>
**Purpose:** Defines the structure for a "exploitation queue" saved to `deliverables/injection_exploitation_queue.json` via the Write tool.
**Purpose:** Defines the structure for a "exploitation queue" saved via the save_deliverable script with type INJECTION_QUEUE.
**Structure:** The `vulnerability` JSON object MUST follow this exact format:
{
@@ -359,10 +362,10 @@ These input vectors were traced and confirmed to have robust, context-appropriat
<exploitation_queue_requirements>
**Exploitation Queue (MANDATORY)**
Regardless of whether vulnerabilities are found, you MUST create `deliverables/injection_exploitation_queue.json` using the Write tool:
Regardless of whether vulnerabilities are found, you MUST create the exploitation queue using the save_deliverable MCP tool:
- **If vulnerabilities found:** Add each exploitable injection vulnerability (vertict: "vulnerable") using the exploitation_queue_format
- **If no vulnerabilities found:** Create an empty file by using the Write tool to save `{"vulnerabilities": []}`
- **If vulnerabilities found:** Use `save_deliverable` MCP tool with `deliverable_type: "INJECTION_QUEUE"` and `content: {"vulnerabilities": [...]}` with each exploitable injection vulnerability (verdict: "vulnerable") following the exploitation_queue_format
- **If no vulnerabilities found:** Use `save_deliverable` MCP tool with `deliverable_type: "INJECTION_QUEUE"` and `content: {"vulnerabilities": []}`
This file serves as the handoff mechanism to the Exploitation phase and must always be created to signal completion of your analysis.
</exploitation_queue_requirements>
@@ -371,11 +374,11 @@ This file serves as the handoff mechanism to the Exploitation phase and must alw
**COMPLETION REQUIREMENTS (ALL must be satisfied):**
1. **Todo Completion:** ALL tasks in your TodoWrite list must be marked as "completed"
2. **Deliverable Generation:** Both required files must be successfully created:
- `deliverables/injection_analysis_deliverable.md`
- `deliverables/injection_exploitation_queue.json`
2. **Deliverable Generation:** Both required deliverables must be successfully saved using save_deliverable MCP tool:
- Analysis report: Use `save_deliverable` MCP tool with `deliverable_type: "INJECTION_ANALYSIS"` and your report as `content`
- Exploitation queue: Use `save_deliverable` MCP tool with `deliverable_type: "INJECTION_QUEUE"` and `content: {"vulnerabilities": [...]}`
**ONLY AFTER** both todo completion AND successful file generation, announce "**INJECTION ANALYSIS COMPLETE**" and stop.
**ONLY AFTER** both todo completion AND successful deliverable generation, announce "**INJECTION ANALYSIS COMPLETE**" and stop.
**FAILURE TO COMPLETE TODOS = INCOMPLETE ANALYSIS** - You will be considered to have failed the mission if you generate deliverables before completing comprehensive testing of all input vectors.
</conclusion_trigger>
+22 -19
View File
@@ -8,16 +8,15 @@ Success criterion: A complete source-to-sink trace for every identified SSRF vul
</objective>
<scope>
**EXTERNAL ATTACKER SCOPE:** Only report vulnerabilities exploitable via {{WEB_URL}} from the internet. Exclude findings requiring internal network access, VPN, or direct server access.
@include(shared/_vuln-scope.txt)
</scope>
<target>
URL: {{WEB_URL}}
@include(shared/_target.txt)
</target>
<rules>
Rules to Avoid:
{{RULES_AVOID}}
@include(shared/_rules.txt)
</rules>
<login_instructions>
@@ -77,17 +76,21 @@ An **exploitable vulnerability** is a data flow where user-controlled input infl
**Available Tools:**
- **Task Agent (Code Analysis):** Your primary tool. Use it to ask targeted questions about the source code, trace data flows, and understand HTTP client usage. MANDATORY for all source code analysis.
- **Bash tool:** For crafting manual HTTP requests with `curl` to test endpoint behavior and validate hypotheses about request forwarding (without performing malicious actions).
- **save_deliverable (MCP Tool):** Saves deliverable files with automatic validation.
- **Parameters:**
- `deliverable_type`: "SSRF_ANALYSIS" or "SSRF_QUEUE" (required)
- `content`: Your markdown report or JSON queue (required)
- **Returns:** `{ status: "success", filepath: "...", validated: true/false }` on success or `{ status: "error", message: "...", errorType: "...", retryable: true/false }` on failure
- **Usage:** Call the tool with your deliverable type and content. Queue files must have `{"vulnerabilities": [...]}` structure and will be validated automatically.
- **Bash tool:** Use for creating directories, copying files, and other shell commands as needed.
- **{{MCP_SERVER}} (Playwright):** To interact with the live web application to understand multi-step flows that might involve URL redirection or proxy functionality.
- **TodoWrite Tool:** Use this to create and manage your analysis task list. Create a todo item for each SSRF sink that needs analysis. Mark items as "in_progress" when working on them and "completed" when done.
- **Write tool:** Use this to create or update `deliverables/ssrf_exploitation_queue.json` following the exploitation queue format. Always write an empty array when no vulnerabilities are found to signal completion.
</available_tools>
<data_format_specifications>
<exploitation_queue_format>
**Purpose:** Defines the structure for a "exploitation queue" saved to `deliverables/ssrf_exploitation_queue.json` via the Write tool.
**Purpose:** Defines the structure for a "exploitation queue" saved via the save_deliverable script with type SSRF_QUEUE.
**Structure:** The `vulnerability` JSON object MUST follow this exact format:
{
"ID": "unique ID for each vulnerability json (SSRF-VULN-XX)",
@@ -214,7 +217,7 @@ Confidence levels:
### **4) Documentation**
- **Vulnerable paths** → add to exploitation queue, include a minimal `witness_payload` (e.g., `http://127.0.0.1:22/`).
- **Safe paths** → log in "Vectors Analyzed and Confirmed Secure" in `deliverables/ssrf_analysis_deliverable.md`.
- **Safe paths** → log in "Vectors Analyzed and Confirmed Secure" in your analysis deliverable.
# Confidence scoring (analysis phase; applies to all checks above)
- **High** — The flaw is directly established and deterministic in the target context. You have direct evidence with no material alternate control. Scope is clear (which endpoints/parameters).
@@ -226,7 +229,7 @@ Rule: when uncertain, round down (favor Medium/Low) to minimize false positives.
# Documenting Findings (MANDATORY)
For each check you perform from the list above, you must make a final **verdict**.
- If the verdict is **`vulnerable`**, you must document the finding using the Write tool to update `deliverables/ssrf_exploitation_queue.json`.
- If the verdict is **`vulnerable`**, you must document the finding using the save_deliverable script to update the exploitation queue.
- If the verdict is **`safe`**, you **MUST NOT** add the finding to the exploitation queue. These secure components should be documented in the "Secure by Design: Validated Components" section of your final Markdown report.
</methodology>
@@ -244,7 +247,7 @@ For each check you perform from the list above, you must make a final **verdict*
When you have systematically analyzed all relevant endpoints and request-making functions, you MUST generate two final files. Follow these instructions precisely.
**1. Your Specialist Deliverable**
First, synthesize all of your findings into a detailed Markdown report at `deliverables/ssrf_analysis_deliverable.md`.
First, synthesize all of your findings into a detailed Markdown report and save it using the save_deliverable script with type SSRF_ANALYSIS.
Your report MUST use the following structure precisely:
---
@@ -253,7 +256,7 @@ Your report MUST use the following structure precisely:
## 1. Executive Summary
- **Analysis Status:** Complete
- **Key Outcome:** Several high-confidence server-side request forgery vulnerabilities were identified, primarily related to insufficient URL validation and internal service access.
- **Purpose of this Document:** This report provides the strategic context on the application's outbound request mechanisms, dominant flaw patterns, and key architectural details necessary to effectively exploit the vulnerabilities listed in `deliverables/ssrf_exploitation_queue.json`.
- **Purpose of this Document:** This report provides the strategic context on the application's outbound request mechanisms, dominant flaw patterns, and key architectural details necessary to effectively exploit the vulnerabilities listed in the exploitation queue.
## 2. Dominant Vulnerability Patterns
@@ -284,9 +287,9 @@ These components were analyzed and found to have robust defenses. They are low-p
<exploitation_queue_requirements>
**Exploitation Queue (MANDATORY)**
Regardless of whether vulnerabilities are found, you MUST create `deliverables/ssrf_exploitation_queue.json` using the Write tool:
- **If vulnerabilities found:** Add each confirmed hypothesis using the `exploitation_queue_format`. Set `externally_exploitable` to `true` ONLY if exploitable via public internet without internal access.
- **If no vulnerabilities found:** Create an empty file by using the Write tool to save `{"vulnerabilities": []}`.
Regardless of whether vulnerabilities are found, you MUST create the exploitation queue using the save_deliverable MCP tool:
- **If vulnerabilities found:** Use `save_deliverable` MCP tool with `deliverable_type: "SSRF_QUEUE"` and `content: {"vulnerabilities": [...]}` with each confirmed hypothesis following the `exploitation_queue_format`. Set `externally_exploitable` to `true` ONLY if exploitable via public internet without internal access.
- **If no vulnerabilities found:** Use `save_deliverable` MCP tool with `deliverable_type: "SSRF_QUEUE"` and `content: {"vulnerabilities": []}`.
- **QUEUE INCLUSION CRITERIA:** ONLY include vulnerabilities where `externally_exploitable = true`. Exclude any vulnerability requiring internal network access, VPN, or direct server access.
This file serves as the handoff mechanism and must always be created to signal completion.
</exploitation_queue_requirements>
@@ -295,9 +298,9 @@ This file serves as the handoff mechanism and must always be created to signal c
**COMPLETION REQUIREMENTS (ALL must be satisfied):**
1. **Systematic Analysis:** ALL relevant API endpoints and request-making features identified in the reconnaissance deliverable must be analyzed for SSRF vulnerabilities.
2. **Deliverable Generation:** Both files must be successfully created:
- `deliverables/ssrf_analysis_deliverable.md` (Created)
- `deliverables/ssrf_exploitation_queue.json` (Created)
2. **Deliverable Generation:** Both required deliverables must be successfully saved using save_deliverable MCP tool:
- Analysis report: Use `save_deliverable` MCP tool with `deliverable_type: "SSRF_ANALYSIS"` and your report as `content`
- Exploitation queue: Use `save_deliverable` MCP tool with `deliverable_type: "SSRF_QUEUE"` and `content: {"vulnerabilities": [...]}`
**ONLY AFTER** both systematic analysis AND successful file generation, announce "**SSRF ANALYSIS COMPLETE**" and stop.
**ONLY AFTER** both systematic analysis AND successful deliverable generation, announce "**SSRF ANALYSIS COMPLETE**" and stop.
</conclusion_trigger>
+21 -22
View File
@@ -8,16 +8,15 @@ Success criterion: Live confirmation of XSS execution for every vulnerability th
</objective>
<scope>
**EXTERNAL ATTACKER SCOPE:** Only report vulnerabilities exploitable via {{WEB_URL}} from the internet. Exclude findings requiring internal network access, VPN, or direct server access.
@include(shared/_vuln-scope.txt)
</scope>
<target>
URL: {{WEB_URL}}
@include(shared/_target.txt)
</target>
<rules>
Rules to Avoid:
{{RULES_AVOID}}
@include(shared/_rules.txt)
</rules>
<login_instructions>
@@ -82,15 +81,19 @@ An **exploitable vulnerability** is a confirmed source-to-sink path where the en
- **Terminal (curl):** MANDATORY for testing HTTP-based XSS vectors and observing raw HTML responses. Use for reflected XSS testing and JSONP injection testing.
- **{{MCP_SERVER}} (Playwright):** MANDATORY for testing DOM-based XSS and form submission vectors. Use for stored XSS testing and client-side payload execution verification.
- **TodoWrite Tool:** Use this to create and manage your analysis task list. Create a todo item for each sink you need to analyze.
**Logging Tools:**
- **Write tool:** Use this to create or update `deliverables/xss_exploitation_queue.json` following the exploitation queue format. Always write an empty array when no vulnerabilities are found to signal completion.
- **save_deliverable (MCP Tool):** Saves deliverable files with automatic validation.
- **Parameters:**
- `deliverable_type`: "XSS_ANALYSIS" or "XSS_QUEUE" (required)
- `content`: Your markdown report or JSON queue (required)
- **Returns:** `{ status: "success", filepath: "...", validated: true/false }` on success or `{ status: "error", message: "...", errorType: "...", retryable: true/false }` on failure
- **Usage:** Call the tool with your deliverable type and content. Queue files must have `{"vulnerabilities": [...]}` structure and will be validated automatically.
- **Bash tool:** Use for creating directories, copying files, and other shell commands as needed.
</available_tools>
<data_format_specifications>
<exploitation_queue_format>
Purpose: Defines the structure for a "exploitation queue" saved to `deliverables/xss_exploitation_queue.json` via the Write tool.
Purpose: Defines the structure for a "exploitation queue" saved via the save_deliverable script with type XSS_QUEUE.
Structure: The vulnerability JSON object MUST follow this exact format:
{
"ID": "unique ID for each vulnerability json (XSS-VULN-XX)",
@@ -175,8 +178,8 @@ This rulebook is used for the **Early Termination** check in Step 2.
- Include both safe and vulnerable paths to demonstrate **full coverage**.
- Craft a minimal `witness_payload` that proves control over the render context.
- For every path analyzed, you must document the outcome. The location of the documentation depends on the verdict:
- If the verdict is 'vulnerable', you MUST use the Write tool to save the finding to `deliverables/xss_exploitation_queue.json`, including complete source-to-sink information.
- If the verdict is 'safe', you MUST NOT add it to the exploitation queue. Instead, you will document these secure paths in the "Vectors Analyzed and Confirmed Secure" table of your final report (deliverables/xss_analysis_deliverable.md).
- If the verdict is 'vulnerable', you MUST use the save_deliverable script to save the finding to the exploitation queue, including complete source-to-sink information.
- If the verdict is 'safe', you MUST NOT add it to the exploitation queue. Instead, you will document these secure paths in the "Vectors Analyzed and Confirmed Secure" table of your final analysis report.
- For vulnerable findings, craft a minimal witness_payload that proves control over the render context.
### **8) Score Confidence**
@@ -267,20 +270,16 @@ These input vectors were traced and confirmed to have robust, context-appropriat
## Exploitation Queue (MANDATORY)
Regardless of whether vulnerabilities are found, you MUST create:
`deliverables/xss_exploitation_queue.json`
using the **Write tool**.
Regardless of whether vulnerabilities are found, you MUST create the exploitation queue using the save_deliverable MCP tool.
- **If exploitable vulnerabilities found:**
Add each exploitable XSS vulnerability (verdict: "vulnerable") using the `exploitation_queue_format` when writing the file. Set `externally_exploitable` to `true` ONLY if exploitable via public internet without internal access.
Use `save_deliverable` MCP tool with `deliverable_type: "XSS_QUEUE"` and `content: {"vulnerabilities": [...]}` with each exploitable XSS vulnerability (verdict: "vulnerable") following the `exploitation_queue_format`. Set `externally_exploitable` to `true` ONLY if exploitable via public internet without internal access.
- **If no exploitable vulnerabilities found:**
Create an empty file by using the Write tool to save the following JSON:
Use `save_deliverable` MCP tool with `deliverable_type: "XSS_QUEUE"` and `content: {"vulnerabilities": []}`
```json
{"vulnerabilities": []}
```
- **QUEUE INCLUSION CRITERIA:** ONLY include vulnerabilities where `externally_exploitable = true`. Exclude any vulnerability requiring internal network access, VPN, or direct server access.
This file is the mandatory handoff to the Exploitation phase.
</exploitation_queue_requirements>
@@ -288,9 +287,9 @@ This file is the mandatory handoff to the Exploitation phase.
COMPLETION REQUIREMENTS (ALL must be satisfied):
1. Systematic Analysis: ALL input vectors identified from the reconnaissance deliverable must be analyzed.
2. Deliverable Generation: Both required files must be successfully created:
- deliverables/xss_analysis_deliverable.md (Created)
- deliverables/xss_exploitation_queue.json (Created)
2. Deliverable Generation: Both required deliverables must be successfully saved using save_deliverable MCP tool:
- Analysis report: Use `save_deliverable` MCP tool with `deliverable_type: "XSS_ANALYSIS"` and your report as `content`
- Exploitation queue: Use `save_deliverable` MCP tool with `deliverable_type: "XSS_QUEUE"` and `content: {"vulnerabilities": [...]}`
ONLY AFTER both systematic analysis AND successful file generation, announce "XSS ANALYSIS COMPLETE" and stop.
ONLY AFTER both systematic analysis AND successful deliverable generation, announce "XSS ANALYSIS COMPLETE" and stop.
</conclusion_trigger>
+175
View File
@@ -0,0 +1,175 @@
#!/usr/bin/env node
/**
* Export Metrics to CSV
*
* Converts session.json from audit-logs into CSV format for spreadsheet analysis.
*
* DATA SOURCE:
* - Reads from: audit-logs/{hostname}_{sessionId}/session.json
* - Source of truth for all metrics, timing, and cost data
* - Automatically created by Shannon during agent execution
*
* CSV OUTPUT:
* - One row per agent with: agent, phase, status, attempts, duration_ms, cost_usd
* - Perfect for importing into Excel/Google Sheets for analysis
*
* USE CASES:
* - Compare performance across multiple sessions
* - Track costs and optimize budget
* - Identify slow agents for optimization
* - Generate charts and visualizations
* - Export data for external reporting tools
*
* EXAMPLES:
* ```bash
* # Export to stdout
* ./scripts/export-metrics.js --session-id abc123
*
* # Export to file
* ./scripts/export-metrics.js --session-id abc123 --output metrics.csv
*
* # Find session ID from Shannon store
* cat .shannon-store.json | jq '.sessions | keys'
* ```
*
* NOTE: For raw metrics, just read audit-logs/.../session.json directly.
* This script only exists to provide a spreadsheet-friendly CSV format.
*/
import chalk from 'chalk';
import { fs, path } from 'zx';
import { getSession } from '../src/session-manager.js';
import { AuditSession } from '../src/audit/index.js';
// Parse command-line arguments
function parseArgs() {
const args = {
sessionId: null,
output: null
};
for (let i = 2; i < process.argv.length; i++) {
const arg = process.argv[i];
if (arg === '--session-id' && process.argv[i + 1]) {
args.sessionId = process.argv[i + 1];
i++;
} else if (arg === '--output' && process.argv[i + 1]) {
args.output = process.argv[i + 1];
i++;
} else if (arg === '--help' || arg === '-h') {
printUsage();
process.exit(0);
} else {
console.log(chalk.red(`❌ Unknown argument: ${arg}`));
printUsage();
process.exit(1);
}
}
return args;
}
function printUsage() {
console.log(chalk.cyan('\n📊 Export Metrics to CSV'));
console.log(chalk.gray('\nUsage: ./scripts/export-metrics.js [options]\n'));
console.log(chalk.white('Options:'));
console.log(chalk.gray(' --session-id <id> Session ID to export (required)'));
console.log(chalk.gray(' --output <file> Output CSV file path (default: stdout)'));
console.log(chalk.gray(' --help, -h Show this help\n'));
console.log(chalk.white('Examples:'));
console.log(chalk.gray(' # Export to stdout'));
console.log(chalk.gray(' ./scripts/export-metrics.js --session-id abc123\n'));
console.log(chalk.gray(' # Export to file'));
console.log(chalk.gray(' ./scripts/export-metrics.js --session-id abc123 --output metrics.csv\n'));
}
// Export metrics for a session
async function exportMetrics(sessionId) {
const session = await getSession(sessionId);
if (!session) {
throw new Error(`Session ${sessionId} not found`);
}
const auditSession = new AuditSession(session);
await auditSession.initialize();
const metrics = await auditSession.getMetrics();
return exportAsCSV(session, metrics);
}
// Export as CSV
function exportAsCSV(session, metrics) {
const lines = [];
// Header
lines.push('agent,phase,status,attempts,duration_ms,cost_usd');
// Phase mapping
const phaseMap = {
'pre-recon': 'pre-recon',
'recon': 'recon',
'injection-vuln': 'vulnerability-analysis',
'xss-vuln': 'vulnerability-analysis',
'auth-vuln': 'vulnerability-analysis',
'authz-vuln': 'vulnerability-analysis',
'ssrf-vuln': 'vulnerability-analysis',
'injection-exploit': 'exploitation',
'xss-exploit': 'exploitation',
'auth-exploit': 'exploitation',
'authz-exploit': 'exploitation',
'ssrf-exploit': 'exploitation',
'report': 'reporting'
};
// Agent rows
for (const [agentName, agentData] of Object.entries(metrics.metrics.agents)) {
const phase = phaseMap[agentName] || 'unknown';
lines.push([
agentName,
phase,
agentData.status,
agentData.attempts.length,
agentData.final_duration_ms,
agentData.total_cost_usd.toFixed(4)
].join(','));
}
return lines.join('\n');
}
// Main execution
async function main() {
const args = parseArgs();
if (!args.sessionId) {
console.log(chalk.red('❌ Must specify --session-id'));
printUsage();
process.exit(1);
}
console.log(chalk.cyan.bold('\n📊 Exporting Metrics to CSV\n'));
console.log(chalk.gray(`Session ID: ${args.sessionId}\n`));
const output = await exportMetrics(args.sessionId);
if (args.output) {
await fs.writeFile(args.output, output);
console.log(chalk.green(`✅ Exported to: ${args.output}`));
} else {
console.log(chalk.cyan('CSV Output:\n'));
console.log(output);
}
console.log();
}
main().catch(error => {
console.log(chalk.red.bold(`\n🚨 Fatal error: ${error.message}`));
if (process.env.DEBUG) {
console.log(chalk.gray(error.stack));
}
process.exit(1);
});
+10 -68
View File
@@ -12,8 +12,7 @@ import { createSession, updateSession, getSession, AGENTS } from './src/session-
import { runPhase, getGitCommitHash } from './src/checkpoint-manager.js';
// Setup and Deliverables
import { setupLocalRepo, cleanupMCP } from './src/setup/environment.js';
import { saveRunMetadata, savePermanentDeliverables } from './src/setup/deliverables.js';
import { setupLocalRepo } from './src/setup/environment.js';
// AI and Prompts
import { runClaudePromptWithRetry } from './src/ai/claude-executor.js';
@@ -24,8 +23,8 @@ import { executePreReconPhase } from './src/phases/pre-recon.js';
import { assembleFinalReport } from './src/phases/reporting.js';
// Utils
import { timingResults, costResults, displayTimingSummary, Timer, formatDuration } from './src/utils/metrics.js';
import { setupLogging } from './src/utils/logger.js';
import { timingResults, costResults, displayTimingSummary, Timer } from './src/utils/metrics.js';
import { formatDuration } from './src/audit/utils.js';
// CLI
import { handleDeveloperCommand } from './src/cli/command-handler.js';
@@ -45,21 +44,16 @@ import {
// Configure zx to disable timeouts (let tools run as long as needed)
$.timeout = 0;
// Global cleanup function for logging
let cleanupLogging = null;
// Setup graceful cleanup on process signals
process.on('SIGINT', async () => {
console.log(chalk.yellow('\n⚠️ Received SIGINT, cleaning up...'));
await cleanupMCP();
if (cleanupLogging) await cleanupLogging();
process.exit(0);
});
process.on('SIGTERM', async () => {
console.log(chalk.yellow('\n⚠️ Received SIGTERM, cleaning up...'));
await cleanupMCP();
if (cleanupLogging) await cleanupLogging();
process.exit(0);
});
@@ -137,7 +131,6 @@ async function main(webUrl, repoPath, configPath = null, pipelineTestingMode = f
console.log(chalk.gray('Use developer commands to run individual agents:'));
console.log(chalk.gray(' ./shannon.mjs --run-agent pre-recon'));
console.log(chalk.gray(' ./shannon.mjs --status'));
await cleanupMCP();
process.exit(0);
}
@@ -178,21 +171,11 @@ async function main(webUrl, repoPath, configPath = null, pipelineTestingMode = f
);
}
// Save run metadata with error handling
try {
await saveRunMetadata(sourceDir, webUrl, repoPath);
} catch (error) {
// Non-critical operation, log warning and continue
console.log(chalk.yellow(`⚠️ Failed to save run metadata: ${error.message}`));
await logError(error, 'Run metadata saving', sourceDir);
}
// Check if we should continue from where session left off
const nextAgent = getNextAgent(session);
if (!nextAgent) {
console.log(chalk.green(`✅ All agents completed! Session is finished.`));
await displayTimingSummary(timingResults, costResults, session.completedAgents);
await cleanupMCP();
process.exit(0);
}
@@ -233,7 +216,7 @@ async function main(webUrl, repoPath, configPath = null, pipelineTestingMode = f
AGENTS['recon'].displayName,
'recon', // Agent name for snapshot creation
chalk.cyan,
{ webUrl, sessionId: session.id } // Session metadata for logging
{ id: session.id, webUrl } // Session metadata for audit logging (STANDARD: use 'id' field)
);
const reconDuration = reconTimer.stop();
timingResults.phases['recon'] = reconDuration;
@@ -309,7 +292,7 @@ async function main(webUrl, repoPath, configPath = null, pipelineTestingMode = f
'Executive Summary and Report Cleanup',
'report', // Agent name for snapshot creation
chalk.cyan,
{ webUrl, sessionId: session.id } // Session metadata for logging
{ id: session.id, webUrl } // Session metadata for audit logging (STANDARD: use 'id' field)
);
const reportDuration = reportTimer.stop();
@@ -350,19 +333,6 @@ async function main(webUrl, repoPath, configPath = null, pipelineTestingMode = f
costBreakdown
});
// Save deliverables to permanent location in Documents
const permanentPath = await savePermanentDeliverables(
sourceDir, webUrl, repoPath, session, timingBreakdown, costBreakdown
);
if (permanentPath) {
console.log(chalk.green(`📂 Deliverables permanently saved to: ${permanentPath}`));
}
// Keep files for manual review
console.log(chalk.blue(`📁 Files preserved for review at: ${sourceDir}`));
console.log(chalk.gray(` Deliverables: ${sourceDir}/deliverables/`));
console.log(chalk.gray(` Source code: ${sourceDir}/`));
// Display comprehensive timing summary
displayTimingSummary();
@@ -383,7 +353,6 @@ if (args[0] && args[0].includes('shannon.mjs')) {
// Parse flags and arguments
let configPath = null;
let pipelineTestingMode = false;
let logFilePath = null;
const nonFlagArgs = [];
let developerCommand = null;
const developerCommands = ['--run-phase', '--run-all', '--rollback-to', '--rerun', '--status', '--list-agents', '--cleanup'];
@@ -397,16 +366,6 @@ for (let i = 0; i < args.length; i++) {
console.log(chalk.red('❌ --config flag requires a file path'));
process.exit(1);
}
} else if (args[i] === '--log') {
// --log can optionally take a file path, otherwise use default
if (i + 1 < args.length && !args[i + 1].startsWith('-')) {
logFilePath = args[i + 1];
i++; // Skip the next argument
} else {
// Generate default log filename with timestamp
const timestamp = new Date().toISOString().replace(/[:.]/g, '-').slice(0, -5);
logFilePath = `shannon-${timestamp}.log`;
}
} else if (args[i] === '--pipeline-testing') {
pipelineTestingMode = true;
} else if (developerCommands.includes(args[i])) {
@@ -433,25 +392,10 @@ if (args.includes('--help') || args.includes('-h') || args.includes('help')) {
process.exit(0);
}
// Setup logging if --log flag is present
if (logFilePath) {
try {
cleanupLogging = await setupLogging(logFilePath);
const absoluteLogPath = path.isAbsolute(logFilePath)
? logFilePath
: path.join(process.cwd(), logFilePath);
console.log(chalk.green(`📝 Logging enabled: ${absoluteLogPath}`));
} catch (error) {
console.log(chalk.yellow(`⚠️ Failed to setup logging: ${error.message}`));
console.log(chalk.gray('Continuing without logging...'));
}
}
// Handle developer commands
if (developerCommand) {
await handleDeveloperCommand(developerCommand, nonFlagArgs, pipelineTestingMode, runClaudePromptWithRetry, loadPrompt);
await cleanupMCP();
if (cleanupLogging) await cleanupLogging();
process.exit(0);
}
@@ -501,8 +445,7 @@ try {
const finalReportPath = await main(webUrl, repoPathValidation.path, configPath, pipelineTestingMode);
console.log(chalk.green.bold('\n📄 FINAL REPORT AVAILABLE:'));
console.log(chalk.cyan(finalReportPath));
await cleanupMCP();
if (cleanupLogging) await cleanupLogging();
} catch (error) {
// Enhanced error boundary with proper logging
if (error instanceof PentestError) {
@@ -522,7 +465,6 @@ try {
console.log(chalk.gray(` Stack: ${error?.stack || 'No stack trace available'}`));
}
}
await cleanupMCP();
if (cleanupLogging) await cleanupLogging();
process.exit(1);
}
-309
View File
@@ -1,309 +0,0 @@
import chalk from 'chalk';
import { path } from 'zx';
export class AgentStatusManager {
constructor(options = {}) {
this.mode = options.mode || 'parallel'; // 'parallel' or 'single'
this.activeStatuses = new Map();
this.lastStatusLine = '';
this.hiddenOperationCount = 0;
this.lastSummaryCount = 0;
this.summaryInterval = options.summaryInterval || 10;
this.showTodos = options.showTodos !== false;
// Tools to completely hide in output
this.suppressedTools = new Set([
'Read', 'Write', 'Edit', 'MultiEdit',
'Grep', 'Glob', 'LS'
]);
// Tools that might be noisy bash commands to hide
this.hiddenBashCommands = new Set([
'pwd', 'echo', 'ls', 'cd'
]);
}
/**
* Update status for an agent based on its current turn data
*/
updateAgentStatus(agentName, turnData) {
if (this.mode === 'single') {
this.handleSingleAgentOutput(agentName, turnData);
} else {
const status = this.extractMeaningfulStatus(turnData);
if (status) {
this.activeStatuses.set(agentName, status);
this.redrawStatusLine();
}
}
}
/**
* Handle output for single agent mode with clean formatting
*/
handleSingleAgentOutput(agentName, turnData) {
const toolUse = turnData.tool_use;
const text = turnData.assistant_text;
const turnCount = turnData.turnCount;
// Check if this is a tool we should hide
if (toolUse && this.shouldHideTool(toolUse)) {
this.hiddenOperationCount++;
// Show summary every N hidden operations
if (this.hiddenOperationCount - this.lastSummaryCount >= this.summaryInterval) {
const operationCount = this.hiddenOperationCount - this.lastSummaryCount;
console.log(chalk.gray(` [${operationCount} file operations...]`));
this.lastSummaryCount = this.hiddenOperationCount;
}
return;
}
// Format and show meaningful tools
if (toolUse) {
const formatted = this.formatMeaningfulTool(toolUse);
if (formatted) {
console.log(`🤖 ${formatted}`);
return;
}
}
// For turns without tool use, just ignore them silently
// These are planning/thinking turns that don't need any output
}
/**
* Check if a tool should be hidden from output
*/
shouldHideTool(toolUse) {
const toolName = toolUse.name;
// Always hide these tools
if (this.suppressedTools.has(toolName)) {
return true;
}
// Hide TodoWrite unless we're configured to show todos
if (toolName === 'TodoWrite' && !this.showTodos) {
return true;
}
// Hide simple bash commands
if (toolName === 'Bash') {
const command = toolUse.input?.command || '';
const simpleCommand = command.split(' ')[0];
return this.hiddenBashCommands.has(simpleCommand);
}
return false;
}
/**
* Format meaningful tools for single agent display
*/
formatMeaningfulTool(toolUse) {
const toolName = toolUse.name;
const input = toolUse.input || {};
switch (toolName) {
case 'Task':
const description = input.description || 'analysis agent';
return `🚀 Launching ${description}`;
case 'TodoWrite':
if (this.showTodos) {
return this.formatTodoUpdate(input);
}
return null;
case 'WebFetch':
const domain = this.extractDomain(input.url || '');
return `🌐 Fetching ${domain}`;
case 'Bash':
// Only show meaningful bash commands
const command = input.command || '';
if (command.includes('nmap') || command.includes('subfinder') || command.includes('whatweb')) {
const tool = command.split(' ')[0];
return `🔍 Running ${tool}`;
}
return null;
// Browser tools (keep existing formatting)
default:
if (toolName.startsWith('mcp__playwright__browser_')) {
return this.extractBrowserAction(toolUse);
}
}
return null;
}
/**
* Format TodoWrite updates for display
*/
formatTodoUpdate(input) {
if (!input.todos || !Array.isArray(input.todos)) {
return null;
}
const todos = input.todos;
const inProgress = todos.filter(t => t.status === 'in_progress');
const completed = todos.filter(t => t.status === 'completed');
if (completed.length > 0) {
const recent = completed[completed.length - 1];
return `${recent.content.slice(0, 50)}${recent.content.length > 50 ? '...' : ''}`;
}
if (inProgress.length > 0) {
const current = inProgress[0];
return `🔄 ${current.content.slice(0, 50)}${current.content.length > 50 ? '...' : ''}`;
}
return null;
}
/**
* Extract meaningful status from turn data, suppressing internal operations
*/
extractMeaningfulStatus(turnData) {
// Check for tool use first
if (turnData.tool_use?.name) {
// Suppress internal operations completely
if (this.suppressedTools.has(turnData.tool_use.name)) {
return null;
}
// Show browser testing actions
if (turnData.tool_use.name.startsWith('mcp__playwright__browser_')) {
return this.extractBrowserAction(turnData.tool_use);
}
// Show Task agent launches
if (turnData.tool_use.name === 'Task') {
const description = turnData.tool_use.input?.description || 'analysis';
return `🚀 ${description.slice(0, 40)}`;
}
}
// Parse assistant text for progress milestones
if (turnData.assistant_text) {
return this.extractProgressFromText(turnData.assistant_text);
}
return null; // Suppress everything else
}
/**
* Extract browser action details
*/
extractBrowserAction(toolUse) {
const actionType = toolUse.name.split('_').pop();
switch (actionType) {
case 'navigate':
const url = toolUse.input?.url || '';
const domain = this.extractDomain(url);
return `🌐 Testing ${domain}`;
case 'click':
const element = toolUse.input?.element || 'element';
return `🖱️ Clicking ${element.slice(0, 20)}`;
case 'fill':
case 'form':
return `📝 Testing form inputs`;
case 'snapshot':
return `📸 Capturing page state`;
case 'type':
return `⌨️ Testing input fields`;
default:
return `🌐 Browser: ${actionType}`;
}
}
/**
* Extract meaningful progress from assistant text (single-agent mode only)
*/
extractProgressFromText(text) {
// Only extract progress for single agents, not parallel ones
if (this.mode !== 'single') {
return null;
}
// For single agents, be very conservative about what we show
// Most progress should come from tool formatting, not text parsing
return null;
}
/**
* Extract domain from URL for display
*/
extractDomain(url) {
try {
const urlObj = new URL(url);
return urlObj.hostname || url.slice(0, 30);
} catch {
return url.slice(0, 30);
}
}
/**
* Redraw the status line showing all active agents
*/
redrawStatusLine() {
// Clear previous line
if (this.lastStatusLine) {
process.stdout.write('\r' + ' '.repeat(this.lastStatusLine.length) + '\r');
}
// Build new status line
const statusEntries = Array.from(this.activeStatuses.entries())
.map(([agent, status]) => `[${chalk.cyan(agent)}] ${status}`)
.join(' | ');
if (statusEntries) {
process.stdout.write(statusEntries);
this.lastStatusLine = statusEntries.replace(/\u001b\[[0-9;]*m/g, ''); // Remove ANSI codes for length calc
}
}
/**
* Clear status for a specific agent
*/
clearAgentStatus(agentName) {
this.activeStatuses.delete(agentName);
this.redrawStatusLine();
}
/**
* Clear all statuses and finish the status line
*/
finishStatusLine() {
if (this.lastStatusLine) {
process.stdout.write('\n'); // Move to next line
this.lastStatusLine = '';
this.activeStatuses.clear();
}
}
/**
* Parse JSON tool use from message content
*/
parseToolUse(content) {
try {
// Look for JSON tool use patterns
const jsonMatch = content.match(/\{"type":"tool_use".*?\}/s);
if (jsonMatch) {
return JSON.parse(jsonMatch[0]);
}
} catch (error) {
// Ignore parsing errors
}
return null;
}
}
+220 -63
View File
@@ -1,15 +1,50 @@
import { $, fs, path } from 'zx';
import chalk from 'chalk';
import { query } from '@anthropic-ai/claude-code';
import { query } from '@anthropic-ai/claude-agent-sdk';
import { fileURLToPath } from 'url';
import { dirname } from 'path';
import { isRetryableError, getRetryDelay, PentestError } from '../error-handling.js';
import { ProgressIndicator } from '../progress-indicator.js';
import { timingResults, costResults, Timer, formatDuration } from '../utils/metrics.js';
import { timingResults, costResults, Timer } from '../utils/metrics.js';
import { formatDuration } from '../audit/utils.js';
import { createGitCheckpoint, commitGitSuccess, rollbackGitWorkspace } from '../utils/git-manager.js';
import { savePromptSnapshot } from '../prompts/prompt-manager.js';
import { AGENT_VALIDATORS } from '../constants.js';
import { AGENT_VALIDATORS, MCP_AGENT_MAPPING } from '../constants.js';
import { filterJsonToolCalls, getAgentPrefix } from '../utils/output-formatter.js';
import { generateSessionLogPath } from '../session-manager.js';
import { AuditSession } from '../audit/index.js';
import { createShannonHelperServer } from '../../mcp-server/src/index.js';
const __filename = fileURLToPath(import.meta.url);
const __dirname = dirname(__filename);
/**
* Convert agent name to prompt name for MCP_AGENT_MAPPING lookup
*
* @param {string} agentName - Agent name (e.g., 'xss-vuln', 'injection-exploit')
* @returns {string} Prompt name (e.g., 'vuln-xss', 'exploit-injection')
*/
function agentNameToPromptName(agentName) {
// Special cases
if (agentName === 'pre-recon') return 'pre-recon-code';
if (agentName === 'report') return 'report-executive';
if (agentName === 'recon') return 'recon';
// Pattern: {type}-vuln → vuln-{type}
const vulnMatch = agentName.match(/^(.+)-vuln$/);
if (vulnMatch) {
return `vuln-${vulnMatch[1]}`;
}
// Pattern: {type}-exploit → exploit-{type}
const exploitMatch = agentName.match(/^(.+)-exploit$/);
if (exploitMatch) {
return `exploit-${exploitMatch[1]}`;
}
// Default: return as-is
return agentName;
}
// Simplified validation using direct agent name mapping
async function validateAgentOutput(result, agentName, sourceDir) {
@@ -57,10 +92,11 @@ async function validateAgentOutput(result, agentName, sourceDir) {
// - Output validation
// - Prompt snapshotting for debugging
// - Git checkpoint/rollback safety
async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context = '', description = 'Claude analysis', colorFn = chalk.cyan, sessionMetadata = null) {
async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context = '', description = 'Claude analysis', agentName = null, colorFn = chalk.cyan, sessionMetadata = null, auditSession = null, attemptNumber = 1) {
const timer = new Timer(`agent-${description.toLowerCase().replace(/\s+/g, '-')}`);
const fullPrompt = context ? `${context}\n\n${prompt}` : prompt;
let totalCost = 0;
let partialCost = 0; // Track partial cost for crash safety
// Auto-detect execution mode to adjust logging behavior
const isParallelExecution = description.includes('vuln agent') || description.includes('exploit agent');
@@ -82,39 +118,80 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
progressIndicator = new ProgressIndicator(`Running ${agentType}...`);
}
// Setup detailed logging for all agents (if session metadata is available)
// NOTE: Logging now handled by AuditSession (append-only, crash-safe)
// Legacy log path generation kept for compatibility
let logFilePath = null;
let logBuffer = [];
if (sessionMetadata && sessionMetadata.webUrl && sessionMetadata.sessionId) {
if (sessionMetadata && sessionMetadata.webUrl && sessionMetadata.id) {
const timestamp = new Date().toISOString().replace(/T/, '_').replace(/[:.]/g, '-').slice(0, 19);
const agentName = description.toLowerCase().replace(/\s+/g, '-');
// Use session-based folder structure
const logDir = generateSessionLogPath(sessionMetadata.webUrl, sessionMetadata.sessionId);
await fs.ensureDir(logDir);
logFilePath = path.join(logDir, `${timestamp}_${agentName}_attempt-1.log`);
// Initialize log with agent startup info
const sessionId = sessionMetadata?.sessionId || path.basename(sourceDir).split('-').pop().substring(0, 8);
logBuffer.push(`=== ${description} - Detailed Execution Log ===`);
logBuffer.push(`Timestamp: ${new Date().toISOString()}`);
logBuffer.push(`Working Directory: ${sourceDir}`);
logBuffer.push(`Session ID: ${sessionId}`);
logBuffer.push(`Log File: ${logFilePath}`);
logBuffer.push(`\n=== Agent Execution Start ===\n`);
const logDir = generateSessionLogPath(sessionMetadata.webUrl, sessionMetadata.id);
logFilePath = path.join(logDir, `${timestamp}_${agentName}_attempt-${attemptNumber}.log`);
} else {
console.log(chalk.blue(` 🤖 Running Claude Code: ${description}...`));
}
// Declare variables that need to be accessible in both try and catch blocks
let turnCount = 0;
try {
// Create MCP server with target directory context
const shannonHelperServer = createShannonHelperServer(sourceDir);
// Look up agent's assigned Playwright MCP server
// Convert agent name (e.g., 'xss-vuln') to prompt name (e.g., 'vuln-xss')
let playwrightMcpName = null;
if (agentName) {
const promptName = agentNameToPromptName(agentName);
playwrightMcpName = MCP_AGENT_MAPPING[promptName];
if (playwrightMcpName) {
console.log(chalk.gray(` 🎭 Assigned ${agentName}${playwrightMcpName}`));
}
}
// Configure MCP servers: shannon-helper (SDK) + playwright-agentN (stdio)
const mcpServers = {
'shannon-helper': shannonHelperServer,
};
// Add Playwright MCP server if this agent needs browser automation
if (playwrightMcpName) {
const userDataDir = `/tmp/${playwrightMcpName}`;
// Detect if running in Docker via explicit environment variable
const isDocker = process.env.SHANNON_DOCKER === 'true';
// Build args array - conditionally add --executable-path for Docker
const mcpArgs = [
'@playwright/mcp@latest',
'--isolated',
'--user-data-dir', userDataDir,
];
// Docker: Use system Chromium; Local: Use Playwright's bundled browsers
if (isDocker) {
mcpArgs.push('--executable-path', '/usr/bin/chromium-browser');
mcpArgs.push('--browser', 'chromium');
}
mcpServers[playwrightMcpName] = {
type: 'stdio',
command: 'npx',
args: mcpArgs,
env: {
...process.env,
PLAYWRIGHT_HEADLESS: 'true', // Ensure headless mode for security and CI compatibility
...(isDocker && { PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD: '1' }), // Only skip in Docker
},
};
}
const options = {
model: 'claude-sonnet-4-20250514', // Use latest Claude 4 Sonnet
model: 'claude-sonnet-4-5-20250929', // Use latest Claude 4.5 Sonnet
maxTurns: 10_000, // Maximum turns for autonomous work
cwd: sourceDir, // Set working directory using SDK option
permissionMode: 'bypassPermissions', // Bypass all permission checks for pentesting
customSystemPrompt: fullPrompt, // Use system prompt for better security and consistency
mcpServers,
};
// SDK Options only shown for verbose agents (not clean output)
@@ -124,7 +201,6 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
let result = null;
let messages = [];
let turnCount = 0;
let apiErrorDetected = false;
// Start progress indicator for clean output agents
@@ -132,9 +208,15 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
progressIndicator.start();
}
for await (const message of query({ prompt: 'Begin.', options })) {
let messageCount = 0;
try {
for await (const message of query({ prompt: fullPrompt, options })) {
messageCount++;
if (message.type === "assistant") {
turnCount++;
const content = Array.isArray(message.message.content)
? message.message.content.map(c => c.text || JSON.stringify(c)).join('\n')
: message.message.content;
@@ -177,9 +259,15 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
console.log(colorFn(` ${content}`));
}
// Log full details to file for later review
logBuffer.push(`\n🤖 Turn ${turnCount} (${description}):`);
logBuffer.push(content);
// Log to audit system (crash-safe, append-only)
if (auditSession) {
await auditSession.logEvent('llm_response', {
turn: turnCount,
content,
timestamp: new Date().toISOString()
});
}
messages.push(content);
// Check for API error patterns in assistant message content
@@ -210,6 +298,15 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
if (message.input && Object.keys(message.input).length > 0) {
console.log(chalk.gray(` Input: ${JSON.stringify(message.input, null, 2)}`));
}
// Log tool start event
if (auditSession) {
await auditSession.logEvent('tool_start', {
toolName: message.name,
parameters: message.input,
timestamp: new Date().toISOString()
});
}
} else if (message.type === "tool_result") {
console.log(chalk.green(` ✅ Tool Result:`));
if (message.content) {
@@ -221,6 +318,14 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
console.log(chalk.gray(` ${resultStr}`));
}
}
// Log tool end event
if (auditSession) {
await auditSession.logEvent('tool_end', {
result: message.content,
timestamp: new Date().toISOString()
});
}
} else if (message.type === "result") {
result = message.result;
@@ -273,13 +378,17 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
costResults.agents[agentKey] = cost;
costResults.total += cost;
// Store cost for return value
// Store cost for return value and partial tracking
totalCost = cost;
partialCost = cost;
break;
} else {
// Log any other message types we might not be handling
console.log(chalk.gray(` 💬 ${message.type}: ${JSON.stringify(message, null, 2)}`));
}
}
} catch (queryError) {
throw queryError; // Re-throw to outer catch
}
const duration = timer.stop();
@@ -292,23 +401,14 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
console.log(chalk.yellow(` ⚠️ API Error detected in ${description} - will validate deliverables before failing`));
}
// Finish status line for parallel execution and save detailed log
// Finish status line for parallel execution
if (statusManager) {
statusManager.clearAgentStatus(description);
statusManager.finishStatusLine();
}
// Write detailed log to file
if (logFilePath && logBuffer.length > 0) {
logBuffer.push(`\n=== Agent Execution Complete ===`);
logBuffer.push(`Duration: ${formatDuration(duration)}`);
logBuffer.push(`Turns: ${turnCount}`);
logBuffer.push(`Cost: $${totalCost.toFixed(4)}`);
logBuffer.push(`Status: Success`);
logBuffer.push(`Completed: ${new Date().toISOString()}`);
await fs.writeFile(logFilePath, logBuffer.join('\n'));
}
// NOTE: Log writing now handled by AuditSession (crash-safe, append-only)
// Legacy log writing removed - audit system handles this automatically
// Show completion messages based on agent type
if (progressIndicator) {
@@ -327,7 +427,15 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
}
// Return result with log file path for all agents
const returnData = { result, success: true, duration, turns: turnCount, cost: totalCost, apiErrorDetected };
const returnData = {
result,
success: true,
duration,
turns: turnCount,
cost: totalCost,
partialCost, // Include partial cost for crash recovery
apiErrorDetected
};
if (logFilePath) {
returnData.logFile = logFilePath;
}
@@ -344,17 +452,16 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
statusManager.finishStatusLine();
}
// Write error log to file
if (logFilePath && logBuffer.length > 0) {
logBuffer.push(`\n=== Agent Execution Failed ===`);
logBuffer.push(`Duration: ${formatDuration(duration)}`);
logBuffer.push(`Turns: ${turnCount}`);
logBuffer.push(`Error: ${error.message}`);
logBuffer.push(`Error Type: ${error.constructor.name}`);
logBuffer.push(`Status: Failed`);
logBuffer.push(`Failed: ${new Date().toISOString()}`);
await fs.writeFile(logFilePath, logBuffer.join('\n'));
// Log error to audit system
if (auditSession) {
await auditSession.logEvent('error', {
message: error.message,
errorType: error.constructor.name,
stack: error.stack,
duration,
turns: turnCount,
timestamp: new Date().toISOString()
});
}
// Show error messages based on agent type
@@ -420,6 +527,7 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
prompt: fullPrompt.slice(0, 100) + '...',
success: false,
duration,
cost: partialCost, // Include partial cost on error
retryable: isRetryableError(error)
};
}
@@ -432,6 +540,7 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
// - Prompt snapshotting for debugging and reproducibility
// - Git checkpoint/rollback safety for workspace protection
// - Comprehensive error handling and logging
// - Crash-safe audit logging via AuditSession
export async function runClaudePromptWithRetry(prompt, sourceDir, allowedTools = 'Read', context = '', description = 'Claude analysis', agentName = null, colorFn = chalk.cyan, sessionMetadata = null) {
const maxRetries = 3;
let lastError;
@@ -439,22 +548,25 @@ export async function runClaudePromptWithRetry(prompt, sourceDir, allowedTools =
console.log(chalk.cyan(`🚀 Starting ${description} with ${maxRetries} max attempts`));
// Save prompt snapshot before execution starts (for debugging failed runs)
let snapshotSaved = false;
// Initialize audit session (crash-safe logging)
let auditSession = null;
if (sessionMetadata && agentName) {
auditSession = new AuditSession(sessionMetadata);
await auditSession.initialize();
}
for (let attempt = 1; attempt <= maxRetries; attempt++) {
// Create checkpoint before each attempt
await createGitCheckpoint(sourceDir, description, attempt);
// Save snapshot on first attempt only (before any execution)
if (!snapshotSaved && agentName) {
// Start agent tracking in audit system (saves prompt snapshot automatically)
if (auditSession) {
const fullPrompt = retryContext ? `${retryContext}\n\n${prompt}` : prompt;
await savePromptSnapshot(sourceDir, agentName, fullPrompt);
snapshotSaved = true;
await auditSession.startAgent(agentName, fullPrompt, attempt);
}
try {
const result = await runClaudePrompt(prompt, sourceDir, allowedTools, retryContext, description, colorFn, sessionMetadata);
const result = await runClaudePrompt(prompt, sourceDir, allowedTools, retryContext, description, agentName, colorFn, sessionMetadata, auditSession, attempt);
// Validate output after successful run
if (result.success) {
@@ -466,6 +578,17 @@ export async function runClaudePromptWithRetry(prompt, sourceDir, allowedTools =
console.log(chalk.yellow(`📋 Validation: Ready for exploitation despite API error warnings`));
}
// Record successful attempt in audit system
if (auditSession) {
await auditSession.endAgent(agentName, {
attemptNumber: attempt,
duration_ms: result.duration,
cost_usd: result.cost || 0,
success: true,
checkpoint: await getGitCommitHash(sourceDir)
});
}
// Commit successful changes (will include the snapshot)
await commitGitSuccess(sourceDir, description);
console.log(chalk.green.bold(`🎉 ${description} completed successfully on attempt ${attempt}/${maxRetries}`));
@@ -474,6 +597,18 @@ export async function runClaudePromptWithRetry(prompt, sourceDir, allowedTools =
// Agent completed but output validation failed
console.log(chalk.yellow(`⚠️ ${description} completed but output validation failed`));
// Record failed validation attempt in audit system
if (auditSession) {
await auditSession.endAgent(agentName, {
attemptNumber: attempt,
duration_ms: result.duration,
cost_usd: result.partialCost || result.cost || 0,
success: false,
error: 'Output validation failed',
isFinalAttempt: attempt === maxRetries
});
}
// If API error detected AND validation failed, this is a retryable error
if (result.apiErrorDetected) {
console.log(chalk.yellow(`⚠️ API Error detected with validation failure - treating as retryable`));
@@ -501,6 +636,18 @@ export async function runClaudePromptWithRetry(prompt, sourceDir, allowedTools =
} catch (error) {
lastError = error;
// Record failed attempt in audit system
if (auditSession) {
await auditSession.endAgent(agentName, {
attemptNumber: attempt,
duration_ms: error.duration || 0,
cost_usd: error.cost || 0,
success: false,
error: error.message,
isFinalAttempt: attempt === maxRetries
});
}
// Check if error is retryable
if (!isRetryableError(error)) {
console.log(chalk.red(`${description} failed with non-retryable error: ${error.message}`));
@@ -533,4 +680,14 @@ export async function runClaudePromptWithRetry(prompt, sourceDir, allowedTools =
}
throw lastError;
}
// Helper function to get git commit hash
async function getGitCommitHash(sourceDir) {
try {
const result = await $`cd ${sourceDir} && git rev-parse HEAD`;
return result.stdout.trim();
} catch (error) {
return null;
}
}
+206
View File
@@ -0,0 +1,206 @@
/**
* Audit Session - Main Facade
*
* Coordinates logger, metrics tracker, and concurrency control for comprehensive
* crash-safe audit logging.
*/
import { AgentLogger } from './logger.js';
import { MetricsTracker } from './metrics-tracker.js';
import { initializeAuditStructure, formatTimestamp } from './utils.js';
import { SessionMutex } from '../utils/concurrency.js';
// Global mutex instance
const sessionMutex = new SessionMutex();
/**
* AuditSession - Main audit system facade
*/
export class AuditSession {
/**
* @param {Object} sessionMetadata - Session metadata from Shannon store
* @param {string} sessionMetadata.id - Session UUID
* @param {string} sessionMetadata.webUrl - Target web URL
* @param {string} [sessionMetadata.repoPath] - Target repository path
*/
constructor(sessionMetadata) {
this.sessionMetadata = sessionMetadata;
this.sessionId = sessionMetadata.id;
// Validate required fields
if (!this.sessionId) {
throw new Error('sessionMetadata.id is required');
}
if (!this.sessionMetadata.webUrl) {
throw new Error('sessionMetadata.webUrl is required');
}
// Components
this.metricsTracker = new MetricsTracker(sessionMetadata);
// Active logger (one at a time per agent attempt)
this.currentLogger = null;
// Initialization flag
this.initialized = false;
}
/**
* Initialize audit session (creates directories, session.json)
* Idempotent and race-safe
* @returns {Promise<void>}
*/
async initialize() {
if (this.initialized) {
return; // Already initialized
}
// Create directory structure
await initializeAuditStructure(this.sessionMetadata);
// Initialize metrics tracker (loads or creates session.json)
await this.metricsTracker.initialize();
this.initialized = true;
}
/**
* Ensure initialized (helper for lazy initialization)
* @private
* @returns {Promise<void>}
*/
async ensureInitialized() {
if (!this.initialized) {
await this.initialize();
}
}
/**
* Start agent execution
* @param {string} agentName - Agent name
* @param {string} promptContent - Full prompt content
* @param {number} [attemptNumber=1] - Attempt number
* @returns {Promise<void>}
*/
async startAgent(agentName, promptContent, attemptNumber = 1) {
await this.ensureInitialized();
// Save prompt snapshot (only on first attempt)
if (attemptNumber === 1) {
await AgentLogger.savePrompt(this.sessionMetadata, agentName, promptContent);
}
// Create and initialize logger for this attempt
this.currentLogger = new AgentLogger(this.sessionMetadata, agentName, attemptNumber);
await this.currentLogger.initialize();
// Start metrics tracking
this.metricsTracker.startAgent(agentName, attemptNumber);
// Log start event
await this.currentLogger.logEvent('agent_start', {
agentName,
attemptNumber,
timestamp: formatTimestamp()
});
}
/**
* Log event during agent execution
* @param {string} eventType - Event type (tool_start, tool_end, llm_response, etc.)
* @param {Object} eventData - Event data
* @returns {Promise<void>}
*/
async logEvent(eventType, eventData) {
if (!this.currentLogger) {
throw new Error('No active logger. Call startAgent() first.');
}
await this.currentLogger.logEvent(eventType, eventData);
}
/**
* End agent execution (mutex-protected)
* @param {string} agentName - Agent name
* @param {Object} result - Execution result
* @param {number} result.attemptNumber - Attempt number
* @param {number} result.duration_ms - Duration in milliseconds
* @param {number} result.cost_usd - Cost in USD
* @param {boolean} result.success - Whether attempt succeeded
* @param {string} [result.error] - Error message (if failed)
* @param {string} [result.checkpoint] - Git checkpoint hash (if succeeded)
* @param {boolean} [result.isFinalAttempt=false] - Whether this is the final attempt
* @returns {Promise<void>}
*/
async endAgent(agentName, result) {
// Log end event
if (this.currentLogger) {
await this.currentLogger.logEvent('agent_end', {
agentName,
success: result.success,
duration_ms: result.duration_ms,
cost_usd: result.cost_usd,
timestamp: formatTimestamp()
});
// Close logger
await this.currentLogger.close();
this.currentLogger = null;
}
// Mutex-protected update to session.json
const unlock = await sessionMutex.lock(this.sessionId);
try {
// Reload metrics (in case of parallel updates)
await this.metricsTracker.reload();
// Update metrics
await this.metricsTracker.endAgent(agentName, result);
} finally {
unlock();
}
}
/**
* Mark multiple agents as rolled back
* @param {string[]} agentNames - Array of agent names
* @returns {Promise<void>}
*/
async markMultipleRolledBack(agentNames) {
await this.ensureInitialized();
const unlock = await sessionMutex.lock(this.sessionId);
try {
await this.metricsTracker.reload();
await this.metricsTracker.markMultipleRolledBack(agentNames);
} finally {
unlock();
}
}
/**
* Update session status
* @param {string} status - New status (in-progress, completed, failed)
* @returns {Promise<void>}
*/
async updateSessionStatus(status) {
await this.ensureInitialized();
const unlock = await sessionMutex.lock(this.sessionId);
try {
await this.metricsTracker.reload();
await this.metricsTracker.updateSessionStatus(status);
} finally {
unlock();
}
}
/**
* Get current metrics (read-only)
* @returns {Promise<Object>} Current metrics
*/
async getMetrics() {
await this.ensureInitialized();
return this.metricsTracker.getMetrics();
}
}
+16
View File
@@ -0,0 +1,16 @@
/**
* Unified Audit & Metrics System
*
* Public API for the audit system. Provides crash-safe, append-only logging
* and comprehensive metrics tracking for Shannon penetration testing sessions.
*
* IMPORTANT: Session objects must have an 'id' field (NOT 'sessionId')
* Example: { id: "uuid", webUrl: "...", repoPath: "..." }
*
* @module audit
*/
export { AuditSession } from './audit-session.js';
export { AgentLogger } from './logger.js';
export { MetricsTracker } from './metrics-tracker.js';
export * as AuditUtils from './utils.js';
+172
View File
@@ -0,0 +1,172 @@
/**
* Append-Only Agent Logger
*
* Provides crash-safe, append-only logging for agent execution.
* Uses file streams with immediate flush to prevent data loss.
*/
import fs from 'fs';
import { generateLogPath, generatePromptPath, atomicWrite, formatTimestamp } from './utils.js';
/**
* AgentLogger - Manages append-only logging for a single agent execution
*/
export class AgentLogger {
/**
* @param {Object} sessionMetadata - Session metadata
* @param {string} agentName - Name of the agent
* @param {number} attemptNumber - Attempt number (1, 2, 3, ...)
*/
constructor(sessionMetadata, agentName, attemptNumber) {
this.sessionMetadata = sessionMetadata;
this.agentName = agentName;
this.attemptNumber = attemptNumber;
this.timestamp = Date.now();
// Generate log file path
this.logPath = generateLogPath(sessionMetadata, agentName, this.timestamp, attemptNumber);
// Create write stream (append mode)
this.stream = null;
this.isOpen = false;
}
/**
* Initialize the log stream (creates file and opens stream)
* @returns {Promise<void>}
*/
async initialize() {
if (this.isOpen) {
return; // Already initialized
}
// Create write stream with append mode and auto-flush
this.stream = fs.createWriteStream(this.logPath, {
flags: 'a', // Append mode
encoding: 'utf8',
autoClose: true
});
this.isOpen = true;
// Write header
await this.writeHeader();
}
/**
* Write header to log file
* @private
* @returns {Promise<void>}
*/
async writeHeader() {
const header = [
`========================================`,
`Agent: ${this.agentName}`,
`Attempt: ${this.attemptNumber}`,
`Started: ${formatTimestamp(this.timestamp)}`,
`Session: ${this.sessionMetadata.id}`,
`Web URL: ${this.sessionMetadata.webUrl}`,
`========================================\n`
].join('\n');
return this.writeRaw(header);
}
/**
* Write raw text to log file with immediate flush
* @private
* @param {string} text - Text to write
* @returns {Promise<void>}
*/
writeRaw(text) {
return new Promise((resolve, reject) => {
if (!this.isOpen || !this.stream) {
reject(new Error('Logger not initialized'));
return;
}
// Write and flush immediately (crash-safe)
const needsDrain = !this.stream.write(text, 'utf8', (error) => {
if (error) {
reject(error);
}
});
if (needsDrain) {
// Buffer is full, wait for drain
const drainHandler = () => {
this.stream.removeListener('drain', drainHandler);
resolve();
};
this.stream.once('drain', drainHandler);
} else {
// Buffer has space, resolve immediately
resolve();
}
});
}
/**
* Log an event (tool_start, tool_end, llm_response, etc.)
* Events are logged as JSON for parseability
* @param {string} eventType - Type of event
* @param {Object} eventData - Event data
* @returns {Promise<void>}
*/
async logEvent(eventType, eventData) {
const event = {
type: eventType,
timestamp: formatTimestamp(),
data: eventData
};
const eventLine = `${JSON.stringify(event)}\n`;
return this.writeRaw(eventLine);
}
/**
* Close the log stream
* @returns {Promise<void>}
*/
async close() {
if (!this.isOpen || !this.stream) {
return;
}
return new Promise((resolve) => {
this.stream.end(() => {
this.isOpen = false;
resolve();
});
});
}
/**
* Save prompt snapshot to prompts directory
* Static method - doesn't require logger instance
* @param {Object} sessionMetadata - Session metadata
* @param {string} agentName - Agent name
* @param {string} promptContent - Full prompt content
* @returns {Promise<void>}
*/
static async savePrompt(sessionMetadata, agentName, promptContent) {
const promptPath = generatePromptPath(sessionMetadata, agentName);
// Create header with metadata
const header = [
`# Prompt Snapshot: ${agentName}`,
``,
`**Session:** ${sessionMetadata.id}`,
`**Web URL:** ${sessionMetadata.webUrl}`,
`**Saved:** ${formatTimestamp()}`,
``,
`---`,
``
].join('\n');
const fullContent = header + promptContent;
// Use atomic write for safety
await atomicWrite(promptPath, fullContent);
}
}
+331
View File
@@ -0,0 +1,331 @@
/**
* Metrics Tracker
*
* Manages session.json with comprehensive timing, cost, and validation metrics.
* Tracks attempt-level data for complete forensic trail.
*/
import {
generateSessionJsonPath,
atomicWrite,
readJson,
fileExists,
formatTimestamp,
calculatePercentage
} from './utils.js';
/**
* MetricsTracker - Manages metrics for a session
*/
export class MetricsTracker {
/**
* @param {Object} sessionMetadata - Session metadata from Shannon store
*/
constructor(sessionMetadata) {
this.sessionMetadata = sessionMetadata;
this.sessionJsonPath = generateSessionJsonPath(sessionMetadata);
// In-memory state (loaded from/synced to session.json)
this.data = null;
// Active timers (agent name -> start time)
this.activeTimers = new Map();
}
/**
* Initialize session.json (idempotent)
* @returns {Promise<void>}
*/
async initialize() {
// Check if session.json already exists
const exists = await fileExists(this.sessionJsonPath);
if (exists) {
// Load existing data
this.data = await readJson(this.sessionJsonPath);
} else {
// Create new session.json
this.data = this.createInitialData();
await this.save();
}
}
/**
* Create initial session.json structure
* @private
* @returns {Object} Initial session data
*/
createInitialData() {
return {
session: {
id: this.sessionMetadata.id,
webUrl: this.sessionMetadata.webUrl,
repoPath: this.sessionMetadata.repoPath,
status: 'in-progress',
createdAt: formatTimestamp()
},
metrics: {
total_duration_ms: 0,
total_cost_usd: 0,
phases: {}, // Phase-level aggregations: { duration_ms, duration_percentage, cost_usd, agent_count }
agents: {} // Agent-level metrics: { status, attempts[], final_duration_ms, total_cost_usd, checkpoint }
}
};
}
/**
* Start tracking an agent execution
* @param {string} agentName - Agent name
* @param {number} attemptNumber - Attempt number
* @returns {void}
*/
startAgent(agentName, attemptNumber) {
this.activeTimers.set(agentName, {
startTime: Date.now(),
attemptNumber
});
}
/**
* End agent execution and update metrics
* @param {string} agentName - Agent name
* @param {Object} result - Agent execution result
* @param {number} result.attemptNumber - Attempt number
* @param {number} result.duration_ms - Duration in milliseconds
* @param {number} result.cost_usd - Cost in USD
* @param {boolean} result.success - Whether attempt succeeded
* @param {string} [result.error] - Error message (if failed)
* @param {string} [result.checkpoint] - Git checkpoint hash (if succeeded)
* @returns {Promise<void>}
*/
async endAgent(agentName, result) {
// Initialize agent metrics if not exists
if (!this.data.metrics.agents[agentName]) {
this.data.metrics.agents[agentName] = {
status: 'in-progress',
attempts: [],
final_duration_ms: 0,
total_cost_usd: 0 // Total cost across all attempts (including retries)
};
}
const agent = this.data.metrics.agents[agentName];
// Add attempt to array
const attempt = {
attempt_number: result.attemptNumber,
duration_ms: result.duration_ms,
cost_usd: result.cost_usd,
success: result.success,
timestamp: formatTimestamp()
};
if (result.error) {
attempt.error = result.error;
}
agent.attempts.push(attempt);
// Update total cost (includes failed attempts)
agent.total_cost_usd = agent.attempts.reduce((sum, a) => sum + a.cost_usd, 0);
// If successful, update final metrics and status
if (result.success) {
agent.status = 'success';
agent.final_duration_ms = result.duration_ms;
if (result.checkpoint) {
agent.checkpoint = result.checkpoint;
}
} else {
// If this was the last attempt, mark as failed
if (result.isFinalAttempt) {
agent.status = 'failed';
}
}
// Clear active timer
this.activeTimers.delete(agentName);
// Recalculate aggregations
this.recalculateAggregations();
// Save to disk
await this.save();
}
/**
* Mark agent as rolled back
* @param {string} agentName - Agent name
* @returns {Promise<void>}
*/
async markRolledBack(agentName) {
if (!this.data.metrics.agents[agentName]) {
return; // Agent not tracked
}
const agent = this.data.metrics.agents[agentName];
agent.status = 'rolled-back';
agent.rolled_back_at = formatTimestamp();
// Recalculate aggregations (exclude rolled-back agents)
this.recalculateAggregations();
await this.save();
}
/**
* Mark multiple agents as rolled back
* @param {string[]} agentNames - Array of agent names
* @returns {Promise<void>}
*/
async markMultipleRolledBack(agentNames) {
for (const agentName of agentNames) {
if (this.data.metrics.agents[agentName]) {
const agent = this.data.metrics.agents[agentName];
agent.status = 'rolled-back';
agent.rolled_back_at = formatTimestamp();
}
}
this.recalculateAggregations();
await this.save();
}
/**
* Update session status
* @param {string} status - New status (in-progress, completed, failed)
* @returns {Promise<void>}
*/
async updateSessionStatus(status) {
this.data.session.status = status;
if (status === 'completed' || status === 'failed') {
this.data.session.completedAt = formatTimestamp();
}
await this.save();
}
/**
* Recalculate aggregations (total duration, total cost, phases)
* @private
*/
recalculateAggregations() {
const agents = this.data.metrics.agents;
// Only count successful agents (not rolled-back or failed)
const successfulAgents = Object.entries(agents)
.filter(([_, data]) => data.status === 'success');
// Calculate total duration and cost
const totalDuration = successfulAgents.reduce(
(sum, [_, data]) => sum + data.final_duration_ms,
0
);
const totalCost = successfulAgents.reduce(
(sum, [_, data]) => sum + data.total_cost_usd,
0
);
this.data.metrics.total_duration_ms = totalDuration;
this.data.metrics.total_cost_usd = totalCost;
// Calculate phase-level metrics
this.data.metrics.phases = this.calculatePhaseMetrics(successfulAgents);
}
/**
* Calculate phase-level metrics
* @private
* @param {Array} successfulAgents - Array of [agentName, agentData] tuples
* @returns {Object} Phase metrics
*/
calculatePhaseMetrics(successfulAgents) {
const phases = {
'pre-recon': [],
'recon': [],
'vulnerability-analysis': [],
'exploitation': [],
'reporting': []
};
// Map agents to phases
const agentPhaseMap = {
'pre-recon': 'pre-recon',
'recon': 'recon',
'injection-vuln': 'vulnerability-analysis',
'xss-vuln': 'vulnerability-analysis',
'auth-vuln': 'vulnerability-analysis',
'authz-vuln': 'vulnerability-analysis',
'ssrf-vuln': 'vulnerability-analysis',
'injection-exploit': 'exploitation',
'xss-exploit': 'exploitation',
'auth-exploit': 'exploitation',
'authz-exploit': 'exploitation',
'ssrf-exploit': 'exploitation',
'report': 'reporting'
};
// Group agents by phase
for (const [agentName, agentData] of successfulAgents) {
const phase = agentPhaseMap[agentName];
if (phase) {
phases[phase].push(agentData);
}
}
// Calculate metrics per phase
const phaseMetrics = {};
const totalDuration = this.data.metrics.total_duration_ms;
for (const [phaseName, agentList] of Object.entries(phases)) {
if (agentList.length === 0) continue;
const phaseDuration = agentList.reduce(
(sum, agent) => sum + agent.final_duration_ms,
0
);
const phaseCost = agentList.reduce(
(sum, agent) => sum + agent.total_cost_usd,
0
);
phaseMetrics[phaseName] = {
duration_ms: phaseDuration,
duration_percentage: calculatePercentage(phaseDuration, totalDuration),
cost_usd: phaseCost,
agent_count: agentList.length
};
}
return phaseMetrics;
}
/**
* Get current metrics
* @returns {Object} Current metrics data
*/
getMetrics() {
return JSON.parse(JSON.stringify(this.data));
}
/**
* Save metrics to session.json (atomic write)
* @private
* @returns {Promise<void>}
*/
async save() {
await atomicWrite(this.sessionJsonPath, this.data);
}
/**
* Reload metrics from disk
* @returns {Promise<void>}
*/
async reload() {
this.data = await readJson(this.sessionJsonPath);
}
}
+199
View File
@@ -0,0 +1,199 @@
/**
* Audit System Utilities
*
* Core utility functions for path generation, atomic writes, and formatting.
* All functions are pure and crash-safe.
*/
import fs from 'fs/promises';
import path from 'path';
import { fileURLToPath } from 'url';
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
// Get Shannon repository root
export const SHANNON_ROOT = path.resolve(__dirname, '..', '..');
export const AUDIT_LOGS_DIR = path.join(SHANNON_ROOT, 'audit-logs');
/**
* Generate standardized session identifier: {hostname}_{sessionId}
* @param {Object} sessionMetadata - Session metadata from Shannon store
* @param {string} sessionMetadata.id - UUID session ID
* @param {string} sessionMetadata.webUrl - Target web URL
* @returns {string} Formatted session identifier
*/
export function generateSessionIdentifier(sessionMetadata) {
const { id, webUrl } = sessionMetadata;
const hostname = new URL(webUrl).hostname.replace(/[^a-zA-Z0-9-]/g, '-');
return `${hostname}_${id}`;
}
/**
* Generate path to audit log directory for a session
* @param {Object} sessionMetadata - Session metadata
* @returns {string} Absolute path to session audit directory
*/
export function generateAuditPath(sessionMetadata) {
const sessionIdentifier = generateSessionIdentifier(sessionMetadata);
return path.join(AUDIT_LOGS_DIR, sessionIdentifier);
}
/**
* Generate path to agent log file
* @param {Object} sessionMetadata - Session metadata
* @param {string} agentName - Name of the agent
* @param {number} timestamp - Timestamp (ms since epoch)
* @param {number} attemptNumber - Attempt number (1, 2, 3, ...)
* @returns {string} Absolute path to agent log file
*/
export function generateLogPath(sessionMetadata, agentName, timestamp, attemptNumber) {
const auditPath = generateAuditPath(sessionMetadata);
const filename = `${timestamp}_${agentName}_attempt-${attemptNumber}.log`;
return path.join(auditPath, 'agents', filename);
}
/**
* Generate path to prompt snapshot file
* @param {Object} sessionMetadata - Session metadata
* @param {string} agentName - Name of the agent
* @returns {string} Absolute path to prompt file
*/
export function generatePromptPath(sessionMetadata, agentName) {
const auditPath = generateAuditPath(sessionMetadata);
return path.join(auditPath, 'prompts', `${agentName}.md`);
}
/**
* Generate path to session.json file
* @param {Object} sessionMetadata - Session metadata
* @returns {string} Absolute path to session.json
*/
export function generateSessionJsonPath(sessionMetadata) {
const auditPath = generateAuditPath(sessionMetadata);
return path.join(auditPath, 'session.json');
}
/**
* Ensure directory exists (idempotent, race-safe)
* @param {string} dirPath - Directory path to create
* @returns {Promise<void>}
*/
export async function ensureDirectory(dirPath) {
try {
await fs.mkdir(dirPath, { recursive: true });
} catch (error) {
// Ignore EEXIST errors (race condition safe)
if (error.code !== 'EEXIST') {
throw error;
}
}
}
/**
* Atomic write using temp file + rename pattern
* Guarantees no partial writes or corruption on crash
* @param {string} filePath - Target file path
* @param {Object|string} data - Data to write (will be JSON.stringified if object)
* @returns {Promise<void>}
*/
export async function atomicWrite(filePath, data) {
const tempPath = `${filePath}.tmp`;
const content = typeof data === 'string' ? data : JSON.stringify(data, null, 2);
try {
// Write to temp file
await fs.writeFile(tempPath, content, 'utf8');
// Atomic rename (POSIX guarantee: atomic on same filesystem)
await fs.rename(tempPath, filePath);
} catch (error) {
// Clean up temp file on failure
try {
await fs.unlink(tempPath);
} catch (cleanupError) {
// Ignore cleanup errors
}
throw error;
}
}
/**
* Format duration in milliseconds to human-readable string
* @param {number} ms - Duration in milliseconds
* @returns {string} Formatted duration (e.g., "2m 34s", "45s", "1.2s")
*/
export function formatDuration(ms) {
if (ms < 1000) {
return `${ms}ms`;
}
const seconds = ms / 1000;
if (seconds < 60) {
return `${seconds.toFixed(1)}s`;
}
const minutes = Math.floor(seconds / 60);
const remainingSeconds = Math.floor(seconds % 60);
return `${minutes}m ${remainingSeconds}s`;
}
/**
* Format timestamp to ISO 8601 string
* @param {number} [timestamp] - Unix timestamp in ms (defaults to now)
* @returns {string} ISO 8601 formatted string
*/
export function formatTimestamp(timestamp = Date.now()) {
return new Date(timestamp).toISOString();
}
/**
* Calculate percentage
* @param {number} part - Part value
* @param {number} total - Total value
* @returns {number} Percentage (0-100)
*/
export function calculatePercentage(part, total) {
if (total === 0) return 0;
return (part / total) * 100;
}
/**
* Read and parse JSON file
* @param {string} filePath - Path to JSON file
* @returns {Promise<Object>} Parsed JSON data
*/
export async function readJson(filePath) {
const content = await fs.readFile(filePath, 'utf8');
return JSON.parse(content);
}
/**
* Check if file exists
* @param {string} filePath - Path to check
* @returns {Promise<boolean>} True if file exists
*/
export async function fileExists(filePath) {
try {
await fs.access(filePath);
return true;
} catch {
return false;
}
}
/**
* Initialize audit directory structure for a session
* Creates: audit-logs/{sessionId}/, agents/, prompts/
* @param {Object} sessionMetadata - Session metadata
* @returns {Promise<void>}
*/
export async function initializeAuditStructure(sessionMetadata) {
const auditPath = generateAuditPath(sessionMetadata);
const agentsPath = path.join(auditPath, 'agents');
const promptsPath = path.join(auditPath, 'prompts');
await ensureDirectory(auditPath);
await ensureDirectory(agentsPath);
await ensureDirectory(promptsPath);
}
+64 -50
View File
@@ -3,6 +3,7 @@ import chalk from 'chalk';
import { PentestError } from './error-handling.js';
import { parseConfig, distributeConfig } from './config-parser.js';
import { executeGitCommandWithRetry } from './utils/git-manager.js';
import { formatDuration } from './audit/utils.js';
import {
AGENTS,
PHASES,
@@ -76,10 +77,10 @@ const rollbackGitToCommit = async (targetRepo, commitHash) => {
};
// Run a single agent with retry logic and checkpointing
export const runSingleAgent = async (agentName, session, pipelineTestingMode, runClaudePromptWithRetry, loadPrompt, allowRerun = false, skipWorkspaceClean = false) => {
const runSingleAgent = async (agentName, session, pipelineTestingMode, runClaudePromptWithRetry, loadPrompt, allowRerun = false, skipWorkspaceClean = false) => {
// Validate agent first
const agent = validateAgent(agentName);
console.log(chalk.cyan(`\n🤖 Running agent: ${agent.displayName}`));
// Reload session to get latest state (important for agent ranges)
@@ -191,7 +192,7 @@ export const runSingleAgent = async (agentName, session, pipelineTestingMode, ru
AGENTS[agentName].displayName,
agentName, // Pass agent name for snapshot creation
getAgentColor(agentName), // Pass color function for this agent
{ webUrl: session.webUrl, sessionId: session.id } // Session metadata for logging
{ id: session.id, webUrl: session.webUrl, repoPath: session.repoPath } // Session metadata for audit logging
);
if (!result.success) {
@@ -218,12 +219,12 @@ export const runSingleAgent = async (agentName, session, pipelineTestingMode, ru
const validation = await safeValidateQueueAndDeliverable(vulnType, targetRepo);
if (validation.success) {
// Log validation result (don't store - will be re-validated during exploitation phase)
console.log(chalk.blue(`📋 Validation: ${validation.data.shouldExploit ? `Ready for exploitation (${validation.data.vulnerabilityCount} vulnerabilities)` : 'No vulnerabilities found'}`));
validationData = {
shouldExploit: validation.data.shouldExploit,
vulnerabilityCount: validation.data.vulnerabilityCount,
validatedAt: new Date().toISOString()
vulnerabilityCount: validation.data.vulnerabilityCount
};
console.log(chalk.blue(`📋 Validation: ${validationData.shouldExploit ? `Ready for exploitation (${validationData.vulnerabilityCount} vulnerabilities)` : 'No vulnerabilities found'}`));
} else {
console.log(chalk.yellow(`⚠️ Validation failed: ${validation.error.message}`));
}
@@ -232,8 +233,8 @@ export const runSingleAgent = async (agentName, session, pipelineTestingMode, ru
}
}
// Mark agent as completed
await markAgentCompleted(session.id, agentName, commitHash, timingData, costData, validationData);
// Mark agent as completed (validation not stored - will be re-checked during exploitation)
await markAgentCompleted(session.id, agentName, commitHash);
// Only show completion message for sequential execution
if (!skipWorkspaceClean) {
@@ -299,7 +300,7 @@ export const runSingleAgent = async (agentName, session, pipelineTestingMode, ru
};
// Run multiple agents in sequence
export const runAgentRange = async (startAgent, endAgent, session, pipelineTestingMode, runClaudePromptWithRetry, loadPrompt) => {
const runAgentRange = async (startAgent, endAgent, session, pipelineTestingMode, runClaudePromptWithRetry, loadPrompt) => {
const agents = validateAgentRange(startAgent, endAgent);
console.log(chalk.cyan(`\n🔄 Running agent range: ${startAgent} to ${endAgent} (${agents.length} agents)`));
@@ -323,7 +324,7 @@ export const runAgentRange = async (startAgent, endAgent, session, pipelineTesti
};
// Run vulnerability agents in parallel
export const runParallelVuln = async (session, pipelineTestingMode, runClaudePromptWithRetry, loadPrompt) => {
const runParallelVuln = async (session, pipelineTestingMode, runClaudePromptWithRetry, loadPrompt) => {
const vulnAgents = ['injection-vuln', 'xss-vuln', 'auth-vuln', 'ssrf-vuln', 'authz-vuln'];
const activeAgents = vulnAgents.filter(agent => !session.completedAgents.includes(agent));
@@ -421,7 +422,7 @@ export const runParallelVuln = async (session, pipelineTestingMode, runClaudePro
};
// Run exploitation agents in parallel
export const runParallelExploit = async (session, pipelineTestingMode, runClaudePromptWithRetry, loadPrompt) => {
const runParallelExploit = async (session, pipelineTestingMode, runClaudePromptWithRetry, loadPrompt) => {
const exploitAgents = ['injection-exploit', 'xss-exploit', 'auth-exploit', 'ssrf-exploit', 'authz-exploit'];
// Get fresh session data to ensure we have the latest vulnerability analysis results
@@ -429,25 +430,36 @@ export const runParallelExploit = async (session, pipelineTestingMode, runClaude
const { getSession } = await import('./session-manager.js');
const freshSession = await getSession(session.id);
// Load validation module
const { safeValidateQueueAndDeliverable } = await import('./queue-validation.js');
// Only run exploit agents whose vuln counterparts completed successfully AND found vulnerabilities
const eligibleAgents = exploitAgents.filter(agentName => {
const vulnAgentName = agentName.replace('-exploit', '-vuln');
const eligibilityChecks = await Promise.all(
exploitAgents.map(async (agentName) => {
const vulnAgentName = agentName.replace('-exploit', '-vuln');
// Must have completed the vulnerability analysis
if (!freshSession.completedAgents.includes(vulnAgentName)) {
return false;
}
// Must have completed the vulnerability analysis
if (!freshSession.completedAgents.includes(vulnAgentName)) {
return { agentName, eligible: false };
}
// Must have found vulnerabilities to exploit
const validationResult = freshSession.validationResults?.[vulnAgentName];
if (!validationResult || !validationResult.shouldExploit) {
console.log(chalk.gray(`⏭️ Skipping ${agentName} (no vulnerabilities found in ${vulnAgentName})`));
return false;
}
// Check if vulnerabilities were found by validating the queue file
const vulnType = vulnAgentName.replace('-vuln', ''); // "injection-vuln" -> "injection"
const validation = await safeValidateQueueAndDeliverable(vulnType, freshSession.targetRepo);
console.log(chalk.blue(`${agentName} eligible (${validationResult.vulnerabilityCount} vulnerabilities from ${vulnAgentName})`));
return true;
});
if (!validation.success || !validation.data.shouldExploit) {
console.log(chalk.gray(`⏭️ Skipping ${agentName} (no vulnerabilities found in ${vulnAgentName})`));
return { agentName, eligible: false };
}
console.log(chalk.blue(`${agentName} eligible (${validation.data.vulnerabilityCount} vulnerabilities from ${vulnAgentName})`));
return { agentName, eligible: true };
})
);
const eligibleAgents = eligibilityChecks
.filter(check => check.eligible)
.map(check => check.agentName);
const activeAgents = eligibleAgents.filter(agent => !freshSession.completedAgents.includes(agent));
@@ -616,13 +628,35 @@ export const rollbackTo = async (targetAgent, session) => {
}
const commitHash = session.checkpoints[targetAgent];
// Rollback git workspace
await rollbackGitToCommit(session.targetRepo, commitHash);
// Update session state
// Update session state (removes agents from completedAgents)
await rollbackToAgent(session.id, targetAgent);
// Mark rolled-back agents in audit system (for forensic trail)
try {
const { AuditSession } = await import('./audit/index.js');
const auditSession = new AuditSession(session);
await auditSession.initialize();
// Find agents that were rolled back (agents after targetAgent)
const targetOrder = AGENTS[targetAgent].order;
const rolledBackAgents = Object.values(AGENTS)
.filter(agent => agent.order > targetOrder)
.map(agent => agent.name);
// Mark them as rolled-back in audit system
if (rolledBackAgents.length > 0) {
await auditSession.markMultipleRolledBack(rolledBackAgents);
console.log(chalk.gray(` Marked ${rolledBackAgents.length} agents as rolled-back in audit logs`));
}
} catch (error) {
// Non-critical: rollback succeeded even if audit update failed
console.log(chalk.yellow(` ⚠️ Failed to update audit logs: ${error.message}`));
}
console.log(chalk.green(`✅ Successfully rolled back to agent '${targetAgent}'`));
};
@@ -867,23 +901,3 @@ const getTimeAgo = (timestamp) => {
}
};
// Helper function to format duration in milliseconds to human readable format
const formatDuration = (durationMs) => {
if (durationMs < 1000) {
return `${durationMs}ms`;
}
const seconds = Math.floor(durationMs / 1000);
const minutes = Math.floor(seconds / 60);
const hours = Math.floor(minutes / 60);
if (hours > 0) {
return `${hours}h ${minutes % 60}m ${seconds % 60}s`;
} else if (minutes > 0) {
return `${minutes}m ${seconds % 60}s`;
} else {
return `${seconds}s`;
}
};
+36 -35
View File
@@ -1,13 +1,13 @@
import chalk from 'chalk';
import {
selectSession, deleteSession, deleteAllSessions,
validateAgent, validatePhase
validateAgent, validatePhase, reconcileSession
} from '../session-manager.js';
import {
runPhase, runAll, rollbackTo, rerunAgent, displayStatus, listAgents
} from '../checkpoint-manager.js';
import { logError, PentestError } from '../error-handling.js';
import { cleanupMCP } from '../setup/environment.js';
import { promptConfirmation } from './prompts.js';
// Developer command handlers
export async function handleDeveloperCommand(command, args, pipelineTestingMode, runClaudePromptWithRetry, loadPrompt) {
@@ -27,41 +27,19 @@ export async function handleDeveloperCommand(command, args, pipelineTestingMode,
const sessionId = args[0];
const deletedSession = await deleteSession(sessionId);
console.log(chalk.green(`✅ Deleted session ${sessionId} (${new URL(deletedSession.webUrl).hostname})`));
// Clean up MCP agents when deleting specific session
await cleanupMCP();
} else {
// Cleanup all sessions - require confirmation
console.log(chalk.yellow('⚠️ This will delete all pentest sessions. Are you sure? (y/N):'));
const { createInterface } = await import('readline');
const readline = createInterface({
input: process.stdin,
output: process.stdout
});
await new Promise((resolve) => {
readline.question('', (answer) => {
readline.close();
if (answer.toLowerCase() === 'y' || answer.toLowerCase() === 'yes') {
deleteAllSessions().then(deleted => {
if (deleted) {
console.log(chalk.green('✅ All sessions deleted'));
} else {
console.log(chalk.yellow('⚠️ No sessions found to delete'));
}
// Clean up MCP agents after deleting sessions
return cleanupMCP();
}).then(() => {
resolve();
}).catch(error => {
console.log(chalk.red(`❌ Failed to delete sessions: ${error.message}`));
resolve();
});
} else {
console.log(chalk.gray('Cleanup cancelled'));
resolve();
}
});
});
const confirmed = await promptConfirmation(chalk.yellow('⚠️ This will delete all pentest sessions. Are you sure? (y/N):'));
if (confirmed) {
const deleted = await deleteAllSessions();
if (deleted) {
console.log(chalk.green('✅ All sessions deleted'));
} else {
console.log(chalk.yellow('⚠️ No sessions found to delete'));
}
} else {
console.log(chalk.gray('Cleanup cancelled'));
}
}
return;
}
@@ -94,6 +72,29 @@ export async function handleDeveloperCommand(command, args, pipelineTestingMode,
process.exit(1);
}
// Self-healing: Reconcile session with audit logs before executing command
// This ensures Shannon store is consistent with audit data, even after crash recovery
try {
const reconcileReport = await reconcileSession(session.id);
if (reconcileReport.promotions.length > 0) {
console.log(chalk.blue(`🔄 Reconciled: Added ${reconcileReport.promotions.length} completed agents from audit logs`));
}
if (reconcileReport.demotions.length > 0) {
console.log(chalk.yellow(`🔄 Reconciled: Removed ${reconcileReport.demotions.length} rolled-back agents`));
}
if (reconcileReport.failures.length > 0) {
console.log(chalk.yellow(`🔄 Reconciled: Marked ${reconcileReport.failures.length} failed agents`));
}
// Reload session after reconciliation to get fresh state
const { getSession } = await import('../session-manager.js');
session = await getSession(session.id);
} catch (error) {
// Reconciliation failure is non-critical, but log warning
console.log(chalk.yellow(`⚠️ Failed to reconcile session with audit logs: ${error.message}`));
}
switch (command) {
case '--run-phase':
+62
View File
@@ -0,0 +1,62 @@
import { createInterface } from 'readline';
import { PentestError } from '../error-handling.js';
/**
* Prompt user for yes/no confirmation
* @param {string} message - Question to display
* @returns {Promise<boolean>} true if confirmed, false otherwise
*/
export async function promptConfirmation(message) {
const readline = createInterface({
input: process.stdin,
output: process.stdout
});
return new Promise((resolve) => {
readline.question(message + ' ', (answer) => {
readline.close();
const confirmed = answer.toLowerCase() === 'y' || answer.toLowerCase() === 'yes';
resolve(confirmed);
});
});
}
/**
* Prompt user to select from numbered list
* @param {string} message - Selection prompt
* @param {Array} items - Items to choose from
* @returns {Promise<any>} Selected item
* @throws {PentestError} If invalid selection
*/
export async function promptSelection(message, items) {
if (!items || items.length === 0) {
throw new PentestError(
'No items available for selection',
'validation',
false
);
}
const readline = createInterface({
input: process.stdin,
output: process.stdout
});
return new Promise((resolve, reject) => {
readline.question(message + ' ', (answer) => {
readline.close();
const choice = parseInt(answer);
if (isNaN(choice) || choice < 1 || choice > items.length) {
reject(new PentestError(
`Invalid selection. Please enter a number between 1 and ${items.length}`,
'validation',
false,
{ choice: answer }
));
} else {
resolve(items[choice - 1]);
}
});
});
}
-2
View File
@@ -21,7 +21,6 @@ export function showHelp() {
console.log(chalk.yellow.bold('OPTIONS:'));
console.log(' --config <file> YAML configuration file for authentication and testing parameters');
console.log(' --log [file] Capture all output to log file (default: shannon-<timestamp>.log)');
console.log(' --pipeline-testing Use minimal prompts for fast pipeline testing (creates minimal deliverables)\n');
console.log(chalk.yellow.bold('DEVELOPER COMMANDS:'));
@@ -37,7 +36,6 @@ export function showHelp() {
console.log(' # Normal mode - create new session');
console.log(' ./shannon.mjs "https://example.com" "/path/to/local/repo"');
console.log(' ./shannon.mjs "https://example.com" "/path/to/local/repo" --config auth.yaml');
console.log(' ./shannon.mjs "https://example.com" "/path/to/local/repo" --log pentest.log');
console.log(' ./shannon.mjs "https://example.com" "/path/to/local/repo" --setup-only # Setup only\n');
console.log(' # Developer mode - operate on existing session');
+31 -73
View File
@@ -2,6 +2,27 @@ import { path, fs } from 'zx';
import chalk from 'chalk';
import { validateQueueAndDeliverable } from './queue-validation.js';
// Factory function for vulnerability queue validators
function createVulnValidator(vulnType) {
return async (sourceDir) => {
try {
await validateQueueAndDeliverable(vulnType, sourceDir);
return true;
} catch (error) {
console.log(chalk.yellow(` Queue validation failed for ${vulnType}: ${error.message}`));
return false;
}
};
}
// Factory function for exploit deliverable validators
function createExploitValidator(vulnType) {
return async (sourceDir) => {
const evidenceFile = path.join(sourceDir, 'deliverables', `${vulnType}_exploitation_evidence.md`);
return await fs.pathExists(evidenceFile);
};
}
// MCP agent mapping - assigns each agent to a specific Playwright instance to prevent conflicts
export const MCP_AGENT_MAPPING = Object.freeze({
// Phase 1: Pre-reconnaissance (actual prompt name is 'pre-recon-code')
@@ -47,81 +68,18 @@ export const AGENT_VALIDATORS = Object.freeze({
},
// Vulnerability analysis agents
'injection-vuln': async (sourceDir) => {
try {
await validateQueueAndDeliverable('injection', sourceDir);
return true;
} catch (error) {
console.log(chalk.yellow(` Queue validation failed for injection: ${error.message}`));
return false;
}
},
'xss-vuln': async (sourceDir) => {
try {
await validateQueueAndDeliverable('xss', sourceDir);
return true;
} catch (error) {
console.log(chalk.yellow(` Queue validation failed for xss: ${error.message}`));
return false;
}
},
'auth-vuln': async (sourceDir) => {
try {
await validateQueueAndDeliverable('auth', sourceDir);
return true;
} catch (error) {
console.log(chalk.yellow(` Queue validation failed for auth: ${error.message}`));
return false;
}
},
'ssrf-vuln': async (sourceDir) => {
try {
await validateQueueAndDeliverable('ssrf', sourceDir);
return true;
} catch (error) {
console.log(chalk.yellow(` Queue validation failed for ssrf: ${error.message}`));
return false;
}
},
'authz-vuln': async (sourceDir) => {
try {
await validateQueueAndDeliverable('authz', sourceDir);
return true;
} catch (error) {
console.log(chalk.yellow(` Queue validation failed for authz: ${error.message}`));
return false;
}
},
'injection-vuln': createVulnValidator('injection'),
'xss-vuln': createVulnValidator('xss'),
'auth-vuln': createVulnValidator('auth'),
'ssrf-vuln': createVulnValidator('ssrf'),
'authz-vuln': createVulnValidator('authz'),
// Exploitation agents
'injection-exploit': async (sourceDir) => {
const evidenceFile = path.join(sourceDir, 'deliverables', 'injection_exploitation_evidence.md');
return await fs.pathExists(evidenceFile);
},
'xss-exploit': async (sourceDir) => {
const evidenceFile = path.join(sourceDir, 'deliverables', 'xss_exploitation_evidence.md');
return await fs.pathExists(evidenceFile);
},
'auth-exploit': async (sourceDir) => {
const evidenceFile = path.join(sourceDir, 'deliverables', 'auth_exploitation_evidence.md');
return await fs.pathExists(evidenceFile);
},
'ssrf-exploit': async (sourceDir) => {
const evidenceFile = path.join(sourceDir, 'deliverables', 'ssrf_exploitation_evidence.md');
return await fs.pathExists(evidenceFile);
},
'authz-exploit': async (sourceDir) => {
const evidenceFile = path.join(sourceDir, 'deliverables', 'authz_exploitation_evidence.md');
return await fs.pathExists(evidenceFile);
},
'injection-exploit': createExploitValidator('injection'),
'xss-exploit': createExploitValidator('xss'),
'auth-exploit': createExploitValidator('auth'),
'ssrf-exploit': createExploitValidator('ssrf'),
'authz-exploit': createExploitValidator('authz'),
// Executive report agent
'report': async (sourceDir) => {
-30
View File
@@ -51,18 +51,6 @@ export const logError = async (error, contextMsg, sourceDir = null) => {
return logEntry;
};
// Handle configuration parsing errors
const handleConfigError = (error, configPath) => {
const configError = new PentestError(
`Configuration error in ${configPath}: ${error.message}. Check your config.yaml file format and try again.`,
'config',
false,
{ configPath, originalError: error.message }
);
throw configError;
};
// Handle tool execution errors
export const handleToolError = (toolName, error) => {
const isRetryable = error.code === 'ECONNRESET' || error.code === 'ETIMEDOUT' || error.code === 'ENOTFOUND';
@@ -167,22 +155,4 @@ export const getRetryDelay = (error, attempt) => {
const baseDelay = Math.pow(2, attempt) * 1000; // 2s, 4s, 8s
const jitter = Math.random() * 1000; // 0-1s random
return Math.min(baseDelay + jitter, 30000); // Max 30s
};
// General error handler with context
const handleError = (error, context, isFatal = false) => {
const pentestError = error instanceof PentestError
? error
: new PentestError(error.message, 'unknown', false, { context, originalError: error.message });
if (isFatal) {
pentestError.type = 'fatal';
throw pentestError;
}
return {
success: false,
error: pentestError,
continuable: !isFatal
};
};
+4 -3
View File
@@ -1,6 +1,7 @@
import { $, fs, path } from 'zx';
import chalk from 'chalk';
import { Timer, timingResults, formatDuration } from '../utils/metrics.js';
import { Timer, timingResults } from '../utils/metrics.js';
import { formatDuration } from '../audit/utils.js';
import { handleToolError, PentestError } from '../error-handling.js';
import { AGENTS } from '../session-manager.js';
import { runClaudePromptWithRetry } from '../ai/claude-executor.js';
@@ -99,7 +100,7 @@ async function runPreReconWave1(webUrl, sourceDir, variables, config, pipelineTe
AGENTS['pre-recon'].displayName,
'pre-recon', // Agent name for snapshot creation
chalk.cyan,
{ webUrl, sessionId } // Session metadata for logging
{ id: sessionId, webUrl } // Session metadata for audit logging (STANDARD: use 'id' field)
)
);
const [codeAnalysis] = await Promise.all(operations);
@@ -123,7 +124,7 @@ async function runPreReconWave1(webUrl, sourceDir, variables, config, pipelineTe
AGENTS['pre-recon'].displayName,
'pre-recon', // Agent name for snapshot creation
chalk.cyan,
{ webUrl, sessionId } // Session metadata for logging
{ id: sessionId, webUrl } // Session metadata for audit logging (STANDARD: use 'id' field)
)
);
}
-4
View File
@@ -22,10 +22,6 @@ export class ProgressIndicator {
}, 100);
}
updateMessage(newMessage) {
this.message = newMessage;
}
stop() {
if (!this.isRunning) return;
+27 -34
View File
@@ -7,7 +7,7 @@ import { MCP_AGENT_MAPPING } from '../constants.js';
async function buildLoginInstructions(authentication) {
try {
// Load the login instructions template
const loginInstructionsPath = path.join(import.meta.dirname, '..', '..', 'login_resources', 'login_instructions.txt');
const loginInstructionsPath = path.join(import.meta.dirname, '..', '..', 'prompts', 'shared', 'login-instructions.txt');
if (!await fs.pathExists(loginInstructionsPath)) {
throw new PentestError(
@@ -84,6 +84,27 @@ async function buildLoginInstructions(authentication) {
}
}
// Pure function: Process @include() directives
async function processIncludes(content, baseDir) {
const includeRegex = /@include\(([^)]+)\)/g;
// Use a Promise.all to handle all includes concurrently
const replacements = await Promise.all(
Array.from(content.matchAll(includeRegex)).map(async (match) => {
const includePath = path.join(baseDir, match[1]);
const sharedContent = await fs.readFile(includePath, 'utf8');
return {
placeholder: match[0],
content: sharedContent,
};
})
);
for (const replacement of replacements) {
content = content.replace(replacement.placeholder, replacement.content);
}
return content;
}
// Pure function: Variable interpolation
async function interpolateVariables(template, variables, config = null) {
try {
@@ -198,7 +219,11 @@ export async function loadPrompt(promptName, variables, config = null, pipelineT
console.log(chalk.yellow(` 🎭 Unknown agent ${promptName}, using fallback → ${enhancedVariables.MCP_SERVER}`));
}
const template = await fs.readFile(promptPath, 'utf8');
let template = await fs.readFile(promptPath, 'utf8');
// Pre-process the template to handle @include directives
template = await processIncludes(template, promptsDir);
return await interpolateVariables(template, enhancedVariables, config);
} catch (error) {
if (error instanceof PentestError) {
@@ -207,36 +232,4 @@ export async function loadPrompt(promptName, variables, config = null, pipelineT
const promptError = handlePromptError(promptName, error);
throw promptError.error;
}
}
// Save prompt snapshot for successful agent runs only
export async function savePromptSnapshot(sourceDir, agentName, promptContent) {
const snapshotDir = path.join(sourceDir, 'prompt-snapshots');
await fs.ensureDir(snapshotDir);
// Use deterministic naming - one snapshot per agent
const fileName = `${agentName}.md`;
const filePath = path.join(snapshotDir, fileName);
const timestamp = new Date().toISOString();
const snapshotContent = `# Prompt Snapshot: ${agentName}
**Generated:** ${timestamp}
**Agent:** ${agentName}
---
## Full Interpolated Prompt
\`\`\`markdown
${promptContent}
\`\`\`
---
*This snapshot represents the exact prompt that was sent to Claude Code to generate the current deliverables for this agent.*
`;
await fs.writeFile(filePath, snapshotContent);
console.log(chalk.gray(` 📸 Prompt snapshot saved: prompt-snapshots/${fileName}`));
}
-1
View File
@@ -27,7 +27,6 @@ const VULN_TYPE_CONFIG = Object.freeze({
// Functional composition utilities - async pipe for promise chain
const pipe = (...fns) => x => fns.reduce(async (v, f) => f(await v), x);
const compose = (...fns) => x => fns.reduceRight((v, f) => f(v), x);
// Pure function to create validation rule
const createValidationRule = (predicate, errorMessage, retryable = true) =>
+113 -134
View File
@@ -2,41 +2,17 @@ import { fs, path } from 'zx';
import chalk from 'chalk';
import crypto from 'crypto';
import { PentestError } from './error-handling.js';
import { SessionMutex } from './utils/concurrency.js';
import { promptSelection } from './cli/prompts.js';
// Generate a session-based log folder path
// NEW FORMAT: {hostname}_{sessionId} (no hash, full UUID for consistency with audit system)
export const generateSessionLogPath = (webUrl, sessionId) => {
// Create a hash of the webUrl for uniqueness while keeping it readable
const urlHash = crypto.createHash('md5').update(webUrl).digest('hex').substring(0, 8);
const hostname = new URL(webUrl).hostname.replace(/[^a-zA-Z0-9-]/g, '-');
const shortSessionId = sessionId.substring(0, 8);
const sessionFolderName = `${hostname}_${urlHash}_${shortSessionId}`;
const sessionFolderName = `${hostname}_${sessionId}`;
return path.join(process.cwd(), 'agent-logs', sessionFolderName);
};
// Mutex for session file operations to prevent race conditions
class SessionMutex {
constructor() {
this.locks = new Map();
}
async lock(sessionId) {
if (this.locks.has(sessionId)) {
// Wait for existing lock to be released
await this.locks.get(sessionId);
}
let resolve;
const promise = new Promise(r => resolve = r);
this.locks.set(sessionId, promise);
return () => {
this.locks.delete(sessionId);
resolve();
};
}
}
const sessionMutex = new SessionMutex();
// Agent definitions according to PRD
@@ -242,6 +218,8 @@ export const createSession = async (webUrl, repoPath, configFile = null, targetR
const sessionId = generateSessionId();
// STANDARD: All sessions use 'id' field (NOT 'sessionId')
// This is the canonical session structure used throughout the codebase
const session = {
id: sessionId,
webUrl,
@@ -339,29 +317,10 @@ export const selectSession = async () => {
});
// Get user selection
const { createInterface } = await import('readline');
const readline = createInterface({
input: process.stdin,
output: process.stdout
});
return new Promise((resolve, reject) => {
readline.question(chalk.cyan(`Select session (1-${sessions.length}): `), (answer) => {
readline.close();
const choice = parseInt(answer);
if (isNaN(choice) || choice < 1 || choice > sessions.length) {
reject(new PentestError(
`Invalid selection. Please enter a number between 1 and ${sessions.length}`,
'validation',
false,
{ choice: answer }
));
} else {
resolve(sessions[choice - 1]);
}
});
});
return await promptSelection(
chalk.cyan(`Select session (1-${sessions.length}):`),
sessions
);
};
// Validate agent name
@@ -452,7 +411,9 @@ export const getNextAgent = (session) => {
};
// Mark agent as completed with checkpoint
export const markAgentCompleted = async (sessionId, agentName, checkpointCommit, timingData = null, costData = null, validationData = null) => {
// NOTE: Timing, cost, and validation data now managed by AuditSession (audit-logs/session.json)
// Shannon store contains ONLY orchestration state (completedAgents, checkpoints)
export const markAgentCompleted = async (sessionId, agentName, checkpointCommit) => {
// Use mutex to prevent race conditions during parallel agent execution
const unlock = await sessionMutex.lock(sessionId);
@@ -473,38 +434,6 @@ export const markAgentCompleted = async (sessionId, agentName, checkpointCommit,
[agentName]: checkpointCommit
}
};
// Update timing data if provided
if (timingData) {
updates.timingBreakdown = {
...session.timingBreakdown,
agents: {
...session.timingBreakdown?.agents,
[agentName]: timingData
}
};
}
// Update cost data if provided
if (costData) {
const existingCost = session.costBreakdown?.total || 0;
updates.costBreakdown = {
total: existingCost + costData,
agents: {
...session.costBreakdown?.agents,
[agentName]: costData
}
};
}
// Update validation data if provided (for vulnerability agents)
if (validationData && agentName.includes('-vuln')) {
updates.validationResults = {
...session.validationResults,
[agentName]: validationData
};
}
// Check if all agents are now completed and update session status
const totalAgents = Object.keys(AGENTS).length;
@@ -583,25 +512,12 @@ export const getSessionStatus = (session) => {
export const calculateVulnerabilityAnalysisSummary = (session) => {
const vulnAgents = PHASES['vulnerability-analysis'];
const completedVulnAgents = session.completedAgents.filter(agent => vulnAgents.includes(agent));
const validationResults = session.validationResults || {};
let totalVulnerabilities = 0;
let agentsWithVulns = 0;
for (const agent of completedVulnAgents) {
const validation = validationResults[agent];
if (validation?.vulnerabilityCount > 0) {
totalVulnerabilities += validation.vulnerabilityCount;
agentsWithVulns++;
}
}
// NOTE: Actual vulnerability counts require reading queue files
// This summary only shows completion counts
return Object.freeze({
totalAnalyses: completedVulnAgents.length,
totalVulnerabilities,
agentsWithVulnerabilities: agentsWithVulns,
successRate: completedVulnAgents.length > 0 ? (agentsWithVulns / completedVulnAgents.length) * 100 : 0,
exploitationCandidates: Object.values(validationResults).filter(v => v?.shouldExploit).length
completedAgents: completedVulnAgents
});
};
@@ -609,19 +525,12 @@ export const calculateVulnerabilityAnalysisSummary = (session) => {
export const calculateExploitationSummary = (session) => {
const exploitAgents = PHASES['exploitation'];
const completedExploitAgents = session.completedAgents.filter(agent => exploitAgents.includes(agent));
const validationResults = session.validationResults || {};
// Count how many exploitation agents were eligible to run
const eligibleExploits = exploitAgents.filter(agentName => {
const vulnAgentName = agentName.replace('-exploit', '-vuln');
return validationResults[vulnAgentName]?.shouldExploit;
});
// NOTE: Eligibility requires reading queue files
// This summary only shows completion counts
return Object.freeze({
totalAttempts: completedExploitAgents.length,
eligibleExploits: eligibleExploits.length,
skippedExploits: eligibleExploits.length - completedExploitAgents.length,
successRate: eligibleExploits.length > 0 ? (completedExploitAgents.length / eligibleExploits.length) * 100 : 0
completedAgents: completedExploitAgents
});
};
@@ -656,33 +565,103 @@ export const rollbackToAgent = async (sessionId, targetAgent) => {
Object.entries(session.checkpoints).filter(([agent]) => !agentsToRemove.includes(agent))
)
};
// Clean up timing data for rolled-back agents
if (session.timingBreakdown?.agents) {
const filteredTimingAgents = Object.fromEntries(
Object.entries(session.timingBreakdown.agents).filter(([agent]) => !agentsToRemove.includes(agent))
);
updates.timingBreakdown = {
...session.timingBreakdown,
agents: filteredTimingAgents
};
}
// Clean up cost data for rolled-back agents and recalculate total
if (session.costBreakdown?.agents) {
const filteredCostAgents = Object.fromEntries(
Object.entries(session.costBreakdown.agents).filter(([agent]) => !agentsToRemove.includes(agent))
);
const recalculatedTotal = Object.values(filteredCostAgents).reduce((sum, cost) => sum + cost, 0);
updates.costBreakdown = {
total: recalculatedTotal,
agents: filteredCostAgents
};
}
// NOTE: Timing and cost data now managed in audit-logs/session.json
// Rollback will be reflected via reconcileSession() which marks agents as "rolled-back"
return await updateSession(sessionId, updates);
};
/**
* Reconcile Shannon store with audit logs (self-healing)
*
* This function ensures the Shannon store (.shannon-store.json) is consistent with
* the audit logs (audit-logs/session.json) by syncing agent completion status.
*
* Three-part reconciliation:
* 1. PROMOTIONS: Agents completed/failed in audit → added to Shannon store
* 2. DEMOTIONS: Agents rolled-back in audit → removed from Shannon store
* 3. VERIFICATION: Ensure audit state fully reflected in orchestration
*
* Critical for crash recovery, especially crash during rollback operations.
*
* @param {string} sessionId - Session ID to reconcile
* @returns {Promise<Object>} Reconciliation report with added/removed/failed agents
*/
export const reconcileSession = async (sessionId) => {
const { AuditSession } = await import('./audit/index.js');
// Get Shannon store session
const shannonSession = await getSession(sessionId);
if (!shannonSession) {
throw new PentestError(`Session ${sessionId} not found in Shannon store`, 'validation', false);
}
// Get audit session data
const auditSession = new AuditSession(shannonSession);
await auditSession.initialize();
const auditData = await auditSession.getMetrics();
const report = {
promotions: [],
demotions: [],
failures: []
};
// PART 1: PROMOTIONS (Additive)
// Find agents completed in audit but not in Shannon store
const auditCompleted = Object.entries(auditData.metrics.agents)
.filter(([_, agentData]) => agentData.status === 'success')
.map(([agentName]) => agentName);
const missing = auditCompleted.filter(agent => !shannonSession.completedAgents.includes(agent));
for (const agentName of missing) {
const agentData = auditData.metrics.agents[agentName];
const checkpoint = agentData.checkpoint || null;
await markAgentCompleted(sessionId, agentName, checkpoint);
report.promotions.push(agentName);
}
// PART 2: DEMOTIONS (Subtractive) - CRITICAL FOR ROLLBACK RECOVERY
// Find agents rolled-back in audit but still in Shannon store
const auditRolledBack = Object.entries(auditData.metrics.agents)
.filter(([_, agentData]) => agentData.status === 'rolled-back')
.map(([agentName]) => agentName);
const toRemove = shannonSession.completedAgents.filter(agent => auditRolledBack.includes(agent));
if (toRemove.length > 0) {
// Reload session to get fresh state
const freshSession = await getSession(sessionId);
const updates = {
completedAgents: freshSession.completedAgents.filter(agent => !toRemove.includes(agent)),
checkpoints: Object.fromEntries(
Object.entries(freshSession.checkpoints).filter(([agent]) => !toRemove.includes(agent))
)
};
await updateSession(sessionId, updates);
report.demotions.push(...toRemove);
}
// PART 3: FAILURES
// Find agents failed in audit but not marked failed in Shannon store
const auditFailed = Object.entries(auditData.metrics.agents)
.filter(([_, agentData]) => agentData.status === 'failed')
.map(([agentName]) => agentName);
const failedToAdd = auditFailed.filter(agent => !shannonSession.failedAgents.includes(agent));
for (const agentName of failedToAdd) {
await markAgentFailed(sessionId, agentName);
report.failures.push(agentName);
}
return report;
};
// Delete a specific session by ID
export const deleteSession = async (sessionId) => {
const store = await loadSessions();
-136
View File
@@ -1,136 +0,0 @@
import { fs, path, os } from 'zx';
import chalk from 'chalk';
import { PentestError, logError } from '../error-handling.js';
// Pure function: Save deliverables permanently to user directory
export async function savePermanentDeliverables(sourceDir, webUrl, repoPath, session, timingBreakdown, costBreakdown) {
try {
// Simple universal approach - try Documents, fallback to home
const homeDir = os.homedir();
const documentsDir = path.join(homeDir, 'Documents');
// Use Documents if it exists, otherwise use home directory
const baseDir = await fs.pathExists(documentsDir) ? documentsDir : homeDir;
const permanentBaseDir = path.join(baseDir, 'pentest-deliverables');
// Generate directory name from repo path and web URL
const repoName = path.basename(repoPath);
const webDomain = new URL(webUrl).hostname.replace(/[^a-zA-Z0-9-]/g, '-');
const timestamp = new Date().toISOString().replace(/[-:]/g, '').replace(/T/, '-').split('.')[0];
const dirName = `${webDomain}_${repoName}_${timestamp}`;
const permanentDir = path.join(permanentBaseDir, dirName);
// Ensure base directory exists
await fs.ensureDir(permanentBaseDir);
// Create the specific pentest directory
await fs.ensureDir(permanentDir);
// Copy deliverables folder if it exists
const deliverablesSource = path.join(sourceDir, 'deliverables');
const deliverablesDest = path.join(permanentDir, 'deliverables');
if (await fs.pathExists(deliverablesSource)) {
await fs.copy(deliverablesSource, deliverablesDest, { overwrite: true });
}
// Save metadata with session information
const metadata = {
session: {
id: session.id,
webUrl,
repoPath,
configFile: session.configFile,
status: session.status,
completedAgents: session.completedAgents,
createdAt: session.createdAt,
completedAt: new Date().toISOString()
},
timing: timingBreakdown,
cost: costBreakdown,
sourceDirectory: sourceDir,
savedAt: new Date().toISOString()
};
await fs.writeJSON(path.join(permanentDir, 'metadata.json'), metadata, { spaces: 2 });
// Copy prompts directory for reproducibility
const promptsSource = path.join(import.meta.dirname, '..', '..', 'prompts');
const promptsDest = path.join(permanentDir, 'prompts');
if (await fs.pathExists(promptsSource)) {
await fs.copy(promptsSource, promptsDest, { overwrite: true });
}
console.log(chalk.green(`✅ Deliverables saved to permanent location: ${permanentDir}`));
return permanentDir;
} catch (error) {
// Non-fatal error - log but don't throw
console.log(chalk.yellow(`⚠️ Failed to save permanent deliverables: ${error.message}`));
return null;
}
}
// Pure function: Save run metadata for debugging and reproducibility
export async function saveRunMetadata(sourceDir, webUrl, repoPath) {
console.log(chalk.blue('💾 Saving run metadata...'));
try {
// Read package.json to get version info with error handling
const packagePath = path.join(import.meta.dirname, '..', '..', 'package.json');
let packageJson;
try {
packageJson = await fs.readJSON(packagePath);
} catch (packageError) {
throw new PentestError(
`Cannot read package.json: ${packageError.message}`,
'filesystem',
false,
{ packagePath, originalError: packageError.message }
);
}
const metadata = {
timestamp: new Date().toISOString(),
targets: { webUrl, repoPath },
environment: {
nodeVersion: process.version,
platform: process.platform,
arch: process.arch,
cwd: process.cwd()
},
dependencies: {
claudeCodeVersion: packageJson.dependencies?.['@anthropic-ai/claude-code'] || 'unknown',
zxVersion: packageJson.dependencies?.['zx'] || 'unknown',
chalkVersion: packageJson.dependencies?.['chalk'] || 'unknown'
},
execution: {
args: process.argv,
env: {
PLAYWRIGHT_HEADLESS: process.env.PLAYWRIGHT_HEADLESS || 'true',
NODE_ENV: process.env.NODE_ENV
}
}
};
const metadataPath = path.join(sourceDir, 'run-metadata.json');
await fs.writeJSON(metadataPath, metadata, { spaces: 2 });
console.log(chalk.green(`✅ Run metadata saved to: ${metadataPath}`));
return metadata;
} catch (error) {
if (error instanceof PentestError) {
await logError(error, 'Saving run metadata', sourceDir);
throw error; // Re-throw PentestError to be handled by caller
}
const metadataError = new PentestError(
`Run metadata saving failed: ${error.message}`,
'filesystem',
false,
{ sourceDir, originalError: error.message }
);
await logError(metadataError, 'Saving run metadata', sourceDir);
throw metadataError;
}
}
+5 -101
View File
@@ -1,96 +1,14 @@
import { $, fs, path } from 'zx';
import chalk from 'chalk';
import { PentestError, logError } from '../error-handling.js';
// Pure function: Setup MCP with multiple isolated Playwright instances
export async function setupMCP(sourceDir) {
console.log(chalk.blue('🎭 Setting up 5 isolated Playwright MCP instances...'));
// Set headless mode for all instances
process.env.PLAYWRIGHT_HEADLESS = 'true';
try {
// Clean slate - remove any existing instances
const instancesToRemove = ['playwright', ...Array.from({length: 5}, (_, i) => `playwright-agent${i + 1}`)];
for (const instance of instancesToRemove) {
try {
await $`claude mcp remove ${instance} --scope user 2>/dev/null`;
} catch {
// Silent ignore - instance might not exist
}
}
// Ensure screenshot directories exist
await fs.ensureDir(path.join(sourceDir, 'screenshots'));
// Create 5 isolated instances sequentially to avoid config conflicts
for (let i = 1; i <= 5; i++) {
const instanceName = `playwright-agent${i}`;
const screenshotDir = path.join(sourceDir, 'screenshots', instanceName);
const userDataDir = `/tmp/${instanceName}`;
// Ensure both directories exist
await fs.ensureDir(screenshotDir);
await fs.ensureDir(userDataDir);
try {
await $`claude mcp add ${instanceName} --scope user -- npx @playwright/mcp@latest --isolated --user-data-dir ${userDataDir} --output-dir ${screenshotDir}`;
console.log(chalk.green(`${instanceName} configured`));
} catch (error) {
if (error.message?.includes('already exists')) {
console.log(chalk.gray(` ⏭️ ${instanceName} already exists`));
} else {
console.log(chalk.yellow(` ⚠️ ${instanceName} failed: ${error.message}, continuing...`));
}
}
}
console.log(chalk.green('✅ All 5 Playwright MCP instances ready for parallel execution'));
} catch (error) {
// All MCP setup failures are fatal
const mcpError = new PentestError(
`Critical MCP setup failure: ${error.message}. Browser automation required for pentesting.`,
'tool',
false,
{ sourceDir, originalError: error.message }
);
await logError(mcpError, 'MCP setup failure', sourceDir);
throw mcpError;
}
}
// Pure function: Cleanup MCP instances
export async function cleanupMCP() {
console.log(chalk.blue('🧹 Cleaning up Playwright MCP instances...'));
try {
// Remove all instances (including legacy 'playwright' if it exists)
const instancesToRemove = ['playwright', ...Array.from({length: 5}, (_, i) => `playwright-agent${i + 1}`)];
for (const instance of instancesToRemove) {
try {
await $`claude mcp remove ${instance} --scope user 2>/dev/null`;
console.log(chalk.gray(` 🗑️ Removed ${instance}`));
} catch {
// Silent ignore - instance might not exist
}
}
console.log(chalk.green('✅ Playwright MCP cleanup complete'));
} catch (error) {
// Non-fatal - log warning but don't throw
console.log(chalk.yellow(`⚠️ MCP cleanup warning: ${error.message}`));
}
}
import { PentestError } from '../error-handling.js';
// Pure function: Setup local repository for testing
export async function setupLocalRepo(repoPath) {
try {
const sourceDir = path.resolve(repoPath);
// Setup MCP in the local repository - critical for browser automation
await setupMCP(sourceDir);
// MCP servers are now configured via mcpServers option in claude-executor.js
// No need for pre-setup with claude CLI
// Initialize git repository if not already initialized and create checkpoint
try {
@@ -114,22 +32,8 @@ export async function setupLocalRepo(repoPath) {
// Non-fatal - continue without Git setup
}
// Copy TOTP generation script to local repository for agent accessibility
try {
const totpScriptSource = path.join(import.meta.dirname, '..', '..', 'login_resources', 'generate-totp-standalone.mjs');
const totpScriptDest = path.join(sourceDir, 'generate-totp.mjs');
if (await fs.pathExists(totpScriptSource)) {
await fs.copy(totpScriptSource, totpScriptDest);
await fs.chmod(totpScriptDest, '755'); // Make executable
console.log(chalk.green('✅ TOTP generation script (standalone) copied to target repository'));
} else {
console.log(chalk.yellow('⚠️ TOTP script not found, authentication may fail if TOTP is required'));
}
} catch (totpError) {
console.log(chalk.yellow(`⚠️ Failed to copy TOTP script: ${totpError.message}`));
// Non-fatal - continue without TOTP script
}
// MCP tools (save_deliverable, generate_totp) are now available natively via shannon-helper MCP server
// No need to copy bash scripts to target repository
return sourceDir;
} catch (error) {
+1 -12
View File
@@ -48,17 +48,6 @@ export const handleMissingTools = (toolAvailability) => {
});
console.log('');
}
return missing;
};
// Check if a specific tool is available
const isToolAvailable = async (toolName) => {
try {
await $`command -v ${toolName}`;
return true;
} catch {
return false;
}
};
+54
View File
@@ -0,0 +1,54 @@
/**
* Concurrency Control Utilities
*
* Provides mutex implementation for preventing race conditions during
* concurrent session operations.
*/
/**
* SessionMutex - Promise-based mutex for session file operations
*
* Prevents race conditions when multiple agents or operations attempt to
* modify the same session data simultaneously. This is particularly important
* during parallel execution of vulnerability analysis and exploitation phases.
*
* Usage:
* ```js
* const mutex = new SessionMutex();
* const unlock = await mutex.lock(sessionId);
* try {
* // Critical section - modify session data
* } finally {
* unlock(); // Always release the lock
* }
* ```
*/
export class SessionMutex {
constructor() {
// Map of sessionId -> Promise (represents active lock)
this.locks = new Map();
}
/**
* Acquire lock for a session
* @param {string} sessionId - Session ID to lock
* @returns {Promise<Function>} Unlock function to release the lock
*/
async lock(sessionId) {
if (this.locks.has(sessionId)) {
// Wait for existing lock to be released
await this.locks.get(sessionId);
}
// Create new lock promise
let resolve;
const promise = new Promise(r => resolve = r);
this.locks.set(sessionId, promise);
// Return unlock function
return () => {
this.locks.delete(sessionId);
resolve();
};
}
}
+1 -1
View File
@@ -72,7 +72,7 @@ export const executeGitCommandWithRetry = async (commandArgs, sourceDir, descrip
};
// Pure functions for Git workspace management
export const cleanWorkspace = async (sourceDir, reason = 'clean start') => {
const cleanWorkspace = async (sourceDir, reason = 'clean start') => {
console.log(chalk.blue(` 🧹 Cleaning workspace for ${reason}`));
try {
// Check for uncommitted changes
-126
View File
@@ -1,126 +0,0 @@
import { fs } from 'zx';
import { path } from 'zx';
/**
* Strips ANSI escape codes from a string
* @param {string} str - String with ANSI codes
* @returns {string} Clean string without ANSI codes
*/
function stripAnsi(str) {
if (typeof str !== 'string') {
return str;
}
// Remove ANSI escape sequences
// This regex matches all common ANSI codes including:
// - Colors (e.g., \x1b[32m)
// - Cursor movement (e.g., \x1b[1;1H)
// - Screen clearing (e.g., \x1b[0J)
// - 256-color codes (e.g., \x1b[38;2;244;197;66m)
return str.replace(
// eslint-disable-next-line no-control-regex
/\x1b\[[0-9;]*[a-zA-Z]|\x1b\][0-9];.*?\x07|\x1b\[[\d;]*m/g,
''
);
}
/**
* Sets up logging to capture all stdout and stderr to a file
* @param {string} logFilePath - Path to the log file
* @returns {Promise<Function>} Cleanup function to restore original streams
*/
export async function setupLogging(logFilePath) {
// Resolve to absolute path
const absoluteLogPath = path.isAbsolute(logFilePath)
? logFilePath
: path.join(process.cwd(), logFilePath);
// Ensure the directory exists
await fs.ensureDir(path.dirname(absoluteLogPath));
// Create write stream for the log file
const logStream = fs.createWriteStream(absoluteLogPath, { flags: 'a' });
// Buffer for lines that might be overwritten (carriage return without newline)
let stdoutBuffer = '';
let stderrBuffer = '';
// Store original stdout/stderr write functions
const originalStdoutWrite = process.stdout.write.bind(process.stdout);
const originalStderrWrite = process.stderr.write.bind(process.stderr);
// Override stdout
process.stdout.write = function(chunk, encoding, callback) {
// Write colorized output to terminal
originalStdoutWrite(chunk, encoding, callback);
// Write plain text (without ANSI codes) to log file
const cleanChunk = stripAnsi(chunk.toString());
// Handle carriage returns - only log when we get a newline
if (cleanChunk.includes('\r') && !cleanChunk.includes('\n')) {
// Buffer this line - it will be overwritten in terminal
stdoutBuffer = cleanChunk.replace(/\r/g, '');
} else if (cleanChunk.includes('\n')) {
// Flush buffer if exists, then write the new line
if (stdoutBuffer) {
stdoutBuffer = ''; // Clear buffer without writing (it was overwritten)
}
logStream.write(cleanChunk);
} else {
// Normal write
logStream.write(cleanChunk);
}
return true;
};
// Override stderr
process.stderr.write = function(chunk, encoding, callback) {
// Write colorized output to terminal
originalStderrWrite(chunk, encoding, callback);
// Write plain text (without ANSI codes) to log file
const cleanChunk = stripAnsi(chunk.toString());
// Handle carriage returns - only log when we get a newline
if (cleanChunk.includes('\r') && !cleanChunk.includes('\n')) {
// Buffer this line - it will be overwritten in terminal
stderrBuffer = cleanChunk.replace(/\r/g, '');
} else if (cleanChunk.includes('\n')) {
// Flush buffer if exists, then write the new line
if (stderrBuffer) {
stderrBuffer = ''; // Clear buffer without writing (it was overwritten)
}
logStream.write(cleanChunk);
} else {
// Normal write
logStream.write(cleanChunk);
}
return true;
};
// Return cleanup function
return async function cleanup() {
// Restore original streams
process.stdout.write = originalStdoutWrite;
process.stderr.write = originalStderrWrite;
// Flush any remaining buffers
if (stdoutBuffer) {
logStream.write(stdoutBuffer + '\n');
}
if (stderrBuffer) {
logStream.write(stderrBuffer + '\n');
}
// Close the log stream
return new Promise((resolve, reject) => {
logStream.end((err) => {
if (err) reject(err);
else resolve();
});
});
};
}
+1 -7
View File
@@ -1,13 +1,7 @@
import chalk from 'chalk';
import { formatDuration } from '../audit/utils.js';
// Timing utilities
export const formatDuration = (ms) => {
if (ms < 1000) return `${ms}ms`;
if (ms < 60000) return `${(ms / 1000).toFixed(1)}s`;
const minutes = Math.floor(ms / 60000);
const seconds = Math.floor((ms % 60000) / 1000);
return `${minutes}m ${seconds}s`;
};
export class Timer {
constructor(name) {