feat: implement unified audit system v3.0 with crash-safety and self-healing

## Unified Audit System (v3.0)
- Implemented crash-safe, append-only logging to audit-logs/{hostname}_{sessionId}/
- Added session.json with comprehensive metrics (timing, cost, attempts)
- Agent execution logs with turn-by-turn detail
- Prompt snapshots saved to audit-logs/.../prompts/{agent}.md
- SessionMutex prevents race conditions during parallel execution
- Self-healing reconciliation before every CLI command

## Session Metadata Standardization
- Fixed critical bug: standardized on 'id' field (not 'sessionId') throughout codebase
- Updated: shannon.mjs (recon, report), src/phases/pre-recon.js
- Added validation in AuditSession to fail fast on incorrect field usage
- JavaScript shorthand syntax was causing wrong field names

## Schema Improvements
- session.json: Added cost_usd per phase, removed redundant final_cost_usd
- Renamed 'percentage' -> 'duration_percentage' for clarity
- Simplified agent metrics to single total_cost_usd field
- Removed unused validation object from schema

## Legacy System Removal
- Removed savePromptSnapshot() - prompts now only saved by audit system
- Removed target repo pollution (prompt-snapshots/ no longer created)
- Single source of truth: audit-logs/{hostname}_{sessionId}/prompts/

## Export Script Simplification
- Removed JSON export mode (session.json already exists)
- CSV-only export with clean columns: agent, phase, status, attempts, duration_ms, cost_usd
- Tested on real session data

## Documentation
- Updated CLAUDE.md with audit system architecture
- Added .gitignore entry for audit-logs/

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
ajmallesh
2025-10-22 16:09:08 -07:00
parent a9e00ca19f
commit 27334a4dd6
18 changed files with 1871 additions and 206 deletions
+2 -1
View File
@@ -1,3 +1,4 @@
node_modules/
.shannon-store.json
agent-logs/
agent-logs/
audit-logs/
+102 -36
View File
@@ -188,17 +188,47 @@ The agent implements a sophisticated checkpoint system using git:
- Every agent creates a git checkpoint before execution
- Rollback to any previous agent state using `--rollback-to` or `--rerun`
- Failed agents don't affect completed work
- Timing and cost data cleaned up during rollbacks
- Rolled-back agents marked in audit system with status: "rolled-back"
- Reconciliation automatically syncs Shannon store with audit logs after rollback
- Fail-fast safety prevents accidental re-execution of completed agents
### Timing & Performance Monitoring
The agent includes comprehensive timing instrumentation that tracks:
- Total execution time
- Phase-level timing breakdown
- Individual command execution times
- Claude Code agent processing times
- Cost tracking for AI agent usage
### Unified Audit & Metrics System
The agent implements a crash-safe, self-healing audit system (v3.0) with the following guarantees:
**Architecture:**
- **audit-logs/**: Centralized metrics and forensic logs (source of truth)
- `{hostname}_{sessionId}/session.json` - Comprehensive metrics with attempt-level detail
- `{hostname}_{sessionId}/prompts/` - Exact prompts used for reproducibility
- `{hostname}_{sessionId}/agents/` - Turn-by-turn execution logs
- **.shannon-store.json**: Minimal orchestration state (completedAgents, checkpoints)
**Crash Safety:**
- Append-only logging with immediate flush (survives kill -9)
- Atomic writes for session.json (no partial writes)
- Event-based logging (tool_start, tool_end, llm_response) closes data loss windows
**Self-Healing:**
- Automatic reconciliation before every CLI command
- Recovers from crashes during rollback
- Audit logs are source of truth; Shannon store follows
**Forensic Completeness:**
- All retry attempts logged with errors, costs, durations
- Rolled-back agents preserved with status: "rolled-back"
- Partial cost capture for failed attempts
- Complete event trail for debugging
**Concurrency Safety:**
- SessionMutex prevents race conditions during parallel agent execution
- Safe parallel execution of vulnerability and exploitation phases
**Metrics & Reporting:**
- Export metrics with `./scripts/export-metrics.js`
- Manual reconciliation (diagnostics) with `./scripts/reconcile-session.js`
- Phase-level and agent-level timing/cost aggregations
- Validation results integrated with metrics
For detailed design, see `docs/unified-audit-system-design.md`.
## Development Notes
@@ -232,34 +262,56 @@ The tool should only be used on systems you own or have explicit permission to t
## File Structure
```
shannon.mjs # Main orchestration script
package.json # Node.js dependencies
src/ # Core modules
├── config-parser.js # Configuration handling
├── error-handling.js # Error management
├── tool-checker.js # Tool validation
├── session-manager.js # Session state management
├── checkpoint-manager.js # Git-based checkpointing
├── queue-validation.js # Deliverable validation
shannon.mjs # Main orchestration script
package.json # Node.js dependencies
.shannon-store.json # Orchestration state (minimal)
src/ # Core modules
├── audit/ # Unified audit system (v3.0)
│ ├── index.js # Public API
│ ├── audit-session.js # Main facade (logger + metrics + mutex)
│ ├── logger.js # Append-only crash-safe logging
│ ├── metrics-tracker.js # Timing, cost, attempt tracking
│ └── utils.js # Path generation, atomic writes
├── config-parser.js # Configuration handling
├── error-handling.js # Error management
├── tool-checker.js # Tool validation
├── session-manager.js # Session state + reconciliation
├── checkpoint-manager.js # Git-based checkpointing + rollback
├── queue-validation.js # Deliverable validation
├── ai/
│ └── claude-executor.js # Claude Code SDK integration
└── utils/
configs/ # Configuration files
── config-schema.json # JSON Schema validation
├── example-config.yaml # Template configuration
├── juice-shop-config.yaml # Juice Shop example
├── keygraph-config.yaml # Keygraph configuration
├── chatwoot-config.yaml # Chatwoot configuration
├── metabase-config.yaml # Metabase configuration
└── cal-com-config.yaml # Cal.com configuration
prompts/ # AI prompt templates
├── pre-recon-code.txt # Code analysis
├── recon.txt # Reconnaissance
├── vuln-*.txt # Vulnerability assessment
├── exploit-*.txt # Exploitation
── report-executive.txt # Executive reporting
login_resources/ # Authentication utilities
├── generate-totp.mjs # TOTP generation
── login_instructions.txt # Login documentation
deliverables/ # Output directory
audit-logs/ # Centralized audit data (v3.0)
── {hostname}_{sessionId}/
├── session.json # Comprehensive metrics
├── prompts/ # Prompt snapshots
│ └── {agent}.md
└── agents/ # Agent execution logs
└── {timestamp}_{agent}_attempt-{N}.log
configs/ # Configuration files
├── config-schema.json # JSON Schema validation
├── example-config.yaml # Template configuration
├── juice-shop-config.yaml # Juice Shop example
├── keygraph-config.yaml # Keygraph configuration
├── chatwoot-config.yaml # Chatwoot configuration
── metabase-config.yaml # Metabase configuration
└── cal-com-config.yaml # Cal.com configuration
prompts/ # AI prompt templates
── pre-recon-code.txt # Code analysis
├── recon.txt # Reconnaissance
├── vuln-*.txt # Vulnerability assessment
├── exploit-*.txt # Exploitation
└── report-executive.txt # Executive reporting
login_resources/ # Authentication utilities
├── generate-totp.mjs # TOTP generation
└── login_instructions.txt # Login documentation
scripts/ # Utility scripts
├── reconcile-session.js # Manual reconciliation (diagnostics)
└── export-metrics.js # Export metrics to CSV/JSON
deliverables/ # Output directory (in target repo)
docs/ # Documentation
├── unified-audit-system-design.md
└── migration-guide.md
```
## Troubleshooting
@@ -275,4 +327,18 @@ deliverables/ # Output directory
Missing tools can be skipped using `--pipeline-testing` mode during development:
- `nmap` - Network scanning
- `subfinder` - Subdomain discovery
- `whatweb` - Web technology detection
- `whatweb` - Web technology detection
### Diagnostic & Utility Scripts
```bash
# Manual reconciliation (for diagnostics only)
./scripts/reconcile-session.js --session-id <id> --dry-run --verbose
# Export metrics to CSV/JSON
./scripts/export-metrics.js --session-id <id> --format csv --output metrics.csv
# System-wide consistency audit
./scripts/reconcile-session.js --all-sessions --dry-run
```
Note: Manual reconciliation should rarely be needed. Frequent use indicates bugs in automatic reconciliation.
+2 -3
View File
@@ -1,7 +1,6 @@
Use the save_deliverable script to create your deliverable:
Run this command and do nothing else:
```bash
node save_deliverable.js CODE_ANALYSIS 'Pre-recon analysis complete'
```
This will automatically create `deliverables/code_analysis_deliverable.md` with the correct filename.
Then say "Done".
+2 -3
View File
@@ -1,7 +1,6 @@
Use the save_deliverable script to create your deliverable:
Run this command and do nothing else:
```bash
node save_deliverable.js RECON 'Reconnaissance analysis complete'
```
This will automatically create `deliverables/recon_deliverable.md` with the correct filename.
Then say "Done".
+150
View File
@@ -0,0 +1,150 @@
#!/usr/bin/env node
/**
* Export Metrics Script
*
* Export session metrics from audit logs to CSV format for analysis.
*
* Use Cases:
* - Performance analysis across sessions
* - Cost tracking and budgeting
* - Agent success rate analysis
* - Benchmarking improvements
*/
import chalk from 'chalk';
import { fs, path } from 'zx';
import { getSession } from '../src/session-manager.js';
import { AuditSession } from '../src/audit/index.js';
// Parse command-line arguments
function parseArgs() {
const args = {
sessionId: null,
output: null
};
for (let i = 2; i < process.argv.length; i++) {
const arg = process.argv[i];
if (arg === '--session-id' && process.argv[i + 1]) {
args.sessionId = process.argv[i + 1];
i++;
} else if (arg === '--output' && process.argv[i + 1]) {
args.output = process.argv[i + 1];
i++;
} else if (arg === '--help' || arg === '-h') {
printUsage();
process.exit(0);
} else {
console.log(chalk.red(`❌ Unknown argument: ${arg}`));
printUsage();
process.exit(1);
}
}
return args;
}
function printUsage() {
console.log(chalk.cyan('\n📊 Export Metrics to CSV'));
console.log(chalk.gray('\nUsage: ./scripts/export-metrics.js [options]\n'));
console.log(chalk.white('Options:'));
console.log(chalk.gray(' --session-id <id> Session ID to export (required)'));
console.log(chalk.gray(' --output <file> Output CSV file path (default: stdout)'));
console.log(chalk.gray(' --help, -h Show this help\n'));
console.log(chalk.white('Examples:'));
console.log(chalk.gray(' # Export to stdout'));
console.log(chalk.gray(' ./scripts/export-metrics.js --session-id abc123\n'));
console.log(chalk.gray(' # Export to file'));
console.log(chalk.gray(' ./scripts/export-metrics.js --session-id abc123 --output metrics.csv\n'));
}
// Export metrics for a session
async function exportMetrics(sessionId) {
const session = await getSession(sessionId);
if (!session) {
throw new Error(`Session ${sessionId} not found`);
}
const auditSession = new AuditSession(session);
await auditSession.initialize();
const metrics = await auditSession.getMetrics();
return exportAsCSV(session, metrics);
}
// Export as CSV
function exportAsCSV(session, metrics) {
const lines = [];
// Header
lines.push('agent,phase,status,attempts,duration_ms,cost_usd');
// Phase mapping
const phaseMap = {
'pre-recon': 'pre-recon',
'recon': 'recon',
'injection-vuln': 'vulnerability-analysis',
'xss-vuln': 'vulnerability-analysis',
'auth-vuln': 'vulnerability-analysis',
'authz-vuln': 'vulnerability-analysis',
'ssrf-vuln': 'vulnerability-analysis',
'injection-exploit': 'exploitation',
'xss-exploit': 'exploitation',
'auth-exploit': 'exploitation',
'authz-exploit': 'exploitation',
'ssrf-exploit': 'exploitation',
'report': 'reporting'
};
// Agent rows
for (const [agentName, agentData] of Object.entries(metrics.metrics.agents)) {
const phase = phaseMap[agentName] || 'unknown';
lines.push([
agentName,
phase,
agentData.status,
agentData.attempts.length,
agentData.final_duration_ms,
agentData.total_cost_usd.toFixed(4)
].join(','));
}
return lines.join('\n');
}
// Main execution
async function main() {
const args = parseArgs();
if (!args.sessionId) {
console.log(chalk.red('❌ Must specify --session-id'));
printUsage();
process.exit(1);
}
console.log(chalk.cyan.bold('\n📊 Exporting Metrics to CSV\n'));
console.log(chalk.gray(`Session ID: ${args.sessionId}\n`));
const output = await exportMetrics(args.sessionId);
if (args.output) {
await fs.writeFile(args.output, output);
console.log(chalk.green(`✅ Exported to: ${args.output}`));
} else {
console.log(chalk.cyan('CSV Output:\n'));
console.log(output);
}
console.log();
}
main().catch(error => {
console.log(chalk.red.bold(`\n🚨 Fatal error: ${error.message}`));
if (process.env.DEBUG) {
console.log(chalk.gray(error.stack));
}
process.exit(1);
});
+225
View File
@@ -0,0 +1,225 @@
#!/usr/bin/env node
/**
* Manual Session Reconciliation Script
*
* Purpose: Diagnostics and exceptional recovery (NOT normal operations).
*
* Use Cases:
* 1. Diagnostics (Primary): Non-destructively report inconsistencies
* 2. Debugging: Test reconciliation logic in isolation
* 3. Exceptional Recovery: Malformed JSON recovery, reconciliation bugs
* 4. Bulk Operations: System-wide consistency audit
*
* Design Principle:
* "Self-healing is the norm. Manual intervention is the exception."
*
* Red Flags (indicate bugs):
* - Manual script needed frequently
* - Automatic reconciliation failing consistently
* - Manual intervention after every crash
*/
import chalk from 'chalk';
import { fs, path } from 'zx';
import { reconcileSession, getSession } from '../src/session-manager.js';
const STORE_FILE = path.join(process.cwd(), '.shannon-store.json');
// Parse command-line arguments
function parseArgs() {
const args = {
sessionId: null,
allSessions: false,
dryRun: false,
verbose: false
};
for (let i = 2; i < process.argv.length; i++) {
const arg = process.argv[i];
if (arg === '--session-id' && process.argv[i + 1]) {
args.sessionId = process.argv[i + 1];
i++;
} else if (arg === '--all-sessions') {
args.allSessions = true;
} else if (arg === '--dry-run') {
args.dryRun = true;
} else if (arg === '--verbose') {
args.verbose = true;
} else if (arg === '--help' || arg === '-h') {
printUsage();
process.exit(0);
} else {
console.log(chalk.red(`❌ Unknown argument: ${arg}`));
printUsage();
process.exit(1);
}
}
return args;
}
function printUsage() {
console.log(chalk.cyan('\n📋 Manual Session Reconciliation Script'));
console.log(chalk.gray('\nUsage: ./scripts/reconcile-session.js [options]\n'));
console.log(chalk.white('Options:'));
console.log(chalk.gray(' --session-id <id> Reconcile specific session'));
console.log(chalk.gray(' --all-sessions Reconcile all sessions'));
console.log(chalk.gray(' --dry-run Report inconsistencies without fixing'));
console.log(chalk.gray(' --verbose Detailed logging'));
console.log(chalk.gray(' --help, -h Show this help\n'));
console.log(chalk.white('Examples:'));
console.log(chalk.gray(' # Diagnostics (primary use case)'));
console.log(chalk.gray(' ./scripts/reconcile-session.js --session-id abc123 --dry-run\n'));
console.log(chalk.gray(' # System-wide consistency audit'));
console.log(chalk.gray(' ./scripts/reconcile-session.js --all-sessions --dry-run --verbose\n'));
console.log(chalk.gray(' # Exceptional recovery'));
console.log(chalk.gray(' ./scripts/reconcile-session.js --session-id abc123\n'));
}
// Load all sessions
async function loadAllSessions() {
try {
if (!await fs.pathExists(STORE_FILE)) {
return [];
}
const content = await fs.readFile(STORE_FILE, 'utf8');
const store = JSON.parse(content);
return Object.values(store.sessions || {});
} catch (error) {
throw new Error(`Failed to load sessions: ${error.message}`);
}
}
// Reconcile a single session
async function reconcileSingleSession(sessionId, dryRun, verbose) {
try {
const session = await getSession(sessionId);
if (!session) {
console.log(chalk.red(`❌ Session ${sessionId} not found`));
return { success: false, sessionId };
}
if (verbose) {
console.log(chalk.blue(`\n🔍 Analyzing session: ${sessionId}`));
console.log(chalk.gray(` Web URL: ${session.webUrl}`));
console.log(chalk.gray(` Status: ${session.status}`));
console.log(chalk.gray(` Completed Agents: ${session.completedAgents.length}`));
}
if (dryRun) {
console.log(chalk.yellow(` [DRY RUN] Would reconcile session ${sessionId.substring(0, 8)}...`));
return { success: true, sessionId, dryRun: true };
}
// Perform actual reconciliation
const report = await reconcileSession(sessionId);
const hasChanges = report.promotions.length > 0 ||
report.demotions.length > 0 ||
report.failures.length > 0;
if (hasChanges) {
console.log(chalk.green(`✅ Reconciled session ${sessionId.substring(0, 8)}...`));
if (report.promotions.length > 0) {
console.log(chalk.blue(` Added ${report.promotions.length} completed agents: ${report.promotions.join(', ')}`));
}
if (report.demotions.length > 0) {
console.log(chalk.yellow(` Removed ${report.demotions.length} rolled-back agents: ${report.demotions.join(', ')}`));
}
if (report.failures.length > 0) {
console.log(chalk.red(` ❌ Marked ${report.failures.length} failed agents: ${report.failures.join(', ')}`));
}
} else {
if (verbose) {
console.log(chalk.gray(` ✓ No inconsistencies found`));
}
}
return { success: true, sessionId, ...report };
} catch (error) {
console.log(chalk.red(`❌ Failed to reconcile session ${sessionId}: ${error.message}`));
return { success: false, sessionId, error: error.message };
}
}
// Main execution
async function main() {
const args = parseArgs();
console.log(chalk.cyan.bold('\n🔄 Manual Session Reconciliation\n'));
if (args.dryRun) {
console.log(chalk.yellow('⚠️ DRY RUN MODE - No changes will be made\n'));
}
let sessions = [];
if (args.sessionId) {
sessions = [{ id: args.sessionId }];
} else if (args.allSessions) {
sessions = await loadAllSessions();
console.log(chalk.blue(`Found ${sessions.length} sessions\n`));
} else {
console.log(chalk.red('❌ Must specify either --session-id or --all-sessions'));
printUsage();
process.exit(1);
}
const results = {
total: sessions.length,
success: 0,
failed: 0,
totalPromotions: 0,
totalDemotions: 0,
totalFailures: 0
};
for (const session of sessions) {
const result = await reconcileSingleSession(session.id, args.dryRun, args.verbose);
if (result.success) {
results.success++;
results.totalPromotions += result.promotions?.length || 0;
results.totalDemotions += result.demotions?.length || 0;
results.totalFailures += result.failures?.length || 0;
} else {
results.failed++;
}
}
// Summary
console.log(chalk.cyan.bold('\n📊 Summary:'));
console.log(chalk.gray(`Total sessions: ${results.total}`));
console.log(chalk.green(`Successful: ${results.success}`));
if (results.failed > 0) {
console.log(chalk.red(`Failed: ${results.failed}`));
}
console.log(chalk.blue(`Promotions: ${results.totalPromotions}`));
console.log(chalk.yellow(`Demotions: ${results.totalDemotions}`));
console.log(chalk.red(`Failures: ${results.totalFailures}`));
// Health check
if (args.allSessions) {
const consistencyRate = (results.success / results.total) * 100;
console.log(chalk.cyan(`\n📈 Consistency Rate: ${consistencyRate.toFixed(1)}%`));
if (consistencyRate < 98) {
console.log(chalk.red('\n⚠️ WARNING: Low consistency rate detected!'));
console.log(chalk.red('This may indicate bugs in automatic reconciliation.'));
}
}
console.log();
}
main().catch(error => {
console.log(chalk.red.bold(`\n🚨 Fatal error: ${error.message}`));
if (process.env.DEBUG) {
console.log(chalk.gray(error.stack));
}
process.exit(1);
});
+2 -2
View File
@@ -233,7 +233,7 @@ async function main(webUrl, repoPath, configPath = null, pipelineTestingMode = f
AGENTS['recon'].displayName,
'recon', // Agent name for snapshot creation
chalk.cyan,
{ webUrl, sessionId: session.id } // Session metadata for logging
{ id: session.id, webUrl } // Session metadata for audit logging (STANDARD: use 'id' field)
);
const reconDuration = reconTimer.stop();
timingResults.phases['recon'] = reconDuration;
@@ -309,7 +309,7 @@ async function main(webUrl, repoPath, configPath = null, pipelineTestingMode = f
'Executive Summary and Report Cleanup',
'report', // Agent name for snapshot creation
chalk.cyan,
{ webUrl, sessionId: session.id } // Session metadata for logging
{ id: session.id, webUrl } // Session metadata for audit logging (STANDARD: use 'id' field)
);
const reportDuration = reportTimer.stop();
+116 -58
View File
@@ -6,10 +6,10 @@ import { isRetryableError, getRetryDelay, PentestError } from '../error-handling
import { ProgressIndicator } from '../progress-indicator.js';
import { timingResults, costResults, Timer, formatDuration } from '../utils/metrics.js';
import { createGitCheckpoint, commitGitSuccess, rollbackGitWorkspace } from '../utils/git-manager.js';
import { savePromptSnapshot } from '../prompts/prompt-manager.js';
import { AGENT_VALIDATORS } from '../constants.js';
import { filterJsonToolCalls, getAgentPrefix } from '../utils/output-formatter.js';
import { generateSessionLogPath } from '../session-manager.js';
import { AuditSession } from '../audit/index.js';
// Simplified validation using direct agent name mapping
async function validateAgentOutput(result, agentName, sourceDir) {
@@ -57,10 +57,11 @@ async function validateAgentOutput(result, agentName, sourceDir) {
// - Output validation
// - Prompt snapshotting for debugging
// - Git checkpoint/rollback safety
async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context = '', description = 'Claude analysis', colorFn = chalk.cyan, sessionMetadata = null) {
async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context = '', description = 'Claude analysis', colorFn = chalk.cyan, sessionMetadata = null, auditSession = null, attemptNumber = 1) {
const timer = new Timer(`agent-${description.toLowerCase().replace(/\s+/g, '-')}`);
const fullPrompt = context ? `${context}\n\n${prompt}` : prompt;
let totalCost = 0;
let partialCost = 0; // Track partial cost for crash safety
// Auto-detect execution mode to adjust logging behavior
const isParallelExecution = description.includes('vuln agent') || description.includes('exploit agent');
@@ -82,28 +83,14 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
progressIndicator = new ProgressIndicator(`Running ${agentType}...`);
}
// Setup detailed logging for all agents (if session metadata is available)
// NOTE: Logging now handled by AuditSession (append-only, crash-safe)
// Legacy log path generation kept for compatibility
let logFilePath = null;
let logBuffer = [];
if (sessionMetadata && sessionMetadata.webUrl && sessionMetadata.sessionId) {
if (sessionMetadata && sessionMetadata.webUrl && sessionMetadata.id) {
const timestamp = new Date().toISOString().replace(/T/, '_').replace(/[:.]/g, '-').slice(0, 19);
const agentName = description.toLowerCase().replace(/\s+/g, '-');
// Use session-based folder structure
const logDir = generateSessionLogPath(sessionMetadata.webUrl, sessionMetadata.sessionId);
await fs.ensureDir(logDir);
logFilePath = path.join(logDir, `${timestamp}_${agentName}_attempt-1.log`);
// Initialize log with agent startup info
const sessionId = sessionMetadata?.sessionId || path.basename(sourceDir).split('-').pop().substring(0, 8);
logBuffer.push(`=== ${description} - Detailed Execution Log ===`);
logBuffer.push(`Timestamp: ${new Date().toISOString()}`);
logBuffer.push(`Working Directory: ${sourceDir}`);
logBuffer.push(`Session ID: ${sessionId}`);
logBuffer.push(`Log File: ${logFilePath}`);
logBuffer.push(`\n=== Agent Execution Start ===\n`);
const logDir = generateSessionLogPath(sessionMetadata.webUrl, sessionMetadata.id);
logFilePath = path.join(logDir, `${timestamp}_${agentName}_attempt-${attemptNumber}.log`);
} else {
console.log(chalk.blue(` 🤖 Running Claude Code: ${description}...`));
}
@@ -114,7 +101,6 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
maxTurns: 10_000, // Maximum turns for autonomous work
cwd: sourceDir, // Set working directory using SDK option
permissionMode: 'bypassPermissions', // Bypass all permission checks for pentesting
customSystemPrompt: fullPrompt, // Use system prompt for better security and consistency
};
// SDK Options only shown for verbose agents (not clean output)
@@ -132,7 +118,7 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
progressIndicator.start();
}
for await (const message of query({ prompt: 'Begin.', options })) {
for await (const message of query({ prompt: fullPrompt, options })) {
if (message.type === "assistant") {
turnCount++;
const content = Array.isArray(message.message.content)
@@ -177,9 +163,15 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
console.log(colorFn(` ${content}`));
}
// Log full details to file for later review
logBuffer.push(`\n🤖 Turn ${turnCount} (${description}):`);
logBuffer.push(content);
// Log to audit system (crash-safe, append-only)
if (auditSession) {
await auditSession.logEvent('llm_response', {
turn: turnCount,
content,
timestamp: new Date().toISOString()
});
}
messages.push(content);
// Check for API error patterns in assistant message content
@@ -210,6 +202,15 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
if (message.input && Object.keys(message.input).length > 0) {
console.log(chalk.gray(` Input: ${JSON.stringify(message.input, null, 2)}`));
}
// Log tool start event
if (auditSession) {
await auditSession.logEvent('tool_start', {
toolName: message.name,
parameters: message.input,
timestamp: new Date().toISOString()
});
}
} else if (message.type === "tool_result") {
console.log(chalk.green(` ✅ Tool Result:`));
if (message.content) {
@@ -221,6 +222,14 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
console.log(chalk.gray(` ${resultStr}`));
}
}
// Log tool end event
if (auditSession) {
await auditSession.logEvent('tool_end', {
result: message.content,
timestamp: new Date().toISOString()
});
}
} else if (message.type === "result") {
result = message.result;
@@ -273,8 +282,9 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
costResults.agents[agentKey] = cost;
costResults.total += cost;
// Store cost for return value
// Store cost for return value and partial tracking
totalCost = cost;
partialCost = cost;
break;
} else {
// Log any other message types we might not be handling
@@ -292,23 +302,14 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
console.log(chalk.yellow(` ⚠️ API Error detected in ${description} - will validate deliverables before failing`));
}
// Finish status line for parallel execution and save detailed log
// Finish status line for parallel execution
if (statusManager) {
statusManager.clearAgentStatus(description);
statusManager.finishStatusLine();
}
// Write detailed log to file
if (logFilePath && logBuffer.length > 0) {
logBuffer.push(`\n=== Agent Execution Complete ===`);
logBuffer.push(`Duration: ${formatDuration(duration)}`);
logBuffer.push(`Turns: ${turnCount}`);
logBuffer.push(`Cost: $${totalCost.toFixed(4)}`);
logBuffer.push(`Status: Success`);
logBuffer.push(`Completed: ${new Date().toISOString()}`);
await fs.writeFile(logFilePath, logBuffer.join('\n'));
}
// NOTE: Log writing now handled by AuditSession (crash-safe, append-only)
// Legacy log writing removed - audit system handles this automatically
// Show completion messages based on agent type
if (progressIndicator) {
@@ -327,7 +328,15 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
}
// Return result with log file path for all agents
const returnData = { result, success: true, duration, turns: turnCount, cost: totalCost, apiErrorDetected };
const returnData = {
result,
success: true,
duration,
turns: turnCount,
cost: totalCost,
partialCost, // Include partial cost for crash recovery
apiErrorDetected
};
if (logFilePath) {
returnData.logFile = logFilePath;
}
@@ -344,17 +353,16 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
statusManager.finishStatusLine();
}
// Write error log to file
if (logFilePath && logBuffer.length > 0) {
logBuffer.push(`\n=== Agent Execution Failed ===`);
logBuffer.push(`Duration: ${formatDuration(duration)}`);
logBuffer.push(`Turns: ${turnCount}`);
logBuffer.push(`Error: ${error.message}`);
logBuffer.push(`Error Type: ${error.constructor.name}`);
logBuffer.push(`Status: Failed`);
logBuffer.push(`Failed: ${new Date().toISOString()}`);
await fs.writeFile(logFilePath, logBuffer.join('\n'));
// Log error to audit system
if (auditSession) {
await auditSession.logEvent('error', {
message: error.message,
errorType: error.constructor.name,
stack: error.stack,
duration,
turns: turnCount,
timestamp: new Date().toISOString()
});
}
// Show error messages based on agent type
@@ -420,6 +428,7 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
prompt: fullPrompt.slice(0, 100) + '...',
success: false,
duration,
cost: partialCost, // Include partial cost on error
retryable: isRetryableError(error)
};
}
@@ -432,6 +441,7 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
// - Prompt snapshotting for debugging and reproducibility
// - Git checkpoint/rollback safety for workspace protection
// - Comprehensive error handling and logging
// - Crash-safe audit logging via AuditSession
export async function runClaudePromptWithRetry(prompt, sourceDir, allowedTools = 'Read', context = '', description = 'Claude analysis', agentName = null, colorFn = chalk.cyan, sessionMetadata = null) {
const maxRetries = 3;
let lastError;
@@ -439,22 +449,25 @@ export async function runClaudePromptWithRetry(prompt, sourceDir, allowedTools =
console.log(chalk.cyan(`🚀 Starting ${description} with ${maxRetries} max attempts`));
// Save prompt snapshot before execution starts (for debugging failed runs)
let snapshotSaved = false;
// Initialize audit session (crash-safe logging)
let auditSession = null;
if (sessionMetadata && agentName) {
auditSession = new AuditSession(sessionMetadata);
await auditSession.initialize();
}
for (let attempt = 1; attempt <= maxRetries; attempt++) {
// Create checkpoint before each attempt
await createGitCheckpoint(sourceDir, description, attempt);
// Save snapshot on first attempt only (before any execution)
if (!snapshotSaved && agentName) {
// Start agent tracking in audit system (saves prompt snapshot automatically)
if (auditSession) {
const fullPrompt = retryContext ? `${retryContext}\n\n${prompt}` : prompt;
await savePromptSnapshot(sourceDir, agentName, fullPrompt);
snapshotSaved = true;
await auditSession.startAgent(agentName, fullPrompt, attempt);
}
try {
const result = await runClaudePrompt(prompt, sourceDir, allowedTools, retryContext, description, colorFn, sessionMetadata);
const result = await runClaudePrompt(prompt, sourceDir, allowedTools, retryContext, description, colorFn, sessionMetadata, auditSession, attempt);
// Validate output after successful run
if (result.success) {
@@ -466,6 +479,17 @@ export async function runClaudePromptWithRetry(prompt, sourceDir, allowedTools =
console.log(chalk.yellow(`📋 Validation: Ready for exploitation despite API error warnings`));
}
// Record successful attempt in audit system
if (auditSession) {
await auditSession.endAgent(agentName, {
attemptNumber: attempt,
duration_ms: result.duration,
cost_usd: result.cost || 0,
success: true,
checkpoint: await getGitCommitHash(sourceDir)
});
}
// Commit successful changes (will include the snapshot)
await commitGitSuccess(sourceDir, description);
console.log(chalk.green.bold(`🎉 ${description} completed successfully on attempt ${attempt}/${maxRetries}`));
@@ -474,6 +498,18 @@ export async function runClaudePromptWithRetry(prompt, sourceDir, allowedTools =
// Agent completed but output validation failed
console.log(chalk.yellow(`⚠️ ${description} completed but output validation failed`));
// Record failed validation attempt in audit system
if (auditSession) {
await auditSession.endAgent(agentName, {
attemptNumber: attempt,
duration_ms: result.duration,
cost_usd: result.partialCost || result.cost || 0,
success: false,
error: 'Output validation failed',
isFinalAttempt: attempt === maxRetries
});
}
// If API error detected AND validation failed, this is a retryable error
if (result.apiErrorDetected) {
console.log(chalk.yellow(`⚠️ API Error detected with validation failure - treating as retryable`));
@@ -501,6 +537,18 @@ export async function runClaudePromptWithRetry(prompt, sourceDir, allowedTools =
} catch (error) {
lastError = error;
// Record failed attempt in audit system
if (auditSession) {
await auditSession.endAgent(agentName, {
attemptNumber: attempt,
duration_ms: error.duration || 0,
cost_usd: error.cost || 0,
success: false,
error: error.message,
isFinalAttempt: attempt === maxRetries
});
}
// Check if error is retryable
if (!isRetryableError(error)) {
console.log(chalk.red(`${description} failed with non-retryable error: ${error.message}`));
@@ -533,4 +581,14 @@ export async function runClaudePromptWithRetry(prompt, sourceDir, allowedTools =
}
throw lastError;
}
// Helper function to get git commit hash
async function getGitCommitHash(sourceDir) {
try {
const result = await $`cd ${sourceDir} && git rev-parse HEAD`;
return result.stdout.trim();
} catch (error) {
return null;
}
}
+313
View File
@@ -0,0 +1,313 @@
/**
* Audit Session - Main Facade
*
* Coordinates logger, metrics tracker, and concurrency control for comprehensive
* crash-safe audit logging.
*/
import { AgentLogger } from './logger.js';
import { MetricsTracker } from './metrics-tracker.js';
import { initializeAuditStructure, formatTimestamp } from './utils.js';
/**
* SessionMutex for concurrency control
* (Identical to session-manager.js implementation)
*/
class SessionMutex {
constructor() {
this.locks = new Map();
}
async lock(sessionId) {
if (this.locks.has(sessionId)) {
// Wait for existing lock to be released
await this.locks.get(sessionId);
}
let resolve;
const promise = new Promise(r => resolve = r);
this.locks.set(sessionId, promise);
return () => {
this.locks.delete(sessionId);
resolve();
};
}
}
// Global mutex instance
const sessionMutex = new SessionMutex();
/**
* AuditSession - Main audit system facade
*/
export class AuditSession {
/**
* @param {Object} sessionMetadata - Session metadata from Shannon store
* @param {string} sessionMetadata.id - Session UUID
* @param {string} sessionMetadata.webUrl - Target web URL
* @param {string} [sessionMetadata.repoPath] - Target repository path
*/
constructor(sessionMetadata) {
this.sessionMetadata = sessionMetadata;
this.sessionId = sessionMetadata.id;
// Validate required fields
if (!this.sessionId) {
throw new Error('sessionMetadata.id is required');
}
if (!this.sessionMetadata.webUrl) {
throw new Error('sessionMetadata.webUrl is required');
}
// Components
this.metricsTracker = new MetricsTracker(sessionMetadata);
// Active logger (one at a time per agent attempt)
this.currentLogger = null;
// Initialization flag
this.initialized = false;
}
/**
* Initialize audit session (creates directories, session.json)
* Idempotent and race-safe
* @returns {Promise<void>}
*/
async initialize() {
if (this.initialized) {
return; // Already initialized
}
// Create directory structure
await initializeAuditStructure(this.sessionMetadata);
// Initialize metrics tracker (loads or creates session.json)
await this.metricsTracker.initialize();
this.initialized = true;
}
/**
* Ensure initialized (helper for lazy initialization)
* @private
* @returns {Promise<void>}
*/
async ensureInitialized() {
if (!this.initialized) {
await this.initialize();
}
}
/**
* Log session-level failure (pre-agent failures)
* @param {Error} error - Error object
* @param {Object} context - Additional context
* @returns {Promise<void>}
*/
async logSessionFailure(error, context = {}) {
await this.ensureInitialized();
// Update session status
await this.metricsTracker.updateSessionStatus('failed');
// Create a special failure logger
const failureLogger = new AgentLogger(this.sessionMetadata, 'session-failure', 1);
await failureLogger.initialize();
await failureLogger.logError(error, {
...context,
timestamp: formatTimestamp(),
sessionId: this.sessionId
});
await failureLogger.close();
}
/**
* Start agent execution
* @param {string} agentName - Agent name
* @param {string} promptContent - Full prompt content
* @param {number} [attemptNumber=1] - Attempt number
* @returns {Promise<void>}
*/
async startAgent(agentName, promptContent, attemptNumber = 1) {
await this.ensureInitialized();
// Save prompt snapshot (only on first attempt)
if (attemptNumber === 1) {
await AgentLogger.savePrompt(this.sessionMetadata, agentName, promptContent);
}
// Create and initialize logger for this attempt
this.currentLogger = new AgentLogger(this.sessionMetadata, agentName, attemptNumber);
await this.currentLogger.initialize();
// Start metrics tracking
this.metricsTracker.startAgent(agentName, attemptNumber);
// Log start event
await this.currentLogger.logEvent('agent_start', {
agentName,
attemptNumber,
timestamp: formatTimestamp()
});
}
/**
* Log event during agent execution
* @param {string} eventType - Event type (tool_start, tool_end, llm_response, etc.)
* @param {Object} eventData - Event data
* @returns {Promise<void>}
*/
async logEvent(eventType, eventData) {
if (!this.currentLogger) {
throw new Error('No active logger. Call startAgent() first.');
}
await this.currentLogger.logEvent(eventType, eventData);
}
/**
* Log a text message (for compatibility)
* @param {string} message - Message to log
* @returns {Promise<void>}
*/
async logMessage(message) {
if (!this.currentLogger) {
throw new Error('No active logger. Call startAgent() first.');
}
await this.currentLogger.logMessage(message);
}
/**
* End agent execution (mutex-protected)
* @param {string} agentName - Agent name
* @param {Object} result - Execution result
* @param {number} result.attemptNumber - Attempt number
* @param {number} result.duration_ms - Duration in milliseconds
* @param {number} result.cost_usd - Cost in USD
* @param {boolean} result.success - Whether attempt succeeded
* @param {string} [result.error] - Error message (if failed)
* @param {string} [result.checkpoint] - Git checkpoint hash (if succeeded)
* @param {boolean} [result.isFinalAttempt=false] - Whether this is the final attempt
* @returns {Promise<void>}
*/
async endAgent(agentName, result) {
// Log end event
if (this.currentLogger) {
await this.currentLogger.logEvent('agent_end', {
agentName,
success: result.success,
duration_ms: result.duration_ms,
cost_usd: result.cost_usd,
timestamp: formatTimestamp()
});
// Close logger
await this.currentLogger.close();
this.currentLogger = null;
}
// Mutex-protected update to session.json
const unlock = await sessionMutex.lock(this.sessionId);
try {
// Reload metrics (in case of parallel updates)
await this.metricsTracker.reload();
// Update metrics
await this.metricsTracker.endAgent(agentName, result);
} finally {
unlock();
}
}
/**
* Update validation results
* @param {string} agentName - Agent name
* @param {Object} validationData - Validation data
* @returns {Promise<void>}
*/
async updateValidation(agentName, validationData) {
await this.ensureInitialized();
const unlock = await sessionMutex.lock(this.sessionId);
try {
await this.metricsTracker.reload();
await this.metricsTracker.updateValidation(agentName, validationData);
} finally {
unlock();
}
}
/**
* Mark agent as rolled back
* @param {string} agentName - Agent name
* @returns {Promise<void>}
*/
async markRolledBack(agentName) {
await this.ensureInitialized();
const unlock = await sessionMutex.lock(this.sessionId);
try {
await this.metricsTracker.reload();
await this.metricsTracker.markRolledBack(agentName);
} finally {
unlock();
}
}
/**
* Mark multiple agents as rolled back
* @param {string[]} agentNames - Array of agent names
* @returns {Promise<void>}
*/
async markMultipleRolledBack(agentNames) {
await this.ensureInitialized();
const unlock = await sessionMutex.lock(this.sessionId);
try {
await this.metricsTracker.reload();
await this.metricsTracker.markMultipleRolledBack(agentNames);
} finally {
unlock();
}
}
/**
* Update session status
* @param {string} status - New status (in-progress, completed, failed)
* @returns {Promise<void>}
*/
async updateSessionStatus(status) {
await this.ensureInitialized();
const unlock = await sessionMutex.lock(this.sessionId);
try {
await this.metricsTracker.reload();
await this.metricsTracker.updateSessionStatus(status);
} finally {
unlock();
}
}
/**
* Get current metrics (read-only)
* @returns {Promise<Object>} Current metrics
*/
async getMetrics() {
await this.ensureInitialized();
return this.metricsTracker.getMetrics();
}
/**
* Get validation results (read-only)
* @returns {Promise<Object>} Validation results
*/
async getValidation() {
await this.ensureInitialized();
return this.metricsTracker.getValidation();
}
}
+16
View File
@@ -0,0 +1,16 @@
/**
* Unified Audit & Metrics System
*
* Public API for the audit system. Provides crash-safe, append-only logging
* and comprehensive metrics tracking for Shannon penetration testing sessions.
*
* IMPORTANT: Session objects must have an 'id' field (NOT 'sessionId')
* Example: { id: "uuid", webUrl: "...", repoPath: "..." }
*
* @module audit
*/
export { AuditSession } from './audit-session.js';
export { AgentLogger } from './logger.js';
export { MetricsTracker } from './metrics-tracker.js';
export * as AuditUtils from './utils.js';
+247
View File
@@ -0,0 +1,247 @@
/**
* Append-Only Agent Logger
*
* Provides crash-safe, append-only logging for agent execution.
* Uses file streams with immediate flush to prevent data loss.
*/
import fs from 'fs';
import { generateLogPath, generatePromptPath, atomicWrite, formatTimestamp } from './utils.js';
/**
* AgentLogger - Manages append-only logging for a single agent execution
*/
export class AgentLogger {
/**
* @param {Object} sessionMetadata - Session metadata
* @param {string} agentName - Name of the agent
* @param {number} attemptNumber - Attempt number (1, 2, 3, ...)
*/
constructor(sessionMetadata, agentName, attemptNumber) {
this.sessionMetadata = sessionMetadata;
this.agentName = agentName;
this.attemptNumber = attemptNumber;
this.timestamp = Date.now();
// Generate log file path
this.logPath = generateLogPath(sessionMetadata, agentName, this.timestamp, attemptNumber);
// Create write stream (append mode)
this.stream = null;
this.isOpen = false;
}
/**
* Initialize the log stream (creates file and opens stream)
* @returns {Promise<void>}
*/
async initialize() {
if (this.isOpen) {
return; // Already initialized
}
// Create write stream with append mode and auto-flush
this.stream = fs.createWriteStream(this.logPath, {
flags: 'a', // Append mode
encoding: 'utf8',
autoClose: true
});
this.isOpen = true;
// Write header
await this.writeHeader();
}
/**
* Write header to log file
* @private
* @returns {Promise<void>}
*/
async writeHeader() {
const header = [
`========================================`,
`Agent: ${this.agentName}`,
`Attempt: ${this.attemptNumber}`,
`Started: ${formatTimestamp(this.timestamp)}`,
`Session: ${this.sessionMetadata.id}`,
`Web URL: ${this.sessionMetadata.webUrl}`,
`========================================\n`
].join('\n');
return this.writeRaw(header);
}
/**
* Write raw text to log file with immediate flush
* @private
* @param {string} text - Text to write
* @returns {Promise<void>}
*/
writeRaw(text) {
return new Promise((resolve, reject) => {
if (!this.isOpen || !this.stream) {
reject(new Error('Logger not initialized'));
return;
}
// Write and flush immediately (crash-safe)
const needsDrain = !this.stream.write(text, 'utf8', (error) => {
if (error) {
reject(error);
}
});
if (needsDrain) {
// Buffer is full, wait for drain
const drainHandler = () => {
this.stream.removeListener('drain', drainHandler);
resolve();
};
this.stream.once('drain', drainHandler);
} else {
// Buffer has space, resolve immediately
resolve();
}
});
}
/**
* Log an event (tool_start, tool_end, llm_response, etc.)
* Events are logged as JSON for parseability
* @param {string} eventType - Type of event
* @param {Object} eventData - Event data
* @returns {Promise<void>}
*/
async logEvent(eventType, eventData) {
const event = {
type: eventType,
timestamp: formatTimestamp(),
data: eventData
};
const eventLine = `${JSON.stringify(event)}\n`;
return this.writeRaw(eventLine);
}
/**
* Log a text message (for compatibility with existing logging)
* @param {string} message - Message to log
* @returns {Promise<void>}
*/
async logMessage(message) {
const timestamp = formatTimestamp();
const line = `[${timestamp}] ${message}\n`;
return this.writeRaw(line);
}
/**
* Log tool start event
* @param {string} toolName - Name of the tool
* @param {Object} [parameters] - Tool parameters
* @returns {Promise<void>}
*/
async logToolStart(toolName, parameters = {}) {
return this.logEvent('tool_start', { toolName, parameters });
}
/**
* Log tool end event
* @param {string} toolName - Name of the tool
* @param {Object} result - Tool result
* @returns {Promise<void>}
*/
async logToolEnd(toolName, result) {
return this.logEvent('tool_end', { toolName, result });
}
/**
* Log LLM response event
* @param {string} content - Response content
* @param {Object} [metadata] - Additional metadata
* @returns {Promise<void>}
*/
async logLLMResponse(content, metadata = {}) {
return this.logEvent('llm_response', { content, ...metadata });
}
/**
* Log validation start event
* @param {string} validationType - Type of validation
* @returns {Promise<void>}
*/
async logValidationStart(validationType) {
return this.logEvent('validation_start', { validationType });
}
/**
* Log validation end event
* @param {string} validationType - Type of validation
* @param {boolean} success - Whether validation passed
* @param {Object} [details] - Validation details
* @returns {Promise<void>}
*/
async logValidationEnd(validationType, success, details = {}) {
return this.logEvent('validation_end', { validationType, success, ...details });
}
/**
* Log error event
* @param {Error} error - Error object
* @param {Object} [context] - Additional context
* @returns {Promise<void>}
*/
async logError(error, context = {}) {
return this.logEvent('error', {
message: error.message,
stack: error.stack,
...context
});
}
/**
* Close the log stream
* @returns {Promise<void>}
*/
async close() {
if (!this.isOpen || !this.stream) {
return;
}
return new Promise((resolve) => {
this.stream.end(() => {
this.isOpen = false;
resolve();
});
});
}
/**
* Save prompt snapshot to prompts directory
* Static method - doesn't require logger instance
* @param {Object} sessionMetadata - Session metadata
* @param {string} agentName - Agent name
* @param {string} promptContent - Full prompt content
* @returns {Promise<void>}
*/
static async savePrompt(sessionMetadata, agentName, promptContent) {
const promptPath = generatePromptPath(sessionMetadata, agentName);
// Create header with metadata
const header = [
`# Prompt Snapshot: ${agentName}`,
``,
`**Session:** ${sessionMetadata.id}`,
`**Web URL:** ${sessionMetadata.webUrl}`,
`**Saved:** ${formatTimestamp()}`,
``,
`---`,
``
].join('\n');
const fullContent = header + promptContent;
// Use atomic write for safety
await atomicWrite(promptPath, fullContent);
}
}
+331
View File
@@ -0,0 +1,331 @@
/**
* Metrics Tracker
*
* Manages session.json with comprehensive timing, cost, and validation metrics.
* Tracks attempt-level data for complete forensic trail.
*/
import {
generateSessionJsonPath,
atomicWrite,
readJson,
fileExists,
formatTimestamp,
calculatePercentage
} from './utils.js';
/**
* MetricsTracker - Manages metrics for a session
*/
export class MetricsTracker {
/**
* @param {Object} sessionMetadata - Session metadata from Shannon store
*/
constructor(sessionMetadata) {
this.sessionMetadata = sessionMetadata;
this.sessionJsonPath = generateSessionJsonPath(sessionMetadata);
// In-memory state (loaded from/synced to session.json)
this.data = null;
// Active timers (agent name -> start time)
this.activeTimers = new Map();
}
/**
* Initialize session.json (idempotent)
* @returns {Promise<void>}
*/
async initialize() {
// Check if session.json already exists
const exists = await fileExists(this.sessionJsonPath);
if (exists) {
// Load existing data
this.data = await readJson(this.sessionJsonPath);
} else {
// Create new session.json
this.data = this.createInitialData();
await this.save();
}
}
/**
* Create initial session.json structure
* @private
* @returns {Object} Initial session data
*/
createInitialData() {
return {
session: {
id: this.sessionMetadata.id,
webUrl: this.sessionMetadata.webUrl,
repoPath: this.sessionMetadata.repoPath,
status: 'in-progress',
createdAt: formatTimestamp()
},
metrics: {
total_duration_ms: 0,
total_cost_usd: 0,
phases: {}, // Phase-level aggregations: { duration_ms, duration_percentage, cost_usd, agent_count }
agents: {} // Agent-level metrics: { status, attempts[], final_duration_ms, total_cost_usd, checkpoint }
}
};
}
/**
* Start tracking an agent execution
* @param {string} agentName - Agent name
* @param {number} attemptNumber - Attempt number
* @returns {void}
*/
startAgent(agentName, attemptNumber) {
this.activeTimers.set(agentName, {
startTime: Date.now(),
attemptNumber
});
}
/**
* End agent execution and update metrics
* @param {string} agentName - Agent name
* @param {Object} result - Agent execution result
* @param {number} result.attemptNumber - Attempt number
* @param {number} result.duration_ms - Duration in milliseconds
* @param {number} result.cost_usd - Cost in USD
* @param {boolean} result.success - Whether attempt succeeded
* @param {string} [result.error] - Error message (if failed)
* @param {string} [result.checkpoint] - Git checkpoint hash (if succeeded)
* @returns {Promise<void>}
*/
async endAgent(agentName, result) {
// Initialize agent metrics if not exists
if (!this.data.metrics.agents[agentName]) {
this.data.metrics.agents[agentName] = {
status: 'in-progress',
attempts: [],
final_duration_ms: 0,
total_cost_usd: 0 // Total cost across all attempts (including retries)
};
}
const agent = this.data.metrics.agents[agentName];
// Add attempt to array
const attempt = {
attempt_number: result.attemptNumber,
duration_ms: result.duration_ms,
cost_usd: result.cost_usd,
success: result.success,
timestamp: formatTimestamp()
};
if (result.error) {
attempt.error = result.error;
}
agent.attempts.push(attempt);
// Update total cost (includes failed attempts)
agent.total_cost_usd = agent.attempts.reduce((sum, a) => sum + a.cost_usd, 0);
// If successful, update final metrics and status
if (result.success) {
agent.status = 'success';
agent.final_duration_ms = result.duration_ms;
if (result.checkpoint) {
agent.checkpoint = result.checkpoint;
}
} else {
// If this was the last attempt, mark as failed
if (result.isFinalAttempt) {
agent.status = 'failed';
}
}
// Clear active timer
this.activeTimers.delete(agentName);
// Recalculate aggregations
this.recalculateAggregations();
// Save to disk
await this.save();
}
/**
* Mark agent as rolled back
* @param {string} agentName - Agent name
* @returns {Promise<void>}
*/
async markRolledBack(agentName) {
if (!this.data.metrics.agents[agentName]) {
return; // Agent not tracked
}
const agent = this.data.metrics.agents[agentName];
agent.status = 'rolled-back';
agent.rolled_back_at = formatTimestamp();
// Recalculate aggregations (exclude rolled-back agents)
this.recalculateAggregations();
await this.save();
}
/**
* Mark multiple agents as rolled back
* @param {string[]} agentNames - Array of agent names
* @returns {Promise<void>}
*/
async markMultipleRolledBack(agentNames) {
for (const agentName of agentNames) {
if (this.data.metrics.agents[agentName]) {
const agent = this.data.metrics.agents[agentName];
agent.status = 'rolled-back';
agent.rolled_back_at = formatTimestamp();
}
}
this.recalculateAggregations();
await this.save();
}
/**
* Update session status
* @param {string} status - New status (in-progress, completed, failed)
* @returns {Promise<void>}
*/
async updateSessionStatus(status) {
this.data.session.status = status;
if (status === 'completed' || status === 'failed') {
this.data.session.completedAt = formatTimestamp();
}
await this.save();
}
/**
* Recalculate aggregations (total duration, total cost, phases)
* @private
*/
recalculateAggregations() {
const agents = this.data.metrics.agents;
// Only count successful agents (not rolled-back or failed)
const successfulAgents = Object.entries(agents)
.filter(([_, data]) => data.status === 'success');
// Calculate total duration and cost
const totalDuration = successfulAgents.reduce(
(sum, [_, data]) => sum + data.final_duration_ms,
0
);
const totalCost = successfulAgents.reduce(
(sum, [_, data]) => sum + data.total_cost_usd,
0
);
this.data.metrics.total_duration_ms = totalDuration;
this.data.metrics.total_cost_usd = totalCost;
// Calculate phase-level metrics
this.data.metrics.phases = this.calculatePhaseMetrics(successfulAgents);
}
/**
* Calculate phase-level metrics
* @private
* @param {Array} successfulAgents - Array of [agentName, agentData] tuples
* @returns {Object} Phase metrics
*/
calculatePhaseMetrics(successfulAgents) {
const phases = {
'pre-recon': [],
'recon': [],
'vulnerability-analysis': [],
'exploitation': [],
'reporting': []
};
// Map agents to phases
const agentPhaseMap = {
'pre-recon': 'pre-recon',
'recon': 'recon',
'injection-vuln': 'vulnerability-analysis',
'xss-vuln': 'vulnerability-analysis',
'auth-vuln': 'vulnerability-analysis',
'authz-vuln': 'vulnerability-analysis',
'ssrf-vuln': 'vulnerability-analysis',
'injection-exploit': 'exploitation',
'xss-exploit': 'exploitation',
'auth-exploit': 'exploitation',
'authz-exploit': 'exploitation',
'ssrf-exploit': 'exploitation',
'report': 'reporting'
};
// Group agents by phase
for (const [agentName, agentData] of successfulAgents) {
const phase = agentPhaseMap[agentName];
if (phase) {
phases[phase].push(agentData);
}
}
// Calculate metrics per phase
const phaseMetrics = {};
const totalDuration = this.data.metrics.total_duration_ms;
for (const [phaseName, agentList] of Object.entries(phases)) {
if (agentList.length === 0) continue;
const phaseDuration = agentList.reduce(
(sum, agent) => sum + agent.final_duration_ms,
0
);
const phaseCost = agentList.reduce(
(sum, agent) => sum + agent.total_cost_usd,
0
);
phaseMetrics[phaseName] = {
duration_ms: phaseDuration,
duration_percentage: calculatePercentage(phaseDuration, totalDuration),
cost_usd: phaseCost,
agent_count: agentList.length
};
}
return phaseMetrics;
}
/**
* Get current metrics
* @returns {Object} Current metrics data
*/
getMetrics() {
return JSON.parse(JSON.stringify(this.data));
}
/**
* Save metrics to session.json (atomic write)
* @private
* @returns {Promise<void>}
*/
async save() {
await atomicWrite(this.sessionJsonPath, this.data);
}
/**
* Reload metrics from disk
* @returns {Promise<void>}
*/
async reload() {
this.data = await readJson(this.sessionJsonPath);
}
}
+208
View File
@@ -0,0 +1,208 @@
/**
* Audit System Utilities
*
* Core utility functions for path generation, atomic writes, and formatting.
* All functions are pure and crash-safe.
*/
import fs from 'fs/promises';
import path from 'path';
import { fileURLToPath } from 'url';
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
// Get Shannon repository root
export const SHANNON_ROOT = path.resolve(__dirname, '..', '..');
export const AUDIT_LOGS_DIR = path.join(SHANNON_ROOT, 'audit-logs');
/**
* Generate standardized session identifier: {hostname}_{sessionId}
* @param {Object} sessionMetadata - Session metadata from Shannon store
* @param {string} sessionMetadata.id - UUID session ID
* @param {string} sessionMetadata.webUrl - Target web URL
* @returns {string} Formatted session identifier
*/
export function generateSessionIdentifier(sessionMetadata) {
const { id, webUrl } = sessionMetadata;
const hostname = new URL(webUrl).hostname.replace(/[^a-zA-Z0-9-]/g, '-');
return `${hostname}_${id}`;
}
/**
* Generate path to audit log directory for a session
* @param {Object} sessionMetadata - Session metadata
* @returns {string} Absolute path to session audit directory
*/
export function generateAuditPath(sessionMetadata) {
const sessionIdentifier = generateSessionIdentifier(sessionMetadata);
return path.join(AUDIT_LOGS_DIR, sessionIdentifier);
}
/**
* Generate path to agent log file
* @param {Object} sessionMetadata - Session metadata
* @param {string} agentName - Name of the agent
* @param {number} timestamp - Timestamp (ms since epoch)
* @param {number} attemptNumber - Attempt number (1, 2, 3, ...)
* @returns {string} Absolute path to agent log file
*/
export function generateLogPath(sessionMetadata, agentName, timestamp, attemptNumber) {
const auditPath = generateAuditPath(sessionMetadata);
const filename = `${timestamp}_${agentName}_attempt-${attemptNumber}.log`;
return path.join(auditPath, 'agents', filename);
}
/**
* Generate path to prompt snapshot file
* @param {Object} sessionMetadata - Session metadata
* @param {string} agentName - Name of the agent
* @returns {string} Absolute path to prompt file
*/
export function generatePromptPath(sessionMetadata, agentName) {
const auditPath = generateAuditPath(sessionMetadata);
return path.join(auditPath, 'prompts', `${agentName}.md`);
}
/**
* Generate path to session.json file
* @param {Object} sessionMetadata - Session metadata
* @returns {string} Absolute path to session.json
*/
export function generateSessionJsonPath(sessionMetadata) {
const auditPath = generateAuditPath(sessionMetadata);
return path.join(auditPath, 'session.json');
}
/**
* Ensure directory exists (idempotent, race-safe)
* @param {string} dirPath - Directory path to create
* @returns {Promise<void>}
*/
export async function ensureDirectory(dirPath) {
try {
await fs.mkdir(dirPath, { recursive: true });
} catch (error) {
// Ignore EEXIST errors (race condition safe)
if (error.code !== 'EEXIST') {
throw error;
}
}
}
/**
* Atomic write using temp file + rename pattern
* Guarantees no partial writes or corruption on crash
* @param {string} filePath - Target file path
* @param {Object|string} data - Data to write (will be JSON.stringified if object)
* @returns {Promise<void>}
*/
export async function atomicWrite(filePath, data) {
const tempPath = `${filePath}.tmp`;
const content = typeof data === 'string' ? data : JSON.stringify(data, null, 2);
try {
// Write to temp file
await fs.writeFile(tempPath, content, 'utf8');
// Atomic rename (POSIX guarantee: atomic on same filesystem)
await fs.rename(tempPath, filePath);
} catch (error) {
// Clean up temp file on failure
try {
await fs.unlink(tempPath);
} catch (cleanupError) {
// Ignore cleanup errors
}
throw error;
}
}
/**
* Format duration in milliseconds to human-readable string
* @param {number} ms - Duration in milliseconds
* @returns {string} Formatted duration (e.g., "2m 34s", "45s", "1.2s")
*/
export function formatDuration(ms) {
if (ms < 1000) {
return `${ms}ms`;
}
const seconds = ms / 1000;
if (seconds < 60) {
return `${seconds.toFixed(1)}s`;
}
const minutes = Math.floor(seconds / 60);
const remainingSeconds = Math.floor(seconds % 60);
return `${minutes}m ${remainingSeconds}s`;
}
/**
* Format cost in USD
* @param {number} usd - Cost in USD
* @returns {string} Formatted cost (e.g., "$0.0823", "$2.14")
*/
export function formatCost(usd) {
return `$${usd.toFixed(4)}`;
}
/**
* Format timestamp to ISO 8601 string
* @param {number} [timestamp] - Unix timestamp in ms (defaults to now)
* @returns {string} ISO 8601 formatted string
*/
export function formatTimestamp(timestamp = Date.now()) {
return new Date(timestamp).toISOString();
}
/**
* Calculate percentage
* @param {number} part - Part value
* @param {number} total - Total value
* @returns {number} Percentage (0-100)
*/
export function calculatePercentage(part, total) {
if (total === 0) return 0;
return (part / total) * 100;
}
/**
* Read and parse JSON file
* @param {string} filePath - Path to JSON file
* @returns {Promise<Object>} Parsed JSON data
*/
export async function readJson(filePath) {
const content = await fs.readFile(filePath, 'utf8');
return JSON.parse(content);
}
/**
* Check if file exists
* @param {string} filePath - Path to check
* @returns {Promise<boolean>} True if file exists
*/
export async function fileExists(filePath) {
try {
await fs.access(filePath);
return true;
} catch {
return false;
}
}
/**
* Initialize audit directory structure for a session
* Creates: audit-logs/{sessionId}/, agents/, prompts/
* @param {Object} sessionMetadata - Session metadata
* @returns {Promise<void>}
*/
export async function initializeAuditStructure(sessionMetadata) {
const auditPath = generateAuditPath(sessionMetadata);
const agentsPath = path.join(auditPath, 'agents');
const promptsPath = path.join(auditPath, 'prompts');
await ensureDirectory(auditPath);
await ensureDirectory(agentsPath);
await ensureDirectory(promptsPath);
}
+28 -6
View File
@@ -79,7 +79,7 @@ const rollbackGitToCommit = async (targetRepo, commitHash) => {
export const runSingleAgent = async (agentName, session, pipelineTestingMode, runClaudePromptWithRetry, loadPrompt, allowRerun = false, skipWorkspaceClean = false) => {
// Validate agent first
const agent = validateAgent(agentName);
console.log(chalk.cyan(`\n🤖 Running agent: ${agent.displayName}`));
// Reload session to get latest state (important for agent ranges)
@@ -191,7 +191,7 @@ export const runSingleAgent = async (agentName, session, pipelineTestingMode, ru
AGENTS[agentName].displayName,
agentName, // Pass agent name for snapshot creation
getAgentColor(agentName), // Pass color function for this agent
{ webUrl: session.webUrl, sessionId: session.id } // Session metadata for logging
{ id: session.id, webUrl: session.webUrl, repoPath: session.repoPath } // Session metadata for audit logging
);
if (!result.success) {
@@ -616,13 +616,35 @@ export const rollbackTo = async (targetAgent, session) => {
}
const commitHash = session.checkpoints[targetAgent];
// Rollback git workspace
await rollbackGitToCommit(session.targetRepo, commitHash);
// Update session state
// Update session state (removes agents from completedAgents)
await rollbackToAgent(session.id, targetAgent);
// Mark rolled-back agents in audit system (for forensic trail)
try {
const { AuditSession } = await import('./audit/index.js');
const auditSession = new AuditSession(session);
await auditSession.initialize();
// Find agents that were rolled back (agents after targetAgent)
const targetOrder = AGENTS[targetAgent].order;
const rolledBackAgents = Object.values(AGENTS)
.filter(agent => agent.order > targetOrder)
.map(agent => agent.name);
// Mark them as rolled-back in audit system
if (rolledBackAgents.length > 0) {
await auditSession.markMultipleRolledBack(rolledBackAgents);
console.log(chalk.gray(` Marked ${rolledBackAgents.length} agents as rolled-back in audit logs`));
}
} catch (error) {
// Non-critical: rollback succeeded even if audit update failed
console.log(chalk.yellow(` ⚠️ Failed to update audit logs: ${error.message}`));
}
console.log(chalk.green(`✅ Successfully rolled back to agent '${targetAgent}'`));
};
+24 -1
View File
@@ -1,7 +1,7 @@
import chalk from 'chalk';
import {
selectSession, deleteSession, deleteAllSessions,
validateAgent, validatePhase
validateAgent, validatePhase, reconcileSession
} from '../session-manager.js';
import {
runPhase, runAll, rollbackTo, rerunAgent, displayStatus, listAgents
@@ -94,6 +94,29 @@ export async function handleDeveloperCommand(command, args, pipelineTestingMode,
process.exit(1);
}
// Self-healing: Reconcile session with audit logs before executing command
// This ensures Shannon store is consistent with audit data, even after crash recovery
try {
const reconcileReport = await reconcileSession(session.id);
if (reconcileReport.promotions.length > 0) {
console.log(chalk.blue(`🔄 Reconciled: Added ${reconcileReport.promotions.length} completed agents from audit logs`));
}
if (reconcileReport.demotions.length > 0) {
console.log(chalk.yellow(`🔄 Reconciled: Removed ${reconcileReport.demotions.length} rolled-back agents`));
}
if (reconcileReport.failures.length > 0) {
console.log(chalk.yellow(`🔄 Reconciled: Marked ${reconcileReport.failures.length} failed agents`));
}
// Reload session after reconciliation to get fresh state
const { getSession } = await import('../session-manager.js');
session = await getSession(session.id);
} catch (error) {
// Reconciliation failure is non-critical, but log warning
console.log(chalk.yellow(`⚠️ Failed to reconcile session with audit logs: ${error.message}`));
}
switch (command) {
case '--run-phase':
+2 -2
View File
@@ -99,7 +99,7 @@ async function runPreReconWave1(webUrl, sourceDir, variables, config, pipelineTe
AGENTS['pre-recon'].displayName,
'pre-recon', // Agent name for snapshot creation
chalk.cyan,
{ webUrl, sessionId } // Session metadata for logging
{ id: sessionId, webUrl } // Session metadata for audit logging (STANDARD: use 'id' field)
)
);
const [codeAnalysis] = await Promise.all(operations);
@@ -123,7 +123,7 @@ async function runPreReconWave1(webUrl, sourceDir, variables, config, pipelineTe
AGENTS['pre-recon'].displayName,
'pre-recon', // Agent name for snapshot creation
chalk.cyan,
{ webUrl, sessionId } // Session metadata for logging
{ id: sessionId, webUrl } // Session metadata for audit logging (STANDARD: use 'id' field)
)
);
}
-32
View File
@@ -207,36 +207,4 @@ export async function loadPrompt(promptName, variables, config = null, pipelineT
const promptError = handlePromptError(promptName, error);
throw promptError.error;
}
}
// Save prompt snapshot for successful agent runs only
export async function savePromptSnapshot(sourceDir, agentName, promptContent) {
const snapshotDir = path.join(sourceDir, 'prompt-snapshots');
await fs.ensureDir(snapshotDir);
// Use deterministic naming - one snapshot per agent
const fileName = `${agentName}.md`;
const filePath = path.join(snapshotDir, fileName);
const timestamp = new Date().toISOString();
const snapshotContent = `# Prompt Snapshot: ${agentName}
**Generated:** ${timestamp}
**Agent:** ${agentName}
---
## Full Interpolated Prompt
\`\`\`markdown
${promptContent}
\`\`\`
---
*This snapshot represents the exact prompt that was sent to Claude Code to generate the current deliverables for this agent.*
`;
await fs.writeFile(filePath, snapshotContent);
console.log(chalk.gray(` 📸 Prompt snapshot saved: prompt-snapshots/${fileName}`));
}
+101 -62
View File
@@ -4,13 +4,10 @@ import crypto from 'crypto';
import { PentestError } from './error-handling.js';
// Generate a session-based log folder path
// NEW FORMAT: {hostname}_{sessionId} (no hash, full UUID for consistency with audit system)
export const generateSessionLogPath = (webUrl, sessionId) => {
// Create a hash of the webUrl for uniqueness while keeping it readable
const urlHash = crypto.createHash('md5').update(webUrl).digest('hex').substring(0, 8);
const hostname = new URL(webUrl).hostname.replace(/[^a-zA-Z0-9-]/g, '-');
const shortSessionId = sessionId.substring(0, 8);
const sessionFolderName = `${hostname}_${urlHash}_${shortSessionId}`;
const sessionFolderName = `${hostname}_${sessionId}`;
return path.join(process.cwd(), 'agent-logs', sessionFolderName);
};
@@ -242,6 +239,8 @@ export const createSession = async (webUrl, repoPath, configFile = null, targetR
const sessionId = generateSessionId();
// STANDARD: All sessions use 'id' field (NOT 'sessionId')
// This is the canonical session structure used throughout the codebase
const session = {
id: sessionId,
webUrl,
@@ -452,7 +451,9 @@ export const getNextAgent = (session) => {
};
// Mark agent as completed with checkpoint
export const markAgentCompleted = async (sessionId, agentName, checkpointCommit, timingData = null, costData = null, validationData = null) => {
// NOTE: Timing, cost, and validation data now managed by AuditSession (audit-logs/session.json)
// Shannon store contains ONLY orchestration state (completedAgents, checkpoints)
export const markAgentCompleted = async (sessionId, agentName, checkpointCommit) => {
// Use mutex to prevent race conditions during parallel agent execution
const unlock = await sessionMutex.lock(sessionId);
@@ -473,38 +474,6 @@ export const markAgentCompleted = async (sessionId, agentName, checkpointCommit,
[agentName]: checkpointCommit
}
};
// Update timing data if provided
if (timingData) {
updates.timingBreakdown = {
...session.timingBreakdown,
agents: {
...session.timingBreakdown?.agents,
[agentName]: timingData
}
};
}
// Update cost data if provided
if (costData) {
const existingCost = session.costBreakdown?.total || 0;
updates.costBreakdown = {
total: existingCost + costData,
agents: {
...session.costBreakdown?.agents,
[agentName]: costData
}
};
}
// Update validation data if provided (for vulnerability agents)
if (validationData && agentName.includes('-vuln')) {
updates.validationResults = {
...session.validationResults,
[agentName]: validationData
};
}
// Check if all agents are now completed and update session status
const totalAgents = Object.keys(AGENTS).length;
@@ -656,33 +625,103 @@ export const rollbackToAgent = async (sessionId, targetAgent) => {
Object.entries(session.checkpoints).filter(([agent]) => !agentsToRemove.includes(agent))
)
};
// Clean up timing data for rolled-back agents
if (session.timingBreakdown?.agents) {
const filteredTimingAgents = Object.fromEntries(
Object.entries(session.timingBreakdown.agents).filter(([agent]) => !agentsToRemove.includes(agent))
);
updates.timingBreakdown = {
...session.timingBreakdown,
agents: filteredTimingAgents
};
}
// Clean up cost data for rolled-back agents and recalculate total
if (session.costBreakdown?.agents) {
const filteredCostAgents = Object.fromEntries(
Object.entries(session.costBreakdown.agents).filter(([agent]) => !agentsToRemove.includes(agent))
);
const recalculatedTotal = Object.values(filteredCostAgents).reduce((sum, cost) => sum + cost, 0);
updates.costBreakdown = {
total: recalculatedTotal,
agents: filteredCostAgents
};
}
// NOTE: Timing and cost data now managed in audit-logs/session.json
// Rollback will be reflected via reconcileSession() which marks agents as "rolled-back"
return await updateSession(sessionId, updates);
};
/**
* Reconcile Shannon store with audit logs (self-healing)
*
* This function ensures the Shannon store (.shannon-store.json) is consistent with
* the audit logs (audit-logs/session.json) by syncing agent completion status.
*
* Three-part reconciliation:
* 1. PROMOTIONS: Agents completed/failed in audit → added to Shannon store
* 2. DEMOTIONS: Agents rolled-back in audit → removed from Shannon store
* 3. VERIFICATION: Ensure audit state fully reflected in orchestration
*
* Critical for crash recovery, especially crash during rollback operations.
*
* @param {string} sessionId - Session ID to reconcile
* @returns {Promise<Object>} Reconciliation report with added/removed/failed agents
*/
export const reconcileSession = async (sessionId) => {
const { AuditSession } = await import('./audit/index.js');
// Get Shannon store session
const shannonSession = await getSession(sessionId);
if (!shannonSession) {
throw new PentestError(`Session ${sessionId} not found in Shannon store`, 'validation', false);
}
// Get audit session data
const auditSession = new AuditSession(shannonSession);
await auditSession.initialize();
const auditData = await auditSession.getMetrics();
const report = {
promotions: [],
demotions: [],
failures: []
};
// PART 1: PROMOTIONS (Additive)
// Find agents completed in audit but not in Shannon store
const auditCompleted = Object.entries(auditData.metrics.agents)
.filter(([_, agentData]) => agentData.status === 'success')
.map(([agentName]) => agentName);
const missing = auditCompleted.filter(agent => !shannonSession.completedAgents.includes(agent));
for (const agentName of missing) {
const agentData = auditData.metrics.agents[agentName];
const checkpoint = agentData.checkpoint || null;
await markAgentCompleted(sessionId, agentName, checkpoint);
report.promotions.push(agentName);
}
// PART 2: DEMOTIONS (Subtractive) - CRITICAL FOR ROLLBACK RECOVERY
// Find agents rolled-back in audit but still in Shannon store
const auditRolledBack = Object.entries(auditData.metrics.agents)
.filter(([_, agentData]) => agentData.status === 'rolled-back')
.map(([agentName]) => agentName);
const toRemove = shannonSession.completedAgents.filter(agent => auditRolledBack.includes(agent));
if (toRemove.length > 0) {
// Reload session to get fresh state
const freshSession = await getSession(sessionId);
const updates = {
completedAgents: freshSession.completedAgents.filter(agent => !toRemove.includes(agent)),
checkpoints: Object.fromEntries(
Object.entries(freshSession.checkpoints).filter(([agent]) => !toRemove.includes(agent))
)
};
await updateSession(sessionId, updates);
report.demotions.push(...toRemove);
}
// PART 3: FAILURES
// Find agents failed in audit but not marked failed in Shannon store
const auditFailed = Object.entries(auditData.metrics.agents)
.filter(([_, agentData]) => agentData.status === 'failed')
.map(([agentName]) => agentName);
const failedToAdd = auditFailed.filter(agent => !shannonSession.failedAgents.includes(agent));
for (const agentName of failedToAdd) {
await markAgentFailed(sessionId, agentName);
report.failures.push(agentName);
}
return report;
};
// Delete a specific session by ID
export const deleteSession = async (sessionId) => {
const store = await loadSessions();