mirror of
https://github.com/KeygraphHQ/shannon.git
synced 2026-05-17 14:53:32 +02:00
feat: implement unified audit system v3.0 with crash-safety and self-healing
## Unified Audit System (v3.0)
- Implemented crash-safe, append-only logging to audit-logs/{hostname}_{sessionId}/
- Added session.json with comprehensive metrics (timing, cost, attempts)
- Agent execution logs with turn-by-turn detail
- Prompt snapshots saved to audit-logs/.../prompts/{agent}.md
- SessionMutex prevents race conditions during parallel execution
- Self-healing reconciliation before every CLI command
## Session Metadata Standardization
- Fixed critical bug: standardized on 'id' field (not 'sessionId') throughout codebase
- Updated: shannon.mjs (recon, report), src/phases/pre-recon.js
- Added validation in AuditSession to fail fast on incorrect field usage
- JavaScript shorthand syntax was causing wrong field names
## Schema Improvements
- session.json: Added cost_usd per phase, removed redundant final_cost_usd
- Renamed 'percentage' -> 'duration_percentage' for clarity
- Simplified agent metrics to single total_cost_usd field
- Removed unused validation object from schema
## Legacy System Removal
- Removed savePromptSnapshot() - prompts now only saved by audit system
- Removed target repo pollution (prompt-snapshots/ no longer created)
- Single source of truth: audit-logs/{hostname}_{sessionId}/prompts/
## Export Script Simplification
- Removed JSON export mode (session.json already exists)
- CSV-only export with clean columns: agent, phase, status, attempts, duration_ms, cost_usd
- Tested on real session data
## Documentation
- Updated CLAUDE.md with audit system architecture
- Added .gitignore entry for audit-logs/
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
+2
-1
@@ -1,3 +1,4 @@
|
||||
node_modules/
|
||||
.shannon-store.json
|
||||
agent-logs/
|
||||
agent-logs/
|
||||
audit-logs/
|
||||
@@ -188,17 +188,47 @@ The agent implements a sophisticated checkpoint system using git:
|
||||
- Every agent creates a git checkpoint before execution
|
||||
- Rollback to any previous agent state using `--rollback-to` or `--rerun`
|
||||
- Failed agents don't affect completed work
|
||||
- Timing and cost data cleaned up during rollbacks
|
||||
- Rolled-back agents marked in audit system with status: "rolled-back"
|
||||
- Reconciliation automatically syncs Shannon store with audit logs after rollback
|
||||
- Fail-fast safety prevents accidental re-execution of completed agents
|
||||
|
||||
### Timing & Performance Monitoring
|
||||
The agent includes comprehensive timing instrumentation that tracks:
|
||||
- Total execution time
|
||||
- Phase-level timing breakdown
|
||||
- Individual command execution times
|
||||
- Claude Code agent processing times
|
||||
- Cost tracking for AI agent usage
|
||||
### Unified Audit & Metrics System
|
||||
The agent implements a crash-safe, self-healing audit system (v3.0) with the following guarantees:
|
||||
|
||||
**Architecture:**
|
||||
- **audit-logs/**: Centralized metrics and forensic logs (source of truth)
|
||||
- `{hostname}_{sessionId}/session.json` - Comprehensive metrics with attempt-level detail
|
||||
- `{hostname}_{sessionId}/prompts/` - Exact prompts used for reproducibility
|
||||
- `{hostname}_{sessionId}/agents/` - Turn-by-turn execution logs
|
||||
- **.shannon-store.json**: Minimal orchestration state (completedAgents, checkpoints)
|
||||
|
||||
**Crash Safety:**
|
||||
- Append-only logging with immediate flush (survives kill -9)
|
||||
- Atomic writes for session.json (no partial writes)
|
||||
- Event-based logging (tool_start, tool_end, llm_response) closes data loss windows
|
||||
|
||||
**Self-Healing:**
|
||||
- Automatic reconciliation before every CLI command
|
||||
- Recovers from crashes during rollback
|
||||
- Audit logs are source of truth; Shannon store follows
|
||||
|
||||
**Forensic Completeness:**
|
||||
- All retry attempts logged with errors, costs, durations
|
||||
- Rolled-back agents preserved with status: "rolled-back"
|
||||
- Partial cost capture for failed attempts
|
||||
- Complete event trail for debugging
|
||||
|
||||
**Concurrency Safety:**
|
||||
- SessionMutex prevents race conditions during parallel agent execution
|
||||
- Safe parallel execution of vulnerability and exploitation phases
|
||||
|
||||
**Metrics & Reporting:**
|
||||
- Export metrics with `./scripts/export-metrics.js`
|
||||
- Manual reconciliation (diagnostics) with `./scripts/reconcile-session.js`
|
||||
- Phase-level and agent-level timing/cost aggregations
|
||||
- Validation results integrated with metrics
|
||||
|
||||
For detailed design, see `docs/unified-audit-system-design.md`.
|
||||
|
||||
## Development Notes
|
||||
|
||||
@@ -232,34 +262,56 @@ The tool should only be used on systems you own or have explicit permission to t
|
||||
## File Structure
|
||||
|
||||
```
|
||||
shannon.mjs # Main orchestration script
|
||||
package.json # Node.js dependencies
|
||||
src/ # Core modules
|
||||
├── config-parser.js # Configuration handling
|
||||
├── error-handling.js # Error management
|
||||
├── tool-checker.js # Tool validation
|
||||
├── session-manager.js # Session state management
|
||||
├── checkpoint-manager.js # Git-based checkpointing
|
||||
├── queue-validation.js # Deliverable validation
|
||||
shannon.mjs # Main orchestration script
|
||||
package.json # Node.js dependencies
|
||||
.shannon-store.json # Orchestration state (minimal)
|
||||
src/ # Core modules
|
||||
├── audit/ # Unified audit system (v3.0)
|
||||
│ ├── index.js # Public API
|
||||
│ ├── audit-session.js # Main facade (logger + metrics + mutex)
|
||||
│ ├── logger.js # Append-only crash-safe logging
|
||||
│ ├── metrics-tracker.js # Timing, cost, attempt tracking
|
||||
│ └── utils.js # Path generation, atomic writes
|
||||
├── config-parser.js # Configuration handling
|
||||
├── error-handling.js # Error management
|
||||
├── tool-checker.js # Tool validation
|
||||
├── session-manager.js # Session state + reconciliation
|
||||
├── checkpoint-manager.js # Git-based checkpointing + rollback
|
||||
├── queue-validation.js # Deliverable validation
|
||||
├── ai/
|
||||
│ └── claude-executor.js # Claude Code SDK integration
|
||||
└── utils/
|
||||
configs/ # Configuration files
|
||||
├── config-schema.json # JSON Schema validation
|
||||
├── example-config.yaml # Template configuration
|
||||
├── juice-shop-config.yaml # Juice Shop example
|
||||
├── keygraph-config.yaml # Keygraph configuration
|
||||
├── chatwoot-config.yaml # Chatwoot configuration
|
||||
├── metabase-config.yaml # Metabase configuration
|
||||
└── cal-com-config.yaml # Cal.com configuration
|
||||
prompts/ # AI prompt templates
|
||||
├── pre-recon-code.txt # Code analysis
|
||||
├── recon.txt # Reconnaissance
|
||||
├── vuln-*.txt # Vulnerability assessment
|
||||
├── exploit-*.txt # Exploitation
|
||||
└── report-executive.txt # Executive reporting
|
||||
login_resources/ # Authentication utilities
|
||||
├── generate-totp.mjs # TOTP generation
|
||||
└── login_instructions.txt # Login documentation
|
||||
deliverables/ # Output directory
|
||||
audit-logs/ # Centralized audit data (v3.0)
|
||||
└── {hostname}_{sessionId}/
|
||||
├── session.json # Comprehensive metrics
|
||||
├── prompts/ # Prompt snapshots
|
||||
│ └── {agent}.md
|
||||
└── agents/ # Agent execution logs
|
||||
└── {timestamp}_{agent}_attempt-{N}.log
|
||||
configs/ # Configuration files
|
||||
├── config-schema.json # JSON Schema validation
|
||||
├── example-config.yaml # Template configuration
|
||||
├── juice-shop-config.yaml # Juice Shop example
|
||||
├── keygraph-config.yaml # Keygraph configuration
|
||||
├── chatwoot-config.yaml # Chatwoot configuration
|
||||
├── metabase-config.yaml # Metabase configuration
|
||||
└── cal-com-config.yaml # Cal.com configuration
|
||||
prompts/ # AI prompt templates
|
||||
├── pre-recon-code.txt # Code analysis
|
||||
├── recon.txt # Reconnaissance
|
||||
├── vuln-*.txt # Vulnerability assessment
|
||||
├── exploit-*.txt # Exploitation
|
||||
└── report-executive.txt # Executive reporting
|
||||
login_resources/ # Authentication utilities
|
||||
├── generate-totp.mjs # TOTP generation
|
||||
└── login_instructions.txt # Login documentation
|
||||
scripts/ # Utility scripts
|
||||
├── reconcile-session.js # Manual reconciliation (diagnostics)
|
||||
└── export-metrics.js # Export metrics to CSV/JSON
|
||||
deliverables/ # Output directory (in target repo)
|
||||
docs/ # Documentation
|
||||
├── unified-audit-system-design.md
|
||||
└── migration-guide.md
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
@@ -275,4 +327,18 @@ deliverables/ # Output directory
|
||||
Missing tools can be skipped using `--pipeline-testing` mode during development:
|
||||
- `nmap` - Network scanning
|
||||
- `subfinder` - Subdomain discovery
|
||||
- `whatweb` - Web technology detection
|
||||
- `whatweb` - Web technology detection
|
||||
|
||||
### Diagnostic & Utility Scripts
|
||||
```bash
|
||||
# Manual reconciliation (for diagnostics only)
|
||||
./scripts/reconcile-session.js --session-id <id> --dry-run --verbose
|
||||
|
||||
# Export metrics to CSV/JSON
|
||||
./scripts/export-metrics.js --session-id <id> --format csv --output metrics.csv
|
||||
|
||||
# System-wide consistency audit
|
||||
./scripts/reconcile-session.js --all-sessions --dry-run
|
||||
```
|
||||
|
||||
Note: Manual reconciliation should rarely be needed. Frequent use indicates bugs in automatic reconciliation.
|
||||
|
||||
@@ -1,7 +1,6 @@
|
||||
Use the save_deliverable script to create your deliverable:
|
||||
Run this command and do nothing else:
|
||||
|
||||
```bash
|
||||
node save_deliverable.js CODE_ANALYSIS 'Pre-recon analysis complete'
|
||||
```
|
||||
|
||||
This will automatically create `deliverables/code_analysis_deliverable.md` with the correct filename.
|
||||
Then say "Done".
|
||||
@@ -1,7 +1,6 @@
|
||||
Use the save_deliverable script to create your deliverable:
|
||||
Run this command and do nothing else:
|
||||
|
||||
```bash
|
||||
node save_deliverable.js RECON 'Reconnaissance analysis complete'
|
||||
```
|
||||
|
||||
This will automatically create `deliverables/recon_deliverable.md` with the correct filename.
|
||||
Then say "Done".
|
||||
Executable
+150
@@ -0,0 +1,150 @@
|
||||
#!/usr/bin/env node
|
||||
|
||||
/**
|
||||
* Export Metrics Script
|
||||
*
|
||||
* Export session metrics from audit logs to CSV format for analysis.
|
||||
*
|
||||
* Use Cases:
|
||||
* - Performance analysis across sessions
|
||||
* - Cost tracking and budgeting
|
||||
* - Agent success rate analysis
|
||||
* - Benchmarking improvements
|
||||
*/
|
||||
|
||||
import chalk from 'chalk';
|
||||
import { fs, path } from 'zx';
|
||||
import { getSession } from '../src/session-manager.js';
|
||||
import { AuditSession } from '../src/audit/index.js';
|
||||
|
||||
// Parse command-line arguments
|
||||
function parseArgs() {
|
||||
const args = {
|
||||
sessionId: null,
|
||||
output: null
|
||||
};
|
||||
|
||||
for (let i = 2; i < process.argv.length; i++) {
|
||||
const arg = process.argv[i];
|
||||
|
||||
if (arg === '--session-id' && process.argv[i + 1]) {
|
||||
args.sessionId = process.argv[i + 1];
|
||||
i++;
|
||||
} else if (arg === '--output' && process.argv[i + 1]) {
|
||||
args.output = process.argv[i + 1];
|
||||
i++;
|
||||
} else if (arg === '--help' || arg === '-h') {
|
||||
printUsage();
|
||||
process.exit(0);
|
||||
} else {
|
||||
console.log(chalk.red(`❌ Unknown argument: ${arg}`));
|
||||
printUsage();
|
||||
process.exit(1);
|
||||
}
|
||||
}
|
||||
|
||||
return args;
|
||||
}
|
||||
|
||||
function printUsage() {
|
||||
console.log(chalk.cyan('\n📊 Export Metrics to CSV'));
|
||||
console.log(chalk.gray('\nUsage: ./scripts/export-metrics.js [options]\n'));
|
||||
console.log(chalk.white('Options:'));
|
||||
console.log(chalk.gray(' --session-id <id> Session ID to export (required)'));
|
||||
console.log(chalk.gray(' --output <file> Output CSV file path (default: stdout)'));
|
||||
console.log(chalk.gray(' --help, -h Show this help\n'));
|
||||
console.log(chalk.white('Examples:'));
|
||||
console.log(chalk.gray(' # Export to stdout'));
|
||||
console.log(chalk.gray(' ./scripts/export-metrics.js --session-id abc123\n'));
|
||||
console.log(chalk.gray(' # Export to file'));
|
||||
console.log(chalk.gray(' ./scripts/export-metrics.js --session-id abc123 --output metrics.csv\n'));
|
||||
}
|
||||
|
||||
// Export metrics for a session
|
||||
async function exportMetrics(sessionId) {
|
||||
const session = await getSession(sessionId);
|
||||
if (!session) {
|
||||
throw new Error(`Session ${sessionId} not found`);
|
||||
}
|
||||
|
||||
const auditSession = new AuditSession(session);
|
||||
await auditSession.initialize();
|
||||
const metrics = await auditSession.getMetrics();
|
||||
|
||||
return exportAsCSV(session, metrics);
|
||||
}
|
||||
|
||||
// Export as CSV
|
||||
function exportAsCSV(session, metrics) {
|
||||
const lines = [];
|
||||
|
||||
// Header
|
||||
lines.push('agent,phase,status,attempts,duration_ms,cost_usd');
|
||||
|
||||
// Phase mapping
|
||||
const phaseMap = {
|
||||
'pre-recon': 'pre-recon',
|
||||
'recon': 'recon',
|
||||
'injection-vuln': 'vulnerability-analysis',
|
||||
'xss-vuln': 'vulnerability-analysis',
|
||||
'auth-vuln': 'vulnerability-analysis',
|
||||
'authz-vuln': 'vulnerability-analysis',
|
||||
'ssrf-vuln': 'vulnerability-analysis',
|
||||
'injection-exploit': 'exploitation',
|
||||
'xss-exploit': 'exploitation',
|
||||
'auth-exploit': 'exploitation',
|
||||
'authz-exploit': 'exploitation',
|
||||
'ssrf-exploit': 'exploitation',
|
||||
'report': 'reporting'
|
||||
};
|
||||
|
||||
// Agent rows
|
||||
for (const [agentName, agentData] of Object.entries(metrics.metrics.agents)) {
|
||||
const phase = phaseMap[agentName] || 'unknown';
|
||||
|
||||
lines.push([
|
||||
agentName,
|
||||
phase,
|
||||
agentData.status,
|
||||
agentData.attempts.length,
|
||||
agentData.final_duration_ms,
|
||||
agentData.total_cost_usd.toFixed(4)
|
||||
].join(','));
|
||||
}
|
||||
|
||||
return lines.join('\n');
|
||||
}
|
||||
|
||||
// Main execution
|
||||
async function main() {
|
||||
const args = parseArgs();
|
||||
|
||||
if (!args.sessionId) {
|
||||
console.log(chalk.red('❌ Must specify --session-id'));
|
||||
printUsage();
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
console.log(chalk.cyan.bold('\n📊 Exporting Metrics to CSV\n'));
|
||||
console.log(chalk.gray(`Session ID: ${args.sessionId}\n`));
|
||||
|
||||
const output = await exportMetrics(args.sessionId);
|
||||
|
||||
if (args.output) {
|
||||
await fs.writeFile(args.output, output);
|
||||
console.log(chalk.green(`✅ Exported to: ${args.output}`));
|
||||
} else {
|
||||
console.log(chalk.cyan('CSV Output:\n'));
|
||||
console.log(output);
|
||||
}
|
||||
|
||||
console.log();
|
||||
}
|
||||
|
||||
main().catch(error => {
|
||||
console.log(chalk.red.bold(`\n🚨 Fatal error: ${error.message}`));
|
||||
if (process.env.DEBUG) {
|
||||
console.log(chalk.gray(error.stack));
|
||||
}
|
||||
process.exit(1);
|
||||
});
|
||||
@@ -0,0 +1,225 @@
|
||||
#!/usr/bin/env node
|
||||
|
||||
/**
|
||||
* Manual Session Reconciliation Script
|
||||
*
|
||||
* Purpose: Diagnostics and exceptional recovery (NOT normal operations).
|
||||
*
|
||||
* Use Cases:
|
||||
* 1. Diagnostics (Primary): Non-destructively report inconsistencies
|
||||
* 2. Debugging: Test reconciliation logic in isolation
|
||||
* 3. Exceptional Recovery: Malformed JSON recovery, reconciliation bugs
|
||||
* 4. Bulk Operations: System-wide consistency audit
|
||||
*
|
||||
* Design Principle:
|
||||
* "Self-healing is the norm. Manual intervention is the exception."
|
||||
*
|
||||
* Red Flags (indicate bugs):
|
||||
* - Manual script needed frequently
|
||||
* - Automatic reconciliation failing consistently
|
||||
* - Manual intervention after every crash
|
||||
*/
|
||||
|
||||
import chalk from 'chalk';
|
||||
import { fs, path } from 'zx';
|
||||
import { reconcileSession, getSession } from '../src/session-manager.js';
|
||||
|
||||
const STORE_FILE = path.join(process.cwd(), '.shannon-store.json');
|
||||
|
||||
// Parse command-line arguments
|
||||
function parseArgs() {
|
||||
const args = {
|
||||
sessionId: null,
|
||||
allSessions: false,
|
||||
dryRun: false,
|
||||
verbose: false
|
||||
};
|
||||
|
||||
for (let i = 2; i < process.argv.length; i++) {
|
||||
const arg = process.argv[i];
|
||||
|
||||
if (arg === '--session-id' && process.argv[i + 1]) {
|
||||
args.sessionId = process.argv[i + 1];
|
||||
i++;
|
||||
} else if (arg === '--all-sessions') {
|
||||
args.allSessions = true;
|
||||
} else if (arg === '--dry-run') {
|
||||
args.dryRun = true;
|
||||
} else if (arg === '--verbose') {
|
||||
args.verbose = true;
|
||||
} else if (arg === '--help' || arg === '-h') {
|
||||
printUsage();
|
||||
process.exit(0);
|
||||
} else {
|
||||
console.log(chalk.red(`❌ Unknown argument: ${arg}`));
|
||||
printUsage();
|
||||
process.exit(1);
|
||||
}
|
||||
}
|
||||
|
||||
return args;
|
||||
}
|
||||
|
||||
function printUsage() {
|
||||
console.log(chalk.cyan('\n📋 Manual Session Reconciliation Script'));
|
||||
console.log(chalk.gray('\nUsage: ./scripts/reconcile-session.js [options]\n'));
|
||||
console.log(chalk.white('Options:'));
|
||||
console.log(chalk.gray(' --session-id <id> Reconcile specific session'));
|
||||
console.log(chalk.gray(' --all-sessions Reconcile all sessions'));
|
||||
console.log(chalk.gray(' --dry-run Report inconsistencies without fixing'));
|
||||
console.log(chalk.gray(' --verbose Detailed logging'));
|
||||
console.log(chalk.gray(' --help, -h Show this help\n'));
|
||||
console.log(chalk.white('Examples:'));
|
||||
console.log(chalk.gray(' # Diagnostics (primary use case)'));
|
||||
console.log(chalk.gray(' ./scripts/reconcile-session.js --session-id abc123 --dry-run\n'));
|
||||
console.log(chalk.gray(' # System-wide consistency audit'));
|
||||
console.log(chalk.gray(' ./scripts/reconcile-session.js --all-sessions --dry-run --verbose\n'));
|
||||
console.log(chalk.gray(' # Exceptional recovery'));
|
||||
console.log(chalk.gray(' ./scripts/reconcile-session.js --session-id abc123\n'));
|
||||
}
|
||||
|
||||
// Load all sessions
|
||||
async function loadAllSessions() {
|
||||
try {
|
||||
if (!await fs.pathExists(STORE_FILE)) {
|
||||
return [];
|
||||
}
|
||||
|
||||
const content = await fs.readFile(STORE_FILE, 'utf8');
|
||||
const store = JSON.parse(content);
|
||||
return Object.values(store.sessions || {});
|
||||
} catch (error) {
|
||||
throw new Error(`Failed to load sessions: ${error.message}`);
|
||||
}
|
||||
}
|
||||
|
||||
// Reconcile a single session
|
||||
async function reconcileSingleSession(sessionId, dryRun, verbose) {
|
||||
try {
|
||||
const session = await getSession(sessionId);
|
||||
if (!session) {
|
||||
console.log(chalk.red(`❌ Session ${sessionId} not found`));
|
||||
return { success: false, sessionId };
|
||||
}
|
||||
|
||||
if (verbose) {
|
||||
console.log(chalk.blue(`\n🔍 Analyzing session: ${sessionId}`));
|
||||
console.log(chalk.gray(` Web URL: ${session.webUrl}`));
|
||||
console.log(chalk.gray(` Status: ${session.status}`));
|
||||
console.log(chalk.gray(` Completed Agents: ${session.completedAgents.length}`));
|
||||
}
|
||||
|
||||
if (dryRun) {
|
||||
console.log(chalk.yellow(` [DRY RUN] Would reconcile session ${sessionId.substring(0, 8)}...`));
|
||||
return { success: true, sessionId, dryRun: true };
|
||||
}
|
||||
|
||||
// Perform actual reconciliation
|
||||
const report = await reconcileSession(sessionId);
|
||||
|
||||
const hasChanges = report.promotions.length > 0 ||
|
||||
report.demotions.length > 0 ||
|
||||
report.failures.length > 0;
|
||||
|
||||
if (hasChanges) {
|
||||
console.log(chalk.green(`✅ Reconciled session ${sessionId.substring(0, 8)}...`));
|
||||
|
||||
if (report.promotions.length > 0) {
|
||||
console.log(chalk.blue(` ➕ Added ${report.promotions.length} completed agents: ${report.promotions.join(', ')}`));
|
||||
}
|
||||
if (report.demotions.length > 0) {
|
||||
console.log(chalk.yellow(` ➖ Removed ${report.demotions.length} rolled-back agents: ${report.demotions.join(', ')}`));
|
||||
}
|
||||
if (report.failures.length > 0) {
|
||||
console.log(chalk.red(` ❌ Marked ${report.failures.length} failed agents: ${report.failures.join(', ')}`));
|
||||
}
|
||||
} else {
|
||||
if (verbose) {
|
||||
console.log(chalk.gray(` ✓ No inconsistencies found`));
|
||||
}
|
||||
}
|
||||
|
||||
return { success: true, sessionId, ...report };
|
||||
} catch (error) {
|
||||
console.log(chalk.red(`❌ Failed to reconcile session ${sessionId}: ${error.message}`));
|
||||
return { success: false, sessionId, error: error.message };
|
||||
}
|
||||
}
|
||||
|
||||
// Main execution
|
||||
async function main() {
|
||||
const args = parseArgs();
|
||||
|
||||
console.log(chalk.cyan.bold('\n🔄 Manual Session Reconciliation\n'));
|
||||
|
||||
if (args.dryRun) {
|
||||
console.log(chalk.yellow('⚠️ DRY RUN MODE - No changes will be made\n'));
|
||||
}
|
||||
|
||||
let sessions = [];
|
||||
|
||||
if (args.sessionId) {
|
||||
sessions = [{ id: args.sessionId }];
|
||||
} else if (args.allSessions) {
|
||||
sessions = await loadAllSessions();
|
||||
console.log(chalk.blue(`Found ${sessions.length} sessions\n`));
|
||||
} else {
|
||||
console.log(chalk.red('❌ Must specify either --session-id or --all-sessions'));
|
||||
printUsage();
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
const results = {
|
||||
total: sessions.length,
|
||||
success: 0,
|
||||
failed: 0,
|
||||
totalPromotions: 0,
|
||||
totalDemotions: 0,
|
||||
totalFailures: 0
|
||||
};
|
||||
|
||||
for (const session of sessions) {
|
||||
const result = await reconcileSingleSession(session.id, args.dryRun, args.verbose);
|
||||
|
||||
if (result.success) {
|
||||
results.success++;
|
||||
results.totalPromotions += result.promotions?.length || 0;
|
||||
results.totalDemotions += result.demotions?.length || 0;
|
||||
results.totalFailures += result.failures?.length || 0;
|
||||
} else {
|
||||
results.failed++;
|
||||
}
|
||||
}
|
||||
|
||||
// Summary
|
||||
console.log(chalk.cyan.bold('\n📊 Summary:'));
|
||||
console.log(chalk.gray(`Total sessions: ${results.total}`));
|
||||
console.log(chalk.green(`Successful: ${results.success}`));
|
||||
if (results.failed > 0) {
|
||||
console.log(chalk.red(`Failed: ${results.failed}`));
|
||||
}
|
||||
console.log(chalk.blue(`Promotions: ${results.totalPromotions}`));
|
||||
console.log(chalk.yellow(`Demotions: ${results.totalDemotions}`));
|
||||
console.log(chalk.red(`Failures: ${results.totalFailures}`));
|
||||
|
||||
// Health check
|
||||
if (args.allSessions) {
|
||||
const consistencyRate = (results.success / results.total) * 100;
|
||||
console.log(chalk.cyan(`\n📈 Consistency Rate: ${consistencyRate.toFixed(1)}%`));
|
||||
|
||||
if (consistencyRate < 98) {
|
||||
console.log(chalk.red('\n⚠️ WARNING: Low consistency rate detected!'));
|
||||
console.log(chalk.red('This may indicate bugs in automatic reconciliation.'));
|
||||
}
|
||||
}
|
||||
|
||||
console.log();
|
||||
}
|
||||
|
||||
main().catch(error => {
|
||||
console.log(chalk.red.bold(`\n🚨 Fatal error: ${error.message}`));
|
||||
if (process.env.DEBUG) {
|
||||
console.log(chalk.gray(error.stack));
|
||||
}
|
||||
process.exit(1);
|
||||
});
|
||||
+2
-2
@@ -233,7 +233,7 @@ async function main(webUrl, repoPath, configPath = null, pipelineTestingMode = f
|
||||
AGENTS['recon'].displayName,
|
||||
'recon', // Agent name for snapshot creation
|
||||
chalk.cyan,
|
||||
{ webUrl, sessionId: session.id } // Session metadata for logging
|
||||
{ id: session.id, webUrl } // Session metadata for audit logging (STANDARD: use 'id' field)
|
||||
);
|
||||
const reconDuration = reconTimer.stop();
|
||||
timingResults.phases['recon'] = reconDuration;
|
||||
@@ -309,7 +309,7 @@ async function main(webUrl, repoPath, configPath = null, pipelineTestingMode = f
|
||||
'Executive Summary and Report Cleanup',
|
||||
'report', // Agent name for snapshot creation
|
||||
chalk.cyan,
|
||||
{ webUrl, sessionId: session.id } // Session metadata for logging
|
||||
{ id: session.id, webUrl } // Session metadata for audit logging (STANDARD: use 'id' field)
|
||||
);
|
||||
|
||||
const reportDuration = reportTimer.stop();
|
||||
|
||||
+116
-58
@@ -6,10 +6,10 @@ import { isRetryableError, getRetryDelay, PentestError } from '../error-handling
|
||||
import { ProgressIndicator } from '../progress-indicator.js';
|
||||
import { timingResults, costResults, Timer, formatDuration } from '../utils/metrics.js';
|
||||
import { createGitCheckpoint, commitGitSuccess, rollbackGitWorkspace } from '../utils/git-manager.js';
|
||||
import { savePromptSnapshot } from '../prompts/prompt-manager.js';
|
||||
import { AGENT_VALIDATORS } from '../constants.js';
|
||||
import { filterJsonToolCalls, getAgentPrefix } from '../utils/output-formatter.js';
|
||||
import { generateSessionLogPath } from '../session-manager.js';
|
||||
import { AuditSession } from '../audit/index.js';
|
||||
|
||||
// Simplified validation using direct agent name mapping
|
||||
async function validateAgentOutput(result, agentName, sourceDir) {
|
||||
@@ -57,10 +57,11 @@ async function validateAgentOutput(result, agentName, sourceDir) {
|
||||
// - Output validation
|
||||
// - Prompt snapshotting for debugging
|
||||
// - Git checkpoint/rollback safety
|
||||
async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context = '', description = 'Claude analysis', colorFn = chalk.cyan, sessionMetadata = null) {
|
||||
async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context = '', description = 'Claude analysis', colorFn = chalk.cyan, sessionMetadata = null, auditSession = null, attemptNumber = 1) {
|
||||
const timer = new Timer(`agent-${description.toLowerCase().replace(/\s+/g, '-')}`);
|
||||
const fullPrompt = context ? `${context}\n\n${prompt}` : prompt;
|
||||
let totalCost = 0;
|
||||
let partialCost = 0; // Track partial cost for crash safety
|
||||
|
||||
// Auto-detect execution mode to adjust logging behavior
|
||||
const isParallelExecution = description.includes('vuln agent') || description.includes('exploit agent');
|
||||
@@ -82,28 +83,14 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
|
||||
progressIndicator = new ProgressIndicator(`Running ${agentType}...`);
|
||||
}
|
||||
|
||||
// Setup detailed logging for all agents (if session metadata is available)
|
||||
// NOTE: Logging now handled by AuditSession (append-only, crash-safe)
|
||||
// Legacy log path generation kept for compatibility
|
||||
let logFilePath = null;
|
||||
let logBuffer = [];
|
||||
|
||||
if (sessionMetadata && sessionMetadata.webUrl && sessionMetadata.sessionId) {
|
||||
if (sessionMetadata && sessionMetadata.webUrl && sessionMetadata.id) {
|
||||
const timestamp = new Date().toISOString().replace(/T/, '_').replace(/[:.]/g, '-').slice(0, 19);
|
||||
const agentName = description.toLowerCase().replace(/\s+/g, '-');
|
||||
|
||||
// Use session-based folder structure
|
||||
const logDir = generateSessionLogPath(sessionMetadata.webUrl, sessionMetadata.sessionId);
|
||||
|
||||
await fs.ensureDir(logDir);
|
||||
logFilePath = path.join(logDir, `${timestamp}_${agentName}_attempt-1.log`);
|
||||
|
||||
// Initialize log with agent startup info
|
||||
const sessionId = sessionMetadata?.sessionId || path.basename(sourceDir).split('-').pop().substring(0, 8);
|
||||
logBuffer.push(`=== ${description} - Detailed Execution Log ===`);
|
||||
logBuffer.push(`Timestamp: ${new Date().toISOString()}`);
|
||||
logBuffer.push(`Working Directory: ${sourceDir}`);
|
||||
logBuffer.push(`Session ID: ${sessionId}`);
|
||||
logBuffer.push(`Log File: ${logFilePath}`);
|
||||
logBuffer.push(`\n=== Agent Execution Start ===\n`);
|
||||
const logDir = generateSessionLogPath(sessionMetadata.webUrl, sessionMetadata.id);
|
||||
logFilePath = path.join(logDir, `${timestamp}_${agentName}_attempt-${attemptNumber}.log`);
|
||||
} else {
|
||||
console.log(chalk.blue(` 🤖 Running Claude Code: ${description}...`));
|
||||
}
|
||||
@@ -114,7 +101,6 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
|
||||
maxTurns: 10_000, // Maximum turns for autonomous work
|
||||
cwd: sourceDir, // Set working directory using SDK option
|
||||
permissionMode: 'bypassPermissions', // Bypass all permission checks for pentesting
|
||||
customSystemPrompt: fullPrompt, // Use system prompt for better security and consistency
|
||||
};
|
||||
|
||||
// SDK Options only shown for verbose agents (not clean output)
|
||||
@@ -132,7 +118,7 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
|
||||
progressIndicator.start();
|
||||
}
|
||||
|
||||
for await (const message of query({ prompt: 'Begin.', options })) {
|
||||
for await (const message of query({ prompt: fullPrompt, options })) {
|
||||
if (message.type === "assistant") {
|
||||
turnCount++;
|
||||
const content = Array.isArray(message.message.content)
|
||||
@@ -177,9 +163,15 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
|
||||
console.log(colorFn(` ${content}`));
|
||||
}
|
||||
|
||||
// Log full details to file for later review
|
||||
logBuffer.push(`\n🤖 Turn ${turnCount} (${description}):`);
|
||||
logBuffer.push(content);
|
||||
// Log to audit system (crash-safe, append-only)
|
||||
if (auditSession) {
|
||||
await auditSession.logEvent('llm_response', {
|
||||
turn: turnCount,
|
||||
content,
|
||||
timestamp: new Date().toISOString()
|
||||
});
|
||||
}
|
||||
|
||||
messages.push(content);
|
||||
|
||||
// Check for API error patterns in assistant message content
|
||||
@@ -210,6 +202,15 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
|
||||
if (message.input && Object.keys(message.input).length > 0) {
|
||||
console.log(chalk.gray(` Input: ${JSON.stringify(message.input, null, 2)}`));
|
||||
}
|
||||
|
||||
// Log tool start event
|
||||
if (auditSession) {
|
||||
await auditSession.logEvent('tool_start', {
|
||||
toolName: message.name,
|
||||
parameters: message.input,
|
||||
timestamp: new Date().toISOString()
|
||||
});
|
||||
}
|
||||
} else if (message.type === "tool_result") {
|
||||
console.log(chalk.green(` ✅ Tool Result:`));
|
||||
if (message.content) {
|
||||
@@ -221,6 +222,14 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
|
||||
console.log(chalk.gray(` ${resultStr}`));
|
||||
}
|
||||
}
|
||||
|
||||
// Log tool end event
|
||||
if (auditSession) {
|
||||
await auditSession.logEvent('tool_end', {
|
||||
result: message.content,
|
||||
timestamp: new Date().toISOString()
|
||||
});
|
||||
}
|
||||
} else if (message.type === "result") {
|
||||
result = message.result;
|
||||
|
||||
@@ -273,8 +282,9 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
|
||||
costResults.agents[agentKey] = cost;
|
||||
costResults.total += cost;
|
||||
|
||||
// Store cost for return value
|
||||
// Store cost for return value and partial tracking
|
||||
totalCost = cost;
|
||||
partialCost = cost;
|
||||
break;
|
||||
} else {
|
||||
// Log any other message types we might not be handling
|
||||
@@ -292,23 +302,14 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
|
||||
console.log(chalk.yellow(` ⚠️ API Error detected in ${description} - will validate deliverables before failing`));
|
||||
}
|
||||
|
||||
// Finish status line for parallel execution and save detailed log
|
||||
// Finish status line for parallel execution
|
||||
if (statusManager) {
|
||||
statusManager.clearAgentStatus(description);
|
||||
statusManager.finishStatusLine();
|
||||
}
|
||||
|
||||
// Write detailed log to file
|
||||
if (logFilePath && logBuffer.length > 0) {
|
||||
logBuffer.push(`\n=== Agent Execution Complete ===`);
|
||||
logBuffer.push(`Duration: ${formatDuration(duration)}`);
|
||||
logBuffer.push(`Turns: ${turnCount}`);
|
||||
logBuffer.push(`Cost: $${totalCost.toFixed(4)}`);
|
||||
logBuffer.push(`Status: Success`);
|
||||
logBuffer.push(`Completed: ${new Date().toISOString()}`);
|
||||
|
||||
await fs.writeFile(logFilePath, logBuffer.join('\n'));
|
||||
}
|
||||
// NOTE: Log writing now handled by AuditSession (crash-safe, append-only)
|
||||
// Legacy log writing removed - audit system handles this automatically
|
||||
|
||||
// Show completion messages based on agent type
|
||||
if (progressIndicator) {
|
||||
@@ -327,7 +328,15 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
|
||||
}
|
||||
|
||||
// Return result with log file path for all agents
|
||||
const returnData = { result, success: true, duration, turns: turnCount, cost: totalCost, apiErrorDetected };
|
||||
const returnData = {
|
||||
result,
|
||||
success: true,
|
||||
duration,
|
||||
turns: turnCount,
|
||||
cost: totalCost,
|
||||
partialCost, // Include partial cost for crash recovery
|
||||
apiErrorDetected
|
||||
};
|
||||
if (logFilePath) {
|
||||
returnData.logFile = logFilePath;
|
||||
}
|
||||
@@ -344,17 +353,16 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
|
||||
statusManager.finishStatusLine();
|
||||
}
|
||||
|
||||
// Write error log to file
|
||||
if (logFilePath && logBuffer.length > 0) {
|
||||
logBuffer.push(`\n=== Agent Execution Failed ===`);
|
||||
logBuffer.push(`Duration: ${formatDuration(duration)}`);
|
||||
logBuffer.push(`Turns: ${turnCount}`);
|
||||
logBuffer.push(`Error: ${error.message}`);
|
||||
logBuffer.push(`Error Type: ${error.constructor.name}`);
|
||||
logBuffer.push(`Status: Failed`);
|
||||
logBuffer.push(`Failed: ${new Date().toISOString()}`);
|
||||
|
||||
await fs.writeFile(logFilePath, logBuffer.join('\n'));
|
||||
// Log error to audit system
|
||||
if (auditSession) {
|
||||
await auditSession.logEvent('error', {
|
||||
message: error.message,
|
||||
errorType: error.constructor.name,
|
||||
stack: error.stack,
|
||||
duration,
|
||||
turns: turnCount,
|
||||
timestamp: new Date().toISOString()
|
||||
});
|
||||
}
|
||||
|
||||
// Show error messages based on agent type
|
||||
@@ -420,6 +428,7 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
|
||||
prompt: fullPrompt.slice(0, 100) + '...',
|
||||
success: false,
|
||||
duration,
|
||||
cost: partialCost, // Include partial cost on error
|
||||
retryable: isRetryableError(error)
|
||||
};
|
||||
}
|
||||
@@ -432,6 +441,7 @@ async function runClaudePrompt(prompt, sourceDir, allowedTools = 'Read', context
|
||||
// - Prompt snapshotting for debugging and reproducibility
|
||||
// - Git checkpoint/rollback safety for workspace protection
|
||||
// - Comprehensive error handling and logging
|
||||
// - Crash-safe audit logging via AuditSession
|
||||
export async function runClaudePromptWithRetry(prompt, sourceDir, allowedTools = 'Read', context = '', description = 'Claude analysis', agentName = null, colorFn = chalk.cyan, sessionMetadata = null) {
|
||||
const maxRetries = 3;
|
||||
let lastError;
|
||||
@@ -439,22 +449,25 @@ export async function runClaudePromptWithRetry(prompt, sourceDir, allowedTools =
|
||||
|
||||
console.log(chalk.cyan(`🚀 Starting ${description} with ${maxRetries} max attempts`));
|
||||
|
||||
// Save prompt snapshot before execution starts (for debugging failed runs)
|
||||
let snapshotSaved = false;
|
||||
// Initialize audit session (crash-safe logging)
|
||||
let auditSession = null;
|
||||
if (sessionMetadata && agentName) {
|
||||
auditSession = new AuditSession(sessionMetadata);
|
||||
await auditSession.initialize();
|
||||
}
|
||||
|
||||
for (let attempt = 1; attempt <= maxRetries; attempt++) {
|
||||
// Create checkpoint before each attempt
|
||||
await createGitCheckpoint(sourceDir, description, attempt);
|
||||
|
||||
// Save snapshot on first attempt only (before any execution)
|
||||
if (!snapshotSaved && agentName) {
|
||||
// Start agent tracking in audit system (saves prompt snapshot automatically)
|
||||
if (auditSession) {
|
||||
const fullPrompt = retryContext ? `${retryContext}\n\n${prompt}` : prompt;
|
||||
await savePromptSnapshot(sourceDir, agentName, fullPrompt);
|
||||
snapshotSaved = true;
|
||||
await auditSession.startAgent(agentName, fullPrompt, attempt);
|
||||
}
|
||||
|
||||
try {
|
||||
const result = await runClaudePrompt(prompt, sourceDir, allowedTools, retryContext, description, colorFn, sessionMetadata);
|
||||
const result = await runClaudePrompt(prompt, sourceDir, allowedTools, retryContext, description, colorFn, sessionMetadata, auditSession, attempt);
|
||||
|
||||
// Validate output after successful run
|
||||
if (result.success) {
|
||||
@@ -466,6 +479,17 @@ export async function runClaudePromptWithRetry(prompt, sourceDir, allowedTools =
|
||||
console.log(chalk.yellow(`📋 Validation: Ready for exploitation despite API error warnings`));
|
||||
}
|
||||
|
||||
// Record successful attempt in audit system
|
||||
if (auditSession) {
|
||||
await auditSession.endAgent(agentName, {
|
||||
attemptNumber: attempt,
|
||||
duration_ms: result.duration,
|
||||
cost_usd: result.cost || 0,
|
||||
success: true,
|
||||
checkpoint: await getGitCommitHash(sourceDir)
|
||||
});
|
||||
}
|
||||
|
||||
// Commit successful changes (will include the snapshot)
|
||||
await commitGitSuccess(sourceDir, description);
|
||||
console.log(chalk.green.bold(`🎉 ${description} completed successfully on attempt ${attempt}/${maxRetries}`));
|
||||
@@ -474,6 +498,18 @@ export async function runClaudePromptWithRetry(prompt, sourceDir, allowedTools =
|
||||
// Agent completed but output validation failed
|
||||
console.log(chalk.yellow(`⚠️ ${description} completed but output validation failed`));
|
||||
|
||||
// Record failed validation attempt in audit system
|
||||
if (auditSession) {
|
||||
await auditSession.endAgent(agentName, {
|
||||
attemptNumber: attempt,
|
||||
duration_ms: result.duration,
|
||||
cost_usd: result.partialCost || result.cost || 0,
|
||||
success: false,
|
||||
error: 'Output validation failed',
|
||||
isFinalAttempt: attempt === maxRetries
|
||||
});
|
||||
}
|
||||
|
||||
// If API error detected AND validation failed, this is a retryable error
|
||||
if (result.apiErrorDetected) {
|
||||
console.log(chalk.yellow(`⚠️ API Error detected with validation failure - treating as retryable`));
|
||||
@@ -501,6 +537,18 @@ export async function runClaudePromptWithRetry(prompt, sourceDir, allowedTools =
|
||||
} catch (error) {
|
||||
lastError = error;
|
||||
|
||||
// Record failed attempt in audit system
|
||||
if (auditSession) {
|
||||
await auditSession.endAgent(agentName, {
|
||||
attemptNumber: attempt,
|
||||
duration_ms: error.duration || 0,
|
||||
cost_usd: error.cost || 0,
|
||||
success: false,
|
||||
error: error.message,
|
||||
isFinalAttempt: attempt === maxRetries
|
||||
});
|
||||
}
|
||||
|
||||
// Check if error is retryable
|
||||
if (!isRetryableError(error)) {
|
||||
console.log(chalk.red(`❌ ${description} failed with non-retryable error: ${error.message}`));
|
||||
@@ -533,4 +581,14 @@ export async function runClaudePromptWithRetry(prompt, sourceDir, allowedTools =
|
||||
}
|
||||
|
||||
throw lastError;
|
||||
}
|
||||
|
||||
// Helper function to get git commit hash
|
||||
async function getGitCommitHash(sourceDir) {
|
||||
try {
|
||||
const result = await $`cd ${sourceDir} && git rev-parse HEAD`;
|
||||
return result.stdout.trim();
|
||||
} catch (error) {
|
||||
return null;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,313 @@
|
||||
/**
|
||||
* Audit Session - Main Facade
|
||||
*
|
||||
* Coordinates logger, metrics tracker, and concurrency control for comprehensive
|
||||
* crash-safe audit logging.
|
||||
*/
|
||||
|
||||
import { AgentLogger } from './logger.js';
|
||||
import { MetricsTracker } from './metrics-tracker.js';
|
||||
import { initializeAuditStructure, formatTimestamp } from './utils.js';
|
||||
|
||||
/**
|
||||
* SessionMutex for concurrency control
|
||||
* (Identical to session-manager.js implementation)
|
||||
*/
|
||||
class SessionMutex {
|
||||
constructor() {
|
||||
this.locks = new Map();
|
||||
}
|
||||
|
||||
async lock(sessionId) {
|
||||
if (this.locks.has(sessionId)) {
|
||||
// Wait for existing lock to be released
|
||||
await this.locks.get(sessionId);
|
||||
}
|
||||
|
||||
let resolve;
|
||||
const promise = new Promise(r => resolve = r);
|
||||
this.locks.set(sessionId, promise);
|
||||
|
||||
return () => {
|
||||
this.locks.delete(sessionId);
|
||||
resolve();
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
// Global mutex instance
|
||||
const sessionMutex = new SessionMutex();
|
||||
|
||||
/**
|
||||
* AuditSession - Main audit system facade
|
||||
*/
|
||||
export class AuditSession {
|
||||
/**
|
||||
* @param {Object} sessionMetadata - Session metadata from Shannon store
|
||||
* @param {string} sessionMetadata.id - Session UUID
|
||||
* @param {string} sessionMetadata.webUrl - Target web URL
|
||||
* @param {string} [sessionMetadata.repoPath] - Target repository path
|
||||
*/
|
||||
constructor(sessionMetadata) {
|
||||
this.sessionMetadata = sessionMetadata;
|
||||
this.sessionId = sessionMetadata.id;
|
||||
|
||||
// Validate required fields
|
||||
if (!this.sessionId) {
|
||||
throw new Error('sessionMetadata.id is required');
|
||||
}
|
||||
if (!this.sessionMetadata.webUrl) {
|
||||
throw new Error('sessionMetadata.webUrl is required');
|
||||
}
|
||||
|
||||
// Components
|
||||
this.metricsTracker = new MetricsTracker(sessionMetadata);
|
||||
|
||||
// Active logger (one at a time per agent attempt)
|
||||
this.currentLogger = null;
|
||||
|
||||
// Initialization flag
|
||||
this.initialized = false;
|
||||
}
|
||||
|
||||
/**
|
||||
* Initialize audit session (creates directories, session.json)
|
||||
* Idempotent and race-safe
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async initialize() {
|
||||
if (this.initialized) {
|
||||
return; // Already initialized
|
||||
}
|
||||
|
||||
// Create directory structure
|
||||
await initializeAuditStructure(this.sessionMetadata);
|
||||
|
||||
// Initialize metrics tracker (loads or creates session.json)
|
||||
await this.metricsTracker.initialize();
|
||||
|
||||
this.initialized = true;
|
||||
}
|
||||
|
||||
/**
|
||||
* Ensure initialized (helper for lazy initialization)
|
||||
* @private
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async ensureInitialized() {
|
||||
if (!this.initialized) {
|
||||
await this.initialize();
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Log session-level failure (pre-agent failures)
|
||||
* @param {Error} error - Error object
|
||||
* @param {Object} context - Additional context
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async logSessionFailure(error, context = {}) {
|
||||
await this.ensureInitialized();
|
||||
|
||||
// Update session status
|
||||
await this.metricsTracker.updateSessionStatus('failed');
|
||||
|
||||
// Create a special failure logger
|
||||
const failureLogger = new AgentLogger(this.sessionMetadata, 'session-failure', 1);
|
||||
await failureLogger.initialize();
|
||||
|
||||
await failureLogger.logError(error, {
|
||||
...context,
|
||||
timestamp: formatTimestamp(),
|
||||
sessionId: this.sessionId
|
||||
});
|
||||
|
||||
await failureLogger.close();
|
||||
}
|
||||
|
||||
/**
|
||||
* Start agent execution
|
||||
* @param {string} agentName - Agent name
|
||||
* @param {string} promptContent - Full prompt content
|
||||
* @param {number} [attemptNumber=1] - Attempt number
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async startAgent(agentName, promptContent, attemptNumber = 1) {
|
||||
await this.ensureInitialized();
|
||||
|
||||
// Save prompt snapshot (only on first attempt)
|
||||
if (attemptNumber === 1) {
|
||||
await AgentLogger.savePrompt(this.sessionMetadata, agentName, promptContent);
|
||||
}
|
||||
|
||||
// Create and initialize logger for this attempt
|
||||
this.currentLogger = new AgentLogger(this.sessionMetadata, agentName, attemptNumber);
|
||||
await this.currentLogger.initialize();
|
||||
|
||||
// Start metrics tracking
|
||||
this.metricsTracker.startAgent(agentName, attemptNumber);
|
||||
|
||||
// Log start event
|
||||
await this.currentLogger.logEvent('agent_start', {
|
||||
agentName,
|
||||
attemptNumber,
|
||||
timestamp: formatTimestamp()
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Log event during agent execution
|
||||
* @param {string} eventType - Event type (tool_start, tool_end, llm_response, etc.)
|
||||
* @param {Object} eventData - Event data
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async logEvent(eventType, eventData) {
|
||||
if (!this.currentLogger) {
|
||||
throw new Error('No active logger. Call startAgent() first.');
|
||||
}
|
||||
|
||||
await this.currentLogger.logEvent(eventType, eventData);
|
||||
}
|
||||
|
||||
/**
|
||||
* Log a text message (for compatibility)
|
||||
* @param {string} message - Message to log
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async logMessage(message) {
|
||||
if (!this.currentLogger) {
|
||||
throw new Error('No active logger. Call startAgent() first.');
|
||||
}
|
||||
|
||||
await this.currentLogger.logMessage(message);
|
||||
}
|
||||
|
||||
/**
|
||||
* End agent execution (mutex-protected)
|
||||
* @param {string} agentName - Agent name
|
||||
* @param {Object} result - Execution result
|
||||
* @param {number} result.attemptNumber - Attempt number
|
||||
* @param {number} result.duration_ms - Duration in milliseconds
|
||||
* @param {number} result.cost_usd - Cost in USD
|
||||
* @param {boolean} result.success - Whether attempt succeeded
|
||||
* @param {string} [result.error] - Error message (if failed)
|
||||
* @param {string} [result.checkpoint] - Git checkpoint hash (if succeeded)
|
||||
* @param {boolean} [result.isFinalAttempt=false] - Whether this is the final attempt
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async endAgent(agentName, result) {
|
||||
// Log end event
|
||||
if (this.currentLogger) {
|
||||
await this.currentLogger.logEvent('agent_end', {
|
||||
agentName,
|
||||
success: result.success,
|
||||
duration_ms: result.duration_ms,
|
||||
cost_usd: result.cost_usd,
|
||||
timestamp: formatTimestamp()
|
||||
});
|
||||
|
||||
// Close logger
|
||||
await this.currentLogger.close();
|
||||
this.currentLogger = null;
|
||||
}
|
||||
|
||||
// Mutex-protected update to session.json
|
||||
const unlock = await sessionMutex.lock(this.sessionId);
|
||||
try {
|
||||
// Reload metrics (in case of parallel updates)
|
||||
await this.metricsTracker.reload();
|
||||
|
||||
// Update metrics
|
||||
await this.metricsTracker.endAgent(agentName, result);
|
||||
} finally {
|
||||
unlock();
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Update validation results
|
||||
* @param {string} agentName - Agent name
|
||||
* @param {Object} validationData - Validation data
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async updateValidation(agentName, validationData) {
|
||||
await this.ensureInitialized();
|
||||
|
||||
const unlock = await sessionMutex.lock(this.sessionId);
|
||||
try {
|
||||
await this.metricsTracker.reload();
|
||||
await this.metricsTracker.updateValidation(agentName, validationData);
|
||||
} finally {
|
||||
unlock();
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Mark agent as rolled back
|
||||
* @param {string} agentName - Agent name
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async markRolledBack(agentName) {
|
||||
await this.ensureInitialized();
|
||||
|
||||
const unlock = await sessionMutex.lock(this.sessionId);
|
||||
try {
|
||||
await this.metricsTracker.reload();
|
||||
await this.metricsTracker.markRolledBack(agentName);
|
||||
} finally {
|
||||
unlock();
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Mark multiple agents as rolled back
|
||||
* @param {string[]} agentNames - Array of agent names
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async markMultipleRolledBack(agentNames) {
|
||||
await this.ensureInitialized();
|
||||
|
||||
const unlock = await sessionMutex.lock(this.sessionId);
|
||||
try {
|
||||
await this.metricsTracker.reload();
|
||||
await this.metricsTracker.markMultipleRolledBack(agentNames);
|
||||
} finally {
|
||||
unlock();
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Update session status
|
||||
* @param {string} status - New status (in-progress, completed, failed)
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async updateSessionStatus(status) {
|
||||
await this.ensureInitialized();
|
||||
|
||||
const unlock = await sessionMutex.lock(this.sessionId);
|
||||
try {
|
||||
await this.metricsTracker.reload();
|
||||
await this.metricsTracker.updateSessionStatus(status);
|
||||
} finally {
|
||||
unlock();
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Get current metrics (read-only)
|
||||
* @returns {Promise<Object>} Current metrics
|
||||
*/
|
||||
async getMetrics() {
|
||||
await this.ensureInitialized();
|
||||
return this.metricsTracker.getMetrics();
|
||||
}
|
||||
|
||||
/**
|
||||
* Get validation results (read-only)
|
||||
* @returns {Promise<Object>} Validation results
|
||||
*/
|
||||
async getValidation() {
|
||||
await this.ensureInitialized();
|
||||
return this.metricsTracker.getValidation();
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,16 @@
|
||||
/**
|
||||
* Unified Audit & Metrics System
|
||||
*
|
||||
* Public API for the audit system. Provides crash-safe, append-only logging
|
||||
* and comprehensive metrics tracking for Shannon penetration testing sessions.
|
||||
*
|
||||
* IMPORTANT: Session objects must have an 'id' field (NOT 'sessionId')
|
||||
* Example: { id: "uuid", webUrl: "...", repoPath: "..." }
|
||||
*
|
||||
* @module audit
|
||||
*/
|
||||
|
||||
export { AuditSession } from './audit-session.js';
|
||||
export { AgentLogger } from './logger.js';
|
||||
export { MetricsTracker } from './metrics-tracker.js';
|
||||
export * as AuditUtils from './utils.js';
|
||||
@@ -0,0 +1,247 @@
|
||||
/**
|
||||
* Append-Only Agent Logger
|
||||
*
|
||||
* Provides crash-safe, append-only logging for agent execution.
|
||||
* Uses file streams with immediate flush to prevent data loss.
|
||||
*/
|
||||
|
||||
import fs from 'fs';
|
||||
import { generateLogPath, generatePromptPath, atomicWrite, formatTimestamp } from './utils.js';
|
||||
|
||||
/**
|
||||
* AgentLogger - Manages append-only logging for a single agent execution
|
||||
*/
|
||||
export class AgentLogger {
|
||||
/**
|
||||
* @param {Object} sessionMetadata - Session metadata
|
||||
* @param {string} agentName - Name of the agent
|
||||
* @param {number} attemptNumber - Attempt number (1, 2, 3, ...)
|
||||
*/
|
||||
constructor(sessionMetadata, agentName, attemptNumber) {
|
||||
this.sessionMetadata = sessionMetadata;
|
||||
this.agentName = agentName;
|
||||
this.attemptNumber = attemptNumber;
|
||||
this.timestamp = Date.now();
|
||||
|
||||
// Generate log file path
|
||||
this.logPath = generateLogPath(sessionMetadata, agentName, this.timestamp, attemptNumber);
|
||||
|
||||
// Create write stream (append mode)
|
||||
this.stream = null;
|
||||
this.isOpen = false;
|
||||
}
|
||||
|
||||
/**
|
||||
* Initialize the log stream (creates file and opens stream)
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async initialize() {
|
||||
if (this.isOpen) {
|
||||
return; // Already initialized
|
||||
}
|
||||
|
||||
// Create write stream with append mode and auto-flush
|
||||
this.stream = fs.createWriteStream(this.logPath, {
|
||||
flags: 'a', // Append mode
|
||||
encoding: 'utf8',
|
||||
autoClose: true
|
||||
});
|
||||
|
||||
this.isOpen = true;
|
||||
|
||||
// Write header
|
||||
await this.writeHeader();
|
||||
}
|
||||
|
||||
/**
|
||||
* Write header to log file
|
||||
* @private
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async writeHeader() {
|
||||
const header = [
|
||||
`========================================`,
|
||||
`Agent: ${this.agentName}`,
|
||||
`Attempt: ${this.attemptNumber}`,
|
||||
`Started: ${formatTimestamp(this.timestamp)}`,
|
||||
`Session: ${this.sessionMetadata.id}`,
|
||||
`Web URL: ${this.sessionMetadata.webUrl}`,
|
||||
`========================================\n`
|
||||
].join('\n');
|
||||
|
||||
return this.writeRaw(header);
|
||||
}
|
||||
|
||||
/**
|
||||
* Write raw text to log file with immediate flush
|
||||
* @private
|
||||
* @param {string} text - Text to write
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
writeRaw(text) {
|
||||
return new Promise((resolve, reject) => {
|
||||
if (!this.isOpen || !this.stream) {
|
||||
reject(new Error('Logger not initialized'));
|
||||
return;
|
||||
}
|
||||
|
||||
// Write and flush immediately (crash-safe)
|
||||
const needsDrain = !this.stream.write(text, 'utf8', (error) => {
|
||||
if (error) {
|
||||
reject(error);
|
||||
}
|
||||
});
|
||||
|
||||
if (needsDrain) {
|
||||
// Buffer is full, wait for drain
|
||||
const drainHandler = () => {
|
||||
this.stream.removeListener('drain', drainHandler);
|
||||
resolve();
|
||||
};
|
||||
this.stream.once('drain', drainHandler);
|
||||
} else {
|
||||
// Buffer has space, resolve immediately
|
||||
resolve();
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Log an event (tool_start, tool_end, llm_response, etc.)
|
||||
* Events are logged as JSON for parseability
|
||||
* @param {string} eventType - Type of event
|
||||
* @param {Object} eventData - Event data
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async logEvent(eventType, eventData) {
|
||||
const event = {
|
||||
type: eventType,
|
||||
timestamp: formatTimestamp(),
|
||||
data: eventData
|
||||
};
|
||||
|
||||
const eventLine = `${JSON.stringify(event)}\n`;
|
||||
return this.writeRaw(eventLine);
|
||||
}
|
||||
|
||||
/**
|
||||
* Log a text message (for compatibility with existing logging)
|
||||
* @param {string} message - Message to log
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async logMessage(message) {
|
||||
const timestamp = formatTimestamp();
|
||||
const line = `[${timestamp}] ${message}\n`;
|
||||
return this.writeRaw(line);
|
||||
}
|
||||
|
||||
/**
|
||||
* Log tool start event
|
||||
* @param {string} toolName - Name of the tool
|
||||
* @param {Object} [parameters] - Tool parameters
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async logToolStart(toolName, parameters = {}) {
|
||||
return this.logEvent('tool_start', { toolName, parameters });
|
||||
}
|
||||
|
||||
/**
|
||||
* Log tool end event
|
||||
* @param {string} toolName - Name of the tool
|
||||
* @param {Object} result - Tool result
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async logToolEnd(toolName, result) {
|
||||
return this.logEvent('tool_end', { toolName, result });
|
||||
}
|
||||
|
||||
/**
|
||||
* Log LLM response event
|
||||
* @param {string} content - Response content
|
||||
* @param {Object} [metadata] - Additional metadata
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async logLLMResponse(content, metadata = {}) {
|
||||
return this.logEvent('llm_response', { content, ...metadata });
|
||||
}
|
||||
|
||||
/**
|
||||
* Log validation start event
|
||||
* @param {string} validationType - Type of validation
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async logValidationStart(validationType) {
|
||||
return this.logEvent('validation_start', { validationType });
|
||||
}
|
||||
|
||||
/**
|
||||
* Log validation end event
|
||||
* @param {string} validationType - Type of validation
|
||||
* @param {boolean} success - Whether validation passed
|
||||
* @param {Object} [details] - Validation details
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async logValidationEnd(validationType, success, details = {}) {
|
||||
return this.logEvent('validation_end', { validationType, success, ...details });
|
||||
}
|
||||
|
||||
/**
|
||||
* Log error event
|
||||
* @param {Error} error - Error object
|
||||
* @param {Object} [context] - Additional context
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async logError(error, context = {}) {
|
||||
return this.logEvent('error', {
|
||||
message: error.message,
|
||||
stack: error.stack,
|
||||
...context
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Close the log stream
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async close() {
|
||||
if (!this.isOpen || !this.stream) {
|
||||
return;
|
||||
}
|
||||
|
||||
return new Promise((resolve) => {
|
||||
this.stream.end(() => {
|
||||
this.isOpen = false;
|
||||
resolve();
|
||||
});
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Save prompt snapshot to prompts directory
|
||||
* Static method - doesn't require logger instance
|
||||
* @param {Object} sessionMetadata - Session metadata
|
||||
* @param {string} agentName - Agent name
|
||||
* @param {string} promptContent - Full prompt content
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
static async savePrompt(sessionMetadata, agentName, promptContent) {
|
||||
const promptPath = generatePromptPath(sessionMetadata, agentName);
|
||||
|
||||
// Create header with metadata
|
||||
const header = [
|
||||
`# Prompt Snapshot: ${agentName}`,
|
||||
``,
|
||||
`**Session:** ${sessionMetadata.id}`,
|
||||
`**Web URL:** ${sessionMetadata.webUrl}`,
|
||||
`**Saved:** ${formatTimestamp()}`,
|
||||
``,
|
||||
`---`,
|
||||
``
|
||||
].join('\n');
|
||||
|
||||
const fullContent = header + promptContent;
|
||||
|
||||
// Use atomic write for safety
|
||||
await atomicWrite(promptPath, fullContent);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,331 @@
|
||||
/**
|
||||
* Metrics Tracker
|
||||
*
|
||||
* Manages session.json with comprehensive timing, cost, and validation metrics.
|
||||
* Tracks attempt-level data for complete forensic trail.
|
||||
*/
|
||||
|
||||
import {
|
||||
generateSessionJsonPath,
|
||||
atomicWrite,
|
||||
readJson,
|
||||
fileExists,
|
||||
formatTimestamp,
|
||||
calculatePercentage
|
||||
} from './utils.js';
|
||||
|
||||
/**
|
||||
* MetricsTracker - Manages metrics for a session
|
||||
*/
|
||||
export class MetricsTracker {
|
||||
/**
|
||||
* @param {Object} sessionMetadata - Session metadata from Shannon store
|
||||
*/
|
||||
constructor(sessionMetadata) {
|
||||
this.sessionMetadata = sessionMetadata;
|
||||
this.sessionJsonPath = generateSessionJsonPath(sessionMetadata);
|
||||
|
||||
// In-memory state (loaded from/synced to session.json)
|
||||
this.data = null;
|
||||
|
||||
// Active timers (agent name -> start time)
|
||||
this.activeTimers = new Map();
|
||||
}
|
||||
|
||||
/**
|
||||
* Initialize session.json (idempotent)
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async initialize() {
|
||||
// Check if session.json already exists
|
||||
const exists = await fileExists(this.sessionJsonPath);
|
||||
|
||||
if (exists) {
|
||||
// Load existing data
|
||||
this.data = await readJson(this.sessionJsonPath);
|
||||
} else {
|
||||
// Create new session.json
|
||||
this.data = this.createInitialData();
|
||||
await this.save();
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Create initial session.json structure
|
||||
* @private
|
||||
* @returns {Object} Initial session data
|
||||
*/
|
||||
createInitialData() {
|
||||
return {
|
||||
session: {
|
||||
id: this.sessionMetadata.id,
|
||||
webUrl: this.sessionMetadata.webUrl,
|
||||
repoPath: this.sessionMetadata.repoPath,
|
||||
status: 'in-progress',
|
||||
createdAt: formatTimestamp()
|
||||
},
|
||||
metrics: {
|
||||
total_duration_ms: 0,
|
||||
total_cost_usd: 0,
|
||||
phases: {}, // Phase-level aggregations: { duration_ms, duration_percentage, cost_usd, agent_count }
|
||||
agents: {} // Agent-level metrics: { status, attempts[], final_duration_ms, total_cost_usd, checkpoint }
|
||||
}
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* Start tracking an agent execution
|
||||
* @param {string} agentName - Agent name
|
||||
* @param {number} attemptNumber - Attempt number
|
||||
* @returns {void}
|
||||
*/
|
||||
startAgent(agentName, attemptNumber) {
|
||||
this.activeTimers.set(agentName, {
|
||||
startTime: Date.now(),
|
||||
attemptNumber
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* End agent execution and update metrics
|
||||
* @param {string} agentName - Agent name
|
||||
* @param {Object} result - Agent execution result
|
||||
* @param {number} result.attemptNumber - Attempt number
|
||||
* @param {number} result.duration_ms - Duration in milliseconds
|
||||
* @param {number} result.cost_usd - Cost in USD
|
||||
* @param {boolean} result.success - Whether attempt succeeded
|
||||
* @param {string} [result.error] - Error message (if failed)
|
||||
* @param {string} [result.checkpoint] - Git checkpoint hash (if succeeded)
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async endAgent(agentName, result) {
|
||||
// Initialize agent metrics if not exists
|
||||
if (!this.data.metrics.agents[agentName]) {
|
||||
this.data.metrics.agents[agentName] = {
|
||||
status: 'in-progress',
|
||||
attempts: [],
|
||||
final_duration_ms: 0,
|
||||
total_cost_usd: 0 // Total cost across all attempts (including retries)
|
||||
};
|
||||
}
|
||||
|
||||
const agent = this.data.metrics.agents[agentName];
|
||||
|
||||
// Add attempt to array
|
||||
const attempt = {
|
||||
attempt_number: result.attemptNumber,
|
||||
duration_ms: result.duration_ms,
|
||||
cost_usd: result.cost_usd,
|
||||
success: result.success,
|
||||
timestamp: formatTimestamp()
|
||||
};
|
||||
|
||||
if (result.error) {
|
||||
attempt.error = result.error;
|
||||
}
|
||||
|
||||
agent.attempts.push(attempt);
|
||||
|
||||
// Update total cost (includes failed attempts)
|
||||
agent.total_cost_usd = agent.attempts.reduce((sum, a) => sum + a.cost_usd, 0);
|
||||
|
||||
// If successful, update final metrics and status
|
||||
if (result.success) {
|
||||
agent.status = 'success';
|
||||
agent.final_duration_ms = result.duration_ms;
|
||||
|
||||
if (result.checkpoint) {
|
||||
agent.checkpoint = result.checkpoint;
|
||||
}
|
||||
} else {
|
||||
// If this was the last attempt, mark as failed
|
||||
if (result.isFinalAttempt) {
|
||||
agent.status = 'failed';
|
||||
}
|
||||
}
|
||||
|
||||
// Clear active timer
|
||||
this.activeTimers.delete(agentName);
|
||||
|
||||
// Recalculate aggregations
|
||||
this.recalculateAggregations();
|
||||
|
||||
// Save to disk
|
||||
await this.save();
|
||||
}
|
||||
|
||||
/**
|
||||
* Mark agent as rolled back
|
||||
* @param {string} agentName - Agent name
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async markRolledBack(agentName) {
|
||||
if (!this.data.metrics.agents[agentName]) {
|
||||
return; // Agent not tracked
|
||||
}
|
||||
|
||||
const agent = this.data.metrics.agents[agentName];
|
||||
agent.status = 'rolled-back';
|
||||
agent.rolled_back_at = formatTimestamp();
|
||||
|
||||
// Recalculate aggregations (exclude rolled-back agents)
|
||||
this.recalculateAggregations();
|
||||
|
||||
await this.save();
|
||||
}
|
||||
|
||||
/**
|
||||
* Mark multiple agents as rolled back
|
||||
* @param {string[]} agentNames - Array of agent names
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async markMultipleRolledBack(agentNames) {
|
||||
for (const agentName of agentNames) {
|
||||
if (this.data.metrics.agents[agentName]) {
|
||||
const agent = this.data.metrics.agents[agentName];
|
||||
agent.status = 'rolled-back';
|
||||
agent.rolled_back_at = formatTimestamp();
|
||||
}
|
||||
}
|
||||
|
||||
this.recalculateAggregations();
|
||||
await this.save();
|
||||
}
|
||||
|
||||
/**
|
||||
* Update session status
|
||||
* @param {string} status - New status (in-progress, completed, failed)
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async updateSessionStatus(status) {
|
||||
this.data.session.status = status;
|
||||
|
||||
if (status === 'completed' || status === 'failed') {
|
||||
this.data.session.completedAt = formatTimestamp();
|
||||
}
|
||||
|
||||
await this.save();
|
||||
}
|
||||
|
||||
/**
|
||||
* Recalculate aggregations (total duration, total cost, phases)
|
||||
* @private
|
||||
*/
|
||||
recalculateAggregations() {
|
||||
const agents = this.data.metrics.agents;
|
||||
|
||||
// Only count successful agents (not rolled-back or failed)
|
||||
const successfulAgents = Object.entries(agents)
|
||||
.filter(([_, data]) => data.status === 'success');
|
||||
|
||||
// Calculate total duration and cost
|
||||
const totalDuration = successfulAgents.reduce(
|
||||
(sum, [_, data]) => sum + data.final_duration_ms,
|
||||
0
|
||||
);
|
||||
|
||||
const totalCost = successfulAgents.reduce(
|
||||
(sum, [_, data]) => sum + data.total_cost_usd,
|
||||
0
|
||||
);
|
||||
|
||||
this.data.metrics.total_duration_ms = totalDuration;
|
||||
this.data.metrics.total_cost_usd = totalCost;
|
||||
|
||||
// Calculate phase-level metrics
|
||||
this.data.metrics.phases = this.calculatePhaseMetrics(successfulAgents);
|
||||
}
|
||||
|
||||
/**
|
||||
* Calculate phase-level metrics
|
||||
* @private
|
||||
* @param {Array} successfulAgents - Array of [agentName, agentData] tuples
|
||||
* @returns {Object} Phase metrics
|
||||
*/
|
||||
calculatePhaseMetrics(successfulAgents) {
|
||||
const phases = {
|
||||
'pre-recon': [],
|
||||
'recon': [],
|
||||
'vulnerability-analysis': [],
|
||||
'exploitation': [],
|
||||
'reporting': []
|
||||
};
|
||||
|
||||
// Map agents to phases
|
||||
const agentPhaseMap = {
|
||||
'pre-recon': 'pre-recon',
|
||||
'recon': 'recon',
|
||||
'injection-vuln': 'vulnerability-analysis',
|
||||
'xss-vuln': 'vulnerability-analysis',
|
||||
'auth-vuln': 'vulnerability-analysis',
|
||||
'authz-vuln': 'vulnerability-analysis',
|
||||
'ssrf-vuln': 'vulnerability-analysis',
|
||||
'injection-exploit': 'exploitation',
|
||||
'xss-exploit': 'exploitation',
|
||||
'auth-exploit': 'exploitation',
|
||||
'authz-exploit': 'exploitation',
|
||||
'ssrf-exploit': 'exploitation',
|
||||
'report': 'reporting'
|
||||
};
|
||||
|
||||
// Group agents by phase
|
||||
for (const [agentName, agentData] of successfulAgents) {
|
||||
const phase = agentPhaseMap[agentName];
|
||||
if (phase) {
|
||||
phases[phase].push(agentData);
|
||||
}
|
||||
}
|
||||
|
||||
// Calculate metrics per phase
|
||||
const phaseMetrics = {};
|
||||
const totalDuration = this.data.metrics.total_duration_ms;
|
||||
|
||||
for (const [phaseName, agentList] of Object.entries(phases)) {
|
||||
if (agentList.length === 0) continue;
|
||||
|
||||
const phaseDuration = agentList.reduce(
|
||||
(sum, agent) => sum + agent.final_duration_ms,
|
||||
0
|
||||
);
|
||||
|
||||
const phaseCost = agentList.reduce(
|
||||
(sum, agent) => sum + agent.total_cost_usd,
|
||||
0
|
||||
);
|
||||
|
||||
phaseMetrics[phaseName] = {
|
||||
duration_ms: phaseDuration,
|
||||
duration_percentage: calculatePercentage(phaseDuration, totalDuration),
|
||||
cost_usd: phaseCost,
|
||||
agent_count: agentList.length
|
||||
};
|
||||
}
|
||||
|
||||
return phaseMetrics;
|
||||
}
|
||||
|
||||
/**
|
||||
* Get current metrics
|
||||
* @returns {Object} Current metrics data
|
||||
*/
|
||||
getMetrics() {
|
||||
return JSON.parse(JSON.stringify(this.data));
|
||||
}
|
||||
|
||||
/**
|
||||
* Save metrics to session.json (atomic write)
|
||||
* @private
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async save() {
|
||||
await atomicWrite(this.sessionJsonPath, this.data);
|
||||
}
|
||||
|
||||
/**
|
||||
* Reload metrics from disk
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
async reload() {
|
||||
this.data = await readJson(this.sessionJsonPath);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,208 @@
|
||||
/**
|
||||
* Audit System Utilities
|
||||
*
|
||||
* Core utility functions for path generation, atomic writes, and formatting.
|
||||
* All functions are pure and crash-safe.
|
||||
*/
|
||||
|
||||
import fs from 'fs/promises';
|
||||
import path from 'path';
|
||||
import { fileURLToPath } from 'url';
|
||||
|
||||
const __filename = fileURLToPath(import.meta.url);
|
||||
const __dirname = path.dirname(__filename);
|
||||
|
||||
// Get Shannon repository root
|
||||
export const SHANNON_ROOT = path.resolve(__dirname, '..', '..');
|
||||
export const AUDIT_LOGS_DIR = path.join(SHANNON_ROOT, 'audit-logs');
|
||||
|
||||
/**
|
||||
* Generate standardized session identifier: {hostname}_{sessionId}
|
||||
* @param {Object} sessionMetadata - Session metadata from Shannon store
|
||||
* @param {string} sessionMetadata.id - UUID session ID
|
||||
* @param {string} sessionMetadata.webUrl - Target web URL
|
||||
* @returns {string} Formatted session identifier
|
||||
*/
|
||||
export function generateSessionIdentifier(sessionMetadata) {
|
||||
const { id, webUrl } = sessionMetadata;
|
||||
const hostname = new URL(webUrl).hostname.replace(/[^a-zA-Z0-9-]/g, '-');
|
||||
return `${hostname}_${id}`;
|
||||
}
|
||||
|
||||
/**
|
||||
* Generate path to audit log directory for a session
|
||||
* @param {Object} sessionMetadata - Session metadata
|
||||
* @returns {string} Absolute path to session audit directory
|
||||
*/
|
||||
export function generateAuditPath(sessionMetadata) {
|
||||
const sessionIdentifier = generateSessionIdentifier(sessionMetadata);
|
||||
return path.join(AUDIT_LOGS_DIR, sessionIdentifier);
|
||||
}
|
||||
|
||||
/**
|
||||
* Generate path to agent log file
|
||||
* @param {Object} sessionMetadata - Session metadata
|
||||
* @param {string} agentName - Name of the agent
|
||||
* @param {number} timestamp - Timestamp (ms since epoch)
|
||||
* @param {number} attemptNumber - Attempt number (1, 2, 3, ...)
|
||||
* @returns {string} Absolute path to agent log file
|
||||
*/
|
||||
export function generateLogPath(sessionMetadata, agentName, timestamp, attemptNumber) {
|
||||
const auditPath = generateAuditPath(sessionMetadata);
|
||||
const filename = `${timestamp}_${agentName}_attempt-${attemptNumber}.log`;
|
||||
return path.join(auditPath, 'agents', filename);
|
||||
}
|
||||
|
||||
/**
|
||||
* Generate path to prompt snapshot file
|
||||
* @param {Object} sessionMetadata - Session metadata
|
||||
* @param {string} agentName - Name of the agent
|
||||
* @returns {string} Absolute path to prompt file
|
||||
*/
|
||||
export function generatePromptPath(sessionMetadata, agentName) {
|
||||
const auditPath = generateAuditPath(sessionMetadata);
|
||||
return path.join(auditPath, 'prompts', `${agentName}.md`);
|
||||
}
|
||||
|
||||
/**
|
||||
* Generate path to session.json file
|
||||
* @param {Object} sessionMetadata - Session metadata
|
||||
* @returns {string} Absolute path to session.json
|
||||
*/
|
||||
export function generateSessionJsonPath(sessionMetadata) {
|
||||
const auditPath = generateAuditPath(sessionMetadata);
|
||||
return path.join(auditPath, 'session.json');
|
||||
}
|
||||
|
||||
/**
|
||||
* Ensure directory exists (idempotent, race-safe)
|
||||
* @param {string} dirPath - Directory path to create
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
export async function ensureDirectory(dirPath) {
|
||||
try {
|
||||
await fs.mkdir(dirPath, { recursive: true });
|
||||
} catch (error) {
|
||||
// Ignore EEXIST errors (race condition safe)
|
||||
if (error.code !== 'EEXIST') {
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Atomic write using temp file + rename pattern
|
||||
* Guarantees no partial writes or corruption on crash
|
||||
* @param {string} filePath - Target file path
|
||||
* @param {Object|string} data - Data to write (will be JSON.stringified if object)
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
export async function atomicWrite(filePath, data) {
|
||||
const tempPath = `${filePath}.tmp`;
|
||||
const content = typeof data === 'string' ? data : JSON.stringify(data, null, 2);
|
||||
|
||||
try {
|
||||
// Write to temp file
|
||||
await fs.writeFile(tempPath, content, 'utf8');
|
||||
|
||||
// Atomic rename (POSIX guarantee: atomic on same filesystem)
|
||||
await fs.rename(tempPath, filePath);
|
||||
} catch (error) {
|
||||
// Clean up temp file on failure
|
||||
try {
|
||||
await fs.unlink(tempPath);
|
||||
} catch (cleanupError) {
|
||||
// Ignore cleanup errors
|
||||
}
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Format duration in milliseconds to human-readable string
|
||||
* @param {number} ms - Duration in milliseconds
|
||||
* @returns {string} Formatted duration (e.g., "2m 34s", "45s", "1.2s")
|
||||
*/
|
||||
export function formatDuration(ms) {
|
||||
if (ms < 1000) {
|
||||
return `${ms}ms`;
|
||||
}
|
||||
|
||||
const seconds = ms / 1000;
|
||||
if (seconds < 60) {
|
||||
return `${seconds.toFixed(1)}s`;
|
||||
}
|
||||
|
||||
const minutes = Math.floor(seconds / 60);
|
||||
const remainingSeconds = Math.floor(seconds % 60);
|
||||
return `${minutes}m ${remainingSeconds}s`;
|
||||
}
|
||||
|
||||
/**
|
||||
* Format cost in USD
|
||||
* @param {number} usd - Cost in USD
|
||||
* @returns {string} Formatted cost (e.g., "$0.0823", "$2.14")
|
||||
*/
|
||||
export function formatCost(usd) {
|
||||
return `$${usd.toFixed(4)}`;
|
||||
}
|
||||
|
||||
/**
|
||||
* Format timestamp to ISO 8601 string
|
||||
* @param {number} [timestamp] - Unix timestamp in ms (defaults to now)
|
||||
* @returns {string} ISO 8601 formatted string
|
||||
*/
|
||||
export function formatTimestamp(timestamp = Date.now()) {
|
||||
return new Date(timestamp).toISOString();
|
||||
}
|
||||
|
||||
/**
|
||||
* Calculate percentage
|
||||
* @param {number} part - Part value
|
||||
* @param {number} total - Total value
|
||||
* @returns {number} Percentage (0-100)
|
||||
*/
|
||||
export function calculatePercentage(part, total) {
|
||||
if (total === 0) return 0;
|
||||
return (part / total) * 100;
|
||||
}
|
||||
|
||||
/**
|
||||
* Read and parse JSON file
|
||||
* @param {string} filePath - Path to JSON file
|
||||
* @returns {Promise<Object>} Parsed JSON data
|
||||
*/
|
||||
export async function readJson(filePath) {
|
||||
const content = await fs.readFile(filePath, 'utf8');
|
||||
return JSON.parse(content);
|
||||
}
|
||||
|
||||
/**
|
||||
* Check if file exists
|
||||
* @param {string} filePath - Path to check
|
||||
* @returns {Promise<boolean>} True if file exists
|
||||
*/
|
||||
export async function fileExists(filePath) {
|
||||
try {
|
||||
await fs.access(filePath);
|
||||
return true;
|
||||
} catch {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Initialize audit directory structure for a session
|
||||
* Creates: audit-logs/{sessionId}/, agents/, prompts/
|
||||
* @param {Object} sessionMetadata - Session metadata
|
||||
* @returns {Promise<void>}
|
||||
*/
|
||||
export async function initializeAuditStructure(sessionMetadata) {
|
||||
const auditPath = generateAuditPath(sessionMetadata);
|
||||
const agentsPath = path.join(auditPath, 'agents');
|
||||
const promptsPath = path.join(auditPath, 'prompts');
|
||||
|
||||
await ensureDirectory(auditPath);
|
||||
await ensureDirectory(agentsPath);
|
||||
await ensureDirectory(promptsPath);
|
||||
}
|
||||
@@ -79,7 +79,7 @@ const rollbackGitToCommit = async (targetRepo, commitHash) => {
|
||||
export const runSingleAgent = async (agentName, session, pipelineTestingMode, runClaudePromptWithRetry, loadPrompt, allowRerun = false, skipWorkspaceClean = false) => {
|
||||
// Validate agent first
|
||||
const agent = validateAgent(agentName);
|
||||
|
||||
|
||||
console.log(chalk.cyan(`\n🤖 Running agent: ${agent.displayName}`));
|
||||
|
||||
// Reload session to get latest state (important for agent ranges)
|
||||
@@ -191,7 +191,7 @@ export const runSingleAgent = async (agentName, session, pipelineTestingMode, ru
|
||||
AGENTS[agentName].displayName,
|
||||
agentName, // Pass agent name for snapshot creation
|
||||
getAgentColor(agentName), // Pass color function for this agent
|
||||
{ webUrl: session.webUrl, sessionId: session.id } // Session metadata for logging
|
||||
{ id: session.id, webUrl: session.webUrl, repoPath: session.repoPath } // Session metadata for audit logging
|
||||
);
|
||||
|
||||
if (!result.success) {
|
||||
@@ -616,13 +616,35 @@ export const rollbackTo = async (targetAgent, session) => {
|
||||
}
|
||||
|
||||
const commitHash = session.checkpoints[targetAgent];
|
||||
|
||||
|
||||
// Rollback git workspace
|
||||
await rollbackGitToCommit(session.targetRepo, commitHash);
|
||||
|
||||
// Update session state
|
||||
|
||||
// Update session state (removes agents from completedAgents)
|
||||
await rollbackToAgent(session.id, targetAgent);
|
||||
|
||||
|
||||
// Mark rolled-back agents in audit system (for forensic trail)
|
||||
try {
|
||||
const { AuditSession } = await import('./audit/index.js');
|
||||
const auditSession = new AuditSession(session);
|
||||
await auditSession.initialize();
|
||||
|
||||
// Find agents that were rolled back (agents after targetAgent)
|
||||
const targetOrder = AGENTS[targetAgent].order;
|
||||
const rolledBackAgents = Object.values(AGENTS)
|
||||
.filter(agent => agent.order > targetOrder)
|
||||
.map(agent => agent.name);
|
||||
|
||||
// Mark them as rolled-back in audit system
|
||||
if (rolledBackAgents.length > 0) {
|
||||
await auditSession.markMultipleRolledBack(rolledBackAgents);
|
||||
console.log(chalk.gray(` Marked ${rolledBackAgents.length} agents as rolled-back in audit logs`));
|
||||
}
|
||||
} catch (error) {
|
||||
// Non-critical: rollback succeeded even if audit update failed
|
||||
console.log(chalk.yellow(` ⚠️ Failed to update audit logs: ${error.message}`));
|
||||
}
|
||||
|
||||
console.log(chalk.green(`✅ Successfully rolled back to agent '${targetAgent}'`));
|
||||
};
|
||||
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
import chalk from 'chalk';
|
||||
import {
|
||||
selectSession, deleteSession, deleteAllSessions,
|
||||
validateAgent, validatePhase
|
||||
validateAgent, validatePhase, reconcileSession
|
||||
} from '../session-manager.js';
|
||||
import {
|
||||
runPhase, runAll, rollbackTo, rerunAgent, displayStatus, listAgents
|
||||
@@ -94,6 +94,29 @@ export async function handleDeveloperCommand(command, args, pipelineTestingMode,
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
// Self-healing: Reconcile session with audit logs before executing command
|
||||
// This ensures Shannon store is consistent with audit data, even after crash recovery
|
||||
try {
|
||||
const reconcileReport = await reconcileSession(session.id);
|
||||
|
||||
if (reconcileReport.promotions.length > 0) {
|
||||
console.log(chalk.blue(`🔄 Reconciled: Added ${reconcileReport.promotions.length} completed agents from audit logs`));
|
||||
}
|
||||
if (reconcileReport.demotions.length > 0) {
|
||||
console.log(chalk.yellow(`🔄 Reconciled: Removed ${reconcileReport.demotions.length} rolled-back agents`));
|
||||
}
|
||||
if (reconcileReport.failures.length > 0) {
|
||||
console.log(chalk.yellow(`🔄 Reconciled: Marked ${reconcileReport.failures.length} failed agents`));
|
||||
}
|
||||
|
||||
// Reload session after reconciliation to get fresh state
|
||||
const { getSession } = await import('../session-manager.js');
|
||||
session = await getSession(session.id);
|
||||
} catch (error) {
|
||||
// Reconciliation failure is non-critical, but log warning
|
||||
console.log(chalk.yellow(`⚠️ Failed to reconcile session with audit logs: ${error.message}`));
|
||||
}
|
||||
|
||||
switch (command) {
|
||||
|
||||
case '--run-phase':
|
||||
|
||||
@@ -99,7 +99,7 @@ async function runPreReconWave1(webUrl, sourceDir, variables, config, pipelineTe
|
||||
AGENTS['pre-recon'].displayName,
|
||||
'pre-recon', // Agent name for snapshot creation
|
||||
chalk.cyan,
|
||||
{ webUrl, sessionId } // Session metadata for logging
|
||||
{ id: sessionId, webUrl } // Session metadata for audit logging (STANDARD: use 'id' field)
|
||||
)
|
||||
);
|
||||
const [codeAnalysis] = await Promise.all(operations);
|
||||
@@ -123,7 +123,7 @@ async function runPreReconWave1(webUrl, sourceDir, variables, config, pipelineTe
|
||||
AGENTS['pre-recon'].displayName,
|
||||
'pre-recon', // Agent name for snapshot creation
|
||||
chalk.cyan,
|
||||
{ webUrl, sessionId } // Session metadata for logging
|
||||
{ id: sessionId, webUrl } // Session metadata for audit logging (STANDARD: use 'id' field)
|
||||
)
|
||||
);
|
||||
}
|
||||
|
||||
@@ -207,36 +207,4 @@ export async function loadPrompt(promptName, variables, config = null, pipelineT
|
||||
const promptError = handlePromptError(promptName, error);
|
||||
throw promptError.error;
|
||||
}
|
||||
}
|
||||
|
||||
// Save prompt snapshot for successful agent runs only
|
||||
export async function savePromptSnapshot(sourceDir, agentName, promptContent) {
|
||||
const snapshotDir = path.join(sourceDir, 'prompt-snapshots');
|
||||
await fs.ensureDir(snapshotDir);
|
||||
|
||||
// Use deterministic naming - one snapshot per agent
|
||||
const fileName = `${agentName}.md`;
|
||||
const filePath = path.join(snapshotDir, fileName);
|
||||
|
||||
const timestamp = new Date().toISOString();
|
||||
const snapshotContent = `# Prompt Snapshot: ${agentName}
|
||||
|
||||
**Generated:** ${timestamp}
|
||||
**Agent:** ${agentName}
|
||||
|
||||
---
|
||||
|
||||
## Full Interpolated Prompt
|
||||
|
||||
\`\`\`markdown
|
||||
${promptContent}
|
||||
\`\`\`
|
||||
|
||||
---
|
||||
|
||||
*This snapshot represents the exact prompt that was sent to Claude Code to generate the current deliverables for this agent.*
|
||||
`;
|
||||
|
||||
await fs.writeFile(filePath, snapshotContent);
|
||||
console.log(chalk.gray(` 📸 Prompt snapshot saved: prompt-snapshots/${fileName}`));
|
||||
}
|
||||
+101
-62
@@ -4,13 +4,10 @@ import crypto from 'crypto';
|
||||
import { PentestError } from './error-handling.js';
|
||||
|
||||
// Generate a session-based log folder path
|
||||
// NEW FORMAT: {hostname}_{sessionId} (no hash, full UUID for consistency with audit system)
|
||||
export const generateSessionLogPath = (webUrl, sessionId) => {
|
||||
// Create a hash of the webUrl for uniqueness while keeping it readable
|
||||
const urlHash = crypto.createHash('md5').update(webUrl).digest('hex').substring(0, 8);
|
||||
const hostname = new URL(webUrl).hostname.replace(/[^a-zA-Z0-9-]/g, '-');
|
||||
const shortSessionId = sessionId.substring(0, 8);
|
||||
|
||||
const sessionFolderName = `${hostname}_${urlHash}_${shortSessionId}`;
|
||||
const sessionFolderName = `${hostname}_${sessionId}`;
|
||||
return path.join(process.cwd(), 'agent-logs', sessionFolderName);
|
||||
};
|
||||
|
||||
@@ -242,6 +239,8 @@ export const createSession = async (webUrl, repoPath, configFile = null, targetR
|
||||
|
||||
const sessionId = generateSessionId();
|
||||
|
||||
// STANDARD: All sessions use 'id' field (NOT 'sessionId')
|
||||
// This is the canonical session structure used throughout the codebase
|
||||
const session = {
|
||||
id: sessionId,
|
||||
webUrl,
|
||||
@@ -452,7 +451,9 @@ export const getNextAgent = (session) => {
|
||||
};
|
||||
|
||||
// Mark agent as completed with checkpoint
|
||||
export const markAgentCompleted = async (sessionId, agentName, checkpointCommit, timingData = null, costData = null, validationData = null) => {
|
||||
// NOTE: Timing, cost, and validation data now managed by AuditSession (audit-logs/session.json)
|
||||
// Shannon store contains ONLY orchestration state (completedAgents, checkpoints)
|
||||
export const markAgentCompleted = async (sessionId, agentName, checkpointCommit) => {
|
||||
// Use mutex to prevent race conditions during parallel agent execution
|
||||
const unlock = await sessionMutex.lock(sessionId);
|
||||
|
||||
@@ -473,38 +474,6 @@ export const markAgentCompleted = async (sessionId, agentName, checkpointCommit,
|
||||
[agentName]: checkpointCommit
|
||||
}
|
||||
};
|
||||
|
||||
// Update timing data if provided
|
||||
if (timingData) {
|
||||
updates.timingBreakdown = {
|
||||
...session.timingBreakdown,
|
||||
agents: {
|
||||
...session.timingBreakdown?.agents,
|
||||
[agentName]: timingData
|
||||
}
|
||||
};
|
||||
}
|
||||
|
||||
// Update cost data if provided
|
||||
if (costData) {
|
||||
const existingCost = session.costBreakdown?.total || 0;
|
||||
updates.costBreakdown = {
|
||||
total: existingCost + costData,
|
||||
agents: {
|
||||
...session.costBreakdown?.agents,
|
||||
[agentName]: costData
|
||||
}
|
||||
};
|
||||
}
|
||||
|
||||
|
||||
// Update validation data if provided (for vulnerability agents)
|
||||
if (validationData && agentName.includes('-vuln')) {
|
||||
updates.validationResults = {
|
||||
...session.validationResults,
|
||||
[agentName]: validationData
|
||||
};
|
||||
}
|
||||
|
||||
// Check if all agents are now completed and update session status
|
||||
const totalAgents = Object.keys(AGENTS).length;
|
||||
@@ -656,33 +625,103 @@ export const rollbackToAgent = async (sessionId, targetAgent) => {
|
||||
Object.entries(session.checkpoints).filter(([agent]) => !agentsToRemove.includes(agent))
|
||||
)
|
||||
};
|
||||
|
||||
// Clean up timing data for rolled-back agents
|
||||
if (session.timingBreakdown?.agents) {
|
||||
const filteredTimingAgents = Object.fromEntries(
|
||||
Object.entries(session.timingBreakdown.agents).filter(([agent]) => !agentsToRemove.includes(agent))
|
||||
);
|
||||
updates.timingBreakdown = {
|
||||
...session.timingBreakdown,
|
||||
agents: filteredTimingAgents
|
||||
};
|
||||
}
|
||||
|
||||
// Clean up cost data for rolled-back agents and recalculate total
|
||||
if (session.costBreakdown?.agents) {
|
||||
const filteredCostAgents = Object.fromEntries(
|
||||
Object.entries(session.costBreakdown.agents).filter(([agent]) => !agentsToRemove.includes(agent))
|
||||
);
|
||||
const recalculatedTotal = Object.values(filteredCostAgents).reduce((sum, cost) => sum + cost, 0);
|
||||
updates.costBreakdown = {
|
||||
total: recalculatedTotal,
|
||||
agents: filteredCostAgents
|
||||
};
|
||||
}
|
||||
|
||||
|
||||
// NOTE: Timing and cost data now managed in audit-logs/session.json
|
||||
// Rollback will be reflected via reconcileSession() which marks agents as "rolled-back"
|
||||
|
||||
return await updateSession(sessionId, updates);
|
||||
};
|
||||
|
||||
/**
|
||||
* Reconcile Shannon store with audit logs (self-healing)
|
||||
*
|
||||
* This function ensures the Shannon store (.shannon-store.json) is consistent with
|
||||
* the audit logs (audit-logs/session.json) by syncing agent completion status.
|
||||
*
|
||||
* Three-part reconciliation:
|
||||
* 1. PROMOTIONS: Agents completed/failed in audit → added to Shannon store
|
||||
* 2. DEMOTIONS: Agents rolled-back in audit → removed from Shannon store
|
||||
* 3. VERIFICATION: Ensure audit state fully reflected in orchestration
|
||||
*
|
||||
* Critical for crash recovery, especially crash during rollback operations.
|
||||
*
|
||||
* @param {string} sessionId - Session ID to reconcile
|
||||
* @returns {Promise<Object>} Reconciliation report with added/removed/failed agents
|
||||
*/
|
||||
export const reconcileSession = async (sessionId) => {
|
||||
const { AuditSession } = await import('./audit/index.js');
|
||||
|
||||
// Get Shannon store session
|
||||
const shannonSession = await getSession(sessionId);
|
||||
if (!shannonSession) {
|
||||
throw new PentestError(`Session ${sessionId} not found in Shannon store`, 'validation', false);
|
||||
}
|
||||
|
||||
// Get audit session data
|
||||
const auditSession = new AuditSession(shannonSession);
|
||||
await auditSession.initialize();
|
||||
const auditData = await auditSession.getMetrics();
|
||||
|
||||
const report = {
|
||||
promotions: [],
|
||||
demotions: [],
|
||||
failures: []
|
||||
};
|
||||
|
||||
// PART 1: PROMOTIONS (Additive)
|
||||
// Find agents completed in audit but not in Shannon store
|
||||
const auditCompleted = Object.entries(auditData.metrics.agents)
|
||||
.filter(([_, agentData]) => agentData.status === 'success')
|
||||
.map(([agentName]) => agentName);
|
||||
|
||||
const missing = auditCompleted.filter(agent => !shannonSession.completedAgents.includes(agent));
|
||||
|
||||
for (const agentName of missing) {
|
||||
const agentData = auditData.metrics.agents[agentName];
|
||||
const checkpoint = agentData.checkpoint || null;
|
||||
await markAgentCompleted(sessionId, agentName, checkpoint);
|
||||
report.promotions.push(agentName);
|
||||
}
|
||||
|
||||
// PART 2: DEMOTIONS (Subtractive) - CRITICAL FOR ROLLBACK RECOVERY
|
||||
// Find agents rolled-back in audit but still in Shannon store
|
||||
const auditRolledBack = Object.entries(auditData.metrics.agents)
|
||||
.filter(([_, agentData]) => agentData.status === 'rolled-back')
|
||||
.map(([agentName]) => agentName);
|
||||
|
||||
const toRemove = shannonSession.completedAgents.filter(agent => auditRolledBack.includes(agent));
|
||||
|
||||
if (toRemove.length > 0) {
|
||||
// Reload session to get fresh state
|
||||
const freshSession = await getSession(sessionId);
|
||||
|
||||
const updates = {
|
||||
completedAgents: freshSession.completedAgents.filter(agent => !toRemove.includes(agent)),
|
||||
checkpoints: Object.fromEntries(
|
||||
Object.entries(freshSession.checkpoints).filter(([agent]) => !toRemove.includes(agent))
|
||||
)
|
||||
};
|
||||
|
||||
await updateSession(sessionId, updates);
|
||||
report.demotions.push(...toRemove);
|
||||
}
|
||||
|
||||
// PART 3: FAILURES
|
||||
// Find agents failed in audit but not marked failed in Shannon store
|
||||
const auditFailed = Object.entries(auditData.metrics.agents)
|
||||
.filter(([_, agentData]) => agentData.status === 'failed')
|
||||
.map(([agentName]) => agentName);
|
||||
|
||||
const failedToAdd = auditFailed.filter(agent => !shannonSession.failedAgents.includes(agent));
|
||||
|
||||
for (const agentName of failedToAdd) {
|
||||
await markAgentFailed(sessionId, agentName);
|
||||
report.failures.push(agentName);
|
||||
}
|
||||
|
||||
return report;
|
||||
};
|
||||
|
||||
// Delete a specific session by ID
|
||||
export const deleteSession = async (sessionId) => {
|
||||
const store = await loadSessions();
|
||||
|
||||
Reference in New Issue
Block a user