feat: implement unified audit system v3.0 with crash-safety and self-healing

## Unified Audit System (v3.0) - Implemented crash-safe, append-only logging to audit-logs/{hostname}_{sessionId}/ - Added session.json with comprehensive metrics (timing, cost, attempts) - Agent execution logs with turn-by-turn detail - Prompt snapshots saved to audit-logs/.../prompts/{agent}.md - SessionMutex prevents race conditions during parallel execution - Self-healing reconciliation before every CLI command ## Session Metadata Standardization - Fixed critical bug: standardized on 'id' field (not 'sessionId') throughout codebase - Updated: shannon.mjs (recon, report), src/phases/pre-recon.js - Added validation in AuditSession to fail fast on incorrect field usage - JavaScript shorthand syntax was causing wrong field names ## Schema Improvements - session.json: Added cost_usd per phase, removed redundant final_cost_usd - Renamed 'percentage' -> 'duration_percentage' for clarity - Simplified agent metrics to single total_cost_usd field - Removed unused validation object from schema ## Legacy System Removal - Removed savePromptSnapshot() - prompts now only saved by audit system - Removed target repo pollution (prompt-snapshots/ no longer created) - Single source of truth: audit-logs/{hostname}_{sessionId}/prompts/ ## Export Script Simplification - Removed JSON export mode (session.json already exists) - CSV-only export with clean columns: agent, phase, status, attempts, duration_ms, cost_usd - Tested on real session data ## Documentation - Updated CLAUDE.md with audit system architecture - Added .gitignore entry for audit-logs/ 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2026-02-12 17:22:50 +00:00 · 2025-10-22 16:09:08 -07:00
parent b8dc33f180
commit 3babf02d68
18 changed files with 1871 additions and 206 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -188,17 +188,47 @@ The agent implements a sophisticated checkpoint system using git:
 - Every agent creates a git checkpoint before execution
 - Rollback to any previous agent state using `--rollback-to` or `--rerun`
 - Failed agents don't affect completed work
- Timing and cost data cleaned up during rollbacks
+- Rolled-back agents marked in audit system with status: "rolled-back"
+- Reconciliation automatically syncs Shannon store with audit logs after rollback
 - Fail-fast safety prevents accidental re-execution of completed agents

-### Timing & Performance Monitoring
-The agent includes comprehensive timing instrumentation that tracks:
- Total execution time
- Phase-level timing breakdown
- Individual command execution times
- Claude Code agent processing times
- Cost tracking for AI agent usage
+### Unified Audit & Metrics System
+The agent implements a crash-safe, self-healing audit system (v3.0) with the following guarantees:

+**Architecture:**
+- **audit-logs/**: Centralized metrics and forensic logs (source of truth)
+  - `{hostname}_{sessionId}/session.json` - Comprehensive metrics with attempt-level detail
+  - `{hostname}_{sessionId}/prompts/` - Exact prompts used for reproducibility
+  - `{hostname}_{sessionId}/agents/` - Turn-by-turn execution logs
+- **.shannon-store.json**: Minimal orchestration state (completedAgents, checkpoints)
+
+**Crash Safety:**
+- Append-only logging with immediate flush (survives kill -9)
+- Atomic writes for session.json (no partial writes)
+- Event-based logging (tool_start, tool_end, llm_response) closes data loss windows
+
+**Self-Healing:**
+- Automatic reconciliation before every CLI command
+- Recovers from crashes during rollback
+- Audit logs are source of truth; Shannon store follows
+
+**Forensic Completeness:**
+- All retry attempts logged with errors, costs, durations
+- Rolled-back agents preserved with status: "rolled-back"
+- Partial cost capture for failed attempts
+- Complete event trail for debugging
+
+**Concurrency Safety:**
+- SessionMutex prevents race conditions during parallel agent execution
+- Safe parallel execution of vulnerability and exploitation phases
+
+**Metrics & Reporting:**
+- Export metrics with `./scripts/export-metrics.js`
+- Manual reconciliation (diagnostics) with `./scripts/reconcile-session.js`
+- Phase-level and agent-level timing/cost aggregations
+- Validation results integrated with metrics
+
+For detailed design, see `docs/unified-audit-system-design.md`.

 ## Development Notes

@@ -232,34 +262,56 @@ The tool should only be used on systems you own or have explicit permission to t
 ## File Structure

 ```
-shannon.mjs              # Main orchestration script
-package.json                  # Node.js dependencies
-src/                         # Core modules
-├── config-parser.js         # Configuration handling
-├── error-handling.js        # Error management
-├── tool-checker.js          # Tool validation
-├── session-manager.js       # Session state management
-├── checkpoint-manager.js    # Git-based checkpointing
-├── queue-validation.js      # Deliverable validation
+shannon.mjs                      # Main orchestration script
+package.json                     # Node.js dependencies
+.shannon-store.json              # Orchestration state (minimal)
+src/                             # Core modules
+├── audit/                       # Unified audit system (v3.0)
+│   ├── index.js                 # Public API
+│   ├── audit-session.js         # Main facade (logger + metrics + mutex)
+│   ├── logger.js                # Append-only crash-safe logging
+│   ├── metrics-tracker.js       # Timing, cost, attempt tracking
+│   └── utils.js                 # Path generation, atomic writes
+├── config-parser.js             # Configuration handling
+├── error-handling.js            # Error management
+├── tool-checker.js              # Tool validation
+├── session-manager.js           # Session state + reconciliation
+├── checkpoint-manager.js        # Git-based checkpointing + rollback
+├── queue-validation.js          # Deliverable validation
+├── ai/
+│   └── claude-executor.js       # Claude Code SDK integration
 └── utils/
-configs/                     # Configuration files
-├── config-schema.json       # JSON Schema validation
-├── example-config.yaml      # Template configuration
-├── juice-shop-config.yaml   # Juice Shop example
-├── keygraph-config.yaml     # Keygraph configuration
-├── chatwoot-config.yaml     # Chatwoot configuration
-├── metabase-config.yaml     # Metabase configuration
-└── cal-com-config.yaml      # Cal.com configuration
-prompts/                     # AI prompt templates
-├── pre-recon-code.txt       # Code analysis
-├── recon.txt               # Reconnaissance  
-├── vuln-*.txt              # Vulnerability assessment
-├── exploit-*.txt           # Exploitation
-└── report-executive.txt    # Executive reporting
-login_resources/            # Authentication utilities
-├── generate-totp.mjs       # TOTP generation
-└── login_instructions.txt  # Login documentation
-deliverables/              # Output directory
+audit-logs/                      # Centralized audit data (v3.0)
+└── {hostname}_{sessionId}/
+    ├── session.json             # Comprehensive metrics
+    ├── prompts/                 # Prompt snapshots
+    │   └── {agent}.md
+    └── agents/                  # Agent execution logs
+        └── {timestamp}_{agent}_attempt-{N}.log
+configs/                         # Configuration files
+├── config-schema.json           # JSON Schema validation
+├── example-config.yaml          # Template configuration
+├── juice-shop-config.yaml       # Juice Shop example
+├── keygraph-config.yaml         # Keygraph configuration
+├── chatwoot-config.yaml         # Chatwoot configuration
+├── metabase-config.yaml         # Metabase configuration
+└── cal-com-config.yaml          # Cal.com configuration
+prompts/                         # AI prompt templates
+├── pre-recon-code.txt           # Code analysis
+├── recon.txt                    # Reconnaissance
+├── vuln-*.txt                   # Vulnerability assessment
+├── exploit-*.txt                # Exploitation
+└── report-executive.txt         # Executive reporting
+login_resources/                 # Authentication utilities
+├── generate-totp.mjs            # TOTP generation
+└── login_instructions.txt       # Login documentation
+scripts/                         # Utility scripts
+├── reconcile-session.js         # Manual reconciliation (diagnostics)
+└── export-metrics.js            # Export metrics to CSV/JSON
+deliverables/                    # Output directory (in target repo)
+docs/                            # Documentation
+├── unified-audit-system-design.md
+└── migration-guide.md
 ```

 ## Troubleshooting
@@ -275,4 +327,18 @@ deliverables/              # Output directory
 Missing tools can be skipped using `--pipeline-testing` mode during development:
 - `nmap` - Network scanning
 - `subfinder` - Subdomain discovery
- `whatweb` - Web technology detection  
+- `whatweb` - Web technology detection
+
+### Diagnostic & Utility Scripts
+```bash
+# Manual reconciliation (for diagnostics only)
+./scripts/reconcile-session.js --session-id <id> --dry-run --verbose
+
+# Export metrics to CSV/JSON
+./scripts/export-metrics.js --session-id <id> --format csv --output metrics.csv
+
+# System-wide consistency audit
+./scripts/reconcile-session.js --all-sessions --dry-run
+```
+
+Note: Manual reconciliation should rarely be needed. Frequent use indicates bugs in automatic reconciliation.