feat: implement unified audit system v3.0 with crash-safety and self-healing

## Unified Audit System (v3.0)
- Implemented crash-safe, append-only logging to audit-logs/{hostname}_{sessionId}/
- Added session.json with comprehensive metrics (timing, cost, attempts)
- Agent execution logs with turn-by-turn detail
- Prompt snapshots saved to audit-logs/.../prompts/{agent}.md
- SessionMutex prevents race conditions during parallel execution
- Self-healing reconciliation before every CLI command

## Session Metadata Standardization
- Fixed critical bug: standardized on 'id' field (not 'sessionId') throughout codebase
- Updated: shannon.mjs (recon, report), src/phases/pre-recon.js
- Added validation in AuditSession to fail fast on incorrect field usage
- JavaScript shorthand syntax was causing wrong field names

## Schema Improvements
- session.json: Added cost_usd per phase, removed redundant final_cost_usd
- Renamed 'percentage' -> 'duration_percentage' for clarity
- Simplified agent metrics to single total_cost_usd field
- Removed unused validation object from schema

## Legacy System Removal
- Removed savePromptSnapshot() - prompts now only saved by audit system
- Removed target repo pollution (prompt-snapshots/ no longer created)
- Single source of truth: audit-logs/{hostname}_{sessionId}/prompts/

## Export Script Simplification
- Removed JSON export mode (session.json already exists)
- CSV-only export with clean columns: agent, phase, status, attempts, duration_ms, cost_usd
- Tested on real session data

## Documentation
- Updated CLAUDE.md with audit system architecture
- Added .gitignore entry for audit-logs/

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
ajmallesh
2025-10-22 16:09:08 -07:00
parent b8dc33f180
commit 3babf02d68
18 changed files with 1871 additions and 206 deletions

138
CLAUDE.md
View File

@@ -188,17 +188,47 @@ The agent implements a sophisticated checkpoint system using git:
- Every agent creates a git checkpoint before execution
- Rollback to any previous agent state using `--rollback-to` or `--rerun`
- Failed agents don't affect completed work
- Timing and cost data cleaned up during rollbacks
- Rolled-back agents marked in audit system with status: "rolled-back"
- Reconciliation automatically syncs Shannon store with audit logs after rollback
- Fail-fast safety prevents accidental re-execution of completed agents
### Timing & Performance Monitoring
The agent includes comprehensive timing instrumentation that tracks:
- Total execution time
- Phase-level timing breakdown
- Individual command execution times
- Claude Code agent processing times
- Cost tracking for AI agent usage
### Unified Audit & Metrics System
The agent implements a crash-safe, self-healing audit system (v3.0) with the following guarantees:
**Architecture:**
- **audit-logs/**: Centralized metrics and forensic logs (source of truth)
- `{hostname}_{sessionId}/session.json` - Comprehensive metrics with attempt-level detail
- `{hostname}_{sessionId}/prompts/` - Exact prompts used for reproducibility
- `{hostname}_{sessionId}/agents/` - Turn-by-turn execution logs
- **.shannon-store.json**: Minimal orchestration state (completedAgents, checkpoints)
**Crash Safety:**
- Append-only logging with immediate flush (survives kill -9)
- Atomic writes for session.json (no partial writes)
- Event-based logging (tool_start, tool_end, llm_response) closes data loss windows
**Self-Healing:**
- Automatic reconciliation before every CLI command
- Recovers from crashes during rollback
- Audit logs are source of truth; Shannon store follows
**Forensic Completeness:**
- All retry attempts logged with errors, costs, durations
- Rolled-back agents preserved with status: "rolled-back"
- Partial cost capture for failed attempts
- Complete event trail for debugging
**Concurrency Safety:**
- SessionMutex prevents race conditions during parallel agent execution
- Safe parallel execution of vulnerability and exploitation phases
**Metrics & Reporting:**
- Export metrics with `./scripts/export-metrics.js`
- Manual reconciliation (diagnostics) with `./scripts/reconcile-session.js`
- Phase-level and agent-level timing/cost aggregations
- Validation results integrated with metrics
For detailed design, see `docs/unified-audit-system-design.md`.
## Development Notes
@@ -232,34 +262,56 @@ The tool should only be used on systems you own or have explicit permission to t
## File Structure
```
shannon.mjs # Main orchestration script
package.json # Node.js dependencies
src/ # Core modules
├── config-parser.js # Configuration handling
├── error-handling.js # Error management
├── tool-checker.js # Tool validation
├── session-manager.js # Session state management
├── checkpoint-manager.js # Git-based checkpointing
├── queue-validation.js # Deliverable validation
shannon.mjs # Main orchestration script
package.json # Node.js dependencies
.shannon-store.json # Orchestration state (minimal)
src/ # Core modules
├── audit/ # Unified audit system (v3.0)
│ ├── index.js # Public API
│ ├── audit-session.js # Main facade (logger + metrics + mutex)
│ ├── logger.js # Append-only crash-safe logging
│ ├── metrics-tracker.js # Timing, cost, attempt tracking
│ └── utils.js # Path generation, atomic writes
├── config-parser.js # Configuration handling
├── error-handling.js # Error management
├── tool-checker.js # Tool validation
├── session-manager.js # Session state + reconciliation
├── checkpoint-manager.js # Git-based checkpointing + rollback
├── queue-validation.js # Deliverable validation
├── ai/
│ └── claude-executor.js # Claude Code SDK integration
└── utils/
configs/ # Configuration files
── config-schema.json # JSON Schema validation
├── example-config.yaml # Template configuration
├── juice-shop-config.yaml # Juice Shop example
├── keygraph-config.yaml # Keygraph configuration
├── chatwoot-config.yaml # Chatwoot configuration
├── metabase-config.yaml # Metabase configuration
└── cal-com-config.yaml # Cal.com configuration
prompts/ # AI prompt templates
├── pre-recon-code.txt # Code analysis
├── recon.txt # Reconnaissance
├── vuln-*.txt # Vulnerability assessment
├── exploit-*.txt # Exploitation
── report-executive.txt # Executive reporting
login_resources/ # Authentication utilities
├── generate-totp.mjs # TOTP generation
── login_instructions.txt # Login documentation
deliverables/ # Output directory
audit-logs/ # Centralized audit data (v3.0)
── {hostname}_{sessionId}/
├── session.json # Comprehensive metrics
├── prompts/ # Prompt snapshots
└── {agent}.md
└── agents/ # Agent execution logs
└── {timestamp}_{agent}_attempt-{N}.log
configs/ # Configuration files
├── config-schema.json # JSON Schema validation
├── example-config.yaml # Template configuration
├── juice-shop-config.yaml # Juice Shop example
├── keygraph-config.yaml # Keygraph configuration
├── chatwoot-config.yaml # Chatwoot configuration
── metabase-config.yaml # Metabase configuration
└── cal-com-config.yaml # Cal.com configuration
prompts/ # AI prompt templates
── pre-recon-code.txt # Code analysis
├── recon.txt # Reconnaissance
├── vuln-*.txt # Vulnerability assessment
├── exploit-*.txt # Exploitation
└── report-executive.txt # Executive reporting
login_resources/ # Authentication utilities
├── generate-totp.mjs # TOTP generation
└── login_instructions.txt # Login documentation
scripts/ # Utility scripts
├── reconcile-session.js # Manual reconciliation (diagnostics)
└── export-metrics.js # Export metrics to CSV/JSON
deliverables/ # Output directory (in target repo)
docs/ # Documentation
├── unified-audit-system-design.md
└── migration-guide.md
```
## Troubleshooting
@@ -275,4 +327,18 @@ deliverables/ # Output directory
Missing tools can be skipped using `--pipeline-testing` mode during development:
- `nmap` - Network scanning
- `subfinder` - Subdomain discovery
- `whatweb` - Web technology detection
- `whatweb` - Web technology detection
### Diagnostic & Utility Scripts
```bash
# Manual reconciliation (for diagnostics only)
./scripts/reconcile-session.js --session-id <id> --dry-run --verbose
# Export metrics to CSV/JSON
./scripts/export-metrics.js --session-id <id> --format csv --output metrics.csv
# System-wide consistency audit
./scripts/reconcile-session.js --all-sessions --dry-run
```
Note: Manual reconciliation should rarely be needed. Frequent use indicates bugs in automatic reconciliation.