feat: implement unified audit system v3.0 with crash-safety and self-healing

## Unified Audit System (v3.0)
- Implemented crash-safe, append-only logging to audit-logs/{hostname}_{sessionId}/
- Added session.json with comprehensive metrics (timing, cost, attempts)
- Agent execution logs with turn-by-turn detail
- Prompt snapshots saved to audit-logs/.../prompts/{agent}.md
- SessionMutex prevents race conditions during parallel execution
- Self-healing reconciliation before every CLI command

## Session Metadata Standardization
- Fixed critical bug: standardized on 'id' field (not 'sessionId') throughout codebase
- Updated: shannon.mjs (recon, report), src/phases/pre-recon.js
- Added validation in AuditSession to fail fast on incorrect field usage
- JavaScript shorthand syntax was causing wrong field names

## Schema Improvements
- session.json: Added cost_usd per phase, removed redundant final_cost_usd
- Renamed 'percentage' -> 'duration_percentage' for clarity
- Simplified agent metrics to single total_cost_usd field
- Removed unused validation object from schema

## Legacy System Removal
- Removed savePromptSnapshot() - prompts now only saved by audit system
- Removed target repo pollution (prompt-snapshots/ no longer created)
- Single source of truth: audit-logs/{hostname}_{sessionId}/prompts/

## Export Script Simplification
- Removed JSON export mode (session.json already exists)
- CSV-only export with clean columns: agent, phase, status, attempts, duration_ms, cost_usd
- Tested on real session data

## Documentation
- Updated CLAUDE.md with audit system architecture
- Added .gitignore entry for audit-logs/

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
ajmallesh
2025-10-22 16:09:08 -07:00
parent a9e00ca19f
commit 27334a4dd6
18 changed files with 1871 additions and 206 deletions
+24 -1
View File
@@ -1,7 +1,7 @@
import chalk from 'chalk';
import {
selectSession, deleteSession, deleteAllSessions,
validateAgent, validatePhase
validateAgent, validatePhase, reconcileSession
} from '../session-manager.js';
import {
runPhase, runAll, rollbackTo, rerunAgent, displayStatus, listAgents
@@ -94,6 +94,29 @@ export async function handleDeveloperCommand(command, args, pipelineTestingMode,
process.exit(1);
}
// Self-healing: Reconcile session with audit logs before executing command
// This ensures Shannon store is consistent with audit data, even after crash recovery
try {
const reconcileReport = await reconcileSession(session.id);
if (reconcileReport.promotions.length > 0) {
console.log(chalk.blue(`🔄 Reconciled: Added ${reconcileReport.promotions.length} completed agents from audit logs`));
}
if (reconcileReport.demotions.length > 0) {
console.log(chalk.yellow(`🔄 Reconciled: Removed ${reconcileReport.demotions.length} rolled-back agents`));
}
if (reconcileReport.failures.length > 0) {
console.log(chalk.yellow(`🔄 Reconciled: Marked ${reconcileReport.failures.length} failed agents`));
}
// Reload session after reconciliation to get fresh state
const { getSession } = await import('../session-manager.js');
session = await getSession(session.id);
} catch (error) {
// Reconciliation failure is non-critical, but log warning
console.log(chalk.yellow(`⚠️ Failed to reconcile session with audit logs: ${error.message}`));
}
switch (command) {
case '--run-phase':