shannon

mirror of https://github.com/KeygraphHQ/shannon.git synced 2026-07-23 13:01:00 +02:00

Author	SHA1	Message	Date
Arjun Malleswaran	378ed824ad	Feat/temporal (#52 ) * refactor: modularize claude-executor and extract shared utilities - Extract message handling into src/ai/message-handlers.ts with pure functions - Extract output formatting into src/ai/output-formatters.ts - Extract progress management into src/ai/progress-manager.ts - Add audit-logger.ts with Null Object pattern for optional logging - Add shared utilities: formatting.ts, file-io.ts, functional.ts - Consolidate getPromptNameForAgent into src/types/agents.ts * feat: add Claude Code custom commands for debug and review * feat: add Temporal integration foundation (phase 1-2) - Add Temporal SDK dependencies (@temporalio/client, worker, workflow, activity) - Add shared types for pipeline state, metrics, and progress queries - Add classifyErrorForTemporal() for retry behavior classification - Add docker-compose for Temporal server with SQLite persistence * feat: add Temporal activities for agent execution (phase 3) - Add activities.ts with heartbeat loop, git checkpoint/rollback, and error classification - Export runClaudePrompt, validateAgentOutput, ClaudePromptResult for Temporal use - Track attempt number via Temporal Context for accurate audit logging - Rollback git workspace before retry to ensure clean state * feat: add Temporal workflow for 5-phase pipeline orchestration (phase 4) * feat: add Temporal worker, client, and query tools (phase 5) - Add worker.ts with workflow bundling and graceful shutdown - Add client.ts CLI to start pipelines with progress polling - Add query.ts CLI to inspect running workflow state - Fix buffer overflow by truncating error messages and stack traces - Skip git operations gracefully on non-git repositories - Add kill.sh/start.sh dev scripts and Dockerfile.worker * feat: fix Docker worker container setup - Install uv instead of deprecated uvx package - Add mcp-server and configs directories to container - Mount target repo dynamically via TARGET_REPO env variable * fix: add report assembly step to Temporal workflow - Add assembleReportActivity to concatenate exploitation evidence files before report agent runs - Call assembleFinalReport in workflow Phase 5 before runReportAgent - Ensure deliverables directory exists before writing final report - Simplify pipeline-testing report prompt to just prepend header * refactor: consolidate Docker setup to root docker-compose.yml * feat: improve Temporal client UX and env handling - Change default to fire-and-forget (--wait flag to opt-in) - Add splash screen and improve console output formatting - Add .env to gitignore, remove from dockerignore for container access - Add Taskfile for common development commands * refactor: simplify session ID handling and improve Taskfile options - Include hostname in workflow ID for better audit log organization - Extract sanitizeHostname utility to audit/utils.ts for reuse - Remove unused generateSessionLogPath and buildLogFilePath functions - Simplify Taskfile with CONFIG/OUTPUT/CLEAN named parameters * chore: add .env.example and simplify .gitignore * docs: update README and CLAUDE.md for Temporal workflow usage - Replace Docker CLI instructions with Task-based commands - Add monitoring/stopping sections and workflow examples - Document Temporal orchestration layer and troubleshooting - Simplify file structure to key files overview * refactor: replace Taskfile with bash CLI script - Add shannon bash script with start/logs/query/stop/help commands - Remove Taskfile.yml dependency (no longer requires Task installation) - Update README.md and CLAUDE.md to use ./shannon commands - Update client.ts output to show ./shannon commands * docs: fix deliverable filename in README * refactor: remove direct CLI and .shannon-store.json in favor of Temporal - Delete src/shannon.ts direct CLI entry point (Temporal is now the only mode) - Remove .shannon-store.json session lock (Temporal handles workflow deduplication) - Remove broken scripts/export-metrics.js (imported non-existent function) - Update package.json to remove main, start script, and bin entry - Clean up CLAUDE.md and debug.md to remove obsolete references * chore: remove licensing comments from prompt files to prevent leaking into actual prompts * fix: resolve parallel workflow race conditions and retry logic bugs - Fix save_deliverable race condition using closure pattern instead of global variable - Fix error classification order so OutputValidationError matches before generic validation - Fix ApplicationFailure re-classification bug by checking instanceof before re-throwing - Add per-error-type retry limits (3 for output validation, 50 for billing) - Add fast retry intervals for pipeline testing mode (10s vs 5min) - Increase worker concurrent activities to 25 for parallel workflows * refactor: pipeline vuln→exploit workflow for parallel execution - Replace sync barrier between vuln/exploit phases with independent pipelines - Each vuln type runs: vuln agent → queue check → conditional exploit - Add checkExploitationQueue activity to skip exploits when no vulns found - Use Promise.allSettled for graceful failure handling across pipelines - Add PipelineSummary type for aggregated cost/duration/turns metrics * fix: re-throw retryable errors in checkExploitationQueue * fix: detect and retry on Claude Code spending cap errors - Add spending cap pattern detection in detectApiError() with retryable error - Add matching patterns to classifyErrorForTemporal() for proper Temporal retry - Add defense-in-depth safeguard in runClaudePrompt() for $0 cost / low turn detection - Add final sanity check in activities before declaring success * fix: increase heartbeat timeout to prevent false worker-dead detection Original 30s timeout was from POC spec assuming <5min activities. With hour-long activities and multiple concurrent workflows sharing one worker, resource contention causes event loop stalls exceeding 30s, triggering false heartbeat timeouts. Increased to 10min (prod) and 5min (testing). * fix: temporal db init * fix: persist home dir * feat: add per-workflow unified logging with ./shannon logs ID=<workflow-id> - Add WorkflowLogger class for human-readable, per-workflow log files - Create workflow.log in audit-logs/{workflowId}/ with phase, agent, tool, and LLM events - Update ./shannon logs to require ID param and tail specific workflow log - Add phase transition logging at workflow boundaries - Include workflow completion summary with agent breakdown (duration, cost) - Mount audit-logs volume in docker-compose for host access * feat: configurable OUTPUT directory with auto-discovery - Add OUTPUT=<path> option to write reports to custom directory - Mount custom output dir as volume for container-to-host persistence - Auto-discover workflow logs regardless of output path used - Display host output path in workflow start message - Add ASCII splash screen to ./shannon help --------- Co-authored-by: ezl-keygraph <ezhil@keygraph.io>	2026-01-15 11:30:46 -08:00
Arjun Malleswaran	78a0a61208	Feat/temporal (#46 ) * refactor: modularize claude-executor and extract shared utilities - Extract message handling into src/ai/message-handlers.ts with pure functions - Extract output formatting into src/ai/output-formatters.ts - Extract progress management into src/ai/progress-manager.ts - Add audit-logger.ts with Null Object pattern for optional logging - Add shared utilities: formatting.ts, file-io.ts, functional.ts - Consolidate getPromptNameForAgent into src/types/agents.ts * feat: add Claude Code custom commands for debug and review * feat: add Temporal integration foundation (phase 1-2) - Add Temporal SDK dependencies (@temporalio/client, worker, workflow, activity) - Add shared types for pipeline state, metrics, and progress queries - Add classifyErrorForTemporal() for retry behavior classification - Add docker-compose for Temporal server with SQLite persistence * feat: add Temporal activities for agent execution (phase 3) - Add activities.ts with heartbeat loop, git checkpoint/rollback, and error classification - Export runClaudePrompt, validateAgentOutput, ClaudePromptResult for Temporal use - Track attempt number via Temporal Context for accurate audit logging - Rollback git workspace before retry to ensure clean state * feat: add Temporal workflow for 5-phase pipeline orchestration (phase 4) * feat: add Temporal worker, client, and query tools (phase 5) - Add worker.ts with workflow bundling and graceful shutdown - Add client.ts CLI to start pipelines with progress polling - Add query.ts CLI to inspect running workflow state - Fix buffer overflow by truncating error messages and stack traces - Skip git operations gracefully on non-git repositories - Add kill.sh/start.sh dev scripts and Dockerfile.worker * feat: fix Docker worker container setup - Install uv instead of deprecated uvx package - Add mcp-server and configs directories to container - Mount target repo dynamically via TARGET_REPO env variable * fix: add report assembly step to Temporal workflow - Add assembleReportActivity to concatenate exploitation evidence files before report agent runs - Call assembleFinalReport in workflow Phase 5 before runReportAgent - Ensure deliverables directory exists before writing final report - Simplify pipeline-testing report prompt to just prepend header * refactor: consolidate Docker setup to root docker-compose.yml * feat: improve Temporal client UX and env handling - Change default to fire-and-forget (--wait flag to opt-in) - Add splash screen and improve console output formatting - Add .env to gitignore, remove from dockerignore for container access - Add Taskfile for common development commands * refactor: simplify session ID handling and improve Taskfile options - Include hostname in workflow ID for better audit log organization - Extract sanitizeHostname utility to audit/utils.ts for reuse - Remove unused generateSessionLogPath and buildLogFilePath functions - Simplify Taskfile with CONFIG/OUTPUT/CLEAN named parameters * chore: add .env.example and simplify .gitignore * docs: update README and CLAUDE.md for Temporal workflow usage - Replace Docker CLI instructions with Task-based commands - Add monitoring/stopping sections and workflow examples - Document Temporal orchestration layer and troubleshooting - Simplify file structure to key files overview * refactor: replace Taskfile with bash CLI script - Add shannon bash script with start/logs/query/stop/help commands - Remove Taskfile.yml dependency (no longer requires Task installation) - Update README.md and CLAUDE.md to use ./shannon commands - Update client.ts output to show ./shannon commands * docs: fix deliverable filename in README * refactor: remove direct CLI and .shannon-store.json in favor of Temporal - Delete src/shannon.ts direct CLI entry point (Temporal is now the only mode) - Remove .shannon-store.json session lock (Temporal handles workflow deduplication) - Remove broken scripts/export-metrics.js (imported non-existent function) - Update package.json to remove main, start script, and bin entry - Clean up CLAUDE.md and debug.md to remove obsolete references * chore: remove licensing comments from prompt files to prevent leaking into actual prompts * fix: resolve parallel workflow race conditions and retry logic bugs - Fix save_deliverable race condition using closure pattern instead of global variable - Fix error classification order so OutputValidationError matches before generic validation - Fix ApplicationFailure re-classification bug by checking instanceof before re-throwing - Add per-error-type retry limits (3 for output validation, 50 for billing) - Add fast retry intervals for pipeline testing mode (10s vs 5min) - Increase worker concurrent activities to 25 for parallel workflows * refactor: pipeline vuln→exploit workflow for parallel execution - Replace sync barrier between vuln/exploit phases with independent pipelines - Each vuln type runs: vuln agent → queue check → conditional exploit - Add checkExploitationQueue activity to skip exploits when no vulns found - Use Promise.allSettled for graceful failure handling across pipelines - Add PipelineSummary type for aggregated cost/duration/turns metrics * fix: re-throw retryable errors in checkExploitationQueue * fix: detect and retry on Claude Code spending cap errors - Add spending cap pattern detection in detectApiError() with retryable error - Add matching patterns to classifyErrorForTemporal() for proper Temporal retry - Add defense-in-depth safeguard in runClaudePrompt() for $0 cost / low turn detection - Add final sanity check in activities before declaring success * fix: increase heartbeat timeout to prevent false worker-dead detection Original 30s timeout was from POC spec assuming <5min activities. With hour-long activities and multiple concurrent workflows sharing one worker, resource contention causes event loop stalls exceeding 30s, triggering false heartbeat timeouts. Increased to 10min (prod) and 5min (testing). * fix: temporal db init * fix: persist home dir * feat: add per-workflow unified logging with ./shannon logs ID=<workflow-id> - Add WorkflowLogger class for human-readable, per-workflow log files - Create workflow.log in audit-logs/{workflowId}/ with phase, agent, tool, and LLM events - Update ./shannon logs to require ID param and tail specific workflow log - Add phase transition logging at workflow boundaries - Include workflow completion summary with agent breakdown (duration, cost) - Mount audit-logs volume in docker-compose for host access --------- Co-authored-by: ezl-keygraph <ezhil@keygraph.io>	2026-01-15 10:36:11 -08:00
ezl-keygraph	bc52d67dd5	refactor: remove orchestration layer (#45 ) * refactor: remove orchestration layer and simplify CLI Remove the complex orchestration layer including checkpoint management, rollback/recovery commands, and session management commands. This consolidates the execution logic directly in shannon.ts for a simpler fire-and-forget execution model. Changes: - Remove checkpoint-manager.ts and rollback functionality - Remove command-handler.ts and cli/prompts.ts - Simplify session-manager.ts to just agent definitions - Consolidate orchestration logic in shannon.ts - Update CLAUDE.md documentation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * refactor: move session lock logic to shannon.ts, simplify session-manager - Reduce session-manager.ts to only AGENTS, AGENT_ORDER, getParallelGroups() - Move Session interface and lock file functions to shannon.ts - Simplify Session to only: id, webUrl, repoPath, status, startedAt - Remove unused types/session.ts Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * refactor: use crypto.randomUUID() for session ID generation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-12 22:58:17 +05:30
ezl-keygraph	264b16991a	feat: add configurable output directory with --output flag (#41 ) * feat: add configurable output directory with --output flag Add --output CLI flag to specify custom output directory for session folders containing audit logs, prompts, agent logs, and deliverables. Changes: - Add --output <path> CLI flag parsing - Update generateAuditPath() to use custom path when provided - Add consolidateOutputs() to copy deliverables to session folder - Update Docker examples with volume mounts for output directories - Default remains ./audit-logs/ when --output is not specified 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * feat: add configurable output directory with --output flag Add --output CLI flag to specify custom output directory for session folders containing audit logs, prompts, agent logs, and deliverables. Changes: - Add --output <path> CLI flag parsing - Store outputPath in Session interface for persistence - Update generateAuditPath() to use custom path when provided - Pass outputPath through pre-recon and checkpoint-manager - Add consolidateOutputs() to copy deliverables to session folder - Update Docker examples with volume mount instructions - Default remains ./audit-logs/ when --output is not specified 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * chore: add gitkeep and fix formatting * fix: correct docker run command formatting in README Remove invalid inline comments after backslash continuations in docker run commands. Comments cannot appear after backslash line continuations in shell scripts, as the backslash escapes the newline character. Reorganized comments to appear on separate lines before or after the command block for better clarity and proper shell syntax. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-01-08 23:50:42 +05:30
ezl-keygraph	dd18f4629b	feat: typescript migration (#40 ) * chore: initialize TypeScript configuration and build setup - Add tsconfig.json for root and mcp-server with strict type checking - Install typescript and @types/node as devDependencies - Add npm build script for TypeScript compilation - Update main entrypoint to compiled dist/shannon.js - Update Dockerfile to build TypeScript before running - Configure output directory and module resolution for Node.js * refactor: migrate codebase from JavaScript to TypeScript - Convert all 37 JavaScript files to TypeScript (.js -> .ts) - Add type definitions in src/types/ for agents, config, errors, session - Update mcp-server with proper TypeScript types - Move entry point from shannon.mjs to src/shannon.ts - Update tsconfig.json with rootDir: "./src" for cleaner dist output - Update Dockerfile to build TypeScript before runtime - Update package.json paths to use compiled dist/shannon.js No runtime behavior changes - pure type safety migration. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs: update CLI references from ./shannon.mjs to shannon - Update help text in src/cli/ui.ts - Update usage examples in src/cli/command-handler.ts - Update setup message in src/shannon.ts - Update CLAUDE.md documentation with TypeScript file structure - Replace all ./shannon.mjs references with shannon command 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * chore: remove unnecessary eslint-disable comments ESLint is not configured in this project, making these comments redundant. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-08 00:18:25 +05:30
ajmallesh	1e784e650d	fix: support absolute config paths in checkpoint manager Co-Authored-By: Khaushik-keygraph <khaushik.contractor@keygraph.io>	2025-12-15 10:34:25 -08:00
Khaushik-keygraph	38e49eb1eb	chore: added flag additions for minimizing logs	2025-12-09 23:59:12 +05:30
ajmallesh	534b18e303	chore: change license to AGPL-3.0	2025-11-26 18:45:36 -08:00
ajmallesh	1051d40527	chore: add MPL license comments	2025-11-13 16:55:13 +05:30
ajmallesh	939398074f	refactor: update injection display name and add max tokens docs - Change agent prefix from [SQLi/Cmd] to [Injection] to reflect expanded scope - Add README documentation for CLAUDE_CODE_MAX_OUTPUT_TOKENS environment variable This update aligns the display naming with the expanded injection analysis scope that now covers SQLi, Command Injection, LFI/RFI, SSTI, Path Traversal, and Insecure Deserialization vulnerabilities. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-03 10:21:17 -08:00
ajmallesh	9d338ec948	fix: err handling for claude code session limit	2025-10-30 10:28:35 -07:00
ajmallesh	7f3bff9b36	Revert "feat: improve audit log naming with timestamp and app context" This reverts the timestamp-based naming scheme that was causing audit log fragmentation. Each agent execution was creating a new folder because the timestamp kept changing. Reverting back to simple, stable naming: {hostname}_{sessionId} This ensures ONE folder per session, preventing the bug where multiple folders were created for the same session. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-27 13:30:25 -07:00
ajmallesh	95b8e876eb	fix: use session's original createdAt instead of current time Fixed bug where audit system would create duplicate folders for the same session because it was using current time instead of the session's original createdAt timestamp. Bug behavior: - Session created at T1 → folder: {T1}_app_host_id/ - Audit re-initialized at T2 → NEW folder: {T2}_app_host_id/ - Result: 2 folders per session with same ID but different timestamps Root cause: - metrics-tracker.js:65 was calling formatTimestamp() (current time) - Should use sessionMetadata.createdAt (original creation time) Impact: Each running benchmark was creating 2 audit log folders instead of 1 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-27 10:55:53 -07:00
ajmallesh	6e89f26474	feat: improve audit log naming with timestamp and app context Enhances audit log directory naming from `{hostname}_{uuid}` to `{timestamp}_{appName}_{hostname}_{shortId}` for better discoverability and benchmarking analysis. Changes: - Add extractAppName() helper to extract app name from config files - Add smart fallback: use port number for localhost without config - Update generateSessionIdentifier() to include timestamp prefix - Shorten session ID to first 8 characters for readability Examples: - With config: 20251025T193847Z_myapp_localhost_efc60ee0/ - Without config: 20251025T193913Z_8080_localhost_d47e3bfd/ - Remote: 20251024T004401Z_noconfig_example-com_d47e3bfd/ Benefits: - Chronologically sortable audit logs - Instant app identification in directory listings - Efficient filtering for benchmarking queries - Non-breaking: existing logs keep their names 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-27 10:14:19 -07:00
ajmallesh	dcae34af81	fix: enable Playwright MCP browser automation in Docker containers Resolves Playwright browser installation failures in Docker by using Wolfi's system Chromium instead of downloading Playwright's bundled browsers at runtime. ## Problem When running in Docker, agents attempted to install browsers via `browser_install` tool, which failed due to: - Permission issues (non-root user couldn't install system dependencies) - npx @playwright/mcp spawns with its own Playwright dependency separate from global installations - Playwright's bundled browsers require runtime download (~280MB) and glibc deps - Environment variables alone (PLAYWRIGHT_BROWSERS_PATH) weren't sufficient ## Solution Dockerfile changes: - Use Wolfi's native `chromium` package (guaranteed compatible, already installed) - Remove Playwright browser installation step (saves ~280MB and build time) - Add explicit `SHANNON_DOCKER=true` environment variable for reliable detection - Set PLAYWRIGHT_CHROMIUM_EXECUTABLE_PATH to point to system Chromium Code changes (claude-executor.js): - Detect Docker via `process.env.SHANNON_DOCKER` (more reliable than /.dockerenv) - Conditionally add `--executable-path /usr/bin/chromium-browser` CLI arg for Docker - Local: Use Playwright's bundled browsers (downloaded to ~/Library/Caches/) - Docker: Use system Chromium with no runtime downloads ## Research Findings - @playwright/mcp has separate playwright-core dependency (v1.56.0-alpha) - MCP server spawned via npx doesn't inherit browser binaries from global install - --executable-path CLI argument is required (env vars insufficient) - /.dockerenv file is unreliable (missing in BuildKit, K8s, can be spoofed) ## Testing ✅ Docker: All 5 parallel agents successfully navigate, screenshot, create deliverables ✅ Local: All 5 parallel agents successfully navigate, screenshot, create deliverables ✅ No browser_install calls, no permission errors ✅ Image size reduced by ~280MB Fixes #docker-playwright-browser-issues 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-23 17:56:19 -07:00
ajmallesh	d372f87297	refactor: remove ~500 lines of dead code and consolidate duplicates Comprehensive codebase cleanup based on parallel agent analysis and automated dead code detection (knip, depcheck). Reduces codebase by ~10% with zero functional changes. ## Phase 1: Obsolete MCP Setup Removal (~82 lines) - Delete setupMCP() and cleanupMCP() functions from environment.js - Remove all calls to cleanupMCP() (8 instances across 3 files) - Migrate from claude CLI to SDK's mcpServers option - Remove --log flag (obsolete logging system) ## Phase 2: Dead Code Removal (~317 lines) - Delete src/utils/logger.js entirely (127 lines, superseded by audit system) - Remove handleConfigError() and handleError() from error-handling.js - Remove isToolAvailable() from tool-checker.js - Remove 5 dead methods from audit-session.js (logSessionFailure, logMessage, markRolledBack, updateValidation, getValidation) - Remove 6 wrapper methods from audit/logger.js (all callers use logEvent directly) - Remove formatCost(), updateMessage(), compose() utilities (unused) ## Phase 3: Consolidation (~195 lines) - Extract SessionMutex to src/utils/concurrency.js (was duplicated in 2 files) - Consolidate formatDuration to src/audit/utils.js (was in 3 files) - Extract readline prompts to src/cli/prompts.js (was duplicated in 2 files) - Create validator factories in constants.js (reduce 72 lines to 30) ## Impact - Total reduction: 488 lines (20 files modified, 2 created, 1 deleted) - Codebase: ~4,900 → ~4,400 LOC (10% reduction) - Zero functional changes, all tests pass - Improved maintainability and DRY compliance 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-23 17:01:17 -07:00
ajmallesh	369bf29588	refactor: deduplicate prompt templates with shared content system Implemented @include() directive system to eliminate ~800 lines of duplicated content across 10 specialist prompt files. All prompt-related content now consolidated under prompts/ directory for better maintainability. Changes: - Added processIncludes() to prompt-manager.js for generic @include() support - Created prompts/shared/ with 5 reusable template files - Refactored all 10 specialist prompts to use @include() for common sections - Moved login_instructions.txt to prompts/shared/ (deleted login_resources/) - Updated CLAUDE.md to reflect new structure Impact: -137 net lines, zero breaking changes, infinitely scalable for future shared content. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-23 16:19:25 -07:00
ajmallesh	dafd9148f6	chore: remove ~500 lines of dead code identified by knip Remove unused files and exports to improve codebase maintainability: Phase 1 - Deleted files (5): - login_resources/generate-totp-standalone.mjs (replaced by MCP tool) - mcp-server/src/tools/index.js (unused barrel export) - mcp-server/src/utils/index.js (unused barrel export) - mcp-server/src/validation/index.js (unused barrel export) - src/agent-status.js (deprecated 309-line status manager) Phase 2 - Removed unused exports (3): - mcp-server/src/index.js: shannonHelperServer constant - mcp-server/src/utils/error-formatter.js: createFileSystemError function - src/utils/git-manager.js: cleanWorkspace (now internal-only) Phase 3 - Unexported internal functions (4): - src/checkpoint-manager.js: runSingleAgent, runAgentRange, runParallelVuln, runParallelExploit (internal use only) All Shannon CLI commands tested and verified working. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-23 12:46:51 -07:00
ajmallesh	55716963da	feat: migrate to use MCP tools instead of helper scripts	2025-10-23 11:56:47 -07:00
ajmallesh	d6e5db2397	fix: critical bug - exploitation phase was always skipped ROOT CAUSE: - Exploitation phase checked session.validationResults to determine eligibility - validationResults field was removed during audit system refactor - Field never existed in session schema, so all exploits were skipped THE FIX: - Exploitation phase now validates queue files directly when checking eligibility - Reads exploitation_queue.json and checks if vulnerabilities array is non-empty - No need to store validation results - just re-validate on demand CHANGES: 1. runParallelExploit() now calls safeValidateQueueAndDeliverable() directly 2. Removed validationResults parameter from markAgentCompleted() 3. Simplified calculateVulnerabilityAnalysisSummary() - no longer needs validation data 4. Simplified calculateExploitationSummary() - no longer needs validation data IMPACT: - Exploitation agents will now run when vulnerabilities are found - Queue files are the single source of truth for eligibility - Simpler architecture - no duplicate state storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-22 17:41:41 -07:00
ajmallesh	50bf583993	chore: remove run-metadata.json functionality Reasoning: - Pollutes target repo with run-metadata.json - Redundant with audit system (session.json has all metadata) - Less useful than comprehensive audit logs - Target repos should stay clean - only deliverables belong there All debugging info now lives in audit-logs/{hostname}_{sessionId}/session.json 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-22 16:19:40 -07:00
ajmallesh	3babf02d68	feat: implement unified audit system v3.0 with crash-safety and self-healing ## Unified Audit System (v3.0) - Implemented crash-safe, append-only logging to audit-logs/{hostname}_{sessionId}/ - Added session.json with comprehensive metrics (timing, cost, attempts) - Agent execution logs with turn-by-turn detail - Prompt snapshots saved to audit-logs/.../prompts/{agent}.md - SessionMutex prevents race conditions during parallel execution - Self-healing reconciliation before every CLI command ## Session Metadata Standardization - Fixed critical bug: standardized on 'id' field (not 'sessionId') throughout codebase - Updated: shannon.mjs (recon, report), src/phases/pre-recon.js - Added validation in AuditSession to fail fast on incorrect field usage - JavaScript shorthand syntax was causing wrong field names ## Schema Improvements - session.json: Added cost_usd per phase, removed redundant final_cost_usd - Renamed 'percentage' -> 'duration_percentage' for clarity - Simplified agent metrics to single total_cost_usd field - Removed unused validation object from schema ## Legacy System Removal - Removed savePromptSnapshot() - prompts now only saved by audit system - Removed target repo pollution (prompt-snapshots/ no longer created) - Single source of truth: audit-logs/{hostname}_{sessionId}/prompts/ ## Export Script Simplification - Removed JSON export mode (session.json already exists) - CSV-only export with clean columns: agent, phase, status, attempts, duration_ms, cost_usd - Tested on real session data ## Documentation - Updated CLAUDE.md with audit system architecture - Added .gitignore entry for audit-logs/ 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-22 16:09:08 -07:00
ajmallesh	b8dc33f180	chore: remove screenshot saving from Playwright MCP instances Remove unnecessary screenshot storage to reduce file I/O and disk usage: - Removed screenshot directory creation - Removed --output-dir flag from Playwright MCP setup - Agents can still take screenshots, but they won't persist to disk Screenshots were not being used by any part of Shannon for analysis or reporting, making their storage unnecessary overhead. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-22 12:15:47 -07:00
ajmallesh	04f810b9fb	chore: remove permanent deliverables copying to Documents folder Simplified deliverable management by removing automatic copying to ~/Documents/pentest-deliverables/. All deliverables now remain only in <target-repo>/deliverables/, eliminating file duplication and improving UX. Changes: - Removed savePermanentDeliverables() function from src/setup/deliverables.js - Removed function call and related console output from shannon.mjs - Removed unused 'os' import from deliverables.js 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-22 12:11:48 -07:00
ajmallesh	be776c4640	chore: save deliverable script decoupling deliverable creation from the actual content	2025-10-22 11:31:58 -07:00
ajmallesh	de8c5ee041	chore: upgrade model from Sonnet 4 -> Sonnet 4.5	2025-10-21 16:34:56 -07:00
Khaushik-keygraph	ac15ac284f	chore: optimized logging	2025-10-17 13:59:34 +05:30
Khaushik-keygraph	e193d04594	chore: added logging	2025-10-17 13:52:13 +05:30
ajmallesh	9327630c45	Initial commit	2025-10-03 19:35:08 -07:00

29 Commits