Commit Graph

13 Commits

Author SHA1 Message Date
ajmallesh bb89d6f458 refactor: replace console.log/chalk with ActivityLogger across services
- Add ActivityLogger interface wrapping Temporal's Context.current().log
- Thread logger parameter through claude-executor, message-handlers, git-manager, prompt-manager, reporting, and agent validators
- Remove chalk dependency from all service/activity files; CLI files keep console.log for terminal output
- Replace colorFn: ChalkInstance parameter with structured logger.info/warn/error calls
- Use replay-safe `log` import from @temporalio/workflow in workflows.ts
2026-02-16 17:16:27 -08:00
ajmallesh d3816a29fa refactor: extract services layer, Result type, and ErrorCode classification
- Add DI container (src/services/) with AgentExecutionService, ConfigLoaderService, and ExploitationCheckerService — pure domain logic with no Temporal dependencies
- Introduce Result<T, E> type and ErrorCode enum for code-based error classification in classifyErrorForTemporal, replacing scattered string matching
- Consolidate billing/spending cap detection into utils/billing-detection.ts with shared pattern lists across message-handlers, claude-executor, and error-handling
- Extract LogStream abstraction for append-only logging with backpressure, used by both AgentLogger and WorkflowLogger
- Simplify activities.ts from inline lifecycle logic to thin wrappers delegating to services, with heartbeat and error classification
- Expand config-parser with human-readable AJV errors, security validation, and rule type-specific checks
2026-02-16 16:12:21 -08:00
ajmallesh 13731f5ebf refactor: remove ~750 lines of dead code across 12 files
- Delete 4 dead files: pre-recon.ts, tool-checker.ts, input-validator.ts, environment.ts
- Remove runClaudePromptWithRetry() and its now-unused imports from claude-executor.ts
- De-export unused symbols: AGENT_ORDER, getParallelGroups, logError, isRouterMode, showHelp, displayTimingSummary
- De-export unused types: ProcessingState, ProcessingResult, SdkMessage, MessageDispatchResult, MessageDispatchContext
- Remove dead import (path from zx) in session-manager.ts and deprecated comment in config.ts
2026-02-16 11:30:00 -08:00
ajmallesh cd04c7a6d2 feat: add model tracking and reporting across pipeline
- Track actual model name from router through audit logs, session.json, and query output
- Add router-utils.ts to resolve model names from ROUTER_DEFAULT env var
- Inject model info into final report's Executive Summary section
- Update documentation with supported providers, pricing, and config examples
- Update router-config.json with latest model versions (GPT-5.2, Gemini 2.5, etc.)
2026-01-15 18:30:19 -08:00
Arjun Malleswaran 51e621d0d5 Feat/temporal (#46)
* refactor: modularize claude-executor and extract shared utilities

- Extract message handling into src/ai/message-handlers.ts with pure functions
- Extract output formatting into src/ai/output-formatters.ts
- Extract progress management into src/ai/progress-manager.ts
- Add audit-logger.ts with Null Object pattern for optional logging
- Add shared utilities: formatting.ts, file-io.ts, functional.ts
- Consolidate getPromptNameForAgent into src/types/agents.ts

* feat: add Claude Code custom commands for debug and review

* feat: add Temporal integration foundation (phase 1-2)

- Add Temporal SDK dependencies (@temporalio/client, worker, workflow, activity)
- Add shared types for pipeline state, metrics, and progress queries
- Add classifyErrorForTemporal() for retry behavior classification
- Add docker-compose for Temporal server with SQLite persistence

* feat: add Temporal activities for agent execution (phase 3)

- Add activities.ts with heartbeat loop, git checkpoint/rollback, and error classification
- Export runClaudePrompt, validateAgentOutput, ClaudePromptResult for Temporal use
- Track attempt number via Temporal Context for accurate audit logging
- Rollback git workspace before retry to ensure clean state

* feat: add Temporal workflow for 5-phase pipeline orchestration (phase 4)

* feat: add Temporal worker, client, and query tools (phase 5)

- Add worker.ts with workflow bundling and graceful shutdown
- Add client.ts CLI to start pipelines with progress polling
- Add query.ts CLI to inspect running workflow state
- Fix buffer overflow by truncating error messages and stack traces
- Skip git operations gracefully on non-git repositories
- Add kill.sh/start.sh dev scripts and Dockerfile.worker

* feat: fix Docker worker container setup

- Install uv instead of deprecated uvx package
- Add mcp-server and configs directories to container
- Mount target repo dynamically via TARGET_REPO env variable

* fix: add report assembly step to Temporal workflow

- Add assembleReportActivity to concatenate exploitation evidence files before report agent runs
- Call assembleFinalReport in workflow Phase 5 before runReportAgent
- Ensure deliverables directory exists before writing final report
- Simplify pipeline-testing report prompt to just prepend header

* refactor: consolidate Docker setup to root docker-compose.yml

* feat: improve Temporal client UX and env handling

- Change default to fire-and-forget (--wait flag to opt-in)
- Add splash screen and improve console output formatting
- Add .env to gitignore, remove from dockerignore for container access
- Add Taskfile for common development commands

* refactor: simplify session ID handling and improve Taskfile options

- Include hostname in workflow ID for better audit log organization
- Extract sanitizeHostname utility to audit/utils.ts for reuse
- Remove unused generateSessionLogPath and buildLogFilePath functions
- Simplify Taskfile with CONFIG/OUTPUT/CLEAN named parameters

* chore: add .env.example and simplify .gitignore

* docs: update README and CLAUDE.md for Temporal workflow usage

- Replace Docker CLI instructions with Task-based commands
- Add monitoring/stopping sections and workflow examples
- Document Temporal orchestration layer and troubleshooting
- Simplify file structure to key files overview

* refactor: replace Taskfile with bash CLI script

- Add shannon bash script with start/logs/query/stop/help commands
- Remove Taskfile.yml dependency (no longer requires Task installation)
- Update README.md and CLAUDE.md to use ./shannon commands
- Update client.ts output to show ./shannon commands

* docs: fix deliverable filename in README

* refactor: remove direct CLI and .shannon-store.json in favor of Temporal

- Delete src/shannon.ts direct CLI entry point (Temporal is now the only mode)
- Remove .shannon-store.json session lock (Temporal handles workflow deduplication)
- Remove broken scripts/export-metrics.js (imported non-existent function)
- Update package.json to remove main, start script, and bin entry
- Clean up CLAUDE.md and debug.md to remove obsolete references

* chore: remove licensing comments from prompt files to prevent leaking into actual prompts

* fix: resolve parallel workflow race conditions and retry logic bugs

- Fix save_deliverable race condition using closure pattern instead of global variable
- Fix error classification order so OutputValidationError matches before generic validation
- Fix ApplicationFailure re-classification bug by checking instanceof before re-throwing
- Add per-error-type retry limits (3 for output validation, 50 for billing)
- Add fast retry intervals for pipeline testing mode (10s vs 5min)
- Increase worker concurrent activities to 25 for parallel workflows

* refactor: pipeline vuln→exploit workflow for parallel execution

- Replace sync barrier between vuln/exploit phases with independent pipelines
- Each vuln type runs: vuln agent → queue check → conditional exploit
- Add checkExploitationQueue activity to skip exploits when no vulns found
- Use Promise.allSettled for graceful failure handling across pipelines
- Add PipelineSummary type for aggregated cost/duration/turns metrics

* fix: re-throw retryable errors in checkExploitationQueue

* fix: detect and retry on Claude Code spending cap errors

- Add spending cap pattern detection in detectApiError() with retryable error
- Add matching patterns to classifyErrorForTemporal() for proper Temporal retry
- Add defense-in-depth safeguard in runClaudePrompt() for $0 cost / low turn detection
- Add final sanity check in activities before declaring success

* fix: increase heartbeat timeout to prevent false worker-dead detection

Original 30s timeout was from POC spec assuming <5min activities. With
hour-long activities and multiple concurrent workflows sharing one worker,
resource contention causes event loop stalls exceeding 30s, triggering
false heartbeat timeouts. Increased to 10min (prod) and 5min (testing).

* fix: temporal db init

* fix: persist home dir

* feat: add per-workflow unified logging with ./shannon logs ID=<workflow-id>

- Add WorkflowLogger class for human-readable, per-workflow log files
- Create workflow.log in audit-logs/{workflowId}/ with phase, agent, tool, and LLM events
- Update ./shannon logs to require ID param and tail specific workflow log
- Add phase transition logging at workflow boundaries
- Include workflow completion summary with agent breakdown (duration, cost)
- Mount audit-logs volume in docker-compose for host access

---------

Co-authored-by: ezl-keygraph <ezhil@keygraph.io>
2026-01-15 10:36:11 -08:00
ezl-keygraph 45acb16711 refactor: remove orchestration layer (#45)
* refactor: remove orchestration layer and simplify CLI

Remove the complex orchestration layer including checkpoint management,
rollback/recovery commands, and session management commands. This
consolidates the execution logic directly in shannon.ts for a simpler
fire-and-forget execution model.

Changes:
- Remove checkpoint-manager.ts and rollback functionality
- Remove command-handler.ts and cli/prompts.ts
- Simplify session-manager.ts to just agent definitions
- Consolidate orchestration logic in shannon.ts
- Update CLAUDE.md documentation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* refactor: move session lock logic to shannon.ts, simplify session-manager

- Reduce session-manager.ts to only AGENTS, AGENT_ORDER, getParallelGroups()
- Move Session interface and lock file functions to shannon.ts
- Simplify Session to only: id, webUrl, repoPath, status, startedAt
- Remove unused types/session.ts

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* refactor: use crypto.randomUUID() for session ID generation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-12 22:58:17 +05:30
ezl-keygraph 8381198c41 feat: add configurable output directory with --output flag (#41)
* feat: add configurable output directory with --output flag

Add --output CLI flag to specify custom output directory for session
folders containing audit logs, prompts, agent logs, and deliverables.

Changes:
- Add --output <path> CLI flag parsing
- Update generateAuditPath() to use custom path when provided
- Add consolidateOutputs() to copy deliverables to session folder
- Update Docker examples with volume mounts for output directories
- Default remains ./audit-logs/ when --output is not specified

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* feat: add configurable output directory with --output flag

Add --output CLI flag to specify custom output directory for session
folders containing audit logs, prompts, agent logs, and deliverables.

Changes:
- Add --output <path> CLI flag parsing
- Store outputPath in Session interface for persistence
- Update generateAuditPath() to use custom path when provided
- Pass outputPath through pre-recon and checkpoint-manager
- Add consolidateOutputs() to copy deliverables to session folder
- Update Docker examples with volume mount instructions
- Default remains ./audit-logs/ when --output is not specified

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* chore: add gitkeep and fix formatting

* fix: correct docker run command formatting in README

Remove invalid inline comments after backslash continuations in docker
run commands. Comments cannot appear after backslash line continuations
in shell scripts, as the backslash escapes the newline character.

Reorganized comments to appear on separate lines before or after the
command block for better clarity and proper shell syntax.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-01-08 23:50:42 +05:30
ezl-keygraph 3ac07a4718 feat: typescript migration (#40)
* chore: initialize TypeScript configuration and build setup

- Add tsconfig.json for root and mcp-server with strict type checking
- Install typescript and @types/node as devDependencies
- Add npm build script for TypeScript compilation
- Update main entrypoint to compiled dist/shannon.js
- Update Dockerfile to build TypeScript before running
- Configure output directory and module resolution for Node.js

* refactor: migrate codebase from JavaScript to TypeScript

- Convert all 37 JavaScript files to TypeScript (.js -> .ts)
- Add type definitions in src/types/ for agents, config, errors, session
- Update mcp-server with proper TypeScript types
- Move entry point from shannon.mjs to src/shannon.ts
- Update tsconfig.json with rootDir: "./src" for cleaner dist output
- Update Dockerfile to build TypeScript before runtime
- Update package.json paths to use compiled dist/shannon.js

No runtime behavior changes - pure type safety migration.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: update CLI references from ./shannon.mjs to shannon

- Update help text in src/cli/ui.ts
- Update usage examples in src/cli/command-handler.ts
- Update setup message in src/shannon.ts
- Update CLAUDE.md documentation with TypeScript file structure
- Replace all ./shannon.mjs references with shannon command

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* chore: remove unnecessary eslint-disable comments

ESLint is not configured in this project, making these comments redundant.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 00:18:25 +05:30
ajmallesh 7c2edeb4c0 chore: change license to AGPL-3.0 2025-11-26 18:45:36 -08:00
ajmallesh e4eb59870a chore: add MPL license comments 2025-11-13 16:55:13 +05:30
ajmallesh f13c7421f4 refactor: remove ~500 lines of dead code and consolidate duplicates
Comprehensive codebase cleanup based on parallel agent analysis and automated
dead code detection (knip, depcheck). Reduces codebase by ~10% with zero
functional changes.

## Phase 1: Obsolete MCP Setup Removal (~82 lines)
- Delete setupMCP() and cleanupMCP() functions from environment.js
- Remove all calls to cleanupMCP() (8 instances across 3 files)
- Migrate from claude CLI to SDK's mcpServers option
- Remove --log flag (obsolete logging system)

## Phase 2: Dead Code Removal (~317 lines)
- Delete src/utils/logger.js entirely (127 lines, superseded by audit system)
- Remove handleConfigError() and handleError() from error-handling.js
- Remove isToolAvailable() from tool-checker.js
- Remove 5 dead methods from audit-session.js (logSessionFailure, logMessage,
  markRolledBack, updateValidation, getValidation)
- Remove 6 wrapper methods from audit/logger.js (all callers use logEvent directly)
- Remove formatCost(), updateMessage(), compose() utilities (unused)

## Phase 3: Consolidation (~195 lines)
- Extract SessionMutex to src/utils/concurrency.js (was duplicated in 2 files)
- Consolidate formatDuration to src/audit/utils.js (was in 3 files)
- Extract readline prompts to src/cli/prompts.js (was duplicated in 2 files)
- Create validator factories in constants.js (reduce 72 lines to 30)

## Impact
- Total reduction: 488 lines (20 files modified, 2 created, 1 deleted)
- Codebase: ~4,900 → ~4,400 LOC (10% reduction)
- Zero functional changes, all tests pass
- Improved maintainability and DRY compliance

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-23 17:01:17 -07:00
ajmallesh 27334a4dd6 feat: implement unified audit system v3.0 with crash-safety and self-healing
## Unified Audit System (v3.0)
- Implemented crash-safe, append-only logging to audit-logs/{hostname}_{sessionId}/
- Added session.json with comprehensive metrics (timing, cost, attempts)
- Agent execution logs with turn-by-turn detail
- Prompt snapshots saved to audit-logs/.../prompts/{agent}.md
- SessionMutex prevents race conditions during parallel execution
- Self-healing reconciliation before every CLI command

## Session Metadata Standardization
- Fixed critical bug: standardized on 'id' field (not 'sessionId') throughout codebase
- Updated: shannon.mjs (recon, report), src/phases/pre-recon.js
- Added validation in AuditSession to fail fast on incorrect field usage
- JavaScript shorthand syntax was causing wrong field names

## Schema Improvements
- session.json: Added cost_usd per phase, removed redundant final_cost_usd
- Renamed 'percentage' -> 'duration_percentage' for clarity
- Simplified agent metrics to single total_cost_usd field
- Removed unused validation object from schema

## Legacy System Removal
- Removed savePromptSnapshot() - prompts now only saved by audit system
- Removed target repo pollution (prompt-snapshots/ no longer created)
- Single source of truth: audit-logs/{hostname}_{sessionId}/prompts/

## Export Script Simplification
- Removed JSON export mode (session.json already exists)
- CSV-only export with clean columns: agent, phase, status, attempts, duration_ms, cost_usd
- Tested on real session data

## Documentation
- Updated CLAUDE.md with audit system architecture
- Added .gitignore entry for audit-logs/

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-22 16:09:08 -07:00
ajmallesh 8f52722d56 Initial commit
Co-Authored-By: Nellie Mullane <nellie@keygraph.io>
2025-10-03 19:35:08 -07:00