shannon

mirror of https://github.com/KeygraphHQ/shannon.git synced 2026-07-05 04:38:03 +02:00

Author	SHA1	Message	Date
ajmallesh	92db01bd2d	docs: add ctf-mode branch documentation to README Add a TIP callout in the Overview section documenting the ctf-mode branch for users who want to run Shannon against Capture-The-Flag challenges with optimized flag extraction prompts. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-03 10:35:45 -08:00
ajmallesh	34850477a2	refactor: update injection display name and add max tokens docs - Change agent prefix from [SQLi/Cmd] to [Injection] to reflect expanded scope - Add README documentation for CLAUDE_CODE_MAX_OUTPUT_TOKENS environment variable This update aligns the display naming with the expanded injection analysis scope that now covers SQLi, Command Injection, LFI/RFI, SSTI, Path Traversal, and Insecure Deserialization vulnerabilities. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-03 10:21:17 -08:00
ajmallesh	d82d1fa753	feat: expand injection analysis scope to cover LFI/RFI/SSTI/Path Traversal/Deserialization Fixes responsibility gap where agents found vulnerabilities but rejected them as "out of scope" Changes: - vuln-injection.txt: Added LFI/RFI, SSTI, Path Traversal, Deserialization to scope - Updated role definition and objective - Added new vulnerability_type and slot_type enums - Added sink definitions and defense rules for new injection classes - Added witness payload examples - pre-recon-code.txt: Expanded sink hunter agent to find file/template/deserialize sinks - recon.txt: Updated Section 9 with clear injection source definitions for all types - exploit-injection.txt: Updated evidence template to handle all injection types Token-optimized: Condensed verbose sections while preserving critical guidance Addresses XBEN benchmark failures where LFI/SSTI/Path Traversal were detected but excluded from exploitation queues 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-03 10:20:15 -08:00
ajmallesh	0b9580a99a	feat: add environment variable support for Claude Code token limits Introduces .env file configuration to manage CLAUDE_CODE_MAX_TOKENS, allowing flexible control of the context window size for AI analysis sessions. This enables users to tune token limits based on their specific penetration testing needs without modifying code. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-30 10:53:42 -07:00
ajmallesh	cc36fe933d	fix: err handling for claude code session limit	2025-10-30 10:28:35 -07:00
ajmallesh	5b92ff52c4	chore: print audit logs folder location	2025-10-28 10:31:00 -07:00
ajmallesh	d8efd78ac0	Merge pull request #3 from KeygraphHQ/feature/improve-audit-log-naming Feature/improve audit log naming	2025-10-27 14:56:57 -07:00
ajmallesh	a099500d9b	Revert "feat: improve audit log naming with timestamp and app context" This reverts the timestamp-based naming scheme that was causing audit log fragmentation. Each agent execution was creating a new folder because the timestamp kept changing. Reverting back to simple, stable naming: {hostname}_{sessionId} This ensures ONE folder per session, preventing the bug where multiple folders were created for the same session. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-27 13:30:25 -07:00
ajmallesh	f0b8c3aa6e	fix: use session's original createdAt instead of current time Fixed bug where audit system would create duplicate folders for the same session because it was using current time instead of the session's original createdAt timestamp. Bug behavior: - Session created at T1 → folder: {T1}_app_host_id/ - Audit re-initialized at T2 → NEW folder: {T2}_app_host_id/ - Result: 2 folders per session with same ID but different timestamps Root cause: - metrics-tracker.js:65 was calling formatTimestamp() (current time) - Should use sessionMetadata.createdAt (original creation time) Impact: Each running benchmark was creating 2 audit log folders instead of 1 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-27 10:55:53 -07:00
ajmallesh	258830b030	feat: improve audit log naming with timestamp and app context Enhances audit log directory naming from `{hostname}_{uuid}` to `{timestamp}_{appName}_{hostname}_{shortId}` for better discoverability and benchmarking analysis. Changes: - Add extractAppName() helper to extract app name from config files - Add smart fallback: use port number for localhost without config - Update generateSessionIdentifier() to include timestamp prefix - Shorten session ID to first 8 characters for readability Examples: - With config: 20251025T193847Z_myapp_localhost_efc60ee0/ - Without config: 20251025T193913Z_8080_localhost_d47e3bfd/ - Remote: 20251024T004401Z_noconfig_example-com_d47e3bfd/ Benefits: - Chronologically sortable audit logs - Instant app identification in directory listings - Efficient filtering for benchmarking queries - Non-breaking: existing logs keep their names 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-27 10:14:19 -07:00
ajmallesh	d85b6af5f5	Merge pull request #2 from KeygraphHQ/fixing-bugs Fixing bugs	2025-10-23 18:18:21 -07:00
ajmallesh	f40f52f118	fix: enable Playwright MCP browser automation in Docker containers Resolves Playwright browser installation failures in Docker by using Wolfi's system Chromium instead of downloading Playwright's bundled browsers at runtime. ## Problem When running in Docker, agents attempted to install browsers via `browser_install` tool, which failed due to: - Permission issues (non-root user couldn't install system dependencies) - npx @playwright/mcp spawns with its own Playwright dependency separate from global installations - Playwright's bundled browsers require runtime download (~280MB) and glibc deps - Environment variables alone (PLAYWRIGHT_BROWSERS_PATH) weren't sufficient ## Solution Dockerfile changes: - Use Wolfi's native `chromium` package (guaranteed compatible, already installed) - Remove Playwright browser installation step (saves ~280MB and build time) - Add explicit `SHANNON_DOCKER=true` environment variable for reliable detection - Set PLAYWRIGHT_CHROMIUM_EXECUTABLE_PATH to point to system Chromium Code changes (claude-executor.js): - Detect Docker via `process.env.SHANNON_DOCKER` (more reliable than /.dockerenv) - Conditionally add `--executable-path /usr/bin/chromium-browser` CLI arg for Docker - Local: Use Playwright's bundled browsers (downloaded to ~/Library/Caches/) - Docker: Use system Chromium with no runtime downloads ## Research Findings - @playwright/mcp has separate playwright-core dependency (v1.56.0-alpha) - MCP server spawned via npx doesn't inherit browser binaries from global install - --executable-path CLI argument is required (env vars insufficient) - /.dockerenv file is unreliable (missing in BuildKit, K8s, can be spoofed) ## Testing ✅ Docker: All 5 parallel agents successfully navigate, screenshot, create deliverables ✅ Local: All 5 parallel agents successfully navigate, screenshot, create deliverables ✅ No browser_install calls, no permission errors ✅ Image size reduced by ~280MB Fixes #docker-playwright-browser-issues 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-23 17:56:19 -07:00
ajmallesh	f2870e3340	refactor: simplify pipeline testing report prompt by 78% Reduce prompts/pipeline-testing/report-executive.txt from 137 to 30 lines by: - Removing hardcoded detailed vulnerability content - Testing actual workflow (read → modify → save) instead of creating from scratch - Removing meta-commentary, keeping only direct instructions - Making it consistent with other pipeline testing prompts (30 lines like exploit agents) The prompt now properly mimics the real reporting agent behavior where the orchestration code stitches files first, then the agent modifies the result. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-23 17:13:25 -07:00
ajmallesh	f13c7421f4	refactor: remove ~500 lines of dead code and consolidate duplicates Comprehensive codebase cleanup based on parallel agent analysis and automated dead code detection (knip, depcheck). Reduces codebase by ~10% with zero functional changes. ## Phase 1: Obsolete MCP Setup Removal (~82 lines) - Delete setupMCP() and cleanupMCP() functions from environment.js - Remove all calls to cleanupMCP() (8 instances across 3 files) - Migrate from claude CLI to SDK's mcpServers option - Remove --log flag (obsolete logging system) ## Phase 2: Dead Code Removal (~317 lines) - Delete src/utils/logger.js entirely (127 lines, superseded by audit system) - Remove handleConfigError() and handleError() from error-handling.js - Remove isToolAvailable() from tool-checker.js - Remove 5 dead methods from audit-session.js (logSessionFailure, logMessage, markRolledBack, updateValidation, getValidation) - Remove 6 wrapper methods from audit/logger.js (all callers use logEvent directly) - Remove formatCost(), updateMessage(), compose() utilities (unused) ## Phase 3: Consolidation (~195 lines) - Extract SessionMutex to src/utils/concurrency.js (was duplicated in 2 files) - Consolidate formatDuration to src/audit/utils.js (was in 3 files) - Extract readline prompts to src/cli/prompts.js (was duplicated in 2 files) - Create validator factories in constants.js (reduce 72 lines to 30) ## Impact - Total reduction: 488 lines (20 files modified, 2 created, 1 deleted) - Codebase: ~4,900 → ~4,400 LOC (10% reduction) - Zero functional changes, all tests pass - Improved maintainability and DRY compliance 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-23 17:01:17 -07:00
ajmallesh	9be2e71ff2	refactor: deduplicate prompt templates with shared content system Implemented @include() directive system to eliminate ~800 lines of duplicated content across 10 specialist prompt files. All prompt-related content now consolidated under prompts/ directory for better maintainability. Changes: - Added processIncludes() to prompt-manager.js for generic @include() support - Created prompts/shared/ with 5 reusable template files - Refactored all 10 specialist prompts to use @include() for common sections - Moved login_instructions.txt to prompts/shared/ (deleted login_resources/) - Updated CLAUDE.md to reflect new structure Impact: -137 net lines, zero breaking changes, infinitely scalable for future shared content. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-23 16:19:25 -07:00
ajmallesh	2966157596	chore: remove ~500 lines of dead code identified by knip Remove unused files and exports to improve codebase maintainability: Phase 1 - Deleted files (5): - login_resources/generate-totp-standalone.mjs (replaced by MCP tool) - mcp-server/src/tools/index.js (unused barrel export) - mcp-server/src/utils/index.js (unused barrel export) - mcp-server/src/validation/index.js (unused barrel export) - src/agent-status.js (deprecated 309-line status manager) Phase 2 - Removed unused exports (3): - mcp-server/src/index.js: shannonHelperServer constant - mcp-server/src/utils/error-formatter.js: createFileSystemError function - src/utils/git-manager.js: cleanWorkspace (now internal-only) Phase 3 - Unexported internal functions (4): - src/checkpoint-manager.js: runSingleAgent, runAgentRange, runParallelVuln, runParallelExploit (internal use only) All Shannon CLI commands tested and verified working. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-23 12:46:51 -07:00
ajmallesh	d649fccfdb	chore: migrate from deprecated @anthropic-ai/claude-code to @anthropic-ai/claude-agent-sdk Anthropic rebranded the SDK in 2025 from "Claude Code SDK" to "Claude Agent SDK". Updated all references across package.json, Dockerfile, and documentation to use the current @anthropic-ai/claude-agent-sdk package. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-23 12:06:55 -07:00
ajmallesh	ef3ae0aead	chore: remove deprecated scripts	2025-10-23 11:57:14 -07:00
ajmallesh	eae0b8d654	feat: migrate to use MCP tools instead of helper scripts	2025-10-23 11:56:47 -07:00
ajmallesh	cfe8dc8bc8	fix: critical bug - exploitation phase was always skipped ROOT CAUSE: - Exploitation phase checked session.validationResults to determine eligibility - validationResults field was removed during audit system refactor - Field never existed in session schema, so all exploits were skipped THE FIX: - Exploitation phase now validates queue files directly when checking eligibility - Reads exploitation_queue.json and checks if vulnerabilities array is non-empty - No need to store validation results - just re-validate on demand CHANGES: 1. runParallelExploit() now calls safeValidateQueueAndDeliverable() directly 2. Removed validationResults parameter from markAgentCompleted() 3. Simplified calculateVulnerabilityAnalysisSummary() - no longer needs validation data 4. Simplified calculateExploitationSummary() - no longer needs validation data IMPACT: - Exploitation agents will now run when vulnerabilities are found - Queue files are the single source of truth for eligibility - Simpler architecture - no duplicate state storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-22 17:41:41 -07:00
ajmallesh	255956d113	chore: remove run-metadata.json functionality Reasoning: - Pollutes target repo with run-metadata.json - Redundant with audit system (session.json has all metadata) - Less useful than comprehensive audit logs - Target repos should stay clean - only deliverables belong there All debugging info now lives in audit-logs/{hostname}_{sessionId}/session.json 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-22 16:19:40 -07:00
ajmallesh	95a4639d90	docs: enhance export-metrics.js documentation - Added comprehensive header comment explaining use case - Documents data source (session.json from audit-logs) - CSV output format and use cases clearly described - Includes usage examples and note about raw data access - Removes need for separate docs/ folder in repo Docs were design artifacts, not needed in open source repo. All relevant documentation now lives in code comments. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-22 16:16:36 -07:00
ajmallesh	a8b4e6899a	chore: remove reconcile-session.js script Reasoning: - Shannon is a local CLI tool with direct filesystem access - Manual file editing (JSON, rm -rf) is simpler than reconciliation script - Automatic reconciliation runs before every command (built-in) - If auto-reconciliation has bugs, fix the code, don't create workarounds - Over-engineered for a local development tool For recovery: Just delete .shannon-store.json or edit JSON files directly 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-22 16:13:50 -07:00
ajmallesh	27334a4dd6	feat: implement unified audit system v3.0 with crash-safety and self-healing ## Unified Audit System (v3.0) - Implemented crash-safe, append-only logging to audit-logs/{hostname}_{sessionId}/ - Added session.json with comprehensive metrics (timing, cost, attempts) - Agent execution logs with turn-by-turn detail - Prompt snapshots saved to audit-logs/.../prompts/{agent}.md - SessionMutex prevents race conditions during parallel execution - Self-healing reconciliation before every CLI command ## Session Metadata Standardization - Fixed critical bug: standardized on 'id' field (not 'sessionId') throughout codebase - Updated: shannon.mjs (recon, report), src/phases/pre-recon.js - Added validation in AuditSession to fail fast on incorrect field usage - JavaScript shorthand syntax was causing wrong field names ## Schema Improvements - session.json: Added cost_usd per phase, removed redundant final_cost_usd - Renamed 'percentage' -> 'duration_percentage' for clarity - Simplified agent metrics to single total_cost_usd field - Removed unused validation object from schema ## Legacy System Removal - Removed savePromptSnapshot() - prompts now only saved by audit system - Removed target repo pollution (prompt-snapshots/ no longer created) - Single source of truth: audit-logs/{hostname}_{sessionId}/prompts/ ## Export Script Simplification - Removed JSON export mode (session.json already exists) - CSV-only export with clean columns: agent, phase, status, attempts, duration_ms, cost_usd - Tested on real session data ## Documentation - Updated CLAUDE.md with audit system architecture - Added .gitignore entry for audit-logs/ 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-22 16:09:08 -07:00
ajmallesh	a9e00ca19f	chore: remove screenshot saving from Playwright MCP instances Remove unnecessary screenshot storage to reduce file I/O and disk usage: - Removed screenshot directory creation - Removed --output-dir flag from Playwright MCP setup - Agents can still take screenshots, but they won't persist to disk Screenshots were not being used by any part of Shannon for analysis or reporting, making their storage unnecessary overhead. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-22 12:15:47 -07:00
ajmallesh	e1237416f5	chore: remove permanent deliverables copying to Documents folder Simplified deliverable management by removing automatic copying to ~/Documents/pentest-deliverables/. All deliverables now remain only in <target-repo>/deliverables/, eliminating file duplication and improving UX. Changes: - Removed savePermanentDeliverables() function from src/setup/deliverables.js - Removed function call and related console output from shannon.mjs - Removed unused 'os' import from deliverables.js 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-22 12:11:48 -07:00
ajmallesh	ac682b0172	chore: save deliverable script decoupling deliverable creation from the actual content	2025-10-22 11:31:58 -07:00
ajmallesh	66c549f3b7	chore: upgrade model from Sonnet 4 -> Sonnet 4.5	2025-10-21 16:34:56 -07:00
ajmallesh	3a8b7ae496	Merge pull request #1 from Khaushik-keygraph/main chore: added logging	2025-10-21 09:16:59 -07:00
Khaushik-keygraph	e0ff1453a5	chore: optimized logging	2025-10-17 13:59:34 +05:30
Khaushik-keygraph	46a30fd8c9	chore: added logging	2025-10-17 13:52:13 +05:30
Khaushik-keygraph	80747a0204	Update README.md	2025-10-09 15:54:04 +05:30
Khaushik-keygraph	bbd9db2a61	fix: renamed agent filename	2025-10-08 23:49:16 +05:30
ajmallesh	770dae387a	docs: update Discord invite link to infinite expiry Updated Discord invite links in README.md to use a permanent invite link that will not expire. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-07 14:10:55 -07:00
keygraphVarun	0c446382e6	Update README.md	2025-10-07 13:50:31 -07:00
keygraphVarun	d72222dcb9	Update README.md	2025-10-07 13:09:29 -07:00
keygraphVarun	851752bcc1	Update README.md	2025-10-07 12:59:22 -07:00
keygraphVarun	d30553d7dd	Create SHANNON-PRO.md	2025-10-07 12:49:33 -07:00
keygraphVarun	8490196c78	Add files via upload gif	2025-10-07 12:47:04 -07:00
keygraphVarun	7e0ca8c49d	Add files via upload assets	2025-10-07 11:51:31 -07:00
keygraphVarun	1fe4c1f828	Update README.md italics	2025-10-06 18:28:11 -07:00
keygraphVarun	59e7e3c586	Update README.md typo	2025-10-06 18:27:00 -07:00
keygraphVarun	7c4559d4aa	Update LICENSE Simplified	2025-10-06 18:25:18 -07:00
keygraphVarun	96eee1c3b6	Update README.md fixes	2025-10-06 18:20:41 -07:00
ajmallesh	8f52722d56	Initial commit Co-Authored-By: Nellie Mullane <nellie@keygraph.io>	2025-10-03 19:35:08 -07:00

45 Commits