Commit Graph

36 Commits

Author SHA1 Message Date
ajmallesh
6e89f26474 feat: improve audit log naming with timestamp and app context
Enhances audit log directory naming from `{hostname}_{uuid}` to
`{timestamp}_{appName}_{hostname}_{shortId}` for better discoverability
and benchmarking analysis.

Changes:
- Add extractAppName() helper to extract app name from config files
- Add smart fallback: use port number for localhost without config
- Update generateSessionIdentifier() to include timestamp prefix
- Shorten session ID to first 8 characters for readability

Examples:
- With config: 20251025T193847Z_myapp_localhost_efc60ee0/
- Without config: 20251025T193913Z_8080_localhost_d47e3bfd/
- Remote: 20251024T004401Z_noconfig_example-com_d47e3bfd/

Benefits:
- Chronologically sortable audit logs
- Instant app identification in directory listings
- Efficient filtering for benchmarking queries
- Non-breaking: existing logs keep their names

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-27 10:14:19 -07:00
ajmallesh
b4cd1066d6 Merge pull request #2 from KeygraphHQ/fixing-bugs
Fixing bugs
2025-10-23 18:18:21 -07:00
ajmallesh
dcae34af81 fix: enable Playwright MCP browser automation in Docker containers
Resolves Playwright browser installation failures in Docker by using Wolfi's
system Chromium instead of downloading Playwright's bundled browsers at runtime.

## Problem
When running in Docker, agents attempted to install browsers via `browser_install`
tool, which failed due to:
- Permission issues (non-root user couldn't install system dependencies)
- npx @playwright/mcp spawns with its own Playwright dependency separate from
  global installations
- Playwright's bundled browsers require runtime download (~280MB) and glibc deps
- Environment variables alone (PLAYWRIGHT_BROWSERS_PATH) weren't sufficient

## Solution
**Dockerfile changes:**
- Use Wolfi's native `chromium` package (guaranteed compatible, already installed)
- Remove Playwright browser installation step (saves ~280MB and build time)
- Add explicit `SHANNON_DOCKER=true` environment variable for reliable detection
- Set PLAYWRIGHT_CHROMIUM_EXECUTABLE_PATH to point to system Chromium

**Code changes (claude-executor.js):**
- Detect Docker via `process.env.SHANNON_DOCKER` (more reliable than /.dockerenv)
- Conditionally add `--executable-path /usr/bin/chromium-browser` CLI arg for Docker
- Local: Use Playwright's bundled browsers (downloaded to ~/Library/Caches/)
- Docker: Use system Chromium with no runtime downloads

## Research Findings
- @playwright/mcp has separate playwright-core dependency (v1.56.0-alpha)
- MCP server spawned via npx doesn't inherit browser binaries from global install
- --executable-path CLI argument is required (env vars insufficient)
- /.dockerenv file is unreliable (missing in BuildKit, K8s, can be spoofed)

## Testing
 Docker: All 5 parallel agents successfully navigate, screenshot, create deliverables
 Local: All 5 parallel agents successfully navigate, screenshot, create deliverables
 No browser_install calls, no permission errors
 Image size reduced by ~280MB

Fixes #docker-playwright-browser-issues

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-23 17:56:19 -07:00
ajmallesh
3094862310 refactor: simplify pipeline testing report prompt by 78%
Reduce prompts/pipeline-testing/report-executive.txt from 137 to 30 lines by:
- Removing hardcoded detailed vulnerability content
- Testing actual workflow (read → modify → save) instead of creating from scratch
- Removing meta-commentary, keeping only direct instructions
- Making it consistent with other pipeline testing prompts (30 lines like exploit agents)

The prompt now properly mimics the real reporting agent behavior where the orchestration code stitches files first, then the agent modifies the result.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-23 17:13:25 -07:00
ajmallesh
d372f87297 refactor: remove ~500 lines of dead code and consolidate duplicates
Comprehensive codebase cleanup based on parallel agent analysis and automated
dead code detection (knip, depcheck). Reduces codebase by ~10% with zero
functional changes.

## Phase 1: Obsolete MCP Setup Removal (~82 lines)
- Delete setupMCP() and cleanupMCP() functions from environment.js
- Remove all calls to cleanupMCP() (8 instances across 3 files)
- Migrate from claude CLI to SDK's mcpServers option
- Remove --log flag (obsolete logging system)

## Phase 2: Dead Code Removal (~317 lines)
- Delete src/utils/logger.js entirely (127 lines, superseded by audit system)
- Remove handleConfigError() and handleError() from error-handling.js
- Remove isToolAvailable() from tool-checker.js
- Remove 5 dead methods from audit-session.js (logSessionFailure, logMessage,
  markRolledBack, updateValidation, getValidation)
- Remove 6 wrapper methods from audit/logger.js (all callers use logEvent directly)
- Remove formatCost(), updateMessage(), compose() utilities (unused)

## Phase 3: Consolidation (~195 lines)
- Extract SessionMutex to src/utils/concurrency.js (was duplicated in 2 files)
- Consolidate formatDuration to src/audit/utils.js (was in 3 files)
- Extract readline prompts to src/cli/prompts.js (was duplicated in 2 files)
- Create validator factories in constants.js (reduce 72 lines to 30)

## Impact
- Total reduction: 488 lines (20 files modified, 2 created, 1 deleted)
- Codebase: ~4,900 → ~4,400 LOC (10% reduction)
- Zero functional changes, all tests pass
- Improved maintainability and DRY compliance

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-23 17:01:17 -07:00
ajmallesh
369bf29588 refactor: deduplicate prompt templates with shared content system
Implemented @include() directive system to eliminate ~800 lines of duplicated content across 10 specialist prompt files. All prompt-related content now consolidated under prompts/ directory for better maintainability.

Changes:
- Added processIncludes() to prompt-manager.js for generic @include() support
- Created prompts/shared/ with 5 reusable template files
- Refactored all 10 specialist prompts to use @include() for common sections
- Moved login_instructions.txt to prompts/shared/ (deleted login_resources/)
- Updated CLAUDE.md to reflect new structure

Impact: -137 net lines, zero breaking changes, infinitely scalable for future shared content.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-23 16:19:25 -07:00
ajmallesh
dafd9148f6 chore: remove ~500 lines of dead code identified by knip
Remove unused files and exports to improve codebase maintainability:

Phase 1 - Deleted files (5):
- login_resources/generate-totp-standalone.mjs (replaced by MCP tool)
- mcp-server/src/tools/index.js (unused barrel export)
- mcp-server/src/utils/index.js (unused barrel export)
- mcp-server/src/validation/index.js (unused barrel export)
- src/agent-status.js (deprecated 309-line status manager)

Phase 2 - Removed unused exports (3):
- mcp-server/src/index.js: shannonHelperServer constant
- mcp-server/src/utils/error-formatter.js: createFileSystemError function
- src/utils/git-manager.js: cleanWorkspace (now internal-only)

Phase 3 - Unexported internal functions (4):
- src/checkpoint-manager.js: runSingleAgent, runAgentRange,
  runParallelVuln, runParallelExploit (internal use only)

All Shannon CLI commands tested and verified working.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-23 12:46:51 -07:00
ajmallesh
d6c7cfd2a6 chore: migrate from deprecated @anthropic-ai/claude-code to @anthropic-ai/claude-agent-sdk
Anthropic rebranded the SDK in 2025 from "Claude Code SDK" to "Claude Agent SDK". Updated all references across package.json, Dockerfile, and documentation to use the current @anthropic-ai/claude-agent-sdk package.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-23 12:06:55 -07:00
ajmallesh
4704982184 chore: remove deprecated scripts 2025-10-23 11:57:14 -07:00
ajmallesh
55716963da feat: migrate to use MCP tools instead of helper scripts 2025-10-23 11:56:47 -07:00
ajmallesh
d6e5db2397 fix: critical bug - exploitation phase was always skipped
ROOT CAUSE:
- Exploitation phase checked session.validationResults to determine eligibility
- validationResults field was removed during audit system refactor
- Field never existed in session schema, so all exploits were skipped

THE FIX:
- Exploitation phase now validates queue files directly when checking eligibility
- Reads exploitation_queue.json and checks if vulnerabilities array is non-empty
- No need to store validation results - just re-validate on demand

CHANGES:
1. runParallelExploit() now calls safeValidateQueueAndDeliverable() directly
2. Removed validationResults parameter from markAgentCompleted()
3. Simplified calculateVulnerabilityAnalysisSummary() - no longer needs validation data
4. Simplified calculateExploitationSummary() - no longer needs validation data

IMPACT:
- Exploitation agents will now run when vulnerabilities are found
- Queue files are the single source of truth for eligibility
- Simpler architecture - no duplicate state storage

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-22 17:41:41 -07:00
ajmallesh
50bf583993 chore: remove run-metadata.json functionality
Reasoning:
- Pollutes target repo with run-metadata.json
- Redundant with audit system (session.json has all metadata)
- Less useful than comprehensive audit logs
- Target repos should stay clean - only deliverables belong there

All debugging info now lives in audit-logs/{hostname}_{sessionId}/session.json

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-22 16:19:40 -07:00
ajmallesh
82909185aa docs: enhance export-metrics.js documentation
- Added comprehensive header comment explaining use case
- Documents data source (session.json from audit-logs)
- CSV output format and use cases clearly described
- Includes usage examples and note about raw data access
- Removes need for separate docs/ folder in repo

Docs were design artifacts, not needed in open source repo.
All relevant documentation now lives in code comments.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-22 16:16:36 -07:00
ajmallesh
cb9971241b chore: remove reconcile-session.js script
Reasoning:
- Shannon is a local CLI tool with direct filesystem access
- Manual file editing (JSON, rm -rf) is simpler than reconciliation script
- Automatic reconciliation runs before every command (built-in)
- If auto-reconciliation has bugs, fix the code, don't create workarounds
- Over-engineered for a local development tool

For recovery: Just delete .shannon-store.json or edit JSON files directly

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-22 16:13:50 -07:00
ajmallesh
3babf02d68 feat: implement unified audit system v3.0 with crash-safety and self-healing
## Unified Audit System (v3.0)
- Implemented crash-safe, append-only logging to audit-logs/{hostname}_{sessionId}/
- Added session.json with comprehensive metrics (timing, cost, attempts)
- Agent execution logs with turn-by-turn detail
- Prompt snapshots saved to audit-logs/.../prompts/{agent}.md
- SessionMutex prevents race conditions during parallel execution
- Self-healing reconciliation before every CLI command

## Session Metadata Standardization
- Fixed critical bug: standardized on 'id' field (not 'sessionId') throughout codebase
- Updated: shannon.mjs (recon, report), src/phases/pre-recon.js
- Added validation in AuditSession to fail fast on incorrect field usage
- JavaScript shorthand syntax was causing wrong field names

## Schema Improvements
- session.json: Added cost_usd per phase, removed redundant final_cost_usd
- Renamed 'percentage' -> 'duration_percentage' for clarity
- Simplified agent metrics to single total_cost_usd field
- Removed unused validation object from schema

## Legacy System Removal
- Removed savePromptSnapshot() - prompts now only saved by audit system
- Removed target repo pollution (prompt-snapshots/ no longer created)
- Single source of truth: audit-logs/{hostname}_{sessionId}/prompts/

## Export Script Simplification
- Removed JSON export mode (session.json already exists)
- CSV-only export with clean columns: agent, phase, status, attempts, duration_ms, cost_usd
- Tested on real session data

## Documentation
- Updated CLAUDE.md with audit system architecture
- Added .gitignore entry for audit-logs/

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-22 16:09:08 -07:00
ajmallesh
b8dc33f180 chore: remove screenshot saving from Playwright MCP instances
Remove unnecessary screenshot storage to reduce file I/O and disk usage:
- Removed screenshot directory creation
- Removed --output-dir flag from Playwright MCP setup
- Agents can still take screenshots, but they won't persist to disk

Screenshots were not being used by any part of Shannon for analysis
or reporting, making their storage unnecessary overhead.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-22 12:15:47 -07:00
ajmallesh
04f810b9fb chore: remove permanent deliverables copying to Documents folder
Simplified deliverable management by removing automatic copying to ~/Documents/pentest-deliverables/. All deliverables now remain only in <target-repo>/deliverables/, eliminating file duplication and improving UX.

Changes:
- Removed savePermanentDeliverables() function from src/setup/deliverables.js
- Removed function call and related console output from shannon.mjs
- Removed unused 'os' import from deliverables.js

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-22 12:11:48 -07:00
ajmallesh
be776c4640 chore: save deliverable script decoupling deliverable creation from the actual content 2025-10-22 11:31:58 -07:00
ajmallesh
de8c5ee041 chore: upgrade model from Sonnet 4 -> Sonnet 4.5 2025-10-21 16:34:56 -07:00
ajmallesh
dc290a8fc4 Merge pull request #1 from Khaushik-keygraph/main
chore: added logging
2025-10-21 09:16:59 -07:00
Khaushik-keygraph
ac15ac284f chore: optimized logging 2025-10-17 13:59:34 +05:30
Khaushik-keygraph
e193d04594 chore: added logging 2025-10-17 13:52:13 +05:30
Khaushik-keygraph
6cdd067dd4 Update README.md 2025-10-09 15:54:04 +05:30
Khaushik-keygraph
76d4b3a040 fix: renamed agent filename 2025-10-08 23:49:16 +05:30
ajmallesh
c470983d15 docs: update Discord invite link to infinite expiry
Updated Discord invite links in README.md to use a permanent invite link
that will not expire.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-07 14:10:55 -07:00
keygraphVarun
ad36ab6a4a Update README.md 2025-10-07 13:50:31 -07:00
keygraphVarun
2ad8278112 Update README.md 2025-10-07 13:09:29 -07:00
keygraphVarun
64126c816a Update README.md 2025-10-07 12:59:22 -07:00
keygraphVarun
2239cee695 Create SHANNON-PRO.md 2025-10-07 12:49:33 -07:00
keygraphVarun
ed7cbb0672 Add files via upload
gif
2025-10-07 12:47:04 -07:00
keygraphVarun
3b3151c175 Add files via upload
assets
2025-10-07 11:51:31 -07:00
keygraphVarun
e9ee158fc7 Update README.md
italics
2025-10-06 18:28:11 -07:00
keygraphVarun
605b1d9971 Update README.md
typo
2025-10-06 18:27:00 -07:00
keygraphVarun
9a69c705d5 Update LICENSE
Simplified
2025-10-06 18:25:18 -07:00
keygraphVarun
9e213314de Update README.md
fixes
2025-10-06 18:20:41 -07:00
ajmallesh
9327630c45 Initial commit 2025-10-03 19:35:08 -07:00