Fixed bug where audit system would create duplicate folders for the same
session because it was using current time instead of the session's original
createdAt timestamp.
Bug behavior:
- Session created at T1 → folder: {T1}_app_host_id/
- Audit re-initialized at T2 → NEW folder: {T2}_app_host_id/
- Result: 2 folders per session with same ID but different timestamps
Root cause:
- metrics-tracker.js:65 was calling formatTimestamp() (current time)
- Should use sessionMetadata.createdAt (original creation time)
Impact: Each running benchmark was creating 2 audit log folders instead of 1
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Enhances audit log directory naming from `{hostname}_{uuid}` to
`{timestamp}_{appName}_{hostname}_{shortId}` for better discoverability
and benchmarking analysis.
Changes:
- Add extractAppName() helper to extract app name from config files
- Add smart fallback: use port number for localhost without config
- Update generateSessionIdentifier() to include timestamp prefix
- Shorten session ID to first 8 characters for readability
Examples:
- With config: 20251025T193847Z_myapp_localhost_efc60ee0/
- Without config: 20251025T193913Z_8080_localhost_d47e3bfd/
- Remote: 20251024T004401Z_noconfig_example-com_d47e3bfd/
Benefits:
- Chronologically sortable audit logs
- Instant app identification in directory listings
- Efficient filtering for benchmarking queries
- Non-breaking: existing logs keep their names
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Resolves Playwright browser installation failures in Docker by using Wolfi's
system Chromium instead of downloading Playwright's bundled browsers at runtime.
## Problem
When running in Docker, agents attempted to install browsers via `browser_install`
tool, which failed due to:
- Permission issues (non-root user couldn't install system dependencies)
- npx @playwright/mcp spawns with its own Playwright dependency separate from
global installations
- Playwright's bundled browsers require runtime download (~280MB) and glibc deps
- Environment variables alone (PLAYWRIGHT_BROWSERS_PATH) weren't sufficient
## Solution
**Dockerfile changes:**
- Use Wolfi's native `chromium` package (guaranteed compatible, already installed)
- Remove Playwright browser installation step (saves ~280MB and build time)
- Add explicit `SHANNON_DOCKER=true` environment variable for reliable detection
- Set PLAYWRIGHT_CHROMIUM_EXECUTABLE_PATH to point to system Chromium
**Code changes (claude-executor.js):**
- Detect Docker via `process.env.SHANNON_DOCKER` (more reliable than /.dockerenv)
- Conditionally add `--executable-path /usr/bin/chromium-browser` CLI arg for Docker
- Local: Use Playwright's bundled browsers (downloaded to ~/Library/Caches/)
- Docker: Use system Chromium with no runtime downloads
## Research Findings
- @playwright/mcp has separate playwright-core dependency (v1.56.0-alpha)
- MCP server spawned via npx doesn't inherit browser binaries from global install
- --executable-path CLI argument is required (env vars insufficient)
- /.dockerenv file is unreliable (missing in BuildKit, K8s, can be spoofed)
## Testing
✅ Docker: All 5 parallel agents successfully navigate, screenshot, create deliverables
✅ Local: All 5 parallel agents successfully navigate, screenshot, create deliverables
✅ No browser_install calls, no permission errors
✅ Image size reduced by ~280MB
Fixes #docker-playwright-browser-issues
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Implemented @include() directive system to eliminate ~800 lines of duplicated content across 10 specialist prompt files. All prompt-related content now consolidated under prompts/ directory for better maintainability.
Changes:
- Added processIncludes() to prompt-manager.js for generic @include() support
- Created prompts/shared/ with 5 reusable template files
- Refactored all 10 specialist prompts to use @include() for common sections
- Moved login_instructions.txt to prompts/shared/ (deleted login_resources/)
- Updated CLAUDE.md to reflect new structure
Impact: -137 net lines, zero breaking changes, infinitely scalable for future shared content.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
ROOT CAUSE:
- Exploitation phase checked session.validationResults to determine eligibility
- validationResults field was removed during audit system refactor
- Field never existed in session schema, so all exploits were skipped
THE FIX:
- Exploitation phase now validates queue files directly when checking eligibility
- Reads exploitation_queue.json and checks if vulnerabilities array is non-empty
- No need to store validation results - just re-validate on demand
CHANGES:
1. runParallelExploit() now calls safeValidateQueueAndDeliverable() directly
2. Removed validationResults parameter from markAgentCompleted()
3. Simplified calculateVulnerabilityAnalysisSummary() - no longer needs validation data
4. Simplified calculateExploitationSummary() - no longer needs validation data
IMPACT:
- Exploitation agents will now run when vulnerabilities are found
- Queue files are the single source of truth for eligibility
- Simpler architecture - no duplicate state storage
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Reasoning:
- Pollutes target repo with run-metadata.json
- Redundant with audit system (session.json has all metadata)
- Less useful than comprehensive audit logs
- Target repos should stay clean - only deliverables belong there
All debugging info now lives in audit-logs/{hostname}_{sessionId}/session.json
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Remove unnecessary screenshot storage to reduce file I/O and disk usage:
- Removed screenshot directory creation
- Removed --output-dir flag from Playwright MCP setup
- Agents can still take screenshots, but they won't persist to disk
Screenshots were not being used by any part of Shannon for analysis
or reporting, making their storage unnecessary overhead.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Simplified deliverable management by removing automatic copying to ~/Documents/pentest-deliverables/. All deliverables now remain only in <target-repo>/deliverables/, eliminating file duplication and improving UX.
Changes:
- Removed savePermanentDeliverables() function from src/setup/deliverables.js
- Removed function call and related console output from shannon.mjs
- Removed unused 'os' import from deliverables.js
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>