Commit Graph

85 Commits

Author SHA1 Message Date
keygraphVarun
81ceabac1f clarify contributions 2025-12-16 13:14:29 -08:00
Arjun Malleswaran
41d3d3912d Merge pull request #21 from KeygraphHQ/bug-fixes
Docker and config path fixes
2025-12-15 10:41:12 -08:00
ajmallesh
1e784e650d fix: support absolute config paths in checkpoint manager
Co-Authored-By: Khaushik-keygraph <khaushik.contractor@keygraph.io>
2025-12-15 10:34:25 -08:00
ajmallesh
906d464abd fix: configure git to trust all directories in Docker
Co-Authored-By: Khaushik-keygraph <khaushik.contractor@keygraph.io>
2025-12-15 10:34:25 -08:00
ajmallesh
fba798ac49 docs: add Docker instructions for testing local applications
Co-Authored-By: Khaushik-keygraph <khaushik.contractor@keygraph.io>
2025-12-15 10:34:24 -08:00
Khaushik-keygraph
c655e8a716 chore: added disable loader functionality 2025-12-10 00:59:56 +05:30
Arjun Malleswaran
accb9562ba Merge pull request #19 from KeygraphHQ/additional-flags
chore: added flag additions for minimizing logs
2025-12-09 10:33:36 -08:00
Khaushik-keygraph
38e49eb1eb chore: added flag additions for minimizing logs 2025-12-09 23:59:12 +05:30
Arjun Malleswaran
c664000458 Merge pull request #18 from KeygraphHQ/16-windows-defender-flags-benchmark-deliverables-as-backdoorphpperhetshell-during-local-use
docs: add Windows Defender false positive guidance
2025-12-08 10:20:51 -08:00
ajmallesh
af41570ae9 docs: add Windows Defender false positive guidance
Closes #16
2025-12-02 19:07:37 -08:00
ajmallesh
2c410d90b3 docs: update Discord invite links 2025-12-01 09:24:19 -08:00
ajmallesh
534b18e303 chore: change license to AGPL-3.0 2025-11-26 18:45:36 -08:00
ajmallesh
8f2825b32f docs: clarify Shannon is a white-box pentesting tool
- Add prominent callout that Shannon Lite is designed for white-box
  (source-available) application security testing
- Update XBOW benchmark description to "hint-free, source-aware"
- Clarify benchmark comparison context (white-box vs black-box results)
- Update benchmark performance comparison image

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-24 12:37:55 -08:00
Khaushik-keygraph
369e3a34cf chore: added licensing to dockerfile 2025-11-22 20:46:15 +05:30
keygraphVarun
7f7285702e fix link 2025-11-22 20:43:09 +05:30
keygraphVarun
deb4e51f98 cleanup 2025-11-22 20:43:09 +05:30
keygraphVarun
2b14282ff6 consistency on score 2025-11-22 20:43:09 +05:30
ajmallesh
5bbd757b45 fix: resolve Docker build failure and clarify env var configuration
- Remove .env file with incorrect CLAUDE_CODE_MAX_TOKENS variable
- Remove .env copy from Dockerfile that was causing build to fail
- Update README to distinguish local (export) vs Docker (-e) env var usage
- Add CLAUDE_CODE_MAX_OUTPUT_TOKENS to all Docker run examples

The correct variable is CLAUDE_CODE_MAX_OUTPUT_TOKENS (not CLAUDE_CODE_MAX_TOKENS)
and should be passed at runtime via -e flag for Docker or export for local runs.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-19 10:28:44 -08:00
Khaushik-keygraph
d2519322d2 fix: removed comments 2025-11-13 20:33:58 +05:30
keygraphVarun
456d852b87 style changes 2025-11-13 20:28:15 +05:30
keygraphVarun
341448c8a3 Link to benchmark 2025-11-13 20:27:26 +05:30
ajmallesh
b32e71a9b4 chore: add licensing comments to prompts 2025-11-13 17:53:41 +05:30
ajmallesh
30f324be5e Update license references from BSL to MPL in documentation
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 17:48:05 +05:30
Arjun Malleswaran
378585a4a3 Merge pull request #14 from KeygraphHQ/license-change
License change
2025-11-13 16:57:18 +05:30
Arjun Malleswaran
c040efc6b5 Update LICENSE 2025-11-13 16:56:19 +05:30
ajmallesh
1051d40527 chore: add MPL license comments 2025-11-13 16:55:13 +05:30
Arjun Malleswaran
fbf24c5d10 Update README.md 2025-11-04 08:47:18 -08:00
Arjun Malleswaran
13231ae016 Update README.md 2025-11-04 08:46:15 -08:00
ajmallesh
76591ae5e2 Update README.md 2025-11-03 20:23:16 -08:00
ajmallesh
23e072a236 Merge branch 'main' of github.com:KeygraphHQ/shannon 2025-11-03 20:22:27 -08:00
ajmallesh
9d9b81d8a9 Update README.md 2025-11-03 20:22:18 -08:00
Arjun Malleswaran
3ff1609151 Merge pull request #9 from KeygraphHQ/adding-xben-results
Update README.md
2025-11-03 20:19:55 -08:00
ajmallesh
aa045b65da Update README.md 2025-11-03 20:16:08 -08:00
Arjun Malleswaran
b25cbaa643 Merge pull request #7 from KeygraphHQ/adding-xben-results
Adding xben results
2025-11-03 20:04:45 -08:00
ajmallesh
40f80fc4cb Update README.md 2025-11-03 20:04:21 -08:00
ajmallesh
c8d7ec1e29 docs: add benchmarks README 2025-11-03 20:03:06 -08:00
ajmallesh
cb54ad46a0 Rename SQLi/Command Injection to Injection throughout README
Consolidates SQL Injection and Command Injection references to the unified "Injection" terminology for consistency with agent naming and OWASP categorization.

Changes:
- Updated feature descriptions and vulnerability lists
- Modified architecture diagrams
- Simplified targeted vulnerability scope

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-03 16:56:40 -08:00
ajmallesh
7454b1a581 Add audit logs and update gitignore for xben results
Updates .gitignore to only ignore top-level audit-logs/ directory, allowing xben-benchmark-results audit logs to be tracked. This enables full reproducibility of benchmark runs with complete session data, prompts, and agent execution logs.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-03 16:29:56 -08:00
ajmallesh
aa49f9fc66 Add X-Bow benchmark performance visualization
This commit adds a professional performance comparison chart showing Shannon's 96% success rate against other autonomous pentesting systems on the X-Bow benchmark.

Chart features:
- Y-axis properly starts at 0% (honest data visualization)
- Shannon bar highlighted in brand orange
- Descriptive title with sample size (104 challenges)
- SVG format for scalability

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-03 12:34:55 -08:00
ajmallesh
c686d7f800 Add X-Bow benchmark results (104 test cases)
This commit adds comprehensive X-Bow (XBEN) benchmark results demonstrating Shannon's performance across 104 CTF security challenges. Each test case includes detailed penetration testing reports and exploitation evidence for reproducible research.

Contents:
- 104 XBEN test case directories (XBEN-001-24 through XBEN-104-24)
- Deliverables including analysis reports and exploitation evidence
- Individual test case results with vulnerability assessments

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-03 12:34:41 -08:00
ajmallesh
33f17dd570 docs: add ctf-mode branch documentation to README
Add a TIP callout in the Overview section documenting the ctf-mode branch
for users who want to run Shannon against Capture-The-Flag challenges with
optimized flag extraction prompts.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-03 10:35:45 -08:00
ajmallesh
939398074f refactor: update injection display name and add max tokens docs
- Change agent prefix from [SQLi/Cmd] to [Injection] to reflect expanded scope
- Add README documentation for CLAUDE_CODE_MAX_OUTPUT_TOKENS environment variable

This update aligns the display naming with the expanded injection analysis scope
that now covers SQLi, Command Injection, LFI/RFI, SSTI, Path Traversal, and
Insecure Deserialization vulnerabilities.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-03 10:21:17 -08:00
ajmallesh
4224d1c4f4 feat: expand injection analysis scope to cover LFI/RFI/SSTI/Path Traversal/Deserialization
Fixes responsibility gap where agents found vulnerabilities but rejected them as "out of scope"

Changes:
- vuln-injection.txt: Added LFI/RFI, SSTI, Path Traversal, Deserialization to scope
  - Updated role definition and objective
  - Added new vulnerability_type and slot_type enums
  - Added sink definitions and defense rules for new injection classes
  - Added witness payload examples
- pre-recon-code.txt: Expanded sink hunter agent to find file/template/deserialize sinks
- recon.txt: Updated Section 9 with clear injection source definitions for all types
- exploit-injection.txt: Updated evidence template to handle all injection types

Token-optimized: Condensed verbose sections while preserving critical guidance

Addresses XBEN benchmark failures where LFI/SSTI/Path Traversal were detected but excluded from exploitation queues

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-03 10:20:15 -08:00
ajmallesh
52d7cc46a6 feat: add environment variable support for Claude Code token limits
Introduces .env file configuration to manage CLAUDE_CODE_MAX_TOKENS, allowing flexible control of the context window size for AI analysis sessions. This enables users to tune token limits based on their specific penetration testing needs without modifying code.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-30 10:53:42 -07:00
ajmallesh
9d338ec948 fix: err handling for claude code session limit 2025-10-30 10:28:35 -07:00
ajmallesh
905156bbc1 chore: print audit logs folder location 2025-10-28 10:31:00 -07:00
ajmallesh
4be7f969a9 Merge pull request #3 from KeygraphHQ/feature/improve-audit-log-naming
Feature/improve audit log naming
2025-10-27 14:56:57 -07:00
ajmallesh
7f3bff9b36 Revert "feat: improve audit log naming with timestamp and app context"
This reverts the timestamp-based naming scheme that was causing audit log
fragmentation. Each agent execution was creating a new folder because the
timestamp kept changing.

Reverting back to simple, stable naming: {hostname}_{sessionId}

This ensures ONE folder per session, preventing the bug where multiple
folders were created for the same session.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-27 13:30:25 -07:00
ajmallesh
95b8e876eb fix: use session's original createdAt instead of current time
Fixed bug where audit system would create duplicate folders for the same
session because it was using current time instead of the session's original
createdAt timestamp.

Bug behavior:
- Session created at T1 → folder: {T1}_app_host_id/
- Audit re-initialized at T2 → NEW folder: {T2}_app_host_id/
- Result: 2 folders per session with same ID but different timestamps

Root cause:
- metrics-tracker.js:65 was calling formatTimestamp() (current time)
- Should use sessionMetadata.createdAt (original creation time)

Impact: Each running benchmark was creating 2 audit log folders instead of 1

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-27 10:55:53 -07:00
ajmallesh
6e89f26474 feat: improve audit log naming with timestamp and app context
Enhances audit log directory naming from `{hostname}_{uuid}` to
`{timestamp}_{appName}_{hostname}_{shortId}` for better discoverability
and benchmarking analysis.

Changes:
- Add extractAppName() helper to extract app name from config files
- Add smart fallback: use port number for localhost without config
- Update generateSessionIdentifier() to include timestamp prefix
- Shorten session ID to first 8 characters for readability

Examples:
- With config: 20251025T193847Z_myapp_localhost_efc60ee0/
- Without config: 20251025T193913Z_8080_localhost_d47e3bfd/
- Remote: 20251024T004401Z_noconfig_example-com_d47e3bfd/

Benefits:
- Chronologically sortable audit logs
- Instant app identification in directory listings
- Efficient filtering for benchmarking queries
- Non-breaking: existing logs keep their names

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-27 10:14:19 -07:00