Commit Graph

4 Commits

Author SHA1 Message Date
CyberSecurityUP a8676fee0a v3.5.1: POMDP belief-state + value-of-information planner + grounded anti-hallucination
Partial observability is now first-class:

- belief.rs — property-graph world model; nodes (host/service/vuln/exploit/cred)
  carry a probability, not a boolean. Bayesian observation updates; per-node
  Shannon entropy; mean-uncertainty + recon-frontier. Black-box = diffuse priors
  that sharpen with observation; white-box collapses toward deterministic (MDP).
- pomdp.rs — value_of_information(), decide() (recon vs exploit falls out of
  belief entropy), and may_assert() — the mathematical anti-hallucination gate:
  no exploitability claim while the belief is diffuse (high entropy) → observe first.
- grounding.rs — verification engine, hard rule "no claim without a tool receipt":
  empirical grounding for black-box (raw HTTP/OOB/error markers), symbolic for
  white-box (file:line into reviewed source). Ungrounded claims demoted + flagged
  receipt_missing (feeds future reward shaping).
- pipeline.finish(): grounding gate before reporting + belief-uncertainty readout.
- bump 3.5.0 → 3.5.1; README documents the v3.5.1 belief/grounding architecture
  and the infra/bandit/reward roadmap.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 21:41:18 -03:00
CyberSecurityUP 435463979b v3.5.0: Claude-Code-style interactive harness (REPL) + instruction-steered testing
- New persistent interactive session (app/src/repl.rs), launched when run with no args:
  banner, model selection, API-key config (/key) or subscription (/sub), then a live
  session to set /target, /repo, /auth, and free-text /focus instructions (or just type
  them) that STEER which agents run and how.
- Slash-commands: /model /providers /key /sub /target /repo /auth /focus /mcp /votes
  /agents /show /run /quit  (+ bare text = focus).
- RunConfig gains `instructions` and `auth`:
  * instructions bias both LLM agent-selection and the heuristic (focus keywords →
    injection/access-control/etc. agents get a strong boost)
  * operator directives (focus + auth) injected into recon and exploit prompts so agents
    test as an authenticated user and prioritise the requested vuln classes
- bump 3.4.1 → 3.5.0 (CLI, harness, reports, credits)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 19:58:35 -03:00
CyberSecurityUP 5d83e8848e v3.4.1: harness intelligence — router, ReAct, dedup, token-trim, configurable MCP, +54 code agents, credits
- Task-based model ROUTER (recon/select prefer a fast model; exploit prefers primary; validate uses a different model than the finder)
- ReAct doctrine injected into exploit prompts (Thought→Action→Observation, token-efficient)
- Dedup: unique agents per run + findings deduped by CWE/endpoint/title (highest confidence kept)
- Token economy: recon blob capped for selector + per-agent context
- Configurable MCP: merge user mcp.servers.json into the pipeline's .mcp.json
- +54 white-box/code-analysis agents (NoSQLi, LDAP/XPath, JWT-none, Java/.NET/PHP/Go/Node/Python
  specifics, SSTI, ReDoS, deserialization, etc.) → 303 agents total (78 code)
- Credits: Joas A Santos & Red Team Leaders (CLI banner, interactive header, HTML+Typst report)
- README: GitHub stars/forks badges, 60-second quick start, full API config steps, intuitive layout

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 19:49:01 -03:00
CyberSecurityUP 96f00c1c68 v3.4.1: CLI-only Rust harness — interactive wizard, smart selection, tool doctrine, Typst, status
- Remove Rust web server (axum/tower-http); CLI-only binary
- Verbose logging (-v) + unique run-id output folder runs/ns-<ts>-<target>/
- status.json lifecycle (running → complete) + ✓ COMPLETE summary
- Interactive wizard when run with no args; detailed --help with testphp/DVWA examples + Kali tip
- Tool-usage doctrine injected into recon/exploit prompts: curl + rustscan/nmap
  (apt/brew/cargo install guidance) + browser via Playwright when present, else curl
- Smart recon-aware selection: map recon signals → agent categories, only run
  matching agents; heuristic fallback when LLM selection is empty
- Cross-model false-positive validation: voting prefers a model other than the finder
- Playwright MCP auto-provision (npx) + per-backend support (claude/codex; gemini/grok degrade)
- Gemini provider (API + gemini CLI subscription)
- Typst report (report.typ + compiled report.pdf) via blank structured template
- Lenient finding parsing (confidence as word/number) — fixes empty-results bug
- bump version 3.4.0 -> 3.4.1

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 19:34:13 -03:00