# NeuroSploit v3.5.2 — Release Notes
**Release Date:** June 2026
**Codename:** Exploitation Depth & Report Hygiene
**License:** MIT
**Credits:** Joas A Santos & Red Team Leaders
---
## TL;DR
v3.5.2 hard-codes the discipline that separates a great pentest from a noisy
one — distilled from reviewing real AI-pentest output that kept stopping at
*"exposed"* instead of *"exploited"*. The engine now pushes every exposure to
demonstrated impact, **chains** findings, decodes/fingerprints artifacts and
correlates CVEs, audits tokens, and keeps the final report honest (deduplicated
and severity-calibrated).
## Highlights
- **DEPTH doctrine (exploit, don't just expose).** A new doctrine is injected
into every exploitation prompt (black/grey/chain): any info-disclosure,
exposed service/catalog/WSDL, leaked credential/token, or reachable dev host
**must be USED** before it can be a finding — call it, decode it, log in, hit
the dev host. If it was only observed, it's reported as a **lead**, not a
confirmed High/Critical.
- **Finding chaining.** Reuse any session/JWT/cookie/credential obtained in one
step across all other modules; pivot access into IDOR/privesc/exfil and report
the **chain**, not isolated parts (e.g. captcha-bypass→admin JWT→authenticated
surface; enum + no-rate-limit→password spraying).
- **Decode & fingerprint → CVE.** Decode opaque tokens/paths (base64/JSON/marshal)
and pin exact library/gem/plugin/CMS versions, then correlate to known CVEs and
attempt a safe PoC.
- **Token auditor.** JWT alg-confusion (RS→HS), `alg:none`, kid/jku injection,
real signature verification, **weak HS256 secret cracking**, and token
lifecycle (logout/expiry/refresh).
- **Report-hygiene & depth pass (deterministic, in the harness).** After
validation the run now:
- **calibrates severity to proven impact** — an unproven High/Critical
(hedged language, no payload, thin evidence) is capped to Medium and
re-titled "(potential)";
- flags **"exposed → exploited" gaps** — exposures on a host with no actual
exploit get an advisory to go use them;
- advises **consolidating hygiene** classes (headers/cookies/TLS/HSTS/
clickjacking/disclosure) repeated across many assets into ONE finding with
an affected-asset table, instead of inflating the count one-per-host.
- **5 new doctrine meta-agents** (`agents_md/meta/`): `exploit_depth_doctrine`,
`finding_chainer`, `artifact_decoder`, `token_auditor`, `report_calibrator`
(meta agents 17 → 22; total library 343 → 348).
- **Source from a GitHub URL.** `whitebox` / `greybox --repo` (and the REPL
`/repo`) now accept a **git URL** (`https://github.com/owner/repo[.git]`) or an
`owner/repo` shorthand — the repo is cloned (shallow) into `/repos/` and
reviewed automatically, no manual `git clone` needed:
```bash
neurosploit whitebox https://github.com/digininja/DVWA \
--subscription --model anthropic:claude-opus-4-8 -v
```
- **Azure OpenAI provider** (resolves #21). OpenAI-compatible: set
`AZURE_OPENAI_ENDPOINT` (+ optional `AZURE_OPENAI_API_VERSION`, default
`2024-10-21`) and `AZURE_OPENAI_API_KEY`, then `--model azure:`
(the model name is your Azure *deployment* name; auth via the `api-key`
header).
- **`GOOGLE_API_KEY` alias for Gemini** (resolves #25 confusion). Gemini's API
path reads `GEMINI_API_KEY`, and now also accepts `GOOGLE_API_KEY` (Google's
standard env var) when the former is unset. Local providers (ollama/litellm)
still need **no** key at all.
## Notes
- Pure-additive and back-compatible: existing modes, REPL, TUI, pause/continue,
crash-recovery and reports are unchanged. The hygiene pass only annotates and
down-calibrates unproven severities — it never invents or drops findings.
- New unit tests cover the calibration and depth-audit logic
(`harness::hygiene`).
---
# NeuroSploit v3.5.1 — Release Notes
**Release Date:** June 2026
**Codename:** Interactive POMDP Harness
**License:** MIT
**Credits:** Joas A Santos & Red Team Leaders
---
## TL;DR
The 3.5.x line turns the Rust harness into a full **interactive REPL** (Claude
Code / Codex / Cursor-CLI style) on top of the multi-model engine: pick models
with arrow-keys, configure API keys per provider, set target/repo/auth/creds and
free-text instructions that steer the agents, then `/run` engagements **in the
background** while you keep typing. v3.5.1 adds a **POMDP belief spine** with
anti-hallucination grounding ("no claim without a tool receipt"), **infra/host**
testing (IP + SSH + Windows/AD) with Linux/Windows/AD agents, **attack-chain
agents**, a **Mission-Control TUI**, structured **Typst** reports, and resilient
run control (live checkpointing, pause-on-quota, instant stop).
## Highlights
- **Interactive REPL** (`neurosploit` with no subcommand): real line editing
(history ↑/↓, Ctrl-A/E/K, multiline), Tab-completion of `/commands` and
`@filesystem-paths` (Claude-Code-style file menu), arrow-key model multi-select,
per-provider API-key config, and a live context bar (`model · cwd · mode▸target`).
- **Engagement modes**: **black-box** (`run`), **white-box** SAST (`whitebox`,
set `/repo`), **grey-box** (`greybox`, `/repo` + `/target`), **host/infra**
(`/target ` + `/creds` for SSH / Windows / AD), plus the **TUI** dashboard.
- **POMDP belief state** (`belief.rs`, `pomdp.rs`): a property-graph with
probabilities + Bayesian update + Shannon-entropy uncertainty, a
value-of-information planner, and a **grounding gate** (`grounding.rs`,
`may_assert`) — findings must carry an empirical/symbolic **tool receipt**.
- **Infra / credentials** (`creds.rs`): multi-block YAML (jwt/header/cookie,
HTTP login, SSH, Windows/AD); real automated login; Linux/Windows/AD agents.
- **Attack-chain agents**: sqli→rce→lpe, ssrf→aws, upload→lfi→rce, and more —
injected as chain recipes during exploitation.
- **App-stack & CVE hunting**: IIS/.NET (tilde shortname, WebDAV, ViewState),
CMS (WordPress/Joomla/Drupal), app-server consoles, known-CVE exploitation.
- **13 providers** incl. **LiteLLM** proxy and Gemini/xAI alongside the existing
OpenAI-compatible set; **subscription mode** drives local agentic CLIs
(claude/codex/gemini/grok) via stream-json.
- **Mission-Control TUI** (`ratatui`): concurrent activity/findings/targets panels
with a non-blocking composer active during the run.
- **Structured Typst report**: executive summary, vulnerability-summary table,
and per-finding sections (criticality, CVSS, OWASP/CWE, PoC, evidence,
remediation) + an attack-graph / kill-chain mapping (OWASP/CWE/MITRE).
- **Per-project persistence** (`.neurosploit/`, no database): `session.json`,
`runs.json`, `history.txt` — resumes automatically on reopen.
## Run control (new in 3.5.1)
- **Background `/run`** with a live progress bar, severity-colored findings, and
the full `file://` report URL on completion/stop.
- **3-way `/stop`**: **[1]** validate findings so far → report · **[2]** raw
report **now** without validating · **[3]** discard. Raw/discard abort
in-flight agents immediately (running CLI children are killed via
`kill_on_drop`); validate soft-stops so the validator still runs.
- **Crash/quit recovery**: every finding is checkpointed live to
`.neurosploit/active_run.json`; an interrupted run is recovered into `/runs`
on the next launch, so `/results`, `/finding` and `/report` keep working.
- **Pause-on-exhaustion**: when all models are rate-limited / out of quota the
run **parks** (state kept) and prints `⏸ token/quota exhausted … PAUSED`.
Resume with **`/continue`** when your quota renews, or switch with
**`/model `** (or the `/model` selector) then **`/continue`**.
- **Inspection**: `/results` (live findings), `/finding` (pick one → full
command + PoC + evidence), `/expand` / Ctrl-O (full untruncated commands),
`/status`, `/diff`, `/retest`.
## Usage
```bash
cd neurosploit-rs && cargo build --release
./target/release/neurosploit # interactive REPL
./target/release/neurosploit run http://target -v --model anthropic:claude-opus-4-8
./target/release/neurosploit whitebox --repo /path/to/code # white-box SAST
./target/release/neurosploit greybox --repo /path --target http://target # grey-box
./target/release/neurosploit run --creds creds.yaml # host / infra
./target/release/neurosploit tui http://target --subscription --mcp
```
Cross-platform install (Linux / macOS / Windows, x64 + arm64) via `setup.sh` and
`install.ps1`. See **README.md** and **TUTORIAL.md** for the full walkthrough.
---
# NeuroSploit v3.4.0 — Release Notes
**Release Date:** June 2026
**Codename:** Rust Multi-Model Harness
**License:** MIT
---
## TL;DR
A new **Rust harness** (`neurosploit-rs/`) re-implements the autonomous runtime
as a single, fast binary built on `tokio` + `axum`. It drives a **pool of LLM
models** with concurrency limits, **provider failover**, and **N-model validator
voting** — multiple models must independently agree a finding is real before it
is reported — then serves its own solid web dashboard. It reuses the existing
`agents_md/` library (213 agents) unchanged.
## Highlights
- **`neurosploit-rs/` cargo workspace**: `harness` lib crate + `neurosploit`
binary. `cargo build --release` → one static-ish binary.
- **Multi-model pool** (`pool.rs`): bounded concurrency + automatic **failover**
across providers; the same panel is reused as the **validator voting** jury.
- **Pipeline** (`pipeline.rs`): recon → parallel agent exploitation (semaphore
bounded) → **N-model adversarial vote** → score → report. Streams live
progress over a channel.
- **11 providers / 31 models** (`models.rs`), all OpenAI-compatible: Anthropic,
OpenAI, xAI, NVIDIA NIM, DeepSeek, Mistral, Qwen, Groq, Together, OpenRouter,
Ollama. Models like **Qwen / DeepSeek / Llama** usable directly.
- **Axum web dashboard** (`app/`): multi-model selection panel, live execution
console, findings, agent browser, embedded HTML report. Single binary serves
the SPA — no npm/build.
- **CLI**: `neurosploit serve | run | agents | models`, plus `--offline`
mode to exercise the full pipeline without any API keys.
## Usage
```bash
cd neurosploit-rs && cargo build --release
./target/release/neurosploit serve # → http://127.0.0.1:8788
./target/release/neurosploit run https://t.example \
--model anthropic:claude-opus-4-8 --model openai:gpt-5.1 --vote-n 3
```
---
# NeuroSploit v3.3.0 — Release Notes
**Release Date:** June 2026
**Codename:** Autonomous MD-Agent Engine
**License:** MIT
---
## TL;DR
NeuroSploit's pentest agent has been **re-modeled into an autonomous,
markdown-driven engine**. You give it a URL; it composes a master prompt from a
curated library of **213 markdown agents** and drives a locally-installed
**agentic CLI backend** (Claude Code / Codex / Grok CLI, or a Claude
subscription) to run the engagement end-to-end — with **Playwright MCP** for
proof-of-execution and a **reinforcement-learning** loop that adapts agent
selection across runs. The old Python orchestration was retired to `legacy/`.
## Highlights
- **New engine `neurosploit_agent/`** + `./neurosploit` terminal launcher.
Interactive (`./neurosploit`) or one-shot (`./neurosploit run `).
- **213-agent markdown library (`agents_md/`)**: **196 vulnerability
specialists** (now covering LLM/AI, cloud/K8s, modern API/auth, advanced
injection, protocol smuggling, logic/crypto/supply-chain) + **17 meta-agents**.
- **Meta-agents for quality**: `recon`, `exploit_validator`,
`false_positive_filter`, `severity_assessor`, `impact_evaluator`, `reporter`,
and `rl_feedback` — the pipeline validates and adversarially refutes every
candidate before it can become a finding.
- **Pluggable agentic CLI backends** with auto-detection: Claude Code, Codex,
Grok CLI; **subscription mode** via Claude Code login.
- **Playwright MCP** wired in (`.mcp.json`) so agents prove client-side execution
(XSS/CSTI) and capture DOM/network/screenshots instead of trusting reflection.
- **Reinforcement learning** (`neurosploit_agent/rl.py` + `meta/rl_feedback.md`):
bounded per-agent weights with per-tech-stack affinity, persisted to
`data/rl_state.json`.
- **Latest model registry** (`neurosploit_agent/models.py`): Anthropic Claude
4.x, OpenAI, xAI Grok, Gemini, OpenRouter, Ollama, and **NVIDIA NIM** (PR #28,
OpenAI-compatible `integrate.api.nvidia.com`, `nvapi-` keys).
- **Data-driven agent builder** `scripts/build_agents.py` for extending the
library without boilerplate.
## Breaking changes
- The monolithic `neurosploit.py` orchestrator and Python agent classes moved to
`legacy/` and are no longer the supported entrypoint. Use `./neurosploit`.
- Primary agent library moved from `prompts/agents/` to `agents_md/` (originals
preserved; meta/role prompts split into `agents_md/meta/`).
## Upgrade notes
1. Install at least one agentic CLI: Claude Code, Codex, or Grok CLI.
2. `npx` (Node) is required for Playwright MCP.
3. Copy `.env.example` → `.env`; set a provider key (or use Claude subscription).
4. `./neurosploit backends` to confirm detection, then `./neurosploit`.
---
# NeuroSploit v3.0.0 — Release Notes
**Release Date:** February 2026
**Codename:** Autonomous Pentester
**License:** MIT
---
## Overview
NeuroSploit v3 is a ground-up overhaul of the AI-powered penetration testing platform. This release transforms the tool from a scanner into an autonomous pentesting agent — capable of reasoning, adapting strategy in real-time, chaining exploits, validating findings with anti-hallucination safeguards, and executing tools inside isolated Kali Linux containers.
### By the Numbers
| Metric | Count |
|--------|-------|
| Vulnerability types supported | 100 |
| Payload libraries | 107 |
| Total payloads | 477+ |
| Kali sandbox tools | 55 |
| Backend core modules | 63 Python files |
| Backend core code | 37,546 lines |
| Autonomous agent | 7,592 lines |
| AI decision prompts | 100 (per-vuln-type) |
| Anti-hallucination prompts | 12 composable templates |
| Proof-of-execution rules | 100 (per-vuln-type) |
| Known CVE signatures | 400 |
| EOL version checks | 19 |
| WAF signatures | 16 |
| WAF bypass techniques | 12 |
| Exploit chain rules | 10+ |
| Frontend pages | 14 |
| API endpoints | 111+ |
| LLM providers supported | 6 |
---
## Architecture
```
+---------------------+
| React/TypeScript |
| Frontend (14p) |
+----------+----------+
|
WebSocket + REST
|
+----------v----------+
| FastAPI Backend |
| 14 API routers |
+----------+----------+
|
+---------+--------+--------+---------+
| | | | |
+----v---+ +---v----+ +v------+ +v------+ +v--------+
| LLM | | Vuln | | Agent | | Kali | | Report |
| Manager| | Engine | | Core | |Sandbox| | Engine |
| 6 provs| | 100typ | |7592 ln| | 55 tl | | 2 fmts |
+--------+ +--------+ +-------+ +-------+ +---------+
```
**Stack:** Python 3.10+ / FastAPI / SQLAlchemy (async) / React 18 / TypeScript / Tailwind CSS / Vite / Docker
---
## Core Engine: 100 Vulnerability Types
The vulnerability engine covers 100 distinct vulnerability types organized in 10 categories with dedicated testers, payloads, AI prompts, and proof-of-execution rules for each.
### Categories & Types
| Category | Types | Examples |
|----------|-------|---------|
| **Injection** | 12 | SQLi (error, union, blind, time-based), Command Injection, SSTI, NoSQL, LDAP, XPath, Expression Language, HTTP Parameter Pollution |
| **XSS** | 3 | Reflected, Stored (two-phase form+display), DOM-based |
| **Authentication** | 7 | Auth Bypass, JWT Manipulation, Session Fixation, Weak Password, Default Credentials, 2FA Bypass, OAuth Misconfig |
| **Authorization** | 5 | IDOR, BOLA, BFLA, Privilege Escalation, Mass Assignment, Forced Browsing |
| **Client-Side** | 9 | CORS, Clickjacking, Open Redirect, DOM Clobbering, PostMessage, WebSocket Hijack, Prototype Pollution, CSS Injection, Tabnabbing |
| **File Access** | 5 | LFI, RFI, Path Traversal, XXE, File Upload |
| **Request Forgery** | 3 | SSRF, SSRF Cloud (AWS/GCP/Azure metadata), CSRF |
| **Infrastructure** | 7 | Security Headers, SSL/TLS, HTTP Methods, Directory Listing, Debug Mode, Exposed Admin, Exposed API Docs, Insecure Cookies |
| **Advanced** | 9 | Race Condition, Business Logic, Rate Limit Bypass, Type Juggling, Timing Attack, Host Header Injection, HTTP Smuggling, Cache Poisoning, CRLF |
| **Data Exposure** | 6 | Sensitive Data, Information Disclosure, API Key Exposure, Source Code Disclosure, Backup Files, Version Disclosure |
| **Cloud & Supply Chain** | 6 | S3 Misconfig, Cloud Metadata, Subdomain Takeover, Vulnerable Dependency, Container Escape, Serverless Misconfig |
### Injection Routing
Every vulnerability type is routed to the correct injection point:
- **Parameter injection** (default): SQLi, XSS, IDOR, SSRF, etc.
- **Header injection**: CRLF, Host Header, HTTP Smuggling
- **Body injection**: XXE
- **Path injection**: Path Traversal, LFI
- **Both (param + path)**: LFI, directory traversal variants
### XSS Pipeline (Reflected)
The reflected XSS engine is a multi-stage pipeline:
1. **Canary probe** — unique marker per endpoint+param to detect reflection
2. **Context analysis** — 8 contexts: html_body, attribute_value, script_string, script_block, html_comment, url_context, style_context, event_handler
3. **Filter detection** — batch probe to map allowed/blocked chars, tags, events
4. **AI payload generation** — LLM generates context-aware bypass payloads
5. **Escalation payloads** — WAF/encoding bypass variants
6. **Testing** — up to 30 payloads per param with per-payload dedup
7. **Browser validation** — Playwright popup/cookie/DOM/event verification (optional)
### POST Form Support
- HTML forms detected during recon with method, action, all input fields (including `