v3.5.2 — Exploitation Depth & Report Hygiene

Distilled from reviewing real AI-pentest output that kept stopping at "exposed" instead of "exploited". Pure-additive, back-compatible. Behavior (injected into black/grey/chain exploit prompts via DEPTH_DOCTRINE): - Exposed → exploited: any info-disclosure / exposed service/WSDL / leaked credential|token / reachable dev host MUST be used before it's a finding; otherwise it's a lead, not a confirmed High/Critical. - Chain across modules: reuse obtained session/JWT/cookie/credential and pivot to IDOR/privesc/exfil; report the chain, not isolated parts. - Decode & fingerprint → CVE; audit tokens (alg-confusion/none/kid/JWKS, weak HS256 secret cracking, lifecycle). Deterministic post-pass (new crates/harness/src/hygiene.rs, wired into finish()): - calibrate severity to PROVEN impact — unproven High/Critical (hedged, no payload, thin evidence) capped to Medium and re-titled "(potential)"; - depth_audit — flag exposures on a host with no real exploit; - hygiene_summary — advise consolidating hygiene classes repeated across assets. Unit tests cover calibration + depth audit. 5 new doctrine meta-agents (scripts/build_methodology_v352.py → agents_md/meta/): exploit_depth_doctrine, finding_chainer, artifact_decoder, token_auditor, report_calibrator (meta 17→22, total 343→348). Version bumped 3.5.1 → 3.5.2 across crates/app/installers/docs; RELEASE/README updated. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-29 23:05:30 +02:00 · 2026-06-26 11:31:11 -03:00
parent ac84db024c
commit e4efa9bbb0
23 changed files with 628 additions and 28 deletions
@@ -1,4 +1,4 @@
-<h1 align="center">🧠 NeuroSploit v3.5.1</h1>
+<h1 align="center">🧠 NeuroSploit v3.5.2</h1>

 <p align="center">
  <a href="https://github.com/JoasASantos/NeuroSploit/stargazers"><img src="https://img.shields.io/github/stars/JoasASantos/NeuroSploit?style=for-the-badge&logo=github&color=8b5cf6" alt="Stars"></a>
@@ -8,7 +8,7 @@
 </p>

 <p align="center">
-  <img src="https://img.shields.io/badge/Version-3.5.1-blue?style=flat-square">
+  <img src="https://img.shields.io/badge/Version-3.5.2-blue?style=flat-square">
  <img src="https://img.shields.io/badge/Harness-Rust%20%7C%20tokio-e6b673?style=flat-square">
  <img src="https://img.shields.io/badge/License-MIT-green?style=flat-square">
  <img src="https://img.shields.io/badge/MD%20Agents-329-red?style=flat-square">
@@ -24,6 +24,13 @@
 >
 > 📖 **New here? Read the [full Tutorial & User Guide →](TUTORIAL.md)** — every mode, flag, config and example explained.

+> 🆕 **New in v3.5.2 — Exploitation Depth & Report Hygiene:** a **DEPTH doctrine**
+> makes the engine *use* what it finds (exposed → exploited), **chain** findings
+> across modules, decode/fingerprint artifacts → CVEs, and **audit tokens** (JWT
+> alg-confusion / weak HS256 secrets). A deterministic post-pass **calibrates
+> severity to proven impact** and **consolidates duplicated hygiene** findings.
+> See [RELEASE.md](RELEASE.md).
+
 ---

 **NeuroSploit** turns a URL, a source repository, a running app, or a host/IP into
@@ -1,3 +1,63 @@
+# NeuroSploit v3.5.2 — Release Notes
+
+**Release Date:** June 2026
+**Codename:** Exploitation Depth & Report Hygiene
+**License:** MIT
+**Credits:** Joas A Santos & Red Team Leaders
+
+---
+
+## TL;DR
+
+v3.5.2 hard-codes the discipline that separates a great pentest from a noisy
+one — distilled from reviewing real AI-pentest output that kept stopping at
+*"exposed"* instead of *"exploited"*. The engine now pushes every exposure to
+demonstrated impact, **chains** findings, decodes/fingerprints artifacts and
+correlates CVEs, audits tokens, and keeps the final report honest (deduplicated
+and severity-calibrated).
+
+## Highlights
+
+- **DEPTH doctrine (exploit, don't just expose).** A new doctrine is injected
+  into every exploitation prompt (black/grey/chain): any info-disclosure,
+  exposed service/catalog/WSDL, leaked credential/token, or reachable dev host
+  **must be USED** before it can be a finding — call it, decode it, log in, hit
+  the dev host. If it was only observed, it's reported as a **lead**, not a
+  confirmed High/Critical.
+- **Finding chaining.** Reuse any session/JWT/cookie/credential obtained in one
+  step across all other modules; pivot access into IDOR/privesc/exfil and report
+  the **chain**, not isolated parts (e.g. captcha-bypass→admin JWT→authenticated
+  surface; enum + no-rate-limit→password spraying).
+- **Decode & fingerprint → CVE.** Decode opaque tokens/paths (base64/JSON/marshal)
+  and pin exact library/gem/plugin/CMS versions, then correlate to known CVEs and
+  attempt a safe PoC.
+- **Token auditor.** JWT alg-confusion (RS→HS), `alg:none`, kid/jku injection,
+  real signature verification, **weak HS256 secret cracking**, and token
+  lifecycle (logout/expiry/refresh).
+- **Report-hygiene & depth pass (deterministic, in the harness).** After
+  validation the run now:
+  - **calibrates severity to proven impact** — an unproven High/Critical
+    (hedged language, no payload, thin evidence) is capped to Medium and
+    re-titled "(potential)";
+  - flags **"exposed → exploited" gaps** — exposures on a host with no actual
+    exploit get an advisory to go use them;
+  - advises **consolidating hygiene** classes (headers/cookies/TLS/HSTS/
+    clickjacking/disclosure) repeated across many assets into ONE finding with
+    an affected-asset table, instead of inflating the count one-per-host.
+- **5 new doctrine meta-agents** (`agents_md/meta/`): `exploit_depth_doctrine`,
+  `finding_chainer`, `artifact_decoder`, `token_auditor`, `report_calibrator`
+  (meta agents 17 → 22; total library 343 → 348).
+
+## Notes
+
+- Pure-additive and back-compatible: existing modes, REPL, TUI, pause/continue,
+  crash-recovery and reports are unchanged. The hygiene pass only annotates and
+  down-calibrates unproven severities — it never invents or drops findings.
+- New unit tests cover the calibration and depth-audit logic
+  (`harness::hygiene`).
+
+---
+
 # NeuroSploit v3.5.1 — Release Notes

 **Release Date:** June 2026
@@ -1,4 +1,4 @@
-# NeuroSploit — Tutorial & User Guide (v3.5.1)
+# NeuroSploit — Tutorial & User Guide (v3.5.2)

 A complete, hands-on guide to installing, configuring and running NeuroSploit —
 the autonomous, multi-model penetration-testing harness.
@@ -98,7 +98,7 @@ Agents **degrade gracefully**: if `rustscan` is absent they use `nmap`; if neith
 ### Verify

 ```bash
-neurosploit --version          # neurosploit 3.5.1
+neurosploit --version          # neurosploit 3.5.2
 neurosploit agents             # {"vulns":196,...,"chains":12,"total":329}
 neurosploit models             # all providers & models
 ```
@@ -0,0 +1,27 @@
+# Artifact Decoder & CVE Correlator Agent
+
+> Meta-agent (v3.5.2 doctrine). Decodes opaque tokens/paths, fingerprints the stack, and maps versions to CVEs.
+
+## User Prompt
+For **{target}**, inspect every opaque or technology-revealing artifact seen in
+recon and responses:
+
+1. **Decode** opaque tokens, IDs and URL paths (base64 / base64url / JSON /
+   marshal / JWT segments). A decoded value often reveals the framework or an
+   internal file path (e.g. a Dragonfly job `[["f","...file"]]`, a signed-URL
+   structure, a serialized object).
+2. **Fingerprint** the stack: server, framework, language, and exact library /
+   gem / plugin / CMS versions (headers, asset paths, readme/changelog, error
+   pages, manifests).
+3. **Correlate to CVEs**: map each exact version to known CVEs; prioritize
+   unauth RCE / SQLi / auth-bypass with a reliable, non-destructive PoC, and
+   attempt a safe confirmation (version/echo/OOB), never a destructive payload.
+
+Output JSON: {decoded:[{artifact, decoded_value, implication}],
+stack:[{component, version}], cves:[{component, version, cve, cvss, exploitable, poc}]}.
+
+## System Prompt
+You decode the opaque and correlate the obvious. Base64/JSON/marshal blobs and
+version banners are leads, not noise — you decode them, fingerprint exact
+versions, and check them against known CVEs, confirming only with a safe PoC and
+a real receipt. Authorized engagement; no destructive or DoS actions. Credits: Joas A Santos and Red Team Leaders.
@@ -0,0 +1,30 @@
+# Exploitation Depth Doctrine Agent
+
+> Meta-agent (v3.5.2 doctrine). Turns every exposure into an exploitation attempt before it becomes a finding.
+
+## User Prompt
+You are reviewing the candidate findings and live transcript for **{target}**.
+
+For EACH candidate that merely *exposes* something (information disclosure,
+exposed service/catalog/WSDL, leaked credential or token, reachable dev/staging
+host, permissive CORS, open .git), drive it one step further BEFORE it is
+reported:
+
+1. **Use what was exposed.** Call the exposed endpoint, decode the leaked
+   artifact, log in with the leaked credential, hit the dev host, send the
+   cross-origin request. Capture the real request/response.
+2. **Decide honestly.** If using it proved impact → keep/raise severity with the
+   new evidence. If it could not be used → down-rate to a LEAD (low confidence),
+   never a confirmed High/Critical.
+3. **Report the gap.** List any exposure you could not yet exploit, with the
+   exact next command to try, so the next round (or the human) can finish it.
+
+Output JSON: {"escalations":[{id, action_taken, new_evidence, new_severity}],
+"leads":[{id, why_not_proven, next_command}]}.
+
+## System Prompt
+You are a senior exploitation lead. Detection is not a finding — impact is. You
+never let an info-disclosure, exposed service, leaked secret or reachable
+non-prod host be reported as confirmed without an attempt to actually use it,
+backed by a real tool receipt. Unproven impact is a lead, not a High. Authorized
+engagement; no destructive or DoS actions. Credits: Joas A Santos and Red Team Leaders.
@@ -0,0 +1,25 @@
+# Finding Chainer Agent
+
+> Meta-agent (v3.5.2 doctrine). Reuses obtained access across modules and reports the chain, not the parts.
+
+## User Prompt
+Given the confirmed findings and any sessions/tokens/credentials obtained during
+the engagement on **{target}**, build exploitation CHAINS:
+
+- Reuse every session/JWT/cookie/credential from one step against ALL other
+  modules and hosts in scope (a captcha/login bypass that yields a token unlocks
+  the entire authenticated surface — use it).
+- Pivot access into higher impact: IDOR/BOLA, horizontal/vertical privesc, mass
+  assignment, data exfiltration, account takeover.
+- Combine separate weaknesses (e.g. user-enumeration + missing rate-limit =
+  password spraying; token-in-URL + no throttle = mass exfil).
+
+For each chain output: {chain_id, steps:[{finding_id, action}], combined_impact,
+combined_severity, evidence}. Prefer ONE well-evidenced chain over several
+isolated low-severity items.
+
+## System Prompt
+You are an exploit-chaining specialist. Isolated findings understate risk; the
+real story is the chain. You always try to reuse obtained access across the
+whole scope and escalate to business impact, reporting the combined chain with
+concrete evidence. Authorized engagement; no destructive or DoS actions. Credits: Joas A Santos and Red Team Leaders.
@@ -0,0 +1,30 @@
+# Report Calibrator Agent
+
+> Meta-agent (v3.5.2 doctrine). Dedups by class, calibrates severity to proven impact, demands evidence per claim.
+
+## User Prompt
+Before the final report for **{target}**, clean and calibrate the findings:
+
+1. **Consolidate hygiene by class.** Merge repeated hygiene findings (missing
+   security headers, clickjacking, cookie flags, weak TLS, HSTS, version/banner
+   disclosure) into ONE finding per class with an affected-asset TABLE — do not
+   inflate the count one-per-host.
+2. **Calibrate severity to PROVEN impact.** High/Critical requires demonstrated
+   impact with evidence. Unproven DoS/abuse, "could/may/potential" language, or a
+   finding with no concrete payload/PoC → cap to Low/Medium or mark
+   "(potential)". Recompute the CVSS vector to match the proven impact.
+3. **Evidence per claim.** Every finding — and every item in the "tests
+   performed" log — must carry a concrete request/response receipt; flag any
+   claim that has none, and any contradiction between the test log and the
+   findings.
+
+Output JSON: {merged:[{class, severity, assets:[...]}],
+recalibrated:[{id, old_severity, new_severity, reason}],
+unevidenced:[{id_or_test, missing}]}.
+
+## System Prompt
+You are a meticulous report editor. You group hygiene by class with an
+asset table, calibrate every severity to demonstrated impact (no inflated
+High/Critical, no padding the count with duplicates), and require a real
+receipt behind every claim — including each line of the tests-performed log.
+Honest, deduplicated, evidence-backed reporting only. Credits: Joas A Santos and Red Team Leaders.
@@ -0,0 +1,26 @@
+# Token & JWT Auditor Agent
+
+> Meta-agent (v3.5.2 doctrine). Attacks tokens: alg-confusion, none, kid/jku, signature checks, weak HS256 secrets.
+
+## User Prompt
+For any session token or JWT issued by **{target}**, run a full auth-token audit:
+
+1. **Decode** the header/payload; note alg (HS*/RS*/none), kid, jku, exp, claims.
+2. **Algorithm attacks**: try `alg:none`, RS→HS confusion (sign with the public
+   key as HMAC secret), and kid/jku injection. Confirm whether the server
+   actually verifies the signature (tamper a claim and replay).
+3. **Weak secret**: for HS256, attempt to crack the signing secret offline
+   (wordlist/rules); a static or guessable shared secret (e.g. an `x-auth-*`
+   header value) is a strong lead — if cracked, forge a token for any user.
+4. **Lifecycle**: test reuse after logout, expiry enforcement, and refresh-token
+   revocation.
+
+Output JSON: {token_type, alg, verified:true|false,
+attacks:[{name, result, evidence}], forged_token_possible:true|false}.
+
+## System Prompt
+You are a token-security specialist. Every JWT/session token gets audited for
+algorithm confusion, none, kid/jku injection, real signature verification, weak
+HS256 secrets, and lifecycle (logout/expiry/refresh). A forged or replayable
+token is account takeover — you prove it with a real receipt. Authorized
+engagement; no destructive or DoS actions. Credits: Joas A Santos and Red Team Leaders.
@@ -11,7 +11,7 @@ function Ok ($m) { Write-Host "  + $m" -ForegroundColor Green }
 function Warn($m){ Write-Host "  ! $m" -ForegroundColor Yellow }

 Write-Host ""
-Write-Host "  NeuroSploit installer (Windows) — v3.5.1" -ForegroundColor Cyan
+Write-Host "  NeuroSploit installer (Windows) — v3.5.2" -ForegroundColor Cyan
 $arch = $env:PROCESSOR_ARCHITECTURE
 Say "Platform: Windows / $arch"

@@ -871,7 +871,7 @@ dependencies = [

 [[package]]
 name = "neurosploit"
-version = "3.5.1"
+version = "3.5.2"
 dependencies = [
 "anyhow",
 "clap",
@@ -888,7 +888,7 @@ dependencies = [

 [[package]]
 name = "neurosploit-harness"
-version = "3.5.1"
+version = "3.5.2"
 dependencies = [
 "anyhow",
 "futures",
@@ -3,7 +3,7 @@ members = ["crates/harness", "app"]
 resolver = "2"

 [workspace.package]
-version = "3.5.1"
+version = "3.5.2"
 edition = "2021"
 license = "MIT"
 repository = "https://github.com/JoasASantos/NeuroSploit"
@@ -1,4 +1,4 @@
-//! NeuroSploit v3.5.1 — interactive harness + CLI (`run` / `whitebox` / `agents` / `models`).
+//! NeuroSploit v3.5.2 — interactive harness + CLI (`run` / `whitebox` / `agents` / `models`).

 mod repl;
 mod tui;
@@ -11,8 +11,8 @@ use std::path::{Path, PathBuf};
 #[command(
    name = "neurosploit",
    version,
-    about = "NeuroSploit v3.5.1 — multi-model autonomous pentest harness",
-    long_about = "NeuroSploit v3.5.1 — a Rust multi-model harness that drives a pool of LLMs \
+    about = "NeuroSploit v3.5.2 — multi-model autonomous pentest harness",
+    long_about = "NeuroSploit v3.5.2 — a Rust multi-model harness that drives a pool of LLMs \
 (API key or local subscription: Claude/Codex/Gemini/Grok) to autonomously test a target. \
 After recon it INTELLIGENTLY selects only the agents matching the discovered surface, runs \
 them in parallel, then validates every finding by cross-model voting before reporting.\n\n\
@@ -379,7 +379,7 @@ pub(crate) fn spawn_engagement(base: &Path, mut cfg: RunConfig, mcp: bool, mode:
    cfg.rl_path = Some(base.join("data").join("rl_state_rs.json").display().to_string());
    write_status(&workdir, "running", &format!("\"target\":{:?}", cfg.target));

-    println!("  ┌─ NeuroSploit v3.5.1  ·  by Joas A Santos & Red Team Leaders");
+    println!("  ┌─ NeuroSploit v3.5.2  ·  by Joas A Santos & Red Team Leaders");
    println!("  │  run id : {run_id}");
    println!("  │  target : {}", cfg.target);
    println!("  │  models : {}", cfg.models.join(", "));
@@ -1,4 +1,4 @@
-//! NeuroSploit v3.5.1 — interactive session (Claude-Code / Codex / Cursor-CLI style).
+//! NeuroSploit v3.5.2 — interactive session (Claude-Code / Codex / Cursor-CLI style).
 //!
 //! Launched when `neurosploit` runs with no subcommand. A persistent REPL with
 //! real line editing (arrow-key history recall, Ctrl-A/E/K, paste), model
@@ -299,7 +299,7 @@ pub async fn repl(base: &Path) -> anyhow::Result<()> {
    let backends = harness::installed_cli_backends();
    println!("\x1b[1m");
    println!("  ███╗   ██╗███████╗██╗   ██╗██████╗  ██████╗");
-    println!("  ████╗  ██║██╔════╝██║   ██║██╔══██╗██╔═══██╗   NeuroSploit v3.5.1");
+    println!("  ████╗  ██║██╔════╝██║   ██║██╔══██╗██╔═══██╗   NeuroSploit v3.5.2");
    println!("  ██╔██╗ ██║█████╗  ██║   ██║██████╔╝██║   ██║   interactive harness");
    println!("  ██║╚██╗██║██╔══╝  ██║   ██║██╔══██╗██║   ██║   by Joas A Santos");
    println!("  ██║ ╚████║███████╗╚██████╔╝██║  ██║╚██████╔╝   & Red Team Leaders");
@@ -1,4 +1,4 @@
-//! NeuroSploit v3.5.1 — TUI "Mission Control" mode.
+//! NeuroSploit v3.5.2 — TUI "Mission Control" mode.
 //!
 //! Concurrent panels that update live while the engagement runs in the
 //! background, with a composer input that stays active during execution:
@@ -1,4 +1,4 @@
-//! POMDP belief-state world model (v3.5.1).
+//! POMDP belief-state world model (v3.5.2).
 //!
 //! The target is only partially observable, so we don't track booleans — we
 //! track a **belief**: a property graph whose nodes (host / service / vuln /
@@ -1,4 +1,4 @@
-//! Verification / grounding engine (v3.5.1).
+//! Verification / grounding engine (v3.5.2).
 //!
 //! Hard rule: **no claim enters the world model without a tool receipt** — raw
 //! tool output, not the LLM's paraphrase. This is the empirical anti-hallucination
@@ -0,0 +1,186 @@
+//! Report-hygiene & exploitation-depth pass (v3.5.2).
+//!
+//! Encodes the post-engagement discipline learned from reviewing real
+//! AI-pentest output, applied deterministically after validation:
+//!  1. **Calibrate severity to PROVEN impact** — an unproven High/Critical
+//!     (hedged language, no payload, thin evidence) is capped to Medium and
+//!     re-titled "(potential)". No inflated severities.
+//!  2. **Exposed → exploited** — flag info-disclosure / exposed-service /
+//!     leaked-credential findings on a host that has no actual exploit, so the
+//!     operator knows to *use* what was exposed (or down-rate it to a lead).
+//!  3. **Consolidate hygiene** — when the same hygiene class (missing headers,
+//!     clickjacking, cookie flags, TLS, info-disclosure…) repeats across many
+//!     assets, advise merging into ONE finding with an affected-asset table,
+//!     instead of inflating the count one-per-host.
+//!
+//! All functions are pure/deterministic; only `calibrate` mutates findings
+//! (severity/title/confidence). The rest return advisory strings streamed to
+//! the operator and recorded with the run.
+use crate::types::Finding;
+
+fn host_of(endpoint: &str) -> String {
+    let s = endpoint.trim();
+    let s = s.split("://").last().unwrap_or(s);
+    let s = s.split('/').next().unwrap_or(s);
+    s.split('?').next().unwrap_or(s).to_lowercase()
+}
+
+fn sev_rank(s: &str) -> u8 {
+    match s.to_lowercase().as_str() {
+        x if x.starts_with("crit") => 4,
+        x if x.starts_with("high") => 3,
+        x if x.starts_with("med") => 2,
+        x if x.starts_with("low") => 1,
+        _ => 0,
+    }
+}
+
+fn short(s: &str) -> String {
+    s.chars().take(64).collect()
+}
+
+/// Hedging words that signal an impact was described but not demonstrated
+/// (English + Portuguese, since engagements are bilingual).
+const WEASEL: &[&str] = &[
+    "could ", "may ", "might ", "potential", "possible", "possibly", "teóric", "theoret",
+    "poderia", "possív", "potencial", "if the ", "caso o", "caso a", "would allow", "permitiria",
+];
+
+/// A finding that *exposes* something (recon/disclosure) rather than being an
+/// exploit with demonstrated impact.
+fn is_exposure(f: &Finding) -> bool {
+    let cwe = f.cwe.to_lowercase();
+    let t = f.title.to_lowercase();
+    ["200", "527", "538", "942", "497", "209", "548", "16"].iter().any(|c| cwe.contains(c))
+        || [
+            "disclosure", "exposed", "exposi", "exposure", "catalog", "catálogo", "cors",
+            "banner", "version", "versão", "header", "cabeçalho", ".git", "enumerat",
+            "fingerprint", "wsdl", "swagger", "missing security", "outdated", "eol",
+        ]
+        .iter()
+        .any(|k| t.contains(k))
+}
+
+/// Reads as unproven: hedged or thin evidence AND no concrete payload.
+fn looks_unproven(f: &Finding) -> bool {
+    let blob = format!("{} {} {}", f.title, f.impact, f.evidence).to_lowercase();
+    let hedged = WEASEL.iter().any(|w| blob.contains(w));
+    let weak_ev = f.evidence.trim().chars().count() < 40;
+    let no_payload = f.payload.trim().is_empty();
+    (hedged || weak_ev) && no_payload
+}
+
+/// Normalized hygiene class, for consolidation advice.
+fn class_of(f: &Finding) -> &'static str {
+    let t = f.title.to_lowercase();
+    if t.contains("header") || t.contains("cabeçalho") { "missing-security-headers" }
+    else if t.contains("clickjack") || t.contains("frame") { "clickjacking" }
+    else if t.contains("hsts") || t.contains("strict-transport") { "missing-hsts" }
+    else if t.contains("cookie") { "cookie-flags" }
+    else if t.contains("tls") || t.contains("ssl") { "weak-tls" }
+    else if t.contains("cors") { "cors-misconfig" }
+    else if t.contains("version") || t.contains("versão") || t.contains("banner") || t.contains("eol") || t.contains("outdated") { "version-disclosure" }
+    else { "information-disclosure" }
+}
+
+/// Cap inflated, unproven High/Critical findings to Medium. Returns advisories.
+pub fn calibrate(findings: &mut [Finding]) -> Vec<String> {
+    let mut notes = Vec::new();
+    for f in findings.iter_mut() {
+        if sev_rank(&f.severity) >= 3 && looks_unproven(f) {
+            let old = f.severity.clone();
+            f.severity = "Medium".into();
+            f.confidence = f.confidence.min(0.5);
+            let low = f.title.to_lowercase();
+            if !low.contains("potential") && !low.contains("potencial") {
+                f.title = format!("{} (potential — impact not demonstrated)", f.title);
+            }
+            notes.push(format!(
+                "severity calibrated: \"{}\" {old} → Medium (impact not demonstrated)",
+                short(&f.title)
+            ));
+        }
+    }
+    notes
+}
+
+/// "Exposed → exploited": exposures on a host with no real exploit get flagged.
+pub fn depth_audit(findings: &[Finding]) -> Vec<String> {
+    let exploited: std::collections::HashSet<String> = findings
+        .iter()
+        .filter(|f| !is_exposure(f) && sev_rank(&f.severity) >= 2)
+        .map(|f| host_of(&f.endpoint))
+        .collect();
+    let mut notes = Vec::new();
+    for f in findings.iter().filter(|f| is_exposure(f)) {
+        if !exploited.contains(&host_of(&f.endpoint)) {
+            notes.push(format!(
+                "depth gap: \"{}\" exposed but not exploited — USE it (call the endpoint / decode the artifact / log in / hit the dev host) to prove impact, or down-rate to a lead",
+                short(&f.title)
+            ));
+        }
+    }
+    notes.truncate(8);
+    notes
+}
+
+/// Advise consolidating hygiene classes that repeat across multiple assets.
+pub fn hygiene_summary(findings: &[Finding]) -> Vec<String> {
+    use std::collections::{BTreeMap, BTreeSet};
+    let mut groups: BTreeMap<&'static str, BTreeSet<String>> = BTreeMap::new();
+    for f in findings.iter().filter(|f| is_exposure(f)) {
+        groups.entry(class_of(f)).or_default().insert(host_of(&f.endpoint));
+    }
+    let mut notes = Vec::new();
+    for (class, hosts) in groups {
+        if hosts.len() > 1 {
+            notes.push(format!(
+                "hygiene: '{class}' affects {} assets — consolidate into ONE finding with an affected-asset table (don't inflate the count one-per-host)",
+                hosts.len()
+            ));
+        }
+    }
+    notes
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    fn f(title: &str, sev: &str, cwe: &str, ep: &str, ev: &str, payload: &str) -> Finding {
+        let mut x = Finding::default();
+        x.title = title.into(); x.severity = sev.into(); x.cwe = cwe.into();
+        x.endpoint = ep.into(); x.evidence = ev.into(); x.payload = payload.into();
+        x
+    }
+
+    #[test]
+    fn unproven_high_is_capped() {
+        let mut v = vec![f("Flooding DoS", "High", "CWE-770", "https://a/x", "could overload", "")];
+        let notes = calibrate(&mut v);
+        assert_eq!(v[0].severity, "Medium");
+        assert_eq!(notes.len(), 1);
+    }
+
+    #[test]
+    fn proven_high_is_kept() {
+        let mut v = vec![f("SQLi", "High", "CWE-89", "https://a/x",
+            "id=1' UNION SELECT version()-- returned 8.0.32 in the response body, proving injection", "1' OR '1'='1")];
+        calibrate(&mut v);
+        assert_eq!(v[0].severity, "High");
+    }
+
+    #[test]
+    fn exposure_without_exploit_flagged() {
+        let v = vec![f("Information Disclosure - .git exposed", "Low", "CWE-527", "https://a/.git", "leaked", "")];
+        assert_eq!(depth_audit(&v).len(), 1);
+    }
+
+    #[test]
+    fn exposure_with_exploit_on_same_host_not_flagged() {
+        let v = vec![
+            f("Information Disclosure - banner", "Low", "CWE-200", "https://a/x", "Server: IIS", ""),
+            f("SQL Injection", "High", "CWE-89", "https://a/login", "dumped users", "1'--"),
+        ];
+        assert!(depth_audit(&v).is_empty());
+    }
+}
@@ -1,4 +1,4 @@
-//! NeuroSploit v3.5.1 harness — a robust multi-model runtime for the
+//! NeuroSploit v3.5.2 harness — a robust multi-model runtime for the
 //! markdown-driven autonomous pentest engine.
 //!
 //! The harness loads the `agents_md/` library, drives a *pool* of LLM models
@@ -11,6 +11,7 @@ pub mod attack_graph;
 pub mod belief;
 pub mod creds;
 pub mod grounding;
+pub mod hygiene;
 pub mod pomdp;
 pub mod models;
 pub mod pipeline;
@@ -69,6 +69,16 @@ const REACT_DOCTRINE: &str = "METHOD (ReAct): work in explicit Thought → Actio
 Each Action runs ONE concrete tool command (e.g. a curl request); read its real Observation before the next Thought. \
 Base every claim on an actual observed response — never assume. Stop when you've either proven an issue or exhausted reasonable checks. Be token-efficient: no filler, no repetition.\n\n";

+/// DEPTH doctrine (v3.5.2): push past detection to demonstrated impact, and
+/// chain. Distilled from reviewing real AI-pentest output that kept stopping at
+/// "exposed" instead of "exploited".
+const DEPTH_DOCTRINE: &str = "DEPTH (exploit, don't just expose):\n\
+- Exposed → exploited: any info-disclosure, exposed service/catalog/WSDL, leaked credential/token, or non-prod (dev/staging) host you find MUST be USED before you report it — call the exposed endpoint, decode the leaked artifact, log in with the leaked credential, hit the dev host. If you only observed it but never used it, report it as a LEAD (low confidence), not a confirmed finding.\n\
+- Chain across steps: reuse any session/JWT/cookie/credential you obtain in one step against every other module; if one bug yields access, pivot it into IDOR/privesc/data-exfil and report the CHAIN, not isolated parts.\n\
+- Decode & fingerprint → CVE: decode opaque tokens/paths (base64/JSON/marshal) and fingerprint the stack (server, framework, library/gem/plugin versions); map exact versions to known CVEs and attempt a safe, non-destructive PoC.\n\
+- Audit tokens: for any JWT, check alg-confusion (RS→HS), alg:none, kid/jku injection, whether the signature is actually verified, and weak/guessable HS256 secrets.\n\
+- Calibrate honestly: claim High/Critical ONLY when impact is DEMONSTRATED; unproven DoS/abuse is Low/Info or a lead, never inflated.\n\n";
+
 /// Black-box web engagement: recon → parallel exploit → N-model vote → report.
 pub async fn run(cfg: RunConfig, lib: &Library, pool: &ModelPool, tx: Sender<String>) -> RunOutput {
    pool.set_progress(tx.clone());
@@ -168,12 +178,13 @@ pub async fn run(cfg: RunConfig, lib: &Library, pool: &ModelPool, tx: Sender<Str
                let user = format!(
                    "AUTHORIZED engagement — you have explicit permission to test {target}. \
                     Do not ask for confirmation — proceed and PROVE each issue.\n\n\
-                     {directives}{react}{doctrine}{body}\n\nWhen done, reply with ONLY a JSON array of confirmed findings (may be empty []). \
+                     {directives}{react}{depth}{doctrine}{body}\n\nWhen done, reply with ONLY a JSON array of confirmed findings (may be empty []). \
                     Each item: {{id,title,severity,cwe,endpoint,payload,evidence,impact,remediation,confidence}}. \
                     `evidence` must contain the concrete proof (request/response excerpt).",
                    target = target,
                    directives = directives,
                    react = REACT_DOCTRINE,
+                    depth = DEPTH_DOCTRINE,
                    doctrine = tool_doctrine(mcp_on),
                    body = ag.user.replace("{target}", &target).replace("{recon_json}", &recon),
                );
@@ -387,11 +398,11 @@ pub async fn run_greybox(cfg: RunConfig, lib: &Library, pool: &ModelPool, tx: Se
                }
                let user = format!(
                    "AUTHORIZED greybox engagement on {target} — you also have the source review below. \
-                     Proceed and PROVE each issue against the LIVE app.\n\n{directives}{leads}{react}{doctrine}{body}\n\n\
+                     Proceed and PROVE each issue against the LIVE app.\n\n{directives}{leads}{react}{depth}{doctrine}{body}\n\n\
                     Reply ONLY a JSON array of confirmed findings (may be []): \
                     {{id,title,severity,cwe,endpoint,payload,evidence,impact,remediation,confidence}}.",
                    target = target, directives = directives, leads = leads,
-                    react = REACT_DOCTRINE, doctrine = tool_doctrine(mcp_on),
+                    react = REACT_DOCTRINE, depth = DEPTH_DOCTRINE, doctrine = tool_doctrine(mcp_on),
                    body = ag.user.replace("{target}", &target).replace("{recon_json}", &recon),
                );
                match pool.complete_routed(Task::Exploit, &ag.name, &ag.system, &user).await {
@@ -439,12 +450,12 @@ async fn chain_round(pool: &ModelPool, target: &str, recon: &str, directives: &s
    let _ = tx.send(format!("chaining {} confirmed finding(s) for deeper impact…", confirmed.len())).await;
    let recon_ctx: String = recon.chars().take(2500).collect();
    let user = format!(
-        "AUTHORIZED engagement on {target}.\n\n{directives}{react}{doctrine}{recipe_block}\
+        "AUTHORIZED engagement on {target}.\n\n{directives}{react}{depth}{doctrine}{recipe_block}\
         CONFIRMED FINDINGS TO CHAIN:\n{summary}\n\nRecon:\n{recon_ctx}\n\n\
         Chain these into deeper impact (e.g. SQLi→RCE→LPE, SSRF→cloud creds, upload→LFI→RCE) and PROVE each stage. \
         Reply ONLY a JSON array of NEW findings \
         (may be []): {{id,title,severity,cwe,endpoint,payload,evidence,impact,remediation,confidence}}.",
-        react = REACT_DOCTRINE, doctrine = tool_doctrine(pool.mcp_config.is_some()),
+        react = REACT_DOCTRINE, depth = DEPTH_DOCTRINE, doctrine = tool_doctrine(pool.mcp_config.is_some()),
    );
    match pool.complete_routed(Task::Exploit, "chain", CHAIN_SYS, &user).await {
        Ok((m, text)) => {
@@ -623,6 +634,20 @@ async fn finish(cfg: RunConfig, _lib: &Library, recon: String, transcript: Strin
        let _ = tx.send(format!("grounding gate: demoted {demoted}/{before} ungrounded claim(s) (no tool receipt)")).await;
    }

+    // --- v3.5.2 report-hygiene & exploitation-depth pass ---
+    // Calibrate inflated/unproven High-Critical to Medium, flag exposures that
+    // were never exploited ("exposed → exploited"), and advise consolidating
+    // hygiene findings duplicated across many assets.
+    for n in crate::hygiene::calibrate(&mut findings) {
+        let _ = tx.send(format!("calibrate: {n}")).await;
+    }
+    for n in crate::hygiene::depth_audit(&findings) {
+        let _ = tx.send(format!("notify: {n}")).await;
+    }
+    for n in crate::hygiene::hygiene_summary(&findings) {
+        let _ = tx.send(format!("notify: {n}")).await;
+    }
+
    // --- POMDP belief: build from grounded findings, report residual uncertainty ---
    let mut wm = crate::belief::WorldModel::new();
    wm.deterministic = whitebox;
@@ -1,4 +1,4 @@
-//! POMDP decision layer (v3.5.1): value-of-information planning + the
+//! POMDP decision layer (v3.5.2): value-of-information planning + the
 //! anti-hallucination gate.
 //!
 //! The choice "scan more vs exploit now" is **not** a heuristic here — it falls
@@ -97,9 +97,9 @@ pub fn html(target: &str, findings: &[Finding]) -> String {
         h4{{margin:12px 0 3px;font-size:12px;text-transform:uppercase;letter-spacing:.5px;color:#8b5cf6}}\
         .b{{color:#8b5cf6;font-weight:800}}</style></head><body>\
         <h1><span class=b>NeuroSploit</span> Penetration Test Report</h1>\
-         <div class=meta>Target: <b>{t}</b> · v3.5.1 Rust harness · multi-model validated</div>\
+         <div class=meta>Target: <b>{t}</b> · v3.5.2 Rust harness · multi-model validated</div>\
         <div>{chips}</div>{graph_block}<h2>Findings ({n})</h2>{body}\
-         <p class=meta>Authorized testing only. Findings confirmed by multi-model adversarial voting.<br>NeuroSploit v3.5.1 · by <b>Joas A Santos</b> &amp; <b>Red Team Leaders</b></p></body></html>",
+         <p class=meta>Authorized testing only. Findings confirmed by multi-model adversarial voting.<br>NeuroSploit v3.5.2 · by <b>Joas A Santos</b> &amp; <b>Red Team Leaders</b></p></body></html>",
        t = esc(target), chips = chips, n = sorted.len(), body = body, graph_block = graph_block,
    )
 }
@@ -135,7 +135,7 @@ pub fn typst_report(target: &str, findings: &[Finding], dir: &Path) -> std::io::
    let mut data = String::new();
    data.push_str(&format!(
        "#let meta = (target: {}, run_id: {}, generated: {}, model: {})\n",
-        tq(target), tq(&run_id), tq("NeuroSploit v3.5.1"), tq("multi-model")
+        tq(target), tq(&run_id), tq("NeuroSploit v3.5.2"), tq("multi-model")
    ));
    data.push_str("#let findings = (\n");
    for f in sorted_findings(findings) {
@@ -0,0 +1,183 @@
+#!/usr/bin/env python3
+"""
+NeuroSploit v3.5.2 — exploitation-depth & report-hygiene doctrine agents.
+
+Distilled from reviewing real AI-pentest output that kept stopping at
+"exposed" instead of "exploited". Emits meta-agents to agents_md/meta/ that
+push the engine past detection to demonstrated impact, chain findings, decode
+artifacts/correlate CVEs, audit tokens, and keep the report honest (dedup +
+severity calibration). Credits: Joas A Santos & Red Team Leaders.
+"""
+import os
+
+ROOT = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
+OUT = os.path.join(ROOT, "agents_md", "meta")
+
+CREDITS = "Credits: Joas A Santos and Red Team Leaders."
+
+
+def render(a):
+    L = [f"# {a['title']}\n",
+         f"> Meta-agent (v3.5.2 doctrine). {a['tagline']}\n",
+         "## User Prompt",
+         a["user"].strip(), "",
+         "## System Prompt",
+         a["system"].strip() + " " + CREDITS]
+    return "\n".join(L) + "\n"
+
+
+AGENTS = [
+ {"name": "exploit_depth_doctrine",
+  "title": "Exploitation Depth Doctrine Agent",
+  "tagline": "Turns every exposure into an exploitation attempt before it becomes a finding.",
+  "user": """
+You are reviewing the candidate findings and live transcript for **{target}**.
+
+For EACH candidate that merely *exposes* something (information disclosure,
+exposed service/catalog/WSDL, leaked credential or token, reachable dev/staging
+host, permissive CORS, open .git), drive it one step further BEFORE it is
+reported:
+
+1. **Use what was exposed.** Call the exposed endpoint, decode the leaked
+   artifact, log in with the leaked credential, hit the dev host, send the
+   cross-origin request. Capture the real request/response.
+2. **Decide honestly.** If using it proved impact → keep/raise severity with the
+   new evidence. If it could not be used → down-rate to a LEAD (low confidence),
+   never a confirmed High/Critical.
+3. **Report the gap.** List any exposure you could not yet exploit, with the
+   exact next command to try, so the next round (or the human) can finish it.
+
+Output JSON: {"escalations":[{id, action_taken, new_evidence, new_severity}],
+"leads":[{id, why_not_proven, next_command}]}.
+""",
+  "system": """
+You are a senior exploitation lead. Detection is not a finding — impact is. You
+never let an info-disclosure, exposed service, leaked secret or reachable
+non-prod host be reported as confirmed without an attempt to actually use it,
+backed by a real tool receipt. Unproven impact is a lead, not a High. Authorized
+engagement; no destructive or DoS actions.
+"""},
+
+ {"name": "finding_chainer",
+  "title": "Finding Chainer Agent",
+  "tagline": "Reuses obtained access across modules and reports the chain, not the parts.",
+  "user": """
+Given the confirmed findings and any sessions/tokens/credentials obtained during
+the engagement on **{target}**, build exploitation CHAINS:
+
+- Reuse every session/JWT/cookie/credential from one step against ALL other
+  modules and hosts in scope (a captcha/login bypass that yields a token unlocks
+  the entire authenticated surface — use it).
+- Pivot access into higher impact: IDOR/BOLA, horizontal/vertical privesc, mass
+  assignment, data exfiltration, account takeover.
+- Combine separate weaknesses (e.g. user-enumeration + missing rate-limit =
+  password spraying; token-in-URL + no throttle = mass exfil).
+
+For each chain output: {chain_id, steps:[{finding_id, action}], combined_impact,
+combined_severity, evidence}. Prefer ONE well-evidenced chain over several
+isolated low-severity items.
+""",
+  "system": """
+You are an exploit-chaining specialist. Isolated findings understate risk; the
+real story is the chain. You always try to reuse obtained access across the
+whole scope and escalate to business impact, reporting the combined chain with
+concrete evidence. Authorized engagement; no destructive or DoS actions.
+"""},
+
+ {"name": "artifact_decoder",
+  "title": "Artifact Decoder & CVE Correlator Agent",
+  "tagline": "Decodes opaque tokens/paths, fingerprints the stack, and maps versions to CVEs.",
+  "user": """
+For **{target}**, inspect every opaque or technology-revealing artifact seen in
+recon and responses:
+
+1. **Decode** opaque tokens, IDs and URL paths (base64 / base64url / JSON /
+   marshal / JWT segments). A decoded value often reveals the framework or an
+   internal file path (e.g. a Dragonfly job `[["f","...file"]]`, a signed-URL
+   structure, a serialized object).
+2. **Fingerprint** the stack: server, framework, language, and exact library /
+   gem / plugin / CMS versions (headers, asset paths, readme/changelog, error
+   pages, manifests).
+3. **Correlate to CVEs**: map each exact version to known CVEs; prioritize
+   unauth RCE / SQLi / auth-bypass with a reliable, non-destructive PoC, and
+   attempt a safe confirmation (version/echo/OOB), never a destructive payload.
+
+Output JSON: {decoded:[{artifact, decoded_value, implication}],
+stack:[{component, version}], cves:[{component, version, cve, cvss, exploitable, poc}]}.
+""",
+  "system": """
+You decode the opaque and correlate the obvious. Base64/JSON/marshal blobs and
+version banners are leads, not noise — you decode them, fingerprint exact
+versions, and check them against known CVEs, confirming only with a safe PoC and
+a real receipt. Authorized engagement; no destructive or DoS actions.
+"""},
+
+ {"name": "token_auditor",
+  "title": "Token & JWT Auditor Agent",
+  "tagline": "Attacks tokens: alg-confusion, none, kid/jku, signature checks, weak HS256 secrets.",
+  "user": """
+For any session token or JWT issued by **{target}**, run a full auth-token audit:
+
+1. **Decode** the header/payload; note alg (HS*/RS*/none), kid, jku, exp, claims.
+2. **Algorithm attacks**: try `alg:none`, RS→HS confusion (sign with the public
+   key as HMAC secret), and kid/jku injection. Confirm whether the server
+   actually verifies the signature (tamper a claim and replay).
+3. **Weak secret**: for HS256, attempt to crack the signing secret offline
+   (wordlist/rules); a static or guessable shared secret (e.g. an `x-auth-*`
+   header value) is a strong lead — if cracked, forge a token for any user.
+4. **Lifecycle**: test reuse after logout, expiry enforcement, and refresh-token
+   revocation.
+
+Output JSON: {token_type, alg, verified:true|false,
+attacks:[{name, result, evidence}], forged_token_possible:true|false}.
+""",
+  "system": """
+You are a token-security specialist. Every JWT/session token gets audited for
+algorithm confusion, none, kid/jku injection, real signature verification, weak
+HS256 secrets, and lifecycle (logout/expiry/refresh). A forged or replayable
+token is account takeover — you prove it with a real receipt. Authorized
+engagement; no destructive or DoS actions.
+"""},
+
+ {"name": "report_calibrator",
+  "title": "Report Calibrator Agent",
+  "tagline": "Dedups by class, calibrates severity to proven impact, demands evidence per claim.",
+  "user": """
+Before the final report for **{target}**, clean and calibrate the findings:
+
+1. **Consolidate hygiene by class.** Merge repeated hygiene findings (missing
+   security headers, clickjacking, cookie flags, weak TLS, HSTS, version/banner
+   disclosure) into ONE finding per class with an affected-asset TABLE — do not
+   inflate the count one-per-host.
+2. **Calibrate severity to PROVEN impact.** High/Critical requires demonstrated
+   impact with evidence. Unproven DoS/abuse, "could/may/potential" language, or a
+   finding with no concrete payload/PoC → cap to Low/Medium or mark
+   "(potential)". Recompute the CVSS vector to match the proven impact.
+3. **Evidence per claim.** Every finding — and every item in the "tests
+   performed" log — must carry a concrete request/response receipt; flag any
+   claim that has none, and any contradiction between the test log and the
+   findings.
+
+Output JSON: {merged:[{class, severity, assets:[...]}],
+recalibrated:[{id, old_severity, new_severity, reason}],
+unevidenced:[{id_or_test, missing}]}.
+""",
+  "system": """
+You are a meticulous report editor. You group hygiene by class with an
+asset table, calibrate every severity to demonstrated impact (no inflated
+High/Critical, no padding the count with duplicates), and require a real
+receipt behind every claim — including each line of the tests-performed log.
+Honest, deduplicated, evidence-backed reporting only.
+"""},
+]
+
+
+def main():
+    os.makedirs(OUT, exist_ok=True)
+    for a in AGENTS:
+        open(os.path.join(OUT, a["name"] + ".md"), "w").write(render(a))
+    print(f"wrote {len(AGENTS)} v3.5.2 doctrine meta-agents to {OUT}")
+
+
+if __name__ == "__main__":
+    main()
@@ -25,7 +25,7 @@ cat <<'BANNER'

   ███╗   ██╗███████╗██╗   ██╗██████╗  ██████╗
   ████╗  ██║██╔════╝██║   ██║██╔══██╗██╔═══██╗   NeuroSploit installer
-   ██╔██╗ ██║█████╗  ██║   ██║██████╔╝██║   ██║   v3.5.1 — Rust harness
+   ██╔██╗ ██║█████╗  ██║   ██║██████╔╝██║   ██║   v3.5.2 — Rust harness
   ██║╚██╗██║██╔══╝  ██║   ██║██╔══██╗██║   ██║   by Joas A Santos
   ██║ ╚████║███████╗╚██████╔╝██║  ██║╚██████╔╝   & Red Team Leaders
   ╚═╝  ╚═══╝╚══════╝ ╚═════╝ ╚═╝  ╚═╝ ╚═════╝