mirror of
https://github.com/CyberSecurityUP/NeuroSploit.git
synced 2026-06-29 23:05:30 +02:00
v3.5.2 — Exploitation Depth & Report Hygiene
Distilled from reviewing real AI-pentest output that kept stopping at "exposed" instead of "exploited". Pure-additive, back-compatible. Behavior (injected into black/grey/chain exploit prompts via DEPTH_DOCTRINE): - Exposed → exploited: any info-disclosure / exposed service/WSDL / leaked credential|token / reachable dev host MUST be used before it's a finding; otherwise it's a lead, not a confirmed High/Critical. - Chain across modules: reuse obtained session/JWT/cookie/credential and pivot to IDOR/privesc/exfil; report the chain, not isolated parts. - Decode & fingerprint → CVE; audit tokens (alg-confusion/none/kid/JWKS, weak HS256 secret cracking, lifecycle). Deterministic post-pass (new crates/harness/src/hygiene.rs, wired into finish()): - calibrate severity to PROVEN impact — unproven High/Critical (hedged, no payload, thin evidence) capped to Medium and re-titled "(potential)"; - depth_audit — flag exposures on a host with no real exploit; - hygiene_summary — advise consolidating hygiene classes repeated across assets. Unit tests cover calibration + depth audit. 5 new doctrine meta-agents (scripts/build_methodology_v352.py → agents_md/meta/): exploit_depth_doctrine, finding_chainer, artifact_decoder, token_auditor, report_calibrator (meta 17→22, total 343→348). Version bumped 3.5.1 → 3.5.2 across crates/app/installers/docs; RELEASE/README updated. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1,4 +1,4 @@
|
||||
<h1 align="center">🧠 NeuroSploit v3.5.1</h1>
|
||||
<h1 align="center">🧠 NeuroSploit v3.5.2</h1>
|
||||
|
||||
<p align="center">
|
||||
<a href="https://github.com/JoasASantos/NeuroSploit/stargazers"><img src="https://img.shields.io/github/stars/JoasASantos/NeuroSploit?style=for-the-badge&logo=github&color=8b5cf6" alt="Stars"></a>
|
||||
@@ -8,7 +8,7 @@
|
||||
</p>
|
||||
|
||||
<p align="center">
|
||||
<img src="https://img.shields.io/badge/Version-3.5.1-blue?style=flat-square">
|
||||
<img src="https://img.shields.io/badge/Version-3.5.2-blue?style=flat-square">
|
||||
<img src="https://img.shields.io/badge/Harness-Rust%20%7C%20tokio-e6b673?style=flat-square">
|
||||
<img src="https://img.shields.io/badge/License-MIT-green?style=flat-square">
|
||||
<img src="https://img.shields.io/badge/MD%20Agents-329-red?style=flat-square">
|
||||
@@ -24,6 +24,13 @@
|
||||
>
|
||||
> 📖 **New here? Read the [full Tutorial & User Guide →](TUTORIAL.md)** — every mode, flag, config and example explained.
|
||||
|
||||
> 🆕 **New in v3.5.2 — Exploitation Depth & Report Hygiene:** a **DEPTH doctrine**
|
||||
> makes the engine *use* what it finds (exposed → exploited), **chain** findings
|
||||
> across modules, decode/fingerprint artifacts → CVEs, and **audit tokens** (JWT
|
||||
> alg-confusion / weak HS256 secrets). A deterministic post-pass **calibrates
|
||||
> severity to proven impact** and **consolidates duplicated hygiene** findings.
|
||||
> See [RELEASE.md](RELEASE.md).
|
||||
|
||||
---
|
||||
|
||||
**NeuroSploit** turns a URL, a source repository, a running app, or a host/IP into
|
||||
|
||||
+60
@@ -1,3 +1,63 @@
|
||||
# NeuroSploit v3.5.2 — Release Notes
|
||||
|
||||
**Release Date:** June 2026
|
||||
**Codename:** Exploitation Depth & Report Hygiene
|
||||
**License:** MIT
|
||||
**Credits:** Joas A Santos & Red Team Leaders
|
||||
|
||||
---
|
||||
|
||||
## TL;DR
|
||||
|
||||
v3.5.2 hard-codes the discipline that separates a great pentest from a noisy
|
||||
one — distilled from reviewing real AI-pentest output that kept stopping at
|
||||
*"exposed"* instead of *"exploited"*. The engine now pushes every exposure to
|
||||
demonstrated impact, **chains** findings, decodes/fingerprints artifacts and
|
||||
correlates CVEs, audits tokens, and keeps the final report honest (deduplicated
|
||||
and severity-calibrated).
|
||||
|
||||
## Highlights
|
||||
|
||||
- **DEPTH doctrine (exploit, don't just expose).** A new doctrine is injected
|
||||
into every exploitation prompt (black/grey/chain): any info-disclosure,
|
||||
exposed service/catalog/WSDL, leaked credential/token, or reachable dev host
|
||||
**must be USED** before it can be a finding — call it, decode it, log in, hit
|
||||
the dev host. If it was only observed, it's reported as a **lead**, not a
|
||||
confirmed High/Critical.
|
||||
- **Finding chaining.** Reuse any session/JWT/cookie/credential obtained in one
|
||||
step across all other modules; pivot access into IDOR/privesc/exfil and report
|
||||
the **chain**, not isolated parts (e.g. captcha-bypass→admin JWT→authenticated
|
||||
surface; enum + no-rate-limit→password spraying).
|
||||
- **Decode & fingerprint → CVE.** Decode opaque tokens/paths (base64/JSON/marshal)
|
||||
and pin exact library/gem/plugin/CMS versions, then correlate to known CVEs and
|
||||
attempt a safe PoC.
|
||||
- **Token auditor.** JWT alg-confusion (RS→HS), `alg:none`, kid/jku injection,
|
||||
real signature verification, **weak HS256 secret cracking**, and token
|
||||
lifecycle (logout/expiry/refresh).
|
||||
- **Report-hygiene & depth pass (deterministic, in the harness).** After
|
||||
validation the run now:
|
||||
- **calibrates severity to proven impact** — an unproven High/Critical
|
||||
(hedged language, no payload, thin evidence) is capped to Medium and
|
||||
re-titled "(potential)";
|
||||
- flags **"exposed → exploited" gaps** — exposures on a host with no actual
|
||||
exploit get an advisory to go use them;
|
||||
- advises **consolidating hygiene** classes (headers/cookies/TLS/HSTS/
|
||||
clickjacking/disclosure) repeated across many assets into ONE finding with
|
||||
an affected-asset table, instead of inflating the count one-per-host.
|
||||
- **5 new doctrine meta-agents** (`agents_md/meta/`): `exploit_depth_doctrine`,
|
||||
`finding_chainer`, `artifact_decoder`, `token_auditor`, `report_calibrator`
|
||||
(meta agents 17 → 22; total library 343 → 348).
|
||||
|
||||
## Notes
|
||||
|
||||
- Pure-additive and back-compatible: existing modes, REPL, TUI, pause/continue,
|
||||
crash-recovery and reports are unchanged. The hygiene pass only annotates and
|
||||
down-calibrates unproven severities — it never invents or drops findings.
|
||||
- New unit tests cover the calibration and depth-audit logic
|
||||
(`harness::hygiene`).
|
||||
|
||||
---
|
||||
|
||||
# NeuroSploit v3.5.1 — Release Notes
|
||||
|
||||
**Release Date:** June 2026
|
||||
|
||||
+2
-2
@@ -1,4 +1,4 @@
|
||||
# NeuroSploit — Tutorial & User Guide (v3.5.1)
|
||||
# NeuroSploit — Tutorial & User Guide (v3.5.2)
|
||||
|
||||
A complete, hands-on guide to installing, configuring and running NeuroSploit —
|
||||
the autonomous, multi-model penetration-testing harness.
|
||||
@@ -98,7 +98,7 @@ Agents **degrade gracefully**: if `rustscan` is absent they use `nmap`; if neith
|
||||
### Verify
|
||||
|
||||
```bash
|
||||
neurosploit --version # neurosploit 3.5.1
|
||||
neurosploit --version # neurosploit 3.5.2
|
||||
neurosploit agents # {"vulns":196,...,"chains":12,"total":329}
|
||||
neurosploit models # all providers & models
|
||||
```
|
||||
|
||||
@@ -0,0 +1,27 @@
|
||||
# Artifact Decoder & CVE Correlator Agent
|
||||
|
||||
> Meta-agent (v3.5.2 doctrine). Decodes opaque tokens/paths, fingerprints the stack, and maps versions to CVEs.
|
||||
|
||||
## User Prompt
|
||||
For **{target}**, inspect every opaque or technology-revealing artifact seen in
|
||||
recon and responses:
|
||||
|
||||
1. **Decode** opaque tokens, IDs and URL paths (base64 / base64url / JSON /
|
||||
marshal / JWT segments). A decoded value often reveals the framework or an
|
||||
internal file path (e.g. a Dragonfly job `[["f","...file"]]`, a signed-URL
|
||||
structure, a serialized object).
|
||||
2. **Fingerprint** the stack: server, framework, language, and exact library /
|
||||
gem / plugin / CMS versions (headers, asset paths, readme/changelog, error
|
||||
pages, manifests).
|
||||
3. **Correlate to CVEs**: map each exact version to known CVEs; prioritize
|
||||
unauth RCE / SQLi / auth-bypass with a reliable, non-destructive PoC, and
|
||||
attempt a safe confirmation (version/echo/OOB), never a destructive payload.
|
||||
|
||||
Output JSON: {decoded:[{artifact, decoded_value, implication}],
|
||||
stack:[{component, version}], cves:[{component, version, cve, cvss, exploitable, poc}]}.
|
||||
|
||||
## System Prompt
|
||||
You decode the opaque and correlate the obvious. Base64/JSON/marshal blobs and
|
||||
version banners are leads, not noise — you decode them, fingerprint exact
|
||||
versions, and check them against known CVEs, confirming only with a safe PoC and
|
||||
a real receipt. Authorized engagement; no destructive or DoS actions. Credits: Joas A Santos and Red Team Leaders.
|
||||
@@ -0,0 +1,30 @@
|
||||
# Exploitation Depth Doctrine Agent
|
||||
|
||||
> Meta-agent (v3.5.2 doctrine). Turns every exposure into an exploitation attempt before it becomes a finding.
|
||||
|
||||
## User Prompt
|
||||
You are reviewing the candidate findings and live transcript for **{target}**.
|
||||
|
||||
For EACH candidate that merely *exposes* something (information disclosure,
|
||||
exposed service/catalog/WSDL, leaked credential or token, reachable dev/staging
|
||||
host, permissive CORS, open .git), drive it one step further BEFORE it is
|
||||
reported:
|
||||
|
||||
1. **Use what was exposed.** Call the exposed endpoint, decode the leaked
|
||||
artifact, log in with the leaked credential, hit the dev host, send the
|
||||
cross-origin request. Capture the real request/response.
|
||||
2. **Decide honestly.** If using it proved impact → keep/raise severity with the
|
||||
new evidence. If it could not be used → down-rate to a LEAD (low confidence),
|
||||
never a confirmed High/Critical.
|
||||
3. **Report the gap.** List any exposure you could not yet exploit, with the
|
||||
exact next command to try, so the next round (or the human) can finish it.
|
||||
|
||||
Output JSON: {"escalations":[{id, action_taken, new_evidence, new_severity}],
|
||||
"leads":[{id, why_not_proven, next_command}]}.
|
||||
|
||||
## System Prompt
|
||||
You are a senior exploitation lead. Detection is not a finding — impact is. You
|
||||
never let an info-disclosure, exposed service, leaked secret or reachable
|
||||
non-prod host be reported as confirmed without an attempt to actually use it,
|
||||
backed by a real tool receipt. Unproven impact is a lead, not a High. Authorized
|
||||
engagement; no destructive or DoS actions. Credits: Joas A Santos and Red Team Leaders.
|
||||
@@ -0,0 +1,25 @@
|
||||
# Finding Chainer Agent
|
||||
|
||||
> Meta-agent (v3.5.2 doctrine). Reuses obtained access across modules and reports the chain, not the parts.
|
||||
|
||||
## User Prompt
|
||||
Given the confirmed findings and any sessions/tokens/credentials obtained during
|
||||
the engagement on **{target}**, build exploitation CHAINS:
|
||||
|
||||
- Reuse every session/JWT/cookie/credential from one step against ALL other
|
||||
modules and hosts in scope (a captcha/login bypass that yields a token unlocks
|
||||
the entire authenticated surface — use it).
|
||||
- Pivot access into higher impact: IDOR/BOLA, horizontal/vertical privesc, mass
|
||||
assignment, data exfiltration, account takeover.
|
||||
- Combine separate weaknesses (e.g. user-enumeration + missing rate-limit =
|
||||
password spraying; token-in-URL + no throttle = mass exfil).
|
||||
|
||||
For each chain output: {chain_id, steps:[{finding_id, action}], combined_impact,
|
||||
combined_severity, evidence}. Prefer ONE well-evidenced chain over several
|
||||
isolated low-severity items.
|
||||
|
||||
## System Prompt
|
||||
You are an exploit-chaining specialist. Isolated findings understate risk; the
|
||||
real story is the chain. You always try to reuse obtained access across the
|
||||
whole scope and escalate to business impact, reporting the combined chain with
|
||||
concrete evidence. Authorized engagement; no destructive or DoS actions. Credits: Joas A Santos and Red Team Leaders.
|
||||
@@ -0,0 +1,30 @@
|
||||
# Report Calibrator Agent
|
||||
|
||||
> Meta-agent (v3.5.2 doctrine). Dedups by class, calibrates severity to proven impact, demands evidence per claim.
|
||||
|
||||
## User Prompt
|
||||
Before the final report for **{target}**, clean and calibrate the findings:
|
||||
|
||||
1. **Consolidate hygiene by class.** Merge repeated hygiene findings (missing
|
||||
security headers, clickjacking, cookie flags, weak TLS, HSTS, version/banner
|
||||
disclosure) into ONE finding per class with an affected-asset TABLE — do not
|
||||
inflate the count one-per-host.
|
||||
2. **Calibrate severity to PROVEN impact.** High/Critical requires demonstrated
|
||||
impact with evidence. Unproven DoS/abuse, "could/may/potential" language, or a
|
||||
finding with no concrete payload/PoC → cap to Low/Medium or mark
|
||||
"(potential)". Recompute the CVSS vector to match the proven impact.
|
||||
3. **Evidence per claim.** Every finding — and every item in the "tests
|
||||
performed" log — must carry a concrete request/response receipt; flag any
|
||||
claim that has none, and any contradiction between the test log and the
|
||||
findings.
|
||||
|
||||
Output JSON: {merged:[{class, severity, assets:[...]}],
|
||||
recalibrated:[{id, old_severity, new_severity, reason}],
|
||||
unevidenced:[{id_or_test, missing}]}.
|
||||
|
||||
## System Prompt
|
||||
You are a meticulous report editor. You group hygiene by class with an
|
||||
asset table, calibrate every severity to demonstrated impact (no inflated
|
||||
High/Critical, no padding the count with duplicates), and require a real
|
||||
receipt behind every claim — including each line of the tests-performed log.
|
||||
Honest, deduplicated, evidence-backed reporting only. Credits: Joas A Santos and Red Team Leaders.
|
||||
@@ -0,0 +1,26 @@
|
||||
# Token & JWT Auditor Agent
|
||||
|
||||
> Meta-agent (v3.5.2 doctrine). Attacks tokens: alg-confusion, none, kid/jku, signature checks, weak HS256 secrets.
|
||||
|
||||
## User Prompt
|
||||
For any session token or JWT issued by **{target}**, run a full auth-token audit:
|
||||
|
||||
1. **Decode** the header/payload; note alg (HS*/RS*/none), kid, jku, exp, claims.
|
||||
2. **Algorithm attacks**: try `alg:none`, RS→HS confusion (sign with the public
|
||||
key as HMAC secret), and kid/jku injection. Confirm whether the server
|
||||
actually verifies the signature (tamper a claim and replay).
|
||||
3. **Weak secret**: for HS256, attempt to crack the signing secret offline
|
||||
(wordlist/rules); a static or guessable shared secret (e.g. an `x-auth-*`
|
||||
header value) is a strong lead — if cracked, forge a token for any user.
|
||||
4. **Lifecycle**: test reuse after logout, expiry enforcement, and refresh-token
|
||||
revocation.
|
||||
|
||||
Output JSON: {token_type, alg, verified:true|false,
|
||||
attacks:[{name, result, evidence}], forged_token_possible:true|false}.
|
||||
|
||||
## System Prompt
|
||||
You are a token-security specialist. Every JWT/session token gets audited for
|
||||
algorithm confusion, none, kid/jku injection, real signature verification, weak
|
||||
HS256 secrets, and lifecycle (logout/expiry/refresh). A forged or replayable
|
||||
token is account takeover — you prove it with a real receipt. Authorized
|
||||
engagement; no destructive or DoS actions. Credits: Joas A Santos and Red Team Leaders.
|
||||
+1
-1
@@ -11,7 +11,7 @@ function Ok ($m) { Write-Host " + $m" -ForegroundColor Green }
|
||||
function Warn($m){ Write-Host " ! $m" -ForegroundColor Yellow }
|
||||
|
||||
Write-Host ""
|
||||
Write-Host " NeuroSploit installer (Windows) — v3.5.1" -ForegroundColor Cyan
|
||||
Write-Host " NeuroSploit installer (Windows) — v3.5.2" -ForegroundColor Cyan
|
||||
$arch = $env:PROCESSOR_ARCHITECTURE
|
||||
Say "Platform: Windows / $arch"
|
||||
|
||||
|
||||
Generated
+2
-2
@@ -871,7 +871,7 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "neurosploit"
|
||||
version = "3.5.1"
|
||||
version = "3.5.2"
|
||||
dependencies = [
|
||||
"anyhow",
|
||||
"clap",
|
||||
@@ -888,7 +888,7 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "neurosploit-harness"
|
||||
version = "3.5.1"
|
||||
version = "3.5.2"
|
||||
dependencies = [
|
||||
"anyhow",
|
||||
"futures",
|
||||
|
||||
@@ -3,7 +3,7 @@ members = ["crates/harness", "app"]
|
||||
resolver = "2"
|
||||
|
||||
[workspace.package]
|
||||
version = "3.5.1"
|
||||
version = "3.5.2"
|
||||
edition = "2021"
|
||||
license = "MIT"
|
||||
repository = "https://github.com/JoasASantos/NeuroSploit"
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
//! NeuroSploit v3.5.1 — interactive harness + CLI (`run` / `whitebox` / `agents` / `models`).
|
||||
//! NeuroSploit v3.5.2 — interactive harness + CLI (`run` / `whitebox` / `agents` / `models`).
|
||||
|
||||
mod repl;
|
||||
mod tui;
|
||||
@@ -11,8 +11,8 @@ use std::path::{Path, PathBuf};
|
||||
#[command(
|
||||
name = "neurosploit",
|
||||
version,
|
||||
about = "NeuroSploit v3.5.1 — multi-model autonomous pentest harness",
|
||||
long_about = "NeuroSploit v3.5.1 — a Rust multi-model harness that drives a pool of LLMs \
|
||||
about = "NeuroSploit v3.5.2 — multi-model autonomous pentest harness",
|
||||
long_about = "NeuroSploit v3.5.2 — a Rust multi-model harness that drives a pool of LLMs \
|
||||
(API key or local subscription: Claude/Codex/Gemini/Grok) to autonomously test a target. \
|
||||
After recon it INTELLIGENTLY selects only the agents matching the discovered surface, runs \
|
||||
them in parallel, then validates every finding by cross-model voting before reporting.\n\n\
|
||||
@@ -379,7 +379,7 @@ pub(crate) fn spawn_engagement(base: &Path, mut cfg: RunConfig, mcp: bool, mode:
|
||||
cfg.rl_path = Some(base.join("data").join("rl_state_rs.json").display().to_string());
|
||||
write_status(&workdir, "running", &format!("\"target\":{:?}", cfg.target));
|
||||
|
||||
println!(" ┌─ NeuroSploit v3.5.1 · by Joas A Santos & Red Team Leaders");
|
||||
println!(" ┌─ NeuroSploit v3.5.2 · by Joas A Santos & Red Team Leaders");
|
||||
println!(" │ run id : {run_id}");
|
||||
println!(" │ target : {}", cfg.target);
|
||||
println!(" │ models : {}", cfg.models.join(", "));
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
//! NeuroSploit v3.5.1 — interactive session (Claude-Code / Codex / Cursor-CLI style).
|
||||
//! NeuroSploit v3.5.2 — interactive session (Claude-Code / Codex / Cursor-CLI style).
|
||||
//!
|
||||
//! Launched when `neurosploit` runs with no subcommand. A persistent REPL with
|
||||
//! real line editing (arrow-key history recall, Ctrl-A/E/K, paste), model
|
||||
@@ -299,7 +299,7 @@ pub async fn repl(base: &Path) -> anyhow::Result<()> {
|
||||
let backends = harness::installed_cli_backends();
|
||||
println!("\x1b[1m");
|
||||
println!(" ███╗ ██╗███████╗██╗ ██╗██████╗ ██████╗");
|
||||
println!(" ████╗ ██║██╔════╝██║ ██║██╔══██╗██╔═══██╗ NeuroSploit v3.5.1");
|
||||
println!(" ████╗ ██║██╔════╝██║ ██║██╔══██╗██╔═══██╗ NeuroSploit v3.5.2");
|
||||
println!(" ██╔██╗ ██║█████╗ ██║ ██║██████╔╝██║ ██║ interactive harness");
|
||||
println!(" ██║╚██╗██║██╔══╝ ██║ ██║██╔══██╗██║ ██║ by Joas A Santos");
|
||||
println!(" ██║ ╚████║███████╗╚██████╔╝██║ ██║╚██████╔╝ & Red Team Leaders");
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
//! NeuroSploit v3.5.1 — TUI "Mission Control" mode.
|
||||
//! NeuroSploit v3.5.2 — TUI "Mission Control" mode.
|
||||
//!
|
||||
//! Concurrent panels that update live while the engagement runs in the
|
||||
//! background, with a composer input that stays active during execution:
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
//! POMDP belief-state world model (v3.5.1).
|
||||
//! POMDP belief-state world model (v3.5.2).
|
||||
//!
|
||||
//! The target is only partially observable, so we don't track booleans — we
|
||||
//! track a **belief**: a property graph whose nodes (host / service / vuln /
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
//! Verification / grounding engine (v3.5.1).
|
||||
//! Verification / grounding engine (v3.5.2).
|
||||
//!
|
||||
//! Hard rule: **no claim enters the world model without a tool receipt** — raw
|
||||
//! tool output, not the LLM's paraphrase. This is the empirical anti-hallucination
|
||||
|
||||
@@ -0,0 +1,186 @@
|
||||
//! Report-hygiene & exploitation-depth pass (v3.5.2).
|
||||
//!
|
||||
//! Encodes the post-engagement discipline learned from reviewing real
|
||||
//! AI-pentest output, applied deterministically after validation:
|
||||
//! 1. **Calibrate severity to PROVEN impact** — an unproven High/Critical
|
||||
//! (hedged language, no payload, thin evidence) is capped to Medium and
|
||||
//! re-titled "(potential)". No inflated severities.
|
||||
//! 2. **Exposed → exploited** — flag info-disclosure / exposed-service /
|
||||
//! leaked-credential findings on a host that has no actual exploit, so the
|
||||
//! operator knows to *use* what was exposed (or down-rate it to a lead).
|
||||
//! 3. **Consolidate hygiene** — when the same hygiene class (missing headers,
|
||||
//! clickjacking, cookie flags, TLS, info-disclosure…) repeats across many
|
||||
//! assets, advise merging into ONE finding with an affected-asset table,
|
||||
//! instead of inflating the count one-per-host.
|
||||
//!
|
||||
//! All functions are pure/deterministic; only `calibrate` mutates findings
|
||||
//! (severity/title/confidence). The rest return advisory strings streamed to
|
||||
//! the operator and recorded with the run.
|
||||
use crate::types::Finding;
|
||||
|
||||
fn host_of(endpoint: &str) -> String {
|
||||
let s = endpoint.trim();
|
||||
let s = s.split("://").last().unwrap_or(s);
|
||||
let s = s.split('/').next().unwrap_or(s);
|
||||
s.split('?').next().unwrap_or(s).to_lowercase()
|
||||
}
|
||||
|
||||
fn sev_rank(s: &str) -> u8 {
|
||||
match s.to_lowercase().as_str() {
|
||||
x if x.starts_with("crit") => 4,
|
||||
x if x.starts_with("high") => 3,
|
||||
x if x.starts_with("med") => 2,
|
||||
x if x.starts_with("low") => 1,
|
||||
_ => 0,
|
||||
}
|
||||
}
|
||||
|
||||
fn short(s: &str) -> String {
|
||||
s.chars().take(64).collect()
|
||||
}
|
||||
|
||||
/// Hedging words that signal an impact was described but not demonstrated
|
||||
/// (English + Portuguese, since engagements are bilingual).
|
||||
const WEASEL: &[&str] = &[
|
||||
"could ", "may ", "might ", "potential", "possible", "possibly", "teóric", "theoret",
|
||||
"poderia", "possív", "potencial", "if the ", "caso o", "caso a", "would allow", "permitiria",
|
||||
];
|
||||
|
||||
/// A finding that *exposes* something (recon/disclosure) rather than being an
|
||||
/// exploit with demonstrated impact.
|
||||
fn is_exposure(f: &Finding) -> bool {
|
||||
let cwe = f.cwe.to_lowercase();
|
||||
let t = f.title.to_lowercase();
|
||||
["200", "527", "538", "942", "497", "209", "548", "16"].iter().any(|c| cwe.contains(c))
|
||||
|| [
|
||||
"disclosure", "exposed", "exposi", "exposure", "catalog", "catálogo", "cors",
|
||||
"banner", "version", "versão", "header", "cabeçalho", ".git", "enumerat",
|
||||
"fingerprint", "wsdl", "swagger", "missing security", "outdated", "eol",
|
||||
]
|
||||
.iter()
|
||||
.any(|k| t.contains(k))
|
||||
}
|
||||
|
||||
/// Reads as unproven: hedged or thin evidence AND no concrete payload.
|
||||
fn looks_unproven(f: &Finding) -> bool {
|
||||
let blob = format!("{} {} {}", f.title, f.impact, f.evidence).to_lowercase();
|
||||
let hedged = WEASEL.iter().any(|w| blob.contains(w));
|
||||
let weak_ev = f.evidence.trim().chars().count() < 40;
|
||||
let no_payload = f.payload.trim().is_empty();
|
||||
(hedged || weak_ev) && no_payload
|
||||
}
|
||||
|
||||
/// Normalized hygiene class, for consolidation advice.
|
||||
fn class_of(f: &Finding) -> &'static str {
|
||||
let t = f.title.to_lowercase();
|
||||
if t.contains("header") || t.contains("cabeçalho") { "missing-security-headers" }
|
||||
else if t.contains("clickjack") || t.contains("frame") { "clickjacking" }
|
||||
else if t.contains("hsts") || t.contains("strict-transport") { "missing-hsts" }
|
||||
else if t.contains("cookie") { "cookie-flags" }
|
||||
else if t.contains("tls") || t.contains("ssl") { "weak-tls" }
|
||||
else if t.contains("cors") { "cors-misconfig" }
|
||||
else if t.contains("version") || t.contains("versão") || t.contains("banner") || t.contains("eol") || t.contains("outdated") { "version-disclosure" }
|
||||
else { "information-disclosure" }
|
||||
}
|
||||
|
||||
/// Cap inflated, unproven High/Critical findings to Medium. Returns advisories.
|
||||
pub fn calibrate(findings: &mut [Finding]) -> Vec<String> {
|
||||
let mut notes = Vec::new();
|
||||
for f in findings.iter_mut() {
|
||||
if sev_rank(&f.severity) >= 3 && looks_unproven(f) {
|
||||
let old = f.severity.clone();
|
||||
f.severity = "Medium".into();
|
||||
f.confidence = f.confidence.min(0.5);
|
||||
let low = f.title.to_lowercase();
|
||||
if !low.contains("potential") && !low.contains("potencial") {
|
||||
f.title = format!("{} (potential — impact not demonstrated)", f.title);
|
||||
}
|
||||
notes.push(format!(
|
||||
"severity calibrated: \"{}\" {old} → Medium (impact not demonstrated)",
|
||||
short(&f.title)
|
||||
));
|
||||
}
|
||||
}
|
||||
notes
|
||||
}
|
||||
|
||||
/// "Exposed → exploited": exposures on a host with no real exploit get flagged.
|
||||
pub fn depth_audit(findings: &[Finding]) -> Vec<String> {
|
||||
let exploited: std::collections::HashSet<String> = findings
|
||||
.iter()
|
||||
.filter(|f| !is_exposure(f) && sev_rank(&f.severity) >= 2)
|
||||
.map(|f| host_of(&f.endpoint))
|
||||
.collect();
|
||||
let mut notes = Vec::new();
|
||||
for f in findings.iter().filter(|f| is_exposure(f)) {
|
||||
if !exploited.contains(&host_of(&f.endpoint)) {
|
||||
notes.push(format!(
|
||||
"depth gap: \"{}\" exposed but not exploited — USE it (call the endpoint / decode the artifact / log in / hit the dev host) to prove impact, or down-rate to a lead",
|
||||
short(&f.title)
|
||||
));
|
||||
}
|
||||
}
|
||||
notes.truncate(8);
|
||||
notes
|
||||
}
|
||||
|
||||
/// Advise consolidating hygiene classes that repeat across multiple assets.
|
||||
pub fn hygiene_summary(findings: &[Finding]) -> Vec<String> {
|
||||
use std::collections::{BTreeMap, BTreeSet};
|
||||
let mut groups: BTreeMap<&'static str, BTreeSet<String>> = BTreeMap::new();
|
||||
for f in findings.iter().filter(|f| is_exposure(f)) {
|
||||
groups.entry(class_of(f)).or_default().insert(host_of(&f.endpoint));
|
||||
}
|
||||
let mut notes = Vec::new();
|
||||
for (class, hosts) in groups {
|
||||
if hosts.len() > 1 {
|
||||
notes.push(format!(
|
||||
"hygiene: '{class}' affects {} assets — consolidate into ONE finding with an affected-asset table (don't inflate the count one-per-host)",
|
||||
hosts.len()
|
||||
));
|
||||
}
|
||||
}
|
||||
notes
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
fn f(title: &str, sev: &str, cwe: &str, ep: &str, ev: &str, payload: &str) -> Finding {
|
||||
let mut x = Finding::default();
|
||||
x.title = title.into(); x.severity = sev.into(); x.cwe = cwe.into();
|
||||
x.endpoint = ep.into(); x.evidence = ev.into(); x.payload = payload.into();
|
||||
x
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn unproven_high_is_capped() {
|
||||
let mut v = vec![f("Flooding DoS", "High", "CWE-770", "https://a/x", "could overload", "")];
|
||||
let notes = calibrate(&mut v);
|
||||
assert_eq!(v[0].severity, "Medium");
|
||||
assert_eq!(notes.len(), 1);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn proven_high_is_kept() {
|
||||
let mut v = vec![f("SQLi", "High", "CWE-89", "https://a/x",
|
||||
"id=1' UNION SELECT version()-- returned 8.0.32 in the response body, proving injection", "1' OR '1'='1")];
|
||||
calibrate(&mut v);
|
||||
assert_eq!(v[0].severity, "High");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn exposure_without_exploit_flagged() {
|
||||
let v = vec![f("Information Disclosure - .git exposed", "Low", "CWE-527", "https://a/.git", "leaked", "")];
|
||||
assert_eq!(depth_audit(&v).len(), 1);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn exposure_with_exploit_on_same_host_not_flagged() {
|
||||
let v = vec![
|
||||
f("Information Disclosure - banner", "Low", "CWE-200", "https://a/x", "Server: IIS", ""),
|
||||
f("SQL Injection", "High", "CWE-89", "https://a/login", "dumped users", "1'--"),
|
||||
];
|
||||
assert!(depth_audit(&v).is_empty());
|
||||
}
|
||||
}
|
||||
@@ -1,4 +1,4 @@
|
||||
//! NeuroSploit v3.5.1 harness — a robust multi-model runtime for the
|
||||
//! NeuroSploit v3.5.2 harness — a robust multi-model runtime for the
|
||||
//! markdown-driven autonomous pentest engine.
|
||||
//!
|
||||
//! The harness loads the `agents_md/` library, drives a *pool* of LLM models
|
||||
@@ -11,6 +11,7 @@ pub mod attack_graph;
|
||||
pub mod belief;
|
||||
pub mod creds;
|
||||
pub mod grounding;
|
||||
pub mod hygiene;
|
||||
pub mod pomdp;
|
||||
pub mod models;
|
||||
pub mod pipeline;
|
||||
|
||||
@@ -69,6 +69,16 @@ const REACT_DOCTRINE: &str = "METHOD (ReAct): work in explicit Thought → Actio
|
||||
Each Action runs ONE concrete tool command (e.g. a curl request); read its real Observation before the next Thought. \
|
||||
Base every claim on an actual observed response — never assume. Stop when you've either proven an issue or exhausted reasonable checks. Be token-efficient: no filler, no repetition.\n\n";
|
||||
|
||||
/// DEPTH doctrine (v3.5.2): push past detection to demonstrated impact, and
|
||||
/// chain. Distilled from reviewing real AI-pentest output that kept stopping at
|
||||
/// "exposed" instead of "exploited".
|
||||
const DEPTH_DOCTRINE: &str = "DEPTH (exploit, don't just expose):\n\
|
||||
- Exposed → exploited: any info-disclosure, exposed service/catalog/WSDL, leaked credential/token, or non-prod (dev/staging) host you find MUST be USED before you report it — call the exposed endpoint, decode the leaked artifact, log in with the leaked credential, hit the dev host. If you only observed it but never used it, report it as a LEAD (low confidence), not a confirmed finding.\n\
|
||||
- Chain across steps: reuse any session/JWT/cookie/credential you obtain in one step against every other module; if one bug yields access, pivot it into IDOR/privesc/data-exfil and report the CHAIN, not isolated parts.\n\
|
||||
- Decode & fingerprint → CVE: decode opaque tokens/paths (base64/JSON/marshal) and fingerprint the stack (server, framework, library/gem/plugin versions); map exact versions to known CVEs and attempt a safe, non-destructive PoC.\n\
|
||||
- Audit tokens: for any JWT, check alg-confusion (RS→HS), alg:none, kid/jku injection, whether the signature is actually verified, and weak/guessable HS256 secrets.\n\
|
||||
- Calibrate honestly: claim High/Critical ONLY when impact is DEMONSTRATED; unproven DoS/abuse is Low/Info or a lead, never inflated.\n\n";
|
||||
|
||||
/// Black-box web engagement: recon → parallel exploit → N-model vote → report.
|
||||
pub async fn run(cfg: RunConfig, lib: &Library, pool: &ModelPool, tx: Sender<String>) -> RunOutput {
|
||||
pool.set_progress(tx.clone());
|
||||
@@ -168,12 +178,13 @@ pub async fn run(cfg: RunConfig, lib: &Library, pool: &ModelPool, tx: Sender<Str
|
||||
let user = format!(
|
||||
"AUTHORIZED engagement — you have explicit permission to test {target}. \
|
||||
Do not ask for confirmation — proceed and PROVE each issue.\n\n\
|
||||
{directives}{react}{doctrine}{body}\n\nWhen done, reply with ONLY a JSON array of confirmed findings (may be empty []). \
|
||||
{directives}{react}{depth}{doctrine}{body}\n\nWhen done, reply with ONLY a JSON array of confirmed findings (may be empty []). \
|
||||
Each item: {{id,title,severity,cwe,endpoint,payload,evidence,impact,remediation,confidence}}. \
|
||||
`evidence` must contain the concrete proof (request/response excerpt).",
|
||||
target = target,
|
||||
directives = directives,
|
||||
react = REACT_DOCTRINE,
|
||||
depth = DEPTH_DOCTRINE,
|
||||
doctrine = tool_doctrine(mcp_on),
|
||||
body = ag.user.replace("{target}", &target).replace("{recon_json}", &recon),
|
||||
);
|
||||
@@ -387,11 +398,11 @@ pub async fn run_greybox(cfg: RunConfig, lib: &Library, pool: &ModelPool, tx: Se
|
||||
}
|
||||
let user = format!(
|
||||
"AUTHORIZED greybox engagement on {target} — you also have the source review below. \
|
||||
Proceed and PROVE each issue against the LIVE app.\n\n{directives}{leads}{react}{doctrine}{body}\n\n\
|
||||
Proceed and PROVE each issue against the LIVE app.\n\n{directives}{leads}{react}{depth}{doctrine}{body}\n\n\
|
||||
Reply ONLY a JSON array of confirmed findings (may be []): \
|
||||
{{id,title,severity,cwe,endpoint,payload,evidence,impact,remediation,confidence}}.",
|
||||
target = target, directives = directives, leads = leads,
|
||||
react = REACT_DOCTRINE, doctrine = tool_doctrine(mcp_on),
|
||||
react = REACT_DOCTRINE, depth = DEPTH_DOCTRINE, doctrine = tool_doctrine(mcp_on),
|
||||
body = ag.user.replace("{target}", &target).replace("{recon_json}", &recon),
|
||||
);
|
||||
match pool.complete_routed(Task::Exploit, &ag.name, &ag.system, &user).await {
|
||||
@@ -439,12 +450,12 @@ async fn chain_round(pool: &ModelPool, target: &str, recon: &str, directives: &s
|
||||
let _ = tx.send(format!("chaining {} confirmed finding(s) for deeper impact…", confirmed.len())).await;
|
||||
let recon_ctx: String = recon.chars().take(2500).collect();
|
||||
let user = format!(
|
||||
"AUTHORIZED engagement on {target}.\n\n{directives}{react}{doctrine}{recipe_block}\
|
||||
"AUTHORIZED engagement on {target}.\n\n{directives}{react}{depth}{doctrine}{recipe_block}\
|
||||
CONFIRMED FINDINGS TO CHAIN:\n{summary}\n\nRecon:\n{recon_ctx}\n\n\
|
||||
Chain these into deeper impact (e.g. SQLi→RCE→LPE, SSRF→cloud creds, upload→LFI→RCE) and PROVE each stage. \
|
||||
Reply ONLY a JSON array of NEW findings \
|
||||
(may be []): {{id,title,severity,cwe,endpoint,payload,evidence,impact,remediation,confidence}}.",
|
||||
react = REACT_DOCTRINE, doctrine = tool_doctrine(pool.mcp_config.is_some()),
|
||||
react = REACT_DOCTRINE, depth = DEPTH_DOCTRINE, doctrine = tool_doctrine(pool.mcp_config.is_some()),
|
||||
);
|
||||
match pool.complete_routed(Task::Exploit, "chain", CHAIN_SYS, &user).await {
|
||||
Ok((m, text)) => {
|
||||
@@ -623,6 +634,20 @@ async fn finish(cfg: RunConfig, _lib: &Library, recon: String, transcript: Strin
|
||||
let _ = tx.send(format!("grounding gate: demoted {demoted}/{before} ungrounded claim(s) (no tool receipt)")).await;
|
||||
}
|
||||
|
||||
// --- v3.5.2 report-hygiene & exploitation-depth pass ---
|
||||
// Calibrate inflated/unproven High-Critical to Medium, flag exposures that
|
||||
// were never exploited ("exposed → exploited"), and advise consolidating
|
||||
// hygiene findings duplicated across many assets.
|
||||
for n in crate::hygiene::calibrate(&mut findings) {
|
||||
let _ = tx.send(format!("calibrate: {n}")).await;
|
||||
}
|
||||
for n in crate::hygiene::depth_audit(&findings) {
|
||||
let _ = tx.send(format!("notify: {n}")).await;
|
||||
}
|
||||
for n in crate::hygiene::hygiene_summary(&findings) {
|
||||
let _ = tx.send(format!("notify: {n}")).await;
|
||||
}
|
||||
|
||||
// --- POMDP belief: build from grounded findings, report residual uncertainty ---
|
||||
let mut wm = crate::belief::WorldModel::new();
|
||||
wm.deterministic = whitebox;
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
//! POMDP decision layer (v3.5.1): value-of-information planning + the
|
||||
//! POMDP decision layer (v3.5.2): value-of-information planning + the
|
||||
//! anti-hallucination gate.
|
||||
//!
|
||||
//! The choice "scan more vs exploit now" is **not** a heuristic here — it falls
|
||||
|
||||
@@ -97,9 +97,9 @@ pub fn html(target: &str, findings: &[Finding]) -> String {
|
||||
h4{{margin:12px 0 3px;font-size:12px;text-transform:uppercase;letter-spacing:.5px;color:#8b5cf6}}\
|
||||
.b{{color:#8b5cf6;font-weight:800}}</style></head><body>\
|
||||
<h1><span class=b>NeuroSploit</span> Penetration Test Report</h1>\
|
||||
<div class=meta>Target: <b>{t}</b> · v3.5.1 Rust harness · multi-model validated</div>\
|
||||
<div class=meta>Target: <b>{t}</b> · v3.5.2 Rust harness · multi-model validated</div>\
|
||||
<div>{chips}</div>{graph_block}<h2>Findings ({n})</h2>{body}\
|
||||
<p class=meta>Authorized testing only. Findings confirmed by multi-model adversarial voting.<br>NeuroSploit v3.5.1 · by <b>Joas A Santos</b> & <b>Red Team Leaders</b></p></body></html>",
|
||||
<p class=meta>Authorized testing only. Findings confirmed by multi-model adversarial voting.<br>NeuroSploit v3.5.2 · by <b>Joas A Santos</b> & <b>Red Team Leaders</b></p></body></html>",
|
||||
t = esc(target), chips = chips, n = sorted.len(), body = body, graph_block = graph_block,
|
||||
)
|
||||
}
|
||||
@@ -135,7 +135,7 @@ pub fn typst_report(target: &str, findings: &[Finding], dir: &Path) -> std::io::
|
||||
let mut data = String::new();
|
||||
data.push_str(&format!(
|
||||
"#let meta = (target: {}, run_id: {}, generated: {}, model: {})\n",
|
||||
tq(target), tq(&run_id), tq("NeuroSploit v3.5.1"), tq("multi-model")
|
||||
tq(target), tq(&run_id), tq("NeuroSploit v3.5.2"), tq("multi-model")
|
||||
));
|
||||
data.push_str("#let findings = (\n");
|
||||
for f in sorted_findings(findings) {
|
||||
|
||||
@@ -0,0 +1,183 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
NeuroSploit v3.5.2 — exploitation-depth & report-hygiene doctrine agents.
|
||||
|
||||
Distilled from reviewing real AI-pentest output that kept stopping at
|
||||
"exposed" instead of "exploited". Emits meta-agents to agents_md/meta/ that
|
||||
push the engine past detection to demonstrated impact, chain findings, decode
|
||||
artifacts/correlate CVEs, audit tokens, and keep the report honest (dedup +
|
||||
severity calibration). Credits: Joas A Santos & Red Team Leaders.
|
||||
"""
|
||||
import os
|
||||
|
||||
ROOT = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
|
||||
OUT = os.path.join(ROOT, "agents_md", "meta")
|
||||
|
||||
CREDITS = "Credits: Joas A Santos and Red Team Leaders."
|
||||
|
||||
|
||||
def render(a):
|
||||
L = [f"# {a['title']}\n",
|
||||
f"> Meta-agent (v3.5.2 doctrine). {a['tagline']}\n",
|
||||
"## User Prompt",
|
||||
a["user"].strip(), "",
|
||||
"## System Prompt",
|
||||
a["system"].strip() + " " + CREDITS]
|
||||
return "\n".join(L) + "\n"
|
||||
|
||||
|
||||
AGENTS = [
|
||||
{"name": "exploit_depth_doctrine",
|
||||
"title": "Exploitation Depth Doctrine Agent",
|
||||
"tagline": "Turns every exposure into an exploitation attempt before it becomes a finding.",
|
||||
"user": """
|
||||
You are reviewing the candidate findings and live transcript for **{target}**.
|
||||
|
||||
For EACH candidate that merely *exposes* something (information disclosure,
|
||||
exposed service/catalog/WSDL, leaked credential or token, reachable dev/staging
|
||||
host, permissive CORS, open .git), drive it one step further BEFORE it is
|
||||
reported:
|
||||
|
||||
1. **Use what was exposed.** Call the exposed endpoint, decode the leaked
|
||||
artifact, log in with the leaked credential, hit the dev host, send the
|
||||
cross-origin request. Capture the real request/response.
|
||||
2. **Decide honestly.** If using it proved impact → keep/raise severity with the
|
||||
new evidence. If it could not be used → down-rate to a LEAD (low confidence),
|
||||
never a confirmed High/Critical.
|
||||
3. **Report the gap.** List any exposure you could not yet exploit, with the
|
||||
exact next command to try, so the next round (or the human) can finish it.
|
||||
|
||||
Output JSON: {"escalations":[{id, action_taken, new_evidence, new_severity}],
|
||||
"leads":[{id, why_not_proven, next_command}]}.
|
||||
""",
|
||||
"system": """
|
||||
You are a senior exploitation lead. Detection is not a finding — impact is. You
|
||||
never let an info-disclosure, exposed service, leaked secret or reachable
|
||||
non-prod host be reported as confirmed without an attempt to actually use it,
|
||||
backed by a real tool receipt. Unproven impact is a lead, not a High. Authorized
|
||||
engagement; no destructive or DoS actions.
|
||||
"""},
|
||||
|
||||
{"name": "finding_chainer",
|
||||
"title": "Finding Chainer Agent",
|
||||
"tagline": "Reuses obtained access across modules and reports the chain, not the parts.",
|
||||
"user": """
|
||||
Given the confirmed findings and any sessions/tokens/credentials obtained during
|
||||
the engagement on **{target}**, build exploitation CHAINS:
|
||||
|
||||
- Reuse every session/JWT/cookie/credential from one step against ALL other
|
||||
modules and hosts in scope (a captcha/login bypass that yields a token unlocks
|
||||
the entire authenticated surface — use it).
|
||||
- Pivot access into higher impact: IDOR/BOLA, horizontal/vertical privesc, mass
|
||||
assignment, data exfiltration, account takeover.
|
||||
- Combine separate weaknesses (e.g. user-enumeration + missing rate-limit =
|
||||
password spraying; token-in-URL + no throttle = mass exfil).
|
||||
|
||||
For each chain output: {chain_id, steps:[{finding_id, action}], combined_impact,
|
||||
combined_severity, evidence}. Prefer ONE well-evidenced chain over several
|
||||
isolated low-severity items.
|
||||
""",
|
||||
"system": """
|
||||
You are an exploit-chaining specialist. Isolated findings understate risk; the
|
||||
real story is the chain. You always try to reuse obtained access across the
|
||||
whole scope and escalate to business impact, reporting the combined chain with
|
||||
concrete evidence. Authorized engagement; no destructive or DoS actions.
|
||||
"""},
|
||||
|
||||
{"name": "artifact_decoder",
|
||||
"title": "Artifact Decoder & CVE Correlator Agent",
|
||||
"tagline": "Decodes opaque tokens/paths, fingerprints the stack, and maps versions to CVEs.",
|
||||
"user": """
|
||||
For **{target}**, inspect every opaque or technology-revealing artifact seen in
|
||||
recon and responses:
|
||||
|
||||
1. **Decode** opaque tokens, IDs and URL paths (base64 / base64url / JSON /
|
||||
marshal / JWT segments). A decoded value often reveals the framework or an
|
||||
internal file path (e.g. a Dragonfly job `[["f","...file"]]`, a signed-URL
|
||||
structure, a serialized object).
|
||||
2. **Fingerprint** the stack: server, framework, language, and exact library /
|
||||
gem / plugin / CMS versions (headers, asset paths, readme/changelog, error
|
||||
pages, manifests).
|
||||
3. **Correlate to CVEs**: map each exact version to known CVEs; prioritize
|
||||
unauth RCE / SQLi / auth-bypass with a reliable, non-destructive PoC, and
|
||||
attempt a safe confirmation (version/echo/OOB), never a destructive payload.
|
||||
|
||||
Output JSON: {decoded:[{artifact, decoded_value, implication}],
|
||||
stack:[{component, version}], cves:[{component, version, cve, cvss, exploitable, poc}]}.
|
||||
""",
|
||||
"system": """
|
||||
You decode the opaque and correlate the obvious. Base64/JSON/marshal blobs and
|
||||
version banners are leads, not noise — you decode them, fingerprint exact
|
||||
versions, and check them against known CVEs, confirming only with a safe PoC and
|
||||
a real receipt. Authorized engagement; no destructive or DoS actions.
|
||||
"""},
|
||||
|
||||
{"name": "token_auditor",
|
||||
"title": "Token & JWT Auditor Agent",
|
||||
"tagline": "Attacks tokens: alg-confusion, none, kid/jku, signature checks, weak HS256 secrets.",
|
||||
"user": """
|
||||
For any session token or JWT issued by **{target}**, run a full auth-token audit:
|
||||
|
||||
1. **Decode** the header/payload; note alg (HS*/RS*/none), kid, jku, exp, claims.
|
||||
2. **Algorithm attacks**: try `alg:none`, RS→HS confusion (sign with the public
|
||||
key as HMAC secret), and kid/jku injection. Confirm whether the server
|
||||
actually verifies the signature (tamper a claim and replay).
|
||||
3. **Weak secret**: for HS256, attempt to crack the signing secret offline
|
||||
(wordlist/rules); a static or guessable shared secret (e.g. an `x-auth-*`
|
||||
header value) is a strong lead — if cracked, forge a token for any user.
|
||||
4. **Lifecycle**: test reuse after logout, expiry enforcement, and refresh-token
|
||||
revocation.
|
||||
|
||||
Output JSON: {token_type, alg, verified:true|false,
|
||||
attacks:[{name, result, evidence}], forged_token_possible:true|false}.
|
||||
""",
|
||||
"system": """
|
||||
You are a token-security specialist. Every JWT/session token gets audited for
|
||||
algorithm confusion, none, kid/jku injection, real signature verification, weak
|
||||
HS256 secrets, and lifecycle (logout/expiry/refresh). A forged or replayable
|
||||
token is account takeover — you prove it with a real receipt. Authorized
|
||||
engagement; no destructive or DoS actions.
|
||||
"""},
|
||||
|
||||
{"name": "report_calibrator",
|
||||
"title": "Report Calibrator Agent",
|
||||
"tagline": "Dedups by class, calibrates severity to proven impact, demands evidence per claim.",
|
||||
"user": """
|
||||
Before the final report for **{target}**, clean and calibrate the findings:
|
||||
|
||||
1. **Consolidate hygiene by class.** Merge repeated hygiene findings (missing
|
||||
security headers, clickjacking, cookie flags, weak TLS, HSTS, version/banner
|
||||
disclosure) into ONE finding per class with an affected-asset TABLE — do not
|
||||
inflate the count one-per-host.
|
||||
2. **Calibrate severity to PROVEN impact.** High/Critical requires demonstrated
|
||||
impact with evidence. Unproven DoS/abuse, "could/may/potential" language, or a
|
||||
finding with no concrete payload/PoC → cap to Low/Medium or mark
|
||||
"(potential)". Recompute the CVSS vector to match the proven impact.
|
||||
3. **Evidence per claim.** Every finding — and every item in the "tests
|
||||
performed" log — must carry a concrete request/response receipt; flag any
|
||||
claim that has none, and any contradiction between the test log and the
|
||||
findings.
|
||||
|
||||
Output JSON: {merged:[{class, severity, assets:[...]}],
|
||||
recalibrated:[{id, old_severity, new_severity, reason}],
|
||||
unevidenced:[{id_or_test, missing}]}.
|
||||
""",
|
||||
"system": """
|
||||
You are a meticulous report editor. You group hygiene by class with an
|
||||
asset table, calibrate every severity to demonstrated impact (no inflated
|
||||
High/Critical, no padding the count with duplicates), and require a real
|
||||
receipt behind every claim — including each line of the tests-performed log.
|
||||
Honest, deduplicated, evidence-backed reporting only.
|
||||
"""},
|
||||
]
|
||||
|
||||
|
||||
def main():
|
||||
os.makedirs(OUT, exist_ok=True)
|
||||
for a in AGENTS:
|
||||
open(os.path.join(OUT, a["name"] + ".md"), "w").write(render(a))
|
||||
print(f"wrote {len(AGENTS)} v3.5.2 doctrine meta-agents to {OUT}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -25,7 +25,7 @@ cat <<'BANNER'
|
||||
|
||||
███╗ ██╗███████╗██╗ ██╗██████╗ ██████╗
|
||||
████╗ ██║██╔════╝██║ ██║██╔══██╗██╔═══██╗ NeuroSploit installer
|
||||
██╔██╗ ██║█████╗ ██║ ██║██████╔╝██║ ██║ v3.5.1 — Rust harness
|
||||
██╔██╗ ██║█████╗ ██║ ██║██████╔╝██║ ██║ v3.5.2 — Rust harness
|
||||
██║╚██╗██║██╔══╝ ██║ ██║██╔══██╗██║ ██║ by Joas A Santos
|
||||
██║ ╚████║███████╗╚██████╔╝██║ ██║╚██████╔╝ & Red Team Leaders
|
||||
╚═╝ ╚═══╝╚══════╝ ╚═════╝ ╚═╝ ╚═╝ ╚═════╝
|
||||
|
||||
Reference in New Issue
Block a user