mirror of
https://github.com/CyberSecurityUP/NeuroSploit.git
synced 2026-06-30 07:15:30 +02:00
v3.5.1: POMDP belief-state + value-of-information planner + grounded anti-hallucination
Partial observability is now first-class: - belief.rs — property-graph world model; nodes (host/service/vuln/exploit/cred) carry a probability, not a boolean. Bayesian observation updates; per-node Shannon entropy; mean-uncertainty + recon-frontier. Black-box = diffuse priors that sharpen with observation; white-box collapses toward deterministic (MDP). - pomdp.rs — value_of_information(), decide() (recon vs exploit falls out of belief entropy), and may_assert() — the mathematical anti-hallucination gate: no exploitability claim while the belief is diffuse (high entropy) → observe first. - grounding.rs — verification engine, hard rule "no claim without a tool receipt": empirical grounding for black-box (raw HTTP/OOB/error markers), symbolic for white-box (file:line into reviewed source). Ungrounded claims demoted + flagged receipt_missing (feeds future reward shaping). - pipeline.finish(): grounding gate before reporting + belief-uncertainty readout. - bump 3.5.0 → 3.5.1; README documents the v3.5.1 belief/grounding architecture and the infra/bandit/reward roadmap. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1,4 +1,4 @@
|
||||
<h1 align="center">🧠 NeuroSploit v3.5.0</h1>
|
||||
<h1 align="center">🧠 NeuroSploit v3.5.1</h1>
|
||||
|
||||
<p align="center">
|
||||
<a href="https://github.com/JoasASantos/NeuroSploit/stargazers"><img src="https://img.shields.io/github/stars/JoasASantos/NeuroSploit?style=for-the-badge&logo=github&color=8b5cf6" alt="Stars"></a>
|
||||
@@ -8,7 +8,7 @@
|
||||
</p>
|
||||
|
||||
<p align="center">
|
||||
<img src="https://img.shields.io/badge/Version-3.5.0-blue?style=flat-square">
|
||||
<img src="https://img.shields.io/badge/Version-3.5.1-blue?style=flat-square">
|
||||
<img src="https://img.shields.io/badge/Harness-Rust%20%7C%20tokio-e6b673?style=flat-square">
|
||||
<img src="https://img.shields.io/badge/License-MIT-green?style=flat-square">
|
||||
<img src="https://img.shields.io/badge/MD%20Agents-303-red?style=flat-square">
|
||||
@@ -23,6 +23,36 @@
|
||||
|
||||
---
|
||||
|
||||
## 🆕 v3.5.1 — POMDP belief & grounded anti-hallucination
|
||||
|
||||
The target is only **partially observable**, so v3.5.1 stops treating findings as
|
||||
booleans and tracks a **belief**:
|
||||
|
||||
- **Belief world model** (`belief.rs`) — a property graph whose nodes
|
||||
(host / service / vuln / exploit / credential) each carry a *probability*, not a
|
||||
boolean. Observations update them with a Bayesian step; per-node **Shannon
|
||||
entropy** measures how diffuse the belief still is.
|
||||
- **Value-of-information planner** (`pomdp.rs`) — "scan more vs exploit now" is
|
||||
not a heuristic: when a node's belief is diffuse, the expected value of an
|
||||
observation (recon) exceeds the risk-adjusted value of an exploit. The
|
||||
`may_assert` gate is the **mathematical anti-hallucination rule** — the agent
|
||||
may not claim exploitability while the belief is diffuse; it must observe first.
|
||||
- **Grounding / verification engine** (`grounding.rs`) — a hard rule: **no claim
|
||||
enters the world model without a tool receipt** (raw tool output, not the LLM's
|
||||
paraphrase). Black-box grounding is *empirical* (a real HTTP response / OOB
|
||||
callback / error oracle); white-box is *symbolic* (a `file:line` into the
|
||||
reviewed source). Ungrounded claims are demoted and flagged `receipt_missing`.
|
||||
- **Regimes** — black-box runs a true POMDP (diffuse priors that sharpen with
|
||||
observation); white-box collapses toward a near-deterministic MDP (the world
|
||||
model is built from SAST/dataflow, so uncertainty migrates to *path
|
||||
reachability*, not state).
|
||||
|
||||
> Roadmap (in progress on this branch): infra targets (IP + SSH/Windows/AD) with
|
||||
> Linux/Windows/AD host agents, a contextual-bandit tool router, and
|
||||
> value-of-information reward shaping.
|
||||
|
||||
---
|
||||
|
||||
**Autonomous, multi-model penetration-testing harness — Rust, CLI-only.**
|
||||
|
||||
This branch is the **slim, Rust-only** distribution: the `neurosploit-rs/` workspace
|
||||
|
||||
Generated
+2
-2
@@ -695,7 +695,7 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "neurosploit"
|
||||
version = "3.5.0"
|
||||
version = "3.5.1"
|
||||
dependencies = [
|
||||
"anyhow",
|
||||
"clap",
|
||||
@@ -710,7 +710,7 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "neurosploit-harness"
|
||||
version = "3.5.0"
|
||||
version = "3.5.1"
|
||||
dependencies = [
|
||||
"anyhow",
|
||||
"futures",
|
||||
|
||||
@@ -3,7 +3,7 @@ members = ["crates/harness", "app"]
|
||||
resolver = "2"
|
||||
|
||||
[workspace.package]
|
||||
version = "3.5.0"
|
||||
version = "3.5.1"
|
||||
edition = "2021"
|
||||
license = "MIT"
|
||||
repository = "https://github.com/JoasASantos/NeuroSploit"
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
//! NeuroSploit v3.5.0 — interactive harness + CLI (`run` / `whitebox` / `agents` / `models`).
|
||||
//! NeuroSploit v3.5.1 — interactive harness + CLI (`run` / `whitebox` / `agents` / `models`).
|
||||
|
||||
mod repl;
|
||||
|
||||
@@ -10,8 +10,8 @@ use std::path::{Path, PathBuf};
|
||||
#[command(
|
||||
name = "neurosploit",
|
||||
version,
|
||||
about = "NeuroSploit v3.5.0 — multi-model autonomous pentest harness",
|
||||
long_about = "NeuroSploit v3.5.0 — a Rust multi-model harness that drives a pool of LLMs \
|
||||
about = "NeuroSploit v3.5.1 — multi-model autonomous pentest harness",
|
||||
long_about = "NeuroSploit v3.5.1 — a Rust multi-model harness that drives a pool of LLMs \
|
||||
(API key or local subscription: Claude/Codex/Gemini/Grok) to autonomously test a target. \
|
||||
After recon it INTELLIGENTLY selects only the agents matching the discovered surface, runs \
|
||||
them in parallel, then validates every finding by cross-model voting before reporting.\n\n\
|
||||
@@ -276,7 +276,7 @@ async fn run_mode(base: &Path, mut cfg: RunConfig, mcp: bool, mode: Mode) -> any
|
||||
cfg.rl_path = Some(base.join("data").join("rl_state_rs.json").display().to_string());
|
||||
write_status(&workdir, "running", &format!("\"target\":{:?}", cfg.target));
|
||||
|
||||
println!(" ┌─ NeuroSploit v3.5.0 · by Joas A Santos & Red Team Leaders");
|
||||
println!(" ┌─ NeuroSploit v3.5.1 · by Joas A Santos & Red Team Leaders");
|
||||
println!(" │ run id : {run_id}");
|
||||
println!(" │ target : {}", cfg.target);
|
||||
println!(" │ models : {}", cfg.models.join(", "));
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
//! NeuroSploit v3.5.0 — interactive session (Claude-Code / Codex / Cursor-CLI style).
|
||||
//! NeuroSploit v3.5.1 — interactive session (Claude-Code / Codex / Cursor-CLI style).
|
||||
//!
|
||||
//! Launched when `neurosploit` runs with no subcommand. A persistent REPL with
|
||||
//! real line editing (arrow-key history recall, Ctrl-A/E/K, paste), model
|
||||
@@ -191,7 +191,7 @@ pub async fn repl(base: &Path) -> anyhow::Result<()> {
|
||||
let backends = harness::installed_cli_backends();
|
||||
println!("\x1b[1m");
|
||||
println!(" ███╗ ██╗███████╗██╗ ██╗██████╗ ██████╗");
|
||||
println!(" ████╗ ██║██╔════╝██║ ██║██╔══██╗██╔═══██╗ NeuroSploit v3.5.0");
|
||||
println!(" ████╗ ██║██╔════╝██║ ██║██╔══██╗██╔═══██╗ NeuroSploit v3.5.1");
|
||||
println!(" ██╔██╗ ██║█████╗ ██║ ██║██████╔╝██║ ██║ interactive harness");
|
||||
println!(" ██║╚██╗██║██╔══╝ ██║ ██║██╔══██╗██║ ██║ by Joas A Santos");
|
||||
println!(" ██║ ╚████║███████╗╚██████╔╝██║ ██║╚██████╔╝ & Red Team Leaders");
|
||||
|
||||
@@ -0,0 +1,146 @@
|
||||
//! POMDP belief-state world model (v3.5.1).
|
||||
//!
|
||||
//! The target is only partially observable, so we don't track booleans — we
|
||||
//! track a **belief**: a property graph whose nodes (host / service / vuln /
|
||||
//! credential) each carry a probability that the proposition is true. Recon
|
||||
//! produces *observations* that update those beliefs via a Bayesian step; the
|
||||
//! per-node Shannon entropy measures how diffuse the belief still is.
|
||||
//!
|
||||
//! - **Black-box**: beliefs start uncertain (~0.5) and sharpen with observation.
|
||||
//! - **White-box**: the world model is built (near-)deterministically from
|
||||
//! source/SAST, so beliefs collapse toward 0/1 — the POMDP degenerates into an
|
||||
//! MDP and uncertainty migrates to *path reachability*, not state.
|
||||
//!
|
||||
//! This is the substrate for value-of-information planning (see `pomdp.rs`): when
|
||||
//! a node's belief is diffuse, gathering an observation about it is worth more
|
||||
//! than acting on it — which is also the anti-hallucination criterion.
|
||||
|
||||
use serde::{Deserialize, Serialize};
|
||||
use std::collections::HashMap;
|
||||
|
||||
/// What a belief node is about.
|
||||
#[derive(Clone, Copy, Debug, PartialEq, Eq, Serialize, Deserialize)]
|
||||
pub enum Kind {
|
||||
Host, // a host exists / is reachable
|
||||
Service, // a service/endpoint is present
|
||||
Vuln, // a specific weakness is present
|
||||
Exploit, // the weakness is actually exploitable
|
||||
Credential, // a credential is valid
|
||||
}
|
||||
|
||||
/// A single proposition with a probability of being true and the evidence count
|
||||
/// behind it (used for confidence/entropy).
|
||||
#[derive(Clone, Debug, Serialize, Deserialize)]
|
||||
pub struct Node {
|
||||
pub id: String,
|
||||
pub kind: Kind,
|
||||
pub label: String,
|
||||
/// P(proposition is true) ∈ [0,1].
|
||||
pub p: f64,
|
||||
/// number of independent observations folded in.
|
||||
pub obs: u32,
|
||||
}
|
||||
|
||||
impl Node {
|
||||
/// Shannon entropy in bits of the Bernoulli(p) belief — 1.0 = maximally
|
||||
/// uncertain (p=0.5), 0.0 = certain.
|
||||
pub fn entropy(&self) -> f64 {
|
||||
let p = self.p.clamp(1e-6, 1.0 - 1e-6);
|
||||
-(p * p.log2() + (1.0 - p) * (1.0 - p).log2())
|
||||
}
|
||||
}
|
||||
|
||||
/// A directed edge: "from enables/leads-to to" with a transition probability.
|
||||
#[derive(Clone, Debug, Serialize, Deserialize)]
|
||||
pub struct Edge {
|
||||
pub from: String,
|
||||
pub to: String,
|
||||
pub p: f64,
|
||||
}
|
||||
|
||||
/// The belief: a property graph over the partially-observed target.
|
||||
#[derive(Default, Clone, Serialize, Deserialize)]
|
||||
pub struct WorldModel {
|
||||
pub nodes: HashMap<String, Node>,
|
||||
pub edges: Vec<Edge>,
|
||||
/// true once beliefs were built deterministically (white-box → MDP regime).
|
||||
pub deterministic: bool,
|
||||
}
|
||||
|
||||
/// A sensed observation about a node: P(observation | true) vs P(observation | false).
|
||||
/// `positive` true means the observation supports the proposition.
|
||||
pub struct Observation<'a> {
|
||||
pub node: &'a str,
|
||||
pub positive: bool,
|
||||
/// sensor reliability ∈ (0.5, 1.0]; how much one observation moves the belief.
|
||||
pub reliability: f64,
|
||||
}
|
||||
|
||||
impl WorldModel {
|
||||
pub fn new() -> Self {
|
||||
WorldModel::default()
|
||||
}
|
||||
|
||||
/// Seed a node with a prior. Black-box priors are ~0.5 (unknown); white-box
|
||||
/// callers pass priors near 0/1.
|
||||
pub fn add(&mut self, id: &str, kind: Kind, label: &str, prior: f64) {
|
||||
self.nodes.entry(id.to_string()).or_insert_with(|| Node {
|
||||
id: id.to_string(),
|
||||
kind,
|
||||
label: label.to_string(),
|
||||
p: prior.clamp(0.0, 1.0),
|
||||
obs: 0,
|
||||
});
|
||||
}
|
||||
|
||||
pub fn link(&mut self, from: &str, to: &str, p: f64) {
|
||||
self.edges.push(Edge { from: from.into(), to: to.into(), p: p.clamp(0.0, 1.0) });
|
||||
}
|
||||
|
||||
/// Bayesian update of a node's belief from one observation. With sensor
|
||||
/// reliability r: a positive obs multiplies the odds by r/(1-r), a negative
|
||||
/// one by (1-r)/r.
|
||||
pub fn observe(&mut self, o: Observation) {
|
||||
let r = o.reliability.clamp(0.5 + 1e-6, 1.0 - 1e-6);
|
||||
if let Some(n) = self.nodes.get_mut(o.node) {
|
||||
let p = n.p.clamp(1e-6, 1.0 - 1e-6);
|
||||
let prior_odds = p / (1.0 - p);
|
||||
let lr = if o.positive { r / (1.0 - r) } else { (1.0 - r) / r };
|
||||
let post_odds = prior_odds * lr;
|
||||
n.p = post_odds / (1.0 + post_odds);
|
||||
n.obs += 1;
|
||||
}
|
||||
}
|
||||
|
||||
/// Collapse a node to (near-)certainty — used by white-box when SAST/dataflow
|
||||
/// determines the proposition deterministically.
|
||||
pub fn set_known(&mut self, id: &str, truth: bool) {
|
||||
if let Some(n) = self.nodes.get_mut(id) {
|
||||
n.p = if truth { 0.98 } else { 0.02 };
|
||||
n.obs += 3;
|
||||
}
|
||||
}
|
||||
|
||||
/// Mean entropy across nodes of a kind (or all). 1.0 = totally diffuse.
|
||||
pub fn uncertainty(&self, kind: Option<Kind>) -> f64 {
|
||||
let rel: Vec<&Node> = self.nodes.values()
|
||||
.filter(|n| kind.map(|k| n.kind == k).unwrap_or(true)).collect();
|
||||
if rel.is_empty() {
|
||||
return 1.0;
|
||||
}
|
||||
rel.iter().map(|n| n.entropy()).sum::<f64>() / rel.len() as f64
|
||||
}
|
||||
|
||||
/// Nodes whose belief is still diffuse (entropy above `thresh`) — the recon
|
||||
/// frontier: where collecting an observation has the highest value.
|
||||
pub fn frontier(&self, thresh: f64) -> Vec<&Node> {
|
||||
let mut v: Vec<&Node> = self.nodes.values().filter(|n| n.entropy() > thresh).collect();
|
||||
v.sort_by(|a, b| b.entropy().partial_cmp(&a.entropy()).unwrap_or(std::cmp::Ordering::Equal));
|
||||
v
|
||||
}
|
||||
|
||||
/// Is a proposition confident enough to *act/assert* on? (low entropy + high p)
|
||||
pub fn is_confident(&self, id: &str, min_p: f64, max_entropy: f64) -> bool {
|
||||
self.nodes.get(id).map(|n| n.p >= min_p && n.entropy() <= max_entropy).unwrap_or(false)
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,87 @@
|
||||
//! Verification / grounding engine (v3.5.1).
|
||||
//!
|
||||
//! Hard rule: **no claim enters the world model without a tool receipt** — raw
|
||||
//! tool output, not the LLM's paraphrase. This is the empirical anti-hallucination
|
||||
//! anchor that complements the POMDP belief gate:
|
||||
//!
|
||||
//! - **Black-box**: grounding is empirical — the finding's evidence must look
|
||||
//! like raw tool output (an HTTP response, an OOB callback, an error oracle),
|
||||
//! not prose.
|
||||
//! - **White-box**: grounding is symbolic — a file:line reference into the
|
||||
//! reviewed source (reachability/taint), checked against the collected context.
|
||||
//!
|
||||
//! Ungrounded claims are flagged (`receipt_missing`) so the reward layer can
|
||||
//! penalize them (the "claim without receipt" term).
|
||||
|
||||
use crate::types::Finding;
|
||||
|
||||
/// Verdict of grounding a single finding.
|
||||
pub struct Grounded {
|
||||
pub ok: bool,
|
||||
pub kind: &'static str, // "empirical" | "symbolic" | "missing"
|
||||
pub reason: String,
|
||||
}
|
||||
|
||||
/// Markers that suggest the evidence is a real tool receipt rather than prose.
|
||||
fn looks_empirical(evidence: &str) -> bool {
|
||||
let e = evidence.to_lowercase();
|
||||
let markers = [
|
||||
"http/", "status", "200", "301", "302", "401", "403", "500",
|
||||
"set-cookie", "location:", "content-type", "<html", "<script",
|
||||
"server:", "x-", "alert(", "uid=", "root:", "sql", "error", "stack",
|
||||
"callback", "oob", "collaborator", "$ ", "# ", "curl", "nmap",
|
||||
];
|
||||
evidence.len() >= 24 && markers.iter().filter(|m| e.contains(*m)).count() >= 2
|
||||
}
|
||||
|
||||
/// White-box: evidence should reference a source location present in `context`.
|
||||
fn looks_symbolic(f: &Finding, context: &str) -> bool {
|
||||
// endpoint like file.ext:line, and the file appears in the reviewed source.
|
||||
let loc = &f.endpoint;
|
||||
if let Some((file, _)) = loc.rsplit_once(':') {
|
||||
let base = file.rsplit('/').next().unwrap_or(file);
|
||||
if !base.is_empty() && context.contains(base) {
|
||||
return true;
|
||||
}
|
||||
}
|
||||
// or the evidence quotes code that is actually in the context
|
||||
!f.evidence.trim().is_empty()
|
||||
&& f.evidence.split_whitespace().take(6).collect::<Vec<_>>().join(" ")
|
||||
.split_whitespace()
|
||||
.filter(|t| t.len() > 4 && context.contains(*t))
|
||||
.count()
|
||||
>= 2
|
||||
}
|
||||
|
||||
/// Ground a finding. `context` is the reviewed source for white-box (empty for
|
||||
/// black-box). Returns whether it has a valid receipt and of what kind.
|
||||
pub fn ground(f: &Finding, context: &str, whitebox: bool) -> Grounded {
|
||||
if whitebox && !context.is_empty() {
|
||||
if looks_symbolic(f, context) {
|
||||
return Grounded { ok: true, kind: "symbolic", reason: "source location/quote matches reviewed code".into() };
|
||||
}
|
||||
return Grounded { ok: false, kind: "missing", reason: "no source reference into reviewed code".into() };
|
||||
}
|
||||
if looks_empirical(&f.evidence) {
|
||||
Grounded { ok: true, kind: "empirical", reason: "evidence resembles raw tool output".into() }
|
||||
} else {
|
||||
Grounded { ok: false, kind: "missing", reason: "evidence is paraphrase, not a tool receipt".into() }
|
||||
}
|
||||
}
|
||||
|
||||
/// Apply the grounding gate to a finding set. Ungrounded findings are flagged
|
||||
/// (receipt recorded in `votes`) and demoted to unvalidated so they never get
|
||||
/// reported as confirmed. Returns (kept, demoted_count).
|
||||
pub fn gate(mut findings: Vec<Finding>, context: &str, whitebox: bool) -> (Vec<Finding>, usize) {
|
||||
let mut demoted = 0;
|
||||
for f in findings.iter_mut() {
|
||||
let g = ground(f, context, whitebox);
|
||||
if !g.ok {
|
||||
f.validated = false;
|
||||
f.votes = format!("{} · receipt_missing", f.votes);
|
||||
demoted += 1;
|
||||
}
|
||||
}
|
||||
findings.retain(|f| f.validated);
|
||||
(findings, demoted)
|
||||
}
|
||||
@@ -1,4 +1,4 @@
|
||||
//! NeuroSploit v3.5.0 harness — a robust multi-model runtime for the
|
||||
//! NeuroSploit v3.5.1 harness — a robust multi-model runtime for the
|
||||
//! markdown-driven autonomous pentest engine.
|
||||
//!
|
||||
//! The harness loads the `agents_md/` library, drives a *pool* of LLM models
|
||||
@@ -8,7 +8,10 @@
|
||||
|
||||
pub mod agents;
|
||||
pub mod attack_graph;
|
||||
pub mod belief;
|
||||
pub mod creds;
|
||||
pub mod grounding;
|
||||
pub mod pomdp;
|
||||
pub mod models;
|
||||
pub mod pipeline;
|
||||
pub mod pool;
|
||||
|
||||
@@ -604,6 +604,27 @@ async fn validate(candidates: Vec<Finding>, pool: &ModelPool, sys: &str, vote_n:
|
||||
|
||||
async fn finish(cfg: RunConfig, _lib: &Library, recon: String, transcript: String, mut findings: Vec<Finding>,
|
||||
selected: Vec<Agent>, rl: &mut RlState, tx: Sender<String>) -> RunOutput {
|
||||
// --- Grounding gate: no claim without a tool receipt (anti-hallucination) ---
|
||||
// White/grey carry source context; black-box is verified empirically.
|
||||
let whitebox = cfg.repo.is_some() && cfg.target.starts_with('/');
|
||||
let before = findings.len();
|
||||
let (kept, demoted) = crate::grounding::gate(findings, &transcript, whitebox);
|
||||
findings = kept;
|
||||
if demoted > 0 {
|
||||
let _ = tx.send(format!("grounding gate: demoted {demoted}/{before} ungrounded claim(s) (no tool receipt)")).await;
|
||||
}
|
||||
|
||||
// --- POMDP belief: build from grounded findings, report residual uncertainty ---
|
||||
let mut wm = crate::belief::WorldModel::new();
|
||||
wm.deterministic = whitebox;
|
||||
for f in &findings {
|
||||
wm.add(&f.id, crate::belief::Kind::Exploit, &f.title, f.confidence.max(0.05).min(0.99));
|
||||
}
|
||||
let unc = wm.uncertainty(None);
|
||||
if !findings.is_empty() {
|
||||
let _ = tx.send(format!("belief uncertainty over confirmed findings: {:.2} (0=sharp,1=diffuse)", unc)).await;
|
||||
}
|
||||
|
||||
let _ = tx.send(format!("{} validated finding(s)", findings.len())).await;
|
||||
// Map findings to OWASP / MITRE / kill-chain stage for the attack graph.
|
||||
crate::attack_graph::enrich(&mut findings);
|
||||
|
||||
@@ -0,0 +1,109 @@
|
||||
//! POMDP decision layer (v3.5.1): value-of-information planning + the
|
||||
//! anti-hallucination gate.
|
||||
//!
|
||||
//! The choice "scan more vs exploit now" is **not** a heuristic here — it falls
|
||||
//! out of the belief. When a target node's belief is diffuse (high entropy), the
|
||||
//! expected value of an observation (recon) exceeds that of an exploit, because
|
||||
//! the observation is expected to sharpen the belief by more than the exploit's
|
||||
//! risk-adjusted payoff. That same criterion is the anti-hallucination rule: the
|
||||
//! agent must not assert exploitability while the belief about the target state
|
||||
//! is diffuse — it must collect more observation first.
|
||||
|
||||
use crate::belief::{Kind, WorldModel};
|
||||
|
||||
/// What the planner recommends doing next.
|
||||
#[derive(Debug, Clone, PartialEq)]
|
||||
pub enum Action {
|
||||
/// Gather an observation about a still-diffuse node (recon).
|
||||
Recon { node: String, voi: f64 },
|
||||
/// Act on a node the belief is confident about (exploit/report).
|
||||
Exploit { node: String, ev: f64 },
|
||||
/// Belief is sharp and nothing actionable remains.
|
||||
Stop,
|
||||
}
|
||||
|
||||
/// Decision thresholds (tunable; could be learned later).
|
||||
pub struct Policy {
|
||||
/// Above this belief entropy, recon dominates exploit (value-of-information).
|
||||
pub explore_entropy: f64,
|
||||
/// Minimum P(true) to allow asserting/acting.
|
||||
pub assert_min_p: f64,
|
||||
/// Maximum entropy to allow asserting/acting (the anti-hallucination ceiling).
|
||||
pub assert_max_entropy: f64,
|
||||
}
|
||||
|
||||
impl Default for Policy {
|
||||
fn default() -> Self {
|
||||
Policy { explore_entropy: 0.6, assert_min_p: 0.7, assert_max_entropy: 0.4 }
|
||||
}
|
||||
}
|
||||
|
||||
/// Expected value of an observation about a node ≈ how much entropy it can
|
||||
/// remove, weighted by the node's relevance (Exploit/Credential nodes matter
|
||||
/// most). A sharp belief has ~0 VoI; a diffuse one has VoI≈1×weight.
|
||||
pub fn value_of_information(wm: &WorldModel, node_id: &str) -> f64 {
|
||||
let Some(n) = wm.nodes.get(node_id) else { return 0.0 };
|
||||
let weight = match n.kind {
|
||||
Kind::Exploit | Kind::Credential => 1.0,
|
||||
Kind::Vuln => 0.8,
|
||||
Kind::Service => 0.5,
|
||||
Kind::Host => 0.4,
|
||||
};
|
||||
n.entropy() * weight
|
||||
}
|
||||
|
||||
/// Risk-adjusted expected value of exploiting a node now: only worthwhile when
|
||||
/// the belief is both high and sharp.
|
||||
fn exploit_ev(wm: &WorldModel, node_id: &str, pol: &Policy) -> f64 {
|
||||
let Some(n) = wm.nodes.get(node_id) else { return 0.0 };
|
||||
if n.entropy() > pol.assert_max_entropy {
|
||||
return 0.0; // too uncertain — exploiting now is gambling
|
||||
}
|
||||
n.p
|
||||
}
|
||||
|
||||
/// Decide the next macro-action from the current belief: recon the highest-VoI
|
||||
/// diffuse node, or exploit the most-confident node, whichever wins.
|
||||
pub fn decide(wm: &WorldModel, pol: &Policy) -> Action {
|
||||
// Best recon candidate by value-of-information.
|
||||
let best_recon = wm.nodes.keys()
|
||||
.map(|id| (id.clone(), value_of_information(wm, id)))
|
||||
.max_by(|a, b| a.1.partial_cmp(&b.1).unwrap_or(std::cmp::Ordering::Equal));
|
||||
// Best exploit candidate by risk-adjusted EV.
|
||||
let best_exploit = wm.nodes.values()
|
||||
.filter(|n| matches!(n.kind, Kind::Exploit | Kind::Vuln | Kind::Credential))
|
||||
.map(|n| (n.id.clone(), exploit_ev(wm, &n.id, pol)))
|
||||
.max_by(|a, b| a.1.partial_cmp(&b.1).unwrap_or(std::cmp::Ordering::Equal));
|
||||
|
||||
match (best_recon, best_exploit) {
|
||||
(Some((rid, voi)), exp) => {
|
||||
let ev = exp.as_ref().map(|(_, e)| *e).unwrap_or(0.0);
|
||||
// Value-of-information dominates while the belief is diffuse.
|
||||
if voi >= ev && voi > (1.0 - pol.explore_entropy) {
|
||||
Action::Recon { node: rid, voi }
|
||||
} else if let Some((eid, e)) = exp.filter(|(_, e)| *e > 0.0) {
|
||||
Action::Exploit { node: eid, ev: e }
|
||||
} else {
|
||||
Action::Recon { node: rid, voi }
|
||||
}
|
||||
}
|
||||
(None, Some((eid, e))) if e > 0.0 => Action::Exploit { node: eid, ev: e },
|
||||
_ => Action::Stop,
|
||||
}
|
||||
}
|
||||
|
||||
/// Anti-hallucination gate. A claim of exploitability about `node` may only be
|
||||
/// asserted when the belief is confident AND sharp. Returns Ok(()) to allow the
|
||||
/// claim, or Err(reason) to force "collect more observation first".
|
||||
pub fn may_assert(wm: &WorldModel, node_id: &str, pol: &Policy) -> Result<(), String> {
|
||||
match wm.nodes.get(node_id) {
|
||||
None => Err("no belief about this target — observe first".into()),
|
||||
Some(n) if n.entropy() > pol.assert_max_entropy =>
|
||||
Err(format!("belief diffuse (entropy {:.2} > {:.2}) — recon before asserting exploitability",
|
||||
n.entropy(), pol.assert_max_entropy)),
|
||||
Some(n) if n.p < pol.assert_min_p =>
|
||||
Err(format!("belief too low (p {:.2} < {:.2}) — not exploitable on current evidence",
|
||||
n.p, pol.assert_min_p)),
|
||||
Some(_) => Ok(()),
|
||||
}
|
||||
}
|
||||
@@ -97,9 +97,9 @@ pub fn html(target: &str, findings: &[Finding]) -> String {
|
||||
h4{{margin:12px 0 3px;font-size:12px;text-transform:uppercase;letter-spacing:.5px;color:#8b5cf6}}\
|
||||
.b{{color:#8b5cf6;font-weight:800}}</style></head><body>\
|
||||
<h1><span class=b>NeuroSploit</span> Penetration Test Report</h1>\
|
||||
<div class=meta>Target: <b>{t}</b> · v3.5.0 Rust harness · multi-model validated</div>\
|
||||
<div class=meta>Target: <b>{t}</b> · v3.5.1 Rust harness · multi-model validated</div>\
|
||||
<div>{chips}</div>{graph_block}<h2>Findings ({n})</h2>{body}\
|
||||
<p class=meta>Authorized testing only. Findings confirmed by multi-model adversarial voting.<br>NeuroSploit v3.5.0 · by <b>Joas A Santos</b> & <b>Red Team Leaders</b></p></body></html>",
|
||||
<p class=meta>Authorized testing only. Findings confirmed by multi-model adversarial voting.<br>NeuroSploit v3.5.1 · by <b>Joas A Santos</b> & <b>Red Team Leaders</b></p></body></html>",
|
||||
t = esc(target), chips = chips, n = sorted.len(), body = body, graph_block = graph_block,
|
||||
)
|
||||
}
|
||||
@@ -135,7 +135,7 @@ pub fn typst_report(target: &str, findings: &[Finding], dir: &Path) -> std::io::
|
||||
let mut data = String::new();
|
||||
data.push_str(&format!(
|
||||
"#let meta = (target: {}, run_id: {}, generated: {}, model: {})\n",
|
||||
tq(target), tq(&run_id), tq("NeuroSploit v3.5.0"), tq("multi-model")
|
||||
tq(target), tq(&run_id), tq("NeuroSploit v3.5.1"), tq("multi-model")
|
||||
));
|
||||
data.push_str("#let findings = (\n");
|
||||
for f in sorted_findings(findings) {
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
// NeuroSploit v3.5.0 — Typst report template (blank, structured).
|
||||
// NeuroSploit v3.5.1 — Typst report template (blank, structured).
|
||||
//
|
||||
// The harness generates `report.typ` per run by prepending a `findings` array
|
||||
// and a `meta` dict, then including this template's rendering logic. This file
|
||||
@@ -24,7 +24,7 @@
|
||||
|
||||
#set page(margin: 2cm, numbering: "1", footer: context [
|
||||
#set text(size: 8pt, fill: gray)
|
||||
NeuroSploit v3.5.0 · #meta.target · confidential
|
||||
NeuroSploit v3.5.1 · #meta.target · confidential
|
||||
#h(1fr) #counter(page).display()
|
||||
])
|
||||
#set text(font: ("Helvetica Neue", "Helvetica", "Arial"), size: 10pt)
|
||||
|
||||
Reference in New Issue
Block a user