mirror of https://github.com/CyberSecurityUP/NeuroSploit.git synced 2026-06-30 07:15:30 +02:00

T

CyberSecurityUP a8676fee0a v3.5.1: POMDP belief-state + value-of-information planner + grounded anti-hallucination

Partial observability is now first-class:

- belief.rs — property-graph world model; nodes (host/service/vuln/exploit/cred)
  carry a probability, not a boolean. Bayesian observation updates; per-node
  Shannon entropy; mean-uncertainty + recon-frontier. Black-box = diffuse priors
  that sharpen with observation; white-box collapses toward deterministic (MDP).
- pomdp.rs — value_of_information(), decide() (recon vs exploit falls out of
  belief entropy), and may_assert() — the mathematical anti-hallucination gate:
  no exploitability claim while the belief is diffuse (high entropy) → observe first.
- grounding.rs — verification engine, hard rule "no claim without a tool receipt":
  empirical grounding for black-box (raw HTTP/OOB/error markers), symbolic for
  white-box (file:line into reviewed source). Ungrounded claims demoted + flagged
  receipt_missing (feeds future reward shaping).
- pipeline.finish(): grounding gate before reporting + belief-uncertainty readout.
- bump 3.5.0 → 3.5.1; README documents the v3.5.1 belief/grounding architecture
  and the infra/bandit/reward roadmap.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-24 21:41:18 -03:00

agents_md

v3.4.1: harness intelligence — router, ReAct, dedup, token-trim, configurable MCP, +54 code agents, credits

2026-06-24 19:49:01 -03:00

neurosploit-rs

v3.5.1: POMDP belief-state + value-of-information planner + grounded anti-hallucination

2026-06-24 21:41:18 -03:00

.env.example

Merge NeuroSploit v3.3.0 — Autonomous MD-Agent Engine into main

2026-06-14 21:41:26 -03:00

.gitignore

v3.5.0: orchestration chaining + rich REPL (rustyline, model arrow-select, persistent history) + model-aware /key

2026-06-24 20:33:13 -03:00

README.md

v3.5.1: POMDP belief-state + value-of-information planner + grounded anti-hallucination

2026-06-24 21:41:18 -03:00

RELEASE.md

NeuroSploit v3.4.0 — Rust multi-model harness + Axum dashboard

2026-06-21 19:58:43 -03:00

setup.sh

v3.5.0: REPL quick-wins (Tab-complete, @file/@dir/@line, multiline, /theme, /attach, /context) + installer + README

2026-06-24 21:19:56 -03:00

README.md

🧠 NeuroSploit v3.5.1

Autonomous, multi-model penetration-testing harness — Rust, CLI-only.
by Joas A Santos & Red Team Leaders

⭐ If this is useful, star the repo — it helps a lot.

🆕 v3.5.1 — POMDP belief & grounded anti-hallucination

The target is only partially observable, so v3.5.1 stops treating findings as booleans and tracks a belief:

Belief world model (belief.rs) — a property graph whose nodes (host / service / vuln / exploit / credential) each carry a probability, not a boolean. Observations update them with a Bayesian step; per-node Shannon entropy measures how diffuse the belief still is.
Value-of-information planner (pomdp.rs) — "scan more vs exploit now" is not a heuristic: when a node's belief is diffuse, the expected value of an observation (recon) exceeds the risk-adjusted value of an exploit. The may_assert gate is the mathematical anti-hallucination rule — the agent may not claim exploitability while the belief is diffuse; it must observe first.
Grounding / verification engine (grounding.rs) — a hard rule: no claim enters the world model without a tool receipt (raw tool output, not the LLM's paraphrase). Black-box grounding is empirical (a real HTTP response / OOB callback / error oracle); white-box is symbolic (a file:line into the reviewed source). Ungrounded claims are demoted and flagged receipt_missing.
Regimes — black-box runs a true POMDP (diffuse priors that sharpen with observation); white-box collapses toward a near-deterministic MDP (the world model is built from SAST/dataflow, so uncertainty migrates to path reachability, not state).

Roadmap (in progress on this branch): infra targets (IP + SSH/Windows/AD) with Linux/Windows/AD host agents, a contextual-bandit tool router, and value-of-information reward shaping.

Autonomous, multi-model penetration-testing harness — Rust, CLI-only.

This branch is the slim, Rust-only distribution: the neurosploit-rs/ workspace plus the agents_md/ agent library. It turns a URL (black-box) or a code repository (white-box) into an autonomous engagement that drives a pool of LLMs — via API key or local subscription (Claude Code / Codex / Gemini / Grok) — recons the target, intelligently selects only the agents matching the discovered surface, runs them in parallel, then validates every finding by cross-model voting before reporting.

The full project (Python engine, web GUIs, history) lives on the main branch.

📦 Install (one line)

curl -fsSL https://raw.githubusercontent.com/JoasASantos/NeuroSploit/main/setup.sh | bash

The installer auto-installs Rust if needed, clones the repo to ~/.neurosploit, builds the release binary, and links neurosploit into ~/.local/bin. Re-run it any time to update. Tweak with env vars: NEUROSPLOIT_REF (branch/tag), NEUROSPLOIT_DIR, PREFIX.

Prefer to build by hand?

git clone https://github.com/JoasASantos/NeuroSploit && cd NeuroSploit/neurosploit-rs
cargo build --release      # → target/release/neurosploit

⚡ Quick start (60 seconds)

# easiest path — just run it; the interactive session asks everything:
neurosploit

# or one-liner (subscription login, no API key needed):
neurosploit run http://testphp.vulnweb.com/ --subscription --model anthropic:claude-opus-4-8 -v

No login? Use an API key instead — see Authentication.

Build

cd neurosploit-rs
cargo build --release        # → target/release/neurosploit

Requires a Rust toolchain (rustup). Recommended: run on Kali Linux (or the Kali Docker image) so the offensive tools the agents use are already present:

docker run -it --rm kalilinux/kali-rolling
apt update && apt install -y curl nmap ffuf nodejs npm
# rustscan (faster port scan): cargo install rustscan   (or grab a release from GitHub)

The agents degrade gracefully: if rustscan isn't installed they use nmap; if neither, they probe with curl. If a Playwright MCP browser is available they use it for JS-heavy pages, otherwise they fall back to curl.

Usage

Run with no arguments for an interactive wizard:

./target/release/neurosploit

Or drive it directly:

# Black-box — subscription (no API key), Opus, browser via Playwright if present, verbose
./target/release/neurosploit run http://testphp.vulnweb.com/ \
    --subscription --model anthropic:claude-opus-4-8 --mcp -v

# Black-box — API keys, multi-model voting panel (1st finds, others adjudicate)
./target/release/neurosploit run http://testphp.vulnweb.com/ \
    --model anthropic:claude-opus-4-8 --model openai:gpt-5.1 --vote-n 3

# White-box — clone a vulnerable app and review its source
git clone https://github.com/digininja/DVWA /tmp/DVWA
./target/release/neurosploit whitebox /tmp/DVWA \
    --subscription --model anthropic:claude-opus-4-8 -v

# Offline pipeline self-test (no keys/login needed)
./target/release/neurosploit run http://testphp.vulnweb.com/ --offline

# Utilities
./target/release/neurosploit agents     # library counts
./target/release/neurosploit models      # providers & models
./target/release/neurosploit --help        # full help with examples

Options (`run` / `whitebox`)

Flag	Meaning
`--model provider:model`	Repeatable. First = primary; the rest fail over and form the voting jury.
`--subscription`	Use the local CLI login (Claude/Codex/Gemini/Grok) instead of an API key.
`--mcp`	Enable Playwright MCP (auto-provisioned via `npx`; backends without MCP use built-in tools).
`--vote-n N`	How many models must agree a finding is real (default 3 / 2 for whitebox).
`--max-agents N`	Cap agents run (`0` = all matching the recon).
`--offline`	Exercise the full pipeline without calling any model.
`-v, --verbose`	Log each agent as it launches, recon, and votes.

Authentication — run via API key or subscription

You can run NeuroSploit two ways. They're independent: pick per run.

1) Via API (provider API key)

Export the key(s) for the providers in your model panel, then run without --subscription. Any OpenAI-compatible provider works.

# pick one or more, depending on the models you select
export ANTHROPIC_API_KEY=sk-ant-...        # anthropic:claude-*
export OPENAI_API_KEY=sk-...               # openai:gpt-*
export GEMINI_API_KEY=AIza...              # gemini:gemini-*
export XAI_API_KEY=xai-...                 # xai:grok-*
export NVIDIA_NIM_API_KEY=nvapi-...        # nvidia_nim:*
export DEEPSEEK_API_KEY=...                # deepseek:*
export MISTRAL_API_KEY=...                 # mistral:*
export DASHSCOPE_API_KEY=...               # qwen:*  (Alibaba DashScope)
export GROQ_API_KEY=...                    # groq:*
export TOGETHER_API_KEY=...                # together:*
export OPENROUTER_API_KEY=...              # openrouter:*
# ollama needs no key (local)

# then run via API (note: NO --subscription)
./target/release/neurosploit run http://testphp.vulnweb.com/ \
    --model anthropic:claude-opus-4-8 --vote-n 3 -v

# multi-provider voting panel via API (1st finds, the others adjudicate)
./target/release/neurosploit run http://testphp.vulnweb.com/ \
    --model anthropic:claude-opus-4-8 --model openai:gpt-5.1 --model gemini:gemini-2.5-pro

Or put the keys in a .env and source it (cp .env.example .env; edit; set -a; . ./.env; set +a).

Provider → env var → endpoint (all OpenAI-compatible):

`--model` prefix	Env var	Base URL
`anthropic:`	`ANTHROPIC_API_KEY`	api.anthropic.com
`openai:`	`OPENAI_API_KEY`	api.openai.com
`gemini:`	`GEMINI_API_KEY`	generativelanguage.googleapis.com
`xai:`	`XAI_API_KEY`	api.x.ai
`nvidia_nim:`	`NVIDIA_NIM_API_KEY`	integrate.api.nvidia.com
`deepseek:`	`DEEPSEEK_API_KEY`	api.deepseek.com
`mistral:`	`MISTRAL_API_KEY`	api.mistral.ai
`qwen:`	`DASHSCOPE_API_KEY`	dashscope-intl.aliyuncs.com
`groq:`	`GROQ_API_KEY`	api.groq.com
`together:`	`TOGETHER_API_KEY`	api.together.xyz
`openrouter:`	`OPENROUTER_API_KEY`	openrouter.ai
`ollama:`	(none)	localhost:11434

Run ./target/release/neurosploit models for the full provider/model list.

2) Via subscription (no API key)

--subscription drives your local agentic-CLI login instead of an API key — install and log into one of the CLIs first:

`--model` prefix	CLI used	Login
`anthropic:`	`claude` (Claude Code)	`claude` then `/login`
`openai:`	`codex`	`codex` login
`gemini:`	`gemini`	`gemini` login
`xai:`	`grok`	`grok` login

./target/release/neurosploit run http://testphp.vulnweb.com/ \
    --subscription --model anthropic:claude-opus-4-8 --mcp -v

How it works

target ─▶ recon (curl/nmap/…) ─▶ INTELLIGENT agent selection (recon-aware)
       ─▶ parallel exploitation ─▶ cross-model validation vote
       ─▶ severity/score ─▶ report (HTML + Typst PDF) ─▶ RL reward update

Every run writes a self-contained folder runs/ns-<ts>-<target>/:

File	Contents
`status.json`	`running` → `complete` with a summary
`recon.json` / `recon.md`	mapped attack surface
`exploitation.md`	raw per-agent transcript
`findings.json` / `findings.md`	validated findings (reuse by other tools/AIs)
`report.html`, `report.typ`, `report.pdf`	final report (PDF via the Typst engine)

A reinforcement-learning reward store (data/rl_state_rs.json) biases agent selection on future runs.

Agent library — `agents_md/` (303)

Category	Count	Purpose
`vulns/`	196	Exploit a specific vulnerability class
`recon/`	12	Information gathering / attack surface
`code/`	78	White-box source-code (SAST) review
`meta/`	17	Orchestrator, validator, scorers, reporter, RL

Each agent is a self-contained markdown playbook (## User Prompt methodology + ## System Prompt strict anti-false-positive rules). Drop a new .md into the matching folder and the harness picks it up.

Safety

For authorized testing only. Agents are instructed to stay in scope, never run destructive/DoS actions, and require proof-of-exploitation. You are responsible for having permission for any target.

Credits

Joas A Santos & Red Team Leaders.

License

MIT.

Languages

Rust 70.6%

HTML 15.9%

Python 10.4%

Typst 1.2%

Shell 1.2%

Other 0.7%

README.md

🧠 NeuroSploit v3.5.1

🆕 v3.5.1 — POMDP belief & grounded anti-hallucination

📦 Install (one line)

⚡ Quick start (60 seconds)

Build

Usage

Options (run / whitebox)

Authentication — run via API key or subscription

1) Via API (provider API key)

2) Via subscription (no API key)

How it works

Agent library — agents_md/ (303)

Safety

Credits

License

Options (`run` / `whitebox`)

Agent library — `agents_md/` (303)