mirror of https://github.com/CyberSecurityUP/NeuroSploit.git synced 2026-06-29 23:05:30 +02:00

Files

T

CyberSecurityUP d957429c09 feat(models): add Azure OpenAI provider + GOOGLE_API_KEY alias for Gemini

Resolves the only two open issues that still apply to the Rust build:
- #21 Azure OpenAI: new `azure` provider (OpenAI-compatible). Endpoint comes
  from AZURE_OPENAI_ENDPOINT, api-version from AZURE_OPENAI_API_VERSION
  (default 2024-10-21); the model name is the Azure deployment; auth uses the
  `api-key` header instead of Bearer. Use `--model azure:<deployment>`.
- #25 Gemini key confusion: GEMINI_API_KEY now also accepts GOOGLE_API_KEY
  (Google's standard env var) as an alias; local providers (ollama/litellm)
  require no key. .env.example documents both.

Kept under the v3.5.2 line (additive provider support).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-26 14:17:25 -03:00

38 KiB

Raw Permalink Blame History

NeuroSploit v3.5.2 — Release Notes

Release Date: June 2026 Codename: Exploitation Depth & Report Hygiene License: MIT Credits: Joas A Santos & Red Team Leaders

TL;DR

v3.5.2 hard-codes the discipline that separates a great pentest from a noisy one — distilled from reviewing real AI-pentest output that kept stopping at "exposed" instead of "exploited". The engine now pushes every exposure to demonstrated impact, chains findings, decodes/fingerprints artifacts and correlates CVEs, audits tokens, and keeps the final report honest (deduplicated and severity-calibrated).

Highlights

DEPTH doctrine (exploit, don't just expose). A new doctrine is injected into every exploitation prompt (black/grey/chain): any info-disclosure, exposed service/catalog/WSDL, leaked credential/token, or reachable dev host must be USED before it can be a finding — call it, decode it, log in, hit the dev host. If it was only observed, it's reported as a lead, not a confirmed High/Critical.
Finding chaining. Reuse any session/JWT/cookie/credential obtained in one step across all other modules; pivot access into IDOR/privesc/exfil and report the chain, not isolated parts (e.g. captcha-bypass→admin JWT→authenticated surface; enum + no-rate-limit→password spraying).
Decode & fingerprint → CVE. Decode opaque tokens/paths (base64/JSON/marshal) and pin exact library/gem/plugin/CMS versions, then correlate to known CVEs and attempt a safe PoC.
Token auditor. JWT alg-confusion (RS→HS), alg:none, kid/jku injection, real signature verification, weak HS256 secret cracking, and token lifecycle (logout/expiry/refresh).
Report-hygiene & depth pass (deterministic, in the harness). After validation the run now:
- calibrates severity to proven impact — an unproven High/Critical (hedged language, no payload, thin evidence) is capped to Medium and re-titled "(potential)";
- flags "exposed → exploited" gaps — exposures on a host with no actual exploit get an advisory to go use them;
- advises consolidating hygiene classes (headers/cookies/TLS/HSTS/ clickjacking/disclosure) repeated across many assets into ONE finding with an affected-asset table, instead of inflating the count one-per-host.
5 new doctrine meta-agents (agents_md/meta/): exploit_depth_doctrine, finding_chainer, artifact_decoder, token_auditor, report_calibrator (meta agents 17 → 22; total library 343 → 348).
Source from a GitHub URL. whitebox / greybox --repo (and the REPL /repo) now accept a git URL (https://github.com/owner/repo[.git]) or an owner/repo shorthand — the repo is cloned (shallow) into <base>/repos/ and reviewed automatically, no manual git clone needed:
```
neurosploit whitebox https://github.com/digininja/DVWA \
  --subscription --model anthropic:claude-opus-4-8 -v
```
Azure OpenAI provider (resolves #21). OpenAI-compatible: set AZURE_OPENAI_ENDPOINT (+ optional AZURE_OPENAI_API_VERSION, default 2024-10-21) and AZURE_OPENAI_API_KEY, then --model azure:<deployment> (the model name is your Azure deployment name; auth via the api-key header).
GOOGLE_API_KEY alias for Gemini (resolves #25 confusion). Gemini's API path reads GEMINI_API_KEY, and now also accepts GOOGLE_API_KEY (Google's standard env var) when the former is unset. Local providers (ollama/litellm) still need no key at all.

Notes

Pure-additive and back-compatible: existing modes, REPL, TUI, pause/continue, crash-recovery and reports are unchanged. The hygiene pass only annotates and down-calibrates unproven severities — it never invents or drops findings.
New unit tests cover the calibration and depth-audit logic (harness::hygiene).

NeuroSploit v3.5.1 — Release Notes

Release Date: June 2026 Codename: Interactive POMDP Harness License: MIT Credits: Joas A Santos & Red Team Leaders

TL;DR

The 3.5.x line turns the Rust harness into a full interactive REPL (Claude Code / Codex / Cursor-CLI style) on top of the multi-model engine: pick models with arrow-keys, configure API keys per provider, set target/repo/auth/creds and free-text instructions that steer the agents, then /run engagements in the background while you keep typing. v3.5.1 adds a POMDP belief spine with anti-hallucination grounding ("no claim without a tool receipt"), infra/host testing (IP + SSH + Windows/AD) with Linux/Windows/AD agents, attack-chain agents, a Mission-Control TUI, structured Typst reports, and resilient run control (live checkpointing, pause-on-quota, instant stop).

Highlights

Interactive REPL (neurosploit with no subcommand): real line editing (history ↑/↓, Ctrl-A/E/K, multiline), Tab-completion of /commands and @filesystem-paths (Claude-Code-style file menu), arrow-key model multi-select, per-provider API-key config, and a live context bar (model · cwd · mode▸target).
Engagement modes: black-box (run), white-box SAST (whitebox, set /repo), grey-box (greybox, /repo + /target), host/infra (/target <ip> + /creds for SSH / Windows / AD), plus the TUI dashboard.
POMDP belief state (belief.rs, pomdp.rs): a property-graph with probabilities + Bayesian update + Shannon-entropy uncertainty, a value-of-information planner, and a grounding gate (grounding.rs, may_assert) — findings must carry an empirical/symbolic tool receipt.
Infra / credentials (creds.rs): multi-block YAML (jwt/header/cookie, HTTP login, SSH, Windows/AD); real automated login; Linux/Windows/AD agents.
Attack-chain agents: sqli→rce→lpe, ssrf→aws, upload→lfi→rce, and more — injected as chain recipes during exploitation.
App-stack & CVE hunting: IIS/.NET (tilde shortname, WebDAV, ViewState), CMS (WordPress/Joomla/Drupal), app-server consoles, known-CVE exploitation.
13 providers incl. LiteLLM proxy and Gemini/xAI alongside the existing OpenAI-compatible set; subscription mode drives local agentic CLIs (claude/codex/gemini/grok) via stream-json.
Mission-Control TUI (ratatui): concurrent activity/findings/targets panels with a non-blocking composer active during the run.
Structured Typst report: executive summary, vulnerability-summary table, and per-finding sections (criticality, CVSS, OWASP/CWE, PoC, evidence, remediation) + an attack-graph / kill-chain mapping (OWASP/CWE/MITRE).
Per-project persistence (.neurosploit/, no database): session.json, runs.json, history.txt — resumes automatically on reopen.

Run control (new in 3.5.1)

Background /run with a live progress bar, severity-colored findings, and the full file:// report URL on completion/stop.
3-way /stop: [1] validate findings so far → report · [2] raw report now without validating · [3] discard. Raw/discard abort in-flight agents immediately (running CLI children are killed via kill_on_drop); validate soft-stops so the validator still runs.
Crash/quit recovery: every finding is checkpointed live to .neurosploit/active_run.json; an interrupted run is recovered into /runs on the next launch, so /results, /finding and /report keep working.
Pause-on-exhaustion: when all models are rate-limited / out of quota the run parks (state kept) and prints ⏸ token/quota exhausted … PAUSED. Resume with /continue when your quota renews, or switch with /model <provider:model> (or the /model selector) then /continue.
Inspection: /results (live findings), /finding (pick one → full command + PoC + evidence), /expand / Ctrl-O (full untruncated commands), /status, /diff, /retest.

Usage

cd neurosploit-rs && cargo build --release
./target/release/neurosploit                              # interactive REPL
./target/release/neurosploit run http://target -v --model anthropic:claude-opus-4-8
./target/release/neurosploit whitebox --repo /path/to/code   # white-box SAST
./target/release/neurosploit greybox  --repo /path --target http://target  # grey-box
./target/release/neurosploit run <ip> --creds creds.yaml     # host / infra
./target/release/neurosploit tui http://target --subscription --mcp

Cross-platform install (Linux / macOS / Windows, x64 + arm64) via setup.sh and install.ps1. See README.md and TUTORIAL.md for the full walkthrough.

NeuroSploit v3.4.0 — Release Notes

Release Date: June 2026 Codename: Rust Multi-Model Harness License: MIT

TL;DR

A new Rust harness (neurosploit-rs/) re-implements the autonomous runtime as a single, fast binary built on tokio + axum. It drives a pool of LLM models with concurrency limits, provider failover, and N-model validator voting — multiple models must independently agree a finding is real before it is reported — then serves its own solid web dashboard. It reuses the existing agents_md/ library (213 agents) unchanged.

Highlights

neurosploit-rs/ cargo workspace: harness lib crate + neurosploit binary. cargo build --release → one static-ish binary.
Multi-model pool (pool.rs): bounded concurrency + automatic failover across providers; the same panel is reused as the validator voting jury.
Pipeline (pipeline.rs): recon → parallel agent exploitation (semaphore bounded) → N-model adversarial vote → score → report. Streams live progress over a channel.
11 providers / 31 models (models.rs), all OpenAI-compatible: Anthropic, OpenAI, xAI, NVIDIA NIM, DeepSeek, Mistral, Qwen, Groq, Together, OpenRouter, Ollama. Models like Qwen / DeepSeek / Llama usable directly.
Axum web dashboard (app/): multi-model selection panel, live execution console, findings, agent browser, embedded HTML report. Single binary serves the SPA — no npm/build.
CLI: neurosploit serve | run <url> | agents | models, plus --offline mode to exercise the full pipeline without any API keys.

Usage

cd neurosploit-rs && cargo build --release
./target/release/neurosploit serve                 # → http://127.0.0.1:8788
./target/release/neurosploit run https://t.example \
    --model anthropic:claude-opus-4-8 --model openai:gpt-5.1 --vote-n 3

NeuroSploit v3.3.0 — Release Notes

Release Date: June 2026 Codename: Autonomous MD-Agent Engine License: MIT

TL;DR

NeuroSploit's pentest agent has been re-modeled into an autonomous, markdown-driven engine. You give it a URL; it composes a master prompt from a curated library of 213 markdown agents and drives a locally-installed agentic CLI backend (Claude Code / Codex / Grok CLI, or a Claude subscription) to run the engagement end-to-end — with Playwright MCP for proof-of-execution and a reinforcement-learning loop that adapts agent selection across runs. The old Python orchestration was retired to legacy/.

Highlights

New engine neurosploit_agent/ + ./neurosploit terminal launcher. Interactive (./neurosploit) or one-shot (./neurosploit run <url>).
213-agent markdown library (agents_md/): 196 vulnerability specialists (now covering LLM/AI, cloud/K8s, modern API/auth, advanced injection, protocol smuggling, logic/crypto/supply-chain) + 17 meta-agents.
Meta-agents for quality: recon, exploit_validator, false_positive_filter, severity_assessor, impact_evaluator, reporter, and rl_feedback — the pipeline validates and adversarially refutes every candidate before it can become a finding.
Pluggable agentic CLI backends with auto-detection: Claude Code, Codex, Grok CLI; subscription mode via Claude Code login.
Playwright MCP wired in (.mcp.json) so agents prove client-side execution (XSS/CSTI) and capture DOM/network/screenshots instead of trusting reflection.
Reinforcement learning (neurosploit_agent/rl.py + meta/rl_feedback.md): bounded per-agent weights with per-tech-stack affinity, persisted to data/rl_state.json.
Latest model registry (neurosploit_agent/models.py): Anthropic Claude 4.x, OpenAI, xAI Grok, Gemini, OpenRouter, Ollama, and NVIDIA NIM (PR #28, OpenAI-compatible integrate.api.nvidia.com, nvapi- keys).
Data-driven agent builder scripts/build_agents.py for extending the library without boilerplate.

Breaking changes

The monolithic neurosploit.py orchestrator and Python agent classes moved to legacy/ and are no longer the supported entrypoint. Use ./neurosploit.
Primary agent library moved from prompts/agents/ to agents_md/ (originals preserved; meta/role prompts split into agents_md/meta/).

Upgrade notes

Install at least one agentic CLI: Claude Code, Codex, or Grok CLI.
npx (Node) is required for Playwright MCP.
Copy .env.example → .env; set a provider key (or use Claude subscription).
./neurosploit backends to confirm detection, then ./neurosploit.

NeuroSploit v3.0.0 — Release Notes

Release Date: February 2026 Codename: Autonomous Pentester License: MIT

Overview

NeuroSploit v3 is a ground-up overhaul of the AI-powered penetration testing platform. This release transforms the tool from a scanner into an autonomous pentesting agent — capable of reasoning, adapting strategy in real-time, chaining exploits, validating findings with anti-hallucination safeguards, and executing tools inside isolated Kali Linux containers.

By the Numbers

Metric	Count
Vulnerability types supported	100
Payload libraries	107
Total payloads	477+
Kali sandbox tools	55
Backend core modules	63 Python files
Backend core code	37,546 lines
Autonomous agent	7,592 lines
AI decision prompts	100 (per-vuln-type)
Anti-hallucination prompts	12 composable templates
Proof-of-execution rules	100 (per-vuln-type)
Known CVE signatures	400
EOL version checks	19
WAF signatures	16
WAF bypass techniques	12
Exploit chain rules	10+
Frontend pages	14
API endpoints	111+
LLM providers supported	6

Architecture

                      +---------------------+
                      |   React/TypeScript   |
                      |     Frontend (14p)   |
                      +----------+----------+
                                 |
                           WebSocket + REST
                                 |
                      +----------v----------+
                      |   FastAPI Backend    |
                      |   14 API routers     |
                      +----------+----------+
                                 |
              +---------+--------+--------+---------+
              |         |        |        |         |
         +----v---+ +---v----+ +v------+ +v------+ +v--------+
         | LLM    | | Vuln   | | Agent | | Kali  | | Report  |
         | Manager| | Engine | | Core  | |Sandbox| | Engine  |
         | 6 provs| | 100typ | |7592 ln| | 55 tl | | 2 fmts  |
         +--------+ +--------+ +-------+ +-------+ +---------+

Stack: Python 3.10+ / FastAPI / SQLAlchemy (async) / React 18 / TypeScript / Tailwind CSS / Vite / Docker

Core Engine: 100 Vulnerability Types

The vulnerability engine covers 100 distinct vulnerability types organized in 10 categories with dedicated testers, payloads, AI prompts, and proof-of-execution rules for each.

Categories & Types

Category	Types	Examples
Injection	12	SQLi (error, union, blind, time-based), Command Injection, SSTI, NoSQL, LDAP, XPath, Expression Language, HTTP Parameter Pollution
XSS	3	Reflected, Stored (two-phase form+display), DOM-based
Authentication	7	Auth Bypass, JWT Manipulation, Session Fixation, Weak Password, Default Credentials, 2FA Bypass, OAuth Misconfig
Authorization	5	IDOR, BOLA, BFLA, Privilege Escalation, Mass Assignment, Forced Browsing
Client-Side	9	CORS, Clickjacking, Open Redirect, DOM Clobbering, PostMessage, WebSocket Hijack, Prototype Pollution, CSS Injection, Tabnabbing
File Access	5	LFI, RFI, Path Traversal, XXE, File Upload
Request Forgery	3	SSRF, SSRF Cloud (AWS/GCP/Azure metadata), CSRF
Infrastructure	7	Security Headers, SSL/TLS, HTTP Methods, Directory Listing, Debug Mode, Exposed Admin, Exposed API Docs, Insecure Cookies
Advanced	9	Race Condition, Business Logic, Rate Limit Bypass, Type Juggling, Timing Attack, Host Header Injection, HTTP Smuggling, Cache Poisoning, CRLF
Data Exposure	6	Sensitive Data, Information Disclosure, API Key Exposure, Source Code Disclosure, Backup Files, Version Disclosure
Cloud & Supply Chain	6	S3 Misconfig, Cloud Metadata, Subdomain Takeover, Vulnerable Dependency, Container Escape, Serverless Misconfig

Injection Routing

Every vulnerability type is routed to the correct injection point:

Parameter injection (default): SQLi, XSS, IDOR, SSRF, etc.
Header injection: CRLF, Host Header, HTTP Smuggling
Body injection: XXE
Path injection: Path Traversal, LFI
Both (param + path): LFI, directory traversal variants

XSS Pipeline (Reflected)

The reflected XSS engine is a multi-stage pipeline:

Canary probe — unique marker per endpoint+param to detect reflection
Context analysis — 8 contexts: html_body, attribute_value, script_string, script_block, html_comment, url_context, style_context, event_handler
Filter detection — batch probe to map allowed/blocked chars, tags, events
AI payload generation — LLM generates context-aware bypass payloads
Escalation payloads — WAF/encoding bypass variants
Testing — up to 30 payloads per param with per-payload dedup
Browser validation — Playwright popup/cookie/DOM/event verification (optional)

POST Form Support

HTML forms detected during recon with method, action, all input fields (including <select>, <textarea>, hidden fields)
POST form testing includes all form fields (CSRF tokens, hidden inputs) — not just the parameter under test
Redirect following for POST responses (search forms that redirect to results)
Full HTTP method support: GET, POST, PUT, DELETE, PATCH, OPTIONS, HEAD

Autonomous Agent Architecture

3-Stream Parallel Auto-Pentest

The agent runs 3 concurrent streams via asyncio.gather():

Stream 1: Recon          Stream 2: Junior Tester      Stream 3: Tool Runner
  - Crawl target           - Immediate target test       - Nuclei + Naabu
  - Extract forms           - Consume endpoint queue      - AI-selected tools
  - JS analysis             - 3 payloads/endpoint         - Dynamic install
  - Deep fingerprint        - AI-prioritized types        - Process findings
  - Push to queue           - Skip tested types           - Feed back to recon
        |                         |                             |
        +----------+--------------+-----------------------------+
                   |
            Deep Analysis (50-75%)
            Researcher AI (75%)    ← NEW
            Finalization (75-100%)

Reasoning Engine (ReACT)

AI reasoning at strategic checkpoints (50%, 75%):

Think: analyze situation, available data, findings so far
Plan: recommend next actions, prioritize vuln types
Reflect: evaluate results, adjust strategy

Token budget tracking with graceful degradation:

0-60% budget: full AI (reasoning + verification + enhancement)
60-80%: reduced (skip enhancement)
80-95%: minimal (verification only)
95%+: technical only (no AI calls)

Strategy Adaptation

Dead endpoint detection: skip after 5+ consecutive errors
Diminishing returns: reduce testing on low-yield endpoints
Priority recomputation: re-rank vuln types based on results
Pattern propagation: IDOR on /users/1 automatically queues /orders/1, /accounts/1
Checkpoint refinement: at 30%/60%/90% refine attack strategy

Exploit Chaining

10+ chain rules for multi-step attack paths:

SSRF -> Internal service access -> Data extraction
SQLi -> Database-specific escalation (MySQL, PostgreSQL, MSSQL)
XSS -> Session hijacking -> Account takeover
LFI -> Source code disclosure -> Credential extraction
Auth bypass -> Privilege escalation -> Admin access

AI-driven chain discovery during finalization phase.

Validation & Anti-Hallucination Pipeline

4-Layer Verification

Every finding passes through 4 independent verification layers before confirmation:

Finding Signal
    |
    v
[1] Negative Controls  — Send benign/empty probes. Same response = false positive (-60 penalty)
    |
    v
[2] Proof of Execution — Per-vuln-type proof checks (25+ methods). XSS: context analyzer.
    |                      SSRF: metadata markers. SQLi: DB error patterns. Score 0-60.
    v
[3] AI Interpretation  — LLM analyzes with anti-hallucination system prompt + per-type
    |                      proof requirements. Speculative language rejected.
    v
[4] Confidence Scorer  — Numeric 0-100 score. >=90 confirmed, >=60 likely, <60 rejected.
    |
    v
ValidationJudge (sole authority for finding approval)

Anti-Hallucination System Prompts

12 composable anti-hallucination prompt templates injected into all 17 LLM call sites:

Prompt	Purpose
`anti_hallucination`	Core: never claim vuln without concrete proof
`anti_scanner`	Don't behave like a scanner — reason like a pentester
`negative_controls`	Explain control test methodology
`think_like_pentester`	Manual testing mindset
`proof_of_execution`	What constitutes real proof per vuln type
`frontend_backend_correlation`	Don't confuse client-side vs server-side
`multi_phase_tests`	Two-phase testing (submit + verify)
`final_judgment`	Conservative final decision framework
`confidence_score`	Numeric scoring calibration
`anti_severity_inflation`	Don't inflate severity
`operational_humility`	Acknowledge uncertainty
`access_control_intelligence`	Data comparison, not status code diff

100 per-vuln-type proof requirements (e.g., SSRF requires metadata content, not just status diff).

Cross-Validation

_cross_validate_ai_claim() — independent check for XSS, SQLi, SSRF, IDOR, open redirect, CRLF, XXE, NoSQL
_evidence_in_response() — verify AI claim matches actual HTTP response
Speculative language rejection ("might be", "could be", "possibly")
Default False — findings rejected unless positively proven

Access Control Intelligence

BOLA/BFLA/IDOR use data comparison methodology (not status code diff)
JSON field comparison between authenticated user responses
Adaptive TP/FP learning across scans (9 patterns, 6 known FP patterns)
Access control types auto-inject specialized prompts

Kali Sandbox & Tool Execution

Container-Per-Scan Architecture

Each scan gets its own isolated Kali Linux Docker container:

ContainerPool (global coordinator)
    |
    +-- Scan A: KaliSandbox (neurosploit-kali-abc123)
    |       +-- nuclei, naabu, httpx (pre-installed)
    |       +-- wpscan (installed on-demand)
    |       +-- sqlmap (installed on-demand)
    |
    +-- Scan B: KaliSandbox (neurosploit-kali-def456)
    |       +-- nuclei, httpx (pre-installed)
    |       +-- dirsearch (installed on-demand)
    |
    +-- max_concurrent, TTL, orphan cleanup

55 Security Tools

Category	Count	Examples
Pre-installed (Go)	11	nuclei, naabu, httpx, subfinder, katana, dnsx, ffuf, gobuster, dalfox, waybackurls, uncover
Pre-installed (APT)	5	nmap, nikto, sqlmap, masscan, whatweb
Pre-installed (System)	12	curl, wget, git, python3, pip3, go, jq, dig, whois, openssl, netcat, bash
APT on-demand	15	wpscan, dirb, hydra, john, hashcat, sslscan, amass, enum4linux, dnsrecon, fierce, crackmapexec
Go on-demand	4	gau, gitleaks, anew, httprobe
Pip on-demand	8	dirsearch, wfuzz, arjun, wafw00f, sslyze, commix, trufflehog, retire

Dynamic Tool Engine

AI selects tools based on detected tech stack
On-demand install → execute → collect results → cleanup
Tool output parsed and converted to structured findings
Results fed back into recon context for deeper testing

Researcher AI Agent

Hypothesis-driven 0-day discovery agent with Kali sandbox access:

Observe (recon data + existing findings)
    |
    v
Hypothesize (AI generates targeted hypotheses)
    |          - Logic flaws, race conditions
    v          - CVE-based attacks, misconfigurations
Plan Tools (AI selects from 55+ tools)
    |
    v
Execute in Sandbox (isolated Kali container)
    |
    v
Analyze Results (AI verdicts: confirmed/rejected)
    |
    v
Loop (max 15 hypotheses, 30 tool executions, 5 iterations)

Enabled via: ENABLE_RESEARCHER_AI=true + per-scan checkbox in frontend.

Intelligence Modules

CVE Hunter

Extracts software versions from headers, meta tags, error pages, JS files
Searches NVD API (NIST National Vulnerability Database)
Searches GitHub for public exploit PoCs
Correlates CVEs with detected versions
Optional API keys for higher rate limits

Banner Analyzer

400 known vulnerable version signatures
19 end-of-life version categories
Instant version-to-CVE mapping without API calls
AI-assisted analysis for unknown versions

Deep Recon

JavaScript file crawling for API endpoints, secrets, route definitions
Sitemap.xml and robots.txt parsing
OpenAPI/Swagger schema discovery and enumeration
Deep fingerprinting from multiple sources

Endpoint Classifier

8 endpoint type categories with risk scoring:

Type	Risk Weight	Priority Vulns
Admin	0.95	auth_bypass, privilege_escalation, default_credentials
Auth	0.90	auth_bypass, brute_force, weak_password
Upload	0.85	file_upload, xxe, path_traversal
API	0.80	idor, bola, bfla, jwt_manipulation, mass_assignment
Data	0.75	idor, bola, mass_assignment, data_exposure
Search	0.70	sqli_error, xss_reflected, nosql_injection

Parameter Analyzer

8 semantic categories for smart parameter prioritization:

ID params (id, uid, user_id) -> IDOR, BOLA
File params (file, path, include) -> LFI, Path Traversal
URL params (url, redirect, callback) -> SSRF, Open Redirect
Query params (q, search, filter) -> SQLi, XSS
Auth params (token, jwt, session) -> JWT Manipulation, Auth Bypass
Code params (cmd, exec, template) -> Command Injection, SSTI

Payload Mutator

14 mutation strategies for WAF/filter bypass:

Double encoding, Unicode escape, case variation
Null byte injection, comment injection, concat bypass
Hex encoding, newline/tab bypass, charset bypass
Failure analysis: adapts strategy based on observed response patterns

WAF Detection & Bypass

16 WAF signatures (Cloudflare, AWS WAF, Akamai, Imperva, F5, Sucuri, etc.)
Passive detection (response headers) + active probing
12 bypass techniques per WAF type
Auto-applied when WAF detected

Request Infrastructure

Resilient Request Engine

Automatic retry with exponential backoff
Rate limiting (requests/second configurable)
Circuit breaker (open after N consecutive failures, half-open probe, close on success)
Adaptive timeouts (increase on slow responses)
Per-domain rate tracking

Auth Manager

Multi-user session management
Login form detection and auto-authentication
Cookie, Bearer, Basic, Header auth types
Session refresh on expiry

Multi-Agent Orchestration (Experimental)

Optional replacement for the 3-stream architecture. 5 specialist agents with handoff coordination:

Agent	Budget	Responsibility
ReconAgent	20%	Deep crawl, JS analysis, API enum, fingerprinting
ExploitAgent	35%	Classify endpoints, prioritize params, test, mutate, validate
ValidatorAgent	20%	Independent re-test, different payloads, reproducibility
CVEHunterAgent	10%	Version extraction, NVD search, GitHub exploit search
ReportAgent	15%	Finding enhancement, PoC generation, report creation

3-phase pipeline: Parallel (Recon + CVE) -> Sequential (Exploit) -> Parallel (Validator + Report)

Enable: ENABLE_MULTI_AGENT=true in .env

Frontend

14 Pages

Page	Route	Description
Home	`/`	Dashboard with stats, activity feed, severity charts
Auto Pentest	`/auto`	3-stream display, live findings, AI reports, Kali checkbox
Scan Details	`/scan/:id`	Findings with validation badges, confidence scores, pause/resume/stop
New Scan	`/scan/new`	Quick/Full/Custom scan configuration
Reports	`/reports`	Report listing with HTML/PDF/JSON download
Report View	`/report/:id`	Interactive report viewer
Terminal Agent	`/terminal`	AI chat + command execution interface
Vuln Lab	`/vuln-lab`	Per-type challenge testing (100 types, 11 categories)
Task Library	`/tasks`	Reusable pentest task templates
Scheduler	`/scheduler`	Cron/interval scheduling with CRUD
Settings	`/settings`	LLM providers, model routing, feature toggles
Sandbox Dashboard	`/sandbox`	Kali container monitoring, tool status
Agent Status	`/agent/:id`	Real-time agent progress and logs
Realtime Task	`/realtime`	Live interactive testing session

Key UI Features

Real-time WebSocket updates: live scan progress, findings, logs
Confidence badges: green (>=90), yellow (>=60), red (<60) with breakdown details
Validation Pipeline display: proof of execution, negative controls, scoring breakdown
Pause/Resume/Stop: scan control with 5 internal checkpoints
Manual validation: confirm/reject AI decisions
Screenshot evidence: inline per-finding in PoC section
Rejected findings viewer: expandable section with rejection reasons

Report Generation

Two Report Engines

Engine	Format	Style
Professional	HTML	Dark theme, collapsible findings, click-to-zoom screenshots, severity charts
OHVR	HTML	Observation-Hypothesis-Validation-Result methodology, PoC code blocks

Both engines support:

Executive summary (AI-generated)
Severity breakdown with visual charts
Per-finding: description, PoC, exploitation code, inline screenshots, impact, remediation, references
Rejected findings section (AI-rejected, pending manual review)
JSON export for programmatic consumption

Screenshot Placement

Screenshots are embedded inline within each vulnerability's PoC section — directly associated with the finding they evidence. No separate gallery at the end.

Vulnerability Finding
  +-- Description
  +-- Proof of Concept
  |     +-- Observation
  |     +-- Hypothesis
  |     +-- Validation (payload + request)
  |     +-- Exploitation Code
  |     +-- Visual Evidence (screenshots)  <-- HERE
  |     +-- Result (impact)
  +-- Remediation
  +-- References

Cross-Scan Learning

Execution History

Tracks attack success/failure across all scans
Records: tech_stack + vuln_type + target + success rate
get_priority_types(tech_stack) — returns types ranked by historical success
Auto-influences AI prompts and testing priority in future scans
Bounded storage (500 records, auto-save every 20)

Access Control Learner

Adaptive true-positive / false-positive pattern learning
9 detection patterns, 6 known FP patterns
Influences ValidationJudge scoring in subsequent scans

LLM Provider Support

Provider	Models	Config
Anthropic Claude	claude-3.5-sonnet, claude-3-opus, claude-3-haiku	`ANTHROPIC_API_KEY`
OpenAI	gpt-4o, gpt-4-turbo, gpt-3.5-turbo	`OPENAI_API_KEY`
Google Gemini	gemini-pro, gemini-1.5-pro	`GEMINI_API_KEY`
OpenRouter	Any model via unified API	`OPENROUTER_API_KEY`
Ollama	Any local model (llama, mistral, etc.)	`OLLAMA_BASE_URL`
LM Studio	Any local model	`LMSTUDIO_BASE_URL`

Model Routing

Optional task-type routing to different LLM profiles:

Task Type	Recommended
Reasoning	High-capability (Claude Opus, GPT-4)
Analysis	Medium (Claude Sonnet, GPT-4-turbo)
Generation	Medium (Sonnet, GPT-4-turbo)
Validation	High-capability for accuracy
Default	Configurable

Enable: ENABLE_MODEL_ROUTING=true with profiles in config/config.json

Configuration

Environment Variables

# LLM API Keys (at least one required)
ANTHROPIC_API_KEY=
OPENAI_API_KEY=
GEMINI_API_KEY=
OPENROUTER_API_KEY=

# Local LLM (no key needed)
#OLLAMA_BASE_URL=http://localhost:11434
#LMSTUDIO_BASE_URL=http://localhost:1234

# Feature Flags
ENABLE_MODEL_ROUTING=false
ENABLE_KNOWLEDGE_AUGMENTATION=false
ENABLE_BROWSER_VALIDATION=false
ENABLE_REASONING=true
ENABLE_CVE_HUNT=true
ENABLE_MULTI_AGENT=false
ENABLE_RESEARCHER_AI=true

# Optional API Keys
#NVD_API_KEY=
#GITHUB_TOKEN=

# Token Budget (comment out for unlimited)
#TOKEN_BUDGET=100000

# Database
DATABASE_URL=sqlite+aiosqlite:///./data/neurosploit.db

# Server
HOST=0.0.0.0
PORT=8000
DEBUG=false

Installation

Backend

cd /opt/NeuroSploitv2
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
# Edit .env with your API key(s)

Frontend

cd frontend
npm install
npm run build

Kali Sandbox (Optional)

docker build -f docker/Dockerfile.kali -t neurosploit-kali:latest docker/

Run

# Backend (serves frontend static files too)
python -m uvicorn backend.main:app --host 0.0.0.0 --port 8000

# Or development mode (frontend hot reload)
cd frontend && npm run dev  # Port 3000
python -m uvicorn backend.main:app --reload --port 8000

Requirements

Component	Minimum	Recommended
Python	3.10+	3.12
Node.js	18+	20 LTS
Docker	24+	Latest (for Kali sandbox)
RAM	4 GB	8 GB
Disk	2 GB	5 GB (with Kali image)

Backend Dependencies

Framework: FastAPI, Uvicorn, Pydantic
Database: SQLAlchemy (async), aiosqlite
HTTP: aiohttp
LLM: anthropic, openai
Reports: Jinja2, WeasyPrint
Scheduling: APScheduler
Optional: playwright, docker, mcp

Frontend Dependencies

UI: React 18, TypeScript, Tailwind CSS
State: Zustand
HTTP: Axios
Realtime: Socket.IO Client
Charts: Recharts
Icons: Lucide React
Build: Vite

Known Limitations

Anthropic API budget limits cause scan interruption — set a fallback provider in .env
Multi-agent orchestration (ENABLE_MULTI_AGENT) is experimental
Playwright browser validation requires Python 3.10+ and Chromium
MCP server requires Python 3.10+
Container-per-scan requires Docker daemon running
Token budget tracking is approximate (estimates, not exact counts)
CLI report (neurosploit.py) does not embed screenshots (backend reports do)

File Structure

NeuroSploitv2/
+-- backend/
|   +-- api/v1/              # 14 API routers (111+ endpoints)
|   +-- core/                # 63 Python modules (37,546 lines)
|   |   +-- vuln_engine/     # 100-type vulnerability engine
|   |   |   +-- registry.py          # 100 vuln info + 100 tester classes
|   |   |   +-- payload_generator.py # 107 libraries, 477+ payloads
|   |   |   +-- ai_prompts.py        # 100 per-type AI decision prompts
|   |   |   +-- system_prompts.py    # 12 anti-hallucination templates
|   |   |   +-- testers/             # 12 tester modules
|   |   +-- autonomous_agent.py      # Main agent (7,592 lines)
|   |   +-- researcher_agent.py      # 0-day discovery AI
|   |   +-- reasoning_engine.py      # ReACT think/plan/reflect
|   |   +-- validation_judge.py      # Finding approval authority
|   |   +-- confidence_scorer.py     # Numeric 0-100 scoring
|   |   +-- proof_of_execution.py    # Per-type proof checks
|   |   +-- negative_control.py      # False positive detection
|   |   +-- request_engine.py        # Retry, rate limit, circuit breaker
|   |   +-- waf_detector.py          # 16 signatures, 12 bypasses
|   |   +-- strategy_adapter.py      # Dead endpoints, priority recompute
|   |   +-- chain_engine.py          # 10+ exploit chain rules
|   |   +-- exploit_generator.py     # AI-enhanced PoC generation
|   |   +-- cve_hunter.py            # NVD + GitHub exploit search
|   |   +-- deep_recon.py            # JS crawling, sitemap, API enum
|   |   +-- banner_analyzer.py       # 400 known CVEs, 19 EOL versions
|   |   +-- endpoint_classifier.py   # 8 types + risk scoring
|   |   +-- param_analyzer.py        # 8 semantic categories
|   |   +-- payload_mutator.py       # 14 mutation strategies
|   |   +-- xss_validator.py         # Playwright browser validation
|   |   +-- xss_context_analyzer.py  # 8 context detection
|   |   +-- auth_manager.py          # Multi-user session management
|   |   +-- token_budget.py          # Budget tracking + degradation
|   |   +-- agent_tasks.py           # Priority queue task manager
|   |   +-- agent_orchestrator.py    # Multi-agent coordinator
|   |   +-- specialist_agents.py     # 5 specialist agents
|   |   +-- execution_history.py     # Cross-scan learning
|   |   +-- access_control_learner.py# TP/FP adaptive learning
|   |   +-- report_generator.py      # Professional HTML reports
|   |   +-- report_engine/           # OHVR report engine
|   +-- models/              # 8 SQLAlchemy ORM models
|   +-- config.py            # Pydantic settings
|   +-- main.py              # FastAPI app entry
+-- frontend/
|   +-- src/
|   |   +-- pages/           # 14 React pages
|   |   +-- components/      # Reusable UI components
|   |   +-- services/        # API client + WebSocket
|   |   +-- store/           # Zustand state management
|   |   +-- types/           # TypeScript interfaces
+-- core/
|   +-- llm_manager.py       # 6-provider LLM routing
|   +-- tool_registry.py     # 55 security tools
|   +-- kali_sandbox.py      # Per-scan container management
|   +-- container_pool.py    # Global container coordinator
|   +-- sandbox_manager.py   # Sandbox abstraction layer
+-- docker/
|   +-- Dockerfile.kali      # Multi-stage Kali Linux image
|   +-- Dockerfile.backend   # Backend service
|   +-- Dockerfile.frontend  # Frontend builder
+-- config/
|   +-- config.json          # Profiles, roles, tools, routing
+-- data/
|   +-- vuln_knowledge_base.json  # 100 vulnerability entries
+-- neurosploit.py           # CLI entry point
+-- .env.example             # Environment template

38 KiB Raw Permalink Blame History

NeuroSploit v3.5.2 — Release Notes

TL;DR

Highlights

Notes

NeuroSploit v3.5.1 — Release Notes

TL;DR

Highlights

Run control (new in 3.5.1)

Usage

NeuroSploit v3.4.0 — Release Notes

TL;DR

Highlights

Usage

NeuroSploit v3.3.0 — Release Notes

TL;DR

Highlights

Breaking changes

Upgrade notes

NeuroSploit v3.0.0 — Release Notes

Overview

By the Numbers

Architecture

Core Engine: 100 Vulnerability Types

Categories & Types

Injection Routing

XSS Pipeline (Reflected)

POST Form Support

Autonomous Agent Architecture

3-Stream Parallel Auto-Pentest

Reasoning Engine (ReACT)

Strategy Adaptation

Exploit Chaining

Validation & Anti-Hallucination Pipeline

4-Layer Verification

Anti-Hallucination System Prompts

Cross-Validation

Access Control Intelligence

Kali Sandbox & Tool Execution

Container-Per-Scan Architecture

55 Security Tools

Dynamic Tool Engine

Researcher AI Agent

Intelligence Modules

CVE Hunter

Banner Analyzer

Deep Recon

Endpoint Classifier

Parameter Analyzer

Payload Mutator

WAF Detection & Bypass

Request Infrastructure

Resilient Request Engine

Auth Manager

Multi-Agent Orchestration (Experimental)

Frontend

14 Pages

Key UI Features

Report Generation

Two Report Engines

Screenshot Placement

Cross-Scan Learning

Execution History

Access Control Learner

LLM Provider Support

Model Routing

Configuration

Environment Variables

Installation

Backend

Frontend

Kali Sandbox (Optional)

Run

Requirements

Backend Dependencies

Frontend Dependencies

Known Limitations

File Structure

38 KiB

Raw Permalink Blame History